The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] agglomerative clustering(4hit)

1-4hit
  • A KPI Anomaly Detection Method Based on Fast Clustering

    Yun WU  Yu SHI  Jieming YANG  Lishan BAO  Chunzhe LI  

     
    PAPER

      Pubricized:
    2022/05/27
      Vol:
    E105-B No:11
      Page(s):
    1309-1317

    In the Artificial Intelligence for IT Operations scenarios, KPI (Key Performance Indicator) is a very important operation and maintenance monitoring indicator, and research on KPI anomaly detection has also become a hot spot in recent years. Aiming at the problems of low detection efficiency and insufficient representation learning of existing methods, this paper proposes a fast clustering-based KPI anomaly detection method HCE-DWL. This paper firstly adopts the combination of hierarchical agglomerative clustering (HAC) and deep assignment based on CNN-Embedding (CE) to perform cluster analysis (that is HCE) on KPI data, so as to improve the clustering efficiency of KPI data, and then separately the centroid of each KPI cluster and its Transformed Outlier Scores (TOS) are given weights, and finally they are put into the LightGBM model for detection (the Double Weight LightGBM model, referred to as DWL). Through comparative experimental analysis, it is proved that the algorithm can effectively improve the efficiency and accuracy of KPI anomaly detection.

  • A Hybrid Feature Selection Method for Software Fault Prediction

    Yiheng JIAN  Xiao YU  Zhou XU  Ziyi MA  

     
    PAPER-Software Engineering

      Pubricized:
    2019/07/09
      Vol:
    E102-D No:10
      Page(s):
    1966-1975

    Fault prediction aims to identify whether a software module is defect-prone or not according to metrics that are mined from software projects. These metric values, also known as features, may involve irrelevance and redundancy, which hurt the performance of fault prediction models. In order to filter out irrelevant and redundant features, a Hybrid Feature Selection (abbreviated as HFS) method for software fault prediction is proposed. The proposed HFS method consists of two major stages. First, HFS groups features with hierarchical agglomerative clustering; second, HFS selects the most valuable features from each cluster to remove irrelevant and redundant ones based on two wrapper based strategies. The empirical evaluation was conducted on 11 widely-studied NASA projects, using three different classifiers with four performance metrics (precision, recall, F-measure, and AUC). Comparison with six filter-based feature selection methods demonstrates that HFS achieves higher average F-measure and AUC values. Compared with two classic wrapper feature selection methods, HFS can obtain a competitive prediction performance in terms of average AUC while significantly reducing the computation cost of the wrapper process.

  • Scalable Object Discovery: A Hash-Based Approach to Clustering Co-occurring Visual Words

    Gibran FUENTES PINEDA  Hisashi KOGA  Toshinori WATANABE  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E94-D No:10
      Page(s):
    2024-2035

    We present a scalable approach to automatically discovering particular objects (as opposed to object categories) from a set of images. The basic idea is to search for local image features that consistently appear in the same images under the assumption that such co-occurring features underlie the same object. We first represent each image in the set as a set of visual words (vector quantized local image features) and construct an inverted file to memorize the set of images in which each visual word appears. Then, our object discovery method proceeds by searching the inverted file and extracting visual word sets whose elements tend to appear in the same images; such visual word sets are called co-occurring word sets. Because of unstable and polysemous visual words, a co-occurring word set typically represents only a part of an object. We observe that co-occurring word sets associated with the same object often share many visual words with one another. Hence, to obtain the object models, we further cluster highly overlapping co-occurring word sets in an agglomerative manner. Remarkably, we accelerate both extraction and clustering of co-occurring word sets by Min-Hashing. We show that the models generated by our method can effectively discriminate particular objects. We demonstrate our method on the Oxford buildings dataset. In a quantitative evaluation using a set of ground truth landmarks, our method achieved higher scores than the state-of-the-art methods.

  • User Feedback-Driven Document Clustering Technique for Information Organization

    Han-joon KIM  Sang-goo LEE  

     
    LETTER-Databases

      Vol:
    E85-D No:6
      Page(s):
    1043-1048

    This paper discusses a new type of semi-supervised document clustering that uses partial supervision to partition a large set of documents. Most clustering methods organizes documents into groups based only on similarity measures. In this paper, we attempt to isolate more semantically coherent clusters by employing the domain-specific knowledge provided by a document analyst. By using external human knowledge to guide the clustering mechanism with some flexibility when creating the clusters, clustering efficiency can be considerably enhanced. Experimental results show that the use of only a little external knowledge can considerably enhance the quality of clustering results that satisfy users' constraint.