The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] clustering(170hit)

81-100hit(170hit)

  • Enhancing Digital Book Clustering by LDAC Model

    Lidong WANG  Yuan JIE  

     
    PAPER

      Vol:
    E95-D No:4
      Page(s):
    982-988

    In Digital Library (DL) applications, digital book clustering is an important and urgent research task. However, it is difficult to conduct effectively because of the great length of digital books. To do the correct clustering for digital books, a novel method based on probabilistic topic model is proposed. Firstly, we build a topic model named LDAC. The main goal of LDAC topic modeling is to effectively extract topics from digital books. Subsequently, Gibbs sampling is applied for parameter inference. Once the model parameters are learned, each book is assigned to the cluster which maximizes the posterior probability. Experimental results demonstrate that our approach based on LDAC is able to achieve significant improvement as compared to the related methods.

  • Dynamic Fractional Base Station Cooperation Using Shared Distributed Remote Radio Units for Advanced Cellular Networks

    Naoki KUSASHIMA  Ian Dexter GARCIA  Kei SAKAGUCHI  Kiyomichi ARAKI  Shoji KANEKO  Yoji KISHI  

     
    PAPER

      Vol:
    E94-B No:12
      Page(s):
    3259-3271

    Traditional cellular networks suffer the so-called “cell-edge problem” in which the user throughput is deteriorated because of pathloss and inter-cell (co-channel) interference. Recently, Base Station Cooperation (BSC) was proposed as a solution to the cell-edge problem by alleviating the interference and improving diversity and multiplexing gains at the cell-edge. However, it has minimal impact on cell-inner users and increases the complexity of the network. Moreover, static clustering, which fixes the cooperating cells, suffers from inter-cluster interference at the cluster-edge. In this paper, dynamic fractional cooperation is proposed to realize dynamic clustering in a shared RRU network. In the proposed algorithm, base station cooperation is performed dynamically at cell edges for throughput improvement of users located in these areas. To realize such base station cooperation in large scale cellular networks, coordinated scheduling and distributed dynamic cooperation are introduced. The introduction of coordinated scheduling in BSC multi-user MIMO not only maximizes the performance of BSC for cell-edge users but also reduces computational complexity by performing simple single-cell MIMO for cell-inner users. Furthermore, the proposed dynamic clustering employing shared RRU network realizes efficient transmission at all cell edges by forming cooperative cells dynamically with minimal network complexity. Owing to the combinations of the proposed algorithms, dynamic fractional cooperation achieves high network performance at all areas in the cellular network. Simulation results show that the cell-average and the 5% cell-edge user throughput can be significantly increased in practical cellular network scenarios.

  • A Support Vector and K-Means Based Hybrid Intelligent Data Clustering Algorithm

    Liang SUN  Shinichi YOSHIDA  Yanchun LIANG  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E94-D No:11
      Page(s):
    2234-2243

    Support vector clustering (SVC), a recently developed unsupervised learning algorithm, has been successfully applied to solving many real-life data clustering problems. However, its effectiveness and advantages deteriorate when it is applied to solving complex real-world problems, e.g., those with large proportion of noise data points and with connecting clusters. This paper proposes a support vector and K-Means based hybrid algorithm to improve the performance of SVC. A new SVC training method is developed based on analysis of a Gaussian kernel radius function. An empirical study is conducted to guide better selection of the standard deviation of the Gaussian kernel. In the proposed algorithm, firstly, the outliers which increase problem complexity are identified and removed by training a global SVC. The refined data set is then clustered by a kernel-based K-Means algorithm. Finally, several local SVCs are trained for the clusters and then each removed data point is labeled according to the distance from it to the local SVCs. Since it exploits the advantages of both SVC and K-Means, the proposed algorithm is capable of clustering compact and arbitrary organized data sets and of increasing robustness to outliers and connecting clusters. Experiments are conducted on 2-D data sets generated by mixture models and benchmark data sets taken from the UCI machine learning repository. The cluster error rate is lower than 3.0% for all the selected data sets. The results demonstrate that the proposed algorithm compared favorably with existing SVC algorithms.

  • ROCKET: A Robust Parallel Algorithm for Clustering Large-Scale Transaction Databases

    Woong-Kee LOH  Yang-Sae MOON  Heejune AHN  

     
    LETTER-Artificial Intelligence, Data Mining

      Vol:
    E94-D No:10
      Page(s):
    2048-2051

    We propose a robust and efficient algorithm called ROCKET for clustering large-scale transaction databases. ROCKET is a divisive hierarchical algorithm that makes the most of recent hardware architecture. ROCKET handles the cases with the small and the large number of similar transaction pairs separately and efficiently. Through experiments, we show that ROCKET achieves high-quality clustering with a dramatic performance improvement.

  • Scalable Object Discovery: A Hash-Based Approach to Clustering Co-occurring Visual Words

    Gibran FUENTES PINEDA  Hisashi KOGA  Toshinori WATANABE  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E94-D No:10
      Page(s):
    2024-2035

    We present a scalable approach to automatically discovering particular objects (as opposed to object categories) from a set of images. The basic idea is to search for local image features that consistently appear in the same images under the assumption that such co-occurring features underlie the same object. We first represent each image in the set as a set of visual words (vector quantized local image features) and construct an inverted file to memorize the set of images in which each visual word appears. Then, our object discovery method proceeds by searching the inverted file and extracting visual word sets whose elements tend to appear in the same images; such visual word sets are called co-occurring word sets. Because of unstable and polysemous visual words, a co-occurring word set typically represents only a part of an object. We observe that co-occurring word sets associated with the same object often share many visual words with one another. Hence, to obtain the object models, we further cluster highly overlapping co-occurring word sets in an agglomerative manner. Remarkably, we accelerate both extraction and clustering of co-occurring word sets by Min-Hashing. We show that the models generated by our method can effectively discriminate particular objects. We demonstrate our method on the Oxford buildings dataset. In a quantitative evaluation using a set of ground truth landmarks, our method achieved higher scores than the state-of-the-art methods.

  • Sub-Category Optimization through Cluster Performance Analysis for Multi-View Multi-Pose Object Detection

    Dipankar DAS  Yoshinori KOBAYASHI  Yoshinori KUNO  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E94-D No:7
      Page(s):
    1467-1478

    The detection of object categories with large variations in appearance is a fundamental problem in computer vision. The appearance of object categories can change due to intra-class variations, background clutter, and changes in viewpoint and illumination. For object categories with large appearance changes, some kind of sub-categorization based approach is necessary. This paper proposes a sub-category optimization approach that automatically divides an object category into an appropriate number of sub-categories based on appearance variations. Instead of using predefined intra-category sub-categorization based on domain knowledge or validation datasets, we divide the sample space by unsupervised clustering using discriminative image features. We then use a cluster performance analysis (CPA) algorithm to verify the performance of the unsupervised approach. The CPA algorithm uses two performance metrics to determine the optimal number of sub-categories per object category. Furthermore, we employ the optimal sub-category representation as the basis and a supervised multi-category detection system with χ2 merging kernel function to efficiently detect and localize object categories within an image. Extensive experimental results are shown using a standard and the authors' own databases. The comparison results reveal that our approach outperforms the state-of-the-art methods.

  • A Development of Cascade Granular Neural Networks

    Keun-Chang KWAK  

     
    LETTER-Biocybernetics, Neurocomputing

      Vol:
    E94-D No:7
      Page(s):
    1515-1518

    This paper studies the design of Cascade Granular Neural Networks (CGNN) for human-centric systems. In contrast to typical rule-based systems encountered in fuzzy modeling, the proposed method consists of two-phase development for CGNN. First, we construct a Granular Neural Network (GNN) which could be treated as a preliminary design. Next, all modeling discrepancies are compensated by a second GNN with a collection of rules that become attached to the regions of the input space where the error is localized. These granular networks are constructed by building a collection of user-centric information granules through Context-based Fuzzy c-Means (CFCM) clustering. Finally, the experimental results on two examples reveal that the proposed approach shows good performance in comparison with the previous works.

  • Enhancing Document Clustering Using Condensing Cluster Terms and Fuzzy Association

    Sun PARK  Seong Ro LEE  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E94-D No:6
      Page(s):
    1227-1234

    Most document clustering methods are a challenging issue for improving clustering performance. Document clustering based on semantic features is highly efficient. However, the method sometimes did not successfully cluster some documents, such as highly articulated documents. In order to improve the clustering success of complex documents using semantic features, this paper proposes a document clustering method that uses terms of the condensing document clusters and fuzzy association to efficiently cluster specific documents into meaningful topics based on the document set. The proposed method improves the quality of document clustering because it can extract documents from the perspective of the terms of the cluster topics using semantic features and synonyms, which can also better represent the inherent structure of the document in connection with the document cluster topics. The experimental results demonstrate that the proposed method can achieve better document clustering performance than other methods.

  • Bayesian Context Clustering Using Cross Validation for Speech Recognition

    Kei HASHIMOTO  Heiga ZEN  Yoshihiko NANKAKU  Akinobu LEE  Keiichi TOKUDA  

     
    PAPER-Speech and Hearing

      Vol:
    E94-D No:3
      Page(s):
    668-678

    This paper proposes Bayesian context clustering using cross validation for hidden Markov model (HMM) based speech recognition. The Bayesian approach is a statistical technique for estimating reliable predictive distributions by treating model parameters as random variables. The variational Bayesian method, which is widely used as an efficient approximation of the Bayesian approach, has been applied to HMM-based speech recognition, and it shows good performance. Moreover, the Bayesian approach can select an appropriate model structure while taking account of the amount of training data. Since prior distributions which represent prior information about model parameters affect estimation of the posterior distributions and selection of model structure (e.g., decision tree based context clustering), the determination of prior distributions is an important problem. However, it has not been thoroughly investigated in speech recognition, and the determination technique of prior distributions has not performed well. The proposed method can determine reliable prior distributions without any tuning parameters and select an appropriate model structure while taking account of the amount of training data. Continuous phoneme recognition experiments show that the proposed method achieved a higher performance than the conventional methods.

  • A Hierarchical Geographical Routing with Alternative Paths Using Autonomous Clustering for Mobile Ad Hoc Networks

    Hiroshi NAKAGAWA  Satoshi TESHIMA  Tomoyuki OHTA  Yoshiaki KAKUDA  

     
    PAPER-Assurance

      Vol:
    E94-B No:1
      Page(s):
    37-44

    Recently in ad hoc networks, routing schemes using location information which is provided by GPS (Global Position System) have been proposed. However, many routing schemes using location information assume that a source node has already known the location information of the destination node and they do not adapt to large ad hoc networks. On another front, the autonomous clustering scheme has been proposed to construct the hierarchical structure in ad hoc networks and adapt to large ad hoc networks. However, even when the hierarchical structure is introduced, there is some problem. The data delivery ratio becomes lower as the node speed becomes higher, and clusterheads have much overhead in the hierarchical routing scheme based on the autonomous clustering scheme. In order to cope with these problems, this paper proposes a new Hierarchical Geographical Routing with Alternative Paths (Hi-GRAP) using the autonomous clustering scheme and shows the effectiveness of the proposed hierarchical geographical routing in comparison with GPSR, Hi-AODV and AODV through simulation experiments with respect to the amount of control packets and the data delivery ratio.

  • O-means: An Optimized Clustering Method for Analyzing Spam Based Attacks

    Jungsuk SONG  Daisuke INOUE  Masashi ETO  Hyung Chan KIM  Koji NAKAO  

     
    PAPER-Network Security

      Vol:
    E94-A No:1
      Page(s):
    245-254

    In recent years, the number of spam emails has been dramatically increasing and spam is recognized as a serious internet threat. Most recent spam emails are being sent by bots which often operate with others in the form of a botnet, and skillful spammers try to conceal their activities from spam analyzers and spam detection technology. In addition, most spam messages contain URLs that lure spam receivers to malicious Web servers for the purpose of carrying out various cyber attacks such as malware infection, phishing attacks, etc. In order to cope with spam based attacks, there have been many efforts made towards the clustering of spam emails based on similarities between them. The spam clusters obtained from the clustering of spam emails can be used to identify the infrastructure of spam sending systems and malicious Web servers, and how they are grouped and correlate with each other, and to minimize the time needed for analyzing Web pages. Therefore, it is very important to improve the accuracy of the spam clustering as much as possible so as to analyze spam based attacks more accurately. In this paper, we present an optimized spam clustering method, called O-means, based on the K-means clustering method, which is one of the most widely used clustering methods. By examining three weeks of spam gathered in our SMTP server, we observed that the accuracy of the O-means clustering method is about 87% which is superior to the previous clustering methods. In addition, we define 12 statistical features to compare similarity between spam emails, and we determined a set of optimized features which makes the O-means clustering method more effective.

  • 3D Sound Rendering for Multiple Sound Sources Based on Fuzzy Clustering

    Masashi OKADA  Nobuyuki IWANAGA  Tomoya MATSUMURA  Takao ONOYE  Wataru KOBAYASHI  

     
    PAPER

      Vol:
    E93-A No:11
      Page(s):
    2163-2172

    In this paper, we propose a new 3D sound rendering method for multiple sound sources with limited computational resources. The method is based on fuzzy clustering, which achieves dual benefits of two general methods based on amplitude-panning and hard clustering. In embedded systems where the number of reproducible sound sources is restricted, the general methods suffer from localization errors and/or serious quality degradation, whereas the proposed method settles the problems by executing clustering-process and amplitude-panning simultaneously. Computational cost evaluation based on DSP implementation and subjective listening test have been performed to demonstrate the applicability for embedded systems and the effectiveness of the proposed method.

  • PAW: A Pattern-Aware Write Policy for a Flash Non-volatile Cache

    Young-Jin KIM  Jihong KIM  Jeong-Bae LEE  Kee-Wook RIM  

     
    PAPER-Software System

      Vol:
    E93-D No:11
      Page(s):
    3017-3026

    In disk-based storage systems, non-volatile write caches have been widely used to reduce write latency as well as to ensure data consistency at the level of a storage controller. Write cache policies should basically consider which data is important to cache and evict, and they should also take into account the real I/O features of a non-volatile device. However, existing work has mainly focused on improving basic cache operations, but has not considered the I/O cost of a non-volatile device properly. In this paper, we propose a pattern-aware write cache policy, PAW for a NAND flash memory in disk-based mobile storage systems. PAW is designed to face a mix of a number of sequential accesses and fewer non-sequential ones in mobile storage systems by redirecting the latter to a NAND flash memory and the former to a disk. In addition, PAW employs the synergistic effect of combining a pattern-aware write cache policy and an I/O clustering-based queuing method to strengthen the sequentiality with the aim of reducing the overall system I/O latency. For evaluations, we have built a practical hard disk simulator with a non-volatile cache of a NAND flash memory. Experimental results show that our policy significantly improves the overall I/O performance by reducing the overhead from a non-volatile cache considerably over a traditional one, achieving a high efficiency in energy consumption.

  • An Adaptive Niching EDA with Balance Searching Based on Clustering Analysis

    Benhui CHEN  Jinglu HU  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E93-A No:10
      Page(s):
    1792-1799

    For optimization problems with irregular and complex multimodal landscapes, Estimation of Distribution Algorithms (EDAs) suffer from the drawback of premature convergence similar to other evolutionary algorithms. In this paper, we propose an adaptive niching EDA based on Affinity Propagation (AP) clustering analysis. The AP clustering is used to adaptively partition the niches and mine the searching information from the evolution process. The obtained information is successfully utilized to improve the EDA performance by using a balance niching searching strategy. Two different categories of optimization problems are used to evaluate the proposed adaptive niching EDA. The first one is solving three benchmark functional multimodal optimization problems by a continuous EDA based on single Gaussian probabilistic model; the other one is solving a real complicated discrete EDA optimization problem, the HP model protein folding based on k-order Markov probabilistic model. Simulation results show that the proposed adaptive niching EDA is an efficient method.

  • An Efficient Filtering Method for Processing Continuous Skyline Queries on Sensor Data

    Su Min JANG  Choon Seo PARK  Dong Min SEO  Jae Soo YOO  

     
    LETTER-Network

      Vol:
    E93-B No:8
      Page(s):
    2180-2183

    In this paper, we propose a novel filtering method for processing continuous skyline queries in wireless sensor network environments. The existing filtering methods on such environments use filters that are based on router paths. However, these methods do not have a major effect on reducing data for sensor nodes to transmit to the base station, because the filters are applied to not the whole area but a partial area. Therefore, we propose a novel and efficient method to dramatically reduce the data transmissions of sensors through applying an effective filter with low costs to all sensor nodes. The proposed effective filter is generated by using characteristics such as the data locality and the clustering of sensors. An extensive performance study verifies the merits of our new method.

  • Spectral Methods for Thesaurus Construction

    Nobuyuki SHIMIZU  Masashi SUGIYAMA  Hiroshi NAKAGAWA  

     
    PAPER-Natural Language Processing

      Vol:
    E93-D No:6
      Page(s):
    1378-1385

    Traditionally, popular synonym acquisition methods are based on the distributional hypothesis, and a metric such as Jaccard coefficients is used to evaluate the similarity between the contexts of words to obtain synonyms for a query. On the other hand, when one tries to compile and clean a thesaurus, one often already has a modest number of synonym relations at hand. Could something be done with a half-built thesaurus alone? We propose the use of spectral methods and discuss their relation to other network-based algorithms in natural language processing (NLP), such as PageRank and Bootstrapping. Since compiling a thesaurus is very laborious, we believe that adding the proposed method to the toolkit of thesaurus constructors would significantly ease the pain in accomplishing this task.

  • Distributed Clustering Algorithm to Explore Selection Diversity in Wireless Sensor Networks

    Hyung-Yun KONG   ASADUZZAMAN  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E93-B No:5
      Page(s):
    1232-1239

    This paper presents a novel cross-layer approach to explore selection diversity for distributed clustering based wireless sensor networks (WSNs) by selecting a proper cluster-head. We develop and analyze an instantaneous channel state information (CSI) based cluster-head selection algorithm for a distributed, dynamic and randomized clustering based WSN. The proposed cluster-head selection scheme is also random and capable to distribute the energy uses among the nodes in the network. We present an analytical approach to evaluate the energy efficiency and system lifetime of our proposal. Analysis shows that the proposed scheme outperforms the performance of additive white Gaussian noise (AWGN) channel under Rayleigh fading environment. This proposal also outperforms the existing cooperative diversity protocols in terms of system lifetime and implementation complexity.

  • A WDS Clustering Algorithm for Wireless Mesh Networks

    Shigeto TAJIMA  Nobuo FUNABIKI  Teruo HIGASHINO  

     
    PAPER-Fundamentals of Information Systems

      Vol:
    E93-D No:4
      Page(s):
    800-810

    Wireless mesh networks have been extensively studied as expandable, flexible, and inexpensive access networks to the Internet. This paper focuses on one composed of multiple access points (APs) connected through multihop wireless communications mainly by the wireless distribution system (WDS). For scalability, the proper partition of APs into multiple WDS clusters is essential, because the number of APs in one cluster is limited due to the increasing radio interference and control packets. In this paper, we formulate this WDS clustering problem and prove the NP-completeness of its decision version through reduction from a known NP-complete problem. Then, we propose its heuristic algorithm, using a greedy method and a variable depth search method, to satisfy the complex constraints while optimizing the cost function. We verify the effectiveness of our algorithm through extensive simulations, where the results confirm its superiority to the existing algorithm in terms of throughput.

  • An Improved Anchor Shot Detection Method Using Fitness of Face Location and Dissimilarity of Icon Region

    Ji-Soo KEUM  Hyon-Soo LEE  Masafumi HAGIWARA  

     
    LETTER-Image

      Vol:
    E93-A No:4
      Page(s):
    863-866

    In this letter, we propose an improved anchor shot detection (ASD) method in order to effectively retrieve anchor shots from news video. The face location and dissimilarity of icon region are used to reduce false alarms in the proposed method. According to the results of the experiment on several types of news video, the proposed method obtained high anchor detection results compared with previous methods.

  • Energy Efficient and Stable Weight Based Clustering for Mobile Ad Hoc Networks

    Safdar H. BOUK  Iwao SASASE  

     
    PAPER-Network

      Vol:
    E92-B No:9
      Page(s):
    2851-2863

    Recently several weighted clustering algorithms have been proposed, however, to the best of our knowledge; there is none that propagates weights to other nodes without weight message for leader election, normalizes node parameters and considers neighboring node parameters to calculate node weights. In this paper, we propose an Energy Efficient and Stable Weight Based Clustering (EE-SWBC) algorithm that elects cluster heads without sending any additional weight message. It propagates node parameters to its neighbors through neighbor discovery message (HELLO Message) and stores these parameters in neighborhood list. Each node normalizes parameters and efficiently calculates its own weight and the weights of neighboring nodes from that neighborhood table using Grey Decision Method (GDM). GDM finds the ideal solution (best node parameters in neighborhood list) and calculates node weights in comparison to the ideal solution. The node(s) with maximum weight (parameters closer to the ideal solution) are elected as cluster heads. In result, EE-SWBC fairly selects potential nodes with parameters closer to ideal solution with less overhead. Different performance metrics of EE-SWBC and Distributed Weighted Clustering Algorithm (DWCA) are compared through simulations. The simulation results show that EE-SWBC maintains fewer average numbers of stable clusters with minimum overhead, less energy consumption and fewer changes in cluster structure within network compared to DWCA.

81-100hit(170hit)