The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] clustering(170hit)

121-140hit(170hit)

  • Improving Accuracy of Recommender System by Item Clustering

    KhanhQuan TRUONG  Fuyuki ISHIKAWA  Shinichi HONIDEN  

     
    PAPER

      Vol:
    E90-D No:9
      Page(s):
    1363-1373

    Recommender System (RS) predicts user's ratings towards items, and then recommends highly-predicted items to user. In recent years, RS has been playing more and more important role in the agent research field. There have been a great deal of researches trying to apply agent technology to RS. Collaborative Filtering, one of the most widely used approach to predict user's ratings in Recommender System, predicts a user's rating towards an item by aggregating ratings given by users who have similar preference to that user. In existing approaches, user similarity is often computed on the whole set of items. However, because the number of items is often very large and so is the diversity among items, users who have similar preference in one category may have totally different judgement on items of another kind. In order to deal with this problem, we propose a method to cluster items, so that inside a cluster, similarity between users does not change significantly from item to item. After the item clustering phase, when predicting rating of a user towards an item, we only aggregate ratings of users who have similarity preference to that user inside the cluster of that item. Experiments evaluating our approach are carried out on the real dataset taken from MovieLens, a movies recommendation web site. Experiment results suggest that our approach can improve prediction accuracy compared to existing approaches.

  • Statistical Mechanical Analysis of Fuzzy Clustering Based on Fuzzy Entropy

    Makoto YASUDA  Takeshi FURUHASHI  Shigeru OKUMA  

     
    PAPER-Computation and Computational Models

      Vol:
    E90-D No:6
      Page(s):
    883-888

    This paper deals with statistical mechanical characteristics of fuzzy clustering regularized with fuzzy entropy. We obtain the Fermi-Dirac distribution function as a membership function by regularizing the fuzzy c-means with fuzzy entropy. Then we formulate it as a direct annealing clustering, and examine the meanings of Fermi-Dirac function and fuzzy entropy from a statistical mechanical point of view, and show that this fuzzy clustering method is none other than the Fermi-Dirac statistics.

  • Object Tracking with Target and Background Samples

    Chunsheng HUA  Haiyuan WU  Qian CHEN  Toshikazu WADA  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E90-D No:4
      Page(s):
    766-774

    In this paper, we present a general object tracking method based on a newly proposed pixel-wise clustering algorithm. To track an object in a cluttered environment is a challenging issue because a target object may be in concave shape or have apertures (e.g. a hand or a comb). In those cases, it is difficult to separate the target from the background completely by simply modifying the shape of the search area. Our algorithm solves the problem by 1) describing the target object by a set of pixels; 2) using a K-means based algorithm to detect all target pixels. To realize stable and reliable detection of target pixels, we firstly use a 5D feature vector to describe both the color ("Y, U, V") and the position ("x, y") of each pixel uniformly. This enables the simultaneous adaptation to both the color and geometric features during tracking. Secondly, we use a variable ellipse model to describe the shape of the search area and to model the surrounding background. This guarantees the stable object tracking under various geometric transformations. The robust tracking is realized by classifying the pixels within the search area into "target" and "background" groups with a K-means clustering based algorithm that uses the "positive" and "negative" samples. We also propose a method that can detect the tracking failure and recover from it during tracking by making use of both the "positive" and "negative" samples. This feature makes our method become a more reliable tracking algorithm because it can discover the target once again when the target has become lost. Through the extensive experiments under various environments and conditions, the effectiveness and efficiency of the proposed algorithm is confirmed.

  • TSK-Based Linguistic Fuzzy Model with Uncertain Model Output

    Keun-Chang KWAK  Dong-Hwa KIM  

     
    PAPER-Computation and Computational Models

      Vol:
    E89-D No:12
      Page(s):
    2919-2923

    We present a TSK (Takagi-Sugeno-Kang)-based Linguistic Fuzzy Model (TSK-LFM) with uncertain model output. Based on the Linguistic Model (LM) proposed by Pedrycz, we develop a comprehensive design framework. The main design process is composed of the automatic generation of the contexts, fuzzy rule extraction by Context-based Fuzzy C-Means (CFCM) clustering, connection of bias term, and combination of TSK and linguistic context. Finally, we contrast the performance of the presented models with other models for coagulant dosing process in a water purification plant.

  • A New Two-Phase Approach to Fuzzy Modeling for Nonlinear Function Approximation

    Wooyong CHUNG  Euntai KIM  

     
    PAPER-Computation and Computational Models

      Vol:
    E89-D No:9
      Page(s):
    2473-2483

    Nonlinear modeling of complex irregular systems constitutes the essential part of many control and decision-making systems and fuzzy logic is one of the most effective algorithms to build such a nonlinear model. In this paper, a new approach to fuzzy modeling is proposed. The model considered herein is the well-known Sugeno-type fuzzy system. The fuzzy modeling algorithm suggested in this paper is composed of two phases: coarse tuning and fine tuning. In the first phase (coarse tuning), a successive clustering algorithm with the fuzzy validity measure (SCFVM) is proposed to find the number of the fuzzy rules and an initial fuzzy model. In the second phase (fine tuning), a moving genetic algorithm with partial encoding (MGAPE) is developed and used for optimized tuning of membership functions of the fuzzy model. Two computer simulation examples are provided to evaluate the performance of the proposed modeling approach and compare it with other modeling approaches.

  • Unsupervised and Semi-Supervised Extraction of Clusters from Hypergraphs

    Weiwei DU  Kohei INOUE  Kiichi URAHAMA  

     
    LETTER-Biological Engineering

      Vol:
    E89-D No:7
      Page(s):
    2315-2318

    We extend a graph spectral method for extracting clusters from graphs representing pairwise similarity between data to hypergraph data with hyperedges denoting higher order similarity between data. Our method is robust to noisy outlier data and the number of clusters can be easily determined. The unsupervised method extracts clusters sequentially in the order of the majority of clusters. We derive from the unsupervised algorithm a semi-supervised one which can extract any cluster irrespective of its majority. The performance of those methods is exemplified with synthetic toy data and real image data.

  • Implementation and Evaluation of an HMM-Based Korean Speech Synthesis System

    Sang-Jin KIM  Jong-Jin KIM  Minsoo HAHN  

     
    LETTER

      Vol:
    E89-D No:3
      Page(s):
    1116-1119

    Development of a hidden Markov model (HMM)-based Korean speech synthesis system and its evaluation is described. Statistical HMM models for Korean speech units are trained with the hand-labeled speech database including the contextual information about phoneme, morpheme, word phrase, utterance, and break strength. The developed system produced speech with a fairly good prosody. The synthesized speech is evaluated and compared with that of our corpus-based unit concatenating Korean text-to-speech system. The two systems were trained with the same manually labeled speech database.

  • Single-Channel Multiple Regression for In-Car Speech Enhancement

    Weifeng LI  Katsunobu ITOU  Kazuya TAKEDA  Fumitada ITAKURA  

     
    PAPER-Speech Enhancement

      Vol:
    E89-D No:3
      Page(s):
    1032-1039

    We address issues for improving hands-free speech enhancement and speech recognition performance in different car environments using a single distant microphone. This paper describes a new single-channel in-car speech enhancement method that estimates the log spectra of speech at a close-talking microphone based on the nonlinear regression of the log spectra of noisy signal captured by a distant microphone and the estimated noise. The proposed method provides significant overall quality improvements in our subjective evaluation on the regression-enhanced speech, and performed best in most objective measures. Based on our isolated word recognition experiments conducted under 15 real car environments, the proposed adaptive nonlinear regression approach shows an advantage in average relative word error rate (WER) reductions of 50.8% and 13.1%, respectively, compared to original noisy speech and ETSI advanced front-end (ETSI ES 202 050).

  • Two-Phased Bulk Insertion by Seeded Clustering for R-Trees

    Taewon LEE  Sukho LEE  

     
    PAPER-Database

      Vol:
    E89-D No:1
      Page(s):
    228-236

    With great advances in the mobile technology and wireless communications, users expect to be online anytime anywhere. However, due to the high cost of being online, applications are still implemented as partially connected to the server. In many data-intensive mobile client/server frameworks, it is a daunting task to archive and index such a mass volume of complex data that are continuously added to the server when each mobile client gets online. In this paper, we propose a scalable technique called Seeded Clustering that allows us to maintain R-tree indexes by bulk insertion while keeping pace with high data arrival rates. Our approach uses a seed tree, which is copied from the top k levels of a target R-tree, to classify input data objects into clusters. We then build an R-tree for each of the clusters and insert the input R-trees into the target R-tree in bulk one at a time. We present detailed algorithms for the seeded clustering and bulk insertion as well as the results from our extensive experimental study. The experimental results show that the bulk insertion by seeded clustering outperforms the previously known methods in terms of insertion cost and the quality of target R-trees measured by their query performance.

  • Adaptive Clustering Technique Using Genetic Algorithms

    Nam Hyun PARK  Chang Wook AHN  Rudrapatna S. RAMAKRISHNA  

     
    LETTER-Data Mining

      Vol:
    E88-D No:12
      Page(s):
    2880-2882

    This paper proposes a genetically inspired adaptive clustering algorithm for numerical and categorical data sets. To this end, unique encoding method and fitness functions are developed. The algorithm automatically discovers the actual number of clusters and efficiently performs clustering without unduly compromising cluster-purity. Moreover, it outperforms existing clustering algorithms.

  • Robust Multi-Body Motion Segmentation Based on Fuzzy k-Subspace Clustering

    Xi LI  Zhengnan NING  Liuwei XIANG  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E88-D No:11
      Page(s):
    2609-2614

    The problem of multi-body motion segmentation is important in many computer vision applications. In this paper, we propose a novel algorithm called fuzzy k-subspace clustering for robust segmentation. The proposed method exploits the property that under orthographic camera model the tracked feature points of moving objects reside in multiple subspaces. We compute a partition of feature points into corresponding subspace clusters. First, we find a "soft partition" of feature points based on fuzzy k-subspace algorithm. The proposed fuzzy k-subspace algorithm iteratively minimizes the objective function using Weighted Singular Value Decomposition. Then the points with high partition confidence are gathered to form the subspace bases and the remaining points are classified using their distance to the bases. The proposed method can handle the case of missing data naturally, meaning that the feature points do not have to be visible throughout the sequence. The method is robust to noise and insensitive to initialization. Extensive experiments on synthetic and real data show the effectiveness of the proposed fuzzy k-subspace clustering algorithm.

  • A Distributed Clustering Method for Hierarchical Routing in Large-Scaled Wavelength Routed Networks

    Yukinobu FUKUSHIMA  Hiroaki HARAI  Shin'ichi ARAKAWA  Masayuki MURATA  

     
    PAPER

      Vol:
    E88-B No:10
      Page(s):
    3904-3913

    The scalability of routing protocol has been considered as a key issue in large-scaled wavelength routed networks. Hierarchical routing scales well by yielding enormous reductions in routing table length, but it also increases path length. This increased path length in wavelength-routed networks leads to increased blocking probability because longer paths tend to have less free wavelength channels. However, if the routes assigned to longer paths have greater wavelength resources, we can expect that the blocking probability will not increase. In this paper, we propose a distributed node-clustering method that maximizes the number of lightpaths between nodes. The key idea behind our method is to construct node-clusters that have much greater wavelength resources from the ingress border nodes to the egress border nodes, which increases the wavelength resources on the routes of lightpaths between nodes. We evaluate the blocking probability for lightpath requests and the maximum table length in simulation experiments. We find that the method we propose significantly reduces the table length, while the blocking probability is almost the same as that without clustering.

  • Adaptive Neuro-Fuzzy Networks with the Aid of Fuzzy Granulation

    Keun-Chang KWAK  Dong-Hwa KIM  

     
    PAPER-Biocybernetics, Neurocomputing

      Vol:
    E88-D No:9
      Page(s):
    2189-2196

    In this paper, we present the method for identifying an Adaptive Neuro-Fuzzy Networks (ANFN) with Takagi-Sugeno-Kang (TSK) fuzzy type based on fuzzy granulation. We also develop a systematic approach to generating fuzzy if-then rules from a given input-output data. The proposed ANFN is designed by the use of fuzzy granulation realized via context-based fuzzy clustering. This clustering technique builds information granules in the form of fuzzy sets and develops clusters by preserving the homogeneity of the clustered patterns associated with the input and output space. The experimental results reveal that the proposed model yields a better performance in comparison with Linguistic Models (LM) and Radial Basis Function Networks (RBFN) based on context-based fuzzy clustering introduced in the previous literature for Box-Jenkins gas furnace data and automobile MPG prediction.

  • Tree-Structured Clustering Methods for Piecewise Linear-Transformation-Based Noise Adaptation

    Zhipeng ZHANG  Toshiaki SUGIMURA  Sadaoki FURUI  

     
    PAPER-Speech and Hearing

      Vol:
    E88-D No:9
      Page(s):
    2168-2176

    This paper proposes the application of tree-structured clustering to the processing of noisy speech collected under various SNR conditions in the framework of piecewise-linear transformation (PLT)-based HMM adaptation for noisy speech. Three kinds of clustering methods are described: a one-step clustering method that integrates noise and SNR conditions and two two-step clustering methods that construct trees for each SNR condition. According to the clustering results, a noisy speech HMM is made for each node of the tree structure. Based on the likelihood maximization criterion, the HMM that best matches the input speech is selected by tracing the tree from top to bottom, and the selected HMM is further adapted by linear transformation. The proposed methods are evaluated by applying them to a Japanese dialogue recognition system. The results confirm that the proposed methods are effective in recognizing digitally noise-added speech and actual noisy speech issued by a wide range of speakers under various noise conditions. The results also indicate that the one-step clustering method gives better performance than the two-step clustering methods.

  • Using Topic Keyword Clusters for Automatic Document Clustering

    Hsi-Cheng CHANG  Chiun-Chieh HSU  

     
    PAPER-Document Clustering

      Vol:
    E88-D No:8
      Page(s):
    1852-1860

    Data clustering is a technique for grouping similar data items together for convenient understanding. Conventional data clustering methods, including agglomerative hierarchical clustering and partitional clustering algorithms, frequently perform unsatisfactorily for large text collections, since the computation complexities of the conventional data clustering methods increase very quickly with the number of data items. Poor clustering results degrade intelligent applications such as event tracking and information extraction. This paper presents an unsupervised document clustering method which identifies topic keyword clusters of the text corpus. The proposed method adopts a multi-stage process. First, an aggressive data cleaning approach is employed to reduce the noise in the free text and further identify the topic keywords in the documents. All extracted keywords are then grouped into topic keyword clusters using the k-nearest neighbor approach and the keyword clustering technique. Finally, all documents in the corpus are clustered based on the topic keyword clusters. The proposed method is assessed against conventional data clustering methods on a web news corpus. The experimental results show that the proposed method is an efficient and effective clustering approach.

  • Adaptive Nonlinear Regression Using Multiple Distributed Microphones for In-Car Speech Recognition

    Weifeng LI  Chiyomi MIYAJIMA  Takanori NISHINO  Katsunobu ITOU  Kazuya TAKEDA  Fumitada ITAKURA  

     
    PAPER-Speech Enhancement

      Vol:
    E88-A No:7
      Page(s):
    1716-1723

    In this paper, we address issues in improving hands-free speech recognition performance in different car environments using multiple spatially distributed microphones. In the previous work, we proposed the multiple linear regression of the log spectra (MRLS) for estimating the log spectra of speech at a close-talking microphone. In this paper, the concept is extended to nonlinear regressions. Regressions in the cepstrum domain are also investigated. An effective algorithm is developed to adapt the regression weights automatically to different noise environments. Compared to the nearest distant microphone and adaptive beamformer (Generalized Sidelobe Canceller), the proposed adaptive nonlinear regression approach shows an advantage in the average relative word error rate (WER) reductions of 58.5% and 10.3%, respectively, for isolated word recognition under 15 real car environments.

  • Eigen Image Recognition of Pulmonary Nodules from Thoracic CT Images by Use of Subspace Method

    Gentaro FUKANO  Yoshihiko NAKAMURA  Hotaka TAKIZAWA  Shinji MIZUNO  Shinji YAMAMOTO  Kunio DOI  Shigehiko KATSURAGAWA  Tohru MATSUMOTO  Yukio TATENO  Takeshi IINUMA  

     
    PAPER-Biological Engineering

      Vol:
    E88-D No:6
      Page(s):
    1273-1283

    We have proposed a recognition method for pulmonary nodules based on experimentally selected feature values (such as contrast, circularity, etc.) of pathologic candidate regions detected by our Variable N-Quoit (VNQ) filter. In this paper, we propose a new recognition method for pulmonary nodules by use of not experimentally selected feature values, but each CT value itself in a region of interest (ROI) as a feature value. The proposed method has 2 phases: learning and recognition. In the learning phase, first, the pathologic candidate regions are classified into several clusters based on a principal component score. This score is calculated from a set of CT values in the ROI that are regarded as a feature vector, and then eigen vectors and eigen values are calculated for each cluster by application of principal component analysis to the cluster. The eigen vectors (we call them "eigen-images") corresponding to the S-th largest eigen values are utilized as base vectors for subspaces of the clusters in a feature space. In the recognition phase, correlations are measured between the feature vector derived from testing data and the subspace which is spanned by the eigen-images. If the correlation with the nodule subspace is large, the pathologic candidate region is determined to be a nodule, otherwise, it is determined to be a normal organ. In the experiment, first, we decide on the optimal number of subspace dimensions. Then, we demonstrated the robustness of our algorithm by using simulated nodule images.

  • Anchor Frame Detection in News Video Using Anchor Object Extraction

    Ki Tae PARK  Doo Sun HWANG  Young Shik MOON  

     
    LETTER

      Vol:
    E88-A No:6
      Page(s):
    1525-1528

    In this paper, an algorithm for anchor frame detection in news video is proposed, which consists of four steps. First, the cumulative histogram method is used to detect shot boundaries in order to segment a news video into video shots. Second, skin color information is used to detect face regions in each video shot. Third, color information of upper body regions is used to extract anchor object. Then, a graph-theoretic cluster analysis algorithm is utilized to classify the news video into anchor-person shots and non-anchor shots. Experimental results have shown the effectiveness of the proposed algorithm.

  • Voice Activity Detection Algorithm Based on Radial Basis Function Network

    Hong-Ik KIM  Sung-Kwon PARK  

     
    LETTER-Fundamental Theories for Communications

      Vol:
    E88-B No:4
      Page(s):
    1653-1657

    This paper proposes a Voice Activity Detection (VAD) algorithm using Radial Basis Function (RBF) network. The k-means clustering and Least Mean Square (LMS) algorithm are used to update the RBF network to the underlying speech condition. The inputs for RBF are the three parameters a Code Excited Linear Prediction (CELP) coder, which works stably under various background noise levels. Adaptive hangover threshold applies in RBF-VAD for reducing error, because threshold value has trade off effect in VAD decision. The experimental results show that the proposed VAD algorithm achieves better performance than G.729 Annex B at any noise level.

  • Assessing the Quality of Fuzzy Partitions Using Relative Intersection

    Dae-Won KIM  Young-il KIM  Doheon LEE  Kwang Hyung LEE  

     
    PAPER-Computation and Computational Models

      Vol:
    E88-D No:3
      Page(s):
    594-602

    In this paper, conventional validity indexes are reviewed and the shortcomings of the fuzzy cluster validation index based on inter-cluster proximity are examined. Based on these considerations, a new cluster validity index is proposed for fuzzy partitions obtained from the fuzzy c-means algorithm. The proposed validity index is defined as the average value of the relative intersections of all possible pairs of fuzzy clusters in the system. It computes the overlap between two fuzzy clusters by considering the intersection of each data point in the overlap. The optimal number of clusters is obtained by minimizing the validity index with respect to c. Experiments in which the proposed validity index and several conventional validity indexes were applied to well known data sets highlight the superior qualities of the proposed index.

121-140hit(170hit)