The search functionality is under construction.

Keyword Search Result

[Keyword] k-means clustering(18hit)

1-18hit
  • Efficient Task Allocation Protocol for a Hybrid-Hierarchical Spatial-Aerial-Terrestrial Edge-Centric IoT Architecture Open Access

    Abbas JAMALIPOUR  Forough SHIRIN ABKENAR  

     
    INVITED PAPER

      Pubricized:
    2021/08/17
      Vol:
    E105-B No:2
      Page(s):
    116-130

    In this paper, we propose a novel Hybrid-Hierarchical spatial-aerial-Terrestrial Edge-Centric (H2TEC) for the space-air integrated Internet of Things (IoT) networks. (H2TEC) comprises unmanned aerial vehicles (UAVs) that act as mobile fog nodes to provide the required services for terminal nodes (TNs) in cooperation with the satellites. TNs in (H2TEC) offload their generated tasks to the UAVs for further processing. Due to the limited energy budget of TNs, a novel task allocation protocol, named TOP, is proposed to minimize the energy consumption of TNs while guaranteeing the outage probability and network reliability for which the transmission rate of TNs is optimized. TOP also takes advantage of the energy harvesting by which the low earth orbit satellites transfer energy to the UAVs when the remaining energy of the UAVs is below a predefined threshold. To this end, the harvested power of the UAVs is optimized alongside the corresponding harvesting time so that the UAVs can improve the network throughput via processing more bits. Numerical results reveal that TOP outperforms the baseline method in critical situations that more power is required to process the task. It is also found that even in such situations, the energy harvesting mechanism provided in the TOP yields a more efficient network throughput.

  • Anomaly Prediction Based on Machine Learning for Memory-Constrained Devices

    Yuto KITAGAWA  Tasuku ISHIGOOKA  Takuya AZUMI  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/05/30
      Vol:
    E102-D No:9
      Page(s):
    1797-1807

    This paper proposes an anomaly prediction method based on k-means clustering that assumes embedded devices with memory constraints. With this method, by checking control system behavior in detail using k-means clustering, it is possible to predict anomalies. However, continuing clustering is difficult because data accumulate in memory similar to existing k-means clustering method, which is problematic for embedded devices with low memory capacity. Therefore, we also propose k-means clustering to continue clustering for infinite stream data. The proposed k-means clustering method is based on online k-means clustering of sequential processing. The proposed k-means clustering method only stores data required for anomaly prediction and releases other data from memory. Due to these characteristics, the proposed k-means clustering realizes that anomaly prediction is performed by reducing memory consumption. Experiments were performed with actual data of control system for anomaly prediction. Experimental results show that the proposed anomaly prediction method can predict anomaly, and the proposed k-means clustering can predict anomalies similar to standard k-means clustering while reducing memory consumption. Moreover, the proposed k-means clustering demonstrates better results of anomaly prediction than existing online k-means clustering.

  • Multi Long-Short Term Memory Models for Short Term Traffic Flow Prediction

    Zelong XUE  Yang XUE  

     
    LETTER-Biocybernetics, Neurocomputing

      Pubricized:
    2018/09/18
      Vol:
    E101-D No:12
      Page(s):
    3272-3275

    Many single model methods have been applied to real-time short-term traffic flow prediction. However, since traffic flow data is mixed with a variety of ingredients, the performance of single model is limited. Therefore, we proposed Multi-Long-Short Term Memory Models, which improved traffic flow prediction accuracy comparing with state-of-the-art models.

  • An FPGA Realization of a Random Forest with k-Means Clustering Using a High-Level Synthesis Design

    Akira JINGUJI  Shimpei SATO  Hiroki NAKAHARA  

     
    PAPER-Emerging Applications

      Pubricized:
    2017/11/17
      Vol:
    E101-D No:2
      Page(s):
    354-362

    A random forest (RF) is a kind of ensemble machine learning algorithm used for a classification and a regression. It consists of multiple decision trees that are built from randomly sampled data. The RF has a simple, fast learning, and identification capability compared with other machine learning algorithms. It is widely used for application to various recognition systems. Since it is necessary to un-balanced trace for each tree and requires communication for all the ones, the random forest is not suitable in SIMD architectures such as GPUs. Although the accelerators using the FPGA have been proposed, such implementations were based on HDL design. Thus, they required longer design time than the soft-ware based realizations. In the previous work, we showed the high-level synthesis design of the RF including the fully pipelined architecture and the all-to-all communication. In this paper, to further reduce the amount of hardware, we use k-means clustering to share comparators of the branch nodes on the decision tree. Also, we develop the krange tool flow, which generates the bitstream with a few number of hyper parameters. Since the proposed tool flow is based on the high-level synthesis design, we can obtain the high performance RF with short design time compared with the conventional HDL design. We implemented the RF on the Xilinx Inc. ZC702 evaluation board. Compared with the CPU (Intel Xeon (R) E5607 Processor) and the GPU (NVidia Geforce Titan) implementations, as for the performance, the FPGA realization was 8.4 times faster than the CPU one, and it was 62.8 times faster than the GPU one. As for the power consumption efficiency, the FPGA realization was 7.8 times better than the CPU one, and it was 385.9 times better than the GPU one.

  • Expose Spliced Photographic Basing on Boundary and Noise Features

    Jun HOU  Yan CHENG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2015/04/01
      Vol:
    E98-D No:7
      Page(s):
    1426-1429

    The paper proposes an algorithm to expose spliced photographs. Firstly, a graph-based segmentation, which defines a predictor to measure boundary evidence between two neighbor regions, is used to make greedy decision. Then the algorithm gets prediction error image using non-negative linear least-square prediction. For each pair of segmented neighbor regions, the proposed algorithm gathers their statistic features and calculates features of gray level co-occurrence matrix. K-means clustering is applied to create a dictionary, and the vector quantization histogram is taken as the result vector with fixed length. For a tampered image, its noise satisfies Gaussian distribution with zero mean. The proposed method checks the similarity between noise distribution and a zero-mean Gaussian distribution, and follows with the local flatness and texture measurement. Finally, all features are fed to a support vector machine classifier. The algorithm has low computational cost. Experiments show its effectiveness in exposing forgery.

  • Kernel-Reliability-Based K-Means (KRKM) Clustering Algorithm and Image Processing

    Chunsheng HUA  Juntong QI  Jianda HAN  Haiyuan WU  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E97-D No:9
      Page(s):
    2423-2433

    In this paper, we introduced a novel Kernel-Reliability-based K-Means (KRKM) clustering algorithm for categorizing an unknown dataset under noisy condition. Compared with the conventional clustering algorithms, the proposed KRKM algorithm will measure both the reliability and the similarity for classifying data into its neighbor clusters by the dynamic kernel functions, where the noisy data will be rejected by being given low reliability. The reliability for classifying data is measured by a dynamic kernel function whose window size will be determined by the triangular relationship from this data to its two nearest clusters. The similarity from a data item to its neighbor clusters is measured by another adaptive kernel function which takes into account not only the similarity from data to clusters but also that between its two nearest clusters. The main contribution of this work lies in introducing the dynamic kernel functions to evaluate both the reliability and similarity for clustering, which makes the proposed algorithm more efficient in dealing with very strong noisy data. Through various experiments, the efficiency and effectiveness of proposed algorithm have been confirmed.

  • IDDQ Outlier Screening through Two-Phase Approach: Clustering-Based Filtering and Estimation-Based Current-Threshold Determination

    Michihiro SHINTANI  Takashi SATO  

     
    PAPER-Dependable Computing

      Vol:
    E97-D No:8
      Page(s):
    2095-2104

    We propose a novel IDDQ outlier screening flow through a two-phase approach: a clustering-based filtering and an estimation-based current-threshold determination. In the proposed flow, a clustering technique first filters out chips that have high IDDQ current. Then, in the current-threshold determination phase, device-parameters of the unfiltered chips are estimated based on measured IDDQ currents through Bayesian inference. The estimated device-parameters will further be used to determine a statistical leakage current distribution for each test pattern and to calculate a and suitable current-threshold. Numerical experiments using a virtual wafer show that our proposed technique is 14 times more accurate than the neighbor nearest residual (NNR) method and can achieve 80% of the test escape in the case of small leakage faults whose ratios of leakage fault sizes to the nominal IDDQ current are above 40%.

  • Online Learned Player Recognition Model Based Soccer Player Tracking and Labeling for Long-Shot Scenes

    Weicun XU  Qingjie ZHAO  Yuxia WANG  Xuanya LI  

     
    PAPER-Pattern Recognition

      Vol:
    E97-D No:1
      Page(s):
    119-129

    Soccer player tracking and labeling suffer from the similar appearance of the players in the same team, especially in long-shot scenes where the faces and the numbers of the players are too blurry to identify. In this paper, we propose an efficient multi-player tracking system. The tracking system takes the detection responses of a human detector as inputs. To realize real-time player detection, we generate a spatial proposal to minimize the scanning scope of the detector. The tracking system utilizes the discriminative appearance models trained using the online Boosting method to reduce data-association ambiguity caused by the appearance similarity of the players. We also propose to build an online learned player recognition model which can be embedded in the tracking system to approach online player recognition and labeling in tracking applications for long-shot scenes by two stages. At the first stage, to build the model, we utilize the fast k-means clustering method instead of classic k-means clustering to build and update a visual word vocabulary in an efficient online manner, using the informative descriptors extracted from the training samples drawn at each time step of multi-player tracking. The first stage finishes when the vocabulary is ready. At the second stage, given the obtained visual word vocabulary, an incremental vector quantization strategy is used to recognize and label each tracked player. We also perform importance recognition validation to avoid mistakenly recognizing an outlier, namely, people we do not need to recognize, as a player. Both quantitative and qualitative experimental results on the long-shot video clips of a real soccer game video demonstrate that, the proposed player recognition model performs much better than some state-of-the-art online learned models, and our tracking system also performs quite effectively even under very complicated situations.

  • A K-Means-Based Multi-Prototype High-Speed Learning System with FPGA-Implemented Coprocessor for 1-NN Searching

    Fengwei AN  Tetsushi KOIDE  Hans Jürgen MATTAUSCH  

     
    PAPER-Biocybernetics, Neurocomputing

      Vol:
    E95-D No:9
      Page(s):
    2327-2338

    In this paper, we propose a hardware solution for overcoming the problem of high computational demands in a nearest neighbor (NN) based multi-prototype learning system. The multiple prototypes are obtained by a high-speed K-means clustering algorithm utilizing a concept of software-hardware cooperation that takes advantage of the flexibility of the software and the efficiency of the hardware. The one nearest neighbor (1-NN) classifier is used to recognize an object by searching for the nearest Euclidean distance among the prototypes. The major deficiency in conventional implementations for both K-means and 1-NN is the high computational demand of the nearest neighbor searching. This deficiency is resolved by an FPGA-implemented coprocessor that is a VLSI circuit for searching the nearest Euclidean distance. The coprocessor requires 12.9% logic elements and 58% block memory bits of an Altera Stratix III E110 FPGA device. The hardware communicates with the software by a PCI Express (4) local-bus-compatible interface. We benchmark our learning system against the popular case of handwritten digit recognition in which abundant previous works for comparison are available. In the case of the MNIST database, we could attain the most efficient accuracy rate of 97.91% with 930 prototypes, the learning speed of 1.310-4 s/sample and the classification speed of 3.9410-8 s/character.

  • A Support Vector and K-Means Based Hybrid Intelligent Data Clustering Algorithm

    Liang SUN  Shinichi YOSHIDA  Yanchun LIANG  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E94-D No:11
      Page(s):
    2234-2243

    Support vector clustering (SVC), a recently developed unsupervised learning algorithm, has been successfully applied to solving many real-life data clustering problems. However, its effectiveness and advantages deteriorate when it is applied to solving complex real-world problems, e.g., those with large proportion of noise data points and with connecting clusters. This paper proposes a support vector and K-Means based hybrid algorithm to improve the performance of SVC. A new SVC training method is developed based on analysis of a Gaussian kernel radius function. An empirical study is conducted to guide better selection of the standard deviation of the Gaussian kernel. In the proposed algorithm, firstly, the outliers which increase problem complexity are identified and removed by training a global SVC. The refined data set is then clustered by a kernel-based K-Means algorithm. Finally, several local SVCs are trained for the clusters and then each removed data point is labeled according to the distance from it to the local SVCs. Since it exploits the advantages of both SVC and K-Means, the proposed algorithm is capable of clustering compact and arbitrary organized data sets and of increasing robustness to outliers and connecting clusters. Experiments are conducted on 2-D data sets generated by mixture models and benchmark data sets taken from the UCI machine learning repository. The cluster error rate is lower than 3.0% for all the selected data sets. The results demonstrate that the proposed algorithm compared favorably with existing SVC algorithms.

  • O-means: An Optimized Clustering Method for Analyzing Spam Based Attacks

    Jungsuk SONG  Daisuke INOUE  Masashi ETO  Hyung Chan KIM  Koji NAKAO  

     
    PAPER-Network Security

      Vol:
    E94-A No:1
      Page(s):
    245-254

    In recent years, the number of spam emails has been dramatically increasing and spam is recognized as a serious internet threat. Most recent spam emails are being sent by bots which often operate with others in the form of a botnet, and skillful spammers try to conceal their activities from spam analyzers and spam detection technology. In addition, most spam messages contain URLs that lure spam receivers to malicious Web servers for the purpose of carrying out various cyber attacks such as malware infection, phishing attacks, etc. In order to cope with spam based attacks, there have been many efforts made towards the clustering of spam emails based on similarities between them. The spam clusters obtained from the clustering of spam emails can be used to identify the infrastructure of spam sending systems and malicious Web servers, and how they are grouped and correlate with each other, and to minimize the time needed for analyzing Web pages. Therefore, it is very important to improve the accuracy of the spam clustering as much as possible so as to analyze spam based attacks more accurately. In this paper, we present an optimized spam clustering method, called O-means, based on the K-means clustering method, which is one of the most widely used clustering methods. By examining three weeks of spam gathered in our SMTP server, we observed that the accuracy of the O-means clustering method is about 87% which is superior to the previous clustering methods. In addition, we define 12 statistical features to compare similarity between spam emails, and we determined a set of optimized features which makes the O-means clustering method more effective.

  • Differentiating Honeycombed Images from Normal HRCT Lung Images

    Aamir Saeed MALIK  Tae-Sun CHOI  

     
    LETTER-Biological Engineering

      Vol:
    E92-D No:5
      Page(s):
    1218-1221

    A classification method is presented for differentiating honeycombed High Resolution Computed Tomographic (HRCT) images from normal HRCT images. For successful classification of honeycombed HRCT images, a complete set of methods and algorithms is described from segmentation to extraction to feature selection to classification. Wavelet energy is selected as a feature for classification using K-means clustering. Test data of 20 patients are used to validate the method.

  • Security and Correctness Analysis on Privacy-Preserving k-Means Clustering Schemes

    Chunhua SU  Feng BAO  Jianying ZHOU  Tsuyoshi TAKAGI  Kouichi SAKURAI  

     
    LETTER-Cryptography and Information Security

      Vol:
    E92-A No:4
      Page(s):
    1246-1250

    Due to the fast development of Internet and the related IT technologies, it becomes more and more easier to access a large amount of data. k-means clustering is a powerful and frequently used technique in data mining. Many research papers about privacy-preserving k-means clustering were published. In this paper, we analyze the existing privacy-preserving k-means clustering schemes based on the cryptographic techniques. We show those schemes will cause the privacy breach and cannot output the correct results due to the faults in the protocol construction. Furthermore, we analyze our proposal as an option to improve such problems but with intermediate information breach during the computation.

  • RK-Means Clustering: K-Means with Reliability

    Chunsheng HUA  Qian CHEN  Haiyuan WU  Toshikazu WADA  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E91-D No:1
      Page(s):
    96-104

    This paper presents an RK-means clustering algorithm which is developed for reliable data grouping by introducing a new reliability evaluation to the K-means clustering algorithm. The conventional K-means clustering algorithm has two shortfalls: 1) the clustering result will become unreliable if the assumed number of the clusters is incorrect; 2) during the update of a cluster center, all the data points belong to that cluster are used equally without considering how distant they are to the cluster center. In this paper, we introduce a new reliability evaluation to K-means clustering algorithm by considering the triangular relationship among each data point and its two nearest cluster centers. We applied the proposed algorithm to track objects in video sequence and confirmed its effectiveness and advantages.

  • Object Tracking with Target and Background Samples

    Chunsheng HUA  Haiyuan WU  Qian CHEN  Toshikazu WADA  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E90-D No:4
      Page(s):
    766-774

    In this paper, we present a general object tracking method based on a newly proposed pixel-wise clustering algorithm. To track an object in a cluttered environment is a challenging issue because a target object may be in concave shape or have apertures (e.g. a hand or a comb). In those cases, it is difficult to separate the target from the background completely by simply modifying the shape of the search area. Our algorithm solves the problem by 1) describing the target object by a set of pixels; 2) using a K-means based algorithm to detect all target pixels. To realize stable and reliable detection of target pixels, we firstly use a 5D feature vector to describe both the color ("Y, U, V") and the position ("x, y") of each pixel uniformly. This enables the simultaneous adaptation to both the color and geometric features during tracking. Secondly, we use a variable ellipse model to describe the shape of the search area and to model the surrounding background. This guarantees the stable object tracking under various geometric transformations. The robust tracking is realized by classifying the pixels within the search area into "target" and "background" groups with a K-means clustering based algorithm that uses the "positive" and "negative" samples. We also propose a method that can detect the tracking failure and recover from it during tracking by making use of both the "positive" and "negative" samples. This feature makes our method become a more reliable tracking algorithm because it can discover the target once again when the target has become lost. Through the extensive experiments under various environments and conditions, the effectiveness and efficiency of the proposed algorithm is confirmed.

  • Single-Channel Multiple Regression for In-Car Speech Enhancement

    Weifeng LI  Katsunobu ITOU  Kazuya TAKEDA  Fumitada ITAKURA  

     
    PAPER-Speech Enhancement

      Vol:
    E89-D No:3
      Page(s):
    1032-1039

    We address issues for improving hands-free speech enhancement and speech recognition performance in different car environments using a single distant microphone. This paper describes a new single-channel in-car speech enhancement method that estimates the log spectra of speech at a close-talking microphone based on the nonlinear regression of the log spectra of noisy signal captured by a distant microphone and the estimated noise. The proposed method provides significant overall quality improvements in our subjective evaluation on the regression-enhanced speech, and performed best in most objective measures. Based on our isolated word recognition experiments conducted under 15 real car environments, the proposed adaptive nonlinear regression approach shows an advantage in average relative word error rate (WER) reductions of 50.8% and 13.1%, respectively, compared to original noisy speech and ETSI advanced front-end (ETSI ES 202 050).

  • Adaptive Nonlinear Regression Using Multiple Distributed Microphones for In-Car Speech Recognition

    Weifeng LI  Chiyomi MIYAJIMA  Takanori NISHINO  Katsunobu ITOU  Kazuya TAKEDA  Fumitada ITAKURA  

     
    PAPER-Speech Enhancement

      Vol:
    E88-A No:7
      Page(s):
    1716-1723

    In this paper, we address issues in improving hands-free speech recognition performance in different car environments using multiple spatially distributed microphones. In the previous work, we proposed the multiple linear regression of the log spectra (MRLS) for estimating the log spectra of speech at a close-talking microphone. In this paper, the concept is extended to nonlinear regressions. Regressions in the cepstrum domain are also investigated. An effective algorithm is developed to adapt the regression weights automatically to different noise environments. Compared to the nearest distant microphone and adaptive beamformer (Generalized Sidelobe Canceller), the proposed adaptive nonlinear regression approach shows an advantage in the average relative word error rate (WER) reductions of 58.5% and 10.3%, respectively, for isolated word recognition under 15 real car environments.

  • Voice Activity Detection Algorithm Based on Radial Basis Function Network

    Hong-Ik KIM  Sung-Kwon PARK  

     
    LETTER-Fundamental Theories for Communications

      Vol:
    E88-B No:4
      Page(s):
    1653-1657

    This paper proposes a Voice Activity Detection (VAD) algorithm using Radial Basis Function (RBF) network. The k-means clustering and Least Mean Square (LMS) algorithm are used to update the RBF network to the underlying speech condition. The inputs for RBF are the three parameters a Code Excited Linear Prediction (CELP) coder, which works stably under various background noise levels. Adaptive hangover threshold applies in RBF-VAD for reducing error, because threshold value has trade off effect in VAD decision. The experimental results show that the proposed VAD algorithm achieves better performance than G.729 Annex B at any noise level.