The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] k-nearest neighbor(11hit)

1-11hit
  • A Personalised Session-Based Recommender System with Sequential Updating Based on Aggregation of Item Embeddings Open Access

    Yuma NAGI  Kazushi OKAMOTO  

     
    PAPER

      Pubricized:
    2024/01/09
      Vol:
    E107-D No:5
      Page(s):
    638-649

    The study proposes a personalised session-based recommender system that embeds items by using Word2Vec and sequentially updates the session and user embeddings with the hierarchicalization and aggregation of item embeddings. To process a recommendation request, the system constructs a real-time user embedding that considers users’ general preferences and sequential behaviour to handle short-term changes in user preferences with a low computational cost. The system performance was experimentally evaluated in terms of the accuracy, diversity, and novelty of the ranking of recommended items and the training and prediction times of the system for three different datasets. The results of these evaluations were then compared with those of the five baseline systems. According to the evaluation experiment, the proposed system achieved a relatively high recommendation accuracy compared with baseline systems and the diversity and novelty scores of the proposed system did not fall below 90% for any dataset. Furthermore, the training times of the Word2Vec-based systems, including the proposed system, were shorter than those of FPMC and GRU4Rec. The evaluation results suggest that the proposed recommender system succeeds in keeping the computational cost for training low while maintaining high-level recommendation accuracy, diversity, and novelty.

  • Packer Identification Method for Multi-Layer Executables Using Entropy Analysis with k-Nearest Neighbor Algorithm

    Ryoto OMACHI  Yasuyuki MURAKAMI  

     
    LETTER

      Pubricized:
    2022/08/16
      Vol:
    E106-A No:3
      Page(s):
    355-357

    The damage cost caused by malware has been increasing in the world. Usually, malwares are packed so that it is not detected. It is a hard task even for professional malware analysts to identify the packers especially when the malwares are multi-layer packed. In this letter, we propose a method to identify the packers for multi-layer packed malwares by using k-nearest neighbor algorithm with entropy-analysis for the malwares.

  • Recursive Nearest Neighbor Graph Partitioning for Extreme Multi-Label Learning

    Yukihiro TAGAMI  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2018/11/30
      Vol:
    E102-D No:3
      Page(s):
    579-587

    As the data size of Web-related multi-label classification problems continues to increase, the label space has also grown extremely large. For example, the number of labels appearing in Web page tagging and E-commerce recommendation tasks reaches hundreds of thousands or even millions. In this paper, we propose a graph partitioning tree (GPT), which is a novel approach for extreme multi-label learning. At an internal node of the tree, the GPT learns a linear separator to partition a feature space, considering approximate k-nearest neighbor graph of the label vectors. We also developed a simple sequential optimization procedure for learning the linear binary classifiers. Extensive experiments on large-scale real-world data sets showed that our method achieves better prediction accuracy than state-of-the-art tree-based methods, while maintaining fast prediction.

  • Development of License Plate Recognition on Complex Scene with Plate-Style Classification and Confidence Scoring Based on KNN

    Vince Jebryl MONTERO  Yong-Jin JEONG  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2018/08/24
      Vol:
    E101-D No:12
      Page(s):
    3181-3189

    This paper presents an approach for developing an algorithm for automatic license plate recognition system (ALPR) on complex scenes. A plate-style classification method is also proposed in this paper to address the inherent challenges for ALPR in a system that uses multiple plate-styles (e.g., different fonts, multiple plate lay-out, variations in character sequences) which is the case in the current Philippine license plate system. Methods are proposed for each ALPR module: plate detection, character segmentation, and character recognition. K-nearest neighbor (KNN) is used as a classifier for character recognition together with a proposed confidence scoring to rate the decision made by the classifier. A small dataset of Philippine license plates but with relevant features of complex scenarios for ALPR is prepared. Using the proposed system on the prepared dataset, the performance of the system is evaluated on different categories of complex scenes. The proposed algorithm structure shows promising results and yielded an overall accuracy higher than the existing ALPR systems on the dataset consisting mostly of complex scenes.

  • Speeding up Extreme Multi-Label Classifier by Approximate Nearest Neighbor Search

    Yukihiro TAGAMI  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2018/08/06
      Vol:
    E101-D No:11
      Page(s):
    2784-2794

    Extreme multi-label classification methods have been widely used in Web-scale classification tasks such as Web page tagging and product recommendation. In this paper, we present a novel graph embedding method called “AnnexML”. At the training step, AnnexML constructs a k-nearest neighbor graph of label vectors and attempts to reproduce the graph structure in the embedding space. The prediction is efficiently performed by using an approximate nearest neighbor search method that efficiently explores the learned k-nearest neighbor graph in the embedding space. We conducted evaluations on several large-scale real-world data sets and compared our method with recent state-of-the-art methods. Experimental results show that our AnnexML can significantly improve prediction accuracy, especially on data sets that have a larger label space. In addition, AnnexML improves the trade-off between prediction time and accuracy. At the same level of accuracy, the prediction time of AnnexML was up to 58 times faster than that of SLEEC, a state-of-the-art embedding-based method.

  • Scalable and Parameterized Architecture for Efficient Stream Mining

    Li ZHANG  Dawei LI  Xuecheng ZOU  Yu HU  Xiaowei XU  

     
    PAPER-Systems and Control

      Vol:
    E101-A No:1
      Page(s):
    219-231

    With an annual growth of billions of sensor-based devices, it is an urgent need to do stream mining for the massive data streams produced by these devices. Cloud computing is a competitive choice for this, with powerful computational capabilities. However, it sacrifices real-time feature and energy efficiency. Application-specific integrated circuit (ASIC) is with high performance and efficiency, which is not cost-effective for diverse applications. The general-purpose microcontroller is of low performance. Therefore, it is a challenge to do stream mining on these low-cost devices with scalability and efficiency. In this paper, we introduce an FPGA-based scalable and parameterized architecture for stream mining.Particularly, Dynamic Time Warping (DTW) based k-Nearest Neighbor (kNN) is adopted in the architecture. Two processing element (PE) rings for DTW and kNN are designed to achieve parameterization and scalability with high performance. We implement the proposed architecture on an FPGA and perform a comprehensive performance evaluation. The experimental results indicate thatcompared to the multi-core CPU-based implementation, our approach demonstrates over one order of magnitude on speedup and three orders of magnitude on energy-efficiency.

  • Multiple k-Nearest Neighbor Classifier and Its Application to Tissue Characterization of Coronary Plaque

    Eiji UCHINO  Ryosuke KUBOTA  Takanori KOGA  Hideaki MISAWA  Noriaki SUETAKE  

     
    PAPER-Biological Engineering

      Pubricized:
    2016/04/15
      Vol:
    E99-D No:7
      Page(s):
    1920-1927

    In this paper we propose a novel classification method for the multiple k-nearest neighbor (MkNN) classifier and show its practical application to medical image processing. The proposed method performs fine classification when a pair of the spatial coordinate of the observation data in the observation space and its corresponding feature vector in the feature space is provided. The proposed MkNN classifier uses the continuity of the distribution of features of the same class not only in the feature space but also in the observation space. In order to validate the performance of the present method, it is applied to the tissue characterization problem of coronary plaque. The quantitative and qualitative validity of the proposed MkNN classifier have been confirmed by actual experiments.

  • Efficient K-Nearest Neighbor Graph Construction Using MapReduce for Large-Scale Data Sets

    Tomohiro WARASHINA  Kazuo AOYAMA  Hiroshi SAWADA  Takashi HATTORI  

     
    PAPER-Data Engineering, Web Information Systems

      Vol:
    E97-D No:12
      Page(s):
    3142-3154

    This paper presents an efficient method using Hadoop MapReduce for constructing a K-nearest neighbor graph (K-NNG) from a large-scale data set. K-NNG has been utilized as a data structure for data analysis techniques in various applications. If we are to apply the techniques to a large-scale data set, it is desirable that we develop an efficient K-NNG construction method. We focus on NN-Descent, which is a recently proposed method that efficiently constructs an approximate K-NNG. NN-Descent is implemented on a shared-memory system with OpenMP-based parallelization, and its extension for the Hadoop MapReduce framework is implied for a larger data set such that the shared-memory system is difficult to deal with. However, a simple extension for the Hadoop MapReduce framework is impractical since it requires extremely high system performance because of the high memory consumption and the low data transmission efficiency of MapReduce jobs. The proposed method relaxes the requirement by improving the MapReduce jobs, which employs an appropriate key-value pair format and an efficient sampling strategy. Experiments on large-scale data sets demonstrate that the proposed method both works efficiently and is scalable in terms of a data size, the number of machine nodes, and the graph structural parameter K.

  • Accurate and Real-Time Pedestrian Classification Based on UWB Doppler Radar Images and Their Radial Velocity Features

    Kenshi SAHO  Takuya SAKAMOTO  Toru SATO  Kenichi INOUE  Takeshi FUKUDA  

     
    PAPER-Sensing

      Vol:
    E96-B No:10
      Page(s):
    2563-2572

    The classification of human motion is an important aspect of monitoring pedestrian traffic. This requires the development of advanced surveillance and monitoring systems. Methods to achieve this have been proposed using micro-Doppler radars. However, reliable long-term data and/or complicated procedures are needed to classify motion accurately with these conventional methods because their accuracy and real-time capabilities are invariably inadequate. This paper proposes an accurate and real-time method for classifying the movements of pedestrians using ultra wide-band (UWB) Doppler radar to overcome these problems. The classification of various movements is achieved by extracting feature parameters based on UWB Doppler radar images and their radial velocity distributions. Experiments were carried out assuming six types of pedestrian movements (pedestrians swinging both arms, swinging only one arm, swinging no arms, on crutches, pushing wheelchairs, and seated in wheelchairs). We found they could be classified using the proposed feature parameters and a k-nearest neighbor algorithm. A classification accuracy of 96% was achieved with a mean calculation time of 0.55s. Moreover, the classification accuracy was 99% using our proposed method for classifying three groups of pedestrian movements (normal walkers, those on crutches, and those in wheelchairs).

  • Two-Stage Block-Based Whitened Principal Component Analysis with Application to Single Sample Face Recognition

    Biao WANG  Wenming YANG  Weifeng LI  Qingmin LIAO  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E95-D No:3
      Page(s):
    853-860

    In the task of face recognition, a challenging issue is the one sample problem, namely, there is only one training sample per person. Principal component analysis (PCA) seeks a low-dimensional representation that maximizes the global scatter of the training samples, and thus is suitable for one sample problem. However, standard PCA is sensitive to the outliers and emphasizes more on the relatively distant sample pairs, which implies that the close samples belonging to different classes tend to be merged together. In this paper, we propose two-stage block-based whitened PCA (TS-BWPCA) to address this problem. For a specific probe image, in the first stage, we seek the K-Nearest Neighbors (K-NNs) in the whitened PCA space and thus exclude most of samples which are distant to the probe. In the second stage, we maximize the “local” scatter by performing whitened PCA on the K nearest samples, which could explore the most discriminative information for similar classes. Moreover, block-based scheme is incorporated to address the small sample problem. This two-stage process is actually a coarse-to-fine scheme that can maximize both global and local scatter, and thus overcomes the aforementioned shortcomings of PCA. Experimental results on FERET face database show that our proposed algorithm is better than several representative approaches.

  • Dual Two-Dimensional Fuzzy Class Preserving Projections for Facial Expression Recognition

    Ruicong ZHI  Qiuqi RUAN  Jiying WU  

     
    LETTER-Pattern Recognition

      Vol:
    E91-D No:12
      Page(s):
    2880-2883

    This paper proposes a novel algorithm for image feature extraction-the dual two-dimensional fuzzy class preserving projections ((2D)2FCPP). The main advantages of (2D)2FCPP over two-dimensional locality preserving projections (2DLPP) are: (1) utilizing the fuzzy assignation mechanisms to construct the weight matrix, which can improve the classification results; (2) incorporating 2DLPP and alternative 2DLPP to get a more efficient dimensionality reduction method-(2D)2LPP.