The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] unsupervised(46hit)

1-20hit(46hit)

  • Semantic Relationship-Based Unsupervised Representation Learning of Multivariate Time Series

    Chengyang YE  Qiang MA  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/11/16
      Vol:
    E107-D No:2
      Page(s):
    191-200

    Representation learning is a crucial and complex task for multivariate time series data analysis, with a wide range of applications including trend analysis, time series data search, and forecasting. In practice, unsupervised learning is strongly preferred owing to sparse labeling. However, most existing studies focus on the representation of individual subseries without considering relationships between different subseries. In certain scenarios, this may lead to downstream task failures. Here, an unsupervised representation learning model is proposed for multivariate time series that considers the semantic relationship among subseries of time series. Specifically, the covariance calculated by the Gaussian process (GP) is introduced to the self-attention mechanism, capturing relationship features of the subseries. Additionally, a novel unsupervised method is designed to learn the representation of multivariate time series. To address the challenges of variable lengths of input subseries, a temporal pyramid pooling (TPP) method is applied to construct input vectors with equal length. The experimental results show that our model has substantial advantages compared with other representation learning models. We conducted experiments on the proposed algorithm and baseline algorithms in two downstream tasks: classification and retrieval. In classification task, the proposed model demonstrated the best performance on seven of ten datasets, achieving an average accuracy of 76%. In retrieval task, the proposed algorithm achieved the best performance under different datasets and hidden sizes. The result of ablation study also demonstrates significance of semantic relationship in multivariate time series representation learning.

  • Data Augmented Incremental Learning (DAIL) for Unsupervised Data

    Sathya MADHUSUDHANAN  Suresh JAGANATHAN  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2022/03/14
      Vol:
    E105-D No:6
      Page(s):
    1185-1195

    Incremental Learning, a machine learning methodology, trains the continuously arriving input data and extends the model's knowledge. When it comes to unlabeled data streams, incremental learning task becomes more challenging. Our newly proposed incremental learning methodology, Data Augmented Incremental Learning (DAIL), learns the ever-increasing real-time streams with reduced memory resources and time. Initially, the unlabeled batches of data streams are clustered using the proposed clustering algorithm, Clustering based on Autoencoder and Gaussian Model (CLAG). Later, DAIL creates an updated incremental model for the labelled clusters using data augmentation. DAIL avoids the retraining of old samples and retains only the most recently updated incremental model holding all old class information. The use of data augmentation in DAIL combines the similar clusters generated with different data batches. A series of experiments verified the significant performance of CLAG and DAIL, producing scalable and efficient incremental model.

  • Latent Space Virtual Adversarial Training for Supervised and Semi-Supervised Learning

    Genki OSADA  Budrul AHSAN  Revoti PRASAD BORA  Takashi NISHIDE  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2021/12/09
      Vol:
    E105-D No:3
      Page(s):
    667-678

    Virtual Adversarial Training (VAT) has shown impressive results among recently developed regularization methods called consistency regularization. VAT utilizes adversarial samples, generated by injecting perturbation in the input space, for training and thereby enhances the generalization ability of a classifier. However, such adversarial samples can be generated only within a very small area around the input data point, which limits the adversarial effectiveness of such samples. To address this problem we propose LVAT (Latent space VAT), which injects perturbation in the latent space instead of the input space. LVAT can generate adversarial samples flexibly, resulting in more adverse effect and thus more effective regularization. The latent space is built by a generative model, and in this paper we examine two different type of models: variational auto-encoder and normalizing flow, specifically Glow. We evaluated the performance of our method in both supervised and semi-supervised learning scenarios for an image classification task using SVHN and CIFAR-10 datasets. In our evaluation, we found that our method outperforms VAT and other state-of-the-art methods.

  • Energy-Efficient ECG Signals Outlier Detection Hardware Using a Sparse Robust Deep Autoencoder

    Naoto SOGA  Shimpei SATO  Hiroki NAKAHARA  

     
    PAPER-Logic Design

      Pubricized:
    2021/05/17
      Vol:
    E104-D No:8
      Page(s):
    1121-1129

    Advancements in portable electrocardiographs have allowed electrocardiogram (ECG) signals to be recorded in everyday life. Machine-learning techniques, including deep learning, have been used in numerous studies to analyze ECG signals because they exhibit superior performance to conventional methods. A mobile ECG analysis device is needed so that abnormal ECG waves can be detected anywhere. Such mobile device requires a real-time performance and low power consumption, however, deep-learning based models often have too many parameters to implement on mobile hardware, its amount of hardware is too large and dissipates much power consumption. We propose a design flow to implement the outlier detector using an autoencoder on a low-end FPGA. To shorten the preparation time of ECG data used in training an autoencoder, an unsupervised learning technique is applied. Additionally, to minimize the volume of the weight parameters, a weight sparseness technique is applied, and all the parameters are converted into fixed-point values. We show that even if the parameters are reduced converted into fixed-point values, the outlier detection performance degradation is only 0.83 points. By reducing the volume of the weight parameters, all the parameters can be stored in on-chip memory. We design the architecture according to the CRS format, which is the well-known data structure of a sparse matrix, minimizing the hardware size and reducing the power consumption. We use weight sharing to further reduce the weight-parameter volumes. By using weight sharing, we could reduce the bit width of the memories by 60% while maintaining the outlier detection performance. We implemented the autoencoder on a Digilent Inc. ZedBoard and compared the results with those for the ARM mobile CPU for a built-in device. The results indicated that our FPGA implementation of the outlier detector was 12 times faster and 106 times more energy-efficient.

  • Learning-Based WiFi Traffic Load Estimation in NR-U Systems

    Rui YIN  Zhiqun ZOU  Celimuge WU  Jiantao YUAN  Xianfu CHEN  Guanding YU  

     
    PAPER-Mobile Information Network and Personal Communications

      Pubricized:
    2020/08/20
      Vol:
    E104-A No:2
      Page(s):
    542-549

    The unlicensed spectrum has been utilized to make up the shortage on frequency spectrum in new radio (NR) systems. To fully exploit the advantages brought by the unlicensed bands, one of the key issues is to guarantee the fair coexistence with WiFi systems. To reach this goal, timely and accurate estimation on the WiFi traffic loads is an important prerequisite. In this paper, a machine learning (ML) based method is proposed to detect the number of WiFi users on the unlicensed bands. An unsupervised Neural Network (NN) structure is applied to filter the detected transmission collision probability on the unlicensed spectrum, which enables the NR users to precisely rectify the measurement error and estimate the number of active WiFi users. Moreover, NN is trained online and the related parameters and learning rate of NN are jointly optimized to estimate the number of WiFi users adaptively with high accuracy. Simulation results demonstrate that compared with the conventional Kalman Filter based detection mechanism, the proposed approach has lower complexity and can achieve a more stable and accurate estimation.

  • Unsupervised Deep Embedded Hashing for Large-Scale Image Retrieval Open Access

    Huanmin WANG  

     
    LETTER-Image

      Pubricized:
    2020/07/14
      Vol:
    E104-A No:1
      Page(s):
    343-346

    Hashing methods have proven to be effective algorithm for image retrieval. However, learning discriminative hash codes is challenging for unsupervised models. In this paper, we propose a novel distinguishable image retrieval framework, named Unsupervised Deep Embedded Hashing (UDEH), to recursively learn discriminative clustering through soft clustering models and generate highly similar binary codes. We reduce the data dimension by auto-encoder and apply binary constraint loss to reduce quantization error. UDEH can be jointly optimized by standard stochastic gradient descent (SGD) in the embedd layer. We conducted a comprehensive experiment on two popular datasets.

  • Improved LDA Model for Credibility Evaluation of Online Product Reviews

    Xuan WANG  Bofeng ZHANG  Mingqing HUANG  Furong CHANG  Zhuocheng ZHOU  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2019/08/22
      Vol:
    E102-D No:11
      Page(s):
    2148-2158

    When individuals make a purchase from online sources, they may lack first-hand knowledge of the product. In such cases, they will judge the quality of the item by the reviews other consumers have posted. Therefore, it is significant to determine whether comments about a product are credible. Most often, conventional research on comment credibility has employed supervised machine learning methods, which have the disadvantage of needing large quantities of training data. This paper proposes an unsupervised method for judging comment credibility based on the Biterm Sentiment Latent Dirichlet Allocation (BS-LDA) model. Using this approach, first we derived some distributions and calculated each comment's credibility score via them. A comment's credibility was judged based on whether it achieved a threshold score. Our experimental results using comments from Amazon.com demonstrated that the overall performance of our approach can play an important role in determining the credibility of comments in some situation.

  • Feature Based Domain Adaptation for Neural Network Language Models with Factorised Hidden Layers

    Michael HENTSCHEL  Marc DELCROIX  Atsunori OGAWA  Tomoharu IWATA  Tomohiro NAKATANI  

     
    PAPER-Speech and Hearing

      Pubricized:
    2018/12/04
      Vol:
    E102-D No:3
      Page(s):
    598-608

    Language models are a key technology in various tasks, such as, speech recognition and machine translation. They are usually used on texts covering various domains and as a result domain adaptation has been a long ongoing challenge in language model research. With the rising popularity of neural network based language models, many methods have been proposed in recent years. These methods can be separated into two categories: model based and feature based adaptation methods. Feature based domain adaptation has compared to model based domain adaptation the advantage that it does not require domain labels in the corpus. Most existing feature based adaptation methods are based on bias adaptation. We propose a novel feature based domain adaptation technique using hidden layer factorisation. This method is fundamentally different from existing methods because we use the domain features to calculate a linear combination of linear layers. These linear layers can capture domain specific information and information common to different domains. In the experiments, we compare our proposed method with existing adaptation methods. The compared adaptation techniques are based on two different ideas, that is, bias based adaptation and gating of hidden units. All language models in our comparison use state-of-the-art long short-term memory based recurrent neural networks. We demonstrate the effectiveness of the proposed method with perplexity results for the well-known Penn Treebank and speech recognition results for a corpus of TED talks.

  • Gender Attribute Mining with Hand-Dorsa Vein Image Based on Unsupervised Sparse Feature Learning

    Jun WANG  Guoqing WANG  Zaiyu PAN  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2017/10/12
      Vol:
    E101-D No:1
      Page(s):
    257-260

    Gender classification with hand-dorsa vein information, a new soft biometric trait, is solved with the proposed unsupervised sparse feature learning model, state-of-the-art accuracy demonstrates the effectiveness of the proposed model. Besides, we also argue that the proposed data reconstruction model is also applicable to age estimation when comprehensive database differing in age is accessible.

  • Learning Supervised Feature Transformations on Zero Resources for Improved Acoustic Unit Discovery

    Michael HECK  Sakriani SAKTI  Satoshi NAKAMURA  

     
    PAPER-Speech and Hearing

      Pubricized:
    2017/10/20
      Vol:
    E101-D No:1
      Page(s):
    205-214

    In this work we utilize feature transformations that are common in supervised learning without having prior supervision, with the goal to improve Dirichlet process Gaussian mixture model (DPGMM) based acoustic unit discovery. The motivation of using such transformations is to create feature vectors that are more suitable for clustering. The need of labels for these methods makes it difficult to use them in a zero resource setting. To overcome this issue we utilize a first iteration of DPGMM clustering to generate frame based class labels for the target data. The labels serve as basis for learning linear discriminant analysis (LDA), maximum likelihood linear transform (MLLT) and feature-space maximum likelihood linear regression (fMLLR) based feature transformations. The novelty of our approach is the way how we use a traditional acoustic model training pipeline for supervised learning to estimate feature transformations in a zero resource scenario. We show that the learned transformations greatly support the DPGMM sampler in finding better clusters, according to the performance of the DPGMM posteriorgrams on the ABX sound class discriminability task. We also introduce a method for combining posteriorgram outputs of multiple clusterings and demonstrate that such combinations can further improve sound class discriminability.

  • An Extreme Learning Machine Architecture Based on Volterra Filtering and PCA

    Li CHEN  Ling YANG  Juan DU  Chao SUN  Shenglei DU  Haipeng XI  

     
    PAPER-Information Network

      Pubricized:
    2017/08/02
      Vol:
    E100-D No:11
      Page(s):
    2690-2701

    Extreme learning machine (ELM) has recently attracted many researchers' interest due to its very fast learning speed, good generalization ability, and ease of implementation. However, it has a linear output layer which may limit the capability of exploring the available information, since higher-order statistics of the signals are not taken into account. To address this, we propose a novel ELM architecture in which the linear output layer is replaced by a Volterra filter structure. Additionally, the principal component analysis (PCA) technique is used to reduce the number of effective signals transmitted to the output layer. This idea not only improves the processing capability of the network, but also preserves the simplicity of the training process. Then we carry out performance evaluation and application analysis for the proposed architecture in the context of supervised classification and unsupervised equalization respectively, and the obtained results either on publicly available datasets or various channels, when compared to those produced by already proposed ELM versions and a state-of-the-art algorithm: support vector machine (SVM), highlight the adequacy and the advantages of the proposed architecture and characterize it as a promising tool to deal with signal processing tasks.

  • Construction of Latent Descriptor Space and Inference Model of Hand-Object Interactions

    Tadashi MATSUO  Nobutaka SHIMADA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2017/03/13
      Vol:
    E100-D No:6
      Page(s):
    1350-1359

    Appearance-based generic object recognition is a challenging problem because all possible appearances of objects cannot be registered, especially as new objects are produced every day. Function of objects, however, has a comparatively small number of prototypes. Therefore, function-based classification of new objects could be a valuable tool for generic object recognition. Object functions are closely related to hand-object interactions during handling of a functional object; i.e., how the hand approaches the object, which parts of the object and contact the hand, and the shape of the hand during interaction. Hand-object interactions are helpful for modeling object functions. However, it is difficult to assign discrete labels to interactions because an object shape and grasping hand-postures intrinsically have continuous variations. To describe these interactions, we propose the interaction descriptor space which is acquired from unlabeled appearances of human hand-object interactions. By using interaction descriptors, we can numerically describe the relation between an object's appearance and its possible interaction with the hand. The model infers the quantitative state of the interaction from the object image alone. It also identifies the parts of objects designed for hand interactions such as grips and handles. We demonstrate that the proposed method can unsupervisedly generate interaction descriptors that make clusters corresponding to interaction types. And also we demonstrate that the model can infer possible hand-object interactions.

  • Unsupervised Image Steganalysis Method Using Self-Learning Ensemble Discriminant Clustering

    Bing CAO  Guorui FENG  Zhaoxia YIN  Lingyan FAN  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2017/02/18
      Vol:
    E100-D No:5
      Page(s):
    1144-1147

    Image steganography is a technique of embedding secret message into a digital image to securely send the information. In contrast, steganalysis focuses on detecting the presence of secret messages hidden by steganography. The modern approach in steganalysis is based on supervised learning where the training set must include the steganographic and natural image features. But if a new method of steganography is proposed, and the detector still trained on existing methods will generally lead to the serious detection accuracy drop due to the mismatch between training and detecting steganographic method. In this paper, we just attempt to process unsupervised learning problem and propose a detection model called self-learning ensemble discriminant clustering (SEDC), which aims at taking full advantage of the statistical property of the natural and testing images to estimate the optimal projection vector. This method can adaptively select the most discriminative subspace and then use K-means clustering to generate the ultimate class labels. Experimental results on J-UNIWARD and nsF5 steganographic methods with three feature extraction methods such as CC-JRM, DCTR, GFR show that the proposed scheme can effectively classification better than blind speculation.

  • Another Fuzzy Anomaly Detection System Based on Ant Clustering Algorithm

    Muhamad Erza AMINANTO  HakJu KIM  Kyung-Min KIM  Kwangjo KIM  

     
    PAPER

      Vol:
    E100-A No:1
      Page(s):
    176-183

    Attacks against computer networks are evolving rapidly. Conventional intrusion detection system based on pattern matching and static signatures have a significant limitation since the signature database should be updated frequently. The unsupervised learning algorithm can overcome this limitation. Ant Clustering Algorithm (ACA) is a popular unsupervised learning algorithm to classify data into different categories. However, ACA needs to be complemented with other algorithms for the classification process. In this paper, we present a fuzzy anomaly detection system that works in two phases. In the first phase, the training phase, we propose ACA to determine clusters. In the second phase, the classification phase, we exploit a fuzzy approach by the combination of two distance-based methods to detect anomalies in new monitored data. We validate our hybrid approach using the KDD Cup'99 dataset. The results indicate that, compared to several traditional and new techniques, the proposed hybrid approach achieves higher detection rate and lower false positive rate.

  • Improvements of Voice Timbre Control Based on Perceived Age in Singing Voice Conversion

    Kazuhiro KOBAYASHI  Tomoki TODA  Tomoyasu NAKANO  Masataka GOTO  Satoshi NAKAMURA  

     
    PAPER-Speech and Hearing

      Pubricized:
    2016/07/21
      Vol:
    E99-D No:11
      Page(s):
    2767-2777

    As one of the techniques enabling individual singers to produce the varieties of voice timbre beyond their own physical constraints, a statistical voice timbre control technique based on the perceived age has been developed. In this technique, the perceived age of a singing voice, which is the age of the singer as perceived by the listener, is used as one of the intuitively understandable measures to describe voice characteristics of the singing voice. The use of statistical voice conversion (SVC) with a singer-dependent multiple-regression Gaussian mixture model (MR-GMM), which effectively models the voice timbre variations caused by a change of the perceived age, makes it possible for individual singers to manipulate the perceived ages of their own singing voices while retaining their own singer identities. However, there still remain several issues; e.g., 1) a controllable range of the perceived age is limited; 2) quality of the converted singing voice is significantly degraded compared to that of a natural singing voice; and 3) each singer needs to sing the same phrase set as sung by a reference singer to develop the singer-dependent MR-GMM. To address these issues, we propose the following three methods; 1) a method using gender-dependent modeling to expand the controllable range of the perceived age; 2) a method using direct waveform modification based on spectrum differential to improve quality of the converted singing voice; and 3) a rapid unsupervised adaptation method based on maximum a posteriori (MAP) estimation to easily develop the singer-dependent MR-GMM. The experimental results show that the proposed methods achieve a wider controllable range of the perceived age, a significant quality improvement of the converted singing voice, and the development of the singer-dependnet MR-GMM using only a few arbitrary phrases as adaptation data.

  • Investigation of Combining Various Major Language Model Technologies including Data Expansion and Adaptation Open Access

    Ryo MASUMURA  Taichi ASAMI  Takanobu OBA  Hirokazu MASATAKI  Sumitaka SAKAUCHI  Akinori ITO  

     
    PAPER-Language modeling

      Pubricized:
    2016/07/19
      Vol:
    E99-D No:10
      Page(s):
    2452-2461

    This paper aims to investigate the performance improvements made possible by combining various major language model (LM) technologies together and to reveal the interactions between LM technologies in spontaneous automatic speech recognition tasks. While it is clear that recent practical LMs have several problems, isolated use of major LM technologies does not appear to offer sufficient performance. In consideration of this fact, combining various LM technologies has been also examined. However, previous works only focused on modeling technologies with limited text resources, and did not consider other important technologies in practical language modeling, i.e., use of external text resources and unsupervised adaptation. This paper, therefore, employs not only manual transcriptions of target speech recognition tasks but also external text resources. In addition, unsupervised LM adaptation based on multi-pass decoding is also added to the combination. We divide LM technologies into three categories and employ key ones including recurrent neural network LMs or discriminative LMs. Our experiments show the effectiveness of combining various LM technologies in not only in-domain tasks, the subject of our previous work, but also out-of-domain tasks. Furthermore, we also reveal the relationships between the technologies in both tasks.

  • Unsupervised Learning of Continuous Density HMM for Variable-Length Spoken Unit Discovery

    Meng SUN  Hugo VAN HAMME  Yimin WANG  Xiongwei ZHANG  

     
    LETTER-Speech and Hearing

      Pubricized:
    2015/10/21
      Vol:
    E99-D No:1
      Page(s):
    296-299

    Unsupervised spoken unit discovery or zero-source speech recognition is an emerging research topic which is important for spoken document analysis of languages or dialects with little human annotation. In this paper, we extend our earlier joint training framework for unsupervised learning of discrete density HMM to continuous density HMM (CDHMM) and apply it to spoken unit discovery. In the proposed recipe, we first cluster a group of Gaussians which then act as initializations to the joint training framework of nonnegative matrix factorization and semi-continuous density HMM (SCDHMM). In SCDHMM, all the hidden states share the same group of Gaussians but with different mixture weights. A CDHMM is subsequently constructed by tying the top-N activated Gaussians to each hidden state. Baum-Welch training is finally conducted to update the parameters of the Gaussians, mixture weights and HMM transition probabilities. Experiments were conducted on word discovery from TIDIGITS and phone discovery from TIMIT. For TIDIGITS, units were modeled by 10 states which turn out to be strongly related to words; while for TIMIT, units were modeled by 3 states which are likely to be phonemes.

  • Discriminative Middle-Level Parts Mining for Object Detection

    Dong LI  Yali LI  Shengjin WANG  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2015/08/03
      Vol:
    E98-D No:11
      Page(s):
    1950-1957

    Middle-level parts have attracted great attention in the computer vision community, acting as discriminative elements for objects. In this paper we propose an unsupervised approach to mine discriminative parts for object detection. This work features three aspects. First, we introduce an unsupervised, exemplar-based training process for part detection. We generate initial parts by selective search and then train part detectors by exemplar SVM. Second, a part selection model based on consistency and distinctiveness is constructed to select effective parts from the candidate pool. Third, we combine discriminative part mining with the deformable part model (DPM) for object detection. The proposed method is evaluated on the PASCAL VOC2007 and VOC2010 datasets. The experimental results demons-trate the effectiveness of our method for object detection.

  • Video Object Segmentation of Dynamic Scenes with Large Displacements

    Yinhui ZHANG  Zifen HE  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2015/06/17
      Vol:
    E98-D No:9
      Page(s):
    1719-1723

    Segmenting foreground objects in unconstrained dynamic scenes still remains a difficult problem. We present a novel unsupervised segmentation approach that allows robust object segmentation of dynamic scenes with large displacements. To make this possible, we project motion based foreground region hypotheses generated via standard optical flow onto visual saliency regions. The motion hypotheses correspond to inside seeds mapping of the motion boundary. For visual saliency, we generalize the image signature method from images to videos to delineate saliency mapping of object proposals. The mapping of image signatures estimated in Discrete Cosine Transform (DCT) domain favor stand-out regions in the human visual system. We leverage a Markov random field built on superpixels to impose both spatial and temporal consistence constraints on the motion-saliency combined segments. Projecting salient regions via an image signature with inside mapping seeds facilitates segmenting ambiguous objects from unconstrained dynamic scenes in presence of large displacements. We demonstrate the performance on fourteen challenging unconstrained dynamic scenes, compare our method with two state-of-the-art unsupervised video segmentation algorithms, and provide quantitative and qualitative performance comparisons.

  • Unsupervised Dimension Reduction via Least-Squares Quadratic Mutual Information

    Janya SAINUI  Masashi SUGIYAMA  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2014/07/22
      Vol:
    E97-D No:10
      Page(s):
    2806-2809

    The goal of dimension reduction is to represent high-dimensional data in a lower-dimensional subspace, while intrinsic properties of the original data are kept as much as possible. An important challenge in unsupervised dimension reduction is the choice of tuning parameters, because no supervised information is available and thus parameter selection tends to be subjective and heuristic. In this paper, we propose an information-theoretic approach to unsupervised dimension reduction that allows objective tuning parameter selection. We employ quadratic mutual information (QMI) as our information measure, which is known to be less sensitive to outliers than ordinary mutual information, and QMI is estimated analytically by a least-squares method in a computationally efficient way. Then, we provide an eigenvector-based efficient implementation for performing unsupervised dimension reduction based on the QMI estimator. The usefulness of the proposed method is demonstrated through experiments.

1-20hit(46hit)