1-19hit |
Xianmei FANG Xiaobo GAO Yuting WANG Zhouyu LIAO Yue MA
Fault localization analyzes the runtime information of two classes of test cases (i.e., passing test cases and failing test cases) to identify suspicious statements potentially responsible for a failure. However, the failing test cases are always far fewer than passing test cases in reality, and the class imbalance problem will affect fault localization effectiveness. To address this issue, we propose a data augmentation approach using conditional variational auto-encoder to synthesize new failing test cases for FL. The experimental results show that our approach significantly improves six state-of-the-art fault localization techniques.
Yaying SHEN Qun LI Ding XU Ziyi ZHANG Rui YANG
A triple loss based framework for generalized zero-shot learning is presented in this letter. The approach learns a shared latent space for image features and attributes by using aligned variational autoencoders and variants of triplet loss. Then we train a classifier in the latent space. The experimental results demonstrate that the proposed framework achieves great improvement.
The spectral envelope parameter is a significant speech parameter in the vocoder's quality. Recently, the Vector Quantized Variational AutoEncoder (VQ-VAE) is a state-of-the-art end-to-end quantization method based on the deep learning model. This paper proposed a new technique for improving the embedding space learning of VQ-VAE with the Generative Adversarial Network for quantizing the spectral envelope parameter, called VQ-VAE-EMGAN. In experiments, we designed the quantizer for the spectral envelope parameters of the WORLD vocoder extracted from the 16kHz speech waveform. As the results shown, the proposed technique reduced the Log Spectral Distortion (LSD) around 0.5dB and increased the PESQ by around 0.17 on average for four target bit operations compared to the conventional VQ-VAE.
Masaki TAKANASHI Shu-ichi SATO Kentaro INDO Nozomu NISHIHARA Hiroki HAYASHI Toru SUZUKI
The prediction of the malfunction timing of wind turbines is essential for maintaining the high profitability of the wind power generation industry. Studies have been conducted on machine learning methods that use condition monitoring system data, such as vibration data, and supervisory control and data acquisition (SCADA) data to detect and predict anomalies in wind turbines automatically. Autoencoder-based techniques that use unsupervised learning where the anomaly pattern is unknown have attracted significant interest in the area of anomaly detection and prediction. In particular, vibration data are considered useful because they include the changes that occur in the early stages of a malfunction. However, when autoencoder-based techniques are applied for prediction purposes, in the training process it is difficult to distinguish the difference between operating and non-operating condition data, which leads to the degradation of the prediction performance. In this letter, we propose a method in which both vibration data and SCADA data are utilized to improve the prediction performance, namely, a method that uses a power curve composed of active power and wind speed. We evaluated the method's performance using vibration and SCADA data obtained from an actual wind farm.
Yue LI Xiaosheng YU Haijun CAO Ming XU
An autoencoder is trained to generate the background from the surveillance image by setting the training label as the shuffled input, instead of the input itself in a traditional autoencoder. Then the multi-scale features are extracted by a sparse autoencoder from the surveillance image and the corresponding background to detect foreground.
Masaki TAKANASHI Shu-ichi SATO Kentaro INDO Nozomu NISHIHARA Hiroto ICHIKAWA Hirohisa WATANABE
Predicting the malfunction timing of wind turbines is essential for maintaining the high profitability of the wind power generation business. Machine learning methods have been studied using condition monitoring system data, such as vibration data, and supervisory control and data acquisition (SCADA) data, to detect and predict anomalies in wind turbines automatically. Autoencoder-based techniques have attracted significant interest in the detection or prediction of anomalies through unsupervised learning, in which the anomaly pattern is unknown. Although autoencoder-based techniques have been proven to detect anomalies effectively using relatively stable SCADA data, they perform poorly in the case of deteriorated SCADA data. In this letter, we propose a power-curve filtering method, which is a preprocessing technique used before the application of an autoencoder-based technique, to mitigate the dirtiness of SCADA data and improve the prediction performance of wind turbine degradation. We have evaluated its performance using SCADA data obtained from a real wind-farm.
Naoto SOGA Shimpei SATO Hiroki NAKAHARA
Advancements in portable electrocardiographs have allowed electrocardiogram (ECG) signals to be recorded in everyday life. Machine-learning techniques, including deep learning, have been used in numerous studies to analyze ECG signals because they exhibit superior performance to conventional methods. A mobile ECG analysis device is needed so that abnormal ECG waves can be detected anywhere. Such mobile device requires a real-time performance and low power consumption, however, deep-learning based models often have too many parameters to implement on mobile hardware, its amount of hardware is too large and dissipates much power consumption. We propose a design flow to implement the outlier detector using an autoencoder on a low-end FPGA. To shorten the preparation time of ECG data used in training an autoencoder, an unsupervised learning technique is applied. Additionally, to minimize the volume of the weight parameters, a weight sparseness technique is applied, and all the parameters are converted into fixed-point values. We show that even if the parameters are reduced converted into fixed-point values, the outlier detection performance degradation is only 0.83 points. By reducing the volume of the weight parameters, all the parameters can be stored in on-chip memory. We design the architecture according to the CRS format, which is the well-known data structure of a sparse matrix, minimizing the hardware size and reducing the power consumption. We use weight sharing to further reduce the weight-parameter volumes. By using weight sharing, we could reduce the bit width of the memories by 60% while maintaining the outlier detection performance. We implemented the autoencoder on a Digilent Inc. ZedBoard and compared the results with those for the ARM mobile CPU for a built-in device. The results indicated that our FPGA implementation of the outlier detector was 12 times faster and 106 times more energy-efficient.
Byeonghak KIM Murray LOEW David K. HAN Hanseok KO
To date, many studies have employed clustering for the classification of unlabeled data. Deep separate clustering applies several deep learning models to conventional clustering algorithms to more clearly separate the distribution of the clusters. In this paper, we employ a convolutional autoencoder to learn the features of input images. Following this, k-means clustering is conducted using the encoded layer features learned by the convolutional autoencoder. A center loss function is then added to aggregate the data points into clusters to increase the intra-cluster homogeneity. Finally, we calculate and increase the inter-cluster separability. We combine all loss functions into a single global objective function. Our new deep clustering method surpasses the performance of existing clustering approaches when compared in experiments under the same conditions.
Lianqiang LI Kangbo SUN Jie ZHU
Knowledge distillation approaches can transfer information from a large network (teacher network) to a small network (student network) to compress and accelerate deep neural networks. This paper proposes a novel knowledge distillation approach called multi-knowledge distillation (MKD). MKD consists of two stages. In the first stage, it employs autoencoders to learn compact and precise representations of the feature maps (FM) from the teacher network and the student network, these representations can be treated as the essential of the FM, i.e., EFM. In the second stage, MKD utilizes multiple kinds of knowledge, i.e., the magnitude of individual sample's EFM and the similarity relationships among several samples' EFM to enhance the generalization ability of the student network. Compared with previous approaches that employ FM or the handcrafted features from FM, the EFM learned from autoencoders can be transferred more efficiently and reliably. Furthermore, the rich information provided by the multiple kinds of knowledge guarantees the student network to mimic the teacher network as closely as possible. Experimental results also show that MKD is superior to the-state-of-arts.
This Letter proposes a autoencoder model supervised by semantic similarity for zero-shot learning. With the help of semantic similarity vectors of seen and unseen classes and the classification branch, our experimental results on two datasets are 7.3% and 4% better than the state-of-the-art on conventional zero-shot learning in terms of the averaged top-1 accuracy.
Minhae JANG Yeonseung RYU Jik-Soo KIM Minkyoung CHO
Internal user threats such as information leakage or system destruction can cause significant damage to the organization, however it is very difficult to prevent or detect this attack in advance. In this paper, we propose an anomaly-based insider threat detection method with local features and global statistics over the assumption that a user shows different patterns from regular behaviors during harmful actions. We experimentally show that our detection mechanism can achieve superior performance compared to the state of the art approaches for CMU CERT dataset.
Yoonhee KIM Deokgyu YUN Hannah LEE Seung Ho CHOI
This paper presents a deep learning-based non-intrusive speech intelligibility estimation method using bottleneck features of autoencoder. The conventional standard non-intrusive speech intelligibility estimation method, P.563, lacks intelligibility estimation performance in various noise environments. We propose a more accurate speech intelligibility estimation method based on long-short term memory (LSTM) neural network whose input and output are an autoencoder bottleneck features and a short-time objective intelligence (STOI) score, respectively, where STOI is a standard tool for measuring intrusive speech intelligibility with reference speech signals. We showed that the proposed method has a superior performance by comparing with the conventional standard P.563 and mel-frequency cepstral coefficient (MFCC) feature-based intelligibility estimation methods for speech signals in various noise environments.
Kazuki OTOMO Satoru KOBAYASHI Kensuke FUKUDA Hiroshi ESAKI
System logs are useful to understand the status of and detect faults in large scale networks. However, due to their diversity and volume of these logs, log analysis requires much time and effort. In this paper, we propose a log event anomaly detection method for large-scale networks without pre-processing and feature extraction. The key idea is to embed a large amount of diverse data into hidden states by using latent variables. We evaluate our method with 12 months of system logs obtained from a nation-wide academic network in Japan. Through comparisons with Kleinberg's univariate burst detection and a traditional multivariate analysis (i.e., PCA), we demonstrate that our proposed method achieves 14.5% higher recall and 3% higher precision than PCA. A case study shows detected anomalies are effective information for troubleshooting of network system faults.
Tsuneo KATO Atsushi NAGAI Naoki NODA Jianming WU Seiichi YAMAMOTO
Data-driven untying of a recursive autoencoder (RAE) is proposed for utterance intent classification for spoken dialogue systems. Although an RAE expresses a nonlinear operation on two neighboring child nodes in a parse tree in the application of spoken language understanding (SLU) of spoken dialogue systems, the nonlinear operation is considered to be intrinsically different depending on the types of child nodes. To reduce the gap between the single nonlinear operation of an RAE and intrinsically different operations depending on the node types, a data-driven untying of autoencoders using part-of-speech (PoS) tags at leaf nodes is proposed. When using the proposed method, the experimental results on two corpora: ATIS English data set and Japanese data set of a smartphone-based spoken dialogue system showed improved accuracies compared to when using the tied RAE, as well as a reasonable difference in untying between two languages.
Ippei HAMAMOTO Masaki KAWAMURA
An autoencoder has the potential ability to compress and decompress information. In this work, we consider the process of generating a stego-image from an original image and watermarks as compression, and the process of recovering the original image and watermarks from the stego-image as decompression. We propose embedder and extractor neural networks based on the autoencoder. The embedder network learns mapping from the DCT coefficients of the original image and a watermark to those of the stego-image. The extractor network learns mapping from the DCT coefficients of the stego-image to the watermark. Once the proposed neural network has been trained, the network can embed and extract the watermark into unlearned test images. We investigated the relation between the number of neurons and network performance by computer simulations and found that the trained neural network could provide high-quality stego-images and watermarks with few errors. We also evaluated the robustness against JPEG compression and found that, when suitable parameters were used, the watermarks were extracted with an average BER lower than 0.01 and image quality over 35 dB when the quality factor Q was over 50. We also investigated how to represent the watermarks in the stego-image by our neural network. There are two possibilities: distributed representation and sparse representation. From the results of investigation into the output of the stego layer (3rd layer), we found that the distributed representation emerged at an early learning step and then sparse representation came out at a later step.
Hosung PARK Seungsoo NAM Eun Man CHOI Daeseon CHOI
Hidden Singer is a television program in Korea. In the show, the original singer and four imitating singers sing a song in hiding behind a screen. The audience and TV viewers attempt to guess who the original singer is by listening to the singing voices. Usually, there are few correct answers from the audience, because the imitators are well trained and highly skilled. We propose a computerized system for distinguishing the original singer from the imitating singers. During the training phase, the system learns only the original singer's song because it is the one the audience has heard before. During the testing phase, the songs of five candidates are provided to the system and the system then determines the original singer. The system uses a 1-class authentication method, in which only a subject model is made. The subject model is used for measuring similarities between the candidate songs. In this problem, unlike other existing studies that require artist identification, we cannot utilize multi-class classifiers and supervised learning because songs of the imitators and the labels are not provided during the training phase. Therefore, we evaluate the performances of several 1-class learning algorithms to choose which one is more efficient in distinguishing an original singer from among highly skilled imitators. The experiment results show that the proposed system using the autoencoder performs better (63.33%) than other 1-class learning algorithms: Gaussian mixture model (GMM) (50%) and one class support vector machines (OCSVM) (26.67%). We also conduct a human contest to compare the performance of the proposed system with human perception. The accuracy of the proposed system is found to be better (63.33%) than the average accuracy of human perception (33.48%).
In this paper, we propose a novel primary user detection scheme for spectrum sensing in cognitive radio. Inspired by the conventional signal classification approach, the spectrum sensing is translated into a classification problem. On the basis of feature-based classification, the spectral correlation of a second-order cyclostationary analysis is applied as the feature extraction method, whereas a stacked denoising autoencoders network is applied as the classifier. Two training methods for signal detection, interception-based detection and simulation-based detection, are considered, for different prior information and implementation conditions. In an interception-based detection method, inspired by the two-step sensing, we obtain training data from the interception of actual signals after a sophisticated sensing procedure, to achieve detection without priori information. In addition, benefiting from practical training data, this interception-based detection is superior under actual transmission environment conditions. The alternative, a simulation-based detection method utilizes some undisguised parameters of the primary user in the spectrum of interest. Owing to the diversified predetermined training data, simulation-based detection exhibits transcendental robustness against harsh noise environments, although it demands a more complicated classifier network structure. Additionally, for the above-described training methods, we discuss the classifier complexity over implementation conditions and the trade-off between robustness and detection performance. The simulation results show the advantages of the proposed method over conventional spectrum-sensing schemes.
This paper investigates the effect of noises added to hidden units of AutoEncoders linked to multilayer perceptrons. It is shown that internal representation of learned features emerges and sparsity of hidden units increases when independent Gaussian noises are added to inputs of hidden units during the deep network training. It is also shown that the weights that connect the contaminated hidden units with the next layer have smaller values and outputs of hidden units tend to be more definite (0 or 1). This is expected to improve the generalization ability of the network through this automatic structuration by adding the noises. This network structuration was confirmed by experiments for MNIST digits classification via a deep neural network model.
Yundong LI Jiyue ZHANG Yubing LIN
In this letter, we propose a novel discriminative representation for patterned fabric defect inspection when only limited negative samples are available. Fisher criterion is introduced into the loss function of deep learning, which can guide the learning direction of deep networks and make the extracted features more discriminating. A deep neural network constructed from the encoder part of trained autoencoders is utilized to classify each pixel in the images into defective or defectless categories, using as context a patch centered on the pixel. Sequentially the confidence map is processed by median filtering and binary thresholding, and then the defect areas are located. Experimental results demonstrate that our method achieves state-of-the-art performance on the benchmark fabric images.