Radar emitter identification (REI) is a crucial function of electronic radar warfare support systems. The challenge emphasizes identifying and locating unique transmitters, avoiding potential threats, and preparing countermeasures. Due to the remarkable effectiveness of deep learning (DL) in uncovering latent features within data and performing classifications, deep neural networks (DNNs) have seen widespread application in radar emitter identification (REI). In many real-world scenarios, obtaining a large number of annotated radar transmitter samples for training identification models is essential yet challenging. Given the issues of insufficient labeled datasets and abundant unlabeled training datasets, we propose a novel REI method based on a semi-supervised learning (SSL) framework with virtual adversarial training (VAT). Specifically, two objective functions are designed to extract the semantic features of radar signals: computing cross-entropy loss for labeled samples and virtual adversarial training loss for all samples. Additionally, a pseudo-labeling approach is employed for unlabeled samples. The proposed VAT-based SS-REI method is evaluated on a radar dataset. Simulation results indicate that the proposed VAT-based SS-REI method outperforms the latest SS-REI method in recognition performance.
Ryota HIGASHIMOTO Soh YOSHIDA Takashi HORIHATA Mitsuji MUNEYASU
Noisy labels in training data can significantly harm the performance of deep neural networks (DNNs). Recent research on learning with noisy labels uses a property of DNNs called the memorization effect to divide the training data into a set of data with reliable labels and a set of data with unreliable labels. Methods introducing semi-supervised learning strategies discard the unreliable labels and assign pseudo-labels generated from the confident predictions of the model. So far, this semi-supervised strategy has yielded the best results in this field. However, we observe that even when models are trained on balanced data, the distribution of the pseudo-labels can still exhibit an imbalance that is driven by data similarity. Additionally, a data bias is seen that originates from the division of the training data using the semi-supervised method. If we address both types of bias that arise from pseudo-labels, we can avoid the decrease in generalization performance caused by biased noisy pseudo-labels. We propose a learning method with noisy labels that introduces unbiased pseudo-labeling based on causal inference. The proposed method achieves significant accuracy gains in experiments at high noise rates on the standard benchmarks CIFAR-10 and CIFAR-100.
Takefumi KAWAKAMI Takanori IDE Kunihito HOKI Masakazu MURAMATSU
In this paper, we apply two methods in machine learning, dropout and semi-supervised learning, to a recently proposed method called CSQ-SDL which uses deep neural networks for evaluating shift quality from time-series measurement data. When developing a new Automatic Transmission (AT), calibration takes place where many parameters of the AT are adjusted to realize pleasant driving experience in all situations that occur on all roads around the world. Calibration requires an expert to visually assess the shift quality from the time-series measurement data of the experiments each time the parameters are changed, which is iterative and time-consuming. The CSQ-SDL was developed to shorten time consumed by the visual assessment, and its effectiveness depends on acquiring a sufficient number of data points. In practice, however, data amounts are often insufficient. The methods proposed here can handle such cases. For the cases wherein only a small number of labeled data points is available, we propose a method that uses dropout. For those cases wherein the number of labeled data points is small but the number of unlabeled data is sufficient, we propose a method that uses semi-supervised learning. Experiments show that while the former gives moderate improvement, the latter offers a significant performance improvement.
Yuzhuo LIU Hangting CHEN Qingwei ZHAO Pengyuan ZHANG
Weakly labelled semi-supervised audio tagging (AT) and sound event detection (SED) have become significant in real-world applications. A popular method is teacher-student learning, making student models learn from pseudo-labels generated by teacher models from unlabelled data. To generate high-quality pseudo-labels, we propose a master-teacher-student framework trained with a dual-lead policy. Our experiments illustrate that our model outperforms the state-of-the-art model on both tasks.
Kazuhiko MURASAKI Shingo ANDO Jun SHIMAMURA
In this paper, we propose a semi-supervised triplet loss function that realizes semi-supervised representation learning in a novel manner. We extend conventional triplet loss, which uses labeled data to achieve representation learning, so that it can deal with unlabeled data. We estimate, in advance, the degree to which each label applies to each unlabeled data point, and optimize the loss function with unlabeled features according to the resulting ratios. Since the proposed loss function has the effect of adjusting the distribution of all unlabeled data, it complements methods based on consistency regularization, which has been extensively studied in recent years. Combined with a consistency regularization-based method, our method achieves more accurate semi-supervised learning. Experiments show that the proposed loss function achieves a higher accuracy than the conventional fine-tuning method.
Genki OSADA Budrul AHSAN Revoti PRASAD BORA Takashi NISHIDE
Virtual Adversarial Training (VAT) has shown impressive results among recently developed regularization methods called consistency regularization. VAT utilizes adversarial samples, generated by injecting perturbation in the input space, for training and thereby enhances the generalization ability of a classifier. However, such adversarial samples can be generated only within a very small area around the input data point, which limits the adversarial effectiveness of such samples. To address this problem we propose LVAT (Latent space VAT), which injects perturbation in the latent space instead of the input space. LVAT can generate adversarial samples flexibly, resulting in more adverse effect and thus more effective regularization. The latent space is built by a generative model, and in this paper we examine two different type of models: variational auto-encoder and normalizing flow, specifically Glow. We evaluated the performance of our method in both supervised and semi-supervised learning scenarios for an image classification task using SVHN and CIFAR-10 datasets. In our evaluation, we found that our method outperforms VAT and other state-of-the-art methods.
Danlei XING Fei WU Ying SUN Xiao-Yuan JING
Cross-project defect prediction (CPDP) is a feasible solution to build an accurate prediction model without enough historical data. Although existing methods for CPDP that use only labeled data to build the prediction model achieve great results, there are much room left to further improve on prediction performance. In this paper we propose a Semi-Supervised Discriminative Feature Learning (SSDFL) approach for CPDP. SSDFL first transfers knowledge of source and target data into the common space by using a fully-connected neural network to mine potential similarities of source and target data. Next, we reduce the differences of both marginal distributions and conditional distributions between mapped source and target data. We also introduce the discriminative feature learning to make full use of label information, which is that the instances from the same class are close to each other and the instances from different classes are distant from each other. Extensive experiments are conducted on 10 projects from AEEEM and NASA datasets, and the experimental results indicate that our approach obtains better prediction performance than baselines.
This paper proposes a method to create various training images for instance segmentation in a semi-supervised manner. In our proposed learning scheme, a few 3D CG models of target objects and a large number of images retrieved by keywords from the Internet are employed for initial model training and model update, respectively. Instance segmentation requires pixel-level annotations as well as object class labels in all training images. A possible solution to reduce a huge annotation cost is to use synthesized images as training images. While image synthesis using a 3D CG simulator can generate the annotations automatically, it is difficult to prepare a variety of 3D object models for the simulator. One more possible solution is semi-supervised learning. Semi-supervised learning such as self-training uses a small set of supervised data and a huge number of unsupervised data. The supervised images are given by the 3D CG simulator in our method. From the unsupervised images, we have to select only correctly-detected annotations. For selecting the correctly-detected annotations, we propose to quantify the reliability of each detected annotation based on its silhouette as well as its textures. Experimental results demonstrate that the proposed method can generate more various images for improving instance segmentation.
Jing SUN Yi-mu JI Shangdong LIU Fei WU
Software defect prediction (SDP) plays a vital role in allocating testing resources reasonably and ensuring software quality. When there are not enough labeled historical modules, considerable semi-supervised SDP methods have been proposed, and these methods utilize limited labeled modules and abundant unlabeled modules simultaneously. Nevertheless, most of them make use of traditional features rather than the powerful deep feature representations. Besides, the cost of the misclassification of the defective modules is higher than that of defect-free ones, and the number of the defective modules for training is small. Taking the above issues into account, we propose a cost-sensitive and sparse ladder network (CSLN) for SDP. We firstly introduce the semi-supervised ladder network to extract the deep feature representations. Besides, we introduce the cost-sensitive learning to set different misclassification costs for defective-prone and defect-free-prone instances to alleviate the class imbalance problem. A sparse constraint is added on the hidden nodes in ladder network when the number of hidden nodes is large, which enables the model to find robust structures of the data. Extensive experiments on the AEEEM dataset show that the CSLN outperforms several state-of-the-art semi-supervised SDP methods.
Yuichiro WADA Siqiang SU Wataru KUMAGAI Takafumi KANAMORI
This paper proposes a computationally efficient offline semi-supervised algorithm that yields a more accurate prediction than the label propagation algorithm, which is commonly used in online graph-based semi-supervised learning (SSL). Our proposed method is an offline method that is intended to assist online graph-based SSL algorithms. The efficacy of the tool in creating new learning algorithms of this type is demonstrated in numerical experiments.
Xiaotao CHENG Lixin JI Ruiyang HUANG Ruifei CUI
Network embedding has attracted an increasing amount of attention in recent years due to its wide-ranging applications in graph mining tasks such as vertex classification, community detection, and network visualization. Network embedding is an important method to learn low-dimensional representations of vertices in networks, aiming to capture and preserve the network structure. Almost all the existing network embedding methods adopt the so-called Skip-gram model in Word2vec. However, as a bag-of-words model, the skip-gram model mainly utilized the local structure information. The lack of information metrics for vertices in global network leads to the mix of vertices with different labels in the new embedding space. To solve this problem, in this paper we propose a Network Representation Learning method with Deep Metric Learning, namely DML-NRL. By setting the initialized anchor vertices and adding the similarity measure in the training progress, the distance information between different labels of vertices in the network is integrated into the vertex representation, which improves the accuracy of network embedding algorithm effectively. We compare our method with baselines by applying them to the tasks of multi-label classification and data visualization of vertices. The experimental results show that our method outperforms the baselines in all three datasets, and the method has proved to be effective and robust.
Peng SONG Shifeng OU Xinran ZHANG Yun JIN Wenming ZHENG Jinglei LIU Yanwei YU
In practice, emotional speech utterances are often collected from different devices or conditions, which will lead to discrepancy between the training and testing data, resulting in sharp decrease of recognition rates. To solve this problem, in this letter, a novel transfer semi-supervised non-negative matrix factorization (TSNMF) method is presented. A semi-supervised negative matrix factorization algorithm, utilizing both labeled source and unlabeled target data, is adopted to learn common feature representations. Meanwhile, the maximum mean discrepancy (MMD) as a similarity measurement is employed to reduce the distance between the feature distributions of two databases. Finally, the TSNMF algorithm, which optimizes the SNMF and MMD functions together, is proposed to obtain robust feature representations across databases. Extensive experiments demonstrate that in comparison to the state-of-the-art approaches, our proposed method can significantly improve the cross-corpus recognition rates.
Zhen GUO Yujie ZHANG Chen SU Jinan XU Hitoshi ISAHARA
Recent work on joint word segmentation, POS (Part Of Speech) tagging, and dependency parsing in Chinese has two key problems: the first is that word segmentation based on character and dependency parsing based on word were not combined well in the transition-based framework, and the second is that the joint model suffers from the insufficiency of annotated corpus. In order to resolve the first problem, we propose to transform the traditional word-based dependency tree into character-based dependency tree by using the internal structure of words and then propose a novel character-level joint model for the three tasks. In order to resolve the second problem, we propose a novel semi-supervised joint model for exploiting n-gram feature and dependency subtree feature from partially-annotated corpus. Experimental results on the Chinese Treebank show that our joint model achieved 98.31%, 94.84% and 81.71% for Chinese word segmentation, POS tagging, and dependency parsing, respectively. Our model outperforms the pipeline model of the three tasks by 0.92%, 1.77% and 3.95%, respectively. Particularly, the F1 value of word segmentation and POS tagging achieved the best result compared with those reported until now.
Junjun GUO Zhiyong LI Jianjun MU
In this letter, a novel collaborative representation graph based on the local and global consistency label propagation method, denoted as CRLGC, is proposed. The collaborative representation graph is used to reduce the cost time in obtaining the graph which evaluates the similarity of samples. Considering the lacking of labeled samples in real applications, a semi-supervised label propagation method is utilized to transmit the labels from the labeled samples to the unlabeled samples. Experimental results on three image data sets have demonstrated that the proposed method provides the best accuracies in most times when compared with other traditional graph-based semi-supervised classification methods.
Junyang QIU Yibing WANG Zhisong PAN Bo JIA
Independent and identically distributed (i.i.d) assumptions are commonly used in the machine learning community. However, social media data violate this assumption due to the linkages. Meanwhile, with the variety of data, there exist many samples, i.e., Universum, that do not belong to either class of interest. These characteristics pose great challenges to dealing with social media data. In this letter, we fully take advantage of Universum samples to enable the model to be more discriminative. In addition, the linkages are also taken into consideration in the means of social dimensions. To this end, we propose the algorithm Semi-Supervised Linked samples Feature Selection with Universum (U-SSLFS) to integrate the linking information and Universum simultaneously to select robust features. The empirical study shows that U-SSLFS outperforms state-of-the-art algorithms on the Flickr and BlogCatalog.
Jianqiao WANG Yuehua LI Jianfei CHEN Yuanjiang LI
The label estimation technique provides a new way to design semi-supervised learning algorithms. If the labels of the unlabeled data can be estimated correctly, the semi-supervised methods can be replaced by the corresponding supervised versions. In this paper, we propose a novel semi-supervised learning algorithm, called Geodesic Weighted Sparse Representation (GWSR), to estimate the labels of the unlabeled data. First, the geodesic distance and geodesic weight are calculated. The geodesic weight is utilized to reconstruct the labeled samples. The Euclidean distance between the reconstructed labeled sample and the unlabeled sample equals the geodesic distance between the original labeled sample and the unlabeled sample. Then, the unlabeled samples are sparsely reconstructed and the sparse reconstruction weight is obtained by minimizing the L1-norm. Finally, the sparse reconstruction weight is utilized to estimate the labels of the unlabeled samples. Experiments on synthetic data and USPS hand-written digit database demonstrate the effectiveness of our method.
Yong REN Nobuhiro KAJI Naoki YOSHINAGA Masaru KITSUREGAWA
In sentiment classification, conventional supervised approaches heavily rely on a large amount of linguistic resources, which are costly to obtain for under-resourced languages. To overcome this scarce resource problem, there exist several methods that exploit graph-based semi-supervised learning (SSL). However, fundamental issues such as controlling label propagation, choosing the initial seeds, selecting edges have barely been studied. Our evaluation on three real datasets demonstrates that manipulating the label propagating behavior and choosing labeled seeds appropriately play a critical role in adopting graph-based SSL approaches for this task.
Xianglei XING Sidan DU Hua JIANG
We extend the Nonparametric Discriminant Analysis (NDA) algorithm to a semi-supervised dimensionality reduction technique, called Semi-supervised Nonparametric Discriminant Analysis (SNDA). SNDA preserves the inherent advantages of NDA, that is, relaxing the Gaussian assumption required for the traditional LDA-based methods. SNDA takes advantage of both the discriminating power provided by the NDA method and the locality-preserving power provided by the manifold learning. Specifically, the labeled data points are used to maximize the separability between different classes and both the labeled and unlabeled data points are used to build a graph incorporating neighborhood information of the data set. Experiments on synthetic as well as real datasets demonstrate the effectiveness of the proposed approach.
Tsubasa KOBAYASHI Masashi SUGIYAMA
The objective of pool-based incremental active learning is to choose a sample to label from a pool of unlabeled samples in an incremental manner so that the generalization error is minimized. In this scenario, the generalization error often hits a minimum in the middle of the incremental active learning procedure and then it starts to increase. In this paper, we address the problem of early labeling stopping in probabilistic classification for minimizing the generalization error and the labeling cost. Among several possible strategies, we propose to stop labeling when the empirical class-posterior approximation error is maximized. Experiments on benchmark datasets demonstrate the usefulness of the proposed strategy.
Viet Cuong NGUYEN Le Minh NGUYEN Akira SHIMAZU
In the text summarization field, a table-of-contents is a type of indicative summary that is especially suited for locating information in a long document, or a set of documents. It is also a useful summary for a reader to quickly get an overview of the entire contents. The current models for generating a table-of-contents produced relatively low quality output with many meaningless titles, or titles that have no overlapping meaning with the corresponding contents. This problem may be due to the lack of semantic information and topic information in those models. In this research, we propose to integrate supportive knowledge into the learning models to improve the quality of titles in a generated table-of-contents. The supportive knowledge is derived from a hierarchical clustering of words, which is built from a large collection of raw text, and a topic model, which is directly estimated from the training data. The relatively good results of the experiments showed that the semantic and topic information supplied by supportive knowledge have good effects on title generation, and therefore, they help to improve the quality of the generated table-of-contents.