1-11hit |
Hongcui WANG Shanshan LIU Di JIN Lantian LI Jianwu DANG
Recognizing the different segments of speech belonging to the same speaker is an important speech analysis task in various applications. Recent works have shown that there was an underlying manifold on which speaker utterances live in the model-parameter space. However, most speaker clustering methods work on the Euclidean space, and hence often fail to discover the intrinsic geometrical structure of the data space and fail to use such kind of features. For this problem, we consider to convert the speaker i-vector representation of utterances in the Euclidean space into a network structure constructed based on the local (k) nearest neighbor relationship of these signals. We then propose an efficient community detection model on the speaker content network for clustering signals. The new model is based on the probabilistic community memberships, and is further refined with the idea that: if two connected nodes have a high similarity, their community membership distributions in the model should be made close. This refinement enhances the local invariance assumption, and thus better respects the structure of the underlying manifold than the existing community detection methods. Some experiments are conducted on graphs built from two Chinese speech databases and a NIST 2008 Speaker Recognition Evaluations (SREs). The results provided the insight into the structure of the speakers present in the data and also confirmed the effectiveness of the proposed new method. Our new method yields better performance compared to with the other state-of-the-art clustering algorithms. Metrics for constructing speaker content graph is also discussed.
Yuichiro WADA Siqiang SU Wataru KUMAGAI Takafumi KANAMORI
This paper proposes a computationally efficient offline semi-supervised algorithm that yields a more accurate prediction than the label propagation algorithm, which is commonly used in online graph-based semi-supervised learning (SSL). Our proposed method is an offline method that is intended to assist online graph-based SSL algorithms. The efficacy of the tool in creating new learning algorithms of this type is demonstrated in numerical experiments.
Ruisheng RAN Bin FANG Xuegang WU
Neighborhood preserving embedding is a widely used manifold reduced dimensionality technique. But NPE has to encounter two problems. One problem is that it suffers from the small-sample-size (SSS) problem. Another is that the performance of NPE is seriously sensitive to the neighborhood size k. To overcome the two problems, an exponential neighborhood preserving embedding (ENPE) is proposed in this paper. The main idea of ENPE is that the matrix exponential is introduced to NPE, then the SSS problem is avoided and low sensitivity to the neighborhood size k is gotten. The experiments are conducted on ORL, Georgia Tech and AR face database. The results show that, ENPE shows advantageous performance over other unsupervised methods, such as PCA, LPP, ELPP and NPE. Another is that ENPE is much less sensitive to the neighborhood parameter k contrasted with the unsupervised manifold learning methods LPP, ELPP and NPE.
Lu SUN Mineichi KUDO Keigo KIMURA
Multi-label classification is an appealing and challenging supervised learning problem, where multiple labels, rather than a single label, are associated with an unseen test instance. To remove possible noises in labels and features of high-dimensionality, multi-label dimension reduction has attracted more and more attentions in recent years. The existing methods usually suffer from several problems, such as ignoring label outliers and label correlations. In addition, most of them emphasize on conducting dimension reduction in an unsupervised or supervised way, therefore, unable to utilize the label information or a large amount of unlabeled data to improve the performance. In order to cope with these problems, we propose a novel method termed Robust sEmi-supervised multi-lAbel DimEnsion Reduction, shortly READER. From the viewpoint of empirical risk minimization, READER selects most discriminative features for all the labels in a semi-supervised way. Specifically, the ℓ2,1-norm induced loss function and regularization term make READER robust to the outliers in the data points. READER finds a feature subspace so as to keep originally neighbor instances close and embeds labels into a low-dimensional latent space nonlinearly. To optimize the objective function, an efficient algorithm is developed with convergence property. Extensive empirical studies on real-world datasets demonstrate the superior performance of the proposed method.
Daisuke TANAKA Takamitsu MATSUBARA Kenji SUGIMOTO
In this paper, the system identification problem from the high-dimensional input and output is considered. If the relationship between the features extracted from the data is represented as a linear time-invariant dynamical system, the input-output manifold learning method has shown to be a powerful tool for solving such a system identification problem. However, in the previous study, the system is assumed to be initially relaxed because the transfer function model is used for system representation. This assumption may not hold in several tasks. To handle the initially non-relaxed system, we propose the alternative approach of the input-output manifold learning with state space model for the system representation. The effectiveness of our proposed method is confirmed by experiments with synthetic data and motion capture data of human-human conversation.
In the image classification applications, the test sample with multiple man-handcrafted descriptions can be sparsely represented by a few training subjects. Our paper is motivated by the success of multi-task joint sparse representation (MTJSR), and considers that the different modalities of features not only have the constraint of joint sparsity across different tasks, but also have the constraint of local manifold structure across different features. We introduce the constraint of local manifold structure into the MTJSR framework, and propose the Locality-constrained multi-task joint sparse representation method (LC-MTJSR). During the optimization of the formulated objective, the stochastic gradient descent method is used to guarantee fast convergence rate, which is essential for large-scale image categorization. Experiments on several challenging object classification datasets show that our proposed algorithm is better than the MTJSR, and is competitive with the state-of-the-art multiple kernel learning methods.
Xianglei XING Sidan DU Hua JIANG
We extend the Nonparametric Discriminant Analysis (NDA) algorithm to a semi-supervised dimensionality reduction technique, called Semi-supervised Nonparametric Discriminant Analysis (SNDA). SNDA preserves the inherent advantages of NDA, that is, relaxing the Gaussian assumption required for the traditional LDA-based methods. SNDA takes advantage of both the discriminating power provided by the NDA method and the locality-preserving power provided by the manifold learning. Specifically, the labeled data points are used to maximize the separability between different classes and both the labeled and unlabeled data points are used to build a graph incorporating neighborhood information of the data set. Experiments on synthetic as well as real datasets demonstrate the effectiveness of the proposed approach.
Jin-Ping HE Guang-Da SU Jian-Sheng CHEN
To reconstruct low-resolution facial photographs which are in focus and without motion blur, a novel algorithm based on local similarity preserving is proposed. It is based on the theories of local manifold learning. The innovations of the new method include mixing point-based entropy and Euclidian distance to search for the nearest points, adding point-to-patch degradation model to restrict the linear weights and compensating the fusing patch to keep energy coherence. The compensation reduces the algorithm dependence on training sets and keeps the luminance of reconstruction constant. Experiments show that our method can effectively reconstruct 1612 images with the magnification of 88 and the 3224 facial photographs in focus and without motion blur.
Qian LIU Chao LAN Xiao Yuan JING Shi Qiang GAO David ZHANG Jing Yu YANG
In the past few years, discriminant analysis and manifold learning have been widely used in feature extraction. Recently, the sparse representation technique has advanced the development of pattern recognition. In this paper, we combine both discriminant analysis and manifold learning with sparse representation technique and propose a novel feature extraction approach named sparsity preserving embedding with manifold learning and discriminant analysis. It seeks an embedded space, where not only the sparse reconstructive relations among original samples are preserved, but also the manifold and discriminant information of both original sample set and the corresponding reconstructed sample set is maintained. Experimental results on the public AR and FERET face databases show that our approach outperforms relevant methods in recognition performance.
Zhengming MA Jing CHEN Shuaibin LIAN
Locally linear embedding (LLE) is a well-known method for nonlinear dimensionality reduction. The mathematical proof and experimental results presented in this paper show that the neighborhood sizes in LLE must be smaller than the dimensions of input data spaces, otherwise LLE would degenerate from a nonlinear method for dimensionality reduction into a linear method for dimensionality reduction. Furthermore, when the neighborhood sizes are larger than the dimensions of input data spaces, the solutions to LLE are not unique. In these cases, the addition of some regularization method is often proposed. The experimental results presented in this paper show that the regularization method is not robust. Too large or too small regularization parameters cannot unwrap S-curve. Although a moderate regularization parameters can unwrap S-curve, the relative distance in the input data will be distorted in unwrapping. Therefore, in order to make LLE play fully its advantage in nonlinear dimensionality reduction and avoid multiple solutions happening, the best way is to make sure that the neighborhood sizes are smaller than the dimensions of input data spaces.
Xiaokan WANG Xia MAO Catalin-Daniel CALEANU
For improving the nonlinear alignment performance of Active Appearance Models (AAM), we apply a variant of the nonlinear manifold learning algorithm, Local Linear Embedded, to model shape-texture manifold. Experiments show that our method maintains a lower alignment residual to some small scale movements compared with traditional AAM based on Principal Component Analysis (PCA) and makes a successful alignment to large scale motions when PCA-AAM failed.