The search functionality is under construction.

Author Search Result

[Author] Yun JIN(9hit)

1-9hit
  • A Novel Bayes' Theorem-Based Saliency Detection Model

    Xin HE  Huiyun JING  Qi HAN  Xiamu NIU  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E94-D No:12
      Page(s):
    2545-2548

    We propose a novel saliency detection model based on Bayes' theorem. The model integrates the two parts of Bayes' equation to measure saliency, each part of which was considered separately in the previous models. The proposed model measures saliency by computing local kernel density estimation of features in the center-surround region and global kernel density estimation of features at each pixel across the whole image. Under the proposed model, a saliency detection method is presented that extracts DCT (Discrete Cosine Transform) magnitude of local region around each pixel as the feature. Experiments show that the proposed model not only performs competitively on psychological patterns and better than the current state-of-the-art models on human visual fixation data, but also is robust against signal uncertainty.

  • CBRISK: Colored Binary Robust Invariant Scalable Keypoints

    Huiyun JING  Xin HE  Qi HAN  Xiamu NIU  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E96-D No:2
      Page(s):
    392-395

    BRISK (Binary Robust Invariant Scalable Keypoints) works dramatically faster than well-established algorithms (SIFT and SURF) while maintaining matching performance. However BRISK relies on intensity, color information in the image is ignored. In view of the importance of color information in vision applications, we propose CBRISK, a novel method for taking into account color information during keypoint detection and description. Instead of grayscale intensity image, the proposed approach detects keypoints in the photometric invariant color space. On the basis of binary intensity BRISK (original BRISK) descriptor, the proposed approach embeds binary invariant color presentation in the CBRISK descriptors. Experimental results show that CBRISK is more discriminative and robust than BRISK with respect to photometric variation.

  • Saliency Density and Edge Response Based Salient Object Detection

    Huiyun JING  Qi HAN  Xin HE  Xiamu NIU  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E96-D No:5
      Page(s):
    1243-1246

    We propose a novel threshold-free salient object detection approach which integrates both saliency density and edge response. The salient object with a well-defined boundary can be automatically detected by our approach. Saliency density and edge response maximization is used as the quality function to direct the salient object discovery. The global optimal window containing a salient object is efficiently located through the proposed saliency density and edge response based branch-and-bound search. To extract the salient object with a well-defined boundary, the GrabCut method is applied, initialized by the located window. Experimental results show that our approach outperforms the methods only using saliency or edge response and achieves a comparable performance with the best state-of-the-art method, while being without any threshold or multiple iterations of GrabCut.

  • Speech Emotion Recognition Using Transfer Learning

    Peng SONG  Yun JIN  Li ZHAO  Minghai XIN  

     
    LETTER-Speech and Hearing

      Vol:
    E97-D No:9
      Page(s):
    2530-2532

    A major challenge for speech emotion recognition is that when the training and deployment conditions do not use the same speech corpus, the recognition rates will obviously drop. Transfer learning, which has successfully addressed the cross-domain classification or recognition problem, is presented for cross-corpus speech emotion recognition. First, by using the maximum mean discrepancy embedding (MMDE) optimization and dimension reduction algorithms, two close low-dimensional feature spaces are obtained for source and target speech corpora, respectively. Then, a classifier function is trained using the learned low-dimensional features in the labeled source corpus, and directly applied to the unlabeled target corpus for emotion label recognition. Experimental results demonstrate that the transfer learning method can significantly outperform the traditional automatic recognition technique for cross-corpus speech emotion recognition.

  • Speaker-Independent Speech Emotion Recognition Based on Two-Layer Multiple Kernel Learning

    Yun JIN  Peng SONG  Wenming ZHENG  Li ZHAO  Minghai XIN  

     
    LETTER-Speech and Hearing

      Vol:
    E96-D No:10
      Page(s):
    2286-2289

    In this paper, a two-layer Multiple Kernel Learning (MKL) scheme for speaker-independent speech emotion recognition is presented. In the first layer, MKL is used for feature selection. The training samples are separated into n groups according to some rules. All groups are used for feature selection to obtain n sparse feature subsets. The intersection and the union of all feature subsets are the result of our feature selection methods. In the second layer, MKL is used again for speech emotion classification with the selected features. In order to evaluate the effectiveness of our proposed two-layer MKL scheme, we compare it with state-of-the-art results. It is shown that our scheme results in large gain in performance. Furthermore, another experiment is carried out to compare our feature selection method with other popular ones. And the result proves the effectiveness of our feature selection method.

  • A Novel Iterative Speaker Model Alignment Method from Non-Parallel Speech for Voice Conversion

    Peng SONG  Wenming ZHENG  Xinran ZHANG  Yun JIN  Cheng ZHA  Minghai XIN  

     
    LETTER-Speech and Hearing

      Vol:
    E98-A No:10
      Page(s):
    2178-2181

    Most of the current voice conversion methods are conducted based on parallel speech, which is not easily obtained in practice. In this letter, a novel iterative speaker model alignment (ISMA) method is proposed to address this problem. First, the source and target speaker models are each trained from the background model by adopting maximum a posteriori (MAP) algorithm. Then, a novel ISMA method is presented for alignment and transformation of spectral features. Finally, the proposed ISMA approach is further combined with a Gaussian mixture model (GMM) to improve the conversion performance. A series of objective and subjective experiments are carried out on CMU ARCTIC dataset, and the results demonstrate that the proposed method significantly outperforms the state-of-the-art approach.

  • Transfer Semi-Supervised Non-Negative Matrix Factorization for Speech Emotion Recognition

    Peng SONG  Shifeng OU  Xinran ZHANG  Yun JIN  Wenming ZHENG  Jinglei LIU  Yanwei YU  

     
    LETTER-Speech and Hearing

      Pubricized:
    2016/07/01
      Vol:
    E99-D No:10
      Page(s):
    2647-2650

    In practice, emotional speech utterances are often collected from different devices or conditions, which will lead to discrepancy between the training and testing data, resulting in sharp decrease of recognition rates. To solve this problem, in this letter, a novel transfer semi-supervised non-negative matrix factorization (TSNMF) method is presented. A semi-supervised negative matrix factorization algorithm, utilizing both labeled source and unlabeled target data, is adopted to learn common feature representations. Meanwhile, the maximum mean discrepancy (MMD) as a similarity measurement is employed to reduce the distance between the feature distributions of two databases. Finally, the TSNMF algorithm, which optimizes the SNMF and MMD functions together, is proposed to obtain robust feature representations across databases. Extensive experiments demonstrate that in comparison to the state-of-the-art approaches, our proposed method can significantly improve the cross-corpus recognition rates.

  • Region Diversity Based Saliency Density Maximization for Salient Object Detection

    Xin HE  Huiyun JING  Qi HAN  Xiamu NIU  

     
    LETTER-Image

      Vol:
    E96-A No:1
      Page(s):
    394-397

    Existing salient object detection methods either simply use a threshold to detect desired salient objects from saliency map or search the most promising rectangular window covering salient objects on the saliency map. There are two problems in the existing methods: 1) The performance of threshold-dependent methods depends on a threshold selection and it is difficult to select an appropriate threshold value. 2) The rectangular window not only covers the salient object but also contains background pixels, which leads to imprecise salient object detection. For solving these problems, a novel saliency threshold-free method for detecting the salient object with a well-defined boundary is proposed in this paper. We propose a novel window search algorithm to locate a rectangular window on our saliency map, which contains as many as possible pixels belonging the salient object and as few as possible background pixels. Once the window is determined, GrabCut is applied to extract salient object with a well-defined boundary. Compared with existing methods, our approach doesn't need any threshold to binarize the saliency map and additional operations. Experimental results show that our approach outperforms 4 state-of-the-art salient object detection methods, yielding higher precision and better F-Measure.

  • Co-saliency Detection Linearly Combining Single-View Saliency and Foreground Correspondence

    Huiyun JING  Xin HE  Qi HAN  Xiamu NIU  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2015/01/05
      Vol:
    E98-D No:4
      Page(s):
    985-988

    The research of detecting co-saliency over multiple images is just beginning. The existing methods multiply the saliency on single image by the correspondence over multiple images to estimate co-saliency. They have difficulty in highlighting the co-salient object that is not salient on single image. It is caused by two problems. (1) The correspondence computation lacks precision. (2) The co-saliency multiplication formulation does not fully consider the effect of correspondence for co-saliency. In this paper, we propose a novel co-saliency detection scheme linearly combining foreground correspondence and single-view saliency. The progressive graph matching based foreground correspondence method is proposed to improve the precision of correspondence computation. Then the foreground correspondence is linearly combined with single-view saliency to compute co-saliency. According to the linear combination formulation, high correspondence could bring about high co-saliency, even when single-view saliency is low. Experiments show that our method outperforms previous state-of-the-art co-saliency methods.