The search functionality is under construction.

Author Search Result

[Author] Ji-Hyun SONG(3hit)

1-3hit
  • Voice Activity Detection Based on Generalized Normal-Laplace Distribution Incorporating Conditional MAP

    Ji-Hyun SONG  Sangmin LEE  

     
    LETTER-Speech and Hearing

      Vol:
    E96-D No:12
      Page(s):
    2888-2891

    In this paper, we propose a novel voice activity detection (VAD) algorithm based on the generalized normal-Laplace (GNL) distribution to provide enhanced performance in adverse noise environments. Specifically, the probability density function (PDF) of a noisy speech signal is represented by the GNL distribution; the variance of the speech and noise of the GNL distribution are estimated using higher-order moments. After in-depth analysis of estimated variances, a feature that is useful for discrimination between speech and noise at low SNRs is derived and compared to a threshold to detect speech activity. To consider the inter-frame correlation of speech activity, the result from the previous frame is employed in the decision rule of the proposed VAD algorithm. The performance of our proposed VAD algorithm is evaluated in terms of receiver operating characteristics (ROC) and detection accuracy. Results show that the proposed method yields better results than conventional VAD algorithms.

  • Speech/Music Classification Enhancement for 3GPP2 SMV Codec Based on Deep Belief Networks

    Ji-Hyun SONG  Hong-Sub AN  Sangmin LEE  

     
    LETTER-Speech and Hearing

      Vol:
    E97-A No:2
      Page(s):
    661-664

    In this paper, we propose a robust speech/music classification algorithm to improve the performance of speech/music classification in the selectable mode vocoder (SMV) of 3GPP2 using deep belief networks (DBNs), which is a powerful hierarchical generative model for feature extraction and can determine the underlying discriminative characteristic of the extracted features. The six feature vectors selected from the relevant parameters of the SMV are applied to the visible layer in the proposed DBN-based method. The performance of the proposed algorithm is evaluated using the detection accuracy and error probability of speech and music for various music genres. The proposed algorithm yields better results when compared with the original SMV method and support vector machine (SVM) based method.

  • Efficient Implementation of Voiced/Unvoiced Sounds Classification Based on GMM for SMV Codec

    Ji-Hyun SONG  Joon-Hyuk CHANG  

     
    LETTER-Speech and Hearing

      Vol:
    E92-A No:8
      Page(s):
    2120-2123

    In this letter, we propose an efficient method to improve the performance of voiced/unvoiced (V/UV) sounds decision for the selectable mode vocoder (SMV) of 3GPP2 using the Gaussian mixture model (GMM). We first present an effective analysis of the features and the classification method adopted in the SMV. And feature vectors which are applied to the GMM are then selected from relevant parameters of the SMV for the efficient V/UV classification. The performance of the proposed algorithm are evaluated under various conditions and yield better results compared to the conventional method of the SMV.