The search functionality is under construction.

Author Search Result

[Author] Joon-Hyuk CHANG(24hit)

1-20hit(24hit)

  • Multiband Vector Quantization Based on Inner Product for Wideband Speech Coding

    Joon-Hyuk CHANG  Sanjit K. MITRA  

     
    LETTER-Speech and Hearing

      Vol:
    E88-D No:11
      Page(s):
    2606-2608

    This paper describes a multiband vector quantization (VQ) technique based on inner product for wideband speech coding at 16 kb/s. Our approach consists of splitting the input speech into two separate bands and then applying an independent coding scheme for each band. A code excited linear prediction (CELP) coder is used in the lower band while a transform based coding strategy is applied in the higher band. The spectral components in the higher frequency band are represented by a set of modulated lapped transform (MLT) coefficients. The higher frequency band is divided into three subbands, and the MLT coefficients construct a vector for each subband. Specifically, for the VQ of these vectors, an inner product-based distance measure is proposed as a new strategy. The proposed 16 kb/s coder with the inner-product based distortion measure achieves better performance than the 48 kb/s ITU-T G.722 in subjective quality tests.

  • A Support Vector Machine-Based Voice Activity Detection Employing Effective Feature Vectors

    Q-Haing JO  Yun-Sik PARK  Kye-Hwan LEE  Joon-Hyuk CHANG  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E91-B No:6
      Page(s):
    2090-2093

    In this letter, we propose effective feature vectors to improve the performance of voice activity detection (VAD) employing a support vector machine (SVM), which is known to incorporate an optimized nonlinear decision over two different classes. To extract the effective feature vectors, we present a novel scheme that combines the a posteriori SNR, a priori SNR, and predicted SNR, widely adopted in conventional statistical model-based VAD.

  • Improved Speech-Presence Uncertainty Estimation Based on Spectral Gradient for Global Soft Decision-Based Speech Enhancement

    Jong-Woong KIM  Joon-Hyuk CHANG  Sang Won NAM  Dong Kook KIM  Jong Won SHIN  

     
    LETTER-Speech and Hearing

      Vol:
    E96-A No:10
      Page(s):
    2025-2028

    In this paper, we propose a speech-presence uncertainty estimation to improve the global soft decision-based speech enhancement technique by using the spectral gradient scheme. The conventional soft decision-based speech enhancement technique uses a fixed ratio (Q) of the a priori speech-presence and speech-absence probabilities to derive the speech-absence probability (SAP). However, we attempt to adaptively change Q according to the spectral gradient between the current and past frames as well as the status of the voice activity in the previous two frames. As a result, the distinct values of Q to each frequency in each frame are assigned in order to improve the performance of the SAP by tracking the robust a priori information of the speech-presence in time.

  • Speech Enhancement Based on Adaptive Noise Power Estimation Using Spectral Difference

    Jae-Hun CHOI  Joon-Hyuk CHANG  Dong Kook KIM  Suhyun KIM  

     
    LETTER-Speech and Hearing

      Vol:
    E94-A No:10
      Page(s):
    2031-2034

    In this paper, we propose a spectral difference approach for noise power estimation in speech enhancement. The noise power estimate is given by recursively averaging past spectral power values using a smoothing parameter based on the current observation. The smoothing parameter in time and frequency is adjusted by the spectral difference between consecutive frames that can efficiently characterize noise variation. Specifically, we propose an effective technique based on a sigmoid-type function in order to adaptively determine the smoothing parameter based on the spectral difference. Compared to a conventional method, the proposed noise estimate is computationally efficient and able to effectively follow noise changes under various noise conditions.

  • Improved Global Soft Decision Using Smoothed Global Likelihood Ratio for Speech Enhancement

    Joon-Hyuk CHANG  Dong Seok JEONG  Nam Soo KIM  Sangki KANG  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E90-B No:8
      Page(s):
    2186-2189

    In this letter, we propose an improved global soft decision for noisy speech enhancement. From an investigation of statistical model-based speech enhancement, it is discovered that a global soft decision has a fundamental drawback at the speech tail regions of speech signals. For that reason, we propose a new solution based on a smoothed likelihood ratio for the global soft decision. Performances of the proposed method are evaluated by subjective tests under various environments and show better results compared with the our previous work.

  • Frame Splitting Scheme for Error-Robust Audio Streaming over Packet-Switching Networks

    Jong Kyu KIM  Jung Su KIM  Hwan Sik YUN  Joon-Hyuk CHANG  Nam Soo KIM  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E91-B No:2
      Page(s):
    677-680

    This letter presents a novel frame splitting scheme for an error-robust audio streaming over packet-switching networks. In our approach to perceptual audio coding, an audio frame is split into several subframes based on the network configuration such that each packet can be decoded independently at the receiver. Through a subjective comparison category rating (CCR) test, it is discovered that our approach enhances the quality of the decoded audio signal under the lossy packet-switching networks environment.

  • Distorted Speech Rejection for Automatic Speech Recognition in Wireless Communication

    Joon-Hyuk CHANG  Nam Soo KIM  

     
    LETTER-Speech and Hearing

      Vol:
    E87-D No:7
      Page(s):
    1978-1981

    This letter introduces a pre-rejection technique for wireless channel distorted speech with application to automatic speech recognition (ASR). Based on analysis of distorted speech signals over a wireless communication channel, we propose a method to reject the channel distorted speech with a small computational load. From a number of simulation results, we can discover that the pre-rejection algorithm enhances the robustness of speech recognition operation.

  • Discriminative Weight Training for Support Vector Machine-Based Speech/Music Classification in 3GPP2 SMV Codec

    Sang-Kyun KIM  Joon-Hyuk CHANG  

     
    LETTER-Speech and Hearing

      Vol:
    E93-A No:1
      Page(s):
    316-319

    In this study, a discriminative weight training is applied to a support vector machine (SVM) based speech/music classification for a 3GPP2 selectable mode vocoder (SMV). In the proposed approach, the speech/music decision rule is derived by the SVM by incorporating optimally weighted features derived from the SMV based on a minimum classification error (MCE) method. This method differs from that of the previous work in that different weights are assigned to each feature of the SMV a novel process. According to the experimental results, the proposed approach is effective for speech/music classification using the SVM.

  • Improvement of SVM-Based Speech/Music Classification Using Adaptive Kernel Technique

    Chungsoo LIM  Joon-Hyuk CHANG  

     
    LETTER-Speech and Hearing

      Vol:
    E95-D No:3
      Page(s):
    888-891

    In this paper, we propose a way to improve the classification performance of support vector machines (SVMs), especially for speech and music frames within a selectable mode vocoder (SMV) framework. A myriad of techniques have been proposed for SVMs, and most of them are employed during the training phase of SVMs. Instead, the proposed algorithm is applied during the test phase and works with existing schemes. The proposed algorithm modifies a kernel parameter in the decision function of SVMs to alter SVM decisions for better classification accuracy based on the previous outputs of SVMs. Since speech and music frames exhibit strong inter-frame correlation, the outputs of SVMs can guide the kernel parameter modification. Our experimental results show that the proposed algorithm has the potential for adaptively tuning classifications of support vector machines for better performance.

  • Speech Enhancement: New Approaches to Soft Decision

    Joon-Hyuk CHANG  Nam Soo KIM  

     
    PAPER-Speech and Hearing

      Vol:
    E84-D No:9
      Page(s):
    1231-1240

    In this paper, we propose new approaches to speech enhancement based on soft decision. In order to enhance the statistical reliability in estimating speech activity, we introduce the concept of a global speech absence probability (GSAP). First, we compute the conventional speech absence probability (SAP) and then modify it according to the newly proposed GSAP. The modification is made in such a way that the SAP has the same value of GSAP in the case of speech absence while it is maintained to its original value when the speech is present. Moreover, for improving the performance of the SAP's at voice tails (transition periods from speech to silence), we revise the SAP's using a hang-over scheme based on the hidden Markov model (HMM). In addition, we suggest a robust noise update algorithm in which the noise power is estimated not only in the periods of speech absence but also during speech activity based on soft decision. Also, for improving the SAP determination and noise update routines, we present a new signal to noise ratio (SNR) concept which is called the predicted SNR in this paper. Moreover, we demonstrate that the discrete cosine transform (DCT) enhances the accuracy of the SAP estimation. A number of tests show that the proposed method which is called the speech enhancement based on soft decision (SESD) algorithm yields better performance than the conventional approaches.

  • Improved Global Soft Decision Incorporating Second-Order Conditional MAP in Speech Enhancement

    Jong-Mo KUM  Joon-Hyuk CHANG  

     
    LETTER-Speech and Hearing

      Vol:
    E93-D No:6
      Page(s):
    1652-1655

    In this paper, we propose a novel method based on the second-order conditional maximum a posteriori (CMAP) to improve the performance of the global soft decision in speech enhancement. The conventional global soft decision scheme is found through investigation to have a disadvantage in that the global speech absence probability (GSAP) in that scheme is adjusted by a fixed parameter, which could be a restrictive assumption in the consecutive occurrences of speech frames. To address this problem, we devise a method to incorporate the second-order CMAP in determining the GSAP, which is clearly different from the previous approach in that not only current observation but also the speech activity decisions of the previous two frames are exploited. Performances of the proposed method are evaluated by a number of tests in various environments and show better results than previous work.

  • Speech Enhancement Based on Data-Driven Residual Gain Estimation

    Yu Gwang JIN  Nam Soo KIM  Joon-Hyuk CHANG  

     
    LETTER-Speech and Hearing

      Vol:
    E94-D No:12
      Page(s):
    2537-2540

    In this letter, we propose a novel speech enhancement algorithm based on data-driven residual gain estimation. The entire system consists of two stages. At the first stage, a conventional speech enhancement algorithm enhances the input signal while estimating several signal-to-noise ratio (SNR)-related parameters. The residual gain, which is estimated by a data-driven method, is applied to further enhance the signal at the second stage. A number of experimental results show that the proposed speech enhancement algorithm outperforms the conventional speech enhancement technique based on soft decision and the data-driven approach using SNR grid look-up table.

  • Efficient Speech Reinforcement Based on Low-Bit-Rate Speech Coding Parameters

    Jae-Hun CHOI  Joon-Hyuk CHANG  Seong-Ro LEE  

     
    LETTER-Speech and Hearing

      Vol:
    E93-A No:9
      Page(s):
    1684-1687

    In this paper, a novel approach to speech reinforcement in a low-bit-rate speech coder under ambient noise environments is proposed. The excitation vector of ambient noise is efficiently obtained at the near-end and then combined with the excitation signal of the far-end for a suitable reinforcement gain within the G.729 CS-ACELP Annex. B framework. For this reason, this can be clearly different from previous approaches in that the present approach does not require an additional arithmetic step such as the discrete Fourier transform (DFT). Experimental results indicate that the proposed method shows better performance than or at least comparable to conventional approaches with a lower computational burden.

  • Online Sparse Volterra System Identification Using Projections onto Weighted l1 Balls

    Tae-Ho JUNG  Jung-Hee KIM  Joon-Hyuk CHANG  Sang Won NAM  

     
    PAPER

      Vol:
    E96-A No:10
      Page(s):
    1980-1983

    In this paper, online sparse Volterra system identification is proposed. For that purpose, the conventional adaptive projection-based algorithm with weighted l1 balls (APWL1) is revisited for nonlinear system identification, whereby the linear-in-parameters nature of Volterra systems is utilized. Compared with sparsity-aware recursive least squares (RLS) based algorithms, requiring higher computational complexity and showing faster convergence and lower steady-state error due to their long memory in time-invariant cases, the proposed approach yields better tracking capability in time-varying cases due to short-term data dependence in updating the weight. Also, when N is the number of sparse Volterra kernels and q is the number of input vectors involved to update the weight, the proposed algorithm requires O(qN) multiplication complexity and O(Nlog 2N) sorting-operation complexity. Furthermore, sparsity-aware least mean-squares and affine projection based algorithms are also tested.

  • Speech/Music Classification Enhancement for 3GPP2 SMV Codec Based on Support Vector Machine

    Sang-Kyun KIM  Joon-Hyuk CHANG  

     
    LETTER-Speech and Hearing

      Vol:
    E92-A No:2
      Page(s):
    630-632

    In this letter, we propose a novel approach to speech/music classification based on the support vector machine (SVM) to improve the performance of the 3GPP2 selectable mode vocoder (SMV) codec. We first analyze the features and the classification method used in real time speech/music classification algorithm in SMV, and then apply the SVM for enhanced speech/music classification. For evaluation of performance, we compare the proposed algorithm and the traditional algorithm of the SMV. The performance of the proposed system is evaluated under the various environments and shows better performance compared to the original method in the SMV.

  • Efficient Implementation of Statistical Model-Based Voice Activity Detection Using Taylor Series Approximation

    Chungsoo LIM  Soojeong LEE  Jae-Hun CHOI  Joon-Hyuk CHANG  

     
    LETTER-Digital Signal Processing

      Vol:
    E97-A No:3
      Page(s):
    865-868

    In this letter, we propose a simple but effective technique that improves statistical model-based voice activity detection (VAD) by both reducing computational complexity and increasing detection accuracy. The improvements are made by applying Taylor series approximations to the exponential and logarithmic functions in the VAD algorithm based on an in-depth analysis of the algorithm. Experiments performed on a smartphone as well as on a desktop computer with various background noises confirm the effectiveness of the proposed technique.

  • Speech Enhancement Based on Perceptually Comfortable Residual Noise

    Jong Won SHIN  Joon-Hyuk CHANG  Nam Soo KIM  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E90-B No:11
      Page(s):
    3323-3326

    In this letter, we propose a novel approach to speech enhancement, which incorporates a new criterion based on residual noise shaping. In the proposed approach, our goal is to make the residual noise perceptually comfortable instead of making it less audible. A predetermined `comfort noise' is provided as a target for the spectral shaping. Based on some assumptions, the resulting spectral gain function turns out to be a slight modification of the Wiener filter while requiring very low computational complexity. Subjective listening test shows that the proposed algorithm outperforms the conventional spectral enhancement technique based on soft decision and the noise suppression implemented in IS-893 Selectable Mode Vocoder.

  • Efficient Implementation of Voiced/Unvoiced Sounds Classification Based on GMM for SMV Codec

    Ji-Hyun SONG  Joon-Hyuk CHANG  

     
    LETTER-Speech and Hearing

      Vol:
    E92-A No:8
      Page(s):
    2120-2123

    In this letter, we propose an efficient method to improve the performance of voiced/unvoiced (V/UV) sounds decision for the selectable mode vocoder (SMV) of 3GPP2 using the Gaussian mixture model (GMM). We first present an effective analysis of the features and the classification method adopted in the SMV. And feature vectors which are applied to the GMM are then selected from relevant parameters of the SMV for the efficient V/UV classification. The performance of the proposed algorithm are evaluated under various conditions and yield better results compared to the conventional method of the SMV.

  • A Statistical Model-Based V/UV Decision under Background Noise Environments

    Joon-Hyuk CHANG  Nam Soo KIM  Sanjit K. MITRA  

     
    LETTER-Speech and Hearing

      Vol:
    E87-D No:12
      Page(s):
    2885-2887

    In this letter, we propose an approach to incorporate a statistical model for the voiced/unvoiced (V/UV) speech decision under background noise environments. Our approach consists of splitting the input noisy speech into two separate bands and applying a statistical model for each band. We compute and compare the likelihood ratio (LR) for each band based on the statistical model and estimated noise statistics for the V/UV decision. According to the simulation test, the proposed V/UV decision shows a better performance compared with the selectable mode vocoder (SMV) V/UV decision algorithm, particularly in clean and white noise environments.

  • A Statistical Model-Based Speech Enhancement Using Acoustic Noise Classification for Robust Speech Communication

    Jae-Hun CHOI  Joon-Hyuk CHANG  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E95-B No:7
      Page(s):
    2513-2516

    In this paper, we present a speech enhancement technique based on the ambient noise classification that incorporates the Gaussian mixture model (GMM). The principal parameters of the statistical model-based speech enhancement algorithm such as the weighting parameter in the decision-directed (DD) method and the long-term smoothing parameter of the noise estimation, are set according to the classified context to ensure best performance under each noise. For real-time context awareness, the noise classification is performed on a frame-by-frame basis using the GMM with the soft decision framework. The speech absence probability (SAP) is used in detecting the speech absence periods and updating the likelihood of the GMM.

1-20hit(24hit)