The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] bandwidth extension(7hit)

1-7hit
  • Artificial Bandwidth Extension for Lower Bandwidth Using Sinusoidal Synthesis based on First Formant Location

    Yuya HOSODA  Arata KAWAMURA  Youji IIGUNI  

     
    PAPER-Engineering Acoustics

      Pubricized:
    2021/10/12
      Vol:
    E105-A No:4
      Page(s):
    664-672

    The narrow bandwidth limitation of 300-3400Hz on the public switching telephone network results in speech quality deterioration. In this paper, we propose an artificial bandwidth extension approach that reconstructs the missing lower bandwidth of 50-300Hz using sinusoidal synthesis based on the first formant location. Sinusoidal synthesis generates sinusoidal waves with a harmonic structure. The proposed method detects the fundamental frequency using an autocorrelation method based on YIN algorithm, where a threshold processing avoids the false fundamental frequency detection on unvoiced sounds. The amplitude of the sinusoidal waves is calculated in the time domain from the weighted energy of 300-600Hz. In this case, since the first formant location corresponds to the first peak of the spectral envelope, we reconstruct the harmonic structure to avoid attenuating and overemphasizing by increasing the weight when the first formant location is lower, and vice versa. Consequently, the subjective and objective evaluations show that the proposed method reduces the speech quality difference between the original speech signal and the bandwidth extended speech signal.

  • Blind Bandwidth Extension with a Non-Linear Function and Its Evaluation on Automatic Speaker Verification

    Ryota KAMINISHI  Haruna MIYAMOTO  Sayaka SHIOTA  Hitoshi KIYA  

     
    PAPER

      Pubricized:
    2019/10/25
      Vol:
    E103-D No:1
      Page(s):
    42-49

    This study evaluates the effects of some non-learning blind bandwidth extension (BWE) methods on state-of-the-art automatic speaker verification (ASV) systems. Recently, a non-linear bandwidth extension (N-BWE) method has been proposed as a blind, non-learning, and light-weight BWE approach. Other non-learning BWEs have also been developed in recent years. For ASV evaluations, most data available to train ASV systems is narrowband (NB) telephone speech. Meanwhile, wideband (WB) data have been used to train the state-of-the-art ASV systems, such as i-vector, d-vector, and x-vector. This can cause sampling rate mismatches when all datasets are used. In this paper, we investigate the influence of sampling rate mismatches in the x-vector-based ASV systems and how non-learning BWE methods perform against them. The results showed that the N-BWE method improved the equal error rate (EER) on ASV systems based on the x-vector when the mismatches were present. We researched the relationship between objective measurements and EERs. Consequently, the N-BWE method produced the lowest EERs on both ASV systems and obtained the lower RMS-LSD value and the higher STOI score.

  • Statistical Bandwidth Extension for Speech Synthesis Based on Gaussian Mixture Model with Sub-Band Basis Spectrum Model

    Yamato OHTANI  Masatsune TAMURA  Masahiro MORITA  Masami AKAMINE  

     
    PAPER-Voice conversion

      Pubricized:
    2016/07/19
      Vol:
    E99-D No:10
      Page(s):
    2481-2489

    This paper describes a novel statistical bandwidth extension (BWE) technique based on a Gaussian mixture model (GMM) and a sub-band basis spectrum model (SBM), in which each dimensional component represents a specific acoustic space in the frequency domain. The proposed method can achieve the BWE from speech data with an arbitrary frequency bandwidth whereas the conventional methods perform the conversion from fixed narrow-band data. In the proposed method, we train a GMM with SBM parameters extracted from full-band spectra in advance. According to the bandwidth of input signal, the trained GMM is reconstructed to the GMM of the joint probability density between low-band SBM and high-band SBM components. Then high-band SBM components are estimated from low-band SBM components of the input signal based on the reconstructed GMM. Finally, BWE is achieved by adding the spectra decoded from estimated high-band SBM components to the ones of the input signal. To construct the full-band signal from the narrow-band one, we apply this method to log-amplitude spectra and aperiodic components. Objective and subjective evaluation results show that the proposed method extends the bandwidth of speech data robustly for the log-amplitude spectra. Experimental results also indicate that the aperiodic component extracted from the upsampled narrow-band signal realizes the same performance as the restored and the full-band aperiodic components in the proposed method.

  • Search-Free Codebook Mapping for Artificial Bandwidth Extension

    Heewan PARK  Byungsik YOON  Sangwon KANG  Andreas SPANIAS  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E95-B No:4
      Page(s):
    1479-1482

    A new codebook mapping algorithm for artificial bandwidth extension (ABE) is introduced in this paper. We design a wideband line spectrum pair (LSP) codebook which is coupled with the same index as the LSP codebook of a narrowband speech codec. The received narrowband LSP codebook indices are used to directly induce wideband LSP codewords. Thus, the proposed scheme eliminates codebook search processing to estimate the wideband spectrum envelope. We apply the proposed scheme to bandwidth extension in adaptive multi-rate (AMR) compressed domain. Its performance is assessed via the perceptual evaluation of speech quality (PESQ), informal listening tests, and weighted million operations per second (WMOPS) calculations.

  • A Low Power Bandwidth Extension Technique

    Byungsik YOON  Heewan PARK  Sangwon KANG  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E95-B No:1
      Page(s):
    358-361

    This paper proposes a low power artificial bandwidth extension (ABE) technique that reduces computational complexity by introducing a fast codebook mapping method. We also introduce a weighted classified codebook mapping method for constructing the spectral envelope of the wideband speech signal. Classified codebooks are used to reduce spectrum mapping errors caused by characteristic difference among voiced, unvoiced and onset sound. The weighted distortion measure is also used to handle the spectral sensibility. The performance of the proposed ABE system is evaluated by a spectral distortion (SD), a perceptual evaluation of speech quality (PESQ), informal listening tests and weighted million operations per second (WMOPS) calculations. With the use of fast codebook mapping, the WMOPS complexity of the codebook mapping module is reduced by 45.17%.

  • A Bandwidth Extension Scheme for G.711 Speech by Embedding Multiple Highband Gains

    Hae-Yong YANG  Kyung-Hoon LEE  Sung-Jea KO  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E94-B No:10
      Page(s):
    2941-2944

    We present an improvement to the existing steganography-based bandwidth extension scheme. Enhanced WB (wideband) speech quality is achieved by embedding multiple highband spectral gains into a G.711 bitstream. The number of spectral gains is selected by optimizing the quantity of the embedding data with respect to the quality of the extended WB speech. Compared to the existing method, the proposed scheme improves the WB PESQ (Perceptual Evaluation of Speech Quality) score by 0.334 with negligible degradation of the embedded narrowband speech.

  • Bandwidth Extension with Hybrid Signal Extrapolation for Audio Coding

    Chatree BUDSABATHON  Akinori NISHIHARA  

     
    PAPER

      Vol:
    E90-A No:8
      Page(s):
    1564-1569

    In this paper, we propose a blind method using hybrid signal extrapolation at the decoder to regenerate lost high-frequency components which are removed by encoders. At first, a decoded signal spectral resolution is enhanced by time domain linear predictive extrapolation and then the cut off frequency of each frame is estimated to avoid the spectrum gap between the end of original low frequency spectrum and the beginning of reconstructed high frequency spectrum. By utilizing a correlation between the high frequency spectrum and low frequency spectrum, the low frequency spectrum component is employed to reconstruct the high frequency spectrum component by frequency domain linear predictive extrapolation. Experimental results show an effective improvement of the proposed method in terms of SNR and human listening test results. The proposed method can be used to reconstruct the lost high frequency component to improve the perceptual quality of audio independent of the compression method.