The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] G.711(5hit)

1-5hit
  • Adaptive Spectral Masking of AVQ Coding and Sparseness Detection for ITU-T G.711.1 Annex D and G.722 Annex B Standards

    Masahiro FUKUI  Shigeaki SASAKI  Yusuke HIWASAKI  Kimitaka TSUTSUMI  Sachiko KURIHARA  Hitoshi OHMURO  Yoichi HANEDA  

     
    PAPER-Speech and Hearing

      Vol:
    E97-D No:5
      Page(s):
    1264-1272

    We proposes a new adaptive spectral masking method of algebraic vector quantization (AVQ) for non-sparse signals in the modified discreet cosine transform (MDCT) domain. This paper also proposes switching the adaptive spectral masking on and off depending on whether or not the target signal is non-sparse. The switching decision is based on the results of MDCT-domain sparseness analysis. When the target signal is categorized as non-sparse, the masking level of the target MDCT coefficients is adaptively controlled using spectral envelope information. The performance of the proposed method, as a part of ITU-T G.711.1 Annex D, is evaluated in comparison with conventional AVQ. Subjective listening test results showed that the proposed method improves sound quality by more than 0.1 points on a five-point scale on average for speech, music, and mixed content, which indicates significant improvement.

  • A Bandwidth Extension Scheme for G.711 Speech by Embedding Multiple Highband Gains

    Hae-Yong YANG  Kyung-Hoon LEE  Sung-Jea KO  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E94-B No:10
      Page(s):
    2941-2944

    We present an improvement to the existing steganography-based bandwidth extension scheme. Enhanced WB (wideband) speech quality is achieved by embedding multiple highband spectral gains into a G.711 bitstream. The number of spectral gains is selected by optimizing the quantity of the embedding data with respect to the quality of the extended WB speech. Compared to the existing method, the proposed scheme improves the WB PESQ (Perceptual Evaluation of Speech Quality) score by 0.334 with negligible degradation of the embedded narrowband speech.

  • Information Hiding for G.711 Speech Based on Substitution of Least Significant Bits and Estimation of Tolerable Distortion

    Akinori ITO  Shun'ichiro ABE  Yoiti SUZUKI  

     
    PAPER-Speech and Hearing

      Vol:
    E93-A No:7
      Page(s):
    1279-1286

    In this paper, we propose a novel data hiding technique for G.711-coded speech based on the LSB substitution method. The novel feature of the proposed method is that a low-bitrate encoder, G.726 ADPCM, is used as a reference for deciding how many bits can be embedded in a sample. Experiments showed that the method outperformed the simple LSB substitution method and the selective embedding method proposed by Aoki. We achieved 4-kbit/s embedding with almost no subjective degradation of speech quality, and 10 kbit/s while maintaining good quality.

  • A Technique of Lossless Steganography for G.711

    Naofumi AOKI  

     
    LETTER-Network

      Vol:
    E90-B No:11
      Page(s):
    3271-3273

    This study proposes a technique of lossless steganography for G.711, the most common codec for digital speech communications systems such as VoIP. The proposed technique exploits the characteristics of G.711 for embedding steganogram information without degradation. This paper shows the capacity of the proposed technique.

  • A G.711 Embedded Wideband Speech Coding for VoIP Conferences

    Yusuke HIWASAKI  Hitoshi OHMURO  Takeshi MORI  Sachiko KURIHARA  Akitoshi KATAOKA  

     
    PAPER-Speech and Hearing

      Vol:
    E89-D No:9
      Page(s):
    2542-2552

    This paper proposes a wideband speech coder in which a G.711 bitstream is embedded. This coder has an advantage over conventional coders in that it has a high interoperability with existing terminals so costly transcoding involving decoding and re-encoding can be avoided. We also propose a partial mixing method that effectively reduces the mixing complexity in multiple-point remote conferences. To reduce the complexity, we take advantage of the scalable structure of the bitstream and mix only the lower band of the signal. For the higher band, the main speaker location is selected among remote locations and is redistributed with the mixed lower-band signal. By subjective evaluations, we show that the speech quality can be maintained even when the speech signals are partially mixed.