This paper proposes a wideband speech coder in which a G.711 bitstream is embedded. This coder has an advantage over conventional coders in that it has a high interoperability with existing terminals so costly transcoding involving decoding and re-encoding can be avoided. We also propose a partial mixing method that effectively reduces the mixing complexity in multiple-point remote conferences. To reduce the complexity, we take advantage of the scalable structure of the bitstream and mix only the lower band of the signal. For the higher band, the main speaker location is selected among remote locations and is redistributed with the mixed lower-band signal. By subjective evaluations, we show that the speech quality can be maintained even when the speech signals are partially mixed.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Yusuke HIWASAKI, Hitoshi OHMURO, Takeshi MORI, Sachiko KURIHARA, Akitoshi KATAOKA, "A G.711 Embedded Wideband Speech Coding for VoIP Conferences" in IEICE TRANSACTIONS on Information,
vol. E89-D, no. 9, pp. 2542-2552, September 2006, doi: 10.1093/ietisy/e89-d.9.2542.
Abstract: This paper proposes a wideband speech coder in which a G.711 bitstream is embedded. This coder has an advantage over conventional coders in that it has a high interoperability with existing terminals so costly transcoding involving decoding and re-encoding can be avoided. We also propose a partial mixing method that effectively reduces the mixing complexity in multiple-point remote conferences. To reduce the complexity, we take advantage of the scalable structure of the bitstream and mix only the lower band of the signal. For the higher band, the main speaker location is selected among remote locations and is redistributed with the mixed lower-band signal. By subjective evaluations, we show that the speech quality can be maintained even when the speech signals are partially mixed.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e89-d.9.2542/_p
Copy
@ARTICLE{e89-d_9_2542,
author={Yusuke HIWASAKI, Hitoshi OHMURO, Takeshi MORI, Sachiko KURIHARA, Akitoshi KATAOKA, },
journal={IEICE TRANSACTIONS on Information},
title={A G.711 Embedded Wideband Speech Coding for VoIP Conferences},
year={2006},
volume={E89-D},
number={9},
pages={2542-2552},
abstract={This paper proposes a wideband speech coder in which a G.711 bitstream is embedded. This coder has an advantage over conventional coders in that it has a high interoperability with existing terminals so costly transcoding involving decoding and re-encoding can be avoided. We also propose a partial mixing method that effectively reduces the mixing complexity in multiple-point remote conferences. To reduce the complexity, we take advantage of the scalable structure of the bitstream and mix only the lower band of the signal. For the higher band, the main speaker location is selected among remote locations and is redistributed with the mixed lower-band signal. By subjective evaluations, we show that the speech quality can be maintained even when the speech signals are partially mixed.},
keywords={},
doi={10.1093/ietisy/e89-d.9.2542},
ISSN={1745-1361},
month={September},}
Copy
TY - JOUR
TI - A G.711 Embedded Wideband Speech Coding for VoIP Conferences
T2 - IEICE TRANSACTIONS on Information
SP - 2542
EP - 2552
AU - Yusuke HIWASAKI
AU - Hitoshi OHMURO
AU - Takeshi MORI
AU - Sachiko KURIHARA
AU - Akitoshi KATAOKA
PY - 2006
DO - 10.1093/ietisy/e89-d.9.2542
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E89-D
IS - 9
JA - IEICE TRANSACTIONS on Information
Y1 - September 2006
AB - This paper proposes a wideband speech coder in which a G.711 bitstream is embedded. This coder has an advantage over conventional coders in that it has a high interoperability with existing terminals so costly transcoding involving decoding and re-encoding can be avoided. We also propose a partial mixing method that effectively reduces the mixing complexity in multiple-point remote conferences. To reduce the complexity, we take advantage of the scalable structure of the bitstream and mix only the lower band of the signal. For the higher band, the main speaker location is selected among remote locations and is redistributed with the mixed lower-band signal. By subjective evaluations, we show that the speech quality can be maintained even when the speech signals are partially mixed.
ER -