1-1hit |
Toshiro WATANABE Shinji HAYASHI
We propose an objective measure from assessing low-rate coded speech. The model for this objective measure, in which several known features of the perceptual processing of speech sounds by the human ear are emulated, is based on the Hertz-to-Bark transformation, critical-band filtering with preemphasis to boost higher frequencies, nonlinear conversion for subjective loudness, and temporal (forward) masking. The effectiveness of the measure, called the Bark spectral distortion rating (BSDR), was validated by second-order polynomial regression analysis between the computed BSDR values and subjective MOS ratings obtained for a large number of utterances coded by several versions of CELP coders and one VSELP coder under three degradation conditions: input speech levels, transmission error rates, and background noise levels. The BSDR values correspond better to MOS ratings than several commonly used measures. Thus, BSDR can be used to accurately predict subjective scores.