The search functionality is under construction.
The search functionality is under construction.

Comparison of Output Devices for Augmented Audio Reality

Kazuhiro KONDO, Naoya ANAZAWA, Yosuke KOBAYASHI

  • Full Text Views

    0

  • Cite this

Summary :

We compared two audio output devices for augmented audio reality applications. In these applications, we plan to use speech annotations on top of the actual ambient environment. Thus, it becomes essential that these audio output devices are able to deliver intelligible speech annotation along with transparent delivery of the environmental auditory scene. Two candidate devices were compared. The first output was the bone-conduction headphone, which can deliver speech signals by vibrating the skull, while normal hearing is left intact for surrounding noise since these headphones leave the ear canals open. The other is the binaural microphone/earphone combo, which is in a form factor similar to a regular earphone, but integrates a small microphone at the ear canal entry. The input from these microphones can be fed back to the earphones along with the annotation speech. We also compared these devices to normal hearing (i.e., without headphones or earphones) for reference. We compared the speech intelligibility when competing babble noise is simultaneously given from the surrounding environment. It was found that the binaural combo can generally deliver speech signals at comparable or higher intelligibility than the bone-conduction headphones. However, with the binaural combo, we found that the ear canal transfer characteristics were altered significantly by shutting the ear canals closed with the earphones. Accordingly, if we employed a compensation filter to account for this transfer function deviation, the resultant speech intelligibility was found to be significantly higher. However, both of these devices were found to be acceptable as audio output devices for augmented audio reality applications since both are able to deliver speech signals at high intelligibility even when a significant amount of competing noise is present. In fact, both of these speech output methods were able to deliver speech signals at higher intelligibility than natural speech, especially when the SNR was low.

Publication
IEICE TRANSACTIONS on Information Vol.E97-D No.8 pp.2114-2123
Publication Date
2014/08/01
Publicized
Online ISSN
1745-1361
DOI
10.1587/transinf.E97.D.2114
Type of Manuscript
PAPER
Category
Speech and Hearing

Authors

Kazuhiro KONDO
  Yamagata University
Naoya ANAZAWA
  Yamagata University
Yosuke KOBAYASHI
  Yamagata University

Keyword