Graph Cepstrum: Spatial Feature Extracted from Partially Connected Microphones

Keisuke IMOTO

doi:10.1587/transinf.2019EDP7162

IEICE TRANSACTIONS on Information

Graph Cepstrum: Spatial Feature Extracted from Partially Connected Microphones

Keisuke IMOTO

Full Text Views

0

Cite this

Summary :

In this paper, we propose an effective and robust method of spatial feature extraction for acoustic scene analysis utilizing partially synchronized and/or closely located distributed microphones. In the proposed method, a new cepstrum feature utilizing a graph-based basis transformation to extract spatial information from distributed microphones, while taking into account whether any pairs of microphones are synchronized and/or closely located, is introduced. Specifically, in the proposed graph-based cepstrum, the log-amplitude of a multichannel observation is converted to a feature vector utilizing the inverse graph Fourier transform, which is a method of basis transformation of a signal on a graph. Results of experiments using real environmental sounds show that the proposed graph-based cepstrum robustly extracts spatial information with consideration of the microphone connections. Moreover, the results indicate that the proposed method more robustly classifies acoustic scenes than conventional spatial features when the observed sounds have a large synchronization mismatch between partially synchronized microphone groups.

Publication: IEICE TRANSACTIONS on Information Vol.E103-D No.3 pp.631-638

Publication Date: 2020/03/01

Publicized: 2019/12/09

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2019EDP7162

Type of Manuscript: PAPER

Category: Speech and Hearing

Authors

Keisuke IMOTO
Ritsumeikan University

Keyword

graph cepstrum, graph signal processing, acoustic scene analysis, spatial cepstrum

Cite this

Copy

Keisuke IMOTO, "Graph Cepstrum: Spatial Feature Extracted from Partially Connected Microphones" in IEICE TRANSACTIONS on Information, vol. E103-D, no. 3, pp. 631-638, March 2020, doi: 10.1587/transinf.2019EDP7162.
Abstract: In this paper, we propose an effective and robust method of spatial feature extraction for acoustic scene analysis utilizing partially synchronized and/or closely located distributed microphones. In the proposed method, a new cepstrum feature utilizing a graph-based basis transformation to extract spatial information from distributed microphones, while taking into account whether any pairs of microphones are synchronized and/or closely located, is introduced. Specifically, in the proposed graph-based cepstrum, the log-amplitude of a multichannel observation is converted to a feature vector utilizing the inverse graph Fourier transform, which is a method of basis transformation of a signal on a graph. Results of experiments using real environmental sounds show that the proposed graph-based cepstrum robustly extracts spatial information with consideration of the microphone connections. Moreover, the results indicate that the proposed method more robustly classifies acoustic scenes than conventional spatial features when the observed sounds have a large synchronization mismatch between partially synchronized microphone groups.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019EDP7162/_p

Copy

@ARTICLE{e103-d_3_631,
author={Keisuke IMOTO, },
journal={IEICE TRANSACTIONS on Information},
title={Graph Cepstrum: Spatial Feature Extracted from Partially Connected Microphones},
year={2020},
volume={E103-D},
number={3},
pages={631-638},
abstract={In this paper, we propose an effective and robust method of spatial feature extraction for acoustic scene analysis utilizing partially synchronized and/or closely located distributed microphones. In the proposed method, a new cepstrum feature utilizing a graph-based basis transformation to extract spatial information from distributed microphones, while taking into account whether any pairs of microphones are synchronized and/or closely located, is introduced. Specifically, in the proposed graph-based cepstrum, the log-amplitude of a multichannel observation is converted to a feature vector utilizing the inverse graph Fourier transform, which is a method of basis transformation of a signal on a graph. Results of experiments using real environmental sounds show that the proposed graph-based cepstrum robustly extracts spatial information with consideration of the microphone connections. Moreover, the results indicate that the proposed method more robustly classifies acoustic scenes than conventional spatial features when the observed sounds have a large synchronization mismatch between partially synchronized microphone groups.},
keywords={},
doi={10.1587/transinf.2019EDP7162},
ISSN={1745-1361},
month={March},}

Copy

TY - JOUR
TI - Graph Cepstrum: Spatial Feature Extracted from Partially Connected Microphones
T2 - IEICE TRANSACTIONS on Information
SP - 631
EP - 638
AU - Keisuke IMOTO
PY - 2020
DO - 10.1587/transinf.2019EDP7162
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2020
AB - In this paper, we propose an effective and robust method of spatial feature extraction for acoustic scene analysis utilizing partially synchronized and/or closely located distributed microphones. In the proposed method, a new cepstrum feature utilizing a graph-based basis transformation to extract spatial information from distributed microphones, while taking into account whether any pairs of microphones are synchronized and/or closely located, is introduced. Specifically, in the proposed graph-based cepstrum, the log-amplitude of a multichannel observation is converted to a feature vector utilizing the inverse graph Fourier transform, which is a method of basis transformation of a signal on a graph. Results of experiments using real environmental sounds show that the proposed graph-based cepstrum robustly extracts spatial information with consideration of the microphone connections. Moreover, the results indicate that the proposed method more robustly classifies acoustic scenes than conventional spatial features when the observed sounds have a large synchronization mismatch between partially synchronized microphone groups.
ER -

IEICE TRANSACTIONS on Information