The search functionality is under construction.
The search functionality is under construction.

Constant-Q Deep Coefficients for Playback Attack Detection

Jichen YANG, Longting XU, Bo REN

  • Full Text Views

    0

  • Cite this

Summary :

Under the framework of traditional power spectrum based feature extraction, in order to extract more discriminative information for playback attack detection, this paper proposes a feature by making use of deep neural network to describe the nonlinear relationship between power spectrum and discriminative information. Namely, constant-Q deep coefficients (CQDC). It relies on constant-Q transform, deep neural network and discrete cosine transform. In which, constant-Q transform is used to convert signal from the time domain into the frequency domain because it is a long-term transform that can provide more frequency detail, deep neural network is used to extract more discriminative information to discriminate playback speech from genuine speech and discrete cosine transform is used to decorrelate among the feature dimensions. ASVspoof 2017 corpus version 2.0 is used to evaluate the performance of CQDC. The experimental results show that CQDC outperforms the existing power spectrum obtained from constant-Q transform based features, and equal error can reduce from 19.18% to 51.56%. In addition, we found that discriminative information of CQDC hides in all frequency bins, which is different from commonly used features.

Publication
IEICE TRANSACTIONS on Information Vol.E103-D No.2 pp.464-468
Publication Date
2020/02/01
Publicized
2019/11/14
Online ISSN
1745-1361
DOI
10.1587/transinf.2019EDL8115
Type of Manuscript
LETTER
Category
Speech and Hearing

Authors

Jichen YANG
  National University of Singapore
Longting XU
  Donghua University
Bo REN
   Microsoft Search Technology Center Asia

Keyword