The search functionality is under construction.
The search functionality is under construction.

Investigation of Using Continuous Representation of Various Linguistic Units in Neural Network Based Text-to-Speech Synthesis

Xin WANG, Shinji TAKAKI, Junichi YAMAGISHI

  • Full Text Views

    0

  • Cite this

Summary :

Building high-quality text-to-speech (TTS) systems without expert knowledge of the target language and/or time-consuming manual annotation of speech and text data is an important yet challenging research topic. In this kind of TTS system, it is vital to find representation of the input text that is both effective and easy to acquire. Recently, the continuous representation of raw word inputs, called “word embedding”, has been successfully used in various natural language processing tasks. It has also been used as the additional or alternative linguistic input features to a neural-network-based acoustic model for TTS systems. In this paper, we further investigate the use of this embedding technique to represent phonemes, syllables and phrases for the acoustic model based on the recurrent and feed-forward neural network. Results of the experiments show that most of these continuous representations cannot significantly improve the system's performance when they are fed into the acoustic model either as additional component or as a replacement of the conventional prosodic context. However, subjective evaluation shows that the continuous representation of phrases can achieve significant improvement when it is combined with the prosodic context as input to the acoustic model based on the feed-forward neural network.

Publication
IEICE TRANSACTIONS on Information Vol.E99-D No.10 pp.2471-2480
Publication Date
2016/10/01
Publicized
2016/07/19
Online ISSN
1745-1361
DOI
10.1587/transinf.2016SLP0011
Type of Manuscript
Special Section PAPER (Special Section on Recent Advances in Machine Learning for Spoken Language Processing)
Category
Speech synthesis

Authors

Xin WANG
  National Institute of Informatics,SOKENDAI
Shinji TAKAKI
  National Institute of Informatics
Junichi YAMAGISHI
  National Institute of Informatics,SOKENDAI,University of Edinburgh

Keyword