Corpus-based concatenative speech synthesis has been widely investigated and deployed in recent years since it provides a highly natural synthesized speech quality. The amount of computation required in the run time, however, can often be quite large. In this paper, we propose early stopping schemes for Viterbi beam search in the unit selection, with which we can stop early in the local Viterbi minimization for each unit as well as in the exploration of candidate units for a given target. It takes advantage of the fact that the space of the acoustic parameters of the database units is fixed and certain lower bounds of the concatenation costs can be precomputed. The proposed method for early stopping is admissible in that it does not change the result of the Viterbi beam search. Experiments using probability-based concatenation costs as well as distance-based costs show that the proposed methods of admissible stopping effectively reduce the amount of computation required in the Viterbi beam search while keeping its result unchanged. Furthermore, the reduction effect of computation turned out to be much larger if the available lower bound for concatenation costs is tighter.
Shinsuke SAKAI
Kyoto University
Tatsuya KAWAHARA
Kyoto University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Shinsuke SAKAI, Tatsuya KAWAHARA, "Admissible Stopping in Viterbi Beam Search for Unit Selection Speech Synthesis" in IEICE TRANSACTIONS on Information,
vol. E96-D, no. 6, pp. 1359-1367, June 2013, doi: 10.1587/transinf.E96.D.1359.
Abstract: Corpus-based concatenative speech synthesis has been widely investigated and deployed in recent years since it provides a highly natural synthesized speech quality. The amount of computation required in the run time, however, can often be quite large. In this paper, we propose early stopping schemes for Viterbi beam search in the unit selection, with which we can stop early in the local Viterbi minimization for each unit as well as in the exploration of candidate units for a given target. It takes advantage of the fact that the space of the acoustic parameters of the database units is fixed and certain lower bounds of the concatenation costs can be precomputed. The proposed method for early stopping is admissible in that it does not change the result of the Viterbi beam search. Experiments using probability-based concatenation costs as well as distance-based costs show that the proposed methods of admissible stopping effectively reduce the amount of computation required in the Viterbi beam search while keeping its result unchanged. Furthermore, the reduction effect of computation turned out to be much larger if the available lower bound for concatenation costs is tighter.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E96.D.1359/_p
Copy
@ARTICLE{e96-d_6_1359,
author={Shinsuke SAKAI, Tatsuya KAWAHARA, },
journal={IEICE TRANSACTIONS on Information},
title={Admissible Stopping in Viterbi Beam Search for Unit Selection Speech Synthesis},
year={2013},
volume={E96-D},
number={6},
pages={1359-1367},
abstract={Corpus-based concatenative speech synthesis has been widely investigated and deployed in recent years since it provides a highly natural synthesized speech quality. The amount of computation required in the run time, however, can often be quite large. In this paper, we propose early stopping schemes for Viterbi beam search in the unit selection, with which we can stop early in the local Viterbi minimization for each unit as well as in the exploration of candidate units for a given target. It takes advantage of the fact that the space of the acoustic parameters of the database units is fixed and certain lower bounds of the concatenation costs can be precomputed. The proposed method for early stopping is admissible in that it does not change the result of the Viterbi beam search. Experiments using probability-based concatenation costs as well as distance-based costs show that the proposed methods of admissible stopping effectively reduce the amount of computation required in the Viterbi beam search while keeping its result unchanged. Furthermore, the reduction effect of computation turned out to be much larger if the available lower bound for concatenation costs is tighter.},
keywords={},
doi={10.1587/transinf.E96.D.1359},
ISSN={1745-1361},
month={June},}
Copy
TY - JOUR
TI - Admissible Stopping in Viterbi Beam Search for Unit Selection Speech Synthesis
T2 - IEICE TRANSACTIONS on Information
SP - 1359
EP - 1367
AU - Shinsuke SAKAI
AU - Tatsuya KAWAHARA
PY - 2013
DO - 10.1587/transinf.E96.D.1359
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E96-D
IS - 6
JA - IEICE TRANSACTIONS on Information
Y1 - June 2013
AB - Corpus-based concatenative speech synthesis has been widely investigated and deployed in recent years since it provides a highly natural synthesized speech quality. The amount of computation required in the run time, however, can often be quite large. In this paper, we propose early stopping schemes for Viterbi beam search in the unit selection, with which we can stop early in the local Viterbi minimization for each unit as well as in the exploration of candidate units for a given target. It takes advantage of the fact that the space of the acoustic parameters of the database units is fixed and certain lower bounds of the concatenation costs can be precomputed. The proposed method for early stopping is admissible in that it does not change the result of the Viterbi beam search. Experiments using probability-based concatenation costs as well as distance-based costs show that the proposed methods of admissible stopping effectively reduce the amount of computation required in the Viterbi beam search while keeping its result unchanged. Furthermore, the reduction effect of computation turned out to be much larger if the available lower bound for concatenation costs is tighter.
ER -