The recurrent neural network (RNN) has been used in audio and speech processing, such as language translation and speech recognition. Although RNN-based architecture can be applied to speech synthesis, the long computing time is still the primary concern. This research proposes a fast gated recurrent neural network, a fast RNN-based architecture, for speech synthesis based on the minimal gated unit (MGU). Our architecture removes the unit state history from some equations in MGU. Our MGU-based architecture is about twice faster, with equally good sound quality than the other MGU-based architectures.
Bima PRIHASTO
National Central University
Tzu-Chiang TAI
Providence University
Pao-Chi CHANG
National Central University
Jia-Ching WANG
National Central University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Bima PRIHASTO, Tzu-Chiang TAI, Pao-Chi CHANG, Jia-Ching WANG, "Fast Gated Recurrent Network for Speech Synthesis" in IEICE TRANSACTIONS on Information,
vol. E105-D, no. 9, pp. 1634-1638, September 2022, doi: 10.1587/transinf.2021EDL8032.
Abstract: The recurrent neural network (RNN) has been used in audio and speech processing, such as language translation and speech recognition. Although RNN-based architecture can be applied to speech synthesis, the long computing time is still the primary concern. This research proposes a fast gated recurrent neural network, a fast RNN-based architecture, for speech synthesis based on the minimal gated unit (MGU). Our architecture removes the unit state history from some equations in MGU. Our MGU-based architecture is about twice faster, with equally good sound quality than the other MGU-based architectures.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2021EDL8032/_p
Copy
@ARTICLE{e105-d_9_1634,
author={Bima PRIHASTO, Tzu-Chiang TAI, Pao-Chi CHANG, Jia-Ching WANG, },
journal={IEICE TRANSACTIONS on Information},
title={Fast Gated Recurrent Network for Speech Synthesis},
year={2022},
volume={E105-D},
number={9},
pages={1634-1638},
abstract={The recurrent neural network (RNN) has been used in audio and speech processing, such as language translation and speech recognition. Although RNN-based architecture can be applied to speech synthesis, the long computing time is still the primary concern. This research proposes a fast gated recurrent neural network, a fast RNN-based architecture, for speech synthesis based on the minimal gated unit (MGU). Our architecture removes the unit state history from some equations in MGU. Our MGU-based architecture is about twice faster, with equally good sound quality than the other MGU-based architectures.},
keywords={},
doi={10.1587/transinf.2021EDL8032},
ISSN={1745-1361},
month={September},}
Copy
TY - JOUR
TI - Fast Gated Recurrent Network for Speech Synthesis
T2 - IEICE TRANSACTIONS on Information
SP - 1634
EP - 1638
AU - Bima PRIHASTO
AU - Tzu-Chiang TAI
AU - Pao-Chi CHANG
AU - Jia-Ching WANG
PY - 2022
DO - 10.1587/transinf.2021EDL8032
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E105-D
IS - 9
JA - IEICE TRANSACTIONS on Information
Y1 - September 2022
AB - The recurrent neural network (RNN) has been used in audio and speech processing, such as language translation and speech recognition. Although RNN-based architecture can be applied to speech synthesis, the long computing time is still the primary concern. This research proposes a fast gated recurrent neural network, a fast RNN-based architecture, for speech synthesis based on the minimal gated unit (MGU). Our architecture removes the unit state history from some equations in MGU. Our MGU-based architecture is about twice faster, with equally good sound quality than the other MGU-based architectures.
ER -