The development of an HMM-based speech synthesis system for a new language requires resources like speech database and segment-phonetic labels. As an under-resourced language, Malay lacks the necessary resources for the development of such a system, especially segment-phonetic labels. This research aims at developing an HMM-based speech synthesis system for Malay. We are proposing the use of two types of training HMMs, which are the benchmark iterative training incorporating the DAEM algorithm and isolated unit training applying segment-phonetic labels of Malay. The preferred method for preparing segment-phonetic labels is the automatic segmentation. The automatic segmentation of Malay speech database is performed using two approaches which are uniform segmentation that applies fixed phone duration, and a cross-lingual approach that adopts the acoustic model of English. We have measured the segmentation error of the two segmentation approaches to ascertain their relative effectiveness. A listening test was used to evaluate the intelligibility and naturalness of the synthetic speech produced from the iterative and isolated unit training. We also compare the performance of the HMM-based speech synthesis system with existing Malay speech synthesis systems.
Mumtaz Begum MUSTAFA
University of Malaya
Zuraidah Mohd DON
University of Malaya
Raja Noor AINON
University of Malaya
Roziati ZAINUDDIN
University of Malaya
Gerry KNOWLES
Lingenium Sdn Bhd
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Mumtaz Begum MUSTAFA, Zuraidah Mohd DON, Raja Noor AINON, Roziati ZAINUDDIN, Gerry KNOWLES, "Developing an HMM-Based Speech Synthesis System for Malay: A Comparison of Iterative and Isolated Unit Training" in IEICE TRANSACTIONS on Information,
vol. E97-D, no. 5, pp. 1273-1282, May 2014, doi: 10.1587/transinf.E97.D.1273.
Abstract: The development of an HMM-based speech synthesis system for a new language requires resources like speech database and segment-phonetic labels. As an under-resourced language, Malay lacks the necessary resources for the development of such a system, especially segment-phonetic labels. This research aims at developing an HMM-based speech synthesis system for Malay. We are proposing the use of two types of training HMMs, which are the benchmark iterative training incorporating the DAEM algorithm and isolated unit training applying segment-phonetic labels of Malay. The preferred method for preparing segment-phonetic labels is the automatic segmentation. The automatic segmentation of Malay speech database is performed using two approaches which are uniform segmentation that applies fixed phone duration, and a cross-lingual approach that adopts the acoustic model of English. We have measured the segmentation error of the two segmentation approaches to ascertain their relative effectiveness. A listening test was used to evaluate the intelligibility and naturalness of the synthetic speech produced from the iterative and isolated unit training. We also compare the performance of the HMM-based speech synthesis system with existing Malay speech synthesis systems.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E97.D.1273/_p
Copy
@ARTICLE{e97-d_5_1273,
author={Mumtaz Begum MUSTAFA, Zuraidah Mohd DON, Raja Noor AINON, Roziati ZAINUDDIN, Gerry KNOWLES, },
journal={IEICE TRANSACTIONS on Information},
title={Developing an HMM-Based Speech Synthesis System for Malay: A Comparison of Iterative and Isolated Unit Training},
year={2014},
volume={E97-D},
number={5},
pages={1273-1282},
abstract={The development of an HMM-based speech synthesis system for a new language requires resources like speech database and segment-phonetic labels. As an under-resourced language, Malay lacks the necessary resources for the development of such a system, especially segment-phonetic labels. This research aims at developing an HMM-based speech synthesis system for Malay. We are proposing the use of two types of training HMMs, which are the benchmark iterative training incorporating the DAEM algorithm and isolated unit training applying segment-phonetic labels of Malay. The preferred method for preparing segment-phonetic labels is the automatic segmentation. The automatic segmentation of Malay speech database is performed using two approaches which are uniform segmentation that applies fixed phone duration, and a cross-lingual approach that adopts the acoustic model of English. We have measured the segmentation error of the two segmentation approaches to ascertain their relative effectiveness. A listening test was used to evaluate the intelligibility and naturalness of the synthetic speech produced from the iterative and isolated unit training. We also compare the performance of the HMM-based speech synthesis system with existing Malay speech synthesis systems.},
keywords={},
doi={10.1587/transinf.E97.D.1273},
ISSN={1745-1361},
month={May},}
Copy
TY - JOUR
TI - Developing an HMM-Based Speech Synthesis System for Malay: A Comparison of Iterative and Isolated Unit Training
T2 - IEICE TRANSACTIONS on Information
SP - 1273
EP - 1282
AU - Mumtaz Begum MUSTAFA
AU - Zuraidah Mohd DON
AU - Raja Noor AINON
AU - Roziati ZAINUDDIN
AU - Gerry KNOWLES
PY - 2014
DO - 10.1587/transinf.E97.D.1273
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E97-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2014
AB - The development of an HMM-based speech synthesis system for a new language requires resources like speech database and segment-phonetic labels. As an under-resourced language, Malay lacks the necessary resources for the development of such a system, especially segment-phonetic labels. This research aims at developing an HMM-based speech synthesis system for Malay. We are proposing the use of two types of training HMMs, which are the benchmark iterative training incorporating the DAEM algorithm and isolated unit training applying segment-phonetic labels of Malay. The preferred method for preparing segment-phonetic labels is the automatic segmentation. The automatic segmentation of Malay speech database is performed using two approaches which are uniform segmentation that applies fixed phone duration, and a cross-lingual approach that adopts the acoustic model of English. We have measured the segmentation error of the two segmentation approaches to ascertain their relative effectiveness. A listening test was used to evaluate the intelligibility and naturalness of the synthetic speech produced from the iterative and isolated unit training. We also compare the performance of the HMM-based speech synthesis system with existing Malay speech synthesis systems.
ER -