Developing an HMM-Based Speech Synthesis System for Malay: A Comparison of Iterative and Isolated Unit Training

Mumtaz Begum MUSTAFA; Zuraidah Mohd DON; Raja Noor AINON; Roziati ZAINUDDIN; Gerry KNOWLES

doi:10.1587/transinf.E97.D.1273

Developing an HMM-Based Speech Synthesis System for Malay: A Comparison of Iterative and Isolated Unit Training

Mumtaz Begum MUSTAFA, Zuraidah Mohd DON, Raja Noor AINON, Roziati ZAINUDDIN, Gerry KNOWLES

Full Text Views

0

Cite this

Summary :

The development of an HMM-based speech synthesis system for a new language requires resources like speech database and segment-phonetic labels. As an under-resourced language, Malay lacks the necessary resources for the development of such a system, especially segment-phonetic labels. This research aims at developing an HMM-based speech synthesis system for Malay. We are proposing the use of two types of training HMMs, which are the benchmark iterative training incorporating the DAEM algorithm and isolated unit training applying segment-phonetic labels of Malay. The preferred method for preparing segment-phonetic labels is the automatic segmentation. The automatic segmentation of Malay speech database is performed using two approaches which are uniform segmentation that applies fixed phone duration, and a cross-lingual approach that adopts the acoustic model of English. We have measured the segmentation error of the two segmentation approaches to ascertain their relative effectiveness. A listening test was used to evaluate the intelligibility and naturalness of the synthetic speech produced from the iterative and isolated unit training. We also compare the performance of the HMM-based speech synthesis system with existing Malay speech synthesis systems.

Publication: IEICE TRANSACTIONS on Information Vol.E97-D No.5 pp.1273-1282

Publication Date: 2014/05/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1587/transinf.E97.D.1273

Type of Manuscript: PAPER

Category: Speech and Hearing

Authors

Mumtaz Begum MUSTAFA
  University of Malaya
Zuraidah Mohd DON
  University of Malaya
Raja Noor AINON
  University of Malaya
Roziati ZAINUDDIN
  University of Malaya
Gerry KNOWLES
  Lingenium Sdn Bhd

Keyword

iterative training, isolated unit training, cross lingual approach, uniform segmentation, segment-phonetic labels

Cite this

Copy

Mumtaz Begum MUSTAFA, Zuraidah Mohd DON, Raja Noor AINON, Roziati ZAINUDDIN, Gerry KNOWLES, "Developing an HMM-Based Speech Synthesis System for Malay: A Comparison of Iterative and Isolated Unit Training" in IEICE TRANSACTIONS on Information, vol. E97-D, no. 5, pp. 1273-1282, May 2014, doi: 10.1587/transinf.E97.D.1273.
Abstract: The development of an HMM-based speech synthesis system for a new language requires resources like speech database and segment-phonetic labels. As an under-resourced language, Malay lacks the necessary resources for the development of such a system, especially segment-phonetic labels. This research aims at developing an HMM-based speech synthesis system for Malay. We are proposing the use of two types of training HMMs, which are the benchmark iterative training incorporating the DAEM algorithm and isolated unit training applying segment-phonetic labels of Malay. The preferred method for preparing segment-phonetic labels is the automatic segmentation. The automatic segmentation of Malay speech database is performed using two approaches which are uniform segmentation that applies fixed phone duration, and a cross-lingual approach that adopts the acoustic model of English. We have measured the segmentation error of the two segmentation approaches to ascertain their relative effectiveness. A listening test was used to evaluate the intelligibility and naturalness of the synthetic speech produced from the iterative and isolated unit training. We also compare the performance of the HMM-based speech synthesis system with existing Malay speech synthesis systems.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E97.D.1273/_p

Copy

@ARTICLE{e97-d_5_1273,
author={Mumtaz Begum MUSTAFA, Zuraidah Mohd DON, Raja Noor AINON, Roziati ZAINUDDIN, Gerry KNOWLES, },
journal={IEICE TRANSACTIONS on Information},
title={Developing an HMM-Based Speech Synthesis System for Malay: A Comparison of Iterative and Isolated Unit Training},
year={2014},
volume={E97-D},
number={5},
pages={1273-1282},
abstract={The development of an HMM-based speech synthesis system for a new language requires resources like speech database and segment-phonetic labels. As an under-resourced language, Malay lacks the necessary resources for the development of such a system, especially segment-phonetic labels. This research aims at developing an HMM-based speech synthesis system for Malay. We are proposing the use of two types of training HMMs, which are the benchmark iterative training incorporating the DAEM algorithm and isolated unit training applying segment-phonetic labels of Malay. The preferred method for preparing segment-phonetic labels is the automatic segmentation. The automatic segmentation of Malay speech database is performed using two approaches which are uniform segmentation that applies fixed phone duration, and a cross-lingual approach that adopts the acoustic model of English. We have measured the segmentation error of the two segmentation approaches to ascertain their relative effectiveness. A listening test was used to evaluate the intelligibility and naturalness of the synthetic speech produced from the iterative and isolated unit training. We also compare the performance of the HMM-based speech synthesis system with existing Malay speech synthesis systems.},
keywords={},
doi={10.1587/transinf.E97.D.1273},
ISSN={1745-1361},
month={May},}

Copy

TY - JOUR
TI - Developing an HMM-Based Speech Synthesis System for Malay: A Comparison of Iterative and Isolated Unit Training
T2 - IEICE TRANSACTIONS on Information
SP - 1273
EP - 1282
AU - Mumtaz Begum MUSTAFA
AU - Zuraidah Mohd DON
AU - Raja Noor AINON
AU - Roziati ZAINUDDIN
AU - Gerry KNOWLES
PY - 2014
DO - 10.1587/transinf.E97.D.1273
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E97-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2014
AB - The development of an HMM-based speech synthesis system for a new language requires resources like speech database and segment-phonetic labels. As an under-resourced language, Malay lacks the necessary resources for the development of such a system, especially segment-phonetic labels. This research aims at developing an HMM-based speech synthesis system for Malay. We are proposing the use of two types of training HMMs, which are the benchmark iterative training incorporating the DAEM algorithm and isolated unit training applying segment-phonetic labels of Malay. The preferred method for preparing segment-phonetic labels is the automatic segmentation. The automatic segmentation of Malay speech database is performed using two approaches which are uniform segmentation that applies fixed phone duration, and a cross-lingual approach that adopts the acoustic model of English. We have measured the segmentation error of the two segmentation approaches to ascertain their relative effectiveness. A listening test was used to evaluate the intelligibility and naturalness of the synthetic speech produced from the iterative and isolated unit training. We also compare the performance of the HMM-based speech synthesis system with existing Malay speech synthesis systems.
ER -