The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] unit training(2hit)

1-2hit
  • Developing an HMM-Based Speech Synthesis System for Malay: A Comparison of Iterative and Isolated Unit Training

    Mumtaz Begum MUSTAFA  Zuraidah Mohd DON  Raja Noor AINON  Roziati ZAINUDDIN  Gerry KNOWLES  

     
    PAPER-Speech and Hearing

      Vol:
    E97-D No:5
      Page(s):
    1273-1282

    The development of an HMM-based speech synthesis system for a new language requires resources like speech database and segment-phonetic labels. As an under-resourced language, Malay lacks the necessary resources for the development of such a system, especially segment-phonetic labels. This research aims at developing an HMM-based speech synthesis system for Malay. We are proposing the use of two types of training HMMs, which are the benchmark iterative training incorporating the DAEM algorithm and isolated unit training applying segment-phonetic labels of Malay. The preferred method for preparing segment-phonetic labels is the automatic segmentation. The automatic segmentation of Malay speech database is performed using two approaches which are uniform segmentation that applies fixed phone duration, and a cross-lingual approach that adopts the acoustic model of English. We have measured the segmentation error of the two segmentation approaches to ascertain their relative effectiveness. A listening test was used to evaluate the intelligibility and naturalness of the synthetic speech produced from the iterative and isolated unit training. We also compare the performance of the HMM-based speech synthesis system with existing Malay speech synthesis systems.

  • Concatenative Speech Synthesis Based on the Plural Unit Selection and Fusion Method

    Tatsuya MIZUTANI  Takehiko KAGOSHIMA  

     
    PAPER-Speech and Hearing

      Vol:
    E88-D No:11
      Page(s):
    2565-2572

    This paper proposes a novel speech synthesis method to generate human-like natural speech. The conventional unit-selection-based synthesis method selects speech units from a large database, and concatenates them with or without modifying the prosody to generate synthetic speech. This method features highly human-like voice quality. The method, however, has a problem that a suitable speech unit is not necessarily selected. Since the unsuitable speech unit selection causes discontinuity between the consecutive speech units, the synthesized speech quality deteriorates. It might be considered that the conventional method can attain higher speech quality if the database size increases. However, preparation of a larger database requires a longer recording time. The narrator's voice quality does not remain constant throughout the recording period. This fact deteriorates the database quality, and still leaves the problem of unsuitable selection. We propose the plural unit selection and fusion method which avoids this problem. This method integrates the unit fusion used in the unit-training-based method with the conventional unit-selection-based method. The proposed method selects plural speech units for each segment, fuses the selected speech units for each segment, modifies the prosody of the fused speech units, and concatenates them to generate synthetic speech. This unit fusion creates speech units which are connected to one another with much less voice discontinuity, and realizes high quality speech. A subjective evaluation test showed that the proposed method greatly improves the speech quality compared with the conventional method. Also, it showed that the speech quality of the proposed method is kept high regardless of the database size, from small (10 minutes) to large (40 minutes). The proposed method is a new framework in the sense that it is a hybrid method between the unit-selection-based method and the unit-training-based method. In the framework, the algorithms of the unit selection and the unit fusion are exchangeable for more efficient techniques. Thus, the framework is expected to lead to new synthesis methods.