The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] duration modeling(2hit)

1-2hit
  • State Duration Modeling for HMM-Based Speech Synthesis

    Heiga ZEN  Takashi MASUKO  Keiichi TOKUDA  Takayoshi YOSHIMURA  Takao KOBAYASIH  Tadashi KITAMURA  

     
    LETTER-Speech and Hearing

      Vol:
    E90-D No:3
      Page(s):
    692-693

    This paper describes the explicit modeling of a state duration's probability density function in HMM-based speech synthesis. We redefine, in a statistically correct manner, the probability of staying in a state for a time interval used to obtain the state duration PDF and demonstrate improvements in the duration of synthesized speech.

  • Duration Modeling Using Cumulative Duration Probability

    Tae-Young YANG  Chungyong LEE  Dae-Hee YOUN  

     
    LETTER-Speech and Hearing

      Vol:
    E85-D No:9
      Page(s):
    1452-1454

    A duration modeling technique is proposed for the HMM based connected digit recognizer. The proposed duration modeling technique uses a cumulative duration probability. The cumulative duration probability is defined as the partial sum of the duration probabilities which can be estimated from the training speech data. Two approaches of using it are presented. First, the cumulative duration probability is used as a weighting factor to the state transition probability of HMM. Second, it replaces the conventional state transition probability. In both approaches, the cumulative duration probability is combined directly to the Viterbi decoding procedure. A modified Viterbi decoding procedure is also presented. One of the advantages of the proposed duration modeling technique is that the cumulative duration probability rules the transitions of states and words at each frame. Therefore, an additional post-procedure is not required. The proposed technique was examined by recognition experiments on Korean connected digit. Experimental results showed that two approach achieved almost same performances and that the average recognition accuracy was enhanced from 83.60% to 93.12%.