The search functionality is under construction.
The search functionality is under construction.

Author Search Result

[Author] Takehiko KAGOSHIMA(3hit)

1-3hit
  • Design of Three-Dimensional Digital Filters for Video Signal Processing via Decomposition of Magnitude Specifications

    Masayuki KAWAMATA  Takehiko KAGOSHIMA  Tatsuo HIGUCHI  

     
    PAPER-Design and Implementation of Multidimensional Digital Filters

      Vol:
    E75-A No:7
      Page(s):
    821-829

    This paper proposes an efficient design method of three-dimensional (3-D) recursive digital filters for video signal processing via decomposition of magnitude specifications. A given magnitude specification of a 3-D digital filter is decomposed into specifications of 1-D digital filters with three different (horizontal, vertical, and temporal) directions. This decomposition can reduce design problems of 3-D digital filters to design problems of 1-D digital filters, which can be designed with ease by conventional methods. Consequently, design of 3-D digital filters can be efficiently performed without complicated tests for stability and large amount of computations. In order to process video signal in real time, the 1-D digital filters with temporal direction must be causal, which is not the case in horizontal and vertical directions. Since the proposed method can approximate negative magnitude specifications obtained by the decomposition with causal 1-D R filters, the 1-D digital filters with temporal direction can be causal. Therefore the 3-D digital filters designed by the proposed method is suitable for real time video signal processing. The designed 3-D digital filters have a parallel separable structure having high parallelism, regularity and modularity, and thus is suitable for high-speed VLSI implementation.

  • Concatenative Speech Synthesis Based on the Plural Unit Selection and Fusion Method

    Tatsuya MIZUTANI  Takehiko KAGOSHIMA  

     
    PAPER-Speech and Hearing

      Vol:
    E88-D No:11
      Page(s):
    2565-2572

    This paper proposes a novel speech synthesis method to generate human-like natural speech. The conventional unit-selection-based synthesis method selects speech units from a large database, and concatenates them with or without modifying the prosody to generate synthetic speech. This method features highly human-like voice quality. The method, however, has a problem that a suitable speech unit is not necessarily selected. Since the unsuitable speech unit selection causes discontinuity between the consecutive speech units, the synthesized speech quality deteriorates. It might be considered that the conventional method can attain higher speech quality if the database size increases. However, preparation of a larger database requires a longer recording time. The narrator's voice quality does not remain constant throughout the recording period. This fact deteriorates the database quality, and still leaves the problem of unsuitable selection. We propose the plural unit selection and fusion method which avoids this problem. This method integrates the unit fusion used in the unit-training-based method with the conventional unit-selection-based method. The proposed method selects plural speech units for each segment, fuses the selected speech units for each segment, modifies the prosody of the fused speech units, and concatenates them to generate synthetic speech. This unit fusion creates speech units which are connected to one another with much less voice discontinuity, and realizes high quality speech. A subjective evaluation test showed that the proposed method greatly improves the speech quality compared with the conventional method. Also, it showed that the speech quality of the proposed method is kept high regardless of the database size, from small (10 minutes) to large (40 minutes). The proposed method is a new framework in the sense that it is a hybrid method between the unit-selection-based method and the unit-training-based method. In the framework, the algorithms of the unit selection and the unit fusion are exchangeable for more efficient techniques. Thus, the framework is expected to lead to new synthesis methods.

  • Fast Concatenative Speech Synthesis Using Pre-Fused Speech Units Based on the Plural Unit Selection and Fusion Method

    Masatsune TAMURA  Tatsuya MIZUTANI  Takehiko KAGOSHIMA  

     
    PAPER-Speech and Hearing

      Vol:
    E90-D No:2
      Page(s):
    544-553

    We have previously developed a concatenative speech synthesizer based on the plural speech unit selection and fusion method that can synthesize stable and human-like speech. In this method, plural speech units for each speech segment are selected using a cost function and fused by averaging pitch-cycle waveforms. This method has a large computational cost, but some platforms require a speech synthesis system that can work within limited hardware resources. In this paper, we propose an offline unit fusion method that reduces the computational cost. In the proposed method, speech units are fused in advance to make a pre-fused speech unit database. At synthesis time, a speech unit for each segment is selected from the pre-fused speech unit database and the speech waveform is synthesized by applying prosodic modification and concatenation without the computationally expensive unit fusion process. We compared several algorithms for constructing the pre-fused speech unit database. From the subjective and objective evaluations, the effectiveness of the proposed method is confirmed by the results that the quality of synthetic speech of the offline unit fusion method with 100 MB database is close to that of the online unit fusion method with 93 MB JP database and is slightly lower to that of the 390 MB US database, while the computational time is reduced by 80%. We also show that the frequency-weighted VQ-based method is effective for construction of the pre-fused speech unit database.