IEICE global.ieice.org Site

Author Search Result

[Author] Takemi MOCHIDA(2hit)

1-2hit

High Quality Synthetic Speech Generation Using Synchronized Oscillators
Kenji HASHIMOTO Takemi MOCHIDA Yasuaki SATO Tetsunori KOBAYASHI Katsuhiko SHIRAI

PAPER

Vol:
E76-A No:11
Page(s):
1949-1956
For the production of high quality synthetic sounds in a text-to-speech system, an excellent synthesizing method of speech signals is indispensable. In this paper, a new speech analysis-synthesis method for the text-to-speech system is proposed. The signals of voiced speech, which have a line spectrum structure at intervals of pitch in the linear frequency domain, can be represented approximately by the superposition of sinusoidal waves. In our system, analysis and synthesis are performed using such a harmonic structure of the signals of voiced speech. In the analysis phase, assuming an exact harmonic structure model at intervals of pitch against the fine structure of the short-time power spectrum, the fundamental frequency f0 is decided so as to minimize the error of the log-power spectrum at each peak position. At the same time, according to the value of the above minimized error, the rate of periodicity of the speech signal is detemined. Then the log-power spectrum envelope is represented by the cosine-series interpolating the data which are sampled at every pitch period. In the synthesis phase, numerical solutions of non-linear differential equations which generate sinusoidal waves are used. For voiced sounds, those equations behave as a group of mutually synchronized oscillators. These sinusoidal waves are superposed so as to reconstruct the line spectrum structure. For voiceless sounds, those non-linear differential equations work as passive filters with input noise sources. Our system has some characteristics as follows. (1) Voiced and voiceless sounds can be treated in a same framowork. (2) Since the phase and the power information of each sinusoidal wave can be easily controlled, if necessary, periodic waveforms in the voiced sounds can be precisely reproduced in the time domain. (3) The fundamental frequency f0 and phoneme duration can be easily changed without much degradation of original sound quality.
Changes in Calling Parties' Behavior Caused by Settings for Indirect Control of Call Duration under Disaster Congestion Open Access
Daisuke SATOH Takemi MOCHIDA

PAPER-General Fundamentals and Boundaries

Pubricized:
2022/05/10
Vol:
E105-A No:9
Page(s):
1358-1371
The road space rationing (RSR) method regulates a period in which a user group can make telephone calls in order to decrease the call attempt rate and induce calling parties to shorten their calls during disaster congestion. This paper investigates what settings of this indirect control induce more self-restraint and how the settings change calling parties' behavior using experimental psychology. Our experiments revealed that the length of the regulated period differently affected calling parties' behavior (call duration and call attempt rate) and indicated that the 60-min RSR method (i.e., 10 six-min periods) is the most effective setting against disaster congestion.

Author Search Result

[Author] Takemi MOCHIDA(2hit)

High Quality Synthetic Speech Generation Using Synchronized Oscillators

Changes in Calling Parties' Behavior Caused by Settings for Indirect Control of Call Duration under Disaster Congestion Open Access

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles