IEICE TRANSACTIONS on Fundamentals

Impact Factor

0.40
Eigenfactor

0.003
article influence

0.1
Cite Score

1.1

To the Advance publication
To the Archives

Advance publication (published online immediately after acceptance)

Detecting Defect Copper Parts Based on Machine Vision
Zhenhai TAN Yun YANG Xiaoman WANG Fayez ALQAHTANI

Pubricized:
2024/11/05
PAPER
- Summary
- Free PDF (1.1MB)
MST-Adapter : Multi-scaled Spatio-Temporal Adapter for Parameter-Efficient Image-to-Video Transfer Learning
Chenrui CHANG Tongwei LU Feng YAO

Pubricized:
2024/10/28
PAPER
- Summary
- Free PDF (1.3MB)
Identifying Relationships between Attack Patterns using Large Language Models
Takuma TSUCHIDA Rikuho MIYATA Hironori WASHIZAKI Kensuke SUMOTO Nobukazu YOSHIOKA Yoshiaki FUKAZAWA

Pubricized:
2024/10/23
PAPER
- Summary
- Free PDF (4.1MB)
Compactly Committing Authenticated Encryption Made Simpler
Shoichi HIROSE Kazuhiko MINEMATSU

Pubricized:
2024/10/23
PAPER
- Summary
- Free PDF (452.7KB)
Shield-Based Safe Control with Compensation of Computation Delays of Nonlinear Discrete-Time Systems
Toshimitsu USHIO

Pubricized:
2024/10/22
PAPER
- Summary
- Free PDF (302.8KB)
Multi-Bit DDLA: Non-Profiled Deep Learning Side-Channel Attacks Using Multi-bit Label Against Hardware-Implemented AES
Yuta FUKUDA Kota YOSHIDA Takeshi FUJINO

Pubricized:
2024/10/18
PAPER
- Summary
- Free PDF (4.3MB)
A deep joint source channel coding scheme with adaptive feature enhancement
Qingping YU Yuan SUN You ZHANG Longye WANG Xingwang LI

Pubricized:
2024/10/18
LETTER
- Summary
- Free PDF (1.5MB)
Progressive Multi-Scale Learning for Remote Sensing Image Super-Resolution with Residual Prior
Qiuyu XU Kanghui ZHAO Tao LU Zhongyuan WANG Ruimin HU

Pubricized:
2024/10/18
LETTER
- Summary
- Free PDF (2.9MB)
Research on model reference adaptive sliding mode control strategy for permanent magnet synchronous wind generator
Lei Zhang Xi-Lin Guo Guang Han Di-Hui Zeng

Pubricized:
2024/10/18
LETTER
- Summary
- Free PDF (630.4KB)
An Efficient Method for Sea Cucumber Recognition and Sorting Based on Improved YOLOv9 and RepViT
Meng HUANG Honglei WEI

Pubricized:
2024/10/18
PAPER
- Summary
- Free PDF (4.8MB)
ACSTNet: An Attention Cross Stage Transformers Network for Small Object Detection in Remote Sensing Images
Yang LIU Jialong WEI Shujian ZHAO Wenhua XIE Niankuan CHEN Jie LI Xin CHEN Kaixuan YANG Yongwei LI Zhen ZHAO

Pubricized:
2024/10/16
PAPER
- Summary
- Free PDF (50.5MB)
A novel Distributed Stagewise Orthogonal Matching Pursuit algorithm for mmWave MIMO channel estimation
Ngoc-Son DUONG Lan-Nhi VU THI Sinh-Cong LAM Phuong-Dung CHU THI Thai-Mai DINH THI

Pubricized:
2024/10/16
LETTER
- Summary
- Free PDF (762KB)
FSRN: Feature separable reconstruction network for underwater optical image super-resolution
Lan XIE Qiang WANG Yongqiang JI Yu GU Gaozheng XU Zheng ZHU Yuxing WANG Yuwei LI

Pubricized:
2024/10/15
PAPER
- Summary
- Free PDF (1.7MB)
A Construction of Binary Cross Z-Complementary Pairs with Large CZC Ratio
Jihui LIU Hui ZHANG Wei SU Rong LUO

Pubricized:
2024/10/11
LETTER
- Summary
- Free PDF (305.2KB)
A Common Lyapunov Function Approach to Event-Triggered Control with Self-Triggered Sampling for Switched Linear Systems
Shota NAKAYAMA Koichi KOBAYASHI Yuh YAMASHITA

Pubricized:
2024/10/09
PAPER
- Summary
- Free PDF (266.5KB)
Sample Recoverable Fuzzy Extractors
Wataru NAKAMURA Kenta TAKAHASHI

Pubricized:
2024/10/08
PAPER
- Summary
- Free PDF (1.1MB)
Construction of Compact Lattice-based IBE with equality test
Chunfeng FU Renjie JIN Longjiang QU Zijian ZHOU

Pubricized:
2024/10/08
PAPER
- Summary
- Free PDF (2.4MB)
Quaternionic Vector Product Hopfield Network
Masaki KOBAYASHI

Pubricized:
2024/10/08
LETTER
- Summary
- Free PDF (324.8KB)
Multithread Implementation of Open Source Library Characterizer
Shinichi NISHIZAWA Masahiro MATSUDA Shinji KIMURA

Pubricized:
2024/10/08
LETTER
- Summary
- Free PDF (104.4KB)
Hybrid Iterative Annealing Method Using a Quantum Annealer and a Classical Computer and its Evaluation
Keisuke FUKADA Tatsuhiko SHIRAI Nozomu TOGAWA

Pubricized:
2024/10/08
PAPER
- Summary
- Free PDF (993KB)
Performance Evaluation Considering Real Usage in Positioning and Direction Finding System Using Radio Signals
Yuta NAGAHAMA Tetsuya MANABE

Pubricized:
2024/10/03
LETTER
- Summary
- Free PDF (1.1MB)
An Accurate Tunnel Crack Identification Method Integrating Local Segmentation and Global Fusion Detection
Baoxian Wang Ze Gao Hongbin Xu Shoupeng Qin Zhao Tan Xuchao Shi

Pubricized:
2024/09/27
PAPER
- Summary
- Free PDF (459.4KB)
Practical Randomness Effects on Physical Security in Second-Order Threshold Implementation of AES
Maki TSUKAHARA Yusaku HARADA Haruka HIRATA Daiki MIYAHARA Yang LI Yuko HARA-AZUMI Kazuo SAKIYAMA

Pubricized:
2024/09/25
PAPER
- Summary
- Free PDF (6.6MB)
Wideband THz Beam Tracking Based on Integrated Sensing and Communication System
Guijie LIN Jianxiao XIE Zejun ZHANG

Pubricized:
2024/09/25
LETTER
- Summary
- Free PDF (343.1KB)
A New Cryptanalysis Against UOV-based Variants MAYO, QR-UOV and VOX
Hiroki FURUE Yasuhiko IKEMATSU

Pubricized:
2024/09/20
PAPER
- Summary
- Free PDF (2.3MB)
Novel Constructions of Type-II Binary ZCPs via Inserting Vectors
Longye WANG Lingguo KONG Xiaoli ZENG Qingping YU

Pubricized:
2024/09/20
PAPER
- Summary
- Free PDF (544.5KB)
Lightness Modification of Color Image for Protanopia/Deuteranopia in RGB Color Space
Ayaka FUJITA Mashiho MUKAIDA Tadahiro AZETSU Noriaki SUETAKE

Pubricized:
2024/09/20
PAPER
- Summary
- Free PDF (15.8MB)
An FPGA-based YOLOv6 Accelerator for High-Throughput and Energy-Efficient Object Detection
Xingan SHA Masao YANAGISAWA Youhua SHI

Pubricized:
2024/09/20
PAPER
- Summary
- Free PDF (6.4MB)
Clock Drift Compensation for Master-Slave Clock Synchronization in EtherCAT Networks
Jiqian XU Lijin FANG Qiankun ZHAO Yingcai WAN Yue GAO Huaizhen WANG

Pubricized:
2024/09/13
LETTER
- Summary
- Free PDF (964.2KB)
Application of Adversarial Training in the Detection of Calcification Regions from Dental Panoramic Radiographs
Sei TAKANO Mitsuji MUNEYASU Soh YOSHIDA Akira ASANO Nanae DEWAKE Nobuo YOSHINARI Keiichi UCHIDA

Pubricized:
2024/09/13
LETTER
- Summary
- Free PDF (161.3KB)
Laser-based Covert Channel Attack Using Inaudible Acoustic Leakage from Multilayer Ceramic Capacitors
Kohei DOI Takeshi SUGAWARA

Pubricized:
2024/09/11
PAPER
- Summary
- Free PDF (4.3MB)
CA-SCA: Non-profiled Deep Learning-based Side-channel Attacks by using Cluster Analysis
Yuta FUKUDA Kota YOSHIDA Takeshi FUJINO

Pubricized:
2024/09/11
PAPER
- Summary
- Free PDF (1.4MB)
Thinned Waveform Design of MIMO Radar in Interference Environment
Mingjie LIU Chunyang WANG Jian GONG Ming TAN Changlin ZHOU

Pubricized:
2024/09/09
PAPER
- Summary
- Free PDF (1.4MB)
Double-Stack Erasure-Filled Channel and Level-By-Level Error Correction
Hironori UCHIKAWA Manabu HAGIWARA

Pubricized:
2024/09/09
PAPER
- Summary
- Free PDF (505.2KB)
Scalable Unified Privacy-Preserving Machine Learning Framework (SUPM)
Atsuko MIYAJI Tatsuhiro YAMATSUKI Tomoka TAKAHASHI Ping-Lun WANG Tomoaki MIMOTO

Pubricized:
2024/09/09
PAPER
- Summary
- Free PDF (2MB)
A Fast Three-layer One-side Bottleneck Channel Routing with Layout Constraints using ILP
Kazuya TANIGUCHI Satoshi TAYU Atsushi TAKAHASHI Mathieu MOLONGO Makoto MINAMI Katsuya NISHIOKA

Pubricized:
2024/09/09
PAPER
- Summary
- Free PDF (764.1KB)
Gridless Gap Channel Routing with Variable-width Wires
Masayuki SHIMODA Atsushi TAKAHASHI

Pubricized:
2024/09/05
PAPER
- Summary
- Free PDF (1.3MB)
RGB-Event Multi-modal NV-CiM to Detect Object by Mapping-Oriented Enhanced-Feature Pyramid Network with Mapping-Aware Group Convolution
Yuya Ichikawa Naoko Misawa Chihiro Matsui Ken Takeuchi

Pubricized:
2024/09/05
PAPER
- Summary
- Free PDF (1.6MB)
Double Modular Redundancy Design of LSI Controller for Soft Error Tolerance
Katsutoshi OTSUKA Kazuhito ITO

Pubricized:
2024/09/05
PAPER
- Summary
- Free PDF (653.7KB)
Mitigation of Membership Inference Attack by Knowledge Distillation on Federated Learning
Rei UEDA Tsunato NAKAI Kota YOSHIDA Takeshi FUJINO

Pubricized:
2024/09/04
PAPER
- Summary
- Free PDF (4.3MB)
Efficient Reconstruction in Key Recovery Attack on the QC-MDPC McEliece Cryptosystems
Motonari OHTSUKA Takahiro ISHIMARU Yuta TSUKIE Shingo KUKITA Kohtaro WATANABE

Pubricized:
2024/09/04
PAPER
- Summary
- Free PDF (3.3MB)
New Varieties of Hadamard-type Matrices over Finite Fields and Their Properties
Iori KODAMA Tetsuya KOJIMA

Pubricized:
2024/09/02
PAPER
- Summary
- Free PDF (2.7MB)
Analysis of superstable phenomena generated by piecewise-constant chaotic spiking oscillator with state-dependent switching
Yusuke MATSUOKA

Pubricized:
2024/08/27
PAPER
- Summary
- Free PDF (7.8MB)
Frequency-domain weighted FxLMS algorithm for feedback active noise control
Yosuke SUGIURA Ryota NOGUCHI Tetsuya SHIMAMURA

Pubricized:
2024/08/27
PAPER
- Summary
- Free PDF (5.1MB)
Mean Squared Error Analysis of Noisy Average Consensus
Tadashi WADAYAMA Ayano NAKAI-KASAI

Pubricized:
2024/08/27
PAPER
- Summary
- Free PDF (4.3MB)
A scalable frequency estimation method based on multi-point interpolation of trigonometric functions
Li Cheng Huaixing Wang

Pubricized:
2024/08/23
LETTER
- Summary
- Free PDF (858.3KB)
Intelligent question answering system design with domain-specific knowledge graphs
Beining ZHANG Xile ZHANG Qin WANG Guan GUI Lin SHAN

Pubricized:
2024/08/23
PAPER
- Summary
- Free PDF (3.2MB)
Hierarchical Chaotic Wingsuit Flying Search Algorithm with Balanced Exploitation and Exploration for Optimization
Sicheng LIU Kaiyu WANG Haichuan YANG Tao ZHENG Zhenyu LEI Meng JIA Shangce GAO

Pubricized:
2024/08/21
PAPER
- Summary
- Free PDF (2.2MB)
Shape-aware Convolution with Convolutional Kernel Attention for RGB-D Image Semantic Segmentation
Kun ZHOU Zejun ZHANG Xu TANG Wen XU Jianxiao XIE Changbing TANG

Pubricized:
2024/08/21
PAPER
- Summary
- Free PDF (1.9MB)
Aesthetic Evaluation of Chinese Calligraphy Using TabNet: Interpretability and Novel Features for Improved Accuracy
Soh YOSHIDA Nozomi YATOH Mitsuji MUNEYASU

Pubricized:
2024/08/21
LETTER
- Summary
- Free PDF (663.8KB)
Embedding Learning with Relational Heterogeneous Information in Social Network Posts to Detect Malicious Behavior
Ryo YOSHIDA Soh YOSHIDA Mitsuji MUNEYASU

Pubricized:
2024/08/21
PAPER
- Summary
- Free PDF (1.8MB)
Privacy Preserving Deep Unrolling Methods and Its Application to Image Reconstruction
Nichika YUGE Hiroyuki ISHIHARA Morikazu NAKAMURA Takayuki NAKACHI

Pubricized:
2024/08/21
PAPER
- Summary
- Free PDF (2.6MB)
Distributed and Secured Gaussian Process Learning over Networks
Ling ZHU Takayuki NAKACHI Bai ZHANG Yitu WANG

Pubricized:
2024/08/21
PAPER
- Summary
- Free PDF (693.2KB)
Generating Event Structure from Set of Acyclic Relations in Choreography Realization
Toshiyuki MIYAMOTO Hiroki AKAMATSU

Pubricized:
2024/08/20
PAPER
- Summary
- Free PDF (307.7KB)
A Hierarchical Joint Training based Replay-Guided Contrastive Transformer for Action Quality Assessment of Figure Skating
Yanchao LIU Xina CHENG Takeshi IKENAGA

Pubricized:
2024/08/20
PAPER
- Summary
- Free PDF (1.3MB)
Properties of Optimal k-bit Delay Decodable Alphabetic Codes
Kengo HASHIMOTO Ken-ichi IWATA

Pubricized:
2024/08/20
PAPER
- Summary
- Free PDF (316.9KB)
Active Noise Control Systems with Sound Source Localization Robust to Noise Source Movement
Shota TOYOOKA Yoshinobu KAJIKAWA

Pubricized:
2024/08/16
LETTER
- Summary
- Free PDF (476.4KB)
Quantum Search-to-Decision Reduction for the LWE Problem
Kyohei SUDO Keisuke HARA Masayuki TEZUKA Yusuke YOSHIDA

Pubricized:
2024/08/16
PAPER
- Summary
- Free PDF (1.2MB)
On topological entropies of the subshifts associated with the stream version of asymmetric binary systems
Hiroshi FUJISAKI

Pubricized:
2024/08/16
PAPER
- Summary
- Free PDF (225KB)
Detection Probability of Poor Responses in Questionnaires with Quality Control Questions
Tota SUKO Manabu KOBAYASHI

Pubricized:
2024/08/16
PAPER
- Summary
- Free PDF (189KB)
A Variational Characterization of H-Mutual Information and its Application to Computing H-Capacity
Akira KAMATSUKA Koki KAZAMA Takahiro YOSHIDA

Pubricized:
2024/08/16
PAPER
- Summary
- Free PDF (161.9KB)
Hardware Trojan Detection Method Based on Enhanced Local Outlier Factor
Tingyuan NIE Jingjing NIE Kun ZHAO

Pubricized:
2024/08/14
PAPER
- Summary
- Free PDF (802.2KB)
New Bounds for Aperiodic Wide-Gap Frequency Hopping Sequences
Xinyu TIAN Hongyu HAN Limengnan ZHOU Hanzhou WU

Pubricized:
2024/08/06
LETTER
- Summary
- Free PDF (266.2KB)
Memetic Gravitational Search Algorithm with Hierarchical Population Structure
Shibo DONG Haotian LI Yifei YANG Jiatianyi YU Zhenyu LEI Shangce GAO

Pubricized:
2024/08/05
PAPER
- Summary
- Free PDF (1.3MB)
Accelerating CNN Inference with an Adaptive Quantization Method Using Computational Complexity-Aware Regularization
Kengo NAKATA Daisuke MIYASHITA Jun DEGUCHI Ryuichi FUJIMOTO

Pubricized:
2024/08/05
PAPER
- Summary
- Free PDF (1.1MB)
Multidimensional Tensor-Aware GAN based Pseudo Measurement Data Deduction in IoT-Empowered Distribution Station
Jie REN Minglin LIU Lisheng LI Shuai LI Mu FANG Wenbin LIU Yang LIU Haidong YU Shidong ZHANG

Pubricized:
2024/08/05
PAPER
- Summary
- Free PDF (2MB)
Multiple-Insertion-Correcting Non-Binary Quantum Codes and Decoding Algorithm
Ken NAKAMURA Takayuki NOZAKI

Pubricized:
2024/07/30
PAPER
- Summary
- Free PDF (593.3KB)
PSO-CGAN-based Iced Transmission Line Galloping Prediction Method
Yun LIANG Degui YAO Yang GAO Kaihua JIANG

Pubricized:
2024/07/29
PAPER
- Summary
- Free PDF (2.2MB)
Task Offloading and Resource Allocation for Wireless Powered Multi-AP Mobile Edge Computing
Guanqun SHEN Kaikai CHI Osama ALFARRAJ Amr TOLBA

Pubricized:
2024/07/29
PAPER
- Summary
- Free PDF (7.8MB)
An edge-preserving stripe noise removal method for infrared images
Zewei HE Zixuan CHEN Guizhong FU Yangming ZHENG Zhe-Ming LU

Pubricized:
2024/07/26
LETTER
- Summary
- Free PDF (2MB)
Real-time Implementation of Joint Domain Localised Algorithm for High Frequency Surface Wave Radar using GPU
Bowen ZHANG Chang ZHANG Di YAO Xin ZHANG

Pubricized:
2024/07/23
PAPER
- Summary
- Free PDF (1.6MB)
Ternary quantum codes constructed from a class of quasi-twisted codes
Zhihao LI Ruihu LI Chaofeng GUAN Liangdong LU Hao SONG Qiang FU

Pubricized:
2024/07/23
PAPER
- Summary
- Free PDF (3.3MB)
A Framework for Modeling Airspace Traffic Flow without Using Any Specific Waypoints
Kenji UEHARA Kunihiko HIRAISHI

Pubricized:
2024/07/22
PAPER
- Summary
- Free PDF (1.9MB)
Reducing T-Count in Quantum Circuits Using Alternate Forms of the Relative Phase Toffoli Gate
David CLARINO Shohei KURODA Shigeru YAMASHITA

Pubricized:
2024/07/16
PAPER
- Summary
- Free PDF (1.7MB)
Noisy face super-resolution method based on three-level information representation constraints
Qi QI Zi TENG Hongmei HUO Ming XU Bing BAI

Pubricized:
2024/07/16
LETTER
- Summary
- Free PDF (891KB)
Underdetermined RFID tag anti-collision based on bounded component analysis
Ling Wang Zhongqiang Luo

Pubricized:
2024/07/12
LETTER
- Summary
- Free PDF (977.5KB)
Difference Unit Groups in ℤ_n
Zongxiang YI Qiuxia XU

Pubricized:
2024/07/12
LETTER
- Summary
- Free PDF (301.6KB)
New Distinguishing Attacks on Round-Reduced Sparkle384 and Sparkle512 Permutations
Donghoon CHANG Deukjo HONG Jinkeon KANG

Pubricized:
2024/07/12
PAPER
- Summary
- Free PDF (10.2MB)
A Method to Enhance Tag Identification Efficiency Based on Tail Code Optimization Feature Sets
Xiaowu LI Wei CUI Runxin LI Lianyin JIA Jinguo YOU

Pubricized:
2024/07/12
PAPER
- Summary
- Free PDF (1.1MB)
Doppler Ambiguity Compensation within the Batch for Weak Moving Target Detection in Passive Bistatic Radar
Zhang HUAGUO Xu WENJIE Li LIANGLIANG Liao HONGSHU

Pubricized:
2024/07/09
PAPER
- Summary
- Free PDF (923.8KB)
Differential Factors Revisited: A Sufficient Condition for the Practical Use of Differential Factors
Seonkyu KIM Myoungsu SHIN Hanbeom SHIN Insung KIM Sunyeop KIM Donggeun KWON Deukjo HONG Jaechul SUNG Seokhie HONG

Pubricized:
2024/07/09
PAPER
- Summary
- Free PDF (1.2MB)
Introduction to Quantum Deletion Error-Correcting Codes
Manabu HAGIWARA

Pubricized:
2024/05/22
INVITED PAPER
- Summary
- Free PDF (211.5KB)

Whole issue

Volume E76-A No.11 (Publication Date:1993/11/25)

Special Section on Speech Synthesis: Current Technologies and Thier Application

FOREWORO
Hirokazu SATO

FOREWORD

Page(s):
1891-1892
- HTML
- PDF(88KB) >> Buy this Article
Significance of Suitability Assessment in Speech Synthesis Applications
Hideki KASUYA

INVITED PAPER

Page(s):
1893-1897
The paper indicates the importance of suitability assesment in speech synthesis applications. Human factors involved in the use of a synthetic speech are first discussed on the basis of an example of a newspaper company where synthetic speech is extensively used as an aid for proofreading a manuscript. Some findings obtained from perceptual experiments on the subjects' preference for paralinguistic properties of synthetic speech are then described, focusing primarily on the suitability of pitch characteristics, speaker's gender, and speaking rates in the task where subjects are asked to proofread a printed text while listening to the speech. The paper finally claims the need for a flexibile speech synthesis system which helps the users create their own synthetic speech.
Physiologically-Based Speech Synthesis Using Neural Networks
Makoto HIRAYAMA Eric Vatikiotis-BATESON Mitsuo KAWATO

PAPER

Page(s):
1898-1910
This paper focuses on two areas in our effort to synthesize speech from neuromotor input using neural network models that effect transforms between cognitive intentions to speak, their physiological effects on vocal tract structures, and subsequent realization as acoustic signals. The first area concerns the biomechanical transform between motor commands to muscles and the ensuing articulator behavior. Using physiological data of muscle EMG (electromyography) and articulator movements during natural English speech utterances, three articulator-specific neural networks learn the forward dynamics that relate motor commands to the muscles and motion of the tongue, jaw, ant lips. Compared to a fully-connected network, mapping muscle EMG and motion for all three sets of articulators at once, this modular approach has improved performance by reducing network complexity and has eliminated some of the confounding influence of functional coupling among articulators. Network independence has also allowed us to identify and assess the effects of technical and empirical limitations on an articulator-by-articulator basis. This is particularly important for modeling the tongue whose complex structure is very difficult to examine empirically. The second area of progress concerns the transform between articulator motion and the speech acoustics. From the articulatory movement trajectories, a second neural network generates PARCOR (partial correlation) coefficients which are then used to synthesize the speech acoustics. In the current implementation, articulator velocities have been added as the inputs to the network. As a result, the model now follows the fast changes of the coefficients for consonants generated by relatively slow articulatory movements during natural English utterances. Although much work still needs to be done, progress in these areas brings us closer to our goal of emulating speech production processes computationally.
Phoneme Power Control for Speech Synthesis
Kenzo ITOH Tomohisa HIROKAWA Hirokazu SATO

PAPER

Page(s):
1911-1918
This paper proposes a new method of phoneme power control for speech synthesis by rule. The innovation of this method lies in its use of the phoneme environment and the relationship between speech power and pitch frequency. First, the permissible threshold (PT) for power modification is measured by subjective experiments using power manipulated speech material. As a result, it is concluded that the PT of power modification is 4.1 dB. This experimental result is significant when discussing power control and gives a criterion for power control accuracy. Next, the relationship between speech power and pitch frequency is analyzed using a very large speech data base. The results show that the relationship between phoneme power and pitch frequency is affected by the kind of phoneme, the adjoining phonemes, rising or falling pitch, and initial or final position in the sentence. Finally, we propose that the phoneme power should be controlled by pitch frequency and phoneme environment. This proposal is implemented in a waveform concatenation type text-to-speech synthesizer. This new method yields an averaged root mean square error between real and estimated speech power of 2.17 dB. This value indicates that 94% of the estimated power values are within the permissible threshold of human perception.
Manifestation of Linguistic Information in the Voice Fundamental Frequency Contours of Spoken Japanese
Hiroya FUJISAKI Keikichi HIROSE Noboru TAKAHASHI

PAPER

Page(s):
1919-1926
Prosodic features of the spoken Japanese play an important role in the transmission of linguistic information concerning the lexical word accent, the sentence structure and the discourse structure. In order to construct prosodic rules for synthesizing high-quality speech, therefore, prosodic features of speech should be quantitatively analyzed with respect to the linguistic information. With a special focus on the fundamental frequency contour, we first define four prosodic units for the spoken Japanese, viz., prosodic word, prosodic phrase, prosodic clause and prosodic sentence, based on a decomposition of the fundamental frequency contour using a functional model for the generation process. Syntactic units are also introduced which have rough correspondence to these prosodic units. The relationships between the linguistic information and the characteristics of the components of the fundamental frequency contour are then described on the basis of results obtained by the analysis of two sets of speech material. Analysis of weathercast and newscast sentences showed that prosodic boundaries given by the manner of continuation/termination of phrase components fall into three categories, and are primarily related to the syntactic boundaries. On the other hand, analysis of noun phrases with various combinations of word accent types, syntactic structures, and focal conditions, indicated that the magnitude and the shape of the accent components, which of course reflect the information concerning the lexical accent types of constituent words, are largely influenced by the focal structure. The results also indicated that there are cases where prosody fails to meet all the requirements presented by word accent, syntax and discourse.
Prosodic Characteristics of Japanese Conversational Speech
Nobuyoshi KAIKI Yoshinori SAGISAKA

PAPER

Page(s):
1927-1933
In this paper, we quantitively analyzed speech data in seven different styles to make natural Japanese conversational speech synthesis. Three reading styles were produced at different speeds (slow, normal and fast), and four speaking styles were produced by enacting conversation in different situations (free, hurried, angry and polite). To clarify the differences in prosodic characteristics between conversational speech and read speech, means and standard deviations of vowel duration, vowel amplitude and fundamental frequency (F₀) were analyzed. We found large variation in these prosodic parameters. To look more precisely at the segmental duration and segmental amplitude differences between conversational speech and read speech, control rules of prosodic parameters in reading styles were applied to conversational speech. F₀ contours of different speaking styles are superposed by normalizing the segmental duration. The differences between estimated values and actual values were analyzed. Large differences were found at sentence final and key (focused) phrases. Sentence final positions showed lengthening of segmental vowel duration and increased segmental vowel amplitude. Key phrase positions featured raising F₀.
Tree-Based Approaches to Automatic Generation of Speech Synthesis Rules for Prosodic Parameters
Yoichi YAMASHITA Manabu TANAKA Yoshitake AMAKO Yasuo NOMURA Yoshikazu OHTA Atsunori KITOH Osamu KAKUSHO Riichiro MIZOGUCHI

PAPER

Page(s):
1934-1941
This paper describes automatic generation of speech synthesis rules which predict a stress level for each bunsetsu in long noun phrases. The rules are inductively inferred from a lot of speech data by using two kinds of tree-based methods, the conventional decision tree and the SBR-tree methods. The rule sets automatically generated by two methods have almost the same performance and decrease the prediction error to about 14 Hz from 23 Hz of the accent component value. The rate of the correct reproduction of the change for adjacent bunsetsu pairs is also used as a measure for evaluating the generated rule sets and they correctly reproduce the change of about 80%. The effectiveness of the rule sets is verified through the listening test. And, with regard to the comprehensiveness of the generated rules, the rules by the SBR-tree methods are very compact and easy to human experts to interpret and matches the former studies.
Speech Segment Selection for Concatenative Synthesis Based on Spectral Distortion Minimization
Naoto IWAHASHI Nobuyoshi KAIKI Yoshinori SAGISAKA

PAPER

Page(s):
1942-1948
This paper proposes a new scheme for concatenative speech synthesis to improve the speech segment selection procedure. The proposed scheme selects a segment sequence for concatenation by minimizing acoustic distortions between the selected segment and the desired spectrum for the target without the use of heuristics. Four types of distortion, a) the spectral prototypicality of a segment, b) the spectral difference between the source and target contexts, c) the degradation resulting from concatenation of phonemes, and d) the acoustic discontinuity between the concatenated segments, are formulated as acoustic quantities, and used as measures for minimization. A search method for selecting segments from a large speech database is also descrided. In this method, a three-step optimization using dynamic programming is used to minimize the four types of distortion. A perceptual test shows that this proposed segment selection method with minimum distortion criteria produces high quality synthesized speech, and that contextual spectral difference and acoustic discontinuity at the segment boundary are important measures for improving the quality.
High Quality Synthetic Speech Generation Using Synchronized Oscillators
Kenji HASHIMOTO Takemi MOCHIDA Yasuaki SATO Tetsunori KOBAYASHI Katsuhiko SHIRAI

PAPER

Page(s):
1949-1956
For the production of high quality synthetic sounds in a text-to-speech system, an excellent synthesizing method of speech signals is indispensable. In this paper, a new speech analysis-synthesis method for the text-to-speech system is proposed. The signals of voiced speech, which have a line spectrum structure at intervals of pitch in the linear frequency domain, can be represented approximately by the superposition of sinusoidal waves. In our system, analysis and synthesis are performed using such a harmonic structure of the signals of voiced speech. In the analysis phase, assuming an exact harmonic structure model at intervals of pitch against the fine structure of the short-time power spectrum, the fundamental frequency f₀ is decided so as to minimize the error of the log-power spectrum at each peak position. At the same time, according to the value of the above minimized error, the rate of periodicity of the speech signal is detemined. Then the log-power spectrum envelope is represented by the cosine-series interpolating the data which are sampled at every pitch period. In the synthesis phase, numerical solutions of non-linear differential equations which generate sinusoidal waves are used. For voiced sounds, those equations behave as a group of mutually synchronized oscillators. These sinusoidal waves are superposed so as to reconstruct the line spectrum structure. For voiceless sounds, those non-linear differential equations work as passive filters with input noise sources. Our system has some characteristics as follows. (1) Voiced and voiceless sounds can be treated in a same framowork. (2) Since the phase and the power information of each sinusoidal wave can be easily controlled, if necessary, periodic waveforms in the voiced sounds can be precisely reproduced in the time domain. (3) The fundamental frequency f₀ and phoneme duration can be easily changed without much degradation of original sound quality.
Power Control of a Terminal Analog Synthesizer Using a Glottal Model
Mikio YAMAGUCHI

PAPER

Page(s):
1957-1963
A terminal-analog synthesizer which uses a glottal model has already been proposed for rule-based speech synthesis, but the control strategy for glottal source intensity levels has not yet been defined. On the other hand, power-control rules which determine the target segmental power of synthetic speech have been proposed, based on statistical analysis of the power in natural speech. It is pointed out that there is a close correlation between observed fundamental frequency and power levels in natural speech; however, the theoretical reasons for this correlation have not been explained. This paper shows the relationship between fundamental frequency and resultant power in a terminal-analog synthesizer which uses a glottal model. From the equations it can be deduced that the tendency in natural speech for power to increase with fundamental frequency can be closely simulated by the sum of the effect of the radiation characteristic and the effect of the synthesis system's vocal tract transfer function. In addition, this paper proposes a method for adjusting the power of synthetic speech to any desired value. This control method can be executed in real-time.
High Quality Speech Synthesis System Based on Waveform Concatenation of Phoneme Segment
Tomohisa HIROKAWA Kenzo ITOH Hirokazu SATO

PAPER

Page(s):
1964-1970
A new system for speech synthesis by concatenating waveforms selected from a dictionary is described. The dictionary is constructed from a two-hour speech that includes isolated words and sentences uttered by one male speaker, and contains over 45,000 entries which are identified by their average pitch, dynamic pitch parameter which represents micro pitch structure in a segment, duration and average amplitude. Phoneme duration is set according to phoneme environment, and phoneme power is controlled, by both pitch frequency and phoneme environment. Tests show the average errors in vowel duration and consonant duration are 28.8 ms and 16.8 ms respectively, and the vowel power average error is 2.9 dB. The pitch frequency patterns are calculated according to a conventional model in which the accent component is abbed to a gross phrase component. Set a phoneme string and prosody information, the optimum waveforms are selected from the dictionary by matching their attributes with the given phonetic and prosodic information. A waveform selection function, which has two terms corresponding to prosody and phonological coincidence between rule-set values and waveform values from the dictionary, is proposed. The weight coefficients used in the selection function are determined through subjective hearing tests. The selected waveform segments are then modified in waveform domain to further adjust for the desired prosody. A pitch frequency modification method based on pitch synchronous overlap-add technique is introduced into the system. Lastly, the waveforms are interpolated between voiced waveforms to avoid abrupt changes in voice spectrum and waveform shape. An absolute evaluation test of five grades is performed to the synthesized voice and the mean of the score is 3.1, which is over "good," and while the original speaker quality is retained.
A System for the Synthesis of High-Quality Speech from Texts on General Weather Conditions
Keikichi HIROSE Hiroya FUJISAKI

PAPER

Page(s):
1971-1980
A text-to-speech conversion system for Japanese has been developed for the purpose of producing high-quality speech output. This system consists of four processing stages: 1) linguistic processing, 2) phonological processing, 3) control parameter generation, and 4) speech waveform generation. Although the processing at the first stage is restricted to the texts on general weather conditions, the other three stages can also cope with texts of news and narrations on other topics. Since the prosodic features of speech are largely related to the linguistic information, such as word accent, syntactic structure and discourse structure, linguistic processing of a wider range than ever, at least a sentence, is indispensable to obtain good quality speech with respect to the prosody. From this point of view, input text was restricted to the weather forecast sentences and a method for linguistic processing was developed to conduct morpheme, syntactic and semantic analyses simultaneously. A quantitative model for generating fundamental frequency contours was adopted to make a good reflection of the linguistic information on the prosody of synthetic speech. A set of prosodic rules was constructed to generate prosodic symbols representing prosodic structures of the text from the linguistic information obtained at the first stage. A new speech synthesizer based on the terminal analog method was also developed to improve the segmental quality of synthetic speech. It consists of four paths of cascade connection of pole/zero filters and three waveform generators. The four paths are respectively used for the synthesis of vowels and vowel-like sounds, nasal murmur and buzz bar, friction, and plosion, while the three generators produce voicing source waveform approximated by polynomials, white Gaussian noise source for fricatives and impulse source for plosives. The validity of the approach above has been confirmed by the listening tests using speech synthesized by the developed system. Improvements both in the quality of prosodic features and in the quality of segmental features were realized for the synthetic speech.
A Portable Text-to-Speech System Using a Pocket-Sized Formant Speech Synthesizer
Norio HIGUCHI Tohru SHIMIZU Hisashi KAWAI Seiichi YAMAMOTO

PAPER

Page(s):
1981-1989
The authors developed a portable Japanese text-to-speech system using a pocket-sized formant speech synthesizer. It consists of a linguistic processor and an acoustic processor. The linguistic processor runs on an MS-DOS personal computer and has functions to determine readings and prosodic information for input sentences written in kana-kanji-mixed style. New techniques, such as minimization of a cost function for phrases, rare-compound flag, semantic information, information of reading selection and restriction by associated particles, are used to increase the accuracy of readings and accent positions. The accuracy of determining readings and accent positions is 98.6% for sentences in newspaper articles. It is possible to use the linguistic processor through an interface library which has also been developed by the authors. Consequently, it has become possible not only to convert whole texts stored in text files but also to convert parts of sentences sent by the interface library sequentially, and the readings and prosodic information are optimized for the whole sentence at one time. The acoustic processor is custom-made hardware, and it has adopted new techniques, for the improvement of rules for vowel devoicing, control of phoneme durations, control of the phrase components of voice fundamental frequency and the construction of the acoustic parameter database. Due to the above-mentioned modifications, the naturalness of synthetic speech generated by a Klatt-type formant speech synthesizer was improved. On a naturalness test it was rated 3.61 on a scale of 5 points from 0 to 4.
Development of a Rule-Based Speech Synthesizer Module for Embedded Use
Mikio YAMAGUCHI John-Paul HOSOM

PAPER

Page(s):
1990-1998
A module for rule-based Japanese speech synthesis has been developed. The synthesizer was constructed using the Multiple-Cascade Terminal Analog (MCTA) structure, and this sturcture has been improved in three respects: the voicing-source model has an increased number of variable parameters which allows for voicing-source waveforms that better approximate natural speech; the spectral characteristics of the fricative source have been improved; and the path used for nasal consonants has an increased number of resonators to better conform to theory. The current synthesis system uses a modified stored-pattern data structure which allows better transitions between syllables; however, time-invariant values are used in certain cases in order to decrease the amount of required memory. This system also has a new consolidated method for generating geminate obstruents and syllabic nasals. This synthesizer and synthesis system have been implemented in a re-developed rule-based speech-synthesis module. This module has been constructed using ASIC technology and has both small size (56368 mm) and light weight (19g); it is therefore possible to embed it in various types of portable or moving machinery. The module can be connected directly to a mocroprocessor bus and accepts as input sentences which are generated by the host computer. The input sentences are written with the Japanese katakana or romaji syllabaries and other symbols which describe the sentence structure. The syllable articulation rate for one hundred Japanese syllables (including palatalized sounds) is 65% and for sixty-seven syllables (not including palatalized sounds) is 74%. The word intelligibility, measured using phonetically-balanced words, it 88%.
Development of TTS Card for PCs and TTS Software for WSs
Yoshiyuki HARA Tsuneo NITTA Hiroyoshi SAITO Ken'ichiro KOBAYASHI

PAPER

Page(s):
1999-2007
Text-to-speech synthesis (TTS) is currently one of the most important media conversion techniques. In this paper, we describe a Japanese TTS card developed for constructing a personal-computer-based multimedia platform, and a TTS software package developed for a workstation-based multimedia platform. Some applications of this hardware and software are also discussed. The TTS consists of a linguistic processing stage for converting text into phonetic and prosodic information, and a speech processing stage for producing speech from the phonetic and prosodic symbols. The linguistic processing stage uses morphological analysis, rewriting rules for accent movement and pause insertion, and other techniques to impart correct accentuation and a natural-sounding intonation to the synthesized speech. The speech processing stage employs the cepstrum method with consonant-vowel (CV) syllables as the synthesis unit to achieve clear and smooth synthesized speech. All of the processing for converting Japanese text (consisting of mixed Japanese Kanji and Kana characters) to synthesized speech is done internally on the TTS card. This allows the card to be used widely in various applications, including electronic mail and telephone service systems without placing any processing burden on the personal computer. The TTS software was used for an E-mail reading tool on a workstation.

Regular Section

Optimal Sorting Algorithms on Bus-Connected Processor Arrays
Koji NAKANO

PAPER-Computer Aided Design (CAD)

Page(s):
2008-2015
This paper presents a parallel sorting algorithm which sorts n elements on O(n/w+n log n/p) time using p(n) processors arranged in a 1-dimensional grid with w(n^1-ε) buses for every fixed ε>0. Furthermore, it is shown that np elements can be sorted in O(n/w+n log n/p) time on pp (pn) processors arranged in a 2-dimensional grid with w(n^1-ε) buses in each column and in each row. These algorithms are optimal because their time complexities are equal to the lower bounds.
Soft-Decision Decoding Algorithm for Binary Linear Block Codes
Yong Geol SHIM Choong Woong LEE

PAPER-Information Theory and Coding Theory

Page(s):
2016-2021
A soft-decision decoding algorithm for binary linear block codes is proposed. This algorithm seeks to minimize the block error probability. With careful examinations of the first hard-decision decoded results, the candidate codewords are efficiently searched for. Thus, we can reduce the decoding complexity (the number of hard-decision decodings) and lower the block error probability. Computer simulation results are presented for the (23, 12) Golay code. They show that the decoding complexity is considerably reduced and the block error probability is close to that of the maximum likelihood decoder.
Design of a Multiplier-Accumulator for High Speed lmage Filtering
Farhad Fuad ISLAM Keikichi TAMARU

PAPER-VLSI Design Technology

Page(s):
2022-2032
Multiplication-accumulation is the basic computation required for image filtering operations. For real-time image filtering, very high throughput computation is essential. This work proposes a hardware algorithm for an application-specific VLSI architecture which realizes an area-efficient high throughput multiplier-accumulator. The proposed algorithm utilizes a priori knowledge of filter mask coefficients and optimizes number of basic hardware components (e.g., full adders, pipeline latches, etc.). This results in the minimum area VLSI architecture under certain input/output constraints.
Using FFT for Error Correction Decoders
Farokh MARVASTI

LETTER-Analog Circuits and Signal Processing

Page(s):
2033-2035
Discrete Fourier Transform (DFT) is used for error detection and correction. An iterative decoder is proposed for erasure and impulsive noise which also works with moderate amount of additive random noise. The iterative method is very simple and efficient consisting of modules of Fast Fourier Transforms (FFT) and Inverse FFT's. This iterative decoder can be implemented in a feedback configuration.
Single-Shot Evaluation of Stability Hypercube and Hyperball in Polynomial Coefficient Space
Takehiro MORI Hideki KOKAME

LETTER-Control and Computing

Page(s):
2036-2038
A quick evaluation method is proposed to obtain stability robustness measures in polynomial coefficient space based on knowledge of coefficients of a Hurwitz stable nominal polynomial. Two norms are employed: l- and l²-norm, which correspond to the stability hypercube and hyperball in the space, respectively. Just inverting Hurwitz matrix for the nominal polynomial immediately yields closed-form estimates for the size of the hypercube and hyperball.

IEICE TRANSACTIONS on Fundamentals

Advance publication (published online immediately after acceptance)

Volume E76-A No.11 (Publication Date:1993/11/25)

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles