Ryuki TACHIBANA Tohru NAGANO Gakuto KURATA Masafumi NISHIMURA Noboru BABAGUCHI
Automatic prosody labeling is the task of automatically annotating prosodic labels such as syllable stresses or break indices into speech corpora. Prosody-labeled corpora are important for speech synthesis and automatic speech understanding. However, the subtleness of physical features makes accurate labeling difficult. Since errors in the prosodic labels can lead to incorrect prosody estimation and unnatural synthetic sound, the accuracy of the labels is a key factor for text-to-speech (TTS) systems. In particular, mora accent labels relevant to pitch are very important for Japanese, since Japanese is a pitch-accent language and Japanese people have a particularly keen sense of pitch accents. However, the determination of the mora accents of Japanese is a more difficult task than English stress detection in a way. This is because the context of words changes the mora accents within the word, which is different from English stress where the stress is normally put at the lexical primary stress of a word. In this paper, we propose a method that can accurately determine the prosodic labels of Japanese using both acoustic and linguistic models. A speaker-independent linguistic model provides mora-level knowledge about the possible correct accentuations in Japanese, and contributes to reduction of the required size of the speaker-dependent speech corpus for training the other stochastic models. Our experiments show the effectiveness of the combination of models.
The open-vocabulary name recognition technique is one of the most challenging tasks in the application of automatic Chinese speech recognition technology. It can be used as the free name input method for telephony speech applications and automatic directory assistance systems. A Chinese name usually has two to three characters, each of which is pronounced as a single tonal syllable. Obviously, it is very confusing to recognize a three-syllable word from millions to billions of possible candidates. A novel interactive automatic-speech-recognition system is proposed to resolve this highly challenging task. This system was built as an open-vocabulary Chinese name recognition system using character-based approaches. Two important character-input speech-recognition modules were designed as backoff approaches in this system to complete the name input or to correct any misrecognized characters. Finite-state networks were compiled from regular grammar of syllable spellings and character descriptions for these two speech recognition modules. The possible candidate names cover more than five billions. This system has been tested publicly and proved a robust way to interact with the speaker. An 86.7% name recognition success rate was achieved by the interactive open-vocabulary Chinese name input system.
Incheol KIM Kicheol KIM Youbean KIM HyeonUk SON Sungho KANG
A new BIST (Built-in Self-test) method for static ADC testing is proposed. The proposed method detects offset, gain, INL (Integral Non-linearity) and DNL (Differential Non-linearity) errors with a low hardware overhead. Moreover, it can solve a transient zone problem which is derived from the ADC noise in real test environments.
Young-In SONG Kyoung-Soo HAN So-Young PARK Sang-Bum KIM Hae-Chang RIM
In this paper, we propose two weighting techniques to improve performances of query expansion in biomedical document retrieval, especially when a short biomedical term in a query is expanded with its synonymous multi-word terms. When a query contains synonymous terms of different lengths, a traditional IR model highly ranks a document containing a longer terminology because a longer terminology has more chance to be matched with a query. However, such preference is clearly inappropriate and it often yields an unsatisfactory result. To alleviate the bias weighting problem, we devise a method of normalizing the weights of query terms in a long multi-word biomedical term, and a method of discriminating terms by using inverse terminology frequency which is a novel statistics estimated in a query domain. The experiment results on MEDLINE corpus show that our two simple techniques improve the retrieval performance by adjusting the inadequate preference for long multi-word terminologies in an expanded query.
Woo-Seob KIM Jong-Hwan OH Chan-Ho HAN Kil-Houm PARK
We propose a filtering method for optimal estimation of TFT-LCD's surface region except defect's region. To estimate the non-uniform intensity variation on TFT-LCD surface region, the 4-directional Gaussian filter based on image pyramid structure is proposed. The experimental result verified the proposed method's performance
Shinkichi INAGAKI Koudai HAYASHI Tatsuya SUZUKI
This paper presents a new strategy to detect and diagnose fault of a manipulator based on the expression with a Probabilistic Production Rule (PPR). Production Rule (PR) is widely used in the field of computer science as a tool of formal verification. In this work, first of all, PR is used to represent the mapping between highly quantized input and output signals of the dynamical system. By using PR expression, the fault detection and diagnosis algorithm can be implemented with less computational effort. In addition, we introduce a new system description with Probabilistic PR (PPR) wherein the occurrence probability of PRs is assigned to them to improve the robustness with small computational burden. The probability is derived from the statistic characteristics of the observed input and output signals. Then, the fault detection and diagnosis algorithm is developed based on calculating the log-likelihood of the measured data for the designed PPR. Finally, some experiments on a controlled manipulator are demonstrated to confirm the usefulness of the proposed method.
Yusuke HIROTA Hideki TODE Koso MURAKAMI
In Optical Burst Switching (OBS) networks, one of the main problems is collision between bursts. Most of the previous collision avoidance algorithms divide the Routing and Wavelength Assignment (RWA) problem into two partial problems and treat them separately. This paper focuses on the collision avoidance problem in distributed OBS networks. Our proposal involves cooperation between the routing and the wavelength assignment tasks. The main idea is to classify each wavelength at an output link of a node as suited either to sending or to relaying data bursts. The wavelength most suitable for transmitting bursts changes along the transmission route. Thus, we introduced a novel index called the "Suitability Index" (SI). The SI is a priority index assigned to each pair of output link and wavelength, and its value represents the suitability of that pair for sending or relaying data bursts. The proposed method uses the SI for both routing selection and wavelength assignment. Simulation results show that the proposed method can reduce the burst loss probability, particularly for long distance transmissions. As a result, unfairness in the treatment of short hop and long hop bursts can be reduced.
Jong-Hwan OH Woo-Seob KIM Chan-Ho HAN Kil-Houm PARK
The thin film transistor liquid crystal display (TFT-LCD) image has nonuniform brightness, which is a major difficulty in finding the Mura defect region. To facilitate Mura segmentation, globally widely varying background signal must be flattened and then Mura signal must be enhanced. In this paper, Mura signal enhancement and background-signal-flattening methods using wavelet coefficient processing are proposed. The wavelet approximation coefficients are used for background-signal flattening, while wavelet detail coefficients are employed to magnify the Mura signal on the basis of an adapted contrast sensitivity function (CSF). Then, for the enhanced image, trimodal thresholding segmentation technique and a false-region elimination method based on the human visual system (HVS) are employed for reliable Mura segmentation. The experimental results show that the proposed algorithms produce promising results and can be applied to automated inspection systems for finding Muras in a TFT-LCD image.
In OFDM systems, the pilot signal averaging channel estimation is generally used to identify the channel state information (CSI). In this case, large pilot symbols are required for obtaining an accurate CSI. As a result, the total transmission rate is degraded due to large number of pilot symbols transmission. To reduce this problem, in this paper, we propose time-frequency interferometry (TFI) for OFDM to achieve an accurate CSI.
Koichi KITAMURA Yukitoshi SANADA
Impulse Radio (IR)-Ultra Wideband (UWB) enables accurate ranging due to very short duration pulses. Therefore, UWB may provide accurate positioning capability. In order to relax the complexity in circuit implementation, UWB system with low resolution analog digital converters (ADCs) has been investigated. In this paper, the accuracy of UWB positioning with comparators is investigated through experiment. The accuracy of positioning with comparators is compared to that with 8 [bit] ADCs, and effectiveness of the system with the comparators is confirmed within the area of 1.81.8 [m].
Manabu ITO Masato KON Chihiro MIYAZAKI Noriaki IKEDA Mamoru ISHIZAKI Yoshiko UGAJIN Norimasa SEKINE
We demonstrate a novel display structure for color electronic paper for the first time. Fully transparent amorphous oxide TFT array is directly deposited onto color filter array and combined with E Ink Imaging Film. Taking advantage of the transparent property of the oxide TFT, the color filter and TFT array are positioned at the viewing side of the display. This novel "Front Drive" display structure facilitates the alignment of the color filter and TFT dramatically.
Chung-chi LIN Ming-hwa SHEU Huann-keng CHIANG Chih-Jen WEI Chishyan LIAW
Scene changes occur frequently in film broadcasting, and tend to destabilize the performance with blurred, jagged, and artifacts effects when de-interlacing methods are utilized. This paper presents an efficient VLSI architecture of video de-interlacing with considering scene change to improve the quality of video results. This de-interlacing architecture contains three main parts. The first is scene change detection, which is designed based on examining the absolute pixel difference value of two adjacent even or odd fields. The second is background index mechanism for classifying motion and non-motion pixels of input field. The third component, spatial-temporal edge-based median filter, is used to deal with the interpolation for those motion pixels. Comparing with the existed de-interlacing approaches, our architecture design can significantly ameliorate the PSNRs of the video sequences with various scene changes; for other situations, it also maintains better performances. The proposed architecture has been implemented as a VLSI chip based on UMC 0.18-µm CMOS technology process. The total gate count is 30114 and its layout area is about 710 710-µm. The power consumption is 39.78 mW at working frequency 128.2 MHz, which is able to process de-interlacing for HDTV in real-time.
Hideaki KURATA Satoshi NODA Yoshitaka SASAGO Kazuo OTSUGA Tsuyoshi ARIGANE Tetsufumi KAWAMURA Takashi KOBAYASHI Hitoshi KUME Kazuki HOMMA Teruhiko ITO Yoshinori SAKAMOTO Masahiro SHIMIZU Yoshinori IKEDA Osamu TSUCHIYA Kazunori FURUSAWA
A 4-Gb AG-AND flash memory was fabricated by using a 90-nm CMOS technology. To reduce cell size, an inversion-layer-bit-line technology was developed, enabling the elimination of both shallow trench isolations and diffusion layers from the memory cells. The inversion-layer-bit-line technology combined with a multilevel cell technique achieved a bit area 2F2 of 0.0162 µm2, resulting in a chip size of 126 mm2. Both an address and temperature compensation techniques control the resistance of the inversion-layer local bit line. Source-side hot-electron injection programming with self-boosted charge, accumulated in inversion-layer bit lines under assist gates, reduces the dispersal of programming characteristics and also reduces the time overhead of pre-charging the bit lines. This self-boosted charge-injection scheme achieves a programming throughput of 10 MB/s.
This paper discusses design of Generalized Predictive Control (GPC) scheme. GPC is designed in two cases; the first is a dual-rate (DR) system, where the sampling interval of a plant output is an integer multiple of the holding interval of a control input, and the second is a fast-rate single-rate (FR-SR) system, where both the holding and sampling intervals are equal to the holding interval of the DR system. Furthermore, the relation between them is investigated, and this study gives the conditions that FR-SR and DR GPC become equivalent. To this end, a future reference trajectory of DR GPC is rewritten, and a future predictive output of the FR-SR GPC is rearranged.
Chul Ho WON Dong Hoon KIM Jyung Hyun LEE Sang Hyo WOO Yeon Kwan MOON Jinho CHO
This paper proposed a region-based curve control function to detect the brain ventricle area by utilizing a geodesic active contour model. This is based on the average brightness of the brain ventricle area which is brighter in MRI images. Compared numerically by using various types of measurements, the proposed method can detect the brain ventricle area better than the existing methods.
Al-Sakib Khan PATHAN Choong Seon HONG
The intent of this letter is to propose an efficient timestamp based password authentication scheme using smart cards. We show various types of forgery attacks against Shen et al.'s timestamp-based password authentication scheme and improve their scheme to ensure robust security for the remote authentication process, keeping all the advantages of their scheme. Our scheme successfully defends the attacks that could be launched against other related previous schemes.
Kiyoshi NOSU Ayako KANDA Takeshi KOIKE
Eye tracking is a useful tool for accurately mapping where and for how long an individual learner looks at a video/image, in order to obtain immediate information regarding the distribution of a learner's attention among the elements of a video/image. This paper describes a quantitative investigation into the effect of voice navigation in web-based learning materials.
Ayu PURWARIANTI Masatoshi TSUCHIYA Seiichi NAKAGAWA
We have built a CLQA (Cross Language Question Answering) system for a source language with limited data resources (e.g. Indonesian) using a machine learning approach. The CLQA system consists of four modules: question analyzer, keyword translator, passage retriever and answer finder. We used machine learning in two modules, the question classifier (part of the question analyzer) and the answer finder. In the question classifier, we classify the EAT (Expected Answer Type) of a question by using SVM (Support Vector Machine) method. Features for the classification module are basically the output of our shallow question parsing module. To improve the classification score, we use statistical information extracted from our Indonesian corpus. In the answer finder module, using an approach different from the common approach in which answer is located by matching the named entity of the word corpus with the EAT of question, we locate the answer by text chunking the word corpus. The features for the SVM based text chunking process consist of question features, word corpus features and similarity scores between the word corpus and the question keyword. In this way, we eliminate the named entity tagging process for the target document. As for the keyword translator module, we use an Indonesian-English dictionary to translate Indonesian keywords into English. We also use some simple patterns to transform some borrowed English words. The keywords are then combined in boolean queries in order to retrieve relevant passages using IDF scores. We first conducted an experiment using 2,837 questions (about 10% are used as the test data) obtained from 18 Indonesian college students. We next conducted a similar experiment using the NTCIR (NII Test Collection for IR Systems) 2005 CLQA task by translating the English questions into Indonesian. Compared to the Japanese-English and Chinese-English CLQA results in the NTCIR 2005, we found that our system is superior to others except for one system that uses a high data resource employing 3 dictionaries. Further, a rough comparison with two other Indonesian-English CLQA systems revealed that our system achieved higher accuracy score.
Pairing based cryptography has been researched intensively due to its beneficial properties. In 2005, Wu et al. [3] proposed an identity-based key agreement for peer group communication from pairings. In this letter, we propose attacks on their scheme, by which the group fails to agree upon a common communication key.
The support vector machine has received wide acceptance for its high generalization ability in real world classification applications. But a drawback is that it uniquely classifies each pattern to one class or none. This is not appropriate to be applied in classification problem involves overlapping patterns. In this paper, a novel multi-model classifier (DR-SVM) which combines SVM classifier with kNN algorithm under rough set technique is proposed. Instead of classifying the patterns directly, patterns lying in the overlapped region are extracted firstly. Then, upper and lower approximations of each class are defined on the basis of rough set technique. The classification operation is carried out on these new sets. Simulation results on synthetic data set and benchmark data sets indicate that, compared with conventional classifiers, more reasonable and accurate information about the pattern's category could be obtained by use of DR-SVM.