Yasunari OBUCHI Nobuo HATAOKA Richard M. STERN
In this paper we describe a new framework of feature compensation for robust speech recognition, which is suitable especially for small devices. We introduce Delta-cepstrum Normalization (DCN) that normalizes not only cepstral coefficients, but also their time-derivatives. Cepstral Mean Normalization (CMN) and Mean and Variance Normalization (MVN) are fast and efficient algorithms of environmental adaptation, and have been used widely. In those algorithms, normalization was applied to cepstral coefficients to reduce the irrelevant information from them, but such a normalization was not applied to time-derivative parameters because the reduction of the irrelevant information was not enough. However, Histogram Equalization (HEQ) provides better compensation and can be applied even to the delta and delta-delta cepstra. We investigate various implementation of DCN, and show that we can achieve the best performance when the normalization of the cepstra and the delta cepstra can be mutually interdependent. We evaluate the performance of DCN using speech data recorded by a PDA. DCN provides significant improvements compared to HEQ. It is shown that DCN gives 15% relative word error rate reduction from HEQ. We also examine the possibility of combining Vector Taylor Series (VTS) and DCN. Even though some combinations do not improve the performance of VTS, it is shown that the best combination gives the better performance than VTS alone. Finally, the advantage of DCN in terms of the computation speed is also discussed.
Giedre SABALIAUSKAITE Shinji KUSUMOTO Katsuro INOUE
For more than twenty-five years software inspections have been considered an effective method for defect detection. Inspections have been investigated through controlled experiments in university environment and industry case studies. However, in most cases software inspections have been used for defect detection in documents of conventional structured development process. Therefore, there is a significant lack of information about how inspections should be applied to Object-Oriented artifacts, such as Object-Oriented code and design diagrams. In addition, extensive work is needed to determine whether some inspection techniques can be more beneficial than others. Most inspection experiments include inspection meetings after individual inspection is completed. However, several researchers suggested that inspection meetings may not be necessary since an insignificant number of new defects are found as a result of inspection meeting. Moreover, inspection meetings have been found to suffer from process loss. This paper presents the findings of a controlled experiment that was conducted to investigate the performance of individual inspectors as well as 3-person teams in Object-Oriented design document inspection. Documents were written using the notation of Unified Modelling Language. Two reading techniques, namely Checklist-based reading (CBR) and Perspective-based reading (PBR), were used during experiment. We found that both techniques are similar with respect to defect detection effectiveness during individual inspection as well as during inspection meetings. Investigating the usefulness of inspection meetings, we found out that the teams that used CBR technique exhibited significantly smaller meeting gains (number of new defect first found during team meeting) than meeting losses (number of defects first identified by an individual but never included into defect list by a team); meanwhile the meeting gains were similar to meeting losses of the teams that used PBR technique. Consequently, CBR 3-person team meetings turned out to be less beneficial than PBR 3-person team meetings.
A new speaker feature extracted from multi-wavelet decomposition for speaker recognition is described. The multi-wavelet decomposition is a multi-scale representation of the covariance matrix. We have combined wavelet transform and the multi-resolution singular value algorithm to decompose eigenvector for speaker feature extraction not at the square matrix. Our results have shown that this multi-wavelet feature introduced better performance than the cepstrum and Δ-cepstrum with respect to the percentages of recognition.
Kenji TAKATSUKASA Shinya MATSUFUJI Yoshihiro TANADA
This paper formulates functions generating four kinds of binary sequence sets of length 2n with zero correlation zone, which have been discussed for approximately synchronized CDMA systems without co-channel interference nor influence of multipath. They are logic functions of a binary vector of order n, expressed by EXOR and AND operations.
Hisakazu SATO Yasuhiro NUNOMURA Niichi ITOH Koji NII Kanako YOSHIDA Hironobu ITO Jingo NAKANISHI Hidehiro TAKATA Yasunobu NAKASE Hiroshi MAKINO Akira YAMADA Takahiko ARAKAWA Toru SHIMIZU Yuichi HIRANO Takashi IPPOSHI Shuhei IWADE
A low-power microcontroller has been developed with 0.10 µm bulk compatible body-tied SOI technology. For this work, only two new masks are required. For the other layers, existing masks of a prior work developed with 0.18 µm bulk CMOS technology can be applied without any changes. With the SOI technology, the high-speed operation of over 600 MHz has been achieved at a supply voltage of 1.2 V, which is 1.5 times faster than prior work. Also, a five times improvement in the power-delay product has been achieved at a supply voltage 0.8 V. Moreover, the compatibility of the SOI technology with bulk CMOS has been verified, because all circuit blocks of the chip, including logic, memory, analog circuit, and PLL, are completely functional, even though only two new masks are used.
Christopher J. HOGGER Frank R. KRIWACZEK
We describe a framework for deriving specifications of wizard-like tools by detecting coherent patterns of behaviour among user actions observed in a portal environment. Implementation in the portal of tools compliant with these specifications can then provide useful support for the kind of work patterns observed. The derivation process employs a customizable knowledge base which defines coherent patterns and seeks concrete instances of them among series of actions that occur with sufficient frequency among those observed.
Hary BUDIARTO Kenshi HORIHATA Katsuyuki HANEDA Jun-ichi TAKADA
In the urban area, buildings are the main scatterer which dominate the mobile propagation characteristics. However, reflection, diffraction, and scattering on the building surfaces in the radio environment induce undesirable multipath propagation. Multipath prediction with respect to a building surface has been conventionally based on an assumption that reflection from the surface has a substantial specular direction. However non-specular scattering from the building surface can affect the channel characteristics as well as specular scattering. This paper presents multipath characteristics of non-specular wave scattering from building surface roughness based on the experimental results. Superresolution method was applied as an approach to handle the signal parameters (DoA, ToA) of the individual incoming waves reflected from building surface roughness. The results show that the multipaths can be detected at many scatterers, such as ground, window's glass, window's frames and bricks surface, as well as directly from the transmitter. Most of the scattered waves are arriving closely from specular directions. The measured reflection coefficients were well bounded by reflection coefficients of the theoretically smooth and random rough surface. The Fresnel reflection coefficient formula, considering the finite thickness of the building surface and Gaussian scattering correction, give better prediction for glass and bricks reflection coefficient measurement.
A method for estimating the bit-error rate (BER) of turbo codes called the Monte Carlo distance spectrum method is proposed. Testing this method shows that the estimated BER curves closely approximate the results of a Monte Carlo simulation.
Atsumu ISENO Yukihiro IGUCHI Tsutomu SASAO
In this paper, we show a method to locate a single stuck-at fault of a random access memory (RAM). From the fail-bitmaps of the RAM, we obtain their Walsh spectrum. For a single stuck-at fault, we show that the fault can be identified and located by using only the 0-th and 1-st coefficients of the spectrum. We also show a circuit to compute these coefficients. The computation time is O(2n), where n is the number of bits in the address of the RAM. The computation time is much shorter than one that uses a logic minimization method.
We propose Optimal Temporal Decomposition (OTD) of speech for voice morphing preserving Δ cepstrum. OTD is an optimal modification of the original Temporal Decomposition (TD) by B. Atal. It is theoretically shown that OTD can achieve minimal spectral distortion for the TD-based approximation of time-varying LPC parameters. Moreover, by applying OTD to preserving Δ cepstrum, it is also theoretically shown that Δ cepstrum of a target speaker can be reflected to that of a source speaker. In frequency domain interpolation, the Laplacian Spectral Distortion (LSD) measure is introduced to improve the Inverse Function of Integrated Spectrum (IFIS) based non-uniform frequency warping. Experimental results indicate that Δ cepstrum of the OTD-based morphing spectra of a source speaker is mostly equal to that of a target speaker except for a piecewise constant factor and subjective listening tests show that the speech intelligibility of the proposed morphing method is superior to the conventional method.
Kenichi ICHINO Ko-ichi WATANABE Masayuki ARAI Satoshi FUKUMOTO Kazuhiko IWASAKI
The partially rotational scan (PRS) technique greatly reduces the amount of data needed for n-detection testing. It also enables at-speed testing using low-speed testers. We designed tester intellectual properties (tester IP) with PRS for Viper and COMET II processors. When PRS was applied to a Viper processor, we obtained test data that provided the same fault coverage as with a set of automatic test pattern generation (ATPG) test vectors, although the amount of test data was 16% that of the ATPG. When the PRS technique was applied to a COMET II processor with full-scan design, we obtained test data that provided the same fault coverage as with a set of ATPG test vectors, although the amount of test data was 10% that of the ATPG. We also estimated hardware overhead and test time.
Yonghui LI Branka VUCETIC Qishan ZHANG
Channel estimation is one of the key technologies in mobile communications. Channel estimation is critical in providing high data rate services and to overcome fast fading in very high-speed mobile communications. This paper presents a novel channel estimation based on hybrid spreading of I and Q signals (CEHS). Simulation results show that it can effectively mitigate the influence of fast fading and enable to provide high data rates for very high speed mobile systems.
Rajkishore PRASAD Hiroshi SARUWATARI Kiyohiro SHIKANO
This paper deals with the statistical modeling of a Time-Frequency Series of Speech (TFSS), obtained by Short-Time Fourier Transform (STFT) analysis of the speech signal picked up by a linear microphone array with two elements. We have attempted to find closer match between the distribution of the TFSS and theoretical distributions like Laplacian Distribution (LD), Gaussian Distribution (GD) and Generalized Gaussian Distribution (GGD) with parameters estimated from the TFSS data. It has been found that GGD provides the best models for real part, imaginary part and polar magnitudes of the time-series of the spectral components. The distribution of the polar magnitude is closer to LD than that of the real and imaginary parts. The distributions of the real and imaginary parts of TFSS correspond to strongly LD. The phase of the TFSS has been found uniformly distributed. The use of GGD based model as PDF in the fixed-point Frequency Domain Independent Component Analysis (FDICA) provides better separation performance and improves convergence speed significantly.
Francisco MESEGUER Hernan MIGUEZ
Colloidal crystallization is one of the most promising approaches to the fabrication of photonic crystals with periodicity at the submicron length scale. Several approaches have been explored to enhance the optical quality of these materials and, at the same time, to integrate these materials in substrates of interest in current technology. In this paper we review some of the most promising advances recently made in this direction, as well as some achievements towards the creation of new colloidal structures.
Shu-Min TSAI Jia-Ching WANG Jar-Ferr YANG Jhing-Fa WANG
In this paper, we propose a speech coding translation scheme by transferring coding parameters between GSM half rate and G.729 coders. Compared to the conventional decode-then-encode (DTE) scheme, the proposed parameter conversions provide speech interoperability between mobile and IP networks with reducing computational complexity and coding delay. Simulation results show that the proposed methods can reduce about 30% computational load and coding delay acquired in the target encoders and achieve almost imperceptible degradation in performance.
Pi-Chung WANG Chia-Tai CHAN Shuo-Cheng HU Chun-Liang LEE
Rectangle search is a well-known packet classification scheme which is based on multiple hash accesses for different filter length. It shows good scalability with respect to the number of filters; however, the performance is not fast enough to fulfill the high-speed requirement of packet classification. In this paper, we propose a lookahead caching which can significantly improve the performance of hash-based algorithm. The basic idea is to filter out the un-matched probing case by using dual-hash architecture. The experimental results indicate that the proposed scheme can improve the performance by the factor of two for the 2-dimension (source prefix, destination prefix) filter database.
Takashi YAMAMOTO Hirokazu KUBOTA Satoki KAWANISHI Masatoshi TANAKA Syun-ichiro YAMAGUCHI
We describe the first highly nonlinear dispersion-flattened polarization-maintaining photonic crystal fiber designed for nonlinear optics applications in the 1.55 µm region. The nonlinear coefficient of the fiber is 19 (W-1km-1), which is ten times that of dispersion shifted fiber. The chromatic dispersion and dispersion slope of the fiber at 1.55 µm are -0.23 ps/km/nm and 0.01 ps/km/nm2, respectively. We demonstrate the generation of a supercontinuum using the photonic crystal fiber. A symmetrical supercontinuum over 40 nm is obtained by injecting 1562 nm, 2.2 ps, and 40 GHz optical pulses into the 200 m-long photonic crystal fiber.
Takeshi MASUYAMA Hiroshi NAKAGAWA
Although many researchers have verified the superiority of Support Vector Machine (SVM) on text categorization tasks, some recent papers have reported much lower performance of SVM based text categorization methods when focusing on all types of parts of speech (POS) as input words and treating large numbers of training documents. This was caused by the overfitting problem that SVM sometimes selected unsuitable support vectors for each category in the training set. To avoid the overfitting problem, we propose a two step text categorization method with a variable cascaded feature selection (VCFS) using SVM. VCFS method selects a pair of the best number of words and the best POS combination for each category at each step of the cascade. We made use of the difference of words with the highest mutual information for each category on each POS combination. Through the experiments, we confirmed the validation of VCFS method compared with other SVM based text categorization methods, since our results showed that the macro-averaged F1 measure (64.8%) of VCFS method was significantly better than any reported F1 measures, though the micro-averaged F1 measure (85.4%) of VCFS method was similar to them.
Giedre SABALIAUSKAITE Shinji KUSUMOTO Katsuro INOUE
Software inspection is one of the most effective methods to detect defects. However, inspections are not always worthwhile. This letter proposes an inspection cost model to describe inspections-related costs and extended metrics to evaluate the cost effectiveness of software inspections.
Erdenebat DASHTSEREN Shigeyoshi KITAZAWA Satoshi IWASAKI Shinya KIRIYAMA
Our study focuses on an evaluation of a novel speech processing strategy for multi-channel cochlear implant speech processors. Stimulation pulse trains for the Nucleus 24CI speech processor were generated in a way different from the speech processing strategies implemented in this processor. The distinctive features of the novel strategy are: 1) electrode stimulation order driven by location of maximum instantaneous frequency amplitude; 2) variable stimulation rates on electrodes; 3) variable number of selected channels within a cycle of signal processing schema. Within-subject designed tests on Japanese initial, medial and final consonants in CV, VCV and CV/N context tokens were carried out with cochlear implant patients using the Cochlear ACETM strategy, and results were compared with those of normal hearing listeners. Results of the initial and medial consonant tests showed significantly better performance with the novel strategy than with the ACE strategy for both the cochlear implant and normal hearing listener groups. Results of the final consonant tests showed a slightly better performance with the ACE strategy for cochlear implant listeners while showing a slightly better performance with the novel strategy for normal hearing listeners.