Keiichi FUNAKI Tatsuhiko KINJO
Complex speech analysis for an analytic speech signal can accurately estimate the spectrum in low frequencies since the analytic signal provides spectrum only over positive frequencies. The remarkable feature makes it possible to realize more accurate F0 estimation using complex residual signal extracted by complex-valued speech analysis. We have already proposed F0 estimation using complex LPC residual, in which the autocorrelation function weighted by AMDF was adopted as the criterion. The method adopted MMSE-based complex LPC analysis and it has been reported that it can estimate more accurate F0 for IRS filtered speech corrupted by white Gauss noise although it can not work better for the IRS filtered speech corrupted by pink noise. In this paper, robust complex speech analysis based on ELS (Extended Least Square) method is introduced in order to overcome the drawback. The experimental results for additive white Gauss or pink noise demonstrate that the proposed algorithm based on robust ELS-based complex AR analysis can perform better than other methods.
Kenji INOMATA Takashi HIRAI Yoshio YAMAGUCHI Hiroyoshi YAMADA
This paper presents a target location estimation method that can use a pair of leaky coaxial cables to determine the 2D coordinates of the target. Since convention location techniques using leaky coaxial cables can find a target location along the cable in 1D, they have been unable to locate it in 2D planes. The proposed method enables us to estimate target on a 2D plane using; 1) a beam-forming technique and 2) a reconstruction technique based on Hough transform. Leaky coaxial cables are equipped with numerous slots at regular interval, which can be utilized as antenna arrays that acts both as transmitters and receivers. By completely exploiting this specific characteristic of leaky coaxial cables, we carried out an antenna array analysis that performs in a beam-forming fashion. Simulation and experimental results support the effectiveness of the proposed target location method.
This letter propose a new H∞ smoother (HIS) with a finite impulse response (FIR) structure for discrete-time state-space models. This smoother is called an H∞ FIR smoother (HIFS). Constraints such as linearity, quasi-deadbeat property, FIR structure, and independence of the initial state information are required in advance. Among smoothers with these requirements, we choose the HIFS to optimize H∞ performance criterion. The HIFS is obtained by solving the linear matrix inequality (LMI) problem with a parametrization of a linear equality constraint. It is shown through simulation that the proposed HIFS is more robust against uncertainties and faster in convergence than the conventional HIS.
We propose a fast decoding algorithm for the p-ary first-order Reed-Muller code guaranteeing correction of up to errors and having complexity proportional to nlog n, where n = pm is the code length and p is an odd prime. This algorithm is an extension in the complex domain of the fast Hadamard transform decoding algorithm applicable to the binary case.
Yuki DENDA Takanobu NISHIURA Yoichi YAMASHITA
This paper proposes a robust omnidirectional audio-visual (AV) talker localizer for AV applications. The proposed localizer consists of two innovations. One of them is robust omnidirectional audio and visual features. The direction of arrival (DOA) estimation using an equilateral triangular microphone array, and human position estimation using an omnidirectional video camera extract the AV features. The other is a dynamic fusion of the AV features. The validity criterion, called the audio- or visual-localization counter, validates each audio- or visual-feature. The reliability criterion, called the speech arriving evaluator, acts as a dynamic weight to eliminate any prior statistical properties from its fusion procedure. The proposed localizer can compatibly achieve talker localization in a speech activity and user localization in a non-speech activity under the identical fusion rule. Talker localization experiments were conducted in an actual room to evaluate the effectiveness of the proposed localizer. The results confirmed that the talker localization performance of the proposed AV localizer using the validity and reliability criteria is superior to that of conventional localizers.
Youhua SHI Nozomu TOGAWA Masao YANAGISAWA Tatsuo OHTSUKI
In this paper, we presented a Design-for-Secure-Test (DFST) technique for pipelined AES to guarantee both the security and the test quality during testing. Unlike previous works, the proposed method can keep all the secrets inside and provide high test quality and fault diagnosis ability as well. Furthermore, the proposed DFST technique can significantly reduce test application time, test data volume, and test generation effort as additional benefits.
In MC-CDMA systems, subcarriers can be shared by different users. In this letter, we exploit the shared nature of subcarriers and propose a user grouping and subcarrier allocation algorithm for grouped MC-CDMA systems. The scheme aims at maximizing the total system throughput while providing bandwidth-fairness among groups. Simulation results are given to demonstrate the performance of the proposed algorithm in terms of sum capacity and per-user throughput.
Seiichi NAKAMORI María J. GARCIA-LIGERO Aurora HERMOSO-CARAZO Josefa LINARES-PEREZ
In this paper, we propose a recursive filtering algorithm to restore monochromatic images which are corrupted by general dependent additive noise. It is assumed that the equation which describes the image field is not available and a filtering algorithm is obtained using the information provided by the covariance functions of the signal, noise that affects the measurement equation, and the fourth-order moments of the signal. The proposed algorithm is obtained by an innovation approach which provides a simple derivation of the least mean-squared error linear estimators. The estimation of the grey level in each spatial coordinate is made taking into account the information provided by the grey levels located on the row of the pixel to be estimated. The proposed filtering algorithm is applied to restore images which are affected by general signal-dependent additive noise.
Masayuki SHIMIZU Makoto NAKASHIZUKA Youji IIGUNI
In this paper, we propose an image enlargement method by using morphological operators. Our enlargement method is based on the nonlinear frequency extrapolation method (Greenspan et al., 2000) by using a Laplacian pyramid image representation. In this method, the sampling process of input images is modeled as the Laplacian pyramid. A high resolution image is obtained with the finer scale Laplacian that is extrapolated by a nonlinear operation from a low resolution Laplacian. In this paper, we propose a novel nonlinear operation for extrapolation of the finer scale Laplacian. Our nonlinear operation is realized by morphological operators and is capable of generating the finer scale Laplacian, the amplitude of which is proportional to contrasts of edges that appear in the low resolution image. In experiments, the enlargement results given by the proposed method are demonstrated. Compared with the Greenspan's method, the proposed method can recover sharp intensity transients of image edges with small artifacts.
Chuang LIN Jeng-Shyang PAN Chia-An HUANG
The letter proposes a novel subsampling-based digital image watermarking scheme resisting the permutation attack. The subsampling-based watermarking schemes have drawn great attention for their convenience and effectiveness in recent years, but the traditional subsampling-based watermarking schemes are very vulnerable to the permutation attack. In this letter, the watermark information is embedded in the average values of the 1-level DWT coefficients to resist the permutation attack. The concrete embedding process is achieved by the quantization-based method. Experimental results show that the proposed scheme can resist not only the permutation attack but also some common image processing attacks.
Yutaka KAMAMOTO Noboru HARADA Takehiro MORIYA
A new linear prediction analysis method for multichannel signals was devised, with the goal of enhancing the compression performance of the MPEG-4 Audio Lossless Coding (ALS) compliant encoder and decoder. The multichannel coding tool for this standard carries out an adaptively weighted subtraction of the residual signals of the coding channel from those of the reference channel, both of which are produced by independent linear prediction. Our linear prediction method tries to directly minimize the amplitude of the predicted residual signal after subtraction of the signals of the coding channel, and the method has been implemented in the MPEG-4 ALS codec software. The results of a comprehensive evaluation show that this method reduces the size of a compressed file. The maximum improvement of the compression ratio is 14.6% which is achieved at the cost of a small increase in computational complexity at the encoder and without increase in decoding time. This is a practical method because the compressed bitstream remains compliant with the MPEG-4 ALS standard.
Taegyun LIM Keunsung BAE Chansik HWANG Hyeonguk LEE
This paper presents a new method for classification of underwater transient signals, which employs a binary image pattern of the mel-frequency cepstral coefficients as a feature vector and a feed-forward neural network as a classifier. The feature vector is obtained by taking DCT and 1-bit quantization for the square matrix of the mel-frequency cepstral coefficients that is derived from the frame based cepstral analysis. The classifier is a feed-forward neural network having one hidden layer and one output layer, and a back propagation algorithm is used to update the weighting vector of each layer. Experimental results with underwater transient signals demonstrate that the proposed method is very promising for classification of underwater transient signals.
Amin SAEEDFAR Hiroyasu SATO Kunio SAWAYA
This paper includes different approaches for analysis of a thin-wire antenna in the presence of de-ionized water box at different temperatures as a high-permittivity three-dimensional dielectric body. In continuation with the previous work of authors, first, the coupled tensor-volume/line integral equations is solved by using Galerkin-based moment method (MoM) consisting of a combination of entire-domain and sub-domain basis functions including three-dimensional polynomials with different degrees. Then, the accuracy of such MoM, specifically for a high-permittivity dielectric scatterer, is substantiated by comparing its numerical results with that of FDTD method and some experimental data.
Akinori ITO Takanobu OBA Takashi KONASHI Motoyuki SUZUKI Shozo MAKINO
Speech recognition in a noisy environment is one of the hottest topics in the speech recognition research. Noise-tolerant acoustic models or noise reduction techniques are often used to improve recognition accuracy. In this paper, we propose a method to improve accuracy of spoken dialog system from a language model point of view. In the proposed method, the dialog system automatically changes its language model and dialog strategy according to the estimated recognition accuracy in a noisy environment in order to keep the performance of the system high. In a noise-free environment, the system accepts any utterance from a user. On the other hand, the system restricts its grammar and vocabulary in a noisy environment. To realize this strategy, we investigated a method to avoid the user's out-of-grammar utterances through an instruction given by the system to a user. Furthermore, we developed a method to estimate recognition accuracy from features extracted from noise signals. Finally, we realized a proposed dialog system according to these investigations.
In this paper, signal processing techniques which can be applied to automatic speech recognition to improve its robustness are reviewed. The choice of signal processing techniques is strongly dependent on the scenario of the applications. The analysis of scenario and the choice of suitable signal processing techniques are shown through two examples.
Kohki OHBA Takaya MIYAZAWA Iwao SASASE
In this paper, we propose a mitigation system of high-level multiple access interference (MAI) for multimedia optical Code-Division Multiple-Access (CDMA) systems using the optical power selector (OPS). The proposed system can eliminate high-intensity MAI at the receiver for low-priority users. Moreover, the proposed system can reduce by half the required number of code sequences compared to the conventional scheme. As a result, the proposed system can increase the number of weights at the same code-length and, thus, obtain higher code spreading gain. We analyze performances of the proposed system and show that both high-priority users and low-priority users achieve lower bit error rates in comparison to the conventional scheme.
Shoei SATO Akio KOBAYASHI Kazuo ONOE Shinichi HOMMA Toru IMAI Tohru TAKAGI Tetsunori KOBAYASHI
We present a novel method of integrating the likelihoods of multiple feature streams, representing different acoustic aspects, for robust speech recognition. The integration algorithm dynamically calculates a frame-wise stream weight so that a higher weight is given to a stream that is robust to a variety of noisy environments or speaking styles. Such a robust stream is expected to show discriminative ability. A conventional method proposed for the recognition of spoken digits calculates the weights from the entropy of the whole set of HMM states. This paper extends the dynamic weighting to a real-time large-vocabulary continuous speech recognition (LVCSR) system. The proposed weight is calculated in real-time from mutual information between an input stream and active HMM states in a search space without an additional likelihood calculation. Furthermore, the mutual information takes the width of the search space into account by calculating the marginal entropy from the number of active states. In this paper, we integrate three features that are extracted through auditory filters by taking into account the human auditory system's ability to extract amplitude and frequency modulations. Due to this, features representing energy, amplitude drift, and resonant frequency drifts, are integrated. These features are expected to provide complementary clues for speech recognition. Speech recognition experiments on field reports and spontaneous commentary from Japanese broadcast news showed that the proposed method reduced error words by 9.2% in field reports and 4.7% in spontaneous commentaries relative to the best result obtained from a single stream.
Nari TANABE Toshihiro FURUKAWA Shigeo TSUJII
We propose a noise suppression algorithm with the Kalman filter theory. The algorithm aims to achieve robust noise suppression for the additive white and colored disturbance from the canonical state space models with (i) a state equation composed of the speech signal and (ii) an observation equation composed of the speech signal and additive noise. The remarkable features of the proposed algorithm are (1) applied to adaptive white and colored noises where the additive colored noise uses babble noise, (2) realization of high performance noise suppression without sacrificing high quality of the speech signal despite simple noise suppression using only the Kalman filter algorithm, while many conventional methods based on the Kalman filter theory usually perform the noise suppression using the parameter estimation algorithm of AR (auto-regressive) system and the Kalman filter algorithm. We show the effectiveness of the proposed method, which utilizes the Kalman filter theory for the proposed canonical state space model with the colored driving source, using numerical results and subjective evaluation results.
Keiichi TANABE Hironori WAKANA Koji TSUBONE Yoshinobu TARUTANI Seiji ADACHI Yoshihiro ISHIMARU Michitaka MARUYAMA Tsunehiro HATO Akira YOSHIDA Hideo SUZUKI
We have developed the fabrication process, the circuit design technology, and the cryopackaging technology for high-Tc single flux quantum (SFQ) devices with the aim of application to an analog-to-digital (A/D) converter circuit for future wireless communication and a sampler system for high-speed measurements. Reproducibility of fabricating ramp-edge Josephson junctions with IcRn products above 1 mV at 40 K and small Ic spreads on a superconducting groundplane was much improved by employing smooth multilayer structures and optimizing the junction fabrication process. The separated base-electrode layout (SBL) method that suppresses the Jc spread for interface-modified junctions in circuits was developed. This method enabled low-frequency logic operations of various elementary SFQ circuits with relatively wide bias current margins and operation of a toggle-flip-flop (T-FF) above 200 GHz at 40 K. Operation of a 1:2 demultiplexer, one of main elements of a hybrid-type Σ-Δ A/D converter circuit, was also demonstrated. We developed a sampler system in which a sampler circuit with a potential bandwidth over 100 GHz was cooled by a compact stirling cooler, and waveform observation experiments confirmed the actual system bandwidth well over 50 GHz.
Optimal saving in TCAM search power can be achieved with the combined strategy of both hardware-based techniques and a power friendly TCAM configuration. This letter proposes that a conditional precharging hardware scheme is used with a power aware TCAM configuration. In the traffic simulation results, the proposed scheme conservatively saved 72% of energy with unbiased traffic compared to no energy saving schemes for a sample design of 51272 TCAM block.