Bing-Fei WU Li-Shan MA Jau-Woei PERNG
This investigation applies the adaptive fuzzy-neural observer (AFNO) to synchronize a class of unknown chaotic systems via scalar transmitting signal only. The proposed method can be used in synchronization if nonlinear chaotic systems can be transformed into the canonical form of Lur'e system type by the differential geometric method. In this approach, the adaptive fuzzy-neural network (FNN) in AFNO is adopted on line to model the nonlinear term in the transmitter. Additionally, the master's unknown states can be reconstructed from one transmitted state using observer design in the slave end. Synchronization is achieved when all states are observed. The utilized scheme can adaptively estimate the transmitter states on line, even if the transmitter is changed into another chaos system. On the other hand, the robustness of AFNO can be guaranteed with respect to the modeling error, and external bounded disturbance. Simulation results confirm that the AFNO design is valid for the application of chaos synchronization.
A robust routing algorithm was developed based on reinforcement learning that uses (1) reward-weighted principal component analysis, which compresses the state space of a network with a large number of nodes and eliminates the adverse effects of various types of attacks or disturbance noises, (2) activity-oriented index allocation, which adaptively constructs a basis that is used for approximating routing probabilities, and (3) newly developed space compression based on a potential model that reduces the space for routing probabilities. This algorithm takes all the network states into account and reduces the adverse effects of disturbance noises. The algorithm thus works well, and the frequencies of causing routing loops and falling to a local optimum are reduced even if the routing information is disturbed.
Suehiro SHIMAUCHI Yoichi HANEDA Akitoshi KATAOKA
We propose a new robust frequency domain acoustic echo cancellation filter that employs a normalized residual echo enhancement. By interpreting the conventional robust step-size control approaches as a statistical-model-based residual echo enhancement problem, the optimal step-size introduced in the most of conventional approaches is regarded as optimal only on the assumption that both the residual echo and the outlier in the error output signal are described by Gaussian distributions. However, the Gaussian-Gaussian mixture assumption does not always hold well, especially when both the residual echo and the outlier are speech signals (known as a double-talk situation). The proposed filtering scheme is based on the Gaussian-Laplacian mixture assumption for the signals normalized by the reference input signal amplitude. By comparing the performances of the proposed and conventional approaches through the simulations, we show that the Gaussian-Laplacian mixture assumption for the normalized signals can provide a better control scheme for the acoustic echo cancellation.
Chirawat KOTCHASARN Poompat SAENGUDOMLERT
We investigate the problem of joint transmitter and receiver power allocation with the minimax mean square error (MSE) criterion for uplink transmissions in a multi-carrier code division multiple access (MC-CDMA) system. The objective of power allocation is to minimize the maximum MSE among all users each of which has limited transmit power. This problem is a nonlinear optimization problem. Using the Lagrange multiplier method, we derive the Karush-Kuhn-Tucker (KKT) conditions which are necessary for a power allocation to be optimal. Numerical results indicate that, compared to the minimum total MSE criterion, the minimax MSE criterion yields a higher total MSE but provides a fairer treatment across the users. The advantages of the minimax MSE criterion are more evident when we consider the bit error rate (BER) estimates. Numerical results show that the minimax MSE criterion yields a lower maximum BER and a lower average BER. We also observe that, with the minimax MSE criterion, some users do not transmit at full power. For comparison, with the minimum total MSE criterion, all users transmit at full power. In addition, we investigate robust joint transmitter and receiver power allocation where the channel state information (CSI) is not perfect. The CSI error is assumed to be unknown but bounded by a deterministic value. This problem is formulated as a semidefinite programming (SDP) problem with bilinear matrix inequality (BMI) constraints. Numerical results show that, with imperfect CSI, the minimax MSE criterion also outperforms the minimum total MSE criterion in terms of the maximum and average BERs.
Yosuke TATEKURA Takeshi WATANABE
A robust multichannel sound reproduction system that utilizes the relationship between the width of the actual control area and the control frequency of the control points is proposed. The reproduction accuracy of a conventional sound reproduction system is reduced by room environment variations when fixed inverse filter coefficients are used. This tendency becomes more significant when control points are arranged more closely. To resolve this problem, the frequency control band at every control point is switched to avoid degrading the reproduced sound in low frequencies, so the pass band range of the control points at both ears is only high-range. That of the other control points is the entire control range. Numerical simulation with real environmental data showed that improvement of the reproduction accuracy is about 6.1 dB on average, even with a temperature fluctuation of 5C as an environmental variation in the listening room.
This paper presents an algorithm for the robust watermarking of 3D polygonal mesh models. The proposed algorithm embeds the watermark into a 2D image extracted from the 3D model, rather than directly embedding it into 3D geometry. The proposed embedding domain, i.e., the 2D image, is devised to be robust against the attacks like mesh simplification which severely modifies the vertices and connectivity while preserving the appearance of the model. The watermark-embedded model is obtained by using a simple vertex perturbation algorithm without iterative optimization. Two exemplary watermark applications using the proposed methods are also presented: one is to embed several bits into 3D models and the other is to detect only the existence of a watermark. The experimental results show that the proposed algorithm is robust against similarity transform, mesh simplification, additive Gaussian noise, quantization of vertex coordinates and mesh smoothing, and that its computational complexity is lower than that of the conventional methods.
Kyung-Su KIM Hae-Yeoun LEE Dong-Hyuck IM Heung-Kyu LEE
Commercial markets employ digital right management (DRM) systems to protect valuable high-definition (HD) quality videos. DRM system uses watermarking to provide copyright protection and ownership authentication of multimedia contents. We propose a real-time video watermarking scheme for HD video in the uncompressed domain. Especially, our approach is in aspect of practical perspectives to satisfy perceptual quality, real-time processing, and robustness requirements. We simplify and optimize human visual system mask for real-time performance and also apply dithering technique for invisibility. Extensive experiments are performed to prove that the proposed scheme satisfies the invisibility, real-time processing, and robustness requirements against video processing attacks. We concentrate upon video processing attacks that commonly occur in HD quality videos to display on portable devices. These attacks include not only scaling and low bit-rate encoding, but also malicious attacks such as format conversion and frame rate change.
In this paper, signal processing techniques which can be applied to automatic speech recognition to improve its robustness are reviewed. The choice of signal processing techniques is strongly dependent on the scenario of the applications. The analysis of scenario and the choice of suitable signal processing techniques are shown through two examples.
Keiichi FUNAKI Tatsuhiko KINJO
Complex speech analysis for an analytic speech signal can accurately estimate the spectrum in low frequencies since the analytic signal provides spectrum only over positive frequencies. The remarkable feature makes it possible to realize more accurate F0 estimation using complex residual signal extracted by complex-valued speech analysis. We have already proposed F0 estimation using complex LPC residual, in which the autocorrelation function weighted by AMDF was adopted as the criterion. The method adopted MMSE-based complex LPC analysis and it has been reported that it can estimate more accurate F0 for IRS filtered speech corrupted by white Gauss noise although it can not work better for the IRS filtered speech corrupted by pink noise. In this paper, robust complex speech analysis based on ELS (Extended Least Square) method is introduced in order to overcome the drawback. The experimental results for additive white Gauss or pink noise demonstrate that the proposed algorithm based on robust ELS-based complex AR analysis can perform better than other methods.
Longbiao WANG Seiichi NAKAGAWA Norihide KITAOKA
In a distant-talking environment, the length of channel impulse response is longer than the short-term spectral analysis window. Conventional short-term spectrum based Cepstral Mean Normalization (CMN) is therefore, not effective under these conditions. In this paper, we propose a robust speech recognition method by combining a short-term spectrum based CMN with a long-term one. We assume that a static speech segment (such as a vowel, for example) affected by reverberation, can be modeled by a long-term cepstral analysis. Thus, the effect of long reverberation on a static speech segment may be compensated by the long-term spectrum based CMN. The cepstral distance of neighboring frames is used to discriminate the static speech segment (long-term spectrum) and the non-static speech segment (short-term spectrum). The cepstra of the static and non-static speech segments are normalized by the corresponding cepstral means. In a previous study, we proposed an environmentally robust speech recognition method based on Position-Dependent CMN (PDCMN) to compensate for channel distortion depending on speaker position, and which is more efficient than conventional CMN. In this paper, the concept of combining short-term and long-term spectrum based CMN is extended to PDCMN. We call this Variable Term spectrum based PDCMN (VT-PDCMN). Since PDCMN/VT-PDCMN cannot normalize speaker variations because a position-dependent cepstral mean contains the average speaker characteristics over all speakers, we also combine PDCMN/VT-PDCMN with conventional CMN in this study. We conducted the experiments based on our proposed method using limited vocabulary (100 words) distant-talking isolated word recognition in a real environment. The proposed method achieved a relative error reduction rate of 60.9% over the conventional short-term spectrum based CMN and 30.6% over the short-term spectrum based PDCMN.
Xin XU Noboru HAYASAKA Yoshikazu MIYANAGA
This paper proposes a new algorithm named Adaptive Running Spectrum Filtering (ARSF) to restore the amplitude spectra of speech corrupted by additive noises. Based on the pre-hand noise estimation, adaptive filtering is used in speech modulation spectra according to the noise conditions. The periodic structures in the amplitude spectra are kept against noise distortion. Since the amplitude spectral structures contain the information of fundamental frequency, which is the inverse of pitch period, ARSF algorithm is added into robust pitch detection to increase the accuracy. Compared with the conventional methods, experimental results show that the proposed method significantly improves the robustness of pitch detection against noise conditions with several types and SNRs.
Nari TANABE Toshihiro FURUKAWA Shigeo TSUJII
We propose a noise suppression algorithm with the Kalman filter theory. The algorithm aims to achieve robust noise suppression for the additive white and colored disturbance from the canonical state space models with (i) a state equation composed of the speech signal and (ii) an observation equation composed of the speech signal and additive noise. The remarkable features of the proposed algorithm are (1) applied to adaptive white and colored noises where the additive colored noise uses babble noise, (2) realization of high performance noise suppression without sacrificing high quality of the speech signal despite simple noise suppression using only the Kalman filter algorithm, while many conventional methods based on the Kalman filter theory usually perform the noise suppression using the parameter estimation algorithm of AR (auto-regressive) system and the Kalman filter algorithm. We show the effectiveness of the proposed method, which utilizes the Kalman filter theory for the proposed canonical state space model with the colored driving source, using numerical results and subjective evaluation results.
Satoshi KOBASHIKAWA Satoshi TAKAHASHI
Users require speech recognition systems that offer rapid response and high accuracy concurrently. Speech recognition accuracy is degraded by additive noise, imposed by ambient noise, and convolutional noise, created by space transfer characteristics, especially in distant talking situations. Against each type of noise, existing model adaptation techniques achieve robustness by using HMM-composition and CMN (cepstral mean normalization). Since they need an additive noise sample as well as a user speech sample to generate the models required, they can not achieve rapid response, though it may be possible to catch just the additive noise in a previous step. In the previous step, the technique proposed herein uses just the additive noise to generate an adapted and normalized model against both types of noise. When the user's speech sample is captured, only online-CMN need be performed to start the recognition processing, so the technique offers rapid response. In addition, to cover the unpredictable S/N values possible in real applications, the technique creates several S/N HMMs. Simulations using artificial speech data show that the proposed technique increased the character correct rate by 11.62% compared to CMN.
Ilmu BYUN Hae Gwang HWANG Young Jin SANG Kwang Soon KIM
Various space time code (STC) designs have been proposed to obtain full diversity at full rate in multiple-input multiple-output (MIMO) channels for uncoded systems. However, commercial wireless systems typically employ powerful channel codes such as turbo codes and low density parity check (LDPC) codes together with an STC. For these applications, an STC optimized for uncoded systems may not provide the best performance. In this paper, an STC with relatively good performance over a wide range of code rates is proposed. Simulation results show that the performance of the proposed robust STC is very close to the best performance of the SM and the Golden code in various code rates.
Chunsheng HUA Qian CHEN Haiyuan WU Toshikazu WADA
This paper presents an RK-means clustering algorithm which is developed for reliable data grouping by introducing a new reliability evaluation to the K-means clustering algorithm. The conventional K-means clustering algorithm has two shortfalls: 1) the clustering result will become unreliable if the assumed number of the clusters is incorrect; 2) during the update of a cluster center, all the data points belong to that cluster are used equally without considering how distant they are to the cluster center. In this paper, we introduce a new reliability evaluation to K-means clustering algorithm by considering the triangular relationship among each data point and its two nearest cluster centers. We applied the proposed algorithm to track objects in video sequence and confirmed its effectiveness and advantages.
Shouhei KIDERA Takuya SAKAMOTO Toru SATO
Target shape estimation with UWB pulse radars is a promising imaging technique for household robots. We have already proposed a fast imaging algorithm, SEABED, that is based on a reversible transform BST (Boundary Scattering Transform) between the received signals and the target shape. However, the target image obtained by SEABED deteriorates in a noisy environment because it utilizes a derivative of received data. In this paper, we propose a robust imaging method with an envelope of circles. We clarify by numerical simulation that the proposed method can realize a level of robust and fast imaging that cannot be achieved by the original SEABED.
To design a controller with block-diagonal structure for trajectory sensitivity minimization, we propose a method based on LMI. In order to reduce the trajectory sensitivity, linear quadratic regulator theory is adopted, and this is solved using LMI optimization technique.
A new quadruple watermarking scheme of digital images against geometrical attacks is proposed in this letter. We treat the center and the four vertexes of the original image as the reference points and embed the same quadruple watermarks by means of polar coordinates, which is geometrically invariant. The center of an image is assumed to not to be removed after rotating, scaling and local distortions according to the general practical image processing. In the watermark extraction process, the vertexes of the image are found by a searching method. Thus watermark synchronization is obtained. Experimental results show that the scheme is robust to the geometrical distortions including rotation, scaling, cropping and local distortions.
Shouhei KIDERA Takuya SAKAMOTO Toru SATO
UWB pulse radars enable us to measure a target location with high range-resolution, and so are applicable for measurement systems for robots and automobile. We have already proposed a robust and fast imaging algorithm with an envelope of circles, which is suitable for these applications. In this method, we determine time delays from received signals with the matched filter for a transmitted waveform. However, scattered waveforms are different from transmitted one depending on the target shape. Therefore, the resolution of the target edges deteriorates due to these waveform distortions. In this paper, a high-resolution imaging algorithm for convex targets is proposed by iteration of the shape and waveform estimation. We show application examples with numerical simulations and experiments, and confirm its capability to detect edges of an object.
Suehiro SHIMAUCHI Yoichi HANEDA Akitoshi KATAOKA Akinori NISHIHARA
We propose a gradient-limited affine projection algorithm (GL-APA), which can achieve fast and double-talk-robust convergence in acoustic echo cancellation. GL-APA is derived from the M-estimation-based nonlinear cost function extended for evaluating multiple error signals dealt with in the affine projection algorithm (APA). By considering the nonlinearity of the gradient, we carefully formulate an update equation consistent with multiple input-output relationships, which the conventional APA inherently satisfies to achieve fast convergence. We also newly introduce a scaling rule for the nonlinearity, so we can easily implement GL-APA by using a predetermined primary function as a basis of scaling with any projection order. This guarantees a linkage between GL-APA and the gradient-limited normalized least-mean-squares algorithm (GL-NLMS), which is a conventional algorithm that corresponds to the GL-APA of the first order. The performance of GL-APA is demonstrated with simulation results.