This paper presents a new statistical model-based voice activity detection (VAD) algorithm in the wavelet domain to improve the performance in non-stationary environments. Due to the efficient time-frequency localization and the multi-resolution characteristics of the wavelet representations, the wavelet transforms are quite suitable for processing non-stationary signals such as speech. To utilize the fact that the wavelet packet is very efficient approximation of discrete Fourier transform and has built-in de-noising capability, we first apply wavelet packet decomposition to effectively localize the energy in frequency space, use spectral subtraction, and employ matched filtering to enhance the SNR. Since the conventional wavelet-based spectral subtraction eliminates the low-power speech signal in onset and offset regions and generates musical noise, we derive an improved multi-band spectral subtraction. On the other hand, noticing that fixed threshold cannot follow fluctuations of time varying noise power and the inability to adapt to a time-varying environment severely limits the VAD performance, we propose a statistical model-based VAD algorithm in wavelet domain with an adaptive threshold. We perform extensive computer simulations and compare with the conventional algorithms to demonstrate performance improvement of the proposed algorithm under various noise environments.
Hyunchul KU Kang-Yoon LEE Young Beom KIM
This paper investigates limitations of adjacent channel power ratio (ACPR) improvement in predistortion (pre-D) linearizer used with nonlinear RF power amplifiers (PAs) when the PA model is not perfectly acquired in pre-D design. The error between the physical PA and the nonlinear model is expanded by pre-D function and its power spectral density (PSD) works as limitations in ACPR improvement of the pre-D linearizer. An analytical estimation of ACPR limitations in RF PAs driven by digitally modulated input signal is derived using a formulation of autocorrelation function. The analysis technique is validated with the example of the memory polynomial PA model with the quasi-memoryless pre-D linearizer. The technique is also verified by comparing predicted ACPR limitation with measured limitation for a RF PA with 802.11g input signal.
Tran Huy DAT Kazuya TAKEDA Fumitada ITAKURA
This study shows the effectiveness of using gamma distribution in the speech power domain as a more general prior distribution for the model-based speech enhancement approaches. This model is a super-set of the conventional Gaussian model of the complex spectrum and provides more accurate prior modeling when the optimal parameters are estimated. We develop a method to adapt the modeled distribution parameters from each actual noisy speech in a frame-by-frame manner. Next, we derive and investigate the minimum mean square error (MMSE) and maximum a posterior probability (MAP) estimations in different domains of speech spectral magnitude, generalized power and its logarithm, using the proposed gamma modeling. Finally, a comparative evaluation of the MAP and MMSE filters is conducted. As the MMSE estimations tend to more complicated using more general prior distributions, the MAP estimations are given in closed-form extractions and therefore are suitable in the implementation. The adaptive estimation of the modeled distribution parameters provides more accurate prior modeling and this is the principal merit of the proposed method and the reason for the better performance. From the experiments, the MAP estimation is recommended due to its high efficiency and low complexity. Among the MAP based systems, the estimation in log-magnitude domain is shown to be the best for the speech recognition as the estimation in power domain is superior for the noise reduction.
Hossein SHAMSI Omid SHOAEI Roghayeh DOOST
In this paper by using an exactly analytic approach the clock jitter in the feedback path of the continuous time Delta Sigma modulators (CT DSM) is modeled as an additive jitter noise, providing a time invariant model for a jittery CT DSM. Then for various DAC waveforms the power spectral density (psd) of the clock jitter at the output of DAC is derived and by using an approximation the in-band power of the clock jitter at the output of the modulator is extracted. The simplicity and generality of the proposed approach are the main advantages of this paper. The MATALB and HSPICE simulation results confirm the validity of the proposed formulas.
Khalid Mahmood AAMIR Mohammad Ali MAUD Arif ZAMAN Asim LOAN
Power Spectral Density (PSD) computed by taking the Fourier transform of auto-correlation functions (Wiener-Khintchine Theorem) gives better result, in case of noisy data, as compared to the Periodogram approach in case the signal is Gaussian. However, the computational complexity of Wiener-Khintchine approach is more than that of the Periodogram approach. For the computation of short time Fourier transform (STFT), this problem becomes even more prominent where computation of PSD is required after every shift in the window under analysis. This paper presents a recursive form of PSD to reduce the complexity. If the signal is not Gaussian, the PSD approach is insufficient and we estimate the higher order spectra of the signal. Estimation of higher order spectra is even more time consuming. In this paper, recursive versions for computation of bispectrum has been presented as well. The computational complexity of PSD and bispectrum for a window size of N, are O(N) and O(N2) respectively.
We investigate the impact of symbol rate control, modulation level control, and the number of hops on the area spectral efficiency of interference-limited multihop radio networks. By controlling symbol rate and modulation level, data rate can be adapted according to received power. In addition, varying the number of hops can control received power. First, we evaluate the achievable end-to-end throughput of multihop transmission assuming symbol rate and modulation level control. Numerical results reveal that by controlling symbol rate or using multihop transmission, the end-to-end communication range can be extended at the cost of end-to-end throughput, and this may result in lower area spectral efficiency. Next, an expression for the area spectral efficiency of multihop radio networks is derived as a function of the number of hops and the end-to-end throughput. Numerical results also reveal that the resulting area spectral efficiency depends on the specific circumstances, which, however, can be increased only by using multihop transmission.
A method of evaluating the wavelength filter spectrum response is introduced. The increase of the crosstalk level due to the filtering and the relation between the total crosstalk and the spectral efficiency are derived in detail using the Gaussian filter. Since this method can be applied to various kinds of filter spectrum responses, the ultimate spectral efficiencies of filters are compared. In this comparison, the problem of the box-like filter, which has been considered to be desirable, is revealed, and this is improved by cascading the filter spectrum. The requirement on the rejection floor that inheres in the filter is also made clear.
Sangwon KANG Yongwon SHIN Changyong SON Thomas R. FISCHER
A fast encoding technique is described for vector quantization (VQ) of line spectral frequency parameters. A reduction in VQ encoding complexity is achieved by using a preliminary test that reduces the necessary codebook search range. The test is performed based on two criteria. One criterion uses the distance between a specific single element of the input vector and the corresponding element of the codevectors in the codebook. The other criterion makes use of the ordering property of LSF parameters. The fast encoding technique is implemented in the enhanced variable rate codec (EVRC) encoding algorithm. Simulation results show that the average searching range of the codebook can be reduced by 44.50% for the EVRC without degradation of spectral distortion (SD).
Osamu ICHIKAWA Masafumi NISHIMURA
Recently, automatic speech recognition in a car has practical uses for applications like car-navigation and hands-free telephone dialers. For noise robustness, the current successes are based on the assumption that there is only a stationary cruising noise. Therefore, the recognition rate is greatly reduced when there is music or news coming from a radio or a CD player in the car. Since reference signals are available from such in-vehicle units, there is great hope that echo cancellers can eliminate the echo component in the observed noisy signals. However, previous research reported that the performance of an echo canceller is degraded in very noisy conditions. This implies it is desirable to combine the processes of echo cancellation and noise reduction. In this paper, we propose a system that uses echo cancellation and spectral subtraction simultaneously. A stationary noise component for spectral subtraction is estimated through the adaptation of an echo canceller. In our experiments, this system significantly reduced the errors in automatic speech recognition compared with the conventional combination of echo cancellation and spectral subtraction.
Takahiro MURAKAMI Tetsuya HOYA Yoshihisa ISHIDA
This paper presents a novel algorithm for spectral subtraction (SS). The method is derived from a relation between the spectrum obtained by the discrete Fourier transform (DFT) and that by a subspace decomposition method. By using the relation, it is shown that a noise reduction algorithm based on subspace decomposition is led to an SS method in which noise components in an observed signal are eliminated by subtracting variance of noise process in the frequency domain. Moreover, it is shown that the method can significantly reduce computational complexity in comparison with the method based on the standard subspace decomposition. In a similar manner to the conventional SS methods, our method also exploits the variance of noise process estimated from a preceding segment where speech is absent, whereas the noise is present. In order to more reliably detect such non-speech segments, a novel robust voice activity detector (VAD) is then proposed. The VAD utilizes the spread of eigenvalues of an autocorrelation matrix corresponding to the observed signal. Simulation results show that the proposed method yields an improved enhancement quality in comparison with the conventional SS based schemes.
We propose a novel approach based on wavelet decomposition for progressive full spectral rendering. In the fourth progressive stage, our method renders an image that is 95% similar to the final non-progressive approach but requires less than 70% of the execution time. The quality of the rendered image is visually plausible that is indistinguishable from that of the non-progressive method. Our approach is graceful, efficient, progressive, and flexible for full spectral rendering.
Masahiro OGUSU Kazuhiko IDE Shigeru OHSHIMA
An inverse-RZ modulation scheme for dense WDM systems is proposed. Inverse-RZ signals have tolerances to chromatic dispersion and optical bandwidth limitation. The strongly pre-filtered inverse-RZ signals can be adapted to ultra-dense WDM systems, in which the spectral efficiencies are over 1.0 b/s/Hz. We have confirmed the error-free transmission of pre-filtered and co-polarized 40-Gb/s inverse-RZ signals where the channel intervals were 37.5 GHz.
Hyuk-Choon KWON Won-Seok JANG Sang-Kook HAN
We have proposed and experimentally demonstrated a novel WDM-PON downstream optical link. It is composed of a wavelength-locked FP-LD with a spectrally-sliced FP-LD as an external-injection optical source and operated as directly-modulating in a downstream-traffic transmitter. The downstream transmissions at 622 Mbps and 2.5 Gbps were performed for four channels over 25 km. The proposed WDM-PON downstream transmitter can be expanded up to eight channels by controlling an external-injection optical source of a spectrally-sliced FP-LD. Also, the transmitter has facility of multi-channel selection by controlling temperature. We verified the potential of the transmitter in WDM-PON optical link.
Yuki HASEGAWA Shigehiro ASADA Teruaki KATSUBE Tohru IKEGUCHI
Some plants have air purification ability. This purification ability of plants is considered a promising method for indoor air purification because of the low cost and high purification performance. Therefore, several studies have been carried out to investigate the relationship between the air purification ability of plants and environmental conditions. Nevertheless, the purification mechanism and process have not been clarified yet. In this paper, we investigated the air purification process in plants by bioelectrical potential analysis using linear and nonlinear analysis methods. First, we showed that two types of plants have a high air purification ability; Schefflera and Boston fern. Next, we measured AC bioelectrical potential during the purifying process of plants for pollutant gas. Then, we evaluated the power spectra of time series data of the bioelectrical potential. We found that the power spectra shifted to a lower level after gas injection over all frequencies. Thus, the higher power spectrum came from possible higher physiological activities of the plant. Finally, we introduced a nonlinear analysis method from the dynamical system theory. We transformed the time series data of the potential to a higher dimensional state space using a delay coordinate, which is often used in the field of nonlinear time series analysis. The results show that the orbits in the reconstructed state space have a large variation in gas injection. These experimental results suggest that the measurement of bioelectrical potential could become a useful method for evaluating the air purification ability of plants.
Masaki OKAMOTO Yoshihiro INOUE Koichi YOSHIHARA Toshio KAWAHARA Jun MORIMOTO
Photoacoustic (PA) spectra on the 3, 4, 9, 10-perylenetetracarboxylic dianhydride (PTCDA) films deposited by the vacuum evaporation were measured. The films have layered structures constructed from the perylene molecule plane structures. The crystal quality depended on the deposited substrate and the photoacoustic spectroscopy (PAS) seems to be the very useful tools to evaluate these properties from the non-radiative features. The films deposited on the three different substrate had the almost same PL spectra, but the films deposited on the glass substrate had the large non-radiative peaks in the PA spectra contrary to the films deposited on the alumina or crystal Si (100) those had the non-radiative peaks only observed at the short wavelength region.
Kenichi YOSHIDA Kazuyuki TAKAGI Kazuhiko OZEKI
This paper is concerned with improving noise-robustness of a multi-SNR multi-band speaker identification system by introducing automatic adjustment of subband likelihood recombination weights. The adjustment is performed on the basis of subband power calculated from the noise observed just before the speech starts in the input signal. To evaluate the noise-robustness of this system, text-independent speaker identification experiments were conducted on speech data corrupted with noises recorded in five environments: "bus," "car," "office," "lobby," and "restaurant". It was found that the present method reduces the identification error by 15.9% compared with the multi-SNR multi-band method with equal recombination weights at 0 dB SNR. The performance of the present method was compared with a clean fullband method in which a speaker model training is performed on clean speech data, and spectral subtraction is applied to the input signal in the speaker identification stage. When the clean fullband method without spectral subtraction is taken as a baseline, the multi-SNR multi-band method with automatic adjustment of recombination weights attained 56.8% error reduction on average, while the average error reduction rate of the clean fullband method with spectral subtraction was 11.4% at 0 dB SNR.
Jen-Fa HUANG Yao-Tang CHANG Song-Ming LIN
Spectral-amplitude coding (SAC) techniques in fiber-Bragg-grating (FBG)-based optical code-division multiple-access (OCDMA) systems were investigated in our previous work. This paper adopts the same network architecture to investigate the simultaneous reductions of multiple-access interference (MAI) and optical beat interference (OBI). The MAI is caused by overlapping wavelengths from undesired network coder/decoders (codecs) while the OBI is induced by interaction of simultaneous chips at adjacent gratings. It is proposed that MAI and OBI reductions may be obtained by use of: 1) a source spectrum that is divided into equal chip spacing; 2) coded FBGs characterized by approximately the same number of "0" and "1" code elements; and 3) spectrally balanced photo-detectors. With quasi-orthogonal Walsh-Hadamard coded FBGs, complementary spectral chips is employed as signal pairs to be recombined and detected in balanced photo-detectors, thus achieving simultaneous suppression of both MAIs and OBIs. Simulation results showed that the proposed OCDMA spectral-amplitude coding scheme achieves significant MAI and OBI reductions.
Gen-ichiro OHTA Mitsuru UESUGI Takuro SATO Hideyoshi TOMINAGA
This paper proposes a new SSB-QPSK modulation/demodulation method. The present method multiplexes the USB (Upper Side Band) and LSB (Lower Side Band) of a QPSK-modulated SSB (Single Side Band) on the same SSB complex frequency band. The present method thus achieves 2 bit/s/Hz. This method is an orthogonal SSB-QPSK method, because the multiplex signals are orthogonal to each other. The demodulator consists of two SSB demodulators. A simulation result in AWGN conditions, shows that the proposed method has better BER (Bit Error Rate) performance than 16 QAM. The degradation of BER in comparison with QPSK is less than 0.2 dB on Eb/No (bit-energy-to-noise-power ratio). In a fading/Doppler environment, the BER performance of the orthogonal SSB-QPSK is the same as that of QPSK.
Hideaki WAKABAYASHI Jiro YAMAKITA Masamitsu ASAI Hiroshi INAI
The scattering problem by metallic gratings has become one of fundamental problems in electromagnetics. In this paper, a thin metallic grating placed in conical mounting is treated as a lossy dielectric grating expressed by complex permittivity and thickness. The solution of the metallic grating by using the matrix eigenvalue calculations is compared with that of the plane grating by using the resistive boundary condition and the spectral Galerkin procedure, and the availability of the resistive boundary condition for thin metallic gratings in conical mounting is investigated. In order to improve the convergence of the solutions of thin metallic gratings, the spatial harmonics of flux densities which are continuous function instead of electromagnetic fields are used.
Ryo MUKAI Hiroshi SAWADA Shoko ARAKI Shoji MAKINO
This paper describes a real-time blind source separation (BSS) method for moving speech signals in a room. Our method employs frequency domain independent component analysis (ICA) using a blockwise batch algorithm in the first stage, and the separated signals are refined by postprocessing using crosstalk component estimation and non-stationary spectral subtraction in the second stage. The blockwise batch algorithm achieves better performance than an online algorithm when sources are fixed, and the postprocessing compensates for performance degradation caused by source movement. Experimental results using speech signals recorded in a real room show that the proposed method realizes robust real-time separation for moving sources. Our method is implemented on a standard PC and works in realtime.