Xian-Hua HAN Yen-Wei CHEN Zensho NAKAO
We propose a robust edge detection method based on independent component analysis (ICA). It is known that most of the basis functions extracted from natural images by ICA are sparse and similar to localized and oriented receptive fields, and in the proposed edge detection method, a target image is first transformed by ICA basis functions and then the edges are detected or reconstructed with sparse components only. Furthermore, by applying a shrinkage algorithm to filter out the components of noise in the ICA domain, we can readily obtain the sparse components of the original image, resulting in a kind of robust edge detection even for a noisy image with a very low SN ratio. The efficiency of the proposed method is demonstrated by experiments with some natural images.
Fabian J. THEIS Wakako NAKAMURA
The transformation of a data set using a second-order polynomial mapping to find statistically independent components is considered (quadratic independent component analysis or ICA). Based on overdetermined linear ICA, an algorithm together with separability conditions are given via linearization reduction. The linearization is achieved using a higher dimensional embedding defined by the linear parametrization of the monomials, which can also be applied for higher-order polynomials. The paper finishes with simulations for artificial data and natural images.
Ryo MUKAI Hiroshi SAWADA Shoko ARAKI Shoji MAKINO
This paper describes a real-time blind source separation (BSS) method for moving speech signals in a room. Our method employs frequency domain independent component analysis (ICA) using a blockwise batch algorithm in the first stage, and the separated signals are refined by postprocessing using crosstalk component estimation and non-stationary spectral subtraction in the second stage. The blockwise batch algorithm achieves better performance than an online algorithm when sources are fixed, and the postprocessing compensates for performance degradation caused by source movement. Experimental results using speech signals recorded in a real room show that the proposed method realizes robust real-time separation for moving sources. Our method is implemented on a standard PC and works in realtime.
Takayuki SUGAWARA Keisuke IDE Tomoyoshi SATO
The DAPDNA®-2 is the world's first general purpose dynamically reconfigurable processor for commercial usage. It is a dual-core processor consisting of a custom RISC core called the Digital Application Processor (DAP), and a two dimensional array of dynamically reconfigurable processing elements referred to as the Distributed Network Architecture (DNA). The DAP has a 32 bit instruction set architecture with an 8 KB instruction cache and 8 KB data cache that can be accessed in one clock cycle. It has an interrupt control function to detect data processing completion in the DNA-Matrix. The DNA-Matrix has different types of data processing elements such as ALU, delay, and memory elements to process fully parallel computations. The DNA-Matrix includes 32 independent 16 KB high speed SRAM elements (in total 512 KB). The DNA-Matrix, even with its parallel computational capability, can be synchronized and co-work at the same clock frequency as the DAP. The processor operates at a 166 MHz working frequency and fabricated with a 0.11 µm CMOS process. The DAPDNA-2 device can be connected directly with up to 16 units with linear scalability in processing performance, provided the bandwidth requirement is within the maximum communication speed between DNAs, which is 32 Gbps. The DAPDNA-2 performs at a level that is two orders of magnitude higher than conventional high performance processors.
Kenbu TERAMOTO Kohsuke TSURUTA
This paper provides a novel signal processing for detecting defects based on the spatio-temporal gradient analysis over the Lamb-wave field. The proposed processing classifies the wave field through the rank of the covariance matrix which is defined by the four-dimensional vector with following components: a vertical displacement, its vertical velocity, and a pair of out-of-plane shearing strains. The covariance matrix provides the information about defects. Its determinant, therefore, is proposed as the inhomogeneity-index of the object surface. In this study, the physical meanings of the proposed index are shown, the computational process in the Lamb-wave field near the defects is discussed and their behaviors are investigated through FDTD-simulations and acoustic experiments.
Development of new sliding contact, usable under sever conditions such as high-temperature, extremely low-temperature or high vacuum, has recently become an urgent necessity. This research mainly examined the contact resistance and coefficient of friction of 3 kinds of self-lubricant composite materials with electrical conductivity and mechanical stiffness. The result showed that a composite material (CMML-1) containing the least quantity of solid lubricants [WS2, Gr.(Graphite)] among them was low in both contact resistance and coefficient of friction and less in fluctuation. By EPMA analysis, contribution of Sn to electrical conductivity was suggested.
Tomoya TAKATANI Tsuyoki NISHIKAWA Hiroshi SARUWATARI Kiyohiro SHIKANO
We newly propose a novel blind separation framework for Single-Input Multiple-Output (SIMO)-model-based acoustic signals using an extended ICA algorithm, SIMO-ICA. The SIMO-ICA consists of multiple ICAs and a fidelity controller, and each ICA runs in parallel under the fidelity control of the entire separation system. The SIMO-ICA can separate the mixed signals, not into monaural source signals but into SIMO-model-based signals from independent sources as they are at the microphones. Thus, the separated signals of SIMO-ICA can maintain the spatial qualities of each sound source. In order to evaluate its effectiveness, separation experiments are carried out under both nonreverberant and reverberant conditions. The experimental results reveal that the signal separation performance of the proposed SIMO-ICA is the same as that of the conventional ICA-based method, and that the spatial quality of the separated sound in SIMO-ICA is remarkably superior to that of the conventional method, particularly for the fidelity of the sound reproduction.
Tsuyoki NISHIKAWA Hiroshi ABE Hiroshi SARUWATARI Kiyohiro SHIKANO Atsunobu KAMINUMA
We propose a new algorithm for overdetermined blind source separation (BSS) based on multistage independent component analysis (MSICA). To improve the separation performance, we have proposed MSICA in which frequency-domain ICA and time-domain ICA are cascaded. In the original MSICA, the specific mixing model, where the number of microphones is equal to that of sources, was assumed. However, additional microphones are required to achieve an improved separation performance under reverberant environments. This leads to alternative problems, e.g., a complication of the permutation problem. In order to solve them, we propose a new extended MSICA using subarray processing, where the number of microphones and that of sources are set to be the same in every subarray. The experimental results obtained under the real environment reveal that the separation performance of the proposed MSICA is improved as the number of microphones is increased.
In this paper we apply a parallel adaptive solution algorithm to simulate nanoscale double-gate metal-oxide-semiconductor field effect transistors (MOSFETs) on a personal computer (PC)-based Linux cluster with the message passing interface (MPI) libraries. Based on a posteriori error estimation, the triangular mesh generation, the adaptive finite volume method, the monotone iterative method, and the parallel domain decomposition algorithm, a set of two-dimensional quantum correction hydrodynamic (HD) equations is solved numerically on our constructed cluster system. This parallel adaptive simulation methodology with 1-irregular mesh was successfully developed and applied to deep-submicron semiconductor device simulation in our recent work. A 10 nm n-type double-gate MOSFET is simulated with the developed parallel adaptive simulator. In terms of physical quantities and refined adaptive mesh, simulation results demonstrate very good accuracy and computational efficiency. Benchmark results, such as load-balancing, speedup, and parallel efficiency are achieved and exhibit excellent parallel performance. On a 16 nodes PC-based Linux cluster, the maximum difference among CPUs is less than 6%. A 12.8 times speedup and 80% parallel efficiency are simultaneously attained with respect to different simulation cases.
Kodo KAWASE Yuichi OGAWA Yuuki WATANABE
We have developed a novel basic technology for terahertz (THz) imaging, which allows detection and identification of chemicals by introducing the component spatial pattern analysis. The spatial distributions of the chemicals were obtained from terahertz multispectral transillumination images, using absorption spectra previously measured with a widely tunable THz-wave parametric oscillator. We have also separated the component spatial patterns of frequency-dependent absorptions in chemicals and frequency-independent components such as plastic, paper and measurement noise in THz spectroscopic images. Further we have applied this technique to the detection and identification of illicit drugs concealed in envelopes.
In this paper we present a component approach for configurable network processing for active documents. The approach has two key ideas. The first is to enable documents to process themselves on networks. That is, documents can define their own itineraries, like the notion of active packets in active network technology. The second is to enable documents to transmit other documents to their destinations as first-class objects, such as the notion of active nodes in active netwwork technology. The approach also enables buidling and managing active documents as compound documents. The dynamic deployment of network processing for exchanging documents can be defined and achieved by means of GUI-based manipulation of compound documents. Therefore, the approach allows a user to easily and rapidly develop and customize network processing in the same way as if that user had edited the documents. A prototype implementation of the approach and its applications were constructed on a Java-based mobile agent system to evaluate the effectiveness of the approach.
Wataru TERAMOTO Hiroshi WATANABE Hiroyuki UMEMURA Katsunori MATSUOKA Shinichi KITA
Virtual reality system is one of the most useful tools for investigating the characteristics of human perception in dynamic visual environment because we can easily and appropriately manipulate parameters of three-dimensional stimuli of vision in accordance with our purpose. In the present study we examined how the brain processes local stimuli during the global sensation of self-motion (vection) in view of temporal information processing -- perceptual latency -- with temporal order judgment task. In Experiment 1 we demonstrated that the targets in the left visual field were perceived prior to those in the right visual field when an observer stared at rightward optokinetic stimuli or perceived self-motion leftward, and vice versa. Especially at 16.0 deg of target eccentricity the biases were much larger with the continuous exposure of optokinetic stimuli than with their intermittent exposure; the former compelled observers to perceive self-motion and the latter hardly did. In Experiment 2 we examined the relationship between the occurrence of vection and temporal order judgments as the exposure duration of optokinetic stimuli was fixed between conditions, and showed that the biases were larger when vection occurred than when it did not. In Experiment 3 we showed that the biases were not modulated by the speed of optokinetic stimuli and not related with the speed of perceived self-motion. This phenomenon can be explained based on exogenous components of attention, the shift of the reference frame for determining the order in which objects come into awareness and imbalance between hemispheric activities. The mechanism is ecologically reasonable in that it allows us to be aware of the incoming events as soon as possible and to avoid any dangerous situations.
Byeong-Seob KO Ryouichi NISHIMURA Yoiti SUZUKI
A robust watermarking scheme based on the time-spread echo method is proposed in this letter. The embedding process is achieved by subband decomposition of a host signal and by controlling the amount of distortion, i.e., power of watermark, of each subband according to the Signal to Mask Ratio (SMR) calculated from MPEG psychoacoustic model. The decoding performance and robustness of the proposed method were evaluated.
Masashi YAMADA Rahmat BUDIARTO Mamoru ENDO Shinya MIYAZAKI
This paper presents a system for reading comics on cellular phones. It is necessary for comic images to be divided into frames and the contents such as speech text to be displayed at a comfortable reading size, since it is difficult to display high-resolution images in a low resolution cellular phone environment. We have developed a scheme how to decompose comic images into constituent elements frames, speech text and drawings. We implemented a system on the internet for a cellular phone company in our country, that provides downloadable comic data and a program for reading.
Ching-Tang HSIEH Eugene LAI Wan-Chen CHEN
This paper presents some effective methods for improving the performance of a speaker identification system. Based on the multiresolution property of the wavelet transform, the input speech signal is decomposed into various frequency subbands in order not to spread noise distortions over the entire feature space. For capturing the characteristics of the vocal tract, the linear predictive cepstral coefficients (LPCC) of the lower frequency subband for each decomposition process are calculated. In addition, a hard threshold technique for the lower frequency subband in each decomposition process is also applied to eliminate the effect of noise interference. Furthermore, cepstral domain feature vector normalization is applied to all computed features in order to provide similar parameter statistics in all acoustic environments. In order to effectively utilize all these multiband speech features, we propose a modified vector quantization as the identifier. This model uses the multilayer concept to eliminate the interference among the multiband speech features and then uses the principal component analysis (PCA) method to evaluate the codebooks for capturing a more detailed distribution of the speaker's phoneme characteristics. The proposed method is evaluated using the KING speech database for text-independent speaker identification. Experimental results show that the recognition performance of the proposed method is better than those of the vector quantization (VQ) and the Gaussian mixture model (GMM) using full-band LPCC and mel-frequency cepstral coefficients (MFCC) features in both clean and noisy environments. Also, a satisfactory performance can be achieved in low SNR environments.
Taoi HSU Wen-Liang HWANG Jiann-Ling KUO Der-Kuo TUNG
In this paper, a novel Wold decomposition algorithm is proposed to address the issue of deterministic component extraction for texture images. This algorithm exploits the wavelet-based singularity detection theory to process both harmonic a nd evanescent features from frequency domain. This exploitation is based on the 2D Lebesgue decomposition theory. When applying multiresolution analysis techniq ue to the power spectrum density (PSD) of a regular homogeneous random field, its indeterministic component will be effectively smoothed, and its deterministic component will remain dominant at coarse scale. By means of propagating these positions to the finest scale, the deterministic component can be properly extracted. From experiment, the proposed algorithm can obtain results that satisfactorily ensure its robustness and efficiency.
The paper presents a novel stroke decomposition approach based on a directional filtering technique for recognizing Chinese characters. The proposed filtering technique uses a set of the second-order Gaussian derivative (SOGD) filters to decompose a character into a number of stroke segments. Moreover, a new Gaussian function is proposed to overcome the general limitation in extracting stroke segments along some fixed and given orientations. The Gaussian function is designed to model the relationship between the orientation and power response of the stroke segment in the filter output. Then, an optimal orientation of the stroke segment can be estimated by finding the maximal power response of the stroke segment. Finally, the effects of decomposition process are analyzed using some simple structural and statistical features extracted from the stroke segments. Experimental results indicate that the proposed SOGD filtering-based approach is very efficient to decompose noisy and degraded character images into a number of stroke segments along an arbitrary orientation. Furthermore, the recognition performance from the application of decomposition process can be improved about 17.31% in test character set.
The classification time required by conventional multi-class SVMs greatly increases as the number of pattern classes increases. This is due to the fact that the needed set of binary class SVMs gets quite large. In this paper, we propose a method to reduce the number of classes by using nearest neighbor rule (NNR) in the principle component analysis and linear discriminant analysis (PCA+LDA) feature subspace. The proposed method reduces the number of face classes by selecting a few classes closest to the test data projected in the PCA+LDA feature subspace. Results of experiment show that our proposed method has a lower error rate than nearest neighbor classification (NNC) method. Though our error rate is comparable to the conventional multi-class SVMs, the classification process of our method is much faster.
Xiang-Yan ZENG Yen-Wei CHEN Zensho NAKAO Jian CHENG Hanqing LU
Color histograms are effective for representing color visual features. However, the high dimensionality of feature vectors results in high computational cost. Several transformations, including singular value decomposition (SVD) and principal component analysis (PCA), have been proposed to reduce the dimensionality. In PCA, the dimensionality reduction is achieved by projecting the data to a subspace which contains most of the variance. As a common observation, the PCA basis function with the lowest frquency accounts for the highest variance. Therefore, the PCA subspace may not be the optimal one to represent the intrinsic features of data. In this paper, we apply independent component analysis (ICA) to extract the features in color histograms. PCA is applied to reduce the dimensionality and then ICA is performed on the low-dimensional PCA subspace. The experimental results show that the proposed method (1) significantly reduces the feature dimensions compared with the original color histograms and (2) outperforms other dimension reduction techniques, namely the method based on SVD of quadratic matrix and PCA, in terms of retrieval accuracy.
A new speaker feature extracted from multi-wavelet decomposition for speaker recognition is described. The multi-wavelet decomposition is a multi-scale representation of the covariance matrix. We have combined wavelet transform and the multi-resolution singular value algorithm to decompose eigenvector for speaker feature extraction not at the square matrix. Our results have shown that this multi-wavelet feature introduced better performance than the cepstrum and Δ-cepstrum with respect to the percentages of recognition.