Satoshi TAOKA Daisuke TAKAFUJI Toshimasa WATANABE
A branch-and-bound algorithm (BB for short) is the most general technique to deal with various combinatorial optimization problems. Even if it is used, computation time is likely to increase exponentially. So we consider its parallelization to reduce it. It has been reported that the computation time of a parallel BB heavily depends upon node-variable selection strategies. And, in case of a parallel BB, it is also necessary to prevent increase in communication time. So, it is important to pay attention to how many and what kind of nodes are to be transferred (called sending-node selection strategy). In this paper, for the graph coloring problem, we propose some sending-node selection strategies for a parallel BB algorithm by adopting MPI for parallelization and experimentally evaluate how these strategies affect computation time of a parallel BB on a PC cluster network.
Satoshi OHTA Yoshinobu KAJIKAWA Yasuo NOMURA
In the acoustic echo canceller (AEC), the step-size parameter of the adaptive filter must be varied according to the situation if double talk occurs and/or the echo path changes. We propose an AEC that uses a sub-adaptive filter. The proposed AEC can control the step-size parameter according to the situation. Moreover, it offers superior convergence compared to the conventional AEC even when the double talk and the echo path change occur simultaneously. Simulations demonstrate that the proposed AEC can achieve higher ERLE and faster convergence than the conventional AEC. The computational complexity of the proposed AEC can be reduced by reducing the number of taps of the sub-adaptive filter.
Won-Young JUNG Hyungon KIM Yong-Ju KIM Jae-Kyung WEE
In order for the interconnect effects due to process-induced variations to be applied to the designs in 0.13 µm and below, it is necessary to determine and characterize the realistic interconnect worstcase models with high accuracy and speed. This paper proposes new statistically-based approaches to the characterization of realistic interconnect worstcase models which take into account process-induced variations. The Effective Common Geometry (ECG) and Accumulated Maximum Probability (AMP) algorithms have been developed and implemented into the new statistical interconnect worstcase design environment. To verify this statistical interconnect worstcase design environment, the 31-stage ring oscillators are fabricated and measured with UMC 0.13 µm Logic process. The 15-stage ring oscillators are fabricated and measured with 0.18 µm standard CMOS process for investigating its flexibility in other technologies. The results show that the relative errors of the new method are less than 1.00%, which is two times more accurate than the conventional worstcase method. Furthermore, the new interconnect worstcase design environment improves optimization speed by 29.61-32.01% compared to that of the conventional worstcase optimization. The new statistical interconnect worstcase design environment accurately predicts the worstcase and bestcase corners of non-normal distribution where conventional methods cannot do well.
The present paper describes a method for the construction of a zero-correlation zone sequence set from a perfect sequence. Both the cross-correlation function and the side-lobe of the auto-correlation function of the proposed sequence sets are zero for phase shifts within the zero-correlation zone. These sets can be generated from an arbitrary perfect sequence, the length of which is the product of a pair of odd integers ((2n+1)(2k+1) for k ≥ 1 and n ≥ 0). The proposed sequence construction method can generate an optimal zero-correlation zone sequence set that achieves the theoretical bounds of the sequence member size given the size of the zero-correlation zone and the sequence period. The peak in the out-of-phase correlation function of the constructed sequences is restricted to be lower than the half of the power of the sequence itself. The proposed sequence sets could successfully provide CDMA communication without co-channel interference, or, in an ultrasonic synthetic aperture imaging system, improve the signal-to-noise ratio of the acquired image.
Lazaro S.P. BUSAGALA Wataru OHYAMA Tetsushi WAKABAYASHI Fumitaka KIMURA
Feature transformation in automatic text classification (ATC) can lead to better classification performance. Furthermore dimensionality reduction is important in ATC. Hence, feature transformation and dimensionality reduction are performed to obtain lower computational costs with improved classification performance. However, feature transformation and dimension reduction techniques have been conventionally considered in isolation. In such cases classification performance can be lower than when integrated. Therefore, we propose an integrated feature analysis approach which improves the classification performance at lower dimensionality. Moreover, we propose a multiple feature integration technique which also improves classification effectiveness.
Fair queueing is a service scheduling discipline to pursue the fairness among users in packet communication networks. Many fair queueing algorithms, however, have problems of computational overhead since the central scheduler has to maintain a certain performance counter for each flow of user packets based on the global virtual time. Moreover, they are not suitable for wireless networks with high probability of input channel errors due to the lack or complexity in the compensation mechanism for the recovery from the error state. In this paper, we propose a new, computationally efficient, distributed fair queueing scheme, which we call Channel-Aware Throughput Fair Queueing (CATFQ), that is applicable to both wired and wireless packet networks. In our CATFQ scheme, each flow is equipped with a counter that measures the weighted throughput achievement while it has a backlog of packets. At the end of every service to a packet, the scheduler simply selects a flow with the minimum counter value as the one from which a packet is served next. We show that the difference between any two throughput counters is bounded. Our scheme significantly reduces the scheduler's computational overhead and guarantees fair throughput for all flows. For wireless networks with error-prone channels, the service chance lost in bad channel condition is compensated quickly as the channel recovers. Our scheme suppresses the service for leading flows, brings short-term fairness for flows without channel errors, and achieves long-term fairness for all flows. These merits are verified by simulation.
Hiroaki SHIKANO Jun SHIRAKO Yasutaka WADA Keiji KIMURA Hironori KASAHARA
A power-aware compiler controllable chip multiprocessor (CMP) is presented and its performance and power consumption are evaluated with the optimally scheduled advanced multiprocessor (OSCAR) parallelizing compiler. The CMP is equipped with power control registers that change clock frequency and power supply voltage to functional units including processor cores, memories, and an interconnection network. The OSCAR compiler carries out coarse-grain task parallelization of programs and reduces power consumption using architectural power control support and the compiler's power saving scheme. The performance evaluation shows that MPEG-2 encoding on the proposed CMP with four CPUs results in 82.6% power reduction in real-time execution mode with a deadline constraint on its sequential execution time. Furthermore, MP3 encoding on a heterogeneous CMP with four CPUs and four accelerators results in 53.9% power reduction at 21.1-fold speed-up in performance against its sequential execution in the fastest execution mode.
A packet detection method for zero-padded orthogonal frequency division multiplexing (OFDM) transmission is presented. The proposed algorithm effectively conducts packet detection by employing both an M-sample time delayed cross correlation value, and a received signal power calculated by using the received input samples corresponding to the zero padding (ZP) intervals or less.
Jeong-Yong AHN Kill-Sung MUN Young-Hyun KIM Sun-Young OH Beom-Soo HAN
In this note we propose a fuzzy diagnosis of headache. The method is based on the relations between symptoms and diseases. For this purpose, we suggest a new diagnosis measure using the occurrence information of patient's symptoms and develop an improved interview chart with fuzzy degrees assigned according to the relation among symptoms and three labels of headache. The proposed method is illustrated by two examples.
Orthogonal frequency-division multiplexing (OFDM) systems often use a cyclic prefix (CP) to simplify the equalization design at the cost of bandwidth efficiency. To increase the bandwidth efficiency, we study the blind equalization with linear smoothing [1] for single-input multiple-output (SIMO) OFDM systems without CP insertion in this paper. Due to the block Toeplitz structure of channel matrix, the block matrix scheme is applied to the linear smoothing channel estimation, which equivalently increases the number of sample vectors and thus reduces the perturbation of sample autocorrelation matrix. Compared with the linear smoothing and subspace methods, the proposed block linear smoothing requires the lowest computational complexity. Computer simulations show that the block linear smoothing yields a channel estimation error smaller than that from linear smoothing, and close to that of the subspace method. Evaluating by the minimum mean-square error (MMSE) equalizer, the block linear smoothing and subspace methods have nearly the same bit-error-rates (BERs).
Conventional narrowband interference (NBI) rejection algorithms often assumed perfect pseudo-noise (PN) code synchronization. The functions of NBI rejection and code tracking are performed separately and independently by an adaptive filter and a code tracking loop, respectively. This paper presents two new receiver structures for direct sequence spread spectrum (DS/SS) systems, one operates in coherent mode and the other operates in noncoherent mode. Both receivers are designed to suppress NBI and minimize tracking jitter. Numerical results show that the proposed coherent receiver performs as good as the conventional receiver that uses an LMS NBI rejection filter with zero tracking jitter. The noncoherent receiver, when compared with the coherent one, suffers less than 3 dB degradation for bit error probability smaller than 10-3.
Jung-Shan LIN Hong-Yu CHEN Jia-Chin LIN
This paper proposes a channel estimation technique which uses a postfixed pseudo-noise (PN) sequence combined with zero padding to accurately estimate the channel impulse response for mobile orthogonal frequency division multiplexing (OFDM) communications. The major advantage of the proposed techniques is the periodical insertion of PN sequences after each OFDM symbol within the original guard interval in conventional zero-padded OFDM or within the original cyclic prefix (CP) in conventional CP-OFDM. In addition, the proposed technique takes advantage of null samples padded after the PN sequences for reducing inter-symbol interference occurring with the information detection in conventional pseudo-random-postfix OFDM. The proposed technique successfully applies either (1) least-squares algorithm with decision-directed data-assistance, (2) approximate least-squares estimation, or (3) maximum-likelihood scheme with various observation windows for the purpose of improving channel estimation performance. Some comparative simulations are given to illustrate the excellent performance of the proposed channel estimation techniques in mobile environments.
Hiroshi YASUDA Ryota KAIHARA Suguru SAITO Masayuki NAKAJIMA
Because motion capture system enabled us to capture a number of human motions, the demand for a method to easily browse the captured motion database has been increasing. In this paper, we propose a method to generate simple visual outlines of motion clips, for the purpose of efficient motion data browsing. Our method unfolds a motion clip into a 2D stripe of keyframes along a timeline that is based on semantic keyframe extraction and the best view point selection for each keyframes. With our visualization, timing and order of actions in the motions are clearly visible and the contents of multiple motions are easily comparable. In addition, because our method is applicable for a wide variety of motions, it can generate outlines for a large amount of motions fully automatically.
Previous approaches for modeling Intrusion Detection System (IDS) have been on twofold: improving detection model(s) in terms of (i) feature selection of audit data through wrapper and filter methods and (ii) parameters optimization of detection model design, based on classification, clustering algorithms, etc. In this paper, we present three approaches to model IDS in the context of feature selection and parameters optimization: First, we present Fusion of Genetic Algorithm (GA) and Support Vector Machines (SVM) (FuGAS), which employs combinations of GA and SVM through genetic operation and it is capable of building an optimal detection model with only selected important features and optimal parameters value. Second, we present Correlation-based Hybrid Feature Selection (CoHyFS), which utilizes a filter method in conjunction of GA for feature selection in order to reduce long training time. Third, we present Simultaneous Intrinsic Model Identification (SIMI), which adopts Random Forest (RF) and shows better intrusion detection rates and feature selection results, along with no additional computational overheads. We show the experimental results and analysis of three approaches on KDD 1999 intrusion detection datasets.
Yiqing HUANG Zhenyu LIU Yang SONG Satoshi GOTO Takeshi IKENAGA
One hardware efficient and high speed architecture for variable block size motion estimation (VBSME) in H.264 is presented in this paper. By improving the pipeline structure and processing element (PE) circuits, the system latency and hardware cost is reduced, which makes this structure more hardware efficient than the original Propagate Partial SAD architecture. For small and middle frame size picture's coding, the proposed structure can save 12.1% hardware cost compared with original Propagate Partial SAD structure. In the case of HDTV, since small inter modes trivially contribute to the coding quality, we remove modes below 88 in our design. By adopting mode reduction technique, when the set number of PE array is less than 8, the proposed mode reduction based Propagate Partial SAD structure can work at faster clock speed and consume less hardware cost than widely used SAD Tree architecture. It is more robust to the high speed timing constraint when parallel processing is considered. With TSMC 0.18 µm technology in worst work conditions (1.62 V, 125), its peak throughput of 8-set PE array structure is 720p@30 Hz with 12864 search range and 5 reference frames. 12 k gates hardware cost can be reduced by our design compared with the parallel SAD Tree architecture.
Tobias CINCAREK Hiromichi KAWANAMI Ryuichi NISIMURA Akinobu LEE Hiroshi SARUWATARI Kiyohiro SHIKANO
In this paper, the development, long-term operation and portability of a practical ASR application in a real environment is investigated. The target application is a speech-oriented guidance system installed at the local community center. The system has been exposed to ordinary people since November 2002. More than 300 hours or more than 700,000 inputs have been collected during four years. The outcome is a rare example of a large scale real-environment speech database. A simulation experiment is carried out with this database to investigate how the system's performance improves during the first two years of operation. The purpose is to determine empirically the amount of real-environment data which has to be prepared to build a system with reasonable speech recognition performance and response accuracy. Furthermore, the relative importance of developing the main system components, i.e. speech recognizer and the response generation module, is assessed. Although depending on the system's modeling capacities and domain complexity, experimental results show that overall performance stagnates after employing about 10-15 k utterances for training the acoustic model, 40-50 k utterances for training the language model and 40 k-50 k utterances for compiling the question and answer database. The Q&A database was most important for improving the system's response accuracy. Finally, the portability of the well-trained first system prototype for a different environment, a local subway station, is investigated. Since collection and preparation of large amounts of real data is impractical in general, only one month of data from the new environment is employed for system adaptation. While the speech recognition component of the first prototype has a high degree of portability, the response accuracy is lower than in the first environment. The main reason is a domain difference between the two systems, since they are installed in different environments. This implicates that it is imperative to take the behavior of users under real conditions into account to build a system with high user satisfaction.
Ultra fast switching speed of superconducting digital circuits enable realization of Digital Signal Processors with performance unattainable by any other technology. Based on rapid-single-flux technology (RSFQ) logic, these integrated circuits are capable of delivering high computation capacity up to 30 GOPS on a single processor and very short latency of 0.1 ns. There are two main applications of such hardware for practical telecommunication systems: filters for superconducting ADCs operating with digital RF data and recursive filters at baseband. The later of these allows functions such as multiuser detection for 3G WCDMA, equalization and channel precoding for 4G OFDM MIMO, and general blind detection. The performance gain is an increase in the cell capacity, quality of service, and transmitted data rate. The current status of the development of the RSFQ baseband DSP is discussed. Major components with operating speed of 30 GHz have been developed. Designs, test results, and future development of the complete systems including cryopackaging and CMOS interface are reviewed.
Taichi ASAMI Koji IWANO Sadaoki FURUI
We have previously proposed a noise-robust speaker verification method using fundamental frequency (F0) extracted using the Hough transform. The method also incorporates an automatic stream-weight and decision threshold estimation technique. It has been confirmed that the proposed method is effective for white noise at various SNR conditions. This paper evaluates the proposed method in more practical in-car and elevator-hall noise conditions. The paper first describes the noise-robust F0 extraction method and details of our robust speaker verification method using multi-stream HMMs for integrating the extracted F0 and cepstral features. Details of the automatic stream-weight and threshold estimation method for multi-stream speaker verification framework are also explained. This method simultaneously optimizes stream-weights and a decision threshold by combining the linear discriminant analysis (LDA) and the Adaboost technique. Experiments were conducted using Japanese connected digit speech contaminated by white, in-car, or elevator-hall noise at various SNRs. Experimental results show that the F0 features improve the verification performance in various noisy environments, and that our stream-weight and threshold optimization method effectively estimates control parameters so that FARs and FRRs are adjusted to achieve equal error rates (EERs) under various noisy conditions.
Hongbin SUO Ming LI Ping LU Yonghong YAN
Robust automatic language identification (LID) is the task of identifying the language from a short utterance spoken by an unknown speaker. The mainstream approaches include parallel phone recognition language modeling (PPRLM), support vector machine (SVM) and the general Gaussian mixture models (GMMs). These systems map the cepstral features of spoken utterances into high level scores by classifiers. In this paper, in order to increase the dimension of the score vector and alleviate the inter-speaker variability within the same language, multiple data groups based on supervised speaker clustering are employed to generate the discriminative language characterization score vectors (DLCSV). The back-end SVM classifiers are used to model the probability distribution of each target language in the DLCSV space. Finally, the output scores of back-end classifiers are calibrated by a pair-wise posterior probability estimation (PPPE) algorithm. The proposed language identification frameworks are evaluated on 2003 NIST Language Recognition Evaluation (LRE) databases and the experiments show that the system described in this paper produces comparable results to the existing systems. Especially, the SVM framework achieves an equal error rate (EER) of 4.0% in the 30-second task and outperforms the state-of-art systems by more than 30% relative error reduction. Besides, the performances of proposed PPRLM and GMMs algorithms achieve an EER of 5.1% and 5.0% respectively.
Nobuyuki IWANAGA Tomoya MATSUMURA Akihiro YOSHIDA Wataru KOBAYASHI Takao ONOYE
A sound localization method in the proximal region is proposed, which is based on a low-cost 3D sound localization algorithm with the use of head-related transfer functions (HRTFs). The auditory parallax model is applied to the current algorithm so that more accurate HRTFs can be used for sound localization in the proximal region. In addition, head-shadowing effects based on rigid-sphere model are reproduced in the proximal region by means of a second-order IIR filter. A subjective listening test demonstrates the effectiveness of the proposed method. Embedded system implementation of the proposed method is also described claiming that the proposed method improves sound effects in the proximal region only with 5.1% increase of memory capacity and 8.3% of computational costs.