1-4hit |
Chao LIAO Guijin WANG Quan MIAO Zhiguo WANG Chenbo SHI Xinggang LIN
Robust local image features have become crucial components of many state-of-the-art computer vision algorithms. Due to limited hardware resources, computing local features on embedded system is not an easy task. In this paper, we propose an efficient parallel computing framework for speeded-up robust features with an orientation towards multi-DSP based embedded system. We optimize modules in SURF to better utilize the capability of DSP chips. We also design a compact data layout to adapt to the limited memory resource and to increase data access bandwidth. A data-driven barrier and workload balance schemes are presented to synchronize parallel working chips and reduce overall cost. The experiment shows our implementation achieves competitive time efficiency compared with related works.
Sangho LEE Jeonghyun HA Jaekeun HONG
This paper presents a new feature extraction method for robust speech recognition based on the autocorrelation mel frequency cepstral coefficients (AMFCCs) and a variable window. While the AMFCC feature extraction method uses the fixed double-dynamic-range (DDR) Hamming window for higher-lag autocorrelation coefficients, which are least affected by noise, the proposed method applies a variable window, depending on the frame energy and periodicity. The performance of the proposed method is verified using an Aurora-2 task, and the results confirm a significantly improved performance under noisy conditions.
Shingo YOSHIZAWA Noboru HAYASAKA Naoya WADA Yoshikazu MIYANAGA
This paper describes a noise robustness technique that normalizes the cepstral amplitude range in order to remove the influence of additive noise. Additive noise causes speech feature mismatches between testing and training environments and it degrades recognition accuracy in noisy environments. We presume an approximate model that expresses the influence by changing the amplitude range and the DC component in the log-spectra. According to this model, we propose a cepstral amplitude range normalization (CARN) that normalizes the cepstral distance between maximum and minimum values. It can estimate noise robust features without prior knowledge or adaptation. We evaluated its performance in an isolated word recognition task by using the Noisex92 database. Compared with the combinations of conventional methods, the CARN could improve recognition accuracy under various SNR conditions.
Konstantin MARKOV Tomoko MATSUI Rainer GRUHN Jinsong ZHANG Satoshi NAKAMURA
This paper presents the ATR speech recognition system designed for the DARPA SPINE2 evaluation task. The system is capable of dealing with speech from highly variable, real-world noisy conditions and communication channels. A number of robust techniques are implemented, such as differential spectrum mel-scale cepstrum features, on-line MLLR adaptation, and word-level hypothesis combination, which led to a significant reduction in the word error rate.