The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] EE(4073hit)

2141-2160hit(4073hit)

  • A Novel Energy Saving Algorithm with Frame Response Delay Constraint in IEEE 802.16e

    Dinh Thi Thuy NGA  MinGon KIM  Minho KANG  

     
    LETTER-Wireless Communication Technologies

      Vol:
    E91-B No:4
      Page(s):
    1190-1193

    Sleep-mode operation of a Mobile Subscriber Station (MSS) in IEEE 802.16e effectively saves energy consumption; however, it induces frame response delay. In this letter, we propose an algorithm to quickly find the optimal value of the final sleep interval in sleep-mode in order to minimize energy consumption with respect to a given frame response delay constraint. The validations of our proposed algorithm through analytical results and simulation results suggest that our algorithm provide a potential guidance to energy saving.

  • A High-Speed Design of Montgomery Multiplier

    Yibo FAN  Takeshi IKENAGA  Satoshi GOTO  

     
    PAPER

      Vol:
    E91-A No:4
      Page(s):
    971-977

    With the increase of key length used in public cryptographic algorithms such as RSA and ECC, the speed of Montgomery multiplication becomes a bottleneck. This paper proposes a high speed design of Montgomery multiplier. Firstly, a modified scalable high-radix Montgomery algorithm is proposed to reduce critical path. Secondly, a high-radix clock-saving dataflow is proposed to support high-radix operation and one clock cycle delay in dataflow. Finally, a hardware-reused architecture is proposed to reduce the hardware cost and a parallel radix-16 design of data path is proposed to accelerate the speed. By using HHNEC 0.25 µm standard cell library, the implementation results show that the total cost of Montgomery multiplier is 130 KGates, the clock frequency is 180 MHz and the throughput of 1024-bit RSA encryption is 352 kbps. This design is suitable to be used in high speed RSA or ECC encryption/decryption. As a scalable design, it supports any key-length encryption/decryption up to the size of on-chip memory.

  • Development, Long-Term Operation and Portability of a Real-Environment Speech-Oriented Guidance System

    Tobias CINCAREK  Hiromichi KAWANAMI  Ryuichi NISIMURA  Akinobu LEE  Hiroshi SARUWATARI  Kiyohiro SHIKANO  

     
    PAPER-Applications

      Vol:
    E91-D No:3
      Page(s):
    576-587

    In this paper, the development, long-term operation and portability of a practical ASR application in a real environment is investigated. The target application is a speech-oriented guidance system installed at the local community center. The system has been exposed to ordinary people since November 2002. More than 300 hours or more than 700,000 inputs have been collected during four years. The outcome is a rare example of a large scale real-environment speech database. A simulation experiment is carried out with this database to investigate how the system's performance improves during the first two years of operation. The purpose is to determine empirically the amount of real-environment data which has to be prepared to build a system with reasonable speech recognition performance and response accuracy. Furthermore, the relative importance of developing the main system components, i.e. speech recognizer and the response generation module, is assessed. Although depending on the system's modeling capacities and domain complexity, experimental results show that overall performance stagnates after employing about 10-15 k utterances for training the acoustic model, 40-50 k utterances for training the language model and 40 k-50 k utterances for compiling the question and answer database. The Q&A database was most important for improving the system's response accuracy. Finally, the portability of the well-trained first system prototype for a different environment, a local subway station, is investigated. Since collection and preparation of large amounts of real data is impractical in general, only one month of data from the new environment is employed for system adaptation. While the speech recognition component of the first prototype has a high degree of portability, the response accuracy is lower than in the first environment. The main reason is a domain difference between the two systems, since they are installed in different environments. This implicates that it is imperative to take the behavior of users under real conditions into account to build a system with high user satisfaction.

  • A High-Speed Pipelined Degree-Computationless Modified Euclidean Algorithm Architecture for Reed-Solomon Decoders

    Seungbeom LEE  Hanho LEE  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E91-A No:3
      Page(s):
    830-835

    This paper presents a novel high-speed low-complexity pipelined degree-computationless modified Euclidean (pDCME) algorithm architecture for high-speed RS decoders. The pDCME algorithm allows elimination of the degree-computation so as to reduce hardware complexity and obtain high-speed processing. A high-speed RS decoder based on the pDCME algorithm has been designed and implemented with 0.13-µm CMOS standard cell technology in a supply voltage of 1.1 V. The proposed RS decoder operates at a clock frequency of 660 MHz and has a throughput of 5.3 Gb/s. The proposed architecture requires approximately 15% fewer gate counts and a simpler control logic than architectures based on the popular modified Euclidean algorithm.

  • Building an Effective Speech Corpus by Utilizing Statistical Multidimensional Scaling Method

    Goshu NAGINO  Makoto SHOZAKAI  Tomoki TODA  Hiroshi SARUWATARI  Kiyohiro SHIKANO  

     
    PAPER-Corpus

      Vol:
    E91-D No:3
      Page(s):
    607-614

    This paper proposes a technique for building an effective speech corpus with lower cost by utilizing a statistical multidimensional scaling method. The statistical multidimensional scaling method visualizes multiple HMM acoustic models into two-dimensional space. At first, a small number of voice samples per speaker is collected; speaker adapted acoustic models trained with collected utterances, are mapped into two-dimensional space by utilizing the statistical multidimensional scaling method. Next, speakers located in the periphery of the distribution, in a plotted map are selected; a speech corpus is built by collecting enough voice samples for the selected speakers. In an experiment for building an isolated-word speech corpus, the performance of an acoustic model trained with 200 selected speakers was equivalent to that of an acoustic model trained with 533 non-selected speakers. It means that a cost reduction of more than 62% was achieved. In an experiment for building a continuous word speech corpus, the performance of an acoustic model trained with 500 selected speakers was equivalent to that of an acoustic model trained with 1179 non-selected speakers. It means that a cost reduction of more than 57% was achieved.

  • Improved Noise Reduction with Packet Loss Recovery Based on Post-Filtering over IP Networks

    Jinsul KIM  Hyunwoo LEE  Won RYU  Seungho HAN  Minsoo HAHN  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E91-B No:3
      Page(s):
    975-979

    This letter mainly focuses on improving current noise reduction methods to solve the critical speech distortion problems with robust noise reduction in noisy speech signals for speech enhancement over IP networks. For robust noise reduction with packet loss recovery, we propose a novel optimized Wiener filtering technique that uses the estimated SNR (Signal-to-Noise Ratio) with packet loss recovery method which is applied as post-filtering over IP-networks. Simulation results demonstrate that the proposed scheme provides better reduction and recovery rates with considering packet loss and SNR environment than other methods.

  • Local Peak Enhancement for In-Car Speech Recognition in Noisy Environment

    Osamu ICHIKAWA  Takashi FUKUDA  Masafumi NISHIMURA  

     
    LETTER

      Vol:
    E91-D No:3
      Page(s):
    635-639

    The accuracy of automatic speech recognition in a car is significantly degraded in a very low SNR (Signal to Noise Ratio) situation such as "Fan high" or "Window open". In such cases, speech signals are often buried in broadband noise. Although several existing noise reduction algorithms are known to improve the accuracy, other approaches that can work with them are still required for further improvement. One of the candidates is enhancement of the harmonic structures in human voices. However, most conventional approaches are based on comb filtering, and it is difficult to use them in practical situations, because their assumptions for F0 detection and for voiced/unvoiced detection are not accurate enough in realistic noisy environments. In this paper, we propose a new approach that does not rely on such detection. An observed power spectrum is directly converted into a filter for speech enhancement, by retaining only the local peaks considered to be harmonic structures in the human voice. In our experiments, this approach reduced the word error rate by 17% in realistic automobile environments. Also, it showed further improvement when used with existing noise reduction methods.

  • Language Modeling Using PLSA-Based Topic HMM

    Atsushi SAKO  Tetsuya TAKIGUCHI  Yasuo ARIKI  

     
    PAPER-Language Modeling

      Vol:
    E91-D No:3
      Page(s):
    522-528

    In this paper, we propose a PLSA-based language model for sports-related live speech. This model is implemented using a unigram rescaling technique that combines a topic model and an n-gram. In the conventional method, unigram rescaling is performed with a topic distribution estimated from a recognized transcription history. This method can improve the performance, but it cannot express topic transition. By incorporating the concept of topic transition, it is expected that the recognition performance will be improved. Thus, the proposed method employs a "Topic HMM" instead of a history to estimate the topic distribution. The Topic HMM is an Ergodic HMM that expresses typical topic distributions as well as topic transition probabilities. Word accuracy results from our experiments confirmed the superiority of the proposed method over a trigram and a PLSA-based conventional method that uses a recognized history.

  • Noisy Speech Recognition Based on Integration/Selection of Multiple Noise Suppression Methods Using Noise GMMs

    Norihide KITAOKA  Souta HAMAGUCHI  Seiichi NAKAGAWA  

     
    PAPER-Noisy Speech Recognition

      Vol:
    E91-D No:3
      Page(s):
    411-421

    To achieve high recognition performance for a wide variety of noise and for a wide range of signal-to-noise ratio, this paper presents methods for integration of four noise reduction algorithms: spectral subtraction with smoothing of time direction, temporal domain SVD-based speech enhancement, GMM-based speech estimation and KLT-based comb-filtering. In this paper, we proposed two types of combination methods of noise suppression algorithms: selection of front-end processor and combination of results from multiple recognition processes. Recognition results on the CENSREC-1 task showed the effectiveness of our proposed methods.

  • Feature Compensation Employing Multiple Environmental Models for Robust In-Vehicle Speech Recognition

    Wooil KIM  John H.L. HANSEN  

     
    PAPER-Noisy Speech Recognition

      Vol:
    E91-D No:3
      Page(s):
    430-438

    An effective feature compensation method is developed for reliable speech recognition in real-life in-vehicle environments. The CU-Move corpus, used for evaluation, contains a range of speech and noise signals collected for a number of speakers under actual driving conditions. PCGMM-based feature compensation, considered in this paper, utilizes parallel model combination to generate noise-corrupted speech model by combining clean speech and the noise model. In order to address unknown time-varying background noise, an interpolation method of multiple environmental models is employed. To alleviate computational expenses due to multiple models, an Environment Transition Model is employed, which is motivated from Noise Language Model used in Environmental Sniffing. An environment dependent scheme of mixture sharing technique is proposed and shown to be more effective in reducing the computational complexity. A smaller environmental model set is determined by the environment transition model for mixture sharing. The proposed scheme is evaluated on the connected single digits portion of the CU-Move database using the Aurora2 evaluation toolkit. Experimental results indicate that our feature compensation method is effective for improving speech recognition in real-life in-vehicle conditions. A reduction of 73.10% of the computational requirements was obtained by employing the environment dependent mixture sharing scheme with only a slight change in recognition performance. This demonstrates that the proposed method is effective in maintaining the distinctive characteristics among the different environmental models, even when selecting a large number of Gaussian components for mixture sharing.

  • Design for Testability Method to Avoid Error Masking of Software-Based Self-Test for Processors

    Masato NAKAZATO  Michiko INOUE  Satoshi OHTAKE  Hideo FUJIWARA  

     
    PAPER-High-Level Testing

      Vol:
    E91-D No:3
      Page(s):
    763-770

    In this paper, we propose a design for testability method for test programs of software-based self-test using test program templates. Software-based self-test using templates has a problem of error masking where some faults detected in a test generation for a module are not detected by the test program synthesized from the test. The proposed method achieves 100% template level fault efficiency, that is, it completely avoids the error masking. Moreover, the proposed method has no performance degradation (adds only observation points) and enables at-speed testing.

  • Selection of Optimum Vocabulary and Dialog Strategy for Noise-Robust Spoken Dialog Systems

    Akinori ITO  Takanobu OBA  Takashi KONASHI  Motoyuki SUZUKI  Shozo MAKINO  

     
    PAPER-ASR System Architecture

      Vol:
    E91-D No:3
      Page(s):
    538-548

    Speech recognition in a noisy environment is one of the hottest topics in the speech recognition research. Noise-tolerant acoustic models or noise reduction techniques are often used to improve recognition accuracy. In this paper, we propose a method to improve accuracy of spoken dialog system from a language model point of view. In the proposed method, the dialog system automatically changes its language model and dialog strategy according to the estimated recognition accuracy in a noisy environment in order to keep the performance of the system high. In a noise-free environment, the system accepts any utterance from a user. On the other hand, the system restricts its grammar and vocabulary in a noisy environment. To realize this strategy, we investigated a method to avoid the user's out-of-grammar utterances through an instruction given by the system to a user. Furthermore, we developed a method to estimate recognition accuracy from features extracted from noise signals. Finally, we realized a proposed dialog system according to these investigations.

  • Using Mutual Information Criterion to Design an Efficient Phoneme Set for Chinese Speech Recognition

    Jin-Song ZHANG  Xin-Hui HU  Satoshi NAKAMURA  

     
    PAPER-Acoustic Modeling

      Vol:
    E91-D No:3
      Page(s):
    508-513

    Chinese is a representative tonal language, and it has been an attractive topic of how to process tone information in the state-of-the-art large vocabulary speech recognition system. This paper presents a novel way to derive an efficient phoneme set of tone-dependent units to build a recognition system, by iteratively merging a pair of tone-dependent units according to the principle of minimal loss of the Mutual Information (MI). The mutual information is measured between the word tokens and their phoneme transcriptions in a training text corpus, based on the system lexical and language model. The approach has a capability to keep discriminative tonal (and phoneme) contrasts that are most helpful for disambiguating homophone words due to lack of tones, and merge those tonal (and phoneme) contrasts that are not important for word disambiguation for the recognition task. This enables a flexible selection of phoneme set according to a balance between the MI information amount and the number of phonemes. We applied the method to traditional phoneme set of Initial/Finals, and derived several phoneme sets with different number of units. Speech recognition experiments using the derived sets showed its effectiveness.

  • 6-bit 1.6-GS/s 85-mW Flash Analog to Digital Converter Using Symmetric Three-Input Comparator

    Yun-Jeong KIM  Jong-Ho LEE  Ja-Hyun KOO  Kwang-Hyun BAEK  Suki KIM  

     
    LETTER-Electronic Circuits

      Vol:
    E91-C No:3
      Page(s):
    392-395

    In this paper, we describe a 6-bit 1.6-GS/s flash analog to digital converter (ADC). To reduce the power consumption and active area, we propose a new interpolation architecture using a symmetric three-input comparator. This ADC achieves 5.56 effective bits for input frequencies up to 220 MHz at 1.6 GS/s, and almost five effective bits for 660 MHz input at 1.6 GS/s. Peak INL and DNL are less than 0.5 LSB and 0.45 LSB, respectively. This ADC consumes 85 mW from 1.8 V at 1.6 GS/s and occupies an active area of 0.27 mm2. It is fabricated in 0.18-µm CMOS.

  • A Conservative Framework for Safety-Failure Checking

    Frederic BEAL  Tomohiro YONEDA  Chris J. MYERS  

     
    PAPER-Verification and Timing Analysis

      Vol:
    E91-D No:3
      Page(s):
    642-654

    We present a new framework for checking safety failures. The approach is based on the conservative inference of the internal states of a system by the observation of the interaction with its environment. It is based on two similar mechanisms : forward implication, which performs the analysis of the consequences of an input applied to the system, and backward implication, that performs the same task for an output transition. While being a very simple approach, it is general and we believe it can yield efficient algorithms in different safety-failure checking problems. As a case study, we have applied this framework to an existing problem, the hazard checking in (speed-independent) asynchronous circuits. Our new methodology yields an efficient algorithm that performs better or as well as all existing algorithms, while being more general than the fastest one.

  • Analysis of Adaptive Control Scheme in IEEE 802.11 and IEEE 802.11e Wireless LANs

    Bih-Hwang LEE  Hui-Cheng LAI  

     
    PAPER-Terrestrial Radio Communications

      Vol:
    E91-B No:3
      Page(s):
    862-870

    In order to achieve the prioritized quality of service (QoS) guarantee, the IEEE 802.11e EDCAF (the enhanced distributed channel access function) provides the distinguished services by configuring the different QoS parameters to different access categories (ACs). An admission control scheme is needed to maximize the utilization of wireless channel. Most of papers study throughput improvement by solving the complicated multidimensional Markov-chain model. In this paper, we introduce a backoff model to study the transmission probability of the different arbitration interframe space number (AIFSN) and the minimum contention window size (CWmin). We propose an adaptive control scheme (ACS) to dynamically update AIFSN and CWmin based on the periodical monitoring of current channel status and QoS requirements to achieve the specific service differentiation at access points (AP). This paper provides an effective tuning mechanism for improving QoS in WLAN. Analytical and simulation results show that the proposed scheme outperforms the basic EDCAF in terms of throughput and service differentiation especially at high collision rate.

  • Multichannel Speech Enhancement Based on Generalized Gamma Prior Distribution with Its Online Adaptive Estimation

    Tran HUY DAT  Kazuya TAKEDA  Fumitada ITAKURA  

     
    PAPER-Speech Enhancement

      Vol:
    E91-D No:3
      Page(s):
    439-447

    We present a multichannel speech enhancement method based on MAP speech spectral magnitude estimation using a generalized gamma model of speech prior distribution, where the model parameters are adapted from actual noisy speech in a frame-by-frame manner. The utilization of a more general prior distribution with its online adaptive estimation is shown to be effective for speech spectral estimation in noisy environments. Furthermore, the multi-channel information in terms of cross-channel statistics are shown to be useful to better adapt the prior distribution parameters to the actual observation, resulting in better performance of speech enhancement algorithm. We tested the proposed algorithm in an in-car speech database and obtained significant improvements of the speech recognition performance, particularly under non-stationary noise conditions such as music, air-conditioner and open window.

  • Robust F0 Estimation Using ELS-Based Robust Complex Speech Analysis

    Keiichi FUNAKI  Tatsuhiko KINJO  

     
    LETTER-Digital Signal Processing

      Vol:
    E91-A No:3
      Page(s):
    868-871

    Complex speech analysis for an analytic speech signal can accurately estimate the spectrum in low frequencies since the analytic signal provides spectrum only over positive frequencies. The remarkable feature makes it possible to realize more accurate F0 estimation using complex residual signal extracted by complex-valued speech analysis. We have already proposed F0 estimation using complex LPC residual, in which the autocorrelation function weighted by AMDF was adopted as the criterion. The method adopted MMSE-based complex LPC analysis and it has been reported that it can estimate more accurate F0 for IRS filtered speech corrupted by white Gauss noise although it can not work better for the IRS filtered speech corrupted by pink noise. In this paper, robust complex speech analysis based on ELS (Extended Least Square) method is introduced in order to overcome the drawback. The experimental results for additive white Gauss or pink noise demonstrate that the proposed algorithm based on robust ELS-based complex AR analysis can perform better than other methods.

  • Fast Decoding of the p-Ary First-Order Reed-Muller Codes Based On Jacket Transform

    Moon Ho LEE  Yuri L. BORISSOV  

     
    LETTER-Coding Theory

      Vol:
    E91-A No:3
      Page(s):
    901-904

    We propose a fast decoding algorithm for the p-ary first-order Reed-Muller code guaranteeing correction of up to errors and having complexity proportional to nlog n, where n = pm is the code length and p is an odd prime. This algorithm is an extension in the complex domain of the fast Hadamard transform decoding algorithm applicable to the binary case.

  • Bi-Spectral Acoustic Features for Robust Speech Recognition

    Kazuo ONOE  Shoei SATO  Shinichi HOMMA  Akio KOBAYASHI  Toru IMAI  Tohru TAKAGI  

     
    LETTER

      Vol:
    E91-D No:3
      Page(s):
    631-634

    The extraction of acoustic features for robust speech recognition is very important for improving its performance in realistic environments. The bi-spectrum based on the Fourier transformation of the third-order cumulants expresses the non-Gaussianity and the phase information of the speech signal, showing the dependency between frequency components. In this letter, we propose a method of extracting short-time bi-spectral acoustic features with averaging features in a single frame. Merged with the conventional Mel frequency cepstral coefficients (MFCC) based on the power spectrum by the principal component analysis (PCA), the proposed features gave a 6.9% relative lower a word error rate in Japanese broadcast news transcription experiments.

2141-2160hit(4073hit)