The search functionality is under construction.

Author Search Result

[Author] Takashi FUKUDA(8hit)

1-8hit
  • Eigenpolarization States of Photoinduced Anisotropy in Azobenzene Film

    Daisuke BARADA  Kiyonobu TAMURA  Takashi FUKUDA  Akira EMOTO  Toyohiko YATAGAI  

     
    LETTER

      Vol:
    E91-C No:10
      Page(s):
    1675-1676

    Optical anisotropy was induced in azobenzene copolymer film using linearly or elliptically polarized laser beams. The Jones matrix of the anisotropic film was calculated from the change in polarization of the probe light. Two eigenpolarization states were obtained from the matrix. These two eigenstates are useful as a polarization basis for a polarization-tunable component.

  • Confidence Scoring for Accurate HMM-Based Speech Recognition by Using Monophone-Level Normalization Based on Subspace Method

    Muhammad GHULAM  Takaharu SATO  Takashi FUKUDA  Tsuneo NITTA  

     
    PAPER-Speech and Speaker Recognition

      Vol:
    E86-D No:3
      Page(s):
    430-437

    In this paper, a novel confidence scoring method that is applied to N-best hypotheses (word candidates) output from an HMM-based classifier is proposed. In the first pass of the proposed method, the HMM-based classifier with monophone models outputs N-best hypotheses and boundaries of all monophones in the hypotheses. In the second pass, an SM (Subspace Method)-based verifier tests the hypotheses by comparing confidence scores. To test the hypotheses, at first, the SM-based verifier calculates the similarity between phone vectors and an eigen vector set of monophones, then this similarity score is converted into a likelihood score with normalization of acoustic quality, and finally, an HMM-based likelihood of word level and an SM-based likelihood of monophone level are combined to formulate the confidence measure. Two kinds of experiments were performed to evaluate this confidence measure on speaker-independent word recognition. The results showed that the proposed confidence scoring method significantly reduced the word error rate from 4.7% obtained by the standard HMM classifier to 2.0%, and in an unknown word rejection, it reduced the equal error rate from 9.0% to 6.5%.

  • An Inductive Method to Select Simulation Points

    MinSeong CHOI  Takashi FUKUDA  Masahiro GOSHIMA  Shuichi SAKAI  

     
    PAPER-Architecture

      Pubricized:
    2016/08/24
      Vol:
    E99-D No:12
      Page(s):
    2891-2900

    The time taken for processor simulation can be drastically reduced by selecting simulation points, which are dynamic sections obtained from the simulation result of processors. The overall behavior of the program can be estimated by simulating only these sections. The existing methods to select simulation points, such as SimPoint, used for selecting simulation points are deductive and based on the idea that dynamic sections executing the same static section of the program are of the same phase. However, there are counterexamples for this idea. This paper proposes an inductive method, which selects simulation points from the results obtained by pre-simulating several processors with distinctive microarchitectures, based on assumption that sections in which all the distinctive processors have similar istructions per cycle (IPC) values are of the same phase. We evaluated the first 100G instructions of SPEC 2006 programs. Our method achieved an IPC estimation error of approximately 0.1% by simulating approximately 0.05% of the 100G instructions.

  • Estimation of Body Structure by Biomechanical Impedance

    Hisao OKA  Masakazu YASUNA  Shun–ya SAKAMOTO  Takashi FUKUDA  

     
    LETTER

      Vol:
    E77-A No:11
      Page(s):
    1872-1874

    The mechanical impedance of silicone–gel model or chest surface has been measured and the viscoelasticity and effective vibrating radius have been obtained from the impedance. They depend on the distance between the internal block of the silicone–gel/ribs of right chest and the gel surface/skin surface. The 3–D image of internal structure is reconstructed, based on the relation between the distance from the surface and the effective vibrating radius.

  • Canonicalization of Feature Parameters for Robust Speech Recognition Based on Distinctive Phonetic Feature (DPF) Vectors

    Mohammad NURUL HUDA  Muhammad GHULAM  Takashi FUKUDA  Kouichi KATSURADA  Tsuneo NITTA  

     
    PAPER-Feature Extraction

      Vol:
    E91-D No:3
      Page(s):
    488-498

    This paper describes a robust automatic speech recognition (ASR) system with less computation. Acoustic models of a hidden Markov model (HMM)-based classifier include various types of hidden factors such as speaker-specific characteristics, coarticulation, and an acoustic environment, etc. If there exists a canonicalization process that can recover the degraded margin of acoustic likelihoods between correct phonemes and other ones caused by hidden factors, the robustness of ASR systems can be improved. In this paper, we introduce a canonicalization method that is composed of multiple distinctive phonetic feature (DPF) extractors corresponding to each hidden factor canonicalization, and a DPF selector which selects an optimum DPF vector as an input of the HMM-based classifier. The proposed method resolves gender factors and speaker variability, and eliminates noise factors by applying the canonicalzation based on the DPF extractors and two-stage Wiener filtering. In the experiment on AURORA-2J, the proposed method provides higher word accuracy under clean training and significant improvement of word accuracy in low signal-to-noise ratio (SNR) under multi-condition training compared to a standard ASR system with mel frequency ceptral coeffient (MFCC) parameters. Moreover, the proposed method requires a reduced, two-fifth, Gaussian mixture components and less memory to achieve accurate ASR.

  • PS-ZCPA Based Feature Extraction with Auditory Masking, Modulation Enhancement and Noise Reduction for Robust ASR

    Muhammad GHULAM  Takashi FUKUDA  Kouichi KATSURADA  Junsei HORIKAWA  Tsuneo NITTA  

     
    PAPER-Speech Recognition

      Vol:
    E89-D No:3
      Page(s):
    1015-1023

    A pitch-synchronous (PS) auditory feature extraction method based on ZCPA (Zero-Crossings Peak-Amplitudes) was proposed previously and showed more robustness over a conventional ZCPA and MFCC based features. In this paper, firstly, a non-linear adaptive threshold adjustment procedure is introduced into the PS-ZCPA method to get optimal results in noisy conditions with different signal-to-noise ratio (SNR). Next, auditory masking, a well-known auditory perception, and modulation enhancement that simulates a strong relationship between modulation spectrums and intelligibility of speech are embedded into the PS-ZCPA method. Finally, a Wiener filter based noise reduction procedure is integrated into the method to make it more noise-robust, and the performance is evaluated against ETSI ES202 (WI008), which is a standard front-end for distributed speech recognition. All the experiments were carried out on Aurora-2J database. The experimental results demonstrated improved performance of the PS-ZCPA method by embedding auditory masking into it, and a slightly improved performance by using modulation enhancement. The PS-ZCPA method with Wiener filter based noise reduction also showed better performance than ETSI ES202 (WI008).

  • Orthogonalized Distinctive Phonetic Feature Extraction for Noise-Robust Automatic Speech Recognition

    Takashi FUKUDA  Tsuneo NITTA  

     
    PAPER

      Vol:
    E87-D No:5
      Page(s):
    1110-1118

    In this paper, we propose a noise-robust automatic speech recognition system that uses orthogonalized distinctive phonetic features (DPFs) as input of HMM with diagonal covariance. In an orthogonalized DPF extraction stage, first, a speech signal is converted to acoustic features composed of local features (LFs) and ΔP, then a multilayer neural network (MLN) with 153 output units composed of context-dependent DPFs of a preceding context DPF vector, a current DPF vector, and a following context DPF vector maps the LFs to DPFs. Karhunen-Loeve transform (KLT) is then applied to orthogonalize each DPF vector in the context-dependent DPFs, using orthogonal bases calculated from a DPF vector that represents 38 Japanese phonemes. Each orthogonalized DPF vector is finally decorrelated one another by using Gram-Schmidt orthogonalization procedure. In experiments, after evaluating the parameters of the MLN input and output units in the DPF extractor, the orthogonalized DPFs are compared with original DPFs. The orthogonalized DPFs are then evaluated in comparison with a standard parameter set of MFCCs and dynamic features. Next, noise robustness is tested using four types of additive noise. The experimental results show that the use of the proposed orthogonalized DPFs can significantly reduce the error rate in an isolated spoken-word recognition task both with clean speech and with speech contaminated by additive noise. Furthermore, we achieved significant improvements when combining the orthogonalized DPFs with conventional static MFCCs and ΔP.

  • Local Peak Enhancement for In-Car Speech Recognition in Noisy Environment

    Osamu ICHIKAWA  Takashi FUKUDA  Masafumi NISHIMURA  

     
    LETTER

      Vol:
    E91-D No:3
      Page(s):
    635-639

    The accuracy of automatic speech recognition in a car is significantly degraded in a very low SNR (Signal to Noise Ratio) situation such as "Fan high" or "Window open". In such cases, speech signals are often buried in broadband noise. Although several existing noise reduction algorithms are known to improve the accuracy, other approaches that can work with them are still required for further improvement. One of the candidates is enhancement of the harmonic structures in human voices. However, most conventional approaches are based on comb filtering, and it is difficult to use them in practical situations, because their assumptions for F0 detection and for voiced/unvoiced detection are not accurate enough in realistic noisy environments. In this paper, we propose a new approach that does not rely on such detection. An observed power spectrum is directly converted into a filter for speech enhancement, by retaining only the local peaks considered to be harmonic structures in the human voice. In our experiments, this approach reduced the word error rate by 17% in realistic automobile environments. Also, it showed further improvement when used with existing noise reduction methods.