The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] Al(20498hit)

12581-12600hit(20498hit)

  • Parameter Sharing in Mixture of Factor Analyzers for Speaker Identification

    Hiroyoshi YAMAMOTO  Yoshihiko NANKAKU  Chiyomi MIYAJIMA  Keiichi TOKUDA  Tadashi KITAMURA  

     
    PAPER-Feature Extraction and Acoustic Medelings

      Vol:
    E88-D No:3
      Page(s):
    418-424

    This paper investigates the parameter tying structures of a mixture of factor analyzers (MFA) and discriminative training of MFA for speaker identification. The parameters of factor loading matrices or diagonal matrices are shared in different mixtures of MFA. Then, minimum classification error (MCE) training is applied to the MFA parameters to enhance the discrimination ability. The result of a text-independent speaker identification experiment shows that MFA outperforms the conventional Gaussian mixture model (GMM) with diagonal or full covariance matrices and achieves the best performance when sharing the diagonal matrices, resulting in a relative gain of 26% over the GMM with diagonal covariance matrices. The improvement is more significant especially in sparse training data condition. The recognition performance is further improved by MCE training with an additional gain of 3% error reduction.

  • High-Speed Optical Packet Processing Technologies for Optical Packet-Switched Networks

    Hirokazu TAKENOUCHI  Tatsushi NAKAHARA  Kiyoto TAKAHATA  Ryo TAKAHASHI  Hiroyuki SUZUKI  

     
    INVITED PAPER

      Vol:
    E88-C No:3
      Page(s):
    286-294

    Asynchronous optical packet switching (OPS) is a promising solution to support the continuous growth of transmission capacity demand. It has been, however, quite difficult to implement key functions needed at the node of such networks with all-optical approaches. We have proposed a new optoelectronic system composed of a packet-by-packet optical clock-pulse generator (OCG), an all-optical serial-to-parallel converter (SPC), a photonic parallel-to-serial converter (PSC), and CMOS circuitry. The system makes it possible to carry out various required functions such as buffering (random access memory), optical packet compression/decompression, and optical label swapping for high-speed asynchronous optical packets.

  • Developments in Corpus-Based Speech Synthesis: Approaching Natural Conversational Speech

    Nick CAMPBELL  

     
    INVITED PAPER

      Vol:
    E88-D No:3
      Page(s):
    376-383

    This paper describes the special demands of conversational speech in the context of corpus-based speech synthesis. The author proposed the CHATR system of prosody-based unit-selection for concatenative waveform synthesis seven years ago, and now extends this work to incorporate the results of an analysis of five-years of recordings of spontaneous conversational speeech in a wide range of actual daily-life situations. The paper proposes that the expresion of affect (often translated as 'kansei' in Japanese) is the main factor differentiating laboratory speech from real-world conversational speech, and presents a framework for the specification of affect through differences in speaking style and voice quality. Having an enormous corpus of speech samples available for concatenation allows the selection of complete phrase-sized utterance segments, and changes the focus of unit selection from segmental or phonetic continuity to one of prosodic and discoursal appropriateness instead. Samples of the resulting large-corpus-based synthesis can be heard at http://feast.his.atr.jp/AESOP.

  • Game-Theoretic Approach to Capacity and Stability Evaluations of Decentralized Adaptive Route Selections in Wireless Ad Hoc Networks

    Koji YAMAMOTO  Susumu YOSHIDA  

     
    PAPER-Network

      Vol:
    E88-B No:3
      Page(s):
    1009-1016

    A game-theoretic analysis is applied to the evaluation of capacity and stability of a wireless ad hoc network in which each source node independently chooses a route to the destination node so as to enhance throughput. First, the throughput of individual multihop transmission with rate adaptation is evaluated. Observations from this evaluation indicate that the optimal number of hops in terms of the achievable end-to-end throughput depends on the received signal-to-noise ratio. Next, the decentralized adaptive route selection problem in which each source node competes for resources over arbitrary topologies is defined as a game. Numerical results reveal that in some cases this game has no Nash equilibria; i.e., each rational source node cannot determine a unique route. The occurrence of such cases depends on both the transmit power and spatial arrangement of the nodes. Then, the obtained network throughput under the equilibrium conditions is compared to the capacity under centralized scheduling. Numerical results reveal that when the transmit power is low, decentralized adaptive route selection may attain throughput near the capacity.

  • New Switching Control for Synchronous Rectifications in Low-Voltage Paralleled Converter System without Voltage and Current Fluctuations

    Hiroshi SHIMAMORI  Teruhiko KOHAMA  Tamotsu NINOMIYA  

     
    PAPER-Electronic Circuits

      Vol:
    E88-C No:3
      Page(s):
    395-402

    Paralleled converter system with synchronous rectifiers (SRs) causes several problems such as surge voltage, inhalation current and circulating current. Generally, the system stops operation of the SRs in light load to avoid these problems. However, simultaneously, large voltage fluctuations in the output of the modules are occurred due to forward voltage drop of diode. The fluctuations cause serious faults to the semiconductor devices working in very low voltage such as CPU and VLSI. Moreover, the voltage fluctuations generate unstable current fluctuations in the paralleled converter system with current-sharing control. This paper proposes new switching control methods for rectifiers to reduce the voltage and current fluctuations. The effectiveness of the proposed methods is confirmed by computer simulation and experimental results.

  • Spectrum Tuning of Fiber Bragg Gratings by Strain Distributions and Its Applications

    Chee Seong GOH  Sze Yun SET  Kazuro KIKUCHI  

     
    PAPER

      Vol:
    E88-C No:3
      Page(s):
    363-371

    We report tunable optical devices based on fiber Bragg gratings (FBGs), whose filtering characteristics are controlled by strain distributions. These devices include a widely wavelength tunable filter, a tunable group-velocity dispersion (GVD) compensator, a tunable dispersion slope (DS) compensator, and a variable-bandwidth optical add/drop multiplexer (OADM), which will play important roles for next-generation reconfigurable optical networks.

  • Multiple Regression of Log Spectra for In-Car Speech Recognition Using Multiple Distributed Microphones

    Weifeng LI  Tetsuya SHINDE  Hiroshi FUJIMURA  Chiyomi MIYAJIMA  Takanori NISHINO  Katunobu ITOU  Kazuya TAKEDA  Fumitada ITAKURA  

     
    PAPER-Feature Extraction and Acoustic Medelings

      Vol:
    E88-D No:3
      Page(s):
    384-390

    This paper describes a new multi-channel method of noisy speech recognition, which estimates the log spectrum of speech at a close-talking microphone based on the multiple regression of the log spectra (MRLS) of noisy signals captured by distributed microphones. The advantages of the proposed method are as follows: 1) The method does not require a sensitive geometric layout, calibration of the sensors nor additional pre-processing for tracking the speech source; 2) System works in very small computation amounts; and 3) Regression weights can be statistically optimized over the given training data. Once the optimal regression weights are obtained by regression learning, they can be utilized to generate the estimated log spectrum in the recognition phase, where the speech of close-talking is no longer required. The performance of the proposed method is illustrated by speech recognition of real in-car dialogue data. In comparison to the nearest distant microphone and multi-microphone adaptive beamformer, the proposed approach obtains relative word error rate (WER) reductions of 9.8% and 3.6%, respectively.

  • Applying Sparse KPCA for Feature Extraction in Speech Recognition

    Amaro LIMA  Heiga ZEN  Yoshihiko NANKAKU  Keiichi TOKUDA  Tadashi KITAMURA  Fernando G. RESENDE  

     
    PAPER-Feature Extraction and Acoustic Medelings

      Vol:
    E88-D No:3
      Page(s):
    401-409

    This paper presents an analysis of the applicability of Sparse Kernel Principal Component Analysis (SKPCA) for feature extraction in speech recognition, as well as, a proposed approach to make the SKPCA technique realizable for a large amount of training data, which is an usual context in speech recognition systems. Although the KPCA (Kernel Principal Component Analysis) has proved to be an efficient technique for being applied to speech recognition, it has the disadvantage of requiring training data reduction, when its amount is excessively large. This data reduction is important to avoid computational unfeasibility and/or an extremely high computational burden related to the feature representation step of the training and the test data evaluations. The standard approach to perform this data reduction is to randomly choose frames from the original data set, which does not necessarily provide a good statistical representation of the original data set. In order to solve this problem a likelihood related re-estimation procedure was applied to the KPCA framework, thus creating the SKPCA, which nevertheless is not realizable for large training databases. The proposed approach consists in clustering the training data and applying to these clusters a SKPCA like data reduction technique generating the reduced data clusters. These reduced data clusters are merged and reduced in a recursive procedure until just one cluster is obtained, making the SKPCA approach realizable for a large amount of training data. The experimental results show the efficiency of SKPCA technique with the proposed approach over the KPCA with the standard sparse solution using randomly chosen frames and the standard feature extraction techniques.

  • Spectrum Estimation by Noise-Compensated Data Extrapolation

    Jonah GAMBA  Tetsuya SHIMAMURA  

     
    PAPER-Digital Signal Processing

      Vol:
    E88-A No:3
      Page(s):
    702-711

    High-resolution spectrum estimation techniques have been extensively studied in recent publications. Knowledge of the noise variance is vital for spectrum estimation from noise-corrupted observations. This paper presents the use of noise compensation and data extrapolation for spectrum estimation. We assume that the observed data sequence can be represented by a set of autoregressive parameters. A recently proposed iterative algorithm is then used for noise variance estimation while autoregressive parameters are used for data extrapolation. We also present analytical results to show the exponential decay characteristics of the extrapolated samples and the frequency domain smoothing effect of data extrapolation. Some statistical results are also derived. The proposed noise-compensated data extrapolation approach is applied to both the autoregressive and FFT-based spectrum estimation methods. Finally, simulation results show the superiority of the method in terms of bias reduction and resolution improvement for sinusoids buried in noise.

  • Tracking of Speaker Direction by Integrated Use of Microphone Pairs in Equilateral-Triangle

    Yusuke HIOKA  Nozomu HAMADA  

     
    PAPER

      Vol:
    E88-A No:3
      Page(s):
    633-641

    In this report, we propose a tracking algorithm of speaker direction using microphones located at vertices of an equilateral triangle. The method realizes tracking by minimizing a performance index that consists of the cross spectra at three different microphone pairs in the triangular array. We adopt the steepest descent method to minimize it, and for guaranteeing global convergence to the correct direction with high accuracy, we alter the performance index during the adaptation depending on the convergence state. Through some computer simulation and experiments in a real acoustic environment, we show the effectiveness of the proposed method.

  • An Objective Method for Evaluating Speech Translation System: Using a Second Language Learner's Corpus

    Keiji YASUDA  Fumiaki SUGAYA  Toshiyuki TAKEZAWA  Genichiro KIKUI  Seiichi YAMAMOTO  Masuzo YANAGIDA  

     
    PAPER-Speech Corpora and Related Topics

      Vol:
    E88-D No:3
      Page(s):
    569-577

    In this paper we propose an objective method for assessing the capability of a speech translation system. It automates the translation paired comparison method, which gives a simple, easy to understand TOEIC score proposed by Sugaya et al., to succinctly evaluate a speech translation system. To avoid the expensive evaluation cost of the original method where large manual effort is required, the new objective method automates the procedure by employing an objective metric such as BLEU and DP-based measure. The evaluation results obtained by the proposed method are similar to those of the original method. Also, the proposed method is used to evaluate the usefulness of a speech translation system. It is then found that our speech translation system is useful in general, even to users with higher TOEIC score than the system's.

  • Automatic Generation of Non-uniform and Context-Dependent HMMs Based on the Variational Bayesian Approach

    Takatoshi JITSUHIRO  Satoshi NAKAMURA  

     
    PAPER-Feature Extraction and Acoustic Medelings

      Vol:
    E88-D No:3
      Page(s):
    391-400

    We propose a new method both for automatically creating non-uniform, context-dependent HMM topologies, and selecting the number of mixture components based on the Variational Bayesian (VB) approach. Although the Maximum Likelihood (ML) criterion is generally used to create HMM topologies, it has an over-fitting problem. Recently, to avoid this problem, the VB approach has been applied to create acoustic models for speech recognition. We introduce the VB approach to the Successive State Splitting (SSS) algorithm, which can create both contextual and temporal variations for HMMs. Experimental results indicate that the proposed method can automatically create a more efficient model than the original method. We evaluated a method to increase the number of mixture components by using the VB approach and considering temporal structures. The VB approach obtained almost the same performance as the smaller number of mixture components in comparison with that obtained by using ML-based methods.

  • Location-Aware Power-Efficient Directional MAC Protocol in Ad Hoc Networks Using Directional Antenna

    Tetsuro UEDA  Shinsuke TANAKA  Dola SAHA  Siuli ROY  Somprakash BANDYOPADHYAY  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E88-B No:3
      Page(s):
    1169-1181

    Use of directional antenna in the context of ad hoc wireless networks can largely reduce radio interference, thereby improving the utilization of wireless medium. Our major contribution in this paper is to devise a MAC protocol that exploits the advantages of directional antenna in ad hoc networks for improved system performance. In this paper, we have illustrated a MAC protocol for ad hoc networks using directional antenna with the objective of effective utilization of the shared wireless medium. In order to implement effective MAC protocol in this context, a node should know how to set its transmission direction to transmit a packet to its neighbors and to avoid transmission in other directions where data communications are already in progress. In this paper, we are proposing a receiver-centric approach for location tracking and MAC protocol, so that, nodes become aware of its neighborhood and also the direction of the nodes for communicating directionally. A node develops its location-awareness from these neighborhood-awareness and direction-awareness. In this context, researchers usually assume that the gain of directional antennas is equal to the gain of corresponding omni-directional antenna. However, for a given amount of input power, the range R with directional antenna will be much larger than that using omni-directional antenna. In this paper, we also propose a two level transmit power control mechanism in order to approximately equalize the transmission range R of an antenna operating at omni-directional and directional mode. This will not only improve medium utilization but also help to conserve the power of the transmitting node during directional transmission. Our proposed directional MAC protocol can be effective in both ITS (Intelligent Transportation System), which we simulate in String and Parallel Topology, and in any community network, which we simulate in Random Topology. The performance evaluation on QualNet network simulator clearly indicates the efficiency of our protocol.

  • Improving Keyword Recognition of Spoken Queries by Combining Multiple Speech Recognizer's Outputs for Speech-driven WEB Retrieval Task

    Masahiko MATSUSHITA  Hiromitsu NISHIZAKI  Takehito UTSURO  Seiichi NAKAGAWA  

     
    PAPER-Spoken Language Systems

      Vol:
    E88-D No:3
      Page(s):
    472-480

    This paper presents speech-driven Web retrieval models which accept spoken search topics (queries) in the NTCIR-3 Web retrieval task. The major focus of this paper is on improving speech recognition accuracy of spoken queries and then improving retrieval accuracy in speech-driven Web retrieval. We experimentally evaluated the techniques of combining outputs of multiple LVCSR models in recognition of spoken queries. As model combination techniques, we compared the SVM learning technique with conventional voting schemes such as ROVER. In addition, for investigating the effects on the retrieval performance in vocabulary size of the language model, we prepared two kinds of language models: the one's vocabulary size was 20,000, the other's one was 60,000. Then, we evaluated the differences in the recognition rates of the spoken queries and the retrieval performance. We showed that the techniques of multiple LVCSR model combination could achieve improvement both in speech recognition and retrieval accuracies in speech-driven text retrieval. Comparing with the retrieval accuracies when an LM with a 20,000/60,000 vocabulary size is used in an LVCSR system, we found that the larger the vocabulary size is, the better the retrieval accuracy is.

  • A Compact Normal Walk Modeling in PCS Networks with Mesh Cells

    Chiu-Ching TUAN  Chen-Chau YANG  

     
    PAPER-Mobile Information Network and Personal Communications

      Vol:
    E88-A No:3
      Page(s):
    761-769

    Model-based movement patterns play a crucial role in evaluating the performance of mobility-dependent Personal Communication Service (PCS) strategies. This study proposes a new normal walk model to represent more closely the daily movement patterns of a mobile station (MS) in PCS networks than a conventional random walk model. A drift angle θ in this model is applied to determine the relative direction in which an MS handoffs in the next one step, based on the concepts that most real trips follow the shortest path and the directions of daily motion are mostly symmetric. Hence, θ is assumed to approach the normal distribution with the parameters: µ is set to 0and σ is in the range of 5to 90. Varying σ thus redistributes the probabilities associated with θ to make the normal mobility patterns more realistic than the random ones. Experimental results verify that the proposed normal walk is correct and valid for modeling an n-layer mesh cluster of PCS networks. Moreover, when σ = 79.5, a normal walk can almost represent, and even replace, a random walk.

  • Framed ALOHA for Multiple RFID Objects Identification

    Bin ZHEN  Mamoru KOBAYASHI  Masashi SHIMIZU  

     
    PAPER-Network

      Vol:
    E88-B No:3
      Page(s):
    991-999

    Radio frequency identification (RFID) enables everyday objects to be identified, tracked, and recorded. The RFID tags are must be extremely simple and of low cost to be suitable for large scale application. An efficient RFID anti-collision mechanism must have low access latency and low power consumption. This paper investigates how to recognize multiple RFID tags within the reader's interrogation ranges without knowing the number of tags in advance by using framed ALOHA. To optimize power consumption and overall tag read time, a combinatory model was proposed to analyze both passive and active tags with consideration on capture effect over wireless fading channels. By using the model, the parameters on tag set estimation and frame size update were presented. Simulations were conducted to verify the analysis. In addition, we come up with a proposal to combat capture effect in deterministic anti-collision algorithms.

  • Enhanced Flooding Algorithms Introducing the Concept of Biotic Growth

    Hideki TODE  Makoto WADA  Kazuhiko KINOSHITA  Toshihiro MASAKI  Koso MURAKAMI  

     
    PAPER-Software Platform Technologies

      Vol:
    E88-B No:3
      Page(s):
    903-910

    A flooding algorithm is an indispensable and fundamental network control mechanism for achieving some tasks, such notifying all nodes of some information, transferring data with high reliability, getting some information from all nodes, or to reserve a route by flooding the messages in the network. In particular, the flooding algorithm is greatly effective in the heterogeneous and dynamic network environment such as so-called ubiquitous networks, whose topology is indefinite or changes dynamically and whose nodal function may be simple and less intelligent. Actually, it is applied to grasp the network topology in a sensor network or an ad-hoc network, or to retrieve content information by mobile agent systems. A flooding algorithm has the advantages of robustness and optimality by parallel processing of messages. However, the flooding mechanism has a fundamental disadvantages: it causes the message congestion in the network, and eventually increases the processing time until the flooding control is finished. In this paper, we propose and evaluate methods for producing a more efficient flooding algorithm by adopting the growth processes of primitive creatures, such as molds or microbes.

  • Cyclic Codes over Fp + uFp + + uk-1Fp

    Jian-Fa QIAN  Li-Na ZHANG  Shi-Xin ZHU  

     
    LETTER-Coding Theory

      Vol:
    E88-A No:3
      Page(s):
    795-797

    The ring Fp + uFp + + uk-1Fp may be of interest in coding theory, which have already been used in the construction of optimal frequency-hopping sequence. In this work, cyclic codes over Fp + uFp + + uk-1Fp which is an open problem posed in [1] are considered. Namely, the structure of cyclic code over Fp + uFp + + uk-1Fp and that of their duals are derived.

  • Robust Dependency Parsing of Spontaneous Japanese Spoken Language

    Tomohiro OHNO  Shigeki MATSUBARA  Nobuo KAWAGUCHI  Yasuyoshi INAGAKI  

     
    PAPER-Speech Corpora and Related Topics

      Vol:
    E88-D No:3
      Page(s):
    545-552

    Spontaneously spoken Japanese includes a lot of grammatically ill-formed linguistic phenomena such as fillers, hesitations, inversions, and so on, which do not appear in written language. This paper proposes a novel method of robust dependency parsing using a large-scale spoken language corpus, and evaluates the availability and robustness of the method using spontaneously spoken dialogue sentences. By utilizing stochastic information about the appearance of ill-formed phenomena, the method can robustly parse spoken Japanese including fillers, inversions, or dependencies over utterance units. Experimental results reveal that the parsing accuracy reached 87.0%, and we confirmed that it is effective to utilize the location information of a bunsetsu, and the distance information between bunsetsus as stochastic information.

  • All-Optical Regeneration by Electro-Absorption Modulator

    Kohsuke NISHIMURA  Ryo INOHARA  Masashi USAMI  Shigeyuki AKIBA  

     
    INVITED PAPER

      Vol:
    E88-C No:3
      Page(s):
    319-326

    Optical regeneration technique using an electro-absorption modulator (EAM) is reviewed. Simple 3R optical regeneration using an EAM was proposed and verified at 20 Gbit/s. The optical nonlinearities including cross-absorption modulation (XAM) and cross-phase modulation (XPM) induced in an EAM were quantitatively characterized by experiment. High bit-rate 2R type all-optical regeneration (wavelength conversion) at 100 Gbit/s was demonstrated by an EAM in conjunction with a delayed interferometer (DI) with required optical pulse energy of 1.5 pJ. It was verified that the operable bandwidth of the EAM-DI wavelength converter at 40 Gbit/s covered almost full range of C-band without tuning operation conditions.

12581-12600hit(20498hit)