The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] ATI(18690hit)

7501-7520hit(18690hit)

  • Intentional Voice Command Detection for Trigger-Free Speech Interface

    Yasunari OBUCHI  Takashi SUMIYOSHI  

     
    PAPER-Robust Speech Recognition

      Vol:
    E93-D No:9
      Page(s):
    2440-2450

    In this paper we introduce a new framework of audio processing, which is essential to achieve a trigger-free speech interface for home appliances. If the speech interface works continually in real environments, it must extract occasional voice commands and reject everything else. It is extremely important to reduce the number of false alarms because the number of irrelevant inputs is much larger than the number of voice commands even for heavy users of appliances. The framework, called Intentional Voice Command Detection, is based on voice activity detection, but enhanced by various speech/audio processing techniques such as emotion recognition. The effectiveness of the proposed framework is evaluated using a newly-collected large-scale corpus. The advantages of combining various features were tested and confirmed, and the simple LDA-based classifier demonstrated acceptable performance. The effectiveness of various methods of user adaptation is also discussed.

  • Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition

    Tetsuo KOSAKA  Yuui TAKEDA  Takashi ITO  Masaharu KATO  Masaki KOHDA  

     
    PAPER-Adaptation

      Vol:
    E93-D No:9
      Page(s):
    2363-2369

    In this paper, we propose a new speaker-class modeling and its adaptation method for the LVCSR system and evaluate the method on the Corpus of Spontaneous Japanese (CSJ). In this method, closer speakers are selected from training speakers and the acoustic models are trained by using their utterances for each evaluation speaker. One of the major issues of the speaker-class model is determining the selection range of speakers. In order to solve the problem, several models which have a variety of speaker range are prepared for each evaluation speaker in advance, and the most proper model is selected on a likelihood basis in the recognition step. In addition, we improved the recognition performance using unsupervised speaker adaptation with the speaker-class models. In the recognition experiments, a significant improvement could be obtained by using the proposed speaker adaptation based on speaker-class models compared with the conventional adaptation method.

  • Intra-Cell Partial Spectrum Reuse Scheme for Cellular OFDM-Relay Networks

    Tong WU  Ying WANG  Yushan PEI  Gen LI  Ping ZHANG  

     
    LETTER-Wireless Communication Technologies

      Vol:
    E93-B No:9
      Page(s):
    2462-2464

    This letter proposes an intra-cell partial spectrum reuse (PSR) scheme for cellular OFDM-relay networks. The proposed method aims to increase the system throughput, while the SINR of the cell edge users can be also promoted by utilizing the PSR scheme. The novel pre-allocation factor γ not only indicates the flexibility of PSR, but also decreases the complexity of the reuse mechanism. Through simulations, the proposed scheme is shown to offer superior performances in terms of system throughput and SINR of last 5% users.

  • A Hybrid Acoustic and Pronunciation Model Adaptation Approach for Non-native Speech Recognition

    Yoo Rhee OH  Hong Kook KIM  

     
    PAPER-Adaptation

      Vol:
    E93-D No:9
      Page(s):
    2379-2387

    In this paper, we propose a hybrid model adaptation approach in which pronunciation and acoustic models are adapted by incorporating the pronunciation and acoustic variabilities of non-native speech in order to improve the performance of non-native automatic speech recognition (ASR). Specifically, the proposed hybrid model adaptation can be performed at either the state-tying or triphone-modeling level, depending at which acoustic model adaptation is performed. In both methods, we first analyze the pronunciation variant rules of non-native speakers and then classify each rule as either a pronunciation variant or an acoustic variant. The state-tying level hybrid method then adapts pronunciation models and acoustic models by accommodating the pronunciation variants in the pronunciation dictionary and by clustering the states of triphone acoustic models using the acoustic variants, respectively. On the other hand, the triphone-modeling level hybrid method initially adapts pronunciation models in the same way as in the state-tying level hybrid method; however, for the acoustic model adaptation, the triphone acoustic models are then re-estimated based on the adapted pronunciation models and the states of the re-estimated triphone acoustic models are clustered using the acoustic variants. From the Korean-spoken English speech recognition experiments, it is shown that ASR systems employing the state-tying and triphone-modeling level adaptation methods can relatively reduce the average word error rates (WERs) by 17.1% and 22.1% for non-native speech, respectively, when compared to a baseline ASR system.

  • Color Independent Components Based SIFT Descriptors for Object/Scene Classification

    Dan-ni AI  Xian-hua HAN  Xiang RUAN  Yen-wei CHEN  

     
    PAPER-Pattern Recognition

      Vol:
    E93-D No:9
      Page(s):
    2577-2586

    In this paper, we present a novel color independent components based SIFT descriptor (termed CIC-SIFT) for object/scene classification. We first learn an efficient color transformation matrix based on independent component analysis (ICA), which is adaptive to each category in a database. The ICA-based color transformation can enhance contrast between the objects and the background in an image. Then we compute CIC-SIFT descriptors over all three transformed color independent components. Since the ICA-based color transformation can boost the objects and suppress the background, the proposed CIC-SIFT can extract more effective and discriminative local features for object/scene classification. The comparison is performed among seven SIFT descriptors, and the experimental classification results show that our proposed CIC-SIFT is superior to other conventional SIFT descriptors.

  • Reliable Wireless Broadcast with Linear Network Coding for Multipoint-to-Multipoint Real-Time Communications

    Yoshihisa KONDO  Hiroyuki YOMO  Shinji YAMAGUCHI  Peter DAVIS  Ryu MIURA  Sadao OBANA  Seiichi SAMPEI  

     
    PAPER-Network

      Vol:
    E93-B No:9
      Page(s):
    2316-2325

    This paper proposes multipoint-to-multipoint (MPtoMP) real-time broadcast transmission using network coding for ad-hoc networks like video game networks. We aim to achieve highly reliable MPtoMP broadcasting using IEEE 802.11 media access control (MAC) that does not include a retransmission mechanism. When each node detects packets from the other nodes in a sequence, the correctly detected packets are network-encoded, and the encoded packet is broadcasted in the next sequence as a piggy-back for its native packet. To prevent increase of overhead in each packet due to piggy-back packet transmission, network coding vector for each node is exchanged between all nodes in the negotiation phase. Each user keeps using the same coding vector generated in the negotiation phase, and only coding information that represents which user signal is included in the network coding process is transmitted along with the piggy-back packet. Our simulation results show that the proposed method can provide higher reliability than other schemes using multi point relay (MPR) or redundant transmissions such as forward error correction (FEC). We also implement the proposed method in a wireless testbed, and show that the proposed method achieves high reliability in a real-world environment with a practical degree of complexity when installed on current wireless devices.

  • MV-OPES: Multivalued-Order Preserving Encryption Scheme: A Novel Scheme for Encrypting Integer Value to Many Different Values

    Hasan KADHEM  Toshiyuki AMAGASA  Hiroyuki KITAGAWA  

     
    PAPER-Data Engineering, Web Information Systems

      Vol:
    E93-D No:9
      Page(s):
    2520-2533

    Encryption can provide strong security for sensitive data against inside and outside attacks. This is especially true in the "Database as Service" model, where confidentiality and privacy are important issues for the client. In fact, existing encryption approaches are vulnerable to a statistical attack because each value is encrypted to another fixed value. This paper presents a novel database encryption scheme called MV-OPES (Multivalued--Order Preserving Encryption Scheme), which allows privacy-preserving queries over encrypted databases with an improved security level. Our idea is to encrypt a value to different multiple values to prevent statistical attacks. At the same time, MV-OPES preserves the order of the integer values to allow comparison operations to be directly applied on encrypted data. Using calculated distance (range), we propose a novel method that allows a join query between relations based on inequality over encrypted values. We also present techniques to offload query execution load to a database server as much as possible, thereby making a better use of server resources in a database outsourcing environment. Our scheme can easily be integrated with current database systems as it is designed to work with existing indexing structures. It is robust against statistical attack and the estimation of true values. MV-OPES experiments show that security for sensitive data can be achieved with reasonable overhead, establishing the practicability of the scheme.

  • A Parallel Transmission Scheme for All-to-All Broadcast in Underwater Sensor Networks

    Soonchul PARK  Jaesung LIM  

     
    PAPER-Network

      Vol:
    E93-B No:9
      Page(s):
    2309-2315

    This paper is concerned with the packet transmission scheduling problem for repeating all-to-all broadcasts in Underwater Sensor Networks (USN) in which there are n nodes in a transmission range. All-to-all communication is one of the most dense communication patterns. It is assumed that each node has the same size packet. Unlike the terrestrial scenarios, the propagation time in underwater communications is not negligible. We define all-to-all broadcast as the one where every node transmits packets to all the other nodes in the network except itself. So, there are in total n(n - 1) packets to be transmitted for an all-to-all broadcast. The optimal transmission scheduling is to schedule in a way that all packets can be transmitted within the minimum time. In this paper, we propose an efficient packet transmission scheduling algorithm for underwater acoustic communications using the property of long propagation delay.

  • Extended Single Parity Check Product Codes that Achieve Close-to-Capacity Performance in High Coding Rate

    Akira SHIOZAKI  Masashi KISHIMOTO  Genmon MARUOKA  

     
    LETTER-Coding Theory

      Vol:
    E93-A No:9
      Page(s):
    1693-1696

    This letter proposes extended single parity check product codes and presents their empirical performances on a Gaussian channel by belief propagation (BP) decoding algorithm. The simulation results show that the codes can achieve close-to-capacity performance in high coding rate. The code of length 9603 and of rate 0.96 is only 0.77 dB away from the Shannon limit for a BER of 10-5.

  • Performance of Coded CS-CDMA/CP with M-ZCZ Code over a Fast Fading Channel

    Li YUE  Chenggao HAN  Nalin S. WEERASINGHE  Takeshi HASHIMOTO  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E93-B No:9
      Page(s):
    2381-2388

    This paper studies the performance of a coded convolutional spreading CDMA system with cyclic prefix (CS-CDMA/CP) combined with the zero correlation zone code generated from the M-sequence (M-ZCZ code) for downlink transmission over a multipath fast fading channel. In particular, we propose a new pilot-aided channel estimation scheme based on the shift property of the M-ZCZ code and show the robustness of the scheme against fast fading through comparison with the W-CDMA system empolying time-multiplexed pilot signals.

  • Enhancing the Robustness of the Posterior-Based Confidence Measures Using Entropy Information for Speech Recognition

    Yanqing SUN  Yu ZHOU  Qingwei ZHAO  Pengyuan ZHANG  Fuping PAN  Yonghong YAN  

     
    PAPER-Robust Speech Recognition

      Vol:
    E93-D No:9
      Page(s):
    2431-2439

    In this paper, the robustness of the posterior-based confidence measures is improved by utilizing entropy information, which is calculated for speech-unit-level posteriors using only the best recognition result, without requiring a larger computational load than conventional methods. Using different normalization methods, two posterior-based entropy confidence measures are proposed. Practical details are discussed for two typical levels of hidden Markov model (HMM)-based posterior confidence measures, and both levels are compared in terms of their performances. Experiments show that the entropy information results in significant improvements in the posterior-based confidence measures. The absolute improvements of the out-of-vocabulary (OOV) rejection rate are more than 20% for both the phoneme-level confidence measures and the state-level confidence measures for our embedded test sets, without a significant decline of the in-vocabulary accuracy.

  • A Study on Wear of Brush and Carbon Flat Commutator of DC Motor for Automotive Fuel Pump

    Koichiro SAWA  Takahiro UENO  Hidenori TANAKA  

     
    PAPER

      Vol:
    E93-C No:9
      Page(s):
    1443-1448

    In an automotive fuel pump system, a small DC motor is widely used to drive the pump and driven by a automotive battery. Recently a bio-fuel, usually a mixture of gasoline and ethanol has been used due to shortage of gasoline and environmental aspect. It affects strongly the performances of a DC motor, especially commutation phenomena, what kind of fuel is used. Therefore the authors have started to investigate the influence of ethanol on the commutation phenomena. They have been reporting the wear of brush and carbon flat commutator in gasoline and ethanol so far. In this paper commutation period, arc duration, brush and commutator wear are examined in ethanol 50-gasoline 50%. Brush wears are very small compared with the previous results. Namely in the present test a mechanical sliding wear is predominant rather than erosion by arc due to short arc duration. Further, an area eroded by arc is observed to re-appear as a sliding surface. From these results a threshold arc energy between arc erosion and mechanical sliding wear is obtained, and a wear model is proposed to explain the above wear pattern on the sliding surface.

  • Strongly Secure Privacy Amplification Cannot Be Obtained by Encoder of Slepian-Wolf Code

    Shun WATANABE  Ryutaroh MATSUMOTO  Tomohiko UYEMATSU  

     
    PAPER-Information Theory

      Vol:
    E93-A No:9
      Page(s):
    1650-1659

    Privacy amplification is a technique to distill a secret key from a random variable by a function so that the distilled key and eavesdropper's random variable are statistically independent. There are three kinds of security criteria for the key distilled by privacy amplification: the normalized divergence criterion, which is also known as the weak security criterion, the variational distance criterion, and the divergence criterion, which is also known as the strong security criterion. As a technique to distill a secret key, it is known that the encoder of a Slepian-Wolf (the source coding with full side-information at the decoder) code can be used as a function for privacy amplification if we employ the weak security criterion. In this paper, we show that the encoder of a Slepian-Wolf code cannot be used as a function for privacy amplification if we employ the criteria other than the weak one.

  • Commercial Shot Classification Based on Multiple Features Combination

    Nan LIU  Yao ZHAO  Zhenfeng ZHU  Rongrong NI  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E93-D No:9
      Page(s):
    2651-2655

    This paper presents a commercial shot classification scheme combining well-designed visual and textual features to automatically detect TV commercials. To identify the inherent difference between commercials and general programs, a special mid-level textual descriptor is proposed, aiming to capture the spatio-temporal properties of the video texts typical of commercials. In addition, we introduce an ensemble-learning based combination method, named Co-AdaBoost, to interactively exploit the intrinsic relations between the visual and textual features employed.

  • Practical Power Allocation for Cooperative Distributed Antenna Systems

    Wei FENG  Yanmin WANG  Yunzhou LI  Shidong ZHOU  Jing WANG  

     
    LETTER-Fundamental Theories for Communications

      Vol:
    E93-B No:9
      Page(s):
    2424-2427

    In this letter, we address the problem of downlink power allocation for the generalized distributed antenna system (DAS) with cooperative clusters. Considering practical applications, we assume that only the large-scale channel state information is available at the transmitter. The power allocation scheme is investigated with the target of ergodic achievable sum rate maximization. Based on some approximations and the Rayleigh Quotient Theory, the simple selective power allocation scheme is derived for the low SNR scenario and the high SNR scenario, respectively. The methods are applicable in practice due to their low complexity.

  • Adaptive Arbitration of Fair QoS Based Resource Allocation in Multi-Tier Computing Systems

    Naoki HAYASHI  Toshimitsu USHIO  Takafumi KANAZAWA  

     
    PAPER-Concurrent Systems

      Vol:
    E93-A No:9
      Page(s):
    1678-1683

    This paper proposes an adaptive resource allocation for multi-tier computing systems to guarantee a fair QoS level under resource constraints of tiers. We introduce a multi-tier computing architecture which consists of a group of resource managers and an arbiter. Resource allocation of each client is managed by a dedicated resource manager. Each resource manager updates resources allocated to subtasks of its client by locally exchanging QoS levels with other resource managers. An arbiter compensates the updated resources to avoid overload conditions in tiers. Based on the compensation by the arbiter, the subtasks of each client are executed in corresponding tiers. We derive sufficient conditions for the proposed resource allocation to achieve a fair QoS level avoiding overload conditions in all tiers with some assumptions on a QoS function and a resource consumption function of each client. We conduct a simulation to demonstrate that the proposed resource allocation can adaptively achieve a fair QoS level without causing any overload condition.

  • Cooperative Coding Using Cyclic Delay Diversity for OFDM Systems

    Dongwoo LEE  Young Seok JUNG  Jae Hong LEE  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E93-B No:9
      Page(s):
    2354-2362

    This paper proposes cooperative coding using cyclic delay diversity (CDD) for OFDM systems. The cooperative diversity is combined with channel coding while CDD is applied to the cooperative transmission of the multiple relays to improve the beneficial effects of the cooperating relays. Analyses of frame error probability (FEP) and the average channel power of the proposed scheme are shown. Simulation results show the frame error rate (FER) of the proposed scheme. The proposed scheme provides not only a simple code design and low system complexity compared to conventional space-time processing, but better FER and diversity gain compared to direct transmission and conventional cooperative coding without CDD.

  • Phase Offsets for Binary Sequences Using Order and Index

    Young-Joon SONG  

     
    LETTER-Coding Theory

      Vol:
    E93-A No:9
      Page(s):
    1697-1699

    When a zero offset reference sequence is defined, the i-bit shifted sequence has phase offset i with respect to the reference sequence. In this letter, we propose a new algorithm to compute phase offsets for a periodic binary sequence using the concept of order and index of an integer based on the number theoretical approach. We define an offset evaluation function that is used to calculate the phase offset, and derive properties of the function. Once the function is computed, the phase offset of the sequence is simply obtained by taking the index of it. The new algorithm overcomes the restrictions found in conventional methods on the length and the number of '0's and '1's in binary codes. Its application to the code acquisition is also investigated to show the proposed method is useful.

  • Operating Characteristics for 50 kW Utility Interactive Photovoltaic System in Chosun University, Korea

    Youn-Ok CHOI  Zheng-Guo PIAO  Geum-Bae CHO  

     
    PAPER

      Vol:
    E93-B No:9
      Page(s):
    2239-2243

    This study examined the performance improvement of a photovoltaic (PV) array and inverter as well as their design, construction, and post-operation and management, which will become the key elements in future PV systems. In addition, it evaluated the performance characteristics of a 50 kW grid-connection PV system in Korea. According to the result of the evaluation, the PV array showed approximately 10% efficiency. The inverter was indicated to operate at > 90% efficiency regularly at > 400 W/m2 irradiation. The capture losses (Lc), system losses (Ls) and performance ratio were approximately 0.9 h/d, 0.3 h/d, and > 70%, respectively, indicating that the system was operating stably. In addition, while the Ls decreased rapidly due to the efficiency of the inverter, the performance ratio decreased markedly with increasing Lc due to the increase in temperature when the reference yield was > 5.0 h/d.

  • A Comparative Investigation of Several Frequency Modulation Profiles for Programmed Switching Controllers Targeted Conducted-Noise Reduction in DC-DC Converters

    Gamal M. DOUSOKY  Masahito SHOYAMA  Tamotsu NINOMIYA  

     
    PAPER

      Vol:
    E93-B No:9
      Page(s):
    2265-2272

    This paper investigates the effect of several frequency modulation profiles on conducted-noise reduction in dc-dc converters with programmed switching controller. The converter is operated in variable frequency modulation regime. Twelve switching frequency modulation profiles have been studied. Some of the modulation data are prepared using MATLAB software, and others are generated online. Moreover, all the frequency profiles have been designed and implemented using FPGA and experimentally investigated. The experimental results show that the conducted-noise spreading depends on both the modulation sequence profile and the statistical characteristics of the sequence. A substantial part of the manufacturing cost of power converters for telecommunication applications involves designing filters to comply with the EMI limits. Considering this investigation significantly reduces the filter size.

7501-7520hit(18690hit)