IEICE global.ieice.org Site

Keyword Search Result

[Keyword] ATI(18690hit)

7501-7520hit(18690hit)

Intentional Voice Command Detection for Trigger-Free Speech Interface
Yasunari OBUCHI Takashi SUMIYOSHI

PAPER-Robust Speech Recognition

Vol:
E93-D No:9
Page(s):
2440-2450
In this paper we introduce a new framework of audio processing, which is essential to achieve a trigger-free speech interface for home appliances. If the speech interface works continually in real environments, it must extract occasional voice commands and reject everything else. It is extremely important to reduce the number of false alarms because the number of irrelevant inputs is much larger than the number of voice commands even for heavy users of appliances. The framework, called Intentional Voice Command Detection, is based on voice activity detection, but enhanced by various speech/audio processing techniques such as emotion recognition. The effectiveness of the proposed framework is evaluated using a newly-collected large-scale corpus. The advantages of combining various features were tested and confirmed, and the simple LDA-based classifier demonstrated acceptable performance. The effectiveness of various methods of user adaptation is also discussed.
Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition
Tetsuo KOSAKA Yuui TAKEDA Takashi ITO Masaharu KATO Masaki KOHDA

PAPER-Adaptation

Vol:
E93-D No:9
Page(s):
2363-2369
In this paper, we propose a new speaker-class modeling and its adaptation method for the LVCSR system and evaluate the method on the Corpus of Spontaneous Japanese (CSJ). In this method, closer speakers are selected from training speakers and the acoustic models are trained by using their utterances for each evaluation speaker. One of the major issues of the speaker-class model is determining the selection range of speakers. In order to solve the problem, several models which have a variety of speaker range are prepared for each evaluation speaker in advance, and the most proper model is selected on a likelihood basis in the recognition step. In addition, we improved the recognition performance using unsupervised speaker adaptation with the speaker-class models. In the recognition experiments, a significant improvement could be obtained by using the proposed speaker adaptation based on speaker-class models compared with the conventional adaptation method.
Intra-Cell Partial Spectrum Reuse Scheme for Cellular OFDM-Relay Networks
Tong WU Ying WANG Yushan PEI Gen LI Ping ZHANG

LETTER-Wireless Communication Technologies

Vol:
E93-B No:9
Page(s):
2462-2464
This letter proposes an intra-cell partial spectrum reuse (PSR) scheme for cellular OFDM-relay networks. The proposed method aims to increase the system throughput, while the SINR of the cell edge users can be also promoted by utilizing the PSR scheme. The novel pre-allocation factor γ not only indicates the flexibility of PSR, but also decreases the complexity of the reuse mechanism. Through simulations, the proposed scheme is shown to offer superior performances in terms of system throughput and SINR of last 5% users.
A Hybrid Acoustic and Pronunciation Model Adaptation Approach for Non-native Speech Recognition
Yoo Rhee OH Hong Kook KIM

PAPER-Adaptation

Vol:
E93-D No:9
Page(s):
2379-2387
In this paper, we propose a hybrid model adaptation approach in which pronunciation and acoustic models are adapted by incorporating the pronunciation and acoustic variabilities of non-native speech in order to improve the performance of non-native automatic speech recognition (ASR). Specifically, the proposed hybrid model adaptation can be performed at either the state-tying or triphone-modeling level, depending at which acoustic model adaptation is performed. In both methods, we first analyze the pronunciation variant rules of non-native speakers and then classify each rule as either a pronunciation variant or an acoustic variant. The state-tying level hybrid method then adapts pronunciation models and acoustic models by accommodating the pronunciation variants in the pronunciation dictionary and by clustering the states of triphone acoustic models using the acoustic variants, respectively. On the other hand, the triphone-modeling level hybrid method initially adapts pronunciation models in the same way as in the state-tying level hybrid method; however, for the acoustic model adaptation, the triphone acoustic models are then re-estimated based on the adapted pronunciation models and the states of the re-estimated triphone acoustic models are clustered using the acoustic variants. From the Korean-spoken English speech recognition experiments, it is shown that ASR systems employing the state-tying and triphone-modeling level adaptation methods can relatively reduce the average word error rates (WERs) by 17.1% and 22.1% for non-native speech, respectively, when compared to a baseline ASR system.
Color Independent Components Based SIFT Descriptors for Object/Scene Classification
Dan-ni AI Xian-hua HAN Xiang RUAN Yen-wei CHEN

PAPER-Pattern Recognition

Vol:
E93-D No:9
Page(s):
2577-2586
In this paper, we present a novel color independent components based SIFT descriptor (termed CIC-SIFT) for object/scene classification. We first learn an efficient color transformation matrix based on independent component analysis (ICA), which is adaptive to each category in a database. The ICA-based color transformation can enhance contrast between the objects and the background in an image. Then we compute CIC-SIFT descriptors over all three transformed color independent components. Since the ICA-based color transformation can boost the objects and suppress the background, the proposed CIC-SIFT can extract more effective and discriminative local features for object/scene classification. The comparison is performed among seven SIFT descriptors, and the experimental classification results show that our proposed CIC-SIFT is superior to other conventional SIFT descriptors.
Reliable Wireless Broadcast with Linear Network Coding for Multipoint-to-Multipoint Real-Time Communications
Yoshihisa KONDO Hiroyuki YOMO Shinji YAMAGUCHI Peter DAVIS Ryu MIURA Sadao OBANA Seiichi SAMPEI

PAPER-Network

Vol:
E93-B No:9
Page(s):
2316-2325
This paper proposes multipoint-to-multipoint (MPtoMP) real-time broadcast transmission using network coding for ad-hoc networks like video game networks. We aim to achieve highly reliable MPtoMP broadcasting using IEEE 802.11 media access control (MAC) that does not include a retransmission mechanism. When each node detects packets from the other nodes in a sequence, the correctly detected packets are network-encoded, and the encoded packet is broadcasted in the next sequence as a piggy-back for its native packet. To prevent increase of overhead in each packet due to piggy-back packet transmission, network coding vector for each node is exchanged between all nodes in the negotiation phase. Each user keeps using the same coding vector generated in the negotiation phase, and only coding information that represents which user signal is included in the network coding process is transmitted along with the piggy-back packet. Our simulation results show that the proposed method can provide higher reliability than other schemes using multi point relay (MPR) or redundant transmissions such as forward error correction (FEC). We also implement the proposed method in a wireless testbed, and show that the proposed method achieves high reliability in a real-world environment with a practical degree of complexity when installed on current wireless devices.
MV-OPES: Multivalued-Order Preserving Encryption Scheme: A Novel Scheme for Encrypting Integer Value to Many Different Values
Hasan KADHEM Toshiyuki AMAGASA Hiroyuki KITAGAWA

PAPER-Data Engineering, Web Information Systems

Vol:
E93-D No:9
Page(s):
2520-2533
Encryption can provide strong security for sensitive data against inside and outside attacks. This is especially true in the "Database as Service" model, where confidentiality and privacy are important issues for the client. In fact, existing encryption approaches are vulnerable to a statistical attack because each value is encrypted to another fixed value. This paper presents a novel database encryption scheme called MV-OPES (Multivalued--Order Preserving Encryption Scheme), which allows privacy-preserving queries over encrypted databases with an improved security level. Our idea is to encrypt a value to different multiple values to prevent statistical attacks. At the same time, MV-OPES preserves the order of the integer values to allow comparison operations to be directly applied on encrypted data. Using calculated distance (range), we propose a novel method that allows a join query between relations based on inequality over encrypted values. We also present techniques to offload query execution load to a database server as much as possible, thereby making a better use of server resources in a database outsourcing environment. Our scheme can easily be integrated with current database systems as it is designed to work with existing indexing structures. It is robust against statistical attack and the estimation of true values. MV-OPES experiments show that security for sensitive data can be achieved with reasonable overhead, establishing the practicability of the scheme.
A Parallel Transmission Scheme for All-to-All Broadcast in Underwater Sensor Networks
Soonchul PARK Jaesung LIM

PAPER-Network

Vol:
E93-B No:9
Page(s):
2309-2315
This paper is concerned with the packet transmission scheduling problem for repeating all-to-all broadcasts in Underwater Sensor Networks (USN) in which there are n nodes in a transmission range. All-to-all communication is one of the most dense communication patterns. It is assumed that each node has the same size packet. Unlike the terrestrial scenarios, the propagation time in underwater communications is not negligible. We define all-to-all broadcast as the one where every node transmits packets to all the other nodes in the network except itself. So, there are in total n(n - 1) packets to be transmitted for an all-to-all broadcast. The optimal transmission scheduling is to schedule in a way that all packets can be transmitted within the minimum time. In this paper, we propose an efficient packet transmission scheduling algorithm for underwater acoustic communications using the property of long propagation delay.
Extended Single Parity Check Product Codes that Achieve Close-to-Capacity Performance in High Coding Rate
Akira SHIOZAKI Masashi KISHIMOTO Genmon MARUOKA

LETTER-Coding Theory

Vol:
E93-A No:9
Page(s):
1693-1696
This letter proposes extended single parity check product codes and presents their empirical performances on a Gaussian channel by belief propagation (BP) decoding algorithm. The simulation results show that the codes can achieve close-to-capacity performance in high coding rate. The code of length 9603 and of rate 0.96 is only 0.77 dB away from the Shannon limit for a BER of 10-5.
Performance of Coded CS-CDMA/CP with M-ZCZ Code over a Fast Fading Channel
Li YUE Chenggao HAN Nalin S. WEERASINGHE Takeshi HASHIMOTO

PAPER-Wireless Communication Technologies

Vol:
E93-B No:9
Page(s):
2381-2388
This paper studies the performance of a coded convolutional spreading CDMA system with cyclic prefix (CS-CDMA/CP) combined with the zero correlation zone code generated from the M-sequence (M-ZCZ code) for downlink transmission over a multipath fast fading channel. In particular, we propose a new pilot-aided channel estimation scheme based on the shift property of the M-ZCZ code and show the robustness of the scheme against fast fading through comparison with the W-CDMA system empolying time-multiplexed pilot signals.
Enhancing the Robustness of the Posterior-Based Confidence Measures Using Entropy Information for Speech Recognition
Yanqing SUN Yu ZHOU Qingwei ZHAO Pengyuan ZHANG Fuping PAN Yonghong YAN

PAPER-Robust Speech Recognition

Vol:
E93-D No:9
Page(s):
2431-2439
In this paper, the robustness of the posterior-based confidence measures is improved by utilizing entropy information, which is calculated for speech-unit-level posteriors using only the best recognition result, without requiring a larger computational load than conventional methods. Using different normalization methods, two posterior-based entropy confidence measures are proposed. Practical details are discussed for two typical levels of hidden Markov model (HMM)-based posterior confidence measures, and both levels are compared in terms of their performances. Experiments show that the entropy information results in significant improvements in the posterior-based confidence measures. The absolute improvements of the out-of-vocabulary (OOV) rejection rate are more than 20% for both the phoneme-level confidence measures and the state-level confidence measures for our embedded test sets, without a significant decline of the in-vocabulary accuracy.
A Study on Wear of Brush and Carbon Flat Commutator of DC Motor for Automotive Fuel Pump
Koichiro SAWA Takahiro UENO Hidenori TANAKA

PAPER

Vol:
E93-C No:9
Page(s):
1443-1448
In an automotive fuel pump system, a small DC motor is widely used to drive the pump and driven by a automotive battery. Recently a bio-fuel, usually a mixture of gasoline and ethanol has been used due to shortage of gasoline and environmental aspect. It affects strongly the performances of a DC motor, especially commutation phenomena, what kind of fuel is used. Therefore the authors have started to investigate the influence of ethanol on the commutation phenomena. They have been reporting the wear of brush and carbon flat commutator in gasoline and ethanol so far. In this paper commutation period, arc duration, brush and commutator wear are examined in ethanol 50-gasoline 50%. Brush wears are very small compared with the previous results. Namely in the present test a mechanical sliding wear is predominant rather than erosion by arc due to short arc duration. Further, an area eroded by arc is observed to re-appear as a sliding surface. From these results a threshold arc energy between arc erosion and mechanical sliding wear is obtained, and a wear model is proposed to explain the above wear pattern on the sliding surface.
Strongly Secure Privacy Amplification Cannot Be Obtained by Encoder of Slepian-Wolf Code
Shun WATANABE Ryutaroh MATSUMOTO Tomohiko UYEMATSU

PAPER-Information Theory

Vol:
E93-A No:9
Page(s):
1650-1659
Privacy amplification is a technique to distill a secret key from a random variable by a function so that the distilled key and eavesdropper's random variable are statistically independent. There are three kinds of security criteria for the key distilled by privacy amplification: the normalized divergence criterion, which is also known as the weak security criterion, the variational distance criterion, and the divergence criterion, which is also known as the strong security criterion. As a technique to distill a secret key, it is known that the encoder of a Slepian-Wolf (the source coding with full side-information at the decoder) code can be used as a function for privacy amplification if we employ the weak security criterion. In this paper, we show that the encoder of a Slepian-Wolf code cannot be used as a function for privacy amplification if we employ the criteria other than the weak one.
Commercial Shot Classification Based on Multiple Features Combination
Nan LIU Yao ZHAO Zhenfeng ZHU Rongrong NI

LETTER-Image Processing and Video Processing

Vol:
E93-D No:9
Page(s):
2651-2655
This paper presents a commercial shot classification scheme combining well-designed visual and textual features to automatically detect TV commercials. To identify the inherent difference between commercials and general programs, a special mid-level textual descriptor is proposed, aiming to capture the spatio-temporal properties of the video texts typical of commercials. In addition, we introduce an ensemble-learning based combination method, named Co-AdaBoost, to interactively exploit the intrinsic relations between the visual and textual features employed.
Practical Power Allocation for Cooperative Distributed Antenna Systems
Wei FENG Yanmin WANG Yunzhou LI Shidong ZHOU Jing WANG

LETTER-Fundamental Theories for Communications

Vol:
E93-B No:9
Page(s):
2424-2427
In this letter, we address the problem of downlink power allocation for the generalized distributed antenna system (DAS) with cooperative clusters. Considering practical applications, we assume that only the large-scale channel state information is available at the transmitter. The power allocation scheme is investigated with the target of ergodic achievable sum rate maximization. Based on some approximations and the Rayleigh Quotient Theory, the simple selective power allocation scheme is derived for the low SNR scenario and the high SNR scenario, respectively. The methods are applicable in practice due to their low complexity.
Adaptive Arbitration of Fair QoS Based Resource Allocation in Multi-Tier Computing Systems
Naoki HAYASHI Toshimitsu USHIO Takafumi KANAZAWA

PAPER-Concurrent Systems

Vol:
E93-A No:9
Page(s):
1678-1683
This paper proposes an adaptive resource allocation for multi-tier computing systems to guarantee a fair QoS level under resource constraints of tiers. We introduce a multi-tier computing architecture which consists of a group of resource managers and an arbiter. Resource allocation of each client is managed by a dedicated resource manager. Each resource manager updates resources allocated to subtasks of its client by locally exchanging QoS levels with other resource managers. An arbiter compensates the updated resources to avoid overload conditions in tiers. Based on the compensation by the arbiter, the subtasks of each client are executed in corresponding tiers. We derive sufficient conditions for the proposed resource allocation to achieve a fair QoS level avoiding overload conditions in all tiers with some assumptions on a QoS function and a resource consumption function of each client. We conduct a simulation to demonstrate that the proposed resource allocation can adaptively achieve a fair QoS level without causing any overload condition.
Cooperative Coding Using Cyclic Delay Diversity for OFDM Systems
Dongwoo LEE Young Seok JUNG Jae Hong LEE

PAPER-Wireless Communication Technologies

Vol:
E93-B No:9
Page(s):
2354-2362
This paper proposes cooperative coding using cyclic delay diversity (CDD) for OFDM systems. The cooperative diversity is combined with channel coding while CDD is applied to the cooperative transmission of the multiple relays to improve the beneficial effects of the cooperating relays. Analyses of frame error probability (FEP) and the average channel power of the proposed scheme are shown. Simulation results show the frame error rate (FER) of the proposed scheme. The proposed scheme provides not only a simple code design and low system complexity compared to conventional space-time processing, but better FER and diversity gain compared to direct transmission and conventional cooperative coding without CDD.
Phase Offsets for Binary Sequences Using Order and Index
Young-Joon SONG

LETTER-Coding Theory

Vol:
E93-A No:9
Page(s):
1697-1699
When a zero offset reference sequence is defined, the i-bit shifted sequence has phase offset i with respect to the reference sequence. In this letter, we propose a new algorithm to compute phase offsets for a periodic binary sequence using the concept of order and index of an integer based on the number theoretical approach. We define an offset evaluation function that is used to calculate the phase offset, and derive properties of the function. Once the function is computed, the phase offset of the sequence is simply obtained by taking the index of it. The new algorithm overcomes the restrictions found in conventional methods on the length and the number of '0's and '1's in binary codes. Its application to the code acquisition is also investigated to show the proposed method is useful.
Operating Characteristics for 50 kW Utility Interactive Photovoltaic System in Chosun University, Korea
Youn-Ok CHOI Zheng-Guo PIAO Geum-Bae CHO

PAPER

Vol:
E93-B No:9
Page(s):
2239-2243
This study examined the performance improvement of a photovoltaic (PV) array and inverter as well as their design, construction, and post-operation and management, which will become the key elements in future PV systems. In addition, it evaluated the performance characteristics of a 50 kW grid-connection PV system in Korea. According to the result of the evaluation, the PV array showed approximately 10% efficiency. The inverter was indicated to operate at > 90% efficiency regularly at > 400 W/m2 irradiation. The capture losses (Lc), system losses (Ls) and performance ratio were approximately 0.9 h/d, 0.3 h/d, and > 70%, respectively, indicating that the system was operating stably. In addition, while the Ls decreased rapidly due to the efficiency of the inverter, the performance ratio decreased markedly with increasing Lc due to the increase in temperature when the reference yield was > 5.0 h/d.
A Comparative Investigation of Several Frequency Modulation Profiles for Programmed Switching Controllers Targeted Conducted-Noise Reduction in DC-DC Converters
Gamal M. DOUSOKY Masahito SHOYAMA Tamotsu NINOMIYA

PAPER

Vol:
E93-B No:9
Page(s):
2265-2272
This paper investigates the effect of several frequency modulation profiles on conducted-noise reduction in dc-dc converters with programmed switching controller. The converter is operated in variable frequency modulation regime. Twelve switching frequency modulation profiles have been studied. Some of the modulation data are prepared using MATLAB software, and others are generated online. Moreover, all the frequency profiles have been designed and implemented using FPGA and experimentally investigated. The experimental results show that the conducted-noise spreading depends on both the modulation sequence profile and the statistical characteristics of the sequence. A substantial part of the manufacturing cost of power converters for telecommunication applications involves designing filters to comply with the EMI limits. Considering this investigation significantly reduces the filter size.

7501-7520hit(18690hit)

Keyword Search Result

[Keyword] ATI(18690hit)

Intentional Voice Command Detection for Trigger-Free Speech Interface

Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition

Intra-Cell Partial Spectrum Reuse Scheme for Cellular OFDM-Relay Networks

A Hybrid Acoustic and Pronunciation Model Adaptation Approach for Non-native Speech Recognition

Color Independent Components Based SIFT Descriptors for Object/Scene Classification

Reliable Wireless Broadcast with Linear Network Coding for Multipoint-to-Multipoint Real-Time Communications

MV-OPES: Multivalued-Order Preserving Encryption Scheme: A Novel Scheme for Encrypting Integer Value to Many Different Values

A Parallel Transmission Scheme for All-to-All Broadcast in Underwater Sensor Networks

Extended Single Parity Check Product Codes that Achieve Close-to-Capacity Performance in High Coding Rate

Performance of Coded CS-CDMA/CP with M-ZCZ Code over a Fast Fading Channel

Enhancing the Robustness of the Posterior-Based Confidence Measures Using Entropy Information for Speech Recognition

A Study on Wear of Brush and Carbon Flat Commutator of DC Motor for Automotive Fuel Pump

Strongly Secure Privacy Amplification Cannot Be Obtained by Encoder of Slepian-Wolf Code

Commercial Shot Classification Based on Multiple Features Combination

Practical Power Allocation for Cooperative Distributed Antenna Systems

Adaptive Arbitration of Fair QoS Based Resource Allocation in Multi-Tier Computing Systems

Cooperative Coding Using Cyclic Delay Diversity for OFDM Systems

Phase Offsets for Binary Sequences Using Order and Index

Operating Characteristics for 50 kW Utility Interactive Photovoltaic System in Chosun University, Korea

A Comparative Investigation of Several Frequency Modulation Profiles for Programmed Switching Controllers Targeted Conducted-Noise Reduction in DC-DC Converters

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles