The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] SPE(2504hit)

1381-1400hit(2504hit)

  • A Speech Packet Loss Concealment Method Using Linear Prediction

    Kazuhiro KONDO  Kiyoshi NAKAGAWA  

     
    PAPER-Speech and Hearing

      Vol:
    E89-D No:2
      Page(s):
    806-813

    We proposed and evaluated a speech packet loss concealment method which predicts lost segments from speech included in packets either before, or both before and after the lost packet. The lost segments are predicted recursively by using linear prediction both in the forward direction from the packet preceding the loss, and in the backward direction from the packet succeeding the lost segment. Predicted samples in each direction are smoothed by averaging using linear weights to obtain the final interpolated signal. The adjacent segments are also smoothed extensively to significantly reduce the speech quality discontinuity between the interpolated signal and the received speech signal. Subjective quality comparisons between the proposed method and the the packet loss concealment algorithm described in the ITU standard G.711 Appendix I showed similar scores up to about 10% packet loss. However, the proposed method showed higher scores above this loss rate, with Mean Opinion Score rating exceeding 2.4, even at an extremely high packet loss rate of 30%. Packet loss concealment of speech degraded with G.729 coding, and babble noise mixed speech showed similar trends, with the proposed method showing higher qualities at high loss rates. We plan to further improve the performance by using adaptive LPC prediction order depending on the estimated pitch, and adaptive LPC bandwidth expansion depending on the consecutive number of repetitive prediction, among many other improvements. We also plan to investigate complexity reduction using gradient LPC coefficient updates, and processing delay reduction using adaptive forward/bidirectional prediction modes depending on the measured packet loss ratio.

  • High-Speed Human Motion Recognition Based on a Motion History Image and an Eigenspace

    Takehito OGATA  Joo Kooi TAN  Seiji ISHIKAWA  

     
    PAPER-Pattern Recognition

      Vol:
    E89-D No:1
      Page(s):
    281-289

    This paper proposes an efficient technique for human motion recognition based on motion history images and an eigenspace technique. In recent years, human motion recognition has become one of the most popular research fields. It is expected to be applied in a security system, man-machine communication, and so on. In the proposed technique, we use two feature images and the eigenspace technique to realize high-speed recognition. An experiment was performed on recognizing six human motions and the results showed satisfactory performance of the technique.

  • Impact of the Line-of-Sight Propagation Component on the Orthogonality Factor of the Synchronous DS-CDMA Uplink

    Seung-Hoon HWANG  Lajos HANZO  

     
    LETTER-Terrestrial Radio Communications

      Vol:
    E89-B No:1
      Page(s):
    247-249

    This paper investigates a modifying orthogonality factor for synchronous DS-CDMA uplink in dispersive Rician multipath fading channels, which reflects upon the effects of specular path power as well as decaying channel characteristics. Using this investigation, the orthogonal factors in indoor environments are evaluated and compared with the various parameters such as decaying factor, line-of-sight component, and the number of multipaths.

  • Remote Monitoring Scheme for Output Video of Standards Convertors

    Ryoichi KAWADA  Osamu SUGIMOTO  Atsushi KOIKE  

     
    LETTER-Multimedia Systems for Communications" Multimedia Systems for Communications

      Vol:
    E89-B No:1
      Page(s):
    254-258

    As digital television transmission is becoming ubiquitous, a method that can remotely monitor the quality of the final and intermediate pictures is urgently needed. In particular, the case where standards conversion is included in the transmission chain is a serious issue as the input and output cannot simply be compared. This letter proposes a novel method to solve this issue. The combination of skipping fields/pixels and the previously proposed SSSWHT-RR method, using the information of correlation coefficients and variance of the picture, achieves accurate detection of picture failure.

  • Non-Audible Murmur (NAM) Recognition

    Yoshitaka NAKAJIMA  Hideki KASHIOKA  Nick CAMPBELL  Kiyohiro SHIKANO  

     
    PAPER

      Vol:
    E89-D No:1
      Page(s):
    1-8

    We propose a new practical input interface for the recognition of Non-Audible Murmur (NAM), which is defined as articulated respiratory sound without vocal-fold vibration transmitted through the soft tissues of the head. We developed a microphone attachment, which adheres to the skin, by applying the principle of a medical stethoscope, found the ideal position for sampling flesh-conducted NAM sound vibration and retrained an acoustic model with NAM samples. Then using the Julius Japanese Dictation Toolkit, we tested the feasibility of using this method in place of an external microphone for analyzing air-conducted voice sound.

  • C/V Segmentation on Mandarin Spontaneous Spoken Speech Signals Using SNR Improvement and Energy Variation

    Ching-Ta LU  Hsiao-Chuan WANG  

     
    LETTER-Speech and Hearing

      Vol:
    E89-D No:1
      Page(s):
    363-366

    An efficient and simple approach to consonant/vowel (C/V) segmentation by incorporating the SNR improvement of a speech enhancement system with the energy variation of two adjacent frames is proposed. Experimental results show that the proposed scheme performs well in segmenting C/V for a spontaneously spoken utterance.

  • Recursive Computation of Wiener-Khintchine Theorem and Bispectrum

    Khalid Mahmood AAMIR  Mohammad Ali MAUD  Arif ZAMAN  Asim LOAN  

     
    LETTER-Digital Signal Processing

      Vol:
    E89-A No:1
      Page(s):
    321-323

    Power Spectral Density (PSD) computed by taking the Fourier transform of auto-correlation functions (Wiener-Khintchine Theorem) gives better result, in case of noisy data, as compared to the Periodogram approach in case the signal is Gaussian. However, the computational complexity of Wiener-Khintchine approach is more than that of the Periodogram approach. For the computation of short time Fourier transform (STFT), this problem becomes even more prominent where computation of PSD is required after every shift in the window under analysis. This paper presents a recursive form of PSD to reduce the complexity. If the signal is not Gaussian, the PSD approach is insufficient and we estimate the higher order spectra of the signal. Estimation of higher order spectra is even more time consuming. In this paper, recursive versions for computation of bispectrum has been presented as well. The computational complexity of PSD and bispectrum for a window size of N, are O(N) and O(N2) respectively.

  • Autonomous Decentralized High-Speed Processing Technology and the Application in an Integrated IC Card Fixed-Line and Wireless System

    Akio SHIIBASHI  

     
    PAPER

      Vol:
    E88-D No:12
      Page(s):
    2699-2707

    There is "Processing speed improvement of the automatic fare collection gate (AFC gate)" as one of the important problems to correspond to the passengers getting on and off in high density transportation at the peak. On the other hand, reliability is indispensable to handle the ticket that is the note. Therefore, the ticket system that has both high-speed processing and high reliability is necessary and indispensable. For the passenger's convenience improvement and maintenance cost reduction, wireless IC card ticket system is hoped. However, the high-speed processing and the high reliability are ambivalent at this system because of wireless communications between an IC card and an AFC gate; the faster the AFC gate processes the ticket, the poorer the reliability gets. In this thesis, it proposes the autonomous decentralized processing technology to meet high-speed processing in wireless IC ticket system and the requirement of high reliability. "IC card" "AFC" and "Central server" are assumed to be an autonomous system. It proposes "Decentralized algorithm of the fare calculation by IC card and the AFC" to achieve high-speed processing. Moreover, "Autonomous, decentralized consistency technology" in each subsystem is shown for high-reliability. In addition, to make these the effective one, "Wireless communication area enhancing technology (touch & going method)" and "Command system for the data high speed processing" are shown. These technologies are introduced into the Suica system of East Japan Railway and the effectiveness has been proven.

  • Robust Speech Recognition Using Discrete-Mixture HMMs

    Tetsuo KOSAKA  Masaharu KATOH  Masaki KOHDA  

     
    PAPER-Speech and Hearing

      Vol:
    E88-D No:12
      Page(s):
    2811-2818

    This paper introduces new methods of robust speech recognition using discrete-mixture HMMs (DMHMMs). The aim of this work is to develop robust speech recognition for adverse conditions that contain both stationary and non-stationary noise. In particular, we focus on the issue of impulsive noise, which is a major problem in practical speech recognition system. In this paper, two strategies were utilized to solve the problem. In the first strategy, adverse conditions are represented by an acoustic model. In this case, a large amount of training data and accurate acoustic models are required to present a variety of acoustic environments. This strategy is suitable for recognition in stationary or slow-varying noise conditions. The second is based on the idea that the corrupted frames are treated to reduce the adverse effect by compensation method. Since impulsive noise has a wide variety of features and its modeling is difficult, the second strategy is employed. In order to achieve those strategies, we propose two methods. Those methods are based on DMHMM framework which is one type of discrete HMM (DHMM). First, an estimation method of DMHMM parameters based on MAP is proposed aiming to improve trainability. The second is a method of compensating the observation probabilities of DMHMMs by threshold to reduce adverse effect of outlier values. Observation probabilities of impulsive noise tend to be much smaller than those of normal speech. The motivation in this approach is that flooring the observation probability reduces the adverse effect caused by impulsive noise. Experimental evaluations on Japanese LVCSR for read newspaper speech showed that the proposed method achieved the average error rate reduction of 48.5% in impulsive noise conditions. Also the experimental results in adverse conditions that contain both stationary and impulsive noises showed that the proposed method achieved the average error rate reduction of 28.1%.

  • Primitive Inductive Theorems Bridge Implicit Induction Methods and Inductive Theorems in Higher-Order Rewriting

    Keiichirou KUSAKARI  Masahiko SAKAI  Toshiki SAKABE  

     
    PAPER-Computation and Computational Models

      Vol:
    E88-D No:12
      Page(s):
    2715-2726

    Automated reasoning of inductive theorems is considered important in program verification. To verify inductive theorems automatically, several implicit induction methods like the inductionless induction and the rewriting induction methods have been proposed. In studying inductive theorems on higher-order rewritings, we found that the class of the theorems shown by known implicit induction methods does not coincide with that of inductive theorems, and the gap between them is a barrier in developing mechanized methods for disproving inductive theorems. This paper fills this gap by introducing the notion of primitive inductive theorems, and clarifying the relation between inductive theorems and primitive inductive theorems. Based on this relation, we achieve mechanized methods for proving and disproving inductive theorems.

  • The Future of High-Speed Train

    Takashi ENDO  

     
    INVITED PAPER

      Vol:
    E88-D No:12
      Page(s):
    2625-2629

    High-speed intercity railways have grown into profitable business, achieving a renaissance in rail transport. High-speed railways need constant updating to new systems if they are to be winners in this age of competing transportation modes. In view of that situation, JR East started an R&D project to achieve even faster speed--more than 300 km/h. A test train that can run at an operational speed of 360 km/h is under development, and JR East plans to commence high-speed tests in the summer of 2005.

  • Exact Minimization of FPRMs for Incompletely Specified Functions by Using MTBDDs

    Debatosh DEBNATH  Tsutomu SASAO  

     
    PAPER-Logic Synthesis

      Vol:
    E88-A No:12
      Page(s):
    3332-3341

    Fixed polarity Reed-Muller expressions (FPRMs) exhibit several useful properties that make them suitable for many practical applications. This paper presents an exact minimization algorithm for FPRMs for incompletely specified functions. For an n-variable function with α unspecified minterms there are 2n+α distinct FPRMs, and a minimum FPRM is one with the fewest product terms. To find a minimum FPRM the algorithm requires to determine an assignment of the incompletely specified minterms. This is accomplished by using the concept of integer-valued functions in conjunction with an extended truth vector and a weight vector. The vectors help formulate the problem as an assignment of the variables of integer-valued functions, which are then efficiently manipulated by using multi-terminal binary decision diagrams for finding an assignment of the unspecified minterms. The effectiveness of the algorithm is demonstrated through experimental results for code converters, adders, and randomly generated functions.

  • A Binary Tree Based Methodology for Designing an Application Specific Network-on-Chip (ASNOC)

    Yuan-Long JEANG  Jer-Min JOU  Win-Hsien HUANG  

     
    PAPER-VLSI Architecture

      Vol:
    E88-A No:12
      Page(s):
    3531-3538

    In this paper, a methodology based on a mix-mode interconnection architecture is proposed for constructing an application specific network on chip to minimize the total communication time. The proposed architecture uses a globally asynchronous communication network and a locally synchronous bus (or cross-bar or multistage interconnection network MIN). First, a local bus is given for a group of IP cores so that the communications within this local bus can be arranged to be exclusive in time. If the communications of some IP cores should be required to be completed within a given amount of time, then a non-blocking MIN or a crossbar switch should be made for those IP cores instead of a bus. Then, a communication ratio (CR) for each pair of local buses is provided by users, and based on the Huffman coding philosophy, a process is applied to construct a binary tree (BT) with switches on the internal nodes and buses on the leaves. Since the binary tree system is deadlock free (no cycle exists in any path), the router is just a relatively simple and cheap switch. Simulation results show that the proposed methodology and architecture of NOC is better on switching circuit cost and performance than the SPIN and the mesh architecture using our developed deadlock-free router.

  • Subband-Based Blind Separation for Convolutive Mixtures of Speech

    Shoko ARAKI  Shoji MAKINO  Robert AICHNER  Tsuyoki NISHIKAWA  Hiroshi SARUWATARI  

     
    PAPER-Engineering Acoustics

      Vol:
    E88-A No:12
      Page(s):
    3593-3603

    We propose utilizing subband-based blind source separation (BSS) for convolutive mixtures of speech. This is motivated by the drawback of frequency-domain BSS, i.e., when a long frame with a fixed long frame-shift is used to cover reverberation, the number of samples in each frequency decreases and the separation performance is degraded. In subband BSS, (1) by using a moderate number of subbands, a sufficient number of samples can be held in each subband, and (2) by using FIR filters in each subband, we can manage long reverberation. We confirm that subband BSS achieves better performance than frequency-domain BSS. Moreover, subband BSS allows us to select a separation method suited to each subband. Using this advantage, we propose efficient separation procedures that consider the frequency characteristics of room reverberation and speech signals (3) by using longer unmixing filters in low frequency bands and (4) by adopting an overlap-blockshift in BSS's batch adaptation in low frequency bands. Consequently, frequency-dependent subband processing is successfully realized with the proposed subband BSS.

  • Logic Synthesis Technique for High Speed Differential Dynamic Logic with Asymmetric Slope Transition

    Masao MORIMOTO  Yoshinori TANAKA  Makoto NAGATA  Kazuo TAKI  

     
    PAPER-Logic Synthesis

      Vol:
    E88-A No:12
      Page(s):
    3324-3331

    This paper proposes a logic synthesis technique for asymmetric slope differential dynamic logic (ASDDL) circuits. The technique utilizes a commercially available logic synthesis tool that has been well established for static CMOS logic design, where an intermediate library is devised for logic synthesis likely as static CMOS, and then a resulting synthesized circuit is translated automatically into ASDDL implementation at the gate-level logic schematic level as well as at the physical-layout level. A design example of an ASDDL 16-bit multiplier synthesized in a 0.18-µm CMOS technology shows an operation delay time of 1.82 nsec, which is a 32% improvement over a static CMOS design with a static logic standard-cell library that is finely tuned for energy-delay products. Design with the 16-bit multiplier led to a design time for an ASDDL based dynamic digital circuit 300 times shorter than that using a fully handcrafted design, and comparable with a static CMOS design.

  • Symbolic Reachability Analysis of Probabilistic Linear Hybrid Automata

    Yosuke MUTSUDA  Takaaki KATO  Satoshi YAMANE  

     
    PAPER

      Vol:
    E88-A No:11
      Page(s):
    2972-2981

    We can model embedded systems as hybrid systems. Moreover, they are distributed and real-time systems. Therefore, it is important to specify and verify randomness and soft real-time properties. For the purpose of system verification, we formally define probabilistic linear hybrid automaton and its symbolic reachability analysis method. It can describe uncertainties and soft real-time characteristics.

  • Application of Cognitive Radio Technology across the Wireless Stack

    Paul KOLODZY  

     
    INVITED PAPER

      Vol:
    E88-B No:11
      Page(s):
    4158-4162

    The RF environment in the future will consist of many mobile devices operating across a wide range of applications. Most radio developments assume a static operating environment. The physical layer, MAC layer, and network protocols are optimized for that specific environment. However, this new RF environment consisting of many mobile devices will be very dynamic. Radios will need the capacity to sense and adapt to changing environmental conditions. That characteristic is generally associated with cognitive radio. This paper will provide an introduction to new strategies for designing systems for this new, dynamic environment using cognitive radio technology.

  • Intra-Cell Allocation Information and Inter-Cell Interference Distribution Based TPC for High-Speed CDMA Packet Radio

    Heng QIU  Hidetoshi KAYAMA  Narumi UMEDA  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E88-B No:11
      Page(s):
    4301-4308

    We aim to establish a highly efficient transmitting power control (TPC) scheme suitable for the reverse link of high-speed CDMA packet communication systems. Reservation-based access is assumed to be used for packet transmission in the reverse link. First, we describe a hybrid TPC that we created to cope with average interference changes. The target receiving power in the hybrid TPC is set according to the interference averaged over a comparatively long period of time. We show, using experiments on our high-speed packet communication experimental system, that hybrid TPC can effectively reduce transmission power consumption and PER compared with basic receiving power based TPC. Furthermore, we need to change the transmitting power according to the instantaneous interference to cope with instantaneous interference changes slot by slot. However, in a high-speed packet communication system, the interference level can change dramatically in a very short period of time. The TPC of cdma2000 or W-CDMA cannot efficiently cope with rapidly and greatly changing interference levels. Therefore, we created another two novel TPCs. Interference is divided in these TPCs into intra-cell and inter-cell interference. The supposed inter-cell interference level is changed according to the change in the probability distribution of the inter-cell interference, and the necessary transmitting power for a packet is calculated based on intra-cell allocation information and the supposed inter-cell interference level. Computer simulations show that, with the proposed TPCs, throughput can be increased by more than 200% compared with the type of TPC used in cdma2000 or W-CDMA, and the transmitting power consumption in a mobile host (MH) can also be vastly reduced.

  • Concatenative Speech Synthesis Based on the Plural Unit Selection and Fusion Method

    Tatsuya MIZUTANI  Takehiko KAGOSHIMA  

     
    PAPER-Speech and Hearing

      Vol:
    E88-D No:11
      Page(s):
    2565-2572

    This paper proposes a novel speech synthesis method to generate human-like natural speech. The conventional unit-selection-based synthesis method selects speech units from a large database, and concatenates them with or without modifying the prosody to generate synthetic speech. This method features highly human-like voice quality. The method, however, has a problem that a suitable speech unit is not necessarily selected. Since the unsuitable speech unit selection causes discontinuity between the consecutive speech units, the synthesized speech quality deteriorates. It might be considered that the conventional method can attain higher speech quality if the database size increases. However, preparation of a larger database requires a longer recording time. The narrator's voice quality does not remain constant throughout the recording period. This fact deteriorates the database quality, and still leaves the problem of unsuitable selection. We propose the plural unit selection and fusion method which avoids this problem. This method integrates the unit fusion used in the unit-training-based method with the conventional unit-selection-based method. The proposed method selects plural speech units for each segment, fuses the selected speech units for each segment, modifies the prosody of the fused speech units, and concatenates them to generate synthetic speech. This unit fusion creates speech units which are connected to one another with much less voice discontinuity, and realizes high quality speech. A subjective evaluation test showed that the proposed method greatly improves the speech quality compared with the conventional method. Also, it showed that the speech quality of the proposed method is kept high regardless of the database size, from small (10 minutes) to large (40 minutes). The proposed method is a new framework in the sense that it is a hybrid method between the unit-selection-based method and the unit-training-based method. In the framework, the algorithms of the unit selection and the unit fusion are exchangeable for more efficient techniques. Thus, the framework is expected to lead to new synthesis methods.

  • Speech Synthesis with Various Emotional Expressions and Speaking Styles by Style Interpolation and Morphing

    Makoto TACHIBANA  Junichi YAMAGISHI  Takashi MASUKO  Takao KOBAYASHI  

     
    PAPER

      Vol:
    E88-D No:11
      Page(s):
    2484-2491

    This paper describes an approach to generating speech with emotional expressivity and speaking style variability. The approach is based on a speaking style and emotional expression modeling technique for HMM-based speech synthesis. We first model several representative styles, each of which is a speaking style and/or an emotional expression, in an HMM-based speech synthesis framework. Then, to generate synthetic speech with an intermediate style from representative ones, we synthesize speech from a model obtained by interpolating representative style models using a model interpolation technique. We assess the style interpolation technique with subjective evaluation tests using four representative styles, i.e., neutral, joyful, sad, and rough in read speech and synthesized speech from models obtained by interpolating models for all combinations of two styles. The results show that speech synthesized from the interpolated model has a style in between the two representative ones. Moreover, we can control the degree of expressivity for speaking styles or emotions in synthesized speech by changing the interpolation ratio in interpolation between neutral and other representative styles. We also show that we can achieve style morphing in speech synthesis, namely, changing style smoothly from one representative style to another by gradually changing the interpolation ratio.

1381-1400hit(2504hit)