The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] Z(5900hit)

4041-4060hit(5900hit)

  • Sounds of Speech Based Spoken Document Categorization: A Subword Representation Method

    Weidong QU  Katsuhiko SHIRAI  

     
    PAPER

      Vol:
    E87-D No:5
      Page(s):
    1175-1184

    In this paper, we explore a method to the problem of spoken document categorization, which is the task of automatically assigning spoken documents into a set of predetermined categories. To categorize spoken documents, subword unit representations are used as an alternative to word units generated by either keyword spotting or large vocabulary continuous speech recognition (LVCSR). An advantage of using subword acoustic unit representations to spoken document categorization is that it does not require prior knowledge about the contents of the spoken documents and addresses the out of vocabulary (OOV) problem. Moreover, this method works in reliance on the sounds of speech rather than exact orthography. The use of subword units instead of words allows approximate matching on inaccurate transcriptions, makes "sounds-like" spoken document categorization possible. We also explore the performance of our method when the training set contains both perfect and errorful phonetic transcriptions, and hope the classifiers can learn from the confusion characteristics of recognizer and pronunciation variants of words to improve the robustness of whole system. Our experiments based on both artificial and real corrupted data sets show that the proposed method is more effective and robust than the word based method.

  • Robust Speaker Identification System Based on Multilayer Eigen-Codebook Vector Quantization

    Ching-Tang HSIEH  Eugene LAI  Wan-Chen CHEN  

     
    PAPER

      Vol:
    E87-D No:5
      Page(s):
    1185-1193

    This paper presents some effective methods for improving the performance of a speaker identification system. Based on the multiresolution property of the wavelet transform, the input speech signal is decomposed into various frequency subbands in order not to spread noise distortions over the entire feature space. For capturing the characteristics of the vocal tract, the linear predictive cepstral coefficients (LPCC) of the lower frequency subband for each decomposition process are calculated. In addition, a hard threshold technique for the lower frequency subband in each decomposition process is also applied to eliminate the effect of noise interference. Furthermore, cepstral domain feature vector normalization is applied to all computed features in order to provide similar parameter statistics in all acoustic environments. In order to effectively utilize all these multiband speech features, we propose a modified vector quantization as the identifier. This model uses the multilayer concept to eliminate the interference among the multiband speech features and then uses the principal component analysis (PCA) method to evaluate the codebooks for capturing a more detailed distribution of the speaker's phoneme characteristics. The proposed method is evaluated using the KING speech database for text-independent speaker identification. Experimental results show that the recognition performance of the proposed method is better than those of the vector quantization (VQ) and the Gaussian mixture model (GMM) using full-band LPCC and mel-frequency cepstral coefficients (MFCC) features in both clean and noisy environments. Also, a satisfactory performance can be achieved in low SNR environments.

  • Wavelet Coding of Structured Geometry Data on Triangular Lattice Plane Considering Rate-Distortion Properties

    Hiroyuki KANEKO  Koichi FUKUDA  Akira KAWANAKA  

     
    PAPER-Image Processing and Video Processing

      Vol:
    E87-D No:5
      Page(s):
    1238-1246

    Efficient representations of a 3-D object shape and its texture data have attracted wide attention for the transmission of computer graphics data and for the development of multi-view real image rendering systems on computer networks. Polygonal mesh data, which consist of connectivity information, geometry data, and texture data, are often used for representing 3-D objects in many applications. This paper presents a wavelet coding technique for coding the geometry data structured on a triangular lattice plane obtained by structuring the connectivity of the polygonal mesh data. Since the structured geometry data have an arbitrarily-shaped support on the triangular lattice plane, a shape-adaptive wavelet transform was used to obtain the wavelet coefficients, whose number is identical to the number of original data, while preserving the self-similarity of the wavelet coefficients across subbands. In addition, the wavelet coding technique includes extensions of the zerotree entropy (ZTE) coding for taking into account the rate-distortion properties of the structured geometry data. The parent-children dependencies are defined as the set of wavelet coefficients from different bands that represent the same spatial region in the triangular lattice plane, and the wavelet coefficients in the spatial tree are optimally pruned based on the rate-distortion properties of the geometry data. Experiments in which proposed wavelet coding was applied to some sets of polygonal mesh data showed that the proposed wavelet coding achieved better coding efficiency than the Topologically Assisted Geometry Compression scheme adopted in the MPEG-4 standard.

  • Standardization of Measurement Methods of Low-Loss Dielectrics and High-Temperature Superconducting Films

    Yoshio KOBAYASHI  

     
    INVITED PAPER

      Vol:
    E87-C No:5
      Page(s):
    652-656

    The present state of IEC and JIS standards is reviewed on measurement methods of low-loss dielectric and high-tempera-ture superconductor (HTS) materials in the microwave and millimeter wave range. Four resonance methods are discussed actually, that is, a two-dielectric resonator method for dielectric rod measurements, a two-sapphire resonator method for HTS film measurements, a cavity resonator method for microwave measurements of dielectric plates and a cutoff circular waveguide method for millimeter wave measurements of dielectric plates. These methods realize the high accuracy sufficient for measurements of temperature dependence of material properties.

  • Fundamental Properties of M-Convex and L-Convex Functions in Continuous Variables

    Kazuo MUROTA  Akiyoshi SHIOURA  

     
    PAPER

      Vol:
    E87-A No:5
      Page(s):
    1042-1052

    The concepts of M-convexity and L-convexity, introduced by Murota (1996, 1998) for functions on the integer lattice, extract combinatorial structures in well-solved nonlinear combinatorial optimization problems. These concepts are extended to polyhedral convex functions and quadratic functions on the real space by Murota-Shioura (2000, 2001). In this paper, we consider a further extension to general convex functions. The main aim of this paper is to provide rigorous proofs for fundamental properties of general M-convex and L-convex functions.

  • Multiple-Value Exclusive-Or Sum-Of-Products Minimization Algorithms

    Stergios STERGIOU  Dimitris VOUDOURIS  George PAPAKONSTANTINOU  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E87-A No:5
      Page(s):
    1226-1234

    In this work, a novel Multiple Valued Exclusive-Or Sum Of Products (MVESOP) minimization formulation is analyzed and an algorithm is presented that detects minimum MVESOP expressions when the weight of the function is less than eight. A heuristic MVESOP algorithm based on a novel cube transformation operation is then presented. Experimental results on MCNC benchmarks and randomly generated functions indicate that the algorithm matches or outperforms the quality of the state of the art in ESOP minimizers.

  • Sound Source Localization Using a Profile Fitting Method with Sound Reflectors

    Osamu ICHIKAWA  Tetsuya TAKIGUCHI  Masafumi NISHIMURA  

     
    PAPER

      Vol:
    E87-D No:5
      Page(s):
    1138-1145

    In a two-microphone approach, interchannel differences in time (ICTD) and interchannel differences in sound level (ICLD) have generally been used for sound source localization. But those cues are not effective for vertical localization in the median plane (direct front). For that purpose, spectral cues based on features of head-related transfer functions (HRTF) have been investigated, but they are not robust enough against signal variations and environmental noise. In this paper, we use a "profile" as a cue while using a combination of reflectors specially designed for vertical localization. The observed sound is converted into a profile containing information about reflections as well as ICTD and ICLD data. The observed profile is decomposed into signal and noise by using template profiles associated with sound source locations. The template minimizing the residual of the decomposition gives the estimated sound source location. Experiments show this method can correctly provide a rough estimate of the vertical location even in a noisy environment.

  • Size-Reduced Visual Secret Sharing Scheme

    Hidenori KUWAKADO  Hatsukazu TANAKA  

     
    LETTER

      Vol:
    E87-A No:5
      Page(s):
    1193-1197

    We propose a method for reducing the size of a share in visual secret sharing schemes. The proposed method does not cause the leakage and the loss of the original image. The quality of the recovered image is almost same as that of previous schemes.

  • Missing Feature Theory Applied to Robust Speech Recognition over IP Network

    Toshiki ENDO  Shingo KUROIWA  Satoshi NAKAMURA  

     
    PAPER

      Vol:
    E87-D No:5
      Page(s):
    1119-1126

    This paper addresses problems involved in performing speech recognition over mobile and IP networks. The main problem is speech data loss caused by packet loss in the network. We present two missing-feature-based approaches that recover lost regions of speech data. These approaches are based on the reconstruction of missing frames or on marginal distributions. For comparison, we also use a packing method, which skips lost data. We evaluate these approaches with packet loss models, i.e., random loss and Gilbert loss models. The results show that the marginal-distributed-based technique is most effective for a packet loss environment; the degradation of word accuracy is only 5% when the packet loss rate is 30% and only 3% when mean burst loss length is 24 frames in the case of DSR front-end. The simple data imputation method is also effective in the case of clean speech.

  • Single Probe Method with Vector Detection for Measuring Microwave Reflection Coefficient

    Takashi IWASAKI  Makoto TAKASHIMA  

     
    PAPER-General Methods, Materials, and Passive Circuits

      Vol:
    E87-C No:5
      Page(s):
    665-671

    A novel method for measuring microwave reflection coefficients without the open and load standards is proposed. In this method, a single probe is inserted into an air line and the output wave is detected by a vector detector. Offset shorts are used for the calibration. The measurement system is constructed using 7 mm coaxial line and APC7 connectors. The result of the measurement in the frequency range 1-9 GHz shows the possibility of the proposed method. All the major systematic errors can be estimated from the data that is easily obtainable.

  • Decoding Algorithms for Low-Density Parity-Check Codes with Multilevel Modulations

    Hisashi FUTAKI  Tomoaki OHTSUKI  

     
    PAPER-Fundamental Theories

      Vol:
    E87-B No:5
      Page(s):
    1282-1289

    Recently, low-density parity-check (LDPC) codes have attracted much attention. LDPC codes can achieve the near Shannon limit performance like turbo codes. For the LDPC codes, the reduced complexity decoding algorithms referred to as uniformly most powerful (UMP) BP- and normalized BP-based algorithms were proposed for BPSK on an additive white Gaussian noise (AWGN) channel. The conventional BP and BP-based algorithms can be applied to BPSK modulation. For high bit-rate transmission, multilevel modulation is preferred. Thus, the BP algorithm for multilevel modulations is proposed in . In this paper, we propose the BP algorithm with reduced complexity for multilevel modulations, where the first likelihood of the proposed BP algorithm is modified to adjust multilevel modulations. We compare the error rate performance of the proposed algorithm with that of the conventional algorithm on AWGN and flat Rayleigh fading channels. We also propose the UMP BP- and normalized BP-based algorithms for multilevel modulations on AWGN and flat Rayleigh fading channels. We show that the error rate performance of the proposed BP algorithm is almost identical to that of the algorithm in, where the decoding complexity of the proposed BP algorithm is less than that of the algorithm in. We also show that the proposed BP-based algorithms can achieve the good trade-off between the complexity and the error rate performance.

  • A Traffic-Based Bandwidth Reservation Scheme for QoS Sensitive Mobile Multimedia Wireless Networks

    Jau-Yang CHANG  Hsing-Lung CHEN  

     
    PAPER-Mobility Management

      Vol:
    E87-B No:5
      Page(s):
    1166-1176

    Future mobile communication systems are expected to support multimedia applications (audio phone, video on demand, video conference, file transfer, etc.). Multimedia applications make a great demand for bandwidth and impose stringent quality of service requirements on the mobile wireless networks. In order to provide mobile hosts with high quality of service in the next generation mobile multimedia wireless networks, efficient and better bandwidth reservation schemes must be developed. A novel traffic-based bandwidth reservation scheme is proposed in this paper as a solution to support quality of service guarantees in the mobile multimedia wireless networks. Based on the existing network conditions, the proposed scheme makes an adaptive decision for bandwidth reservation and call admission by employing fuzzy inference mechanism, timing based reservation strategy, and round-borrowing strategy in each base station. The amount of reserved bandwidth for each base station is dynamically adjusted, according to the on-line traffic information of each base station. We use the dynamically adaptive approach to reduce the connection-blocking probability and connection-dropping probability, while increasing the bandwidth utilization for quality of service sensitive mobile multimedia wireless networks. Simulation results show that our traffic-based bandwidth reservation scheme outperforms the previously known schemes in terms of connection-blocking probability, connection-dropping probability, and bandwidth utilization.

  • Inter-Cell Interference of Approximately Synchronized CDMA Systems in Asynchronous Condition

    Hideyuki TORII  Makoto NAKAMURA  

     
    PAPER-Wireless Communication Technology

      Vol:
    E87-B No:5
      Page(s):
    1318-1327

    In the present paper, we evaluate the inter-cell interference of AS-CDMA systems. First, the cross-correlation property of AS-CDMA systems is examined by theoretical study in order to clarify the fundamental feature of the inter-cell interference. The result shows that the influence of one interference terminal in each adjacent cell is dominant regardless of whether approximate synchronization is maintained. Next, the ratio of interference signal power and desired signal power is evaluated by computer simulation. The simulation result shows that total interference power does not increase even when approximate synchronization is not maintained.

  • Performance of Chaos and Burst Noises Injected to the Hopfield NN for Quadratic Assignment Problems

    Yoko UWATE  Yoshifumi NISHIO  Tetsushi UETA  Tohru KAWABE  Tohru IKEGUCHI  

     
    PAPER-Neural Networks and Bioengineering

      Vol:
    E87-A No:4
      Page(s):
    937-943

    In this paper, performance of chaos and burst noises injected to the Hopfield Neural Network for quadratic assignment problems is investigated. For the evaluation of the noises, two methods to appreciate finding a lot of nearly optimal solutions are proposed. By computer simulations, it is confirmed that the burst noise generated by the Gilbert model with a laminar part and a burst part achieved the good performance as the intermittency chaos noise near the three-periodic window.

  • Blind Adaptive Equalizer Based on CMA and LMS Algorithm

    James OKELLO  Kenji UEDA  Hiroshi OCHI  

     
    LETTER-Fundamental Theories

      Vol:
    E87-B No:4
      Page(s):
    1012-1015

    In this letter we verify that a blind adaptive algorithm operating at a low intermediate frequency (Low-IF) can be applied to a system where carrier phase synchronization has not been achieved. We consider a quadrature amplitude shift keyed (QPSK) signal as the transmitted signal, and assume that the orthogonal low intermediate sinusoidal frequency used to generate the transmitted signal is well known. The proposed algorithm combines two algorithms: Namely, the least mean square (LMS) algorithm which has a cost function with unique minimum, and the constant modulus algorithm (CMA), which was first proposed by Godard. By doing this and operating the equalizer at a rate greater than the symbol rate, we take advantage of the variable amplitude of the sub-carriers and the fast convergence of LMS algorithm, so as to achieve a faster convergence speed. When the computer simulation results of the proposed algorithm are compared with the constant modulus algorithm (CMA) and the modified CMA (MCMA), we observed that the proposed algorithm exhibited a faster convergence speed.

  • Normalization of Time-Derivative Parameters for Robust Speech Recognition in Small Devices

    Yasunari OBUCHI  Nobuo HATAOKA  Richard M. STERN  

     
    PAPER-Speech and Hearing

      Vol:
    E87-D No:4
      Page(s):
    1004-1011

    In this paper we describe a new framework of feature compensation for robust speech recognition, which is suitable especially for small devices. We introduce Delta-cepstrum Normalization (DCN) that normalizes not only cepstral coefficients, but also their time-derivatives. Cepstral Mean Normalization (CMN) and Mean and Variance Normalization (MVN) are fast and efficient algorithms of environmental adaptation, and have been used widely. In those algorithms, normalization was applied to cepstral coefficients to reduce the irrelevant information from them, but such a normalization was not applied to time-derivative parameters because the reduction of the irrelevant information was not enough. However, Histogram Equalization (HEQ) provides better compensation and can be applied even to the delta and delta-delta cepstra. We investigate various implementation of DCN, and show that we can achieve the best performance when the normalization of the cepstra and the delta cepstra can be mutually interdependent. We evaluate the performance of DCN using speech data recorded by a PDA. DCN provides significant improvements compared to HEQ. It is shown that DCN gives 15% relative word error rate reduction from HEQ. We also examine the possibility of combining Vector Taylor Series (VTS) and DCN. Even though some combinations do not improve the performance of VTS, it is shown that the best combination gives the better performance than VTS alone. Finally, the advantage of DCN in terms of the computation speed is also discussed.

  • Selective-Sets Resizable Cache Memory Design for High-Performance and Low-Power CPU Core

    Takashi KURAFUJI  Yasunobu NAKASE  Hidehiro TAKATA  Yukinaga IMAMURA  Rei AKIYAMA  Tadao YAMANAKA  Atsushi IWABU  Shutarou YASUDA  Toshitsugu MIWA  Yasuhiro NUNOMURA  Niichi ITOH  Tetsuya KAGEMOTO  Nobuharu YOSHIOKA  Takeshi SHIBAGAKI  Hiroyuki KONDO  Masayuki KOYAMA  Takahiko ARAKAWA  Shuhei IWADE  

     
    PAPER

      Vol:
    E87-C No:4
      Page(s):
    535-542

    We apply a selective-sets resizable cache and a complete hierarchy SRAM for the high-performance and low-power RISC CPU core. The selective-sets resizable cache can change the cache memory size by varying the number of cache sets. It reduces the leakage current by 23% with slight degradation of the worst case operating speed from 213 MHz to 210 MHz. The complete hierarchy SRAM enables the partial swing operation not only in the bit lines, but also in the global signal lines. It reduces the current consumption of the memory by 4.6%, and attains the high-speed access of 1.4 ns in the typical case.

  • Comparison of Efficiency in Key Entry among Young, Middle-Aged and Elderly Groups: Effects of Aging and Size of Keyboard Letters on Work Efficiency

    Atsuo MURATA  Yoshitomo OKADA  

     
    PAPER-Human-computer Interaction

      Vol:
    E87-D No:4
      Page(s):
    985-991

    Making information technology (IT) more accessible to elderly users is an important objective, in particular, concerning input devices. In this study, it has been investigated how the aging factor and the letter (character) size of a keyboard affects the efficiency in data entry. In addition, computer experience by the elderly was examined relative to efficiency. The performance measures (entry speed and correctly entered number per min) were twice better in a young group of computer users than in middle-aged and elderly groups. The effect of the size of the keyboard letters on performance was observed for the middle-aged and elderly groups who had no experience using a computer. The young, middle-aged, and elderly groups with computer experience were not affected by the size of the keyboard letters.

  • Deriving Tool Specifications from User Actions

    Christopher J. HOGGER  Frank R. KRIWACZEK  

     
    PAPER-Requirement Engineering

      Vol:
    E87-D No:4
      Page(s):
    831-837

    We describe a framework for deriving specifications of wizard-like tools by detecting coherent patterns of behaviour among user actions observed in a portal environment. Implementation in the portal of tools compliant with these specifications can then provide useful support for the kind of work patterns observed. The derivation process employs a customizable knowledge base which defines coherent patterns and seeks concrete instances of them among series of actions that occur with sufficient frequency among those observed.

  • Dynamic Bit-Rate Reduction Based on Requantization and Frame-Skipping for MPEG-1 to MPEG-4 Transcoder

    Kwang-deok SEO  Seong-cheol HEO  Soon-kak KWON  Jae-kyoon KIM  

     
    PAPER-Image

      Vol:
    E87-A No:4
      Page(s):
    903-911

    In this paper, we propose a dynamic bit-rate reduction scheme for transcoding an MPEG-1 bitstream into an MPEG-4 simple profile bitstream with a typical bit-rate of 384 kbps. For dynamic bit-rate reduction, a significant reduction in the bit-rate is achieved by combining the processes of requantization and frame-skipping. Conventional requantization methods for a homogeneous transcoder cannot be used directly for a heterogeneous transcoder due to the mismatch in the quantization parameters between the MPEG-1 and MPEG-4 syntax and the difference in the compression efficiency between MPEG-1 and MPEG-4. Accordingly, to solve these problems, a new requantization method is proposed for an MPEG-1 to MPEG-4 transcoder consisting of R-Q (rate-quantization) modeling with a simple feedback and an adjustment of the quantization parameters to compensate for the different coding efficiency between MPEG-1 and MPEG-4. For bit-rate reduction by frame-skipping, an efficient method is proposed for estimating the relevant motion vectors from the skipped frames. The conventional FDVS (forward dominant vector selection) method is improved to reflect the effect of the macroblock types in the skipped frames. Simulation results demonstrated that the proposed method combining requantization and frame-skipping can generate a transcoded MPEG-4 bitstream that is much closer to the desired low bit-rate than the conventional method along with a superior objective quality.

4041-4060hit(5900hit)