The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] VQ(31hit)

21-31hit(31hit)

  • A Low-Power High-Performance Vector-Pipeline DSP for Low-Rate Videophones

    Kazutoshi KOBAYASHI  Makoto EGUCHI  Takuya IWAHASHI  Takehide SHIBAYAMA  Xiang LI  Kosuke TAKAI  Hidetoshi ONODERA  

     
    PAPER

      Vol:
    E84-C No:2
      Page(s):
    193-201

    We propose a vector-pipeline processor VP-DSP for low-rate videophones which can encode and decode 10 frames/sec. of QCIF through a 29.2 kbps low-rate line. We have already fabricated a VP-DSP LSI by a 0.35 µm CMOS process. The area of the VP-DSP core is 4.26 mm2. It works properly at 25 MHz/1.6 V with a power consumption of 49 mW. Its peak performance is up to 400 MOPS, 8.2 GOPS/W.

  • High Speed and High Accuracy Rough Classification for Handwritten Characters Using Hierarchical Learning Vector Quantization

    Yuji WAIZUMI  Nei KATO  Kazuki SARUTA  Yoshiaki NEMOTO  

     
    PAPER-Biocybernetics, Neurocomputing

      Vol:
    E83-D No:6
      Page(s):
    1282-1290

    We propose a rough classification system using Hierarchical Learning Vector Quantization (HLVQ) for large scale classification problems which involve many categories. HLVQ of proposed system divides categories hierarchically in the feature space, makes a tree and multiplies the nodes down the hierarchy. The feature space is divided by a few codebook vectors in each layer. The adjacent feature spaces overlap at the borders. HLVQ classification is both speedy and accurate due to the hierarchical architecture and the overlapping technique. In a classification experiment using ETL9B, the largest database of handwritten characters in Japan, (it contains a total of 607,200 samples from 3036 categories) the speed and accuracy of classification by HLVQ was found to be higher than that by Self-Organizing feature Map (SOM) and Learning Vector Quantization methods. We demonstrate that the classification rate of the proposed system which uses multi-codebook vectors for each category under HLVQ can achieve higher speed and accuracy than that of systems which use average vectors.

  • A Bit Rate Reduction Technique for Vector Quantization Image Data Compression

    Yung-Gi WU  Shen-Chuan TAI  

     
    PAPER-Source Coding/Image Processing

      Vol:
    E82-A No:10
      Page(s):
    2147-2153

    In this paper, a technique to reduce the overhead of Vector Quantization (VQ) coding is developed here. Our method exploits the inter-index correlation property to reduce the overhead to transmit encoded indices. Discrete Cosine Transform (DCT) is the tool to decorrelate the above correlation to get further bit rate reduction. As we know, the codewords in the codebook that generated from conventional LBG algorithm do not have any specified orders. Hence, the indices for selected codewords to represent respective adjacent blocks are random distributions. However, due to the homogeneous property existing among adjacent regions in original image, we re-arrange the codebook according to our predefined weighting criterion to enable the selected neighboring indices capable of indicating the homogeneous feature as well. Then, DCT is used to compress those VQ encoded indices. Because of the homogeneous characteristics existing among the selected adjacent indices after codebook permutation, DCT can achieve better compression efficiency. However, as we know, DCT introduces distortion by the quantization procedure, which yield error-decoded indices. Therefore, we utilize an index residue compensation method to make up that error decoded indices which have high complexity deviation to reduce those unpleasant visual effects caused by distorted indices. Statistics illustrators and table are addressed to demonstrate the efficient performance of proposed method. Experiments are carried out to Lena and other natural gray images to demonstrate our claims. Simulation results show that our method saves more than 50% bit rate to some images while preserving the same reconstructed image qualities as standard VQ coding scheme.

  • A Real-Time Low-Rate Video Compression Algorithm Using Multi-Stage Hierarchical Vector Quantization

    Kazutoshi KOBAYASHI  Kazuhiko TERADA  Hidetoshi ONODERA  Keikichi TAMARU  

     
    PAPER

      Vol:
    E82-A No:2
      Page(s):
    215-222

    We propose a real-time low-rate video compression algorithm using fixed-rate multi-stage hierarchical vector quantization. Vector quantization is suitable for mobile computing, since it demands small computation on decoding. The proposed algorithm enables transmission of 10 QCIF frames per second over a low-rate 29.2 kbps mobile channel. A frame is hierarchically divided by sub-blocks. A frame of images is compressed in a fixed rate at any video activity. For active frames, large sub-blocks for low resolution are mainly transmitted. For inactive frames, smaller sub-blocks for high resolution can be transmitted successively after a motion-compensated frame. We develop a compression system which consists of a host computer and a memory-based processor for the nearest neighbor search on VQ. Our algorithm guarantees real-time decoding on a poor CPU.

  • Comparison of Two Speech and Audio Coders at 8 kb/s from the Viewpoints of Coding Scheme and Quality

    Nobuhiko KITAWAKI  Takehiro MORIYA  Takao KANEKO  Naoki IWAKAMI  

     
    PAPER-Media Management

      Vol:
    E81-B No:11
      Page(s):
    2007-2012

    Low bit-rate speech and audio codings are key technologies for multimedia communications. A number of coding scheme have been developed for various applications. In Internet application, good speech and audio quality at very low bit-rate (8-16 kb/s) is valuable. Two recently proposed speech and audio-coding schemes, CS-ACELP (Conjugate Structure Algebraic Code Excited Linear Prediction, standardized by the ITU-T in Recommendation G. 729) and TwinVQ (Transform-domain Weighted INterleave Vector Quantization, one of the candidates for MPEG-4 audio) were compared from the viewpoints of coding schemes and quality. Although there are significant differences in their basic structures and frame lengths, this paper describes that both use the same compression techniques, such as LPC (Linear Predictive Coding)-analysis pitch-period estimation and vector quantization. While CS-ACELP provides toll quality for speech at 8 kb/s, the quality it provides for music signals is insufficient. The TwinVQ transform coder is based on LPC and vector quantization and is also capable of operating at 8 kb/s. Evaluation of these two schemes in terms of their fundamental technologies, quality, delay, and complexity showed that the quality of TwinVQ for music signals is better than that of CS-ACELP, and that the quality of CS-ACELP is better for speech signals. Therefore, TwinVQ may be better suited for one-directional Internet applications, and CS-ACELP may be better for two-directional communication.

  • Combining Multiple Classifiers in a Hybrid System for High Performance Chinese Syllable Recognition

    Liang ZHOU  Satoshi IMAI  

     
    PAPER-Speech Processing and Acoustics

      Vol:
    E79-D No:11
      Page(s):
    1570-1578

    A multiple classifier system can be a powerful solution for robust pattern recognition. It is expected that the appropriate combination of multiple classifiers may reduce errors, provide robustness, and achieve higher performance. In this paper, high performance Chinese syllable recognition is presented using combinations of multiple classifiers. Chinese syllable recognition is divided into base syllable recognition (disregarding the tones) and recognition of 4 tones. For base syllable recognition, we used a combination of two multisegment vector quantization (MSVQ) classifiers based on different features (instantaneous and transitional features of speech). For tone recognition, vector quantization (VQ) classifier was first used, and was comparable to multilayer perceptron (MLP) classifier. To get robust or better performance, a combination of distortion-based classifier (VQ) and discriminant-based classifier (MLP) is proposed. The evaluations have been carried out using standard syllable database CRDB in China, and experimental results have shown that combination of multiple classifiers with different features or different methodologies can improve recognition performance. Recognition accuracy for base syllable, tone, and tonal syllable is 96.79%, 99.82% and 96.24% respectively. Since these results were evaluated on a standard database, they can be used as a benchmark that allows direct comparison against other approaches.

  • A Pattern Vector Quantization Scheme for Mid-range Frequency DCT Coefficients

    Dennis Chileshe MWANSA  Satoshi MIZUNO  Makoto FUJIMURA  Hideo KURODA  

     
    PAPER

      Vol:
    E79-B No:10
      Page(s):
    1452-1458

    In DCT transform coding it is usually necessary to discard some of the ac coefficients obtained after the transform operation for data compression reasons. Although most of the energy is usually compacted in the few coefficients that are transmitted, there are many instances where the discarded coefficients contain significant information. The absence of these coefficients at the decoder can lead to visible degradation of the reconstructed image especially around slow moving objects. We propose a simple but effective method which uses an indirect form of vector quantization to supplement scalar quantization in the transform domain. The distribution pattern of coefficients that fall below a fixed threshold is vector quantized and an index of the pattern chosen from a codebook is transmitted together with two averages; one for the positive coefficients and the other for negative coefficients. In the reconstruction, the average values are used instead of setting the corresponding coefficients to zero. This is tantamount to quantizing the mid range frequency coefficients with 1 bit but the resulting bit-rate is much less. We aim to propose an alternative to using traditional vector quantization which entails computational complexities and large time and memory requirements.

  • Projective Image Representation and Its Application to Image Compression

    Kyeong-Hoon JUNG  Choong Woong LEE  

     
    PAPER-Image Processing,Computer Graphics and Pattern Recognition

      Vol:
    E79-D No:2
      Page(s):
    136-142

    This paper introduces a new image representation method that is named the projective image representation (PIR). We consider an image as a collage of symmetric segments each of which can be well represented by its projection data of a single orientation. A quadtree-based method is adopted to decompose an image into variable sized segments according to the complexity within it. Also, we deal with the application of the PIR to the image compression and propose an efficient algorithm, the quadtree-structured projection vector quantization (QTPVQ) which combines the PIR with the VQ. As the VQ is carried out on the projection data instead of the pixel intensities of the segment, the QTPVQ successfully overcomes the drawbacks of the conventional VQ algorithms such as the blocking artifact and the difficulty in manipulating the large dimension. Above all, the QTPVQ improves the subjective quality greatly, especially at low bit rate, which makes it applicable to low bit rate image coding.

  • Multisegment Multiple VQ Codebooks-Based Speaker Independent Isolated-Word Recognition Using Unbiased Mel Cepstrum

    Liang ZHOU  Satoshi IMAI  

     
    PAPER-Speech Processing and Acoustics

      Vol:
    E78-D No:9
      Page(s):
    1178-1187

    In this paper, we propose a new approach to speaker independent isolated-word speech recognition using multisegment multiple vector quantization (VQ) codebooks. In this approach, words are recognized by means of multisegment multiple VQ codebooks, a separate multisegment multiple VQ codebooks are designed for each word in the recognition vocabulary by dividing equally the word into multiple segments which is correlative with number of syllables or phonemes of the word, and designing two individual VQ codebooks consisting of both instantaneous and transitional speech features for each segment. Using this approach, the influence of the within-word coarticulation can be minimized, the time-sequence information of speech can be used, and the word length differences in the vocabulary or speaking rates variations can be adapted automatically. Moreover, the mel-cepstral coefficients based on unbiased estimation of log spectrum (UELS) are used, and comparison experiment with LPC derived mel cepstral coefficients is made. Recognition experiments Using testing databases consisting of 100 Japanese words (Waseda database) and 216 phonetically balanced words (ATR database), confirmed the effectiveness of the new method and the new speech features. The approach is described, computational complexity as well as memory requirements are analyzed, the experimental results are presented.

  • A Comparative Study of Output Probability Functions in HMMs

    Seiichi NAKAGAWA  Li ZHAO  Hideyuki SUZUKI  

     
    PAPER

      Vol:
    E78-D No:6
      Page(s):
    669-675

    One of the most effective methods in speech recognition is the HMM which has been used to model speech statistically. The discrete distribution and the continuos distribution HMMs have been widely used in various applications. However, in recent years, HMMs with various output probability functions have been proposed to further improve recognition performance, e.g. the Gaussian mixture continuous and the semi-continuous distributed HMMs. We recently have also proposed the RBF (radial basis function)-based HMM and the VQ-distortion based HMM which use a RBF function and VQ-distortion measure at each state instead of an output probability density function used by traditional HMMs. In this paper, we describe the RBF-based HMM and the VQ-distortion based HMM and compare their performance with the discrete distributed, the Gaussian mixture distributed and the semi-continuous distributed HMMs based on their speech recognition performance rates through experiments on speaker-independent spoken digit recognition. Our results confirmed that the RBF-based and VQ-distortion based HMMs are more robust and superior to traditional HMMs.

  • Image Compression by Vector Quantization of Projection Data

    Hee Bok PARK  Choong Woong LEE  

     
    PAPER-Image Processing, Computer Graphics and Pattern Recognition

      Vol:
    E75-D No:1
      Page(s):
    148-155

    In this paper, we present a new image compression scheme, Projection-VQ, based on reconstruction from vector quantized projections. We can easily deal with the blocks of larger size in Projection-VQ than in conventional VQ schemes, because the dimension of vectors in projection domain is, in general, much smaller than that in the spatial domain. In Projection-VQ, the image can be reconstructed without destroying edge sharpness in the process since the projection data having the edge information are preferentially transmitted. There are several good algorithms of reconstructing an image from projections. However, we use a new modified reconstruction algorithm suitable for a variable bit rate image coding system. We allocate the bits depending on the characteristics of the block images. Our simulation results show that the performances are superior to the ordinary VQ schemes in PSNR, and that the improvement in subjective image quality is substantial.

21-31hit(31hit)