The search functionality is under construction.

Keyword Search Result

[Keyword] vector quantization(101hit)

1-20hit(101hit)

  • Vector Quantization of Speech Spectrum Based on the VQ-VAE Embedding Space Learning by GAN Technique

    Tanasan SRIKOTR  Kazunori MANO  

     
    PAPER-Speech and Hearing, Digital Signal Processing

      Pubricized:
    2021/09/30
      Vol:
    E105-A No:4
      Page(s):
    647-654

    The spectral envelope parameter is a significant speech parameter in the vocoder's quality. Recently, the Vector Quantized Variational AutoEncoder (VQ-VAE) is a state-of-the-art end-to-end quantization method based on the deep learning model. This paper proposed a new technique for improving the embedding space learning of VQ-VAE with the Generative Adversarial Network for quantizing the spectral envelope parameter, called VQ-VAE-EMGAN. In experiments, we designed the quantizer for the spectral envelope parameters of the WORLD vocoder extracted from the 16kHz speech waveform. As the results shown, the proposed technique reduced the Log Spectral Distortion (LSD) around 0.5dB and increased the PESQ by around 0.17 on average for four target bit operations compared to the conventional VQ-VAE.

  • Vector Quantization of High-Dimensional Speech Spectra Using Deep Neural Network

    JianFeng WU  HuiBin QIN  YongZhu HUA  LiHuan SHAO  Ji HU  ShengYing YANG  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/07/02
      Vol:
    E102-D No:10
      Page(s):
    2047-2050

    This paper proposes a deep neural network (DNN) based framework to address the problem of vector quantization (VQ) for high-dimensional data. The main challenge of applying DNN to VQ is how to reduce the binary coding error of the auto-encoder when the distribution of the coding units is far from binary. To address this problem, three fine-tuning methods have been adopted: 1) adding Gaussian noise to the input of the coding layer, 2) forcing the output of the coding layer to be binary, 3) adding a non-binary penalty term to the loss function. These fine-tuning methods have been extensively evaluated on quantizing speech magnitude spectra. The results demonstrated that each of the methods is useful for improving the coding performance. When implemented for quantizing 968-dimensional speech spectra using only 18-bit, the DNN-based VQ framework achieved an averaged PESQ of about 2.09, which is far beyond the capability of conventional VQ methods.

  • Rapid Generation of the State Codebook in Side Match Vector Quantization

    Hanhoon PARK  Jong-Il PARK  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2017/05/16
      Vol:
    E100-D No:8
      Page(s):
    1934-1937

    Side match vector quantization (SMVQ) has been originally developed for image compression and is also useful for steganography. SMVQ requires to create its own state codebook for each block in both encoding and decoding phases. Since the conventional method for the state codebook generation is extremely time-consuming, this letter proposes a fast generation method. The proposed method is tens times faster than the conventional one without loss of perceptual visual quality.

  • Performance of Partitioned Vector Quantization with Optimized Feedback Budget Allocation

    Mirza Golam KIBRIA  Hidekazu MURATA  Susumu YOSHIDA  

     
    PAPER-Terrestrial Wireless Communication/Broadcasting Technologies

      Vol:
    E97-B No:6
      Page(s):
    1184-1194

    This study analyzes the performance of a downlink beamformer with partitioned vector quantization under optimized feedback budget allocation. A multiuser multiple-input single-output downlink precoding system with perfect channel state information at mobile stations is considered. The number of feedback bits allocated to the channel quality indicator (CQI) and the channel direction indicator (CDI) corresponding to each partition are optimized by exploiting the quantization mean square error. In addition, the effects of equal and unequal partitioning on codebook memory and system capacity are studied and elucidated through simulations. The results show that with optimized CQI-CDI allocation, the feedback budget distributions of equal or unequal partitions are proportional to the size ratios of the partitioned subvectors. Furthermore, it is observed that for large-sized partitions, the ratio of optimal CDI to CQI is much higher than that for small-sized partitions.

  • Adaptive Spectral Masking of AVQ Coding and Sparseness Detection for ITU-T G.711.1 Annex D and G.722 Annex B Standards

    Masahiro FUKUI  Shigeaki SASAKI  Yusuke HIWASAKI  Kimitaka TSUTSUMI  Sachiko KURIHARA  Hitoshi OHMURO  Yoichi HANEDA  

     
    PAPER-Speech and Hearing

      Vol:
    E97-D No:5
      Page(s):
    1264-1272

    We proposes a new adaptive spectral masking method of algebraic vector quantization (AVQ) for non-sparse signals in the modified discreet cosine transform (MDCT) domain. This paper also proposes switching the adaptive spectral masking on and off depending on whether or not the target signal is non-sparse. The switching decision is based on the results of MDCT-domain sparseness analysis. When the target signal is categorized as non-sparse, the masking level of the target MDCT coefficients is adaptively controlled using spectral envelope information. The performance of the proposed method, as a part of ITU-T G.711.1 Annex D, is evaluated in comparison with conventional AVQ. Subjective listening test results showed that the proposed method improves sound quality by more than 0.1 points on a five-point scale on average for speech, music, and mixed content, which indicates significant improvement.

  • Bit-Plane Coding of Lattice Codevectors

    Wisarn PATCHOO  Thomas R. FISCHER  

     
    LETTER-Coding Theory

      Vol:
    E96-A No:8
      Page(s):
    1817-1820

    In a sign-magnitude representation of binary lattice codevectors, only a few least significant bit-planes are constrained due to the structure of the lattice, while there is no restriction on other more significant bit-planes. Hence, any convenient bit-plane coding method can be used to encode the lattice codevectors, with modification required only for the lattice-defining, least-significant bit-planes. Simple encoding methods for the lattice-defining bit-planes of the D4, RE8, and Barnes-Wall 16-dimensional lattices are described. Simulation results for the encoding of a uniform source show that standard bit-plane coding together with the proposed encoding provide about the same performance as integer lattice vector quantization when the bit-stream is truncated. When the entire bit-stream is fully decoded, the granular gain of the lattice is realized.

  • Development of a Robust and Compact On-Line Handwritten Japanese Text Recognizer for Hand-Held Devices

    Jinfeng GAO  Bilan ZHU  Masaki NAKAGAWA  

     
    PAPER-Pattern Recognition

      Vol:
    E96-D No:4
      Page(s):
    927-938

    The paper describes how a robust and compact on-line handwritten Japanese text recognizer was developed by compressing each component of an integrated text recognition system including a SVM classifier to evaluate segmentation points, an on-line and off-line combined character recognizer, a linguistic context processor, and a geometric context evaluation module to deploy it on hand-held devices. Selecting an elastic-matching based on-line recognizer and compressing MQDF2 via a combination of LDA, vector quantization and data type transformation, have contributed to building a remarkably small yet robust recognizer. The compact text recognizer covering 7,097 character classes just requires about 15 MB memory to keep 93.11% accuracy on horizontal text lines extracted from the TUAT Kondate database. Compared with the original full-scale Japanese text recognizer, the memory size is reduced from 64.1 MB to 14.9 MB while the accuracy loss is only 0.5% from 93.6% to 93.11%. The method is scalable so even systems of less than 11 MB or less than 6 MB still remain 92.80% or 90.02% accuracy, respectively.

  • A Method for Improving TIE-Based VQ Encoding Introducing RI Rules

    Chi-Jung HUANG  Shaw-Hwa HWANG  Cheng-Yu YEH  

     
    LETTER-Pattern Recognition

      Vol:
    E96-D No:1
      Page(s):
    151-154

    This study proposes an improvement to the Triangular Inequality Elimination (TIE) algorithm for vector quantization (VQ). The proposed approach uses recursive and intersection (RI) rules to compensate and enhance the TIE algorithm. The recursive rule changes reference codewords dynamically and produces the smallest candidate group. The intersection rule removes redundant codewords from these candidate groups. The RI-TIE approach avoids over-reliance on the continuity of the input signal. This study tests the contribution of the RI rules using the VQ-based, G.729 standard LSP encoder and some classic images. Results show that the RI rules perform excellently in the TIE algorithm.

  • Designing Algebraic Trellis Vector Code as an Efficient Excitation Codebook for ACELP Coder

    Sungjin KIM  Sangwon KANG  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E95-B No:11
      Page(s):
    3642-3645

    In this paper, a block-constrained trellis coded vector quantization (BC-TCVQ) algorithm is combined with an algebraic codebook to produce an algebraic trellis vector code (ATVC) to be used in ACELP coding. ATVC expands the set of allowed algebraic codebook pulse position, and the trellis branches are labeled with these subsets. The Viterbi algorithm is used to select the excitation codevector. A fast codebook search method using an efficient non-exhaustive search technique is also proposed to reduce the complexity of the ATVC search procedure while maintaining the quality of the reconstructed speech. The ATVC block code is used as the fixed codebook of AMR-NB (12.2 kbps), which reduces the computational complexity compared to the conventional algebraic codebook.

  • Scene Categorization with Classified Codebook Model

    Xu YANG  De XU  Songhe FENG  Yingjun TANG  Shuoyan LIU  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E94-D No:6
      Page(s):
    1349-1352

    This paper presents an efficient yet powerful codebook model, named classified codebook model, to categorize natural scene category. The current codebook model typically resorts to large codebook to obtain higher performance for scene categorization, which severely limits the practical applicability of the model. Our model formulates the codebook model with the theory of vector quantization, and thus uses the famous technique of classified vector quantization for scene-category modeling. The significant feature in our model is that it is beneficial for scene categorization, especially at small codebook size, while saving much computation complexity for quantization. We evaluate the proposed model on a well-known challenging scene dataset: 15 Natural Scenes. The experiments have demonstrated that our model can decrease the computation time for codebook generation. What is more, our model can get better performance for scene categorization, and the gain of performance becomes more pronounced at small codebook size.

  • Secret Image Transmission Scheme Using Secret Codebook

    Shih-Chieh SHIE  Ji-Han JIANG  Long-Tai CHEN  Zeng-Hui HUANG  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E93-D No:2
      Page(s):
    399-402

    A secret image transmission scheme based on vector quantization (VQ) and a secret codebook is proposed in this article. The goal of this scheme is to transmit a set of good-quality images secretly via another high-quality cover image with the same image size. In order to reduce the data size of secret images, the images are encoded by an adaptive codebook. To guarantee the visual quality of secret images, the adaptive codebook applied at the transmitter is transmitted to the receiver secretly as well. Moreover, to enhance the security of the proposed scheme and to compact the data size of the codebook, the adaptive codebook is encoded based on VQ using another codebook generated from the cover image. Experiments show impressive results.

  • Shift-Invariant Sparse Image Representations Using Tree-Structured Dictionaries

    Makoto NAKASHIZUKA  Hidenari NISHIURA  Youji IIGUNI  

     
    PAPER-Image Processing

      Vol:
    E92-A No:11
      Page(s):
    2809-2818

    In this study, we introduce shift-invariant sparse image representations using tree-structured dictionaries. Sparse coding is a generative signal model that approximates signals by the linear combinations of atoms in a dictionary. Since a sparsity penalty is introduced during signal approximation and dictionary learning, the dictionary represents the primal structures of the signals. Under the shift-invariance constraint, the dictionary comprises translated structuring elements (SEs). The computational cost and number of atoms in the dictionary increase along with the increasing number of SEs. In this paper, we propose an algorithm for shift-invariant sparse image representation, in which SEs are learnt with a tree-structured approach. By using a tree-structured dictionary, we can reduce the computational cost of the image decomposition to the logarithmic order of the number of SEs. We also present the results of our experiments on the SE learning and the use of our algorithm in image recovery applications.

  • A Hybrid Technique for Thickness-Map Visualization of the Hip Cartilages in MRI

    Mahdieh KHANMOHAMMADI  Reza AGHAIEZADEH ZOROOFI  Takashi NISHII  Hisashi TANAKA  Yoshinobu SATO  

     
    PAPER-Biological Engineering

      Vol:
    E92-D No:11
      Page(s):
    2253-2263

    Quantification of the hip cartilages is clinically important. In this study, we propose an automatic technique for segmentation and visualization of the acetabular and femoral head cartilages based on clinically obtained multi-slice T1-weighted MR data and a hybrid approach. We follow a knowledge based approach by employing several features such as the anatomical shapes of the hip femoral and acetabular cartilages and corresponding image intensities. We estimate the center of the femoral head by a Hough transform and then automatically select the volume of interest. We then automatically segment the hip bones by a self-adaptive vector quantization technique. Next, we localize the articular central line by a modified canny edge detector based on the first and second derivative filters along the radial lines originated from the femoral head center and anatomical constraint. We then roughly segment the acetabular and femoral head cartilages using derivative images obtained in the previous step and a top-hat filter. Final masks of the acetabular and femoral head cartilages are automatically performed by employing the rough results, the estimated articular center line and the anatomical knowledge. Next, we generate a thickness map for each cartilage in the radial direction based on a Euclidian distance. Three dimensional pelvic bones, acetabular and femoral cartilages and corresponding thicknesses are overlaid and visualized. The techniques have been implemented in C++ and MATLAB environment. We have evaluated and clarified the usefulness of the proposed techniques in the presence of 40 clinical hips multi-slice MR images.

  • Color Image Retrieval Based on Distance-Weighted Boundary Predictive Vector Quantization Index Histograms

    Zhen SUN  Zhe-Ming LU  Hao LUO  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E92-D No:9
      Page(s):
    1803-1806

    This Letter proposes a new kind of features for color image retrieval based on Distance-weighted Boundary Predictive Vector Quantization (DWBPVQ) Index Histograms. For each color image in the database, 6 histograms (2 for each color component) are calculated from the six corresponding DWBPVQ index sequences. The retrieval simulation results show that, compared with the traditional Spatial-domain Color-Histogram-based (SCH) features and the DCTVQ index histogram-based (DCTVQIH) features, the proposed DWBPVQIH features can greatly improve the recall and precision performance.

  • Color Image Classification Using Block Matching and Learning

    Kazuki KONDO  Seiji HOTTA  

     
    LETTER-Pattern Recognition

      Vol:
    E92-D No:7
      Page(s):
    1484-1487

    In this paper, we propose block matching and learning for color image classification. In our method, training images are partitioned into small blocks. Given a test image, it is also partitioned into small blocks, and mean-blocks corresponding to each test block are calculated with neighbor training blocks. Our method classifies a test image into the class that has the shortest total sum of distances between mean blocks and test ones. We also propose a learning method for reducing memory requirement. Experimental results show that our classification outperforms other classifiers such as support vector machine with bag of keypoints.

  • Segmentation of Arteries in Minimally Invasive Surgery Using Change Detection

    Hamed AKBARI  Yukio KOSUGI  Kazuyuki KOJIMA  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E92-D No:3
      Page(s):
    498-505

    In laparoscopic surgery, the lack of tactile sensation and 3D visual feedback make it difficult to identify the position of a blood vessel intraoperatively. An unintentional partial tear or complete rupture of a blood vessel may result in a serious complication; moreover, if the surgeon cannot manage this situation, open surgery will be necessary. Differentiation of arteries from veins and other structures and the ability to independently detect them has a variety of applications in surgical procedures involving the head, neck, lung, heart, abdomen, and extremities. We have used the artery's pulsatile movement to detect and differentiate arteries from veins. The algorithm for change detection in this study uses edge detection for unsupervised image registration. Changed regions are identified by subtracting the systolic and diastolic images. As a post-processing step, region properties, including color average, area, major and minor axis lengths, perimeter, and solidity, are used as inputs of the LVQ (Learning Vector Quantization) network. The output results in two object classes: arteries and non-artery regions. After post-processing, arteries can be detected in the laparoscopic field. The registration method used here is evaluated in comparison with other linear and nonlinear elastic methods. The performance of this method is evaluated for the detection of arteries in several laparoscopic surgeries on an animal model and on eleven human patients. The performance evaluation criteria are based on false negative and false positive rates. This algorithm is able to detect artery regions, even in cases where the arteries are obscured by other tissues.

  • Global Motion Representation of Video Shot Based on Vector Quantization Index Histogram

    Fa-Xin YU  Zhe-Ming LU  Zhen LI  Hao LUO  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E92-D No:1
      Page(s):
    90-92

    In this Letter, we propose a novel method of low-level global motion feature description based on Vector Quantization (VQ) index histograms of motion feature vectors (MFVVQIH) for the purpose of video shot retrieval. The contribution lies in three aspects: first, we use VQ to eliminate singular points in the motion feature vector space; second, we utilize the global motion feature vector index histogram of a video shot as the global motion signature; third, video shot retrieval based on index histograms instead of original motion feature vectors guarantees the low computation complexity, and thus assures a real-time video shot retrieval. Experimental results show that the proposed scheme has high accuracy and low computation complexity.

  • Unitary Space Vector Quantization Codebook Design for Precoding MIMO System

    Ping WU  Lihua LI  Ping ZHANG  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E91-B No:9
      Page(s):
    2917-2924

    In a codebook based precoding MIMO system, the precoding codebook significantly determines the system performance. Consequently, it is crucial to design the precoding codebook, which is related to the channel fading, antenna number, spatial correlation etc. So specific channel conditions correspond to respective optimum codebooks. In this paper, in order to obtain the optimum codebooks, a universal unitary space vector quantization (USVQ) codebook design criterion is provided, which can design the optimum codebooks for various fading and spatial correlated channels with arbitrary antenna configurations. Furthermore, the unitary space K-mean (USK) algorithm is also proposed to generate the USVQ codebook, which is iterative and convergent. Simulations show that the capacities of the precoding MIMO schemes using the USVQ codebooks are very close to those of the ideal precoding cases and outperform those of the schemes using the traditional Grassmannian codebooks and the 3GPP LTE DFT (discrete Fourier transform) codebooks.

  • Initial Codebook Algorithm of Vector Quantizaton

    ShanXue CHEN  FangWei LI  WeiLe ZHU  TianQi ZHANG  

     
    LETTER-Algorithm Theory

      Vol:
    E91-D No:8
      Page(s):
    2189-2191

    A simple and successful design of initial codebook of vector quantization (VQ) is presented. For existing initial codebook algorithms, such as random method, the initial codebook is strongly influenced by selection of initial codewords and difficult to match with the features of the training vectors. In the proposed method, training vectors are sorted according to the norm of training vectors. Then, the ordered vectors are partitioned into N groups where N is the size of codebook. The initial codewords are obtained from calculating the centroid of each group. This initializtion method has a robust performance and can be combined with the VQ algorithm to further improve the quality of codebook.

  • Design of Asymmetric VQ Codebooks Incorporating Channel Coding

    Jong-Ki HAN  Jae-Gon KIM  

     
    PAPER-Communication Theory and Signals

      Vol:
    E91-A No:8
      Page(s):
    2195-2204

    In this paper, a communication system using vector quantization (VQ) and channel coding is considered. Here, a design scheme has been proposed to optimize source codebooks in the transmitter and the receiver. In the proposed algorithm, the overall distortion including both the quantization error and channel distortion is minimized. The proposed algorithm is different from the previous work by the facts that the channel encoder is used in the VQ-based communication system, and the source VQ codebook used in the transmitter is different from the one used by the receiver, i.e. asymmetric VQ system. And the bounded-distance decoding (BDD) technique is used to combat the ambiguousness in the channel decoder. We can see from the computer simulations that the optimized system based on the proposed algorithm outperforms a conventional system based on a symmetric VQ codebook. Also, the proposed algorithm enables a reliable image communication over noisy channels.

1-20hit(101hit)