The search functionality is under construction.

Author Search Result

[Author] Ching-Tang HSIEH(6hit)

1-6hit
  • Generalized Fuzzy Kohonen Clustering Networks

    Ching-Tang HSIEH  Chieh-Ching CHIN  Kuang-Ming SHEN  

     
    PAPER-Neural Networks/Signal Processing/Information Storage

      Vol:
    E81-A No:10
      Page(s):
    2144-2150

    A fuzzy Kohonen clustering network was proposed which integrates the Fuzzy c-means (FCM) model into the learning rate and updating strategies of the Kohonen network. This yields an optimization problem related to FCM, and the numerical results show improved convergence as well as reduced labeling error. However, the clusters may be either hyperspherical-shaped or hyperellipsoidal-shaped, we use a generalized objective function involving a collection of linear varieties. In this way the model is distributed and consists of a series of `local' linear-type models (based on the revealed clusters). We propose a method to generalize the fuzzy Kohonen clustering networks. Anderson's IRIS data and the artificial data set are used to illustrate this method; and results are compared with the standard Kohonen approach and the fuzzy Kohonen clustering networks.

  • Continuous Speech Segmentation Based on a Self-Learning Neuro-Fuzzy System

    Ching-Tang HSIEH  Mu-Chun SU  Chih-Hsu HSU  

     
    PAPER

      Vol:
    E79-A No:8
      Page(s):
    1180-1187

    For reducing requirement of large memory and minimizing computation complexity in a large-vocabulary continuous speech recognition system, speech segmentation plays an important role in speech recognition systems. In this paper, we formulate the speech segmentation as a two-phase problem. Phase 1 (frame labeling) involves labeling frames of speech data. Frames are classified into three types: (1) silence, (2) consonant and (3) vowel according to two segmentation features. In phase 2 (syllabic unit segmentation) we apply the concept of transition states to segment continuous speech data into syllabic units based on the labeled frames. The novel class of hyperrectangular composite neural networks (HRCNNs) is used to cluster frames. The HRCNNs integrate the rule-based approach and neural network paradigms, therefore, this special hybrid system may neutralize the disadvantages of each alternative. The parameters of the trained HRCNNs are utilized to extract both crisp and fuzzy classification rules. In our experiments, a database containing continuous reading-rate Mandarin speech recorded from newscast was utilized to illustrate the performance of the proposed speaker independent speech segmentation system. The effectiveness of the proposed segmentation system is confirmed by the experimental results.

  • Progressive Image Inpainting Based on Wavelet Transform

    Yen-Liang CHEN  Ching-Tang HSIEH  Chih-Hsu HSU  

     
    PAPER-Image Coding

      Vol:
    E88-A No:10
      Page(s):
    2826-2834

    Currently, the automatic image inpainting methods emphasize the inpainting techniques either globally or locally. They didn't consider the merits of global and local techniques to compensate each other. On the contrary, the artists fixed an image in global view firstly, and then focus on the local features of it, when they repaired it. This paper proposes a progressive processing of image inpainting method based on multi-resolution analysis. In damaged and defective area, we imitate the artistic techniques to approach the effectiveness of image inpainting in human vision. First, we use the multi-resolution characteristics of wavelet transform, from the lowest spatial-frequency layer to the higher one, to analyze the image from global-area to local-area progressively. Then, we utilize the variance of the energy of wavelet coefficients within each image block, to decide the priority of inpainting blocks. Finally, we extract the multi-resolution features of each block. We take account of the correlation among horizontal, vertical and diagonal directions, to determine the inpainting strategy for filling image pixels and approximate a high-quality image inpainting to human vision. In our experiments, the performance of the proposed method is superior to the existing methods.

  • A Robust Speaker Identification System Based on Wavelet Transform

    Ching-Tang HSIEH  You-Chuang WANG  

     
    PAPER

      Vol:
    E84-D No:7
      Page(s):
    839-846

    A new approach for extracting significant characteristic within speech signal for distinct speaker is presented. Based on the multiresolution property of wavelet transform, quadrature mirror filters (QMFs) derived by Daubechies is used to decompose the input signal into varied frequency channels. Owning to the uncorrelation property of each resolution derived from QMFs, Linear Predict Coding Cepstrum (LPCC) of lower frequency region and entropy information of higher frequency region for each decomposition process are calculated as the speech feature vectors. In addition, a hard thresholding technique for lower resolution in each decomposition process is also used to remove the effect of noise interference. The experimental result shows that by using this mechanism, not only effectively reduce the effect of noise inference but improve the recognition rate. The proposed feature extraction algorithm is evaluated on MAT telephone speech database for Text-Independent speaker identification using vector quantization (VQ). Some popular existing methods are also evaluated for comparison in this paper. Experimental results show that the performance of the proposed method is more effective and robust than that of the other existing methods. For 80 speakers and 2 seconds utterance, the identification rate is 98.52%. In addition, the performance of our method is very satisfactory even at low SNR.

  • A Novel Bandelet-Based Image Inpainting

    Kuo-Ming HUNG  Yen-Liang CHEN  Ching-Tang HSIEH  

     
    PAPER-Image Coding and Processing

      Vol:
    E92-A No:10
      Page(s):
    2471-2478

    This paper proposes a novel image inpainting method based on bandelet transform. This technique is based on a multi-resolution layer to perform image restoration, and mainly utilizes the geometrical flow of the neighboring texture of the damaged regions as the basis of restoration. By performing the warp transform with geometrical flows, it transforms the textural variation into the nearing domain axis utilizing the bandelet decomposition method to decompose the non-relative textures into different bands, and then combines them with the affine search method to perform image restoration. The experimental results show that the proposed method can simplify the complexity of the repair decision method and improve the quality of HVS, and thus, repaired results to contain the image of contour of high change, and in addition, offer a texture image of high-frequency variation. These repair results can lead to state-of-the-art results.

  • Robust Speaker Identification System Based on Multilayer Eigen-Codebook Vector Quantization

    Ching-Tang HSIEH  Eugene LAI  Wan-Chen CHEN  

     
    PAPER

      Vol:
    E87-D No:5
      Page(s):
    1185-1193

    This paper presents some effective methods for improving the performance of a speaker identification system. Based on the multiresolution property of the wavelet transform, the input speech signal is decomposed into various frequency subbands in order not to spread noise distortions over the entire feature space. For capturing the characteristics of the vocal tract, the linear predictive cepstral coefficients (LPCC) of the lower frequency subband for each decomposition process are calculated. In addition, a hard threshold technique for the lower frequency subband in each decomposition process is also applied to eliminate the effect of noise interference. Furthermore, cepstral domain feature vector normalization is applied to all computed features in order to provide similar parameter statistics in all acoustic environments. In order to effectively utilize all these multiband speech features, we propose a modified vector quantization as the identifier. This model uses the multilayer concept to eliminate the interference among the multiband speech features and then uses the principal component analysis (PCA) method to evaluate the codebooks for capturing a more detailed distribution of the speaker's phoneme characteristics. The proposed method is evaluated using the KING speech database for text-independent speaker identification. Experimental results show that the recognition performance of the proposed method is better than those of the vector quantization (VQ) and the Gaussian mixture model (GMM) using full-band LPCC and mel-frequency cepstral coefficients (MFCC) features in both clean and noisy environments. Also, a satisfactory performance can be achieved in low SNR environments.