The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] Ti(30728hit)

20261-20280hit(30728hit)

  • Recognition of Continuous Korean Sign Language Using Gesture Tension Model and Soft Computing Technique

    Jung-Bae KIM  Zeungnam BIEN  

     
    LETTER-Human-computer Interaction

      Vol:
    E87-D No:5
      Page(s):
    1265-1270

    We present a method for recognition of continuous Korean Sign Language (KSL). In the paper, we consider the segmentation problem of a continuous hand motion pattern in KSL. For this, we first extract sign sentences by removing linking gestures between sign sentences. We use a gesture tension model and fuzzy partitioning. Then, each sign sentence is disassembled into a set of elementary motions (EMs) according to its geometric pattern. The hidden Markov model is adopted to classify the segmented individual EMs.

  • F0 Dynamics in Singing: Evidence from the Data of a Baritone Singer

    Hiroki MORI  Wakana ODAGIRI  Hideki KASUYA  

     
    PAPER

      Vol:
    E87-D No:5
      Page(s):
    1086-1092

    Transitional fundamental frequency (F0) characteristics comprise a crucial part of F0 dynamics in singing. This paper examines the F0 characteristics during the note transition period. An analysis of the singing voice of a professional baritone strongly suggests that asymmetries exist in the mechanisms used for controlling rising and falling. Specifically, the F0 contour in rising transitions can be modeled as a step response from a critically-damped second-order linear system with fixed average/maximum speed of change, whereas that in falling transitions can be modeled as a step response from an underdamped second-order linear system with fixed transition time. The validity of the model is examined through auditory experiments using synthesized singing voice.

  • Some Relations between Watson-Crick Finite Automata and Chomsky Hierarchy

    Sadaki HIROSE  Kunifumi TSUDA  Yasuhiro OGOSHI  Haruhiko KIMURA  

     
    LETTER-Automata and Formal Language Theory

      Vol:
    E87-D No:5
      Page(s):
    1261-1264

    Watson-Crick automata, recently introduced in, are new types of automata in the DNA computing framework, working on tapes which are double stranded sequences of symbols related by a complementarity relation, similar to a DNA molecule. The automata scan separately each of the two strands in a corelated mannar. Some restricted variants of them were also introduced and the relationship between the families of languages recognized by them were investigated in. In this paper, we clarify some relations between the families of languages recognized by the restricted variants of Watson-Crick finite automata and the families in the Chomsky hierarchy.

  • Pattern-Based Features vs. Statistical-Based Features in Decision Trees for Word Segmentation

    Thanaruk THEERAMUNKONG  Thanasan TANHERMHONG  

     
    PAPER-Natural Language Processing

      Vol:
    E87-D No:5
      Page(s):
    1254-1260

    This paper proposes two alternative approaches that do not make use of a dictionary but instead utilizes different types of learned features to segment words in a language that has no explicit word boundary. Both methods utilize decision trees as knowledge representation acquired from a training corpus in the segmentation process. The first method, a language-dependent technique, applies a set of constructed features patterns based on character types to generate a set of heuristic segmentation rules. It separates a running text into a sequence of small chunks based on the given patterns, and constructs a decision tree for word segmentation. The second method extracts statistics of character sequences from a training corpus and uses them as features for the process of constructing a set of rules by decision tree induction. The latter needs no linguistic knowledge. By experiments on Thai language, both methods achieve relatively high accuracy but the latter performs much better.

  • Orthogonalized Distinctive Phonetic Feature Extraction for Noise-Robust Automatic Speech Recognition

    Takashi FUKUDA  Tsuneo NITTA  

     
    PAPER

      Vol:
    E87-D No:5
      Page(s):
    1110-1118

    In this paper, we propose a noise-robust automatic speech recognition system that uses orthogonalized distinctive phonetic features (DPFs) as input of HMM with diagonal covariance. In an orthogonalized DPF extraction stage, first, a speech signal is converted to acoustic features composed of local features (LFs) and ΔP, then a multilayer neural network (MLN) with 153 output units composed of context-dependent DPFs of a preceding context DPF vector, a current DPF vector, and a following context DPF vector maps the LFs to DPFs. Karhunen-Loeve transform (KLT) is then applied to orthogonalize each DPF vector in the context-dependent DPFs, using orthogonal bases calculated from a DPF vector that represents 38 Japanese phonemes. Each orthogonalized DPF vector is finally decorrelated one another by using Gram-Schmidt orthogonalization procedure. In experiments, after evaluating the parameters of the MLN input and output units in the DPF extractor, the orthogonalized DPFs are compared with original DPFs. The orthogonalized DPFs are then evaluated in comparison with a standard parameter set of MFCCs and dynamic features. Next, noise robustness is tested using four types of additive noise. The experimental results show that the use of the proposed orthogonalized DPFs can significantly reduce the error rate in an isolated spoken-word recognition task both with clean speech and with speech contaminated by additive noise. Furthermore, we achieved significant improvements when combining the orthogonalized DPFs with conventional static MFCCs and ΔP.

  • A Study on Acoustic Modeling for Speech Recognition of Predominantly Monosyllabic Languages

    Ekkarit MANEENOI  Visarut AHKUPUTRA  Sudaporn LUKSANEEYANAWIN  Somchai JITAPUNKUL  

     
    PAPER

      Vol:
    E87-D No:5
      Page(s):
    1146-1163

    This paper presents a study on acoustic modeling for speech recognition of predominantly monosyllabic languages. Various speech units used in speech recognition systems have been investigated. To evaluate the effectiveness of these acoustic models, the Thai language is selected, since it is a predominantly monosyllabic language and has a complex vowel system. Several experiments have been carried out to find the proper speech unit that can accurately create acoustic model and give a higher recognition rate. Results of recognition rates under different acoustic models are given and compared. In addition, this paper proposes a new speech unit for speech recognition, namely onset-rhyme unit. Two models are proposed-the Phonotactic Onset-Rhyme Model (PORM) and the Contextual Onset-Rhyme Model (CORM). The models comprise a pair of onset and rhyme units, which makes up a syllable. An onset comprises an initial consonant and its transition towards the following vowel. Together with the onset, the rhyme consists of a steady vowel segment and a final consonant. Experimental results show that the onset-rhyme model improves on the efficiency of other speech units. The onset-rhyme model improves on the accuracy of the inter-syllable triphone model by nearly 9.3% and of the context-dependent Initial-Final model by nearly 4.7% for the speaker-dependent systems using only an acoustic model, and 5.6% and 4.5% for the speaker-dependent systems using both acoustic and language model respectively. The results show that the onset-rhyme models attain a high recognition rate. Moreover, they also give more efficiency in terms of system complexity.

  • Sounds of Speech Based Spoken Document Categorization: A Subword Representation Method

    Weidong QU  Katsuhiko SHIRAI  

     
    PAPER

      Vol:
    E87-D No:5
      Page(s):
    1175-1184

    In this paper, we explore a method to the problem of spoken document categorization, which is the task of automatically assigning spoken documents into a set of predetermined categories. To categorize spoken documents, subword unit representations are used as an alternative to word units generated by either keyword spotting or large vocabulary continuous speech recognition (LVCSR). An advantage of using subword acoustic unit representations to spoken document categorization is that it does not require prior knowledge about the contents of the spoken documents and addresses the out of vocabulary (OOV) problem. Moreover, this method works in reliance on the sounds of speech rather than exact orthography. The use of subword units instead of words allows approximate matching on inaccurate transcriptions, makes "sounds-like" spoken document categorization possible. We also explore the performance of our method when the training set contains both perfect and errorful phonetic transcriptions, and hope the classifiers can learn from the confusion characteristics of recognizer and pronunciation variants of words to improve the robustness of whole system. Our experiments based on both artificial and real corrupted data sets show that the proposed method is more effective and robust than the word based method.

  • Robust Speaker Identification System Based on Multilayer Eigen-Codebook Vector Quantization

    Ching-Tang HSIEH  Eugene LAI  Wan-Chen CHEN  

     
    PAPER

      Vol:
    E87-D No:5
      Page(s):
    1185-1193

    This paper presents some effective methods for improving the performance of a speaker identification system. Based on the multiresolution property of the wavelet transform, the input speech signal is decomposed into various frequency subbands in order not to spread noise distortions over the entire feature space. For capturing the characteristics of the vocal tract, the linear predictive cepstral coefficients (LPCC) of the lower frequency subband for each decomposition process are calculated. In addition, a hard threshold technique for the lower frequency subband in each decomposition process is also applied to eliminate the effect of noise interference. Furthermore, cepstral domain feature vector normalization is applied to all computed features in order to provide similar parameter statistics in all acoustic environments. In order to effectively utilize all these multiband speech features, we propose a modified vector quantization as the identifier. This model uses the multilayer concept to eliminate the interference among the multiband speech features and then uses the principal component analysis (PCA) method to evaluate the codebooks for capturing a more detailed distribution of the speaker's phoneme characteristics. The proposed method is evaluated using the KING speech database for text-independent speaker identification. Experimental results show that the recognition performance of the proposed method is better than those of the vector quantization (VQ) and the Gaussian mixture model (GMM) using full-band LPCC and mel-frequency cepstral coefficients (MFCC) features in both clean and noisy environments. Also, a satisfactory performance can be achieved in low SNR environments.

  • One-Pass Semi-Dynamic Network Decoding Using a Subnetwork Caching Model for Large Vocabulary Continuous Speech Recongnition

    Dong-Hoon AHN  Minhwa CHUNG  

     
    PAPER

      Vol:
    E87-D No:5
      Page(s):
    1164-1174

    This paper presents a new decoding framework for large vocabulary continuous speech recognition that can handle a static search network dynamically. Generally, a static network decoder can use a search space that is globally optimized in advance, and therefore it can run at high speed during decoding. However, its large memory requirement due to the large network size or the spatial complexity of the optimization algorithm often makes it impractical. Our new one-pass semi-dynamic network decoding scheme aims at incorporating such an optimized search network with memory efficiency, but without losing speed. In this framework, a complete search network is organized on the basis of self-structuring subnetworks and is nearly minimized using a modified tail-sharing algorithm. While the decoder runs, it caches subnetworks needed for decoding in memory, whereas static network decoders keep the complete network in memory. The subnetwork caching model is controlled by two levels of caches: local cache obtained by subnetwork caching operations and global cache obtained by subnetwork preloading operations. The model can also be controlled adaptively by using subnetwork profiling operations. Furthermore, it is made simple and fast with compactly designed self-structuring subnetworks. Experimental results on a 25 k-word Korean broadcast news transcription task show that the semi-dynamic decoder can run almost as fast as an equivalent static network decoder under various memory configurations by using the subnetwork caching model.

  • Phoneme-Balanced and Digit-Sequence-Preserving Connected Digit Patterns for Text-Prompted Speaker Verification

    Tsuneo KATO  Tohru SHIMIZU  

     
    PAPER

      Vol:
    E87-D No:5
      Page(s):
    1194-1199

    This paper presents a novel design of connected digit patterns to achieve high accuracy text-prompted speaker verification over a cellular phone network. To reduce the error rate, a phoneme-balanced connected digit pattern for enrollment, and digit-sequence-preserving connected digit patterns for verification (i.e. patterns preserving partial digit sequences of the enrollment pattern) are proposed. In addition to these, a decision procedure using multiple patterns has been designed to overcome the low quality of cellular phone speech. Experimental results on cellular phone speech showed the phoneme-balanced patterns for enrollment and digit-sequence-preserving patterns for verification reduced more than 50% of equal error rate compared to the conventional method using randomly-selected and randomly-reordered digit patterns. The decision procedure reduced 60% of the error rate. In addition, this paper shows that verification patterns depending on the pattern of a preceding utterance reduced 10% of the error rate. Overall, the error rate obtained by the proposed method was 1% for 99% of clients and 95% of impostors.

  • Wavelet Coding of Structured Geometry Data on Triangular Lattice Plane Considering Rate-Distortion Properties

    Hiroyuki KANEKO  Koichi FUKUDA  Akira KAWANAKA  

     
    PAPER-Image Processing and Video Processing

      Vol:
    E87-D No:5
      Page(s):
    1238-1246

    Efficient representations of a 3-D object shape and its texture data have attracted wide attention for the transmission of computer graphics data and for the development of multi-view real image rendering systems on computer networks. Polygonal mesh data, which consist of connectivity information, geometry data, and texture data, are often used for representing 3-D objects in many applications. This paper presents a wavelet coding technique for coding the geometry data structured on a triangular lattice plane obtained by structuring the connectivity of the polygonal mesh data. Since the structured geometry data have an arbitrarily-shaped support on the triangular lattice plane, a shape-adaptive wavelet transform was used to obtain the wavelet coefficients, whose number is identical to the number of original data, while preserving the self-similarity of the wavelet coefficients across subbands. In addition, the wavelet coding technique includes extensions of the zerotree entropy (ZTE) coding for taking into account the rate-distortion properties of the structured geometry data. The parent-children dependencies are defined as the set of wavelet coefficients from different bands that represent the same spatial region in the triangular lattice plane, and the wavelet coefficients in the spatial tree are optimally pruned based on the rate-distortion properties of the geometry data. Experiments in which proposed wavelet coding was applied to some sets of polygonal mesh data showed that the proposed wavelet coding achieved better coding efficiency than the Topologically Assisted Geometry Compression scheme adopted in the MPEG-4 standard.

  • Negation as Failure through a Network

    Kazunori IRIYA  Susumu YAMASAKI  

     
    PAPER-Computation and Computational Models

      Vol:
    E87-D No:5
      Page(s):
    1200-1207

    This paper deals with distributed procedures, caused by negation as failure through a network, where general logic programs are distributed so that they communicate with each other in terms of negation as failure inquiries and responses, but not in terms of derivations of SLD resolutions. The common variables as channels in share for distributed programs are not treated, but negation as failure validated in the whole network is the object for communications of distributed programs. We can define the semantics for the distributed programs in a network. At the same time, we have distributed proof procedures for distributed programs, by means of negation as failure to be implemented through the network, where the soundness of the procedure is guaranteed by the defined semantics.

  • Design of a Differential Electromagnetic Transducer for Use in IME System

    Byung-Seop SONG  Min-Kyu KIM  Young-Ho YOON  Sang-Heun LEE  Jin-Ho CHO  

     
    PAPER-Speech and Hearing

      Vol:
    E87-D No:5
      Page(s):
    1231-1237

    A differential electromagnetic transducer (DET) was implemented using micro electro mechanical system (MEMS) technology for use in an implantable middle ear (IME) system. The DET is designed to have good vibration efficiency and structure that can't be interfered by the external environmental magnetic field. In order to preserve the uniform vibration performance, the MEMS technology was introduced to manufacture the elastic membrane using polyimide that is softer than silicon. Using the finite element analysis (FEA), vibration characteristics are simulated and designed so that the resonance frequency of the membrane is closed to that of the middle ear. The results of the vibration experiments of the developed DET showed excellent results. We implemented the IME system using a DET and implanted it into a dog. This showed the IME system performed well in a living body.

  • Novel Thresholding Algorithm for Change Detection in Video Sequence

    Byung-Gyu KIM  Dong-Jo PARK  

     
    LETTER-Pattern Recognition

      Vol:
    E87-D No:5
      Page(s):
    1271-1275

    A novel thresholding algorithm for change detection in video sequences is proposed. The method is based on image differencing and the intensity distribution of a difference image. With a difference image between two consecutive images, we prepare a new image model for the distribution of stationary pixels. The distribution of moving pixels is then separated by extracting the distribution of stationary pixels from the overall distribution of the difference image. Pixels that exhibit a significant change in intensity are classified using a likelihood criterion. The proposed algorithm is tested on the standard MPEG sequences and verified to have reliable performance.

  • Fixed-Interval Smoothing from Uncertain Observations with White Plus Coloured Noises Using Covariance Information

    Seiichi NAKAMORI  Raquel CABALLERO-AGUILA  Aurora HERMOSO-CARAZO  Josefa LINARES-PEREZ  

     
    PAPER-Digital Signal Processing

      Vol:
    E87-A No:5
      Page(s):
    1209-1218

    This paper presents recursive algorithms for the least mean-squared error linear filtering and fixed-interval smoothing estimators, from uncertain observations for the case of white and white plus coloured observation noises. The estimators are obtained by an innovation approach and do not use the state-space model, but only covariance information about the signal and the observation noises, as well as the probability that the signal exists in the observed values. Therefore the algorithms are applicable not only to signal processes that can be estimated by the conventional formulation using the state-space model but also to those for which a realization of the state-space model is not available. It is assumed that both the signal and the coloured noise autocovariance functions are expressed in a semi-degenerate kernel form. Since the semi-degenerate kernel is suitable for expressing autocovariance functions of non-stationary or stationary signal processes, the proposed estimators provide estimates of general signal processes.

  • Analysis and Experiments of a TM010 Mode Cylindrical Cavity to Measure Accurate Complex Permittivity of Liquid

    Hirokazu KAWABATA  Hiroshi TANPO  Yoshio KOBAYASHI  

     
    PAPER-General Methods, Materials, and Passive Circuits

      Vol:
    E87-C No:5
      Page(s):
    694-699

    A rigorous analysis for a TM010 mode cylindrical cavity with insertion holes is presented on the basis of the Ritz-Galerkin method to realize accurate measurements of the complex permittivity of liquid. The effects of sample insertion holes, a dielectric tube, and air-gaps between a dielectric tube and sample insertion holes are taken into account in this analysis. The validity of this method is verified from measured results of some kinds of liquid.

  • Two Factor Authenticated Key Exchange (TAKE) Protocol in Public Wireless LANs

    Young Man PARK  Sang Kyu PARK  

     
    LETTER-Fundamental Theories

      Vol:
    E87-B No:5
      Page(s):
    1382-1385

    We propose a new authentication and key establishment (AKE) protocol that can be applied to low-power PDAs in Public Wireless LANs (PWLANs), using two factor authentication and precomputation. This protocol provides mutual authentication, identity privacy, and half forward-secrecy. The computational complexity that the client must perform is just one symmetric key encryption and five hash functions during the runtime of the protocol.

  • Traffic Engineering with Constrained Multipath Routing in MPLS Networks

    Youngseok LEE  Yongho SEOK  Yanghee CHOI  

     
    PAPER-Network

      Vol:
    E87-B No:5
      Page(s):
    1346-1356

    A traffic engineering problem in a network consists of setting up paths between the edge nodes of the network to meet traffic demands while optimizing network performance. It is known that total traffic throughput in a network, or resource utilization, can be maximized if a traffic demand is split over multiple paths. However, the problem formulation and practical algorithms, which calculate the paths and the load-splitting ratios by taking bandwidth, the route constraints or policies into consideration, have not been much touched. In this paper, we formulate the constrained multipath-routing problems with the objective of minimizing the maximum of link utilization, while satisfying bandwidth, the maximum hop count, and the not-preferred node/link list in Linear Programming (LP). Optimal solutions of paths and load-splitting ratios found by an LP solver are shown to be superior to the conventional shortest path algorithm in terms of maximum link utilization, total traffic volume, and number of required paths. Then, we propose a heuristic algorithm with low computational complexity that finds near optimal paths and load-splitting ratios satisfying the given constraints. The proposed algorithm is applied to Multi-Protocol Label Switching (MPLS) that can permit explicit path setup, and it is tested in a fictitious backbone network. The experiment results show that the heuristic algorithm finds near optimal solutions.

  • A New Method to Extract MOSFET Threshold Voltage, Effective Channel Length, and Channel Mobility Using S-parameter Measurement

    Han-Yu CHEN  Kun-Ming CHEN  Guo-Wei HUANG  Chun-Yen CHANG  Tiao-Yuan HUANG  

     
    PAPER-Active Devices and Circuits

      Vol:
    E87-C No:5
      Page(s):
    726-732

    In this work, a simple method for extracting MOSFET threshold voltage, effective channel length and channel mobility by using S-parameter measurement is presented. In the new method, the dependence between the channel conductivity and applied gate voltage of the MOSFET device is cleverly utilized to extract the threshold voltage, while biasing the drain node of the device at zero voltage during measurement. Moreover, the effective channel length and channel mobility can also be obtained with the same measurement. Furthermore, all the physical parameters can be extracted directly on the modeling devices without relying on specifically designed test devices. Most important of all, only one S-parameter measurement is required for each device under test (DUT), making the proposed extraction method promising for automatic measurement applications.

  • A Time-Interleaved Switched-Capacitor Band-Pass Delta-Sigma Modulator with Recursive Loop

    Minho KWON  Jungyoon LEE  Gunhee HAN  

     
    PAPER-Electronic Circuits

      Vol:
    E87-C No:5
      Page(s):
    785-790

    A band-pass delta-sigma modulator (BPDSM) is a key building block to implement a digital intermediate frequency (IF) receiver in a wireless communication system. This paper proposes a time-interleaved (TI) switched-capacitor (SC) BPDSM architecture that consists of 5-stage TI blocks with recursive loop. The proposed TI BPDSM provides reduction in the clock frequency requirement by a factor of 5 and relaxes the settling time requirement to one-fourth of conventional approach. The test chip was designed and fabricated for a 30-MHz IF system with a 0.35-µm CMOS process. The measured peak SNR for a 200-kHz bandwidth is 63 dB while dissipating 75 mW from a 3.3-V supply and occupying 1.3 mm2.

20261-20280hit(30728hit)