The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] SCO(484hit)

261-280hit(484hit)

  • Verification of Speech Recognition Results Incorporating In-domain Confidence and Discourse Coherence Measures

    Ian R. LANE  Tatsuya KAWAHARA  

     
    PAPER-Speech Recognition

      Vol:
    E89-D No:3
      Page(s):
    931-938

    Conventional confidence measures for assessing the reliability of ASR (automatic speech recognition) output are typically derived from "low-level" information which is obtained during speech recognition decoding. In contrast to these approaches, we propose a novel utterance verification framework which incorporates "high-level" knowledge sources. Specifically, we investigate two application-independent measures: in-domain confidence, the degree of match between the input utterance and the application domain of the back-end system, and discourse coherence, the consistency between consecutive utterances in a dialogue session. A joint confidence score is generated by combining these two measures with an orthodox measure based on GPP (generalized posterior probability). The proposed framework was evaluated on an utterance verification task for spontaneous dialogue performed via a (English/Japanese) speech-to-speech translation system. Incorporating the two proposed measures significantly improved utterance verification accuracy compared to using GPP alone, realizing reductions in CER (confidence error-rate) of 11.4% and 8.1% for the English and Japanese sides, respectively. When negligible ASR errors (that do not affect translation) were ignored, further improvement was achieved for the English side, realizing a reduction in CER of up to 14.6% compared to the GPP case.

  • Single-Channel Multiple Regression for In-Car Speech Enhancement

    Weifeng LI  Katsunobu ITOU  Kazuya TAKEDA  Fumitada ITAKURA  

     
    PAPER-Speech Enhancement

      Vol:
    E89-D No:3
      Page(s):
    1032-1039

    We address issues for improving hands-free speech enhancement and speech recognition performance in different car environments using a single distant microphone. This paper describes a new single-channel in-car speech enhancement method that estimates the log spectra of speech at a close-talking microphone based on the nonlinear regression of the log spectra of noisy signal captured by a distant microphone and the estimated noise. The proposed method provides significant overall quality improvements in our subjective evaluation on the regression-enhanced speech, and performed best in most objective measures. Based on our isolated word recognition experiments conducted under 15 real car environments, the proposed adaptive nonlinear regression approach shows an advantage in average relative word error rate (WER) reductions of 50.8% and 13.1%, respectively, compared to original noisy speech and ETSI advanced front-end (ETSI ES 202 050).

  • Improving Acoustic Model Precision by Incorporating a Wide Phonetic Context Based on a Bayesian Framework

    Sakriani SAKTI  Satoshi NAKAMURA  Konstantin MARKOV  

     
    PAPER-Speech Recognition

      Vol:
    E89-D No:3
      Page(s):
    946-953

    Over the last decade, the Bayesian approach has increased in popularity in many application areas. It uses a probabilistic framework which encodes our beliefs or actions in situations of uncertainty. Information from several models can also be combined based on the Bayesian framework to achieve better inference and to better account for modeling uncertainty. The approach we adopted here is to utilize the benefits of the Bayesian framework to improve acoustic model precision in speech recognition systems, which modeling a wider-than-triphone context by approximating it using several less context-dependent models. Such a composition was developed in order to avoid the crucial problem of limited training data and to reduce the model complexity. To enhance the model reliability due to unseen contexts and limited training data, flooring and smoothing techniques are applied. Experimental results show that the proposed Bayesian pentaphone model improves word accuracy in comparison with the standard triphone model.

  • Robust Speech Recognition by Using Compensated Acoustic Scores

    Shoei SATO  Kazuo ONOE  Akio KOBAYASHI  Toru IMAI  

     
    PAPER-Speech Recognition

      Vol:
    E89-D No:3
      Page(s):
    915-921

    This paper proposes a new compensation method of acoustic scores in the Viterbi search for robust speech recognition. This method introduces noise models to represent a wide variety of noises and realizes robust decoding together with conventional techniques of subtraction and adaptation. This method uses likelihoods of noise models in two ways. One is to calculate a confidence factor for each input frame by comparing likelihoods of speech models and noise models. Then the weight of the acoustic score for a noisy frame is reduced according to the value of the confidence factor for compensation. The other is to use the likelihood of noise model as an alternative that of a silence model when given noisy input. Since a lower confidence factor compresses acoustic scores, the decoder rather relies on language scores and keeps more hypotheses within a fixed search depth for a noisy frame. An experiment using commentary transcriptions of a broadcast sports program (MLB: Major League Baseball) showed that the proposed method obtained a 6.7% relative word error reduction. The method also reduced the relative error rate of key words by 17.9%, and this is expected lead to an improvement metadata extraction accuracy.

  • Efficient Motion Vector Composition Algorithm by Activity Measurement for Downscaled Video Transcoder

    Ching-Ting HSU  Mei-Juan CHEN  

     
    LETTER-Multimedia Systems for Communications" Multimedia Systems for Communications

      Vol:
    E89-B No:3
      Page(s):
    1036-1039

    When the frame size is downscaled for video transcoding, the new motion vector (MV) must be computed. This paper presents an algorithm to utilize the activity measurement by DC value and the number of non-zero quantized DCT coefficients in the residual macroblock to compose the motion vector. It can reduce the complexity for motion estimation and improve the performance of the spatial domain video transcoder.

  • Frequency Domain Multiplexing of TES Signals by Magnetic Field Summation

    Noriko Y. YAMASAKI  Yoh TAKEI  Kensuke MASUI  Kazuhisa MITSUDA  Toshimitsu MOROOKA  Satoshi NAKAYAMA  

     
    INVITED PAPER

      Vol:
    E89-C No:2
      Page(s):
    98-105

    In frequency-domain multiplexing (FDM) for TES signals, a magnetic field summation method utilizing a multi-input SQUID has the fundamental merit of small degradation of the signal-to-noise ratio. We formulated shifts of the operation point due to a common impedance and cross talk currents. These effects are evaluated for several FDM methods, and the requirements for the bandwidth and filters are summarized. The design parameters of multi-input SQUIDs and a flux locked loop driving circuits are also presented.

  • Four-Quadrant-Input Linear Transconductor Employing Source and Sink Currents Pair for Analog Multiplier

    Masakazu MIZOKAMI  Kawori TAKAKUBO  Hajime TAKAKUBO  

     
    PAPER

      Vol:
    E89-A No:2
      Page(s):
    362-368

    A four-quadrant-input linear transconductor generating a product or a product sum current is proposed. The proposed circuit eliminates the influence of channel length modulation and expands a dynamic input voltage range. As an application of the proposed circuit, the four-quadrant analog multiplier is designed. The four-quadrant analog multiplier consists of the proposed circuit, an input circuit and a class AB current buffer. HSPICE simulation results with 0.35 µm n-well single CMOS process parameter are shown in order to evaluate the proposed circuit.

  • A New Linear Transconductor Combining a Source Coupled Pair with a Transconductor Using Bias-Offset Technique

    Isamu YAMAGUCHI  Fujihiko MATSUMOTO  Makoto IZUMA  Yasuaki NOGUCHI  

     
    PAPER

      Vol:
    E89-A No:2
      Page(s):
    369-376

    Linearity of a transconductor with a theoretical linear characteristic is deteriorated by mobility degradation, in practice. In this paper, a technique to improve the linearity by combining a source-coupled pair with the transconductor is proposed. The proposed transconductor is the circuit that the deteriorated linearity of the conventional part is compensated by the transconductance characteristic of the source-coupled pair. In order to confirm the validity of the proposed technique, SPICE simulation is carried out. The transconductance change ratio of the proposed technique is about 1% and is 1/10 or less of the conventional circuit.

  • Design of a Mobile Application Framework with Context Sensitivities

    Hyung-Min YOON  Woo-Shik KANG  Oh-Young KWON  Seong-Hun JEONG  Bum-Seok KANG  Tack-Don HAN  

     
    PAPER-Mobile Computing

      Vol:
    E89-D No:2
      Page(s):
    508-515

    New service concepts involving mobile devices with a diverse range of embedded sensors are emerging that share contexts supporting communication on a wireless network infrastructure. To promote these services in mobile devices, we propose a method that can efficiently detect a context provider by partitioning the location, time, speed, and discovery sensitivities.

  • HiPeer: A Highly Reliable P2P System

    Giscard WEPIWE  Plamen L. SIMEONOV  

     
    PAPER-Peer-to-Peer Computing

      Vol:
    E89-D No:2
      Page(s):
    570-580

    The paper presents HiPeer, a robust resource distribution and discovery algorithm that can be used for fast and fault-tolerant location of resources in P2P network environments. HiPeer defines a concentric multi-ring overlay networking topology, whereon dynamic network management methods are deployed. In terms of performance, HiPeer delivers of number of lowest bounds. We demonstrate that for any De Bruijn digraph of degree d 2 and diameter DDB HiPeer constructs a highly reliable network, where each node maintains a routing table with at most 2d+2 entries independent of the number N of nodes in the system. Further, we show that any existing resource in the network with at most d nodes can be found within at most DHiPeer = log d(N(d-1)+d)-1 overlay hops. This result is as close to the Moore bound [1] as the query path length in other outstanding P2P proposals based on the De Bruijn digraphs. Thus, we argue that HiPeer defines a highly connected network with connectivity d and the lowest yet known lookup bound DHiPeer. Moreover, we show that any node's "join or leave" operation in HiPeer implies a constant expected reorganization cost of the magnitude order of O(d) control messages.

  • Anti-Parallel Dipole Coupling of Quantum Dots via an Optical Near-Field Interaction

    Tadashi KAWAZOE  Kiyoshi KOBAYASHI  Motoichi OHTSU  

     
    PAPER

      Vol:
    E88-C No:9
      Page(s):
    1845-1849

    We observed the optically forbidden energy transfer between cubic CuCl quantum dots coupled via an optical near-field interaction using time-resolved near-field photoluminescence (PL) spectroscopy. The energy transfer time and exciton lifetime were estimated from the rise and decay times of the PL pump-probe signal, respectively. We found that the exciton lifetime increased as the energy transfer time fell. This result strongly supports the notion that near-field interaction between QD makes the anti-parallel dipole coupling. Namely, a quantum-dots pair coupled by an optical near field has a long exciton lifetime which indicates the anti-parallel coupling of QDs forming a weakly radiative quadrupole state.

  • A Proposal on a New Algorithm for Volume Calculation Based on Laser Microscope Data

    Makoto HASEGAWA  Masato AKITA  Kazutaka IZUMI  Takayoshi KUBONO  

     
    LETTER-Contact Phenomena

      Vol:
    E88-C No:8
      Page(s):
    1573-1576

    We initiated development of our own data processing software for laser microscope data with C# language. This software is provided with volume calculation function of a target portion, based on a new calculation algorithm that can precisely handle the volume calculation of the portion located on a tilted surface or on a distorted surface. In this paper, this algorithm and some exemplary results obtained thereby, as well as some further development aims, are briefly described.

  • Observations of the Eroded Surfaces and the Motion of Arc Spots at Each Breaking Operation of Silver Electrical Contacts

    Junya SEKIKAWA  Tetsuya KITAJIMA  Takayoshi ENDO  Takayoshi KUBONO  

     
    PAPER-Arc Discharge & Related Phenomena

      Vol:
    E88-C No:8
      Page(s):
    1590-1595

    The motion of arc spots of breaking arc is investigated for Ag electrical contacts in DC 42 V/10 A resistive circuit using a high-speed camera. Also, the eroded contact surfaces are observed with a microscope after each breaking operation. As results, some kinds of different films and eroded regions are distinguished. Diameters of these regions are corresponding to the widths of the cathode and anode spot regions that are obtained by using the high-speed camera. It is found that the films and eroded regions on the electrical contacts are generated at different stages of the breaking arc.

  • Extraction of Desired Spectra Using ICA Regression with DOAS

    Hyeon-Ho KIM  Sung-Hwan HAN  Hyeon-Deok BAE  

     
    LETTER-Measurement Technology

      Vol:
    E88-A No:8
      Page(s):
    2244-2246

    Recently, DOAS (differential optical absorption spectroscopy) has been used for nondestructive air monitoring, in which the LS (least squares) method is used to calculate trace gas concentrations due to its computational simplicity. This paper applies the ICA (independent component analysis) method to the DOAS system of air monitoring, since the LS method is insufficient to recover the desired spectra perfectly due to sparsity characteristic. If the sparsity of reference spectra in the DOAS system imposes the assumption of independence, the ICA algorithm can be used. The proposed method is used to regress the observed spectrum on the estimates of the reference spectra. The ICA algorithm can be seen as a preprocessing method where the ICs of the references are used as the input in the regression. The performance of the proposed method is evaluated in simulation studies using synthetic data.

  • High-Speed Distributed Video Transcoding for Multiple Rates and Formats

    Yasuo SAMBE  Shintaro WATANABE  Dong YU  Taichi NAKAMURA  Naoki WAKAMIYA  

     
    PAPER-Computer Systems

      Vol:
    E88-D No:8
      Page(s):
    1923-1931

    This paper describes a distributed video transcoding system that can simultaneously transcode an MPEG-2 video file into various video coding formats with different rates. The transcoder divides the MPEG-2 file into small segments along the time axis and transcodes them in parallel. Efficient video segment handling methods are proposed that minimize the inter-processor communication overhead and eliminate temporal discontinuities from the re-encoded video. We investigate how segment transcoding should be distributed to obtain the shortest total transcoding time. Experimental results show that implementing distributed transcoding on 10 PCs can decrease the total transcoding time by a factor of about 7 for single transcoding and by a factor of 9.5 for simultaneous three kinds of transcoding rates.

  • An Approach to Ultra-Broadband Medium-Power MMIC Cascode-Pair Distributed Amplifier Design

    Qun WU  Yu-Ming WU  Jia-Hui FU  Bo-Shi JIN  Jong-Chul LEE  

     
    INVITED PAPER

      Vol:
    E88-C No:7
      Page(s):
    1353-1357

    This paper presents a cascode-pair distributed amplifier design approach using 0.25 µm GaAs-based PHEMT MMIC technology, which covers 2-32 GHz. Electromagnetic simulation results show that this amplifier achieves 18 dB gain from 2 to 32 GHz and 0.5 dB gain flatness over the band. The reflected coefficients at the input and output ports are below -10 dB up to 27 GHz. The output power at 1 dB compression is greater than 24 dBm at 20 GHz. An appropriate feedback resistance can be utilized to improve P1 dB for about 6 dBm. The DOE (design of experiment) approach is carried out by a simulation tool for better performance and tolerance of the devices is also analyzed. The circuit configuration is capable of operating over ultra-broad band amplification.

  • Quantization/DCT Conversion Scheme for DCT-Domain MPEG-2 to H.264/AVC Transcoding

    Joo-Kyong LEE  Ki-Dong CHUNG  

     
    PAPER

      Vol:
    E88-B No:7
      Page(s):
    2856-2863

    The latest video coding standard, H.264/AVC, adopts 44 approximate transform instead of 88 discrete cosine transform (DCT) to avoid the inverse transform mismatch problem. However, that is only one of the factors that make it difficult to transcode pre-coded video contents with the previous standards to H.264/AVC in the common domain without causing cascaded pixel-domain transcoding. In this paper, to support the existent DCT-domain transcoding schemes and to reduce computational complexity, we propose an efficient algorithm that converts the quantized 88 DCT block into four newly quantized 44 transformed blocks. The experimental results show that the proposed scheme reduces computational complexity by 5-11% and improves video quality by 0.1-0.5 dB compared with the cascaded pixel-domain transcoding scheme that exploits inverse quantization (IQ), inverse DCT (IDCT), DCT, and re-quantization (re-Q).

  • An Efficient Matrix-Based 2-D DCT Splitter and Merger for SIMD Instructions

    Yuh-Jue CHUANG  Ja-Ling WU  

     
    PAPER-Image Processing and Multimedia Systems

      Vol:
    E88-D No:7
      Page(s):
    1569-1577

    Recent microprocessors have included SIMD (single instruction multiple data) extensions into their instruction set architecture to improve the performance of multimedia applications. SIMD instructions speed up the execution of programs but pose lots of challenges to software developers. An efficient matrix-based splitter (or merger), which can split an N N 2-D DCT block into four N/2 N/2 or two N N/2 (or N/2 N) 2-D DCT blocks (or merger small size blocks into a large size one), specialized for SIMD architectures is presented in this paper. The programming-level complexity of the proposed methods is lower than that of the direct approach. Furthermore, even without using SIMD instructions, the algorithmic-level complexity of the proposed DCT splitter/merger is still lower than that of the direct one and is the same as that of the most efficient approach existed in the literature. When N = 8, our method can be applied to act as a transcoder between the latest video coding standards AVC/H.264 and the older ones, such as MPEG-1, MPEG-2 and MPEG-4 part 2. We also provide the image quality tests to show the performance of the proposed 2-D DCT splitter and merger.

  • Consideration of Contents Utilization Time in Multi-Quality Video Content Delivery Methods with Scalable Transcoding

    Mei KODAMA  Shunya SUZUKI  

     
    PAPER-Image Processing and Multimedia Systems

      Vol:
    E88-D No:7
      Page(s):
    1587-1597

    When video data are transmitted via the network, the quality of video data must be carefully chosen to be best under the condition that the transmission is not influenced by other internet services. They often use the simulcast type, which uses independent streams that are stored and transmitted for the quality, considering implementation, when they select the video quality. On the other hand, we had already proposed the scalable structure, which consists of base and enhancement data, but when they require the high quality video, these data are combined using the transcoding methods. In this paper, we propose the video contents delivery methods with scalable transcoding, in which users can update the quality of video data even after the transmission by base data and differential data. In order to reduce the total time of not only users' access time, but also watching time, we compare simulcast method with proposed methods in the total content utilization time using a video contents access model, and evaluate required transcoding time to reduce the waiting time of users.

  • A 0.18 µm CMOS 3rd-Order Digitally Programmable Gm-C Filter for VHF Applications

    Aranzazu OTIN  Santiago CELMA  Concepcion ALDEA  

     
    LETTER-Digital Circuits and Computer Arithmetic

      Vol:
    E88-D No:7
      Page(s):
    1509-1510

    In this paper we report a 3rd-order Gm-C filter based on pseudo-differential continuous-time transconductors for applications in low-voltage systems over VHF range. By using a 0.18 µm pure digital CMOS process, a prototype low pass filter with -3 dB frequency programmable from 38 MHz to 213 MHz confirms the feasibility of the proposed filter in applications such as data storage systems.

261-280hit(484hit)