The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] SI(16314hit)

9261-9280hit(16314hit)

  • Generalization Performance of Subspace Bayes Approach in Linear Neural Networks

    Shinichi NAKAJIMA  Sumio WATANABE  

     
    PAPER-Algorithm Theory

      Vol:
    E89-D No:3
      Page(s):
    1128-1138

    In unidentifiable models, the Bayes estimation has the advantage of generalization performance over the maximum likelihood estimation. However, accurate approximation of the posterior distribution requires huge computational costs. In this paper, we consider an alternative approximation method, which we call a subspace Bayes approach. A subspace Bayes approach is an empirical Bayes approach where a part of the parameters are regarded as hyperparameters. Consequently, in some three-layer models, this approach requires much less computational costs than Markov chain Monte Carlo methods. We show that, in three-layer linear neural networks, a subspace Bayes approach is asymptotically equivalent to a positive-part James-Stein type shrinkage estimation, and theoretically clarify its generalization error and training error. We also discuss the domination over the maximum likelihood estimation and the relation to the variational Bayes approach.

  • Error Identification in At-Speed Scan BIST Environment in the Presence of Circuit and Tester Speed Mismatch

    Yoshiyuki NAKAMURA  Thomas CLOUQUEUR  Kewal K. SALUJA  Hideo FUJIWARA  

     
    PAPER-Dependable Computing

      Vol:
    E89-D No:3
      Page(s):
    1165-1172

    In this paper, we provide a practical formulation of the problem of identifying all error occurrences and all failed scan cells in at-speed scan based BIST environment. We propose a method that can be used to identify every error when the circuit test frequency is higher than the tester frequency. Our approach requires very little extra hardware for diagnosis and the test application time required to identify errors is a linear function of the frequency ratio between the CUT and the tester.

  • Design of MIMO Communication Systems Using Tapped Delay Line Structure in Receiver Side

    Tetsuki TANIGUCHI  Hoang Huy PHAM  Nam Xuan TRAN  Yoshio KARASAWA  

     
    PAPER-Communications

      Vol:
    E89-A No:3
      Page(s):
    670-677

    This paper presents a simple method to determine weights of single carrier multiple input multiple output (MIMO) broadband communication systems adopting tapped delay line (TDL) structure in receiver side for the effective communication under frequency selective fading (FSF) environment. First, assuming the perfect knowledge of the channel matrix in both arrays, an iterative design method of transmitter and receiver weights is proposed. In this approach, both weights are determined alternately to maximize signal to noise plus interference ratio (SINR) by fixing the weight of one side while optimizing the other, and this operation is repeated until SINR converges. Next, considering the case of uninformed transmitter, maximum SINR design method of MIMO system is extended for space time block coding (STBC) scheme working under FSF. Through computer simulations, it is demonstrated that the proposed schemes achieves higher SINR than conventional method with delay-less structure, particularly for the fading with long duration.

  • A New Fusion Based Blind Logo-Watermarking Algorithm

    Gui XIE  Hong SHEN  

     
    PAPER-Application Information Security

      Vol:
    E89-D No:3
      Page(s):
    1173-1180

    We propose a novel blind watermarking algorithm, called XFuseMark, which can hide a small, visually meaningful, grayscale logo in a host image instead of using a random-noise-like sequence based on the multiresolution fusion principles, and extract a recognizable version of the embedded logo even without reference to the original host data at the receiving end. XFuseMark is not only secure, i.e., only authorized users holding a private key are able to conduct the logo extraction operation, but also robust against noise addition and image compression. Experiments verify the practical performance of XFuseMark.

  • Integrated Connection Admission Control and Bandwidth on Demand Algorithm for a Broadband Satellite Network with Heterogeneous Traffic

    Yi QIAN  Rose Qingyang HU  Catherine ROSENBERG  

     
    PAPER-Satellite Communication

      Vol:
    E89-B No:3
      Page(s):
    895-905

    There are many system proposals for satellite-based broadband communications that promise high capacity and ease of access. Many of these proposals require advanced switching technology and signal processing on-board the satellite(s). One solution is based on a geo-synchronous (GEO) satellite system equipped with on-board processing and on-board switching. An important feature of this system is allowing for a maximum number of simultaneous users, hence, requiring effective medium access control (MAC) layer protocols for connection admission control (CAC) and bandwidth on demand (BoD) algorithms. In this paper, an integrated CAC and BoD algorithm is proposed for a broadband satellite communication system with heterogeneous traffic. A detailed modeling and simulation approach is presented for performance evaluation of the integrated CAC and BoD algorithm based on heterogeneous traffic types. The proposed CAC and BoD scheme is shown to be able to efficiently utilize available bandwidth and to gain high throughput, and also to maintain good Grade of Service (GoS) for all the traffic types. The end-to-end delay for real-time traffic in the system falls well within ITU's Quality of Service (QoS) specification for GEO-based satellite systems.

  • Boundary-Active-Only Adaptive Power-Reduction Scheme for Region-Growing Video-Segmentation

    Takashi MORIMOTO  Hidekazu ADACHI  Osamu KIRIYAMA  Tetsushi KOIDE  Hans Jurgen MATTAUSCH  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E89-D No:3
      Page(s):
    1299-1302

    This letter presents a boundary-active-only (BAO) power reduction technique for cell-network-based region-growing video segmentation. The key approach is an adaptive situation-dependent power switching of each network cell, namely only cells at the boundary of currently grown regions are activated, and all the other cells are kept in low-power stand-by mode. The effectiveness of the proposed technique is experimentally confirmed with CMOS test-chips having small-scale cell networks of up to 4133 cells, where an average of only 1.7% of the cells remains active after application of the proposed approach. About 85% power reduction is thus achievable without sacrificing real-time processing.

  • Wideband Signal DOA Estimation Based on Modified Quantum Genetic Algorithm

    Feng LIU  Shaoqian LI  Min LIANG  Laizhao HU  

     
    PAPER-Communications

      Vol:
    E89-A No:3
      Page(s):
    648-653

    A new wideband signal DOA estimation algorithm based on modified quantum genetic algorithm (MQGA) is proposed in the presence of the errors and the mutual coupling between array elements. In the algorithm, the narrowband signal subspace fitting method is generalized to wideband signal DOA finding according to the character of space spectrum of wideband signal, and so the rule function is constructed. Then, the solutions is encoded onto chromosomes as a string of binary sequence, the variable quantum rotation angle is defined according to the distribution of optimization solutions. Finally, we use the MQGA algorithm to solve the nonlinear global azimuths optimization problem, and get optimization azimuths by fitness values. The computer simulation results illustrated that the new algorithm have good estimation performance.

  • Detection of Moving Cast Shadows for Traffic Monitoring System

    Jeong-Hoon CHO  Dae-Geun JANG  Chan-Sik HWANG  

     
    LETTER-Image/Vision Processing

      Vol:
    E89-A No:3
      Page(s):
    747-750

    Shadow detection and removal is important to deal with traffic image sequences. Cast shadow of vehicle may lead to an inaccurate object feature extraction and erroneous scene analysis. Furthermore, separate vehicles can be connected through shadow. Both can confuse object recognition systems. In this paper, a robust method is proposed for detecting and removing active cast shadow in monocular color image sequences. Background subtraction method is used to extract moving blobs in color and gradient dimensions, and the YCrCb color space is adopted for detecting and removing the cast shadow. Even when shadows link different vehicles, it can detect the each vehicle figure using modified mask by shadow bar. Experimental results from town scenes show that proposed method is effective and the classification accuracy is sufficient for general vehicle type classification.

  • Spatial Fading Simulator Using a Cavity-Excited Circular Array (CECA) for Performance Evaluation of Antenna Arrays

    Chulgyun PARK  Jun-ichi TAKADA  Kei SAKAGUCHI  Takashi OHIRA  

     
    PAPER-Antennas and Propagation

      Vol:
    E89-B No:3
      Page(s):
    906-913

    In this paper we propose a novel spatial fading simulator to evaluate the performance of an array antenna and show its spatial stochastic characteristics by computer simulation based on parameters verified by experimental data. We introduce a cavity-excited circular array (CECA) as a fading simulator that can simulate realistic mobile communication environments. To evaluate the antenna array, two stochastic characteristics are necessary. The first one is the fading phenomenon and the second is the angular spread (AS) of the incident wave. The computer simulation results with respect to fading and AS show that CECA works well as a spatial fading simulator for performance evaluation of an antenna array. We first present the basic structure, features and design methodology of CECA, and then show computer simulation results of the spatial stochastic characteristics. The results convince us that CECA is useful to evaluate performance of antenna arrays.

  • PS-ZCPA Based Feature Extraction with Auditory Masking, Modulation Enhancement and Noise Reduction for Robust ASR

    Muhammad GHULAM  Takashi FUKUDA  Kouichi KATSURADA  Junsei HORIKAWA  Tsuneo NITTA  

     
    PAPER-Speech Recognition

      Vol:
    E89-D No:3
      Page(s):
    1015-1023

    A pitch-synchronous (PS) auditory feature extraction method based on ZCPA (Zero-Crossings Peak-Amplitudes) was proposed previously and showed more robustness over a conventional ZCPA and MFCC based features. In this paper, firstly, a non-linear adaptive threshold adjustment procedure is introduced into the PS-ZCPA method to get optimal results in noisy conditions with different signal-to-noise ratio (SNR). Next, auditory masking, a well-known auditory perception, and modulation enhancement that simulates a strong relationship between modulation spectrums and intelligibility of speech are embedded into the PS-ZCPA method. Finally, a Wiener filter based noise reduction procedure is integrated into the method to make it more noise-robust, and the performance is evaluated against ETSI ES202 (WI008), which is a standard front-end for distributed speech recognition. All the experiments were carried out on Aurora-2J database. The experimental results demonstrated improved performance of the PS-ZCPA method by embedding auditory masking into it, and a slightly improved performance by using modulation enhancement. The PS-ZCPA method with Wiener filter based noise reduction also showed better performance than ETSI ES202 (WI008).

  • Comparative Study of Speaker Identification Methods: dPLRM, SVM and GMM

    Tomoko MATSUI  Kunio TANABE  

     
    PAPER-Speaker Recognition

      Vol:
    E89-D No:3
      Page(s):
    1066-1073

    A comparison of performances is made of three text-independent speaker identification methods based on dual Penalized Logistic Regression Machine (dPLRM), Support Vector Machine (SVM) and Gaussian Mixture Model (GMM) with experiments by 10 male speakers. The methods are compared for the speech data which were collected over the period of 13 months in 6 utterance-sessions of which the earlier 3 sessions were for obtaining training data of 12 seconds' utterances. Comparisons are made with the Mel-frequency cepstrum (MFC) data versus the log-power spectrum data and also with training data in a single session versus in plural ones. It is shown that dPLRM with the log-power spectrum data is competitive with SVM and GMM methods with MFC data, when trained for the combined data collected in the earlier three sessions. dPLRM outperforms GMM method especially as the amount of training data becomes smaller. Some of these findings have been already reported in [1]-[3].

  • A Style Adaptation Technique for Speech Synthesis Using HSMM and Suprasegmental Features

    Makoto TACHIBANA  Junichi YAMAGISHI  Takashi MASUKO  Takao KOBAYASHI  

     
    PAPER-Speech Synthesis

      Vol:
    E89-D No:3
      Page(s):
    1092-1099

    This paper proposes a technique for synthesizing speech with a desired speaking style and/or emotional expression, based on model adaptation in an HMM-based speech synthesis framework. Speaking styles and emotional expressions are characterized by many segmental and suprasegmental features in both spectral and prosodic features. Therefore, it is essential to take account of these features in the model adaptation. The proposed technique called style adaptation, deals with this issue. Firstly, the maximum likelihood linear regression (MLLR) algorithm, based on a framework of hidden semi-Markov model (HSMM) is presented to provide a mathematically rigorous and robust adaptation of state duration and to adapt both the spectral and prosodic features. Then, a novel tying method for the regression matrices of the MLLR algorithm is also presented to allow the incorporation of both the segmental and suprasegmental speech features into the style adaptation. The proposed tying method uses regression class trees with contextual information. From the results of several subjective tests, we show that these techniques can perform style adaptation while maintaining naturalness of the synthetic speech.

  • Generating F0 Contours by Statistical Manipulation of Natural F0 Shapes

    Takashi SAITO  

     
    PAPER-Speech Analysis

      Vol:
    E89-D No:3
      Page(s):
    1100-1106

    This paper describes a method of generating F0 contours from natural F0 segmental shapes for speech synthesis. The extracted shapes of the F0 units are basically held invariant by eliminating any averaging operations in the analysis phase and by minimizing modification operations in the synthesis phase. The use of natural F0 shapes has great potential to cover a wide variety of speaking styles with the same framework, including not only read-aloud speech, but also dialogues and emotional speech. A linear-regression statistical model is used to "manipulate" the stored raw F0 shapes to build them up into a sentential F0 contour. Through experimental evaluations, the proposed model is shown to provide stable and robust F0 contour prediction for various speakers. By using this model, linguistically derived information about a sentence can be directly mapped, in a purely data-driven manner, to acoustic F0 values of the sentential intonation contour for a given target speaker.

  • Implementation and Evaluation of an HMM-Based Korean Speech Synthesis System

    Sang-Jin KIM  Jong-Jin KIM  Minsoo HAHN  

     
    LETTER

      Vol:
    E89-D No:3
      Page(s):
    1116-1119

    Development of a hidden Markov model (HMM)-based Korean speech synthesis system and its evaluation is described. Statistical HMM models for Korean speech units are trained with the hand-labeled speech database including the contextual information about phoneme, morpheme, word phrase, utterance, and break strength. The developed system produced speech with a fairly good prosody. The synthesized speech is evaluated and compared with that of our corpus-based unit concatenating Korean text-to-speech system. The two systems were trained with the same manually labeled speech database.

  • Progressive Transform-Based Phase Unwrapping Utilizing a Recursive Structure

    Andriyan Bayu SUKSMONO  Akira HIROSE  

     
    PAPER-Sensing

      Vol:
    E89-B No:3
      Page(s):
    929-936

    We propose a progressive transform-based phase unwrapping (PU) technique that employs a recursive structure. Each stage, which is identical with others in the construction, performs PU by FFT method that yields a solution and a residual phase error as well. The residual phase error is then reprocessed by the following stages. This scheme effectively improves the gradient estimate of the noisy wrapped phase image, which is unrecoverable by conventional global PU methods. Additionally, by incorporating computational strength of the transform PU method in a recursive system, we can realize a progressive PU system for prospective near real-time topographic-mapping radar and near real-time medical imaging system (such as MRI thermometry and MRI flow imager). PU performance of the proposed system and the conventional PU methods are evaluated by comparing their residual error quantitatively with a fringe-density-related error metric called FZX (fringe's zero-crossing) number. Experimental results for simulated and real InSAR phase images show significant, progressive improvement over conventional ones of a single-stage system, which demonstrates the high applicability of the proposed method.

  • Using Hybrid HMM/BN Acoustic Models: Design and Implementation Issues

    Konstantin MARKOV  Satoshi NAKAMURA  

     
    PAPER-Speech Recognition

      Vol:
    E89-D No:3
      Page(s):
    981-988

    In recent years, the number of studies investigating new directions in speech modeling that goes beyond the conventional HMM has increased considerably. One promising approach is to use Bayesian Networks (BN) as speech models. Full recognition systems based on Dynamic BN as well as acoustic models using BN have been proposed lately. Our group at ATR has been developing a hybrid HMM/BN model, which is an HMM where the state probability distribution is modeled by a BN, instead of commonly used mixtures of Gaussian functions. In this paper, we describe how to use the hybrid HMM/BN acoustic models, especially emphasizing some design and implementation issues. The most essential part of HMM/BN model building is the choice of the state BN topology. As it is manually chosen, there are some factors that should be considered in this process. They include, but are not limited to, the type of data, the task and the available additional information. When context-dependent models are used, the state-level structure can be obtained by traditional methods. The HMM/BN parameter learning is based on the Viterbi training paradigm and consists of two alternating steps - BN training and HMM transition updates. For recognition, in some cases, BN inference is computationally equivalent to a mixture of Gaussians, which allows HMM/BN model to be used in existing decoders without any modification. We present two examples of HMM/BN model applications in speech recognition systems. Evaluations under various conditions and for different tasks showed that the HMM/BN model gives consistently better performance than the conventional HMM.

  • Speech Recognition Based on Student's t-Distribution Derived from Total Bayesian Framework

    Shinji WATANABE  Atsushi NAKAMURA  

     
    PAPER-Speech Recognition

      Vol:
    E89-D No:3
      Page(s):
    970-980

    We introduce a robust classification method based on the Bayesian predictive distribution (Bayesian Predictive Classification, referred to as BPC) for speech recognition. We and others have recently proposed a total Bayesian framework named Variational Bayesian Estimation and Clustering for speech recognition (VBEC). VBEC includes the practical computation of approximate posterior distributions that are essential for BPC, based on variational Bayes (VB). BPC using VB posterior distributions (VB-BPC) provides an analytical solution for the predictive distribution as the Student's t-distribution, which can mitigate the over-training effects by marginalizing the model parameters of an output distribution. We address the sparse data problem in speech recognition, and show experimentally that VB-BPC is robust against data sparseness.

  • Cryptanalysis of Tzeng-Tzeng Forward-Secure Signature Schemes

    Hong WANG  Gang QIU  Deng-Guo FENG  Guo-Zhen XIAO  

     
    LETTER-Information Security

      Vol:
    E89-A No:3
      Page(s):
    822-825

    In PKC'01, Tzeng et al. proposed two robust forward-secure signature schemes with proactive security: one is an efficient scheme, but it requires a manager; the other scheme is a new construction based on distributed multiplication procedures. In this paper, we point out their new distributed multiplication procedure is not secure, thus making the whole new construction insecure. Finally, we present an improved forward-secure signature scheme without a manager.

  • Improving Acoustic Model Precision by Incorporating a Wide Phonetic Context Based on a Bayesian Framework

    Sakriani SAKTI  Satoshi NAKAMURA  Konstantin MARKOV  

     
    PAPER-Speech Recognition

      Vol:
    E89-D No:3
      Page(s):
    946-953

    Over the last decade, the Bayesian approach has increased in popularity in many application areas. It uses a probabilistic framework which encodes our beliefs or actions in situations of uncertainty. Information from several models can also be combined based on the Bayesian framework to achieve better inference and to better account for modeling uncertainty. The approach we adopted here is to utilize the benefits of the Bayesian framework to improve acoustic model precision in speech recognition systems, which modeling a wider-than-triphone context by approximating it using several less context-dependent models. Such a composition was developed in order to avoid the crucial problem of limited training data and to reduce the model complexity. To enhance the model reliability due to unseen contexts and limited training data, flooring and smoothing techniques are applied. Experimental results show that the proposed Bayesian pentaphone model improves word accuracy in comparison with the standard triphone model.

  • DSRED: A New Queue Management Scheme for the Next Generation Internet

    Bing ZHENG  Mohammed ATIQUZZAMAN  

     
    PAPER-Internet

      Vol:
    E89-B No:3
      Page(s):
    764-774

    Random Early Detection (RED), an active queue management scheme, has been recommended by the Internet Engineering Task Force (IETF) for the next generation routers. RED suffers from a number of performance problems, such as low throughput, large delay/jitter, and induces instability in networks. Many of the previous attempts to improve the performance of RED have been based on optimizing the values of the RED parameters. However, results have shown that such optimizations resulted in limited improvement in the performance. In this paper, we propose Double Slope RED (DSRED), a new active queue management scheme to improve the performance of RED. The proposed scheme is based on dynamically changing the slope of the packet drop probability curve as a function of the level of congestion in the buffer. Results show that our proposed scheme results in better performance than original RED.

9261-9280hit(16314hit)