The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] SI(16314hit)

4081-4100hit(16314hit)

  • A Low-Cost Stimulus Design for Linearity Test in SAR ADCs

    An-Sheng CHAO  Cheng-Wu LIN  Hsin-Wen TING  Soon-Jyh CHANG  

     
    PAPER

      Vol:
    E97-C No:6
      Page(s):
    538-545

    The proposed stimulus design for linearity test is embedded in a differential successive approximation register analog-to-digital converter (SAR ADC), i.e. a design for testability (DFT). The proposed DFT is compatible to the pattern generator (PG) and output response analyzer (ORA) with the cost of 12.4-% area of the SAR ADC. The 10-bit SAR ADC prototype is verified in a 0.18-µm CMOS technology and the measured differential nonlinearity (DNL) error is between -0.386 and 0.281 LSB at 1-MS/s.

  • High Capacity Mobile Multi-Hop Relay Network for Temporary Traffic Surge

    Ju-Ho LEE  Goo-Yeon LEE  Choong-Kyo JEONG  

     
    LETTER-Information Network

      Vol:
    E97-D No:6
      Page(s):
    1661-1663

    Mobile Multi-hop Relay (MMR) technology is usually used to increase the transmission rate or to extend communication coverage. In this work, we show that MMR technology can also be used to raise the network capacity. Because Relay Stations (RS) are connected to the Base Station (BS) wirelessly and controlled by the BS, an MMR network can easily be deployed when necessary. High capacity MMR networks thus provide a good candidate solution for coping with temporary traffic surges. For the capacity enhancement of the MMR network, we suggest a novel scheme to parallelize cell transmissions while controlling the interference between transmissions. Using a numerical example for a typical network that is conformant to the IEEE 802.16j, we find that the network capacity increases by 88 percent.

  • Unsupervised Prosodic Labeling of Speech Synthesis Databases Using Context-Dependent HMMs

    Chen-Yu YANG  Zhen-Hua LING  Li-Rong DAI  

     
    PAPER-Speech Synthesis and Related Topics

      Vol:
    E97-D No:6
      Page(s):
    1449-1460

    In this paper, an automatic and unsupervised method using context-dependent hidden Markov models (CD-HMMs) is proposed for the prosodic labeling of speech synthesis databases. This method consists of three main steps, i.e., initialization, model training and prosodic labeling. The initial prosodic labels are obtained by unsupervised clustering using the acoustic features designed according to the characteristics of the prosodic descriptor to be labeled. Then, CD-HMMs of the spectral parameters, F0s and phone durations are estimated by a means similar to the HMM-based parametric speech synthesis using the initial prosodic labels. These labels are further updated by Viterbi decoding under the maximum likelihood criterion given the acoustic feature sequences and the trained CD-HMMs. The model training and prosodic labeling procedures are conducted iteratively until convergence. The performance of the proposed method is evaluated on Mandarin speech synthesis databases and two prosodic descriptors are investigated, i.e., the prosodic phrase boundary and the emphasis expression. In our implementation, the prosodic phrase boundary labels are initialized by clustering the durations of the pauses between every two consecutive prosodic words, and the emphasis expression labels are initialized by examining the differences between the original and the synthetic F0 trajectories. Experimental results show that the proposed method is able to label the prosodic phrase boundary positions much more accurately than the text-analysis-based method without requiring any manually labeled training data. The unit selection speech synthesis system constructed using the prosodic phrase boundary labels generated by our proposed method achieves similar performance to that using the manual labels. Furthermore, the unit selection speech synthesis system constructed using the emphasis expression labels generated by our proposed method can convey the emphasis information effectively while maintaining the naturalness of synthetic speech.

  • A Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Generation

    Kou TANAKA  Tomoki TODA  Graham NEUBIG  Sakriani SAKTI  Satoshi NAKAMURA  

     
    PAPER-Voice Conversion and Speech Enhancement

      Vol:
    E97-D No:6
      Page(s):
    1429-1437

    This paper presents an electrolaryngeal (EL) speech enhancement method capable of significantly improving naturalness of EL speech while causing no degradation in its intelligibility. An electrolarynx is an external device that artificially generates excitation sounds to enable laryngectomees to produce EL speech. Although proficient laryngectomees can produce quite intelligible EL speech, it sounds very unnatural due to the mechanical excitation produced by the device. Moreover, the excitation sounds produced by the device often leak outside, adding to EL speech as noise. To address these issues, there are mainly two conventional approached to EL speech enhancement through either noise reduction or statistical voice conversion (VC). The former approach usually causes no degradation in intelligibility but yields only small improvements in naturalness as the mechanical excitation sounds remain essentially unchanged. On the other hand, the latter approach significantly improves naturalness of EL speech using spectral and excitation parameters of natural voices converted from acoustic parameters of EL speech, but it usually causes degradation in intelligibility owing to errors in conversion. We propose a hybrid approach using a noise reduction method for enhancing spectral parameters and statistical voice conversion method for predicting excitation parameters. Moreover, we further modify the prediction process of the excitation parameters to improve its prediction accuracy and reduce adverse effects caused by unvoiced/voiced prediction errors. The experimental results demonstrate the proposed method yields significant improvements in naturalness compared with EL speech while keeping intelligibility high enough.

  • Diagnosis of Signaling and Power Noise Using In-Place Waveform Capturing for 3D Chip Stacking Open Access

    Satoshi TAKAYA  Hiroaki IKEDA  Makoto NAGATA  

     
    PAPER

      Vol:
    E97-C No:6
      Page(s):
    557-565

    A three dimensional (3D) chip stack featuring a 4096-bit wide I/O demonstrator incorporates an in-place waveform capturer on an intermediate interposer within the stack. The capturer includes probing channels on paths of signaling as well as in power delivery and collects analog waveforms for diagnosing circuits within 3D integration. The collection of in-place waveforms on vertical channels with through silicon vias (TSVs) are demonstrated among 128 vertical I/O channels distributed in 8 banks in a 9.9mm × 9.9mm die area. The analog waveforms confirm a full 1.2-V swing of signaling at the maximum data transmission bandwidth of 100GByte/sec with sufficiently small deviations of signal skews and slews among the vertical channels. In addition, it is also experimentally confirmed that the signal swing can be reduced to 0.75V for error free data transfer at 100GByte/sec, achieving the energy efficiency of 0.21pJ/bit.

  • A Novel Adaptive Unambiguous Acquisition Scheme for CBOC Signal Based on Galileo

    Ce LIANG  Xiyan SUN  Yuanfa JI  Qinghua LIU  Guisheng LIAO  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E97-B No:6
      Page(s):
    1157-1165

    The composite binary offset carrier (CBOC) modulated signal contains multi-peaks in its auto-correlation function, which brings ambiguity to the signal acquisition process of a GNSS receiver. Currently, most traditional ambiguity-removing schemes for CBOC signal acquisition approximate CBOC signal as a BOC signal, which may incur performance degradation. Based on Galileo E1 CBOC signal, this paper proposes a novel adaptive ambiguity-removing acquisition scheme which doesn't adopt the approximation used in traditional schemes. According to the energy ratio of each sub-code of CBOC signal, the proposed scheme can self-adjust its local reference code to achieve unambiguous and precise signal synchronization. Monte Carlo simulation is conducted in this paper to analyze the performance of the proposed scheme and three traditional schemes. Simulation results show that the proposed scheme has higher detection probability and less mean acquisition time than the other three schemes, which verify the superiority of the proposed scheme.

  • Multi-Source Tri-Training Transfer Learning

    Yuhu CHENG  Xuesong WANG  Ge CAO  

     
    LETTER-Artificial Intelligence, Data Mining

      Vol:
    E97-D No:6
      Page(s):
    1668-1672

    A multi-source Tri-Training transfer learning algorithm is proposed by integrating transfer learning and semi-supervised learning. First, multiple weak classifiers are respectively trained by using both weighted source and target training samples. Then, based on the idea of co-training, each target testing sample is labeled by using trained weak classifiers and the sample with the same label is selected as the high-confidence sample to be added into the target training sample set. Finally, we can obtain a target domain classifier based on the updated target training samples. The above steps are iterated till the high-confidence samples selected at two successive iterations become the same. At each iteration, source training samples are tested by using the target domain classifier and the samples tested as correct continue with training, while the weights of samples tested as incorrect are lowered. Experimental results on text classification dataset have proven the effectiveness and superiority of the proposed algorithm.

  • A Correctness Assurance Approach to Automatic Synthesis of Composite Web Services

    Dajuan FAN  Zhiqiu HUANG  Lei TANG  

     
    PAPER-Data Engineering, Web Information Systems

      Vol:
    E97-D No:6
      Page(s):
    1535-1545

    One of the most important problems in web services application is the integration of different existing services into a new composite service. Existing work has the following disadvantages: (i) developers are often required to provide a composite service model first and perform formal verifications to check whether the model is correct. This makes the synthesis process of composite services semi-automatic, complex and inefficient; (ii) there is no assurance that composite services synthesized by using the fully-automatic approaches are correct; (iii) some approaches only handle simple composition problems where existing services are atomic. To address these problems, we propose a correct assurance approach for automatically synthesizing composite services based on finite state machine model. The syntax and semantics of the requirement model specifying composition requirements is also proposed. Given a set of abstract BPEL descriptions of existing services, and a composition requirement, our approach automatically generate the BPEL implementation of the composite service. Compared with existing approaches, the composite service generated by utilizing our proposed approach is guaranteed to be correct and does not require any formal verification. The correctness of our approach is proved. Moreover, the case analysis indicates that our approach is feasible and effective.

  • Noise-Robust Voice Conversion Based on Sparse Spectral Mapping Using Non-negative Matrix Factorization

    Ryo AIHARA  Ryoichi TAKASHIMA  Tetsuya TAKIGUCHI  Yasuo ARIKI  

     
    PAPER-Voice Conversion and Speech Enhancement

      Vol:
    E97-D No:6
      Page(s):
    1411-1418

    This paper presents a voice conversion (VC) technique for noisy environments based on a sparse representation of speech. Sparse representation-based VC using Non-negative matrix factorization (NMF) is employed for noise-added spectral conversion between different speakers. In our previous exemplar-based VC method, source exemplars and target exemplars are extracted from parallel training data, having the same texts uttered by the source and target speakers. The input source signal is represented using the source exemplars and their weights. Then, the converted speech is constructed from the target exemplars and the weights related to the source exemplars. However, this exemplar-based approach needs to hold all training exemplars (frames), and it requires high computation times to obtain the weights of the source exemplars. In this paper, we propose a framework to train the basis matrices of the source and target exemplars so that they have a common weight matrix. By using the basis matrices instead of the exemplars, the VC is performed with lower computation times than with the exemplar-based method. The effectiveness of this method was confirmed by comparing its effectiveness (in speaker conversion experiments using noise-added speech data) with that of an exemplar-based method and a conventional Gaussian mixture model (GMM)-based method.

  • Structured Adaptive Regularization of Weight Vectors for a Robust Grapheme-to-Phoneme Conversion Model

    Keigo KUBO  Sakriani SAKTI  Graham NEUBIG  Tomoki TODA  Satoshi NAKAMURA  

     
    PAPER-Speech Synthesis and Related Topics

      Vol:
    E97-D No:6
      Page(s):
    1468-1476

    Grapheme-to-phoneme (g2p) conversion, used to estimate the pronunciations of out-of-vocabulary (OOV) words, is a highly important part of recognition systems, as well as text-to-speech systems. The current state-of-the-art approach in g2p conversion is structured learning based on the Margin Infused Relaxed Algorithm (MIRA), which is an online discriminative training method for multiclass classification. However, it is known that the aggressive weight update method of MIRA is prone to overfitting, even if the current example is an outlier or noisy. Adaptive Regularization of Weight Vectors (AROW) has been proposed to resolve this problem for binary classification. In addition, AROW's update rule is simpler and more efficient than that of MIRA, allowing for more efficient training. Although AROW has these advantages, it has not been applied to g2p conversion yet. In this paper, we first apply AROW on g2p conversion task which is structured learning problem. In an evaluation that employed a dataset generated from the collective knowledge on the Web, our proposed approach achieves a 6.8% error reduction rate compared to MIRA in terms of phoneme error rate. Also the learning time of our proposed approach was shorter than that of MIRA in almost datasets.

  • Cooperative Bayesian Compressed Spectrum Sensing for Correlated Wideband Signals

    Honggyu JUNG  Kwang-Yul KIM  Yoan SHIN  

     
    LETTER-Communication Theory and Signals

      Vol:
    E97-A No:6
      Page(s):
    1434-1438

    We propose a cooperative compressed spectrum sensing scheme for correlated signals in wideband cognitive radio networks. In order to design a reconstruction algorithm which accurately recover the wideband signals from the compressed samples in low SNR (Signal-to-Noise Ratio) environments, we consider the multiple measurement vector model exploiting a sequence of input signals and propose a cooperative sparse Bayesian learning algorithm which models the temporal correlation of the input signals. Simulation results show that the proposed scheme outperforms existing compressed sensing algorithms for low SNRs.

  • Real Time Spectroscopic Observation of Contact Surfaces Being Eroded by Break Arcs

    Masato NAKAMURA  Junya SEKIKAWA  

     
    PAPER-Electromechanical Devices and Components

      Vol:
    E97-C No:6
      Page(s):
    592-598

    Break arcs are generated in a DC48V and 12A resistive circuit. Silver electrical contacts are separated at constant opening speed. The cathode contact surface is irradiated by a blue LED. The center wavelength of the emission of the LED is 470nm. There is no spectral line of the light emitted from the break arcs. Only the images of contact surface are observed by a high-speed camera and an optical band pass filter. Another high-speed camera observes only the images of the break arc. Time evolutions of the cathode surface morphology being eroded by the break arcs and the motion of the break arcs are observed with these cameras, simultaneously. The images of the cathode surface are investigated by the image analysis technique. The results show that the moments when the expanded regions on the cathode surface are formed during the occurrence of the break arcs. In addition, it is shown that the expanded regions are not contacted directly to the cathode roots of the break arcs.

  • A Single Opamp Third-Order Low-Distortion Delta-Sigma Modulator with SAR Quantizer Embedded Passive Adder

    I-Jen CHAO  Ching-Wen HOU  Bin-Da LIU  Soon-Jyh CHANG  Chun-Yueh HUANG  

     
    PAPER

      Vol:
    E97-C No:6
      Page(s):
    526-537

    A third-order low-distortion delta-sigma modulator (DSM), whose third-order noise-shaping ability is achieved by just a single opamp, is proposed. Since only one amplifier is required in the whole circuit, the designed DSM is very power efficient. To realize the adder in front of quantizer without employing the huge-power opamp, a capacitive passive adder, which is the digital-to-analog converter (DAC) array of a successive-approximation-type quantizer, is used. In addition, the feedback path timing is extended from a nonoverlapping interval for the conventional low-distortion structure to half of the clock period, so that the strict operation timing issue with regard to quantization and the dynamic element matching (DEM) logic operation can be solved. In the proposed DSM structure, the features of the unity-gain signal transfer function (STF) and finite-impulse-response (FIR) noise transfer function (NTF) are still preserved, and thus advantages such as a relaxed opamp slew rate and reduced output swing are also maintained, as with the conventional low-distortion DSM. Moreover, the memory effect in the proposed DSM is analyzed when employing the opamp sharing for integrators. The proposed third-order DSM with a 4-bit SAR ADC as the quantizer is implemented in a 90-nm CMOS process. The post-layout simulations show a 79.8-dB signal-to-noise and distortion ratio (SNDR) in the 1.875-MHz signal bandwidth (OSR=16). The active area of the circuit is 0.35mm2 and total power consumption is 2.85mW, resulting in a figure of merit (FOM) of 95 fJ/conversion-step.

  • Automatic Vocabulary Adaptation Based on Semantic and Acoustic Similarities

    Shoko YAMAHATA  Yoshikazu YAMAGUCHI  Atsunori OGAWA  Hirokazu MASATAKI  Osamu YOSHIOKA  Satoshi TAKAHASHI  

     
    PAPER-Speech Recognition

      Vol:
    E97-D No:6
      Page(s):
    1488-1496

    Recognition errors caused by out-of-vocabulary (OOV) words lead critical problems when developing spoken language understanding systems based on automatic speech recognition technology. And automatic vocabulary adaptation is an essential technique to solve these problems. In this paper, we propose a novel and effective automatic vocabulary adaptation method. Our method selects OOV words from relevant documents using combined scores of semantic and acoustic similarities. Using this combined score that reflects both semantic and acoustic aspects, only necessary OOV words can be selected without registering redundant words. In addition, our method estimates probabilities of OOV words using semantic similarity and a class-based N-gram language model. These probabilities will be appropriate since they are estimated by considering both frequencies of OOV words in target speech data and the stable class N-gram probabilities. Experimental results show that our method improves OOV selection accuracy and recognition accuracy of newly registered words in comparison with conventional methods.

  • Weakened Anonymity of Group Signature and Its Application to Subscription Services

    Kazuto OGAWA  Go OHTAKE  Arisa FUJII  Goichiro HANAOKA  

     
    PAPER

      Vol:
    E97-A No:6
      Page(s):
    1240-1258

    For the sake of privacy preservation, services that are offered with reference to individual user preferences should do so with a sufficient degree of anonymity. We surveyed various tools that meet requirements of such services and decided that group signature schemes with weakened anonymity (without unlinkability) are adequate. Then, we investigated a theoretical gap between unlinkability of group signature schemes and their other requirements. We show that this gap is significantly large. Specifically, we clarify that if unlinkability can be achieved from any other property of group signature schemes, it becomes possible to construct a chosen-ciphertext secure cryptosystem from any one-way function. This result implies that the efficiency of group signature schemes can be drastically improved if unlinkability is not taken into account. We also demonstrate a way to construct a scheme without unlinkability that is significantly more efficient than the best known full-fledged scheme.

  • Adaptive Spectral Masking of AVQ Coding and Sparseness Detection for ITU-T G.711.1 Annex D and G.722 Annex B Standards

    Masahiro FUKUI  Shigeaki SASAKI  Yusuke HIWASAKI  Kimitaka TSUTSUMI  Sachiko KURIHARA  Hitoshi OHMURO  Yoichi HANEDA  

     
    PAPER-Speech and Hearing

      Vol:
    E97-D No:5
      Page(s):
    1264-1272

    We proposes a new adaptive spectral masking method of algebraic vector quantization (AVQ) for non-sparse signals in the modified discreet cosine transform (MDCT) domain. This paper also proposes switching the adaptive spectral masking on and off depending on whether or not the target signal is non-sparse. The switching decision is based on the results of MDCT-domain sparseness analysis. When the target signal is categorized as non-sparse, the masking level of the target MDCT coefficients is adaptively controlled using spectral envelope information. The performance of the proposed method, as a part of ITU-T G.711.1 Annex D, is evaluated in comparison with conventional AVQ. Subjective listening test results showed that the proposed method improves sound quality by more than 0.1 points on a five-point scale on average for speech, music, and mixed content, which indicates significant improvement.

  • An Efficient Strategy for Bit-Quad-Based Euler Number Computing Algorithm

    Bin YAO  Hua WU  Yun YANG  Yuyan CHAO  Atsushi OHTA  Haruki KAWANAKA  Lifeng HE  

     
    LETTER-Pattern Recognition

      Vol:
    E97-D No:5
      Page(s):
    1374-1378

    The Euler number of a binary image is an important topological property for pattern recognition, and can be calculated by counting certain bit-quads in the image. This paper proposes an efficient strategy for improving the bit-quad-based Euler number computing algorithm. By use of the information obtained when processing the previous bit quad, the number of times that pixels must be checked in processing a bit quad decreases from 4 to 2. Experiments demonstrate that an algorithm with our strategy significantly outperforms conventional Euler number computing algorithms.

  • Area-Efficient Microarchitecture for Reinforcement of Turbo Mode

    Shinobu MIWA  Takara INOUE  Hiroshi NAKAMURA  

     
    PAPER-Computer System

      Vol:
    E97-D No:5
      Page(s):
    1196-1210

    Turbo mode, which accelerates many applications without major change of existing systems, is widely used in commercial processors. Since time duration or powerfulness of turbo mode depends on peak temperature of a processor chip, reducing the peak temperature can reinforce turbo mode. This paper presents that adding small amount of hardware allows microprocessors to reduce the peak temperature drastically and then to reinforce turbo mode successfully. Our approach is to find out a few small units that become heat sources in a processor and to appropriately duplicate them for reduction of their power density. By duplicating the limited units and using the copies evenly, the processor can show significant performance improvement while achieving area-efficiency. The experimental result shows that the proposed method achieves up to 14.5% of performance improvement in exchange for 2.8% of area increase.

  • A Novel Method of Deinterleaving Pulse Repetition Interval Modulated Sparse Sequences in Noisy Environments

    Mahmoud KESHAVARZI  Delaram AMIRI  Amir Mansour PEZESHK  Forouhar FARZANEH  

     
    LETTER-Digital Signal Processing

      Vol:
    E97-A No:5
      Page(s):
    1136-1139

    This letter presents a novel method based on sparsity, to solve the problem of deinterleaving pulse trains. The proposed method models the problem of deinterleaving pulse trains as an underdetermined system of linear equations. After determining the mixing matrix, we find sparsest solution of an underdetermined system of linear equations using basis pursuit denoising. This method is superior to previous ones in a number of aspects. First, spurious and missing pulses would not cause any performance reduction in the algorithm. Second, the algorithm works well despite the type of pulse repetition interval modulation that is used. Third, the proposed method is able to separate similar sources.

  • Music Signal Separation Based on Supervised Nonnegative Matrix Factorization with Orthogonality and Maximum-Divergence Penalties

    Daichi KITAMURA  Hiroshi SARUWATARI  Kosuke YAGI  Kiyohiro SHIKANO  Yu TAKAHASHI  Kazunobu KONDO  

     
    LETTER-Engineering Acoustics

      Vol:
    E97-A No:5
      Page(s):
    1113-1118

    In this letter, we address monaural source separation based on supervised nonnegative matrix factorization (SNMF) and propose a new penalized SNMF. Conventional SNMF often degrades the separation performance owing to the basis-sharing problem. Our penalized SNMF forces nontarget bases to become different from the target bases, which increases the separated sound quality.

4081-4100hit(16314hit)