The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] SPE(2504hit)

1041-1060hit(2504hit)

  • Speech Clarity Index (Ψ): A Distance-Based Speech Quality Indicator and Recognition Rate Prediction for Dysarthric Speakers with Cerebral Palsy

    Prakasith KAYASITH  Thanaruk THEERAMUNKONG  

     
    PAPER-Speech and Hearing

      Vol:
    E92-D No:3
      Page(s):
    460-468

    It is a tedious and subjective task to measure severity of a dysarthria by manually evaluating his/her speech using available standard assessment methods based on human perception. This paper presents an automated approach to assess speech quality of a dysarthric speaker with cerebral palsy. With the consideration of two complementary factors, speech consistency and speech distinction, a speech quality indicator called speech clarity index (Ψ) is proposed as a measure of the speaker's ability to produce consistent speech signal for a certain word and distinguished speech signal for different words. As an application, it can be used to assess speech quality and forecast speech recognition rate of speech made by an individual dysarthric speaker before actual exhaustive implementation of an automatic speech recognition system for the speaker. The effectiveness of Ψ as a speech recognition rate predictor is evaluated by rank-order inconsistency, correlation coefficient, and root-mean-square of difference. The evaluations had been done by comparing its predicted recognition rates with ones predicted by the standard methods called the articulatory and intelligibility tests based on the two recognition systems (HMM and ANN). The results show that Ψ is a promising indicator for predicting recognition rate of dysarthric speech. All experiments had been done on speech corpus composed of speech data from eight normal speakers and eight dysarthric speakers.

  • Transition Edge Sensor-Energy Dispersive Spectrometer (TES-EDS) and Its Applications Open Access

    Keiichi TANAKA  Akikazu ODAWARA  Atsushi NAGATA  Yukari BABA  Satoshi NAKAYAMA  Shigenori AIDA  Toshimitsu MOROOKA  Yoshikazu HOMMA  Izumi NAKAI  Kazuo CHINONE  

     
    INVITED PAPER

      Vol:
    E92-C No:3
      Page(s):
    334-340

    The Transition Edge Sensor (TES)-Energy Dispersive Spectrometer (EDS) is an X-ray detector with high-energy resolution (12.8 eV). The TES can be mounted to a scanning electron microscope (SEM). The TES-EDS is based on a cryogen-free dilution refrigerator. The high-energy resolution enables analysis of the distribution of various elements in samples under low acceleration voltage (typically under 5 keV) by using K-lines of light elements and M lines of heavy elements. For example, the energy of the arsenic L line differs from the magnesium K line by 28 eV. When used to analyze the spore of the Pteris vittata L plant, the TES-EDS clearly reveals a different distribution of As and Mg in the micro region of the plant. The TES-EDS with SEM yields detailed information about the distribution of multi-elements in a sample.

  • A High Precision Ranging Scheme for IEEE802.15.4a Chirp Spread Spectrum System

    Na Young KIM  Sujin KIM  Youngok KIM  Joonhyuk KANG  

     
    LETTER-Sensing

      Vol:
    E92-B No:3
      Page(s):
    1057-1061

    This letter proposes a high precision ranging scheme based on the time of arrival estimation technique for the IEEE 802.15.4a chirp spread spectrum system. The proposed scheme consists of a linear channel impulse response estimation process with the zero forcing or minimum mean square error technique and the multipath delay estimation process with matrix pencil algorithm. The performance of the proposed scheme is compared with that of a well known MUSIC algorithm in terms of computational complexity and ranging precision. Simulation results demonstrate that the proposed scheme outperforms the MUSIC algorithm even though it has comparatively lower computational complexity.

  • HMM-Based Style Control for Expressive Speech Synthesis with Arbitrary Speaker's Voice Using Model Adaptation

    Takashi NOSE  Makoto TACHIBANA  Takao KOBAYASHI  

     
    PAPER-Speech and Hearing

      Vol:
    E92-D No:3
      Page(s):
    489-497

    This paper presents methods for controlling the intensity of emotional expressions and speaking styles of an arbitrary speaker's synthetic speech by using a small amount of his/her speech data in HMM-based speech synthesis. Model adaptation approaches are introduced into the style control technique based on the multiple-regression hidden semi-Markov model (MRHSMM). Two different approaches are proposed for training a target speaker's MRHSMMs. The first one is MRHSMM-based model adaptation in which the pretrained MRHSMM is adapted to the target speaker's model. For this purpose, we formulate the MLLR adaptation algorithm for the MRHSMM. The second method utilizes simultaneous adaptation of speaker and style from an average voice model to obtain the target speaker's style-dependent HSMMs which are used for the initialization of the MRHSMM. From the result of subjective evaluation using adaptation data of 50 sentences of each style, we show that the proposed methods outperform the conventional speaker-dependent model training when using the same size of speech data of the target speaker.

  • Training Set Selection for Building Compact and Efficient Language Models

    Keiji YASUDA  Hirofumi YAMAMOTO  Eiichiro SUMITA  

     
    PAPER-Natural Language Processing

      Vol:
    E92-D No:3
      Page(s):
    506-511

    For statistical language model training, target domain matched corpora are required. However, training corpora sometimes include both target domain matched and unmatched sentences. In such a case, training set selection is effective for both reducing model size and improving model performance. In this paper, training set selection method for statistical language model training is described. The method provides two advantages for training a language model. One is its capacity to improve the language model performance, and the other is its capacity to reduce computational loads for the language model. The method has four steps. 1) Sentence clustering is applied to all available corpora. 2) Language models are trained on each cluster. 3) Perplexity on the development set is calculated using the language models. 4) For the final language model training, we use the clusters whose language models yield low perplexities. The experimental results indicate that the language model trained on the data selected by our method gives lower perplexity on an open test set than a language model trained on all available corpora.

  • Consolidation-Based Speech Translation and Evaluation Approach

    Chiori HORI  Bing ZHAO  Stephan VOGEL  Alex WAIBEL  Hideki KASHIOKA  Satoshi NAKAMURA  

     
    PAPER-Speech and Hearing

      Vol:
    E92-D No:3
      Page(s):
    477-488

    The performance of speech translation systems combining automatic speech recognition (ASR) and machine translation (MT) systems is degraded by redundant and irrelevant information caused by speaker disfluency and recognition errors. This paper proposes a new approach to translating speech recognition results through speech consolidation, which removes ASR errors and disfluencies and extracts meaningful phrases. A consolidation approach is spun off from speech summarization by word extraction from ASR 1-best. We extended the consolidation approach for confusion network (CN) and tested the performance using TED speech and confirmed the consolidation results preserved more meaningful phrases in comparison with the original ASR results. We applied the consolidation technique to speech translation. To test the performance of consolidation-based speech translation, Chinese broadcast news (BN) speech in RT04 were recognized, consolidated and then translated. The speech translation results via consolidation cannot be directly compared with gold standards in which all words in speech are translated because consolidation-based translations are partial translations. We would like to propose a new evaluation framework for partial translation by comparing them with the most similar set of words extracted from a word network created by merging gradual summarizations of the gold standard translation. The performance of consolidation-based MT results was evaluated using BLEU. We also propose Information Preservation Accuracy (IPAccy) and Meaning Preservation Accuracy (MPAccy) to evaluate consolidation and consolidation-based MT. We confirmed that consolidation contributed to the performance of speech translation.

  • High Tc SQUID Detector for Magnetic Metallic Particles in Products Open Access

    Saburo TANAKA  Tomonori AKAI  Yoshimi HATSUKADE  Shuichi SUZUKI  

     
    INVITED PAPER

      Vol:
    E92-C No:3
      Page(s):
    323-326

    High-Tc superconducting quantum interference device (SQUID) is an ultra-sensitive magnetic sensor. After the discovery of the high-Tc superconducting materials, the performance of the high-Tc SQUID has been improved and stabilized. One strong candidate for application is a detection system of magnetic foreign matters in industrial products. There is a possibility that ultra-small metallic foreign matter has been accidentally mixed with industrial products such as lithium ion batteries. If this happens, the manufacturer of the product suffers a great loss recalling products. The outer dimension of metallic particles less than 100 micron cannot be detected using X-ray imaging, which is commonly used for the inspection. Therefore a highly sensitive system for small foreign matters is required. We developed detection systems based on high-Tc SQUID for industrial products. We could successfully detect small iron particles of less than 50 micron on a belt conveyer. These detection levels were hard to be achieved using conventional X-ray detection or other methods.

  • Real-Time Spectral Moments Estimation and Ground Clutter Suppression for Precipitation Radar with High Resolution

    Eiichi YOSHIKAWA  Tomoaki MEGA  Takeshi MORIMOTO  Tomoo USHIO  Zen KAWASAKI  

     
    PAPER-Sensing

      Vol:
    E92-B No:2
      Page(s):
    578-584

    The purpose of this study is the real-time estimation of Doppler spectral moments for precipitation in the presence of ground clutter overlap. The proposed method is a frequency domain approach that uses a Gaussian model both to remove clutter spectrum and to estimate weather spectrum. The main advantage of this method is that it does not use processes like several fitting procedures and enables to estimate profiles of precipitation in a short processing time. Therefore this method is efficient for real-time radar observation with high range and time resolution. The performance of this method is evaluated based on simulation data and the observation data acquired by the Ku-band broad band radar (BBR) [1].

  • Speech/Music Classification Enhancement for 3GPP2 SMV Codec Based on Support Vector Machine

    Sang-Kyun KIM  Joon-Hyuk CHANG  

     
    LETTER-Speech and Hearing

      Vol:
    E92-A No:2
      Page(s):
    630-632

    In this letter, we propose a novel approach to speech/music classification based on the support vector machine (SVM) to improve the performance of the 3GPP2 selectable mode vocoder (SMV) codec. We first analyze the features and the classification method used in real time speech/music classification algorithm in SMV, and then apply the SVM for enhanced speech/music classification. For evaluation of performance, we compare the proposed algorithm and the traditional algorithm of the SMV. The performance of the proposed system is evaluated under the various environments and shows better performance compared to the original method in the SMV.

  • Simulation of SAR in the Human Body to Determine Effects of RF Heating

    Tetsuyuki MICHIYAMA  Yoshio NIKAWA  

     
    LETTER

      Vol:
    E92-B No:2
      Page(s):
    440-444

    The body area network (BAN) has attracted attention because of its potential for high-grade wireless communication technology and its safety and high durability. Also, human area transmission of a BAN propagating at an ultra-wide band (UWB) has been demonstrated recently. When considering the efficiency of electromagnetic (EM) propagation inside the human body for BAN and hyperthermia treatment using RF, it is important to determine the mechanism of EM dissipation in the human body. A body heating system for hyperthermia must deposit EM energy deep inside the body. Also, it is important that the EM field generated by the implant system is sufficiently strong. In this study, the specific absorption rate (SAR) distribution is simulated using an EM simulator to consider the biological transmission mechanism and its effects. To utilize the EM field distribution using an implant system for hyperthermia treatment, the SAR distribution inside the human body is simulated. As a result, the SAR distribution is concentrated on the surface of human tissue, the muscle-bolus interface, the pancreas, the stomach, the spleen and the regions around bones. It can also be concentrated in bone marrow and cartilage. From these results, the appropriate location for the implant system is revealed on the basis of the current distribution and differences in the wave impedance of interfacing tissues. The possibility of accurate data transmission and suitable treatment planning is confirmed.

  • SAR Computation inside Fetus by RF Coil during MR Imaging Employing Realistic Numerical Pregnant Woman Model

    Satoru KIKUCHI  Kazuyuki SAITO  Masaharu TAKAHASHI  Koichi ITO  Hiroo IKEHIRA  

     
    PAPER

      Vol:
    E92-B No:2
      Page(s):
    431-439

    This paper presents the computational electromagnetic dosimetry inside an anatomically based pregnant woman models exposed to electromagnetic wave during magnetic resonance imaging. The two types of pregnant woman models corresponding to early gestation and 26 weeks gestation were used for this study. The specific absorption rate (SAR) in and around a fetus were calculated by radiated electromagnetic wave from highpass and lowpass birdcage coil. Numerical calculation results showed that high SAR region is observed at the body in the vicinity of gaps of the coil, and is related to concentrated electric field in the gaps of human body such as armpit and thigh. Moreover, it has confirmed that the SAR in the fetus is less than International Electrotechnical Commission limit of 10 W/kg, when whole-body average SARs are 2 W/kg and 4 W/kg, which are the normal operating mode and first level controlled operating mode, respectively.

  • A Variable Break Prediction Method Using CART in a Japanese Text-to-Speech System

    Deok-Su NA  Myung-Jin BAE  

     
    LETTER-Speech and Hearing

      Vol:
    E92-D No:2
      Page(s):
    349-352

    Break prediction is an important step in text-to-speech systems as break indices (BIs) have a great influence on how to correctly represent prosodic phrase boundaries. However, an accurate prediction is difficult since BIs are often chosen according to the meaning of a sentence or the reading style of the speaker. In Japanese, the prediction of an accentual phrase boundary (APB) and major phrase boundary (MPB) is particularly difficult. Thus, this paper presents a method to complement the prediction errors of an APB and MPB. First, we define a subtle BI in which it is difficult to decide between an APB and MPB clearly as a variable break (VB), and an explicit BI as a fixed break (FB). The VB is chosen using the classification and regression tree, and multiple prosodic targets in relation to the pith and duration are then generated. Finally, unit-selection is conducted using multiple prosodic targets. The experimental results show that the proposed method improves the naturalness of synthesized speech.

  • A 5-bit 4.2-GS/s Flash ADC in 0.13-µm CMOS Process Open Access

    Ying-Zu LIN  Soon-Jyh CHANG  Yen-Ting LIU  

     
    PAPER-Electronic Circuits

      Vol:
    E92-C No:2
      Page(s):
    258-268

    This paper investigates and analyzes the resistive averaging network and interpolation technique to estimate the power consumption of preamplifier arrays in a flash analog-to-digital converter (ADC). By comparing the relative power consumption of various configurations, flash ADC designers can select the most power efficient architecture when the operation speed and resolution of a flash ADC are specified. Based on the quantitative analysis, a compact 5-bit flash ADC is designed and fabricated in a 0.13-µm CMOS process. The proposed ADC consumes 180 mW from a 1.2-V supply and occupies 0.16-mm2 active area. Operating at 3.2 GS/s, the ENOB is 4.44 bit and ERBW 1.65 GHz. At 4.2 GS/s, the ENOB is 4.20 bit and ERBW 1.75 GHz. This ADC achieves FOMs of 2.59 and 2.80 pJ/conversion-step at 3.2 and 4.2 GS/s, respectively.

  • A Filter Method for Feature Selection for SELDI-TOF Mass Spectrum

    Trung-Nghia VU  Syng-Yup OHN  

     
    LETTER-Pattern Recognition

      Vol:
    E92-D No:2
      Page(s):
    346-348

    We propose a new filter method for feature selection for SELDI-TOF mass spectrum datasets. In the method, a new relevance index was defined to represent the goodness of a feature by considering the distribution of samples based on the counts. The relevance index can be used to obtain the feature sets for classification. Our method can be applied to mass spectrum datasets with extremely high dimensions and process the clinical datasets with practical sizes in acceptable calculation time since it is based on simple counting of samples. The new method was applied to the three public mass spectrum datasets and showed better or comparable results than conventional filter methods.

  • A Novel Search and Selection Method for Spreading Code of UWB System and Its Application to IEEE 802.15.4a IR-UWB System

    Daegun OH  Jong-Wha CHONG  

     
    LETTER-Wireless Communication Technologies

      Vol:
    E92-B No:2
      Page(s):
    675-678

    In this paper, we propose a novel search and selection method for spreading code set of UWB system and apply it to IEEE 802.15.4a IR-UWB system. To find a spreading code with low spectral peak to average ratio (SPAR) and good auto-correlation property, the proposed method searches spreading codes in the frequency domain based on the time-frequency relation of the spreading code. Using evaluation parameters, we selected the code set which had SPAR reduced about 1.1079 dB, Golay merit factor (GMF) improved by 49% and almost the same modified Golay merit factor (MGMF) compared to the code set used as preambles for IR-UWB system.

  • SA and SAR Analysis for Wearable UWB Body Area Applications

    Qiong WANG  Jianqing WANG  

     
    PAPER

      Vol:
    E92-B No:2
      Page(s):
    425-430

    With the rapid progress of electronic and information technology, an expectation for the realization of body area network (BAN) by means of ultra wide band (UWB) techniques has risen. Although the signal from a single UWB device is very low, the energy absorption may increase significantly when many UWB devices are simultaneously adorned to a human body. An analysis method is therefore required from the point of view of biological safety evaluation. In this study, two approaches, one is in the time domain and the other is in the frequency domain, are proposed for the specific energy absorption (SA) and the specific absorption rate (SAR) calculation. It is shown that the two approaches have the same accuracy but the time-domain approach is more straightforward in the numerical analysis. By using the time-domain approach, SA and SAR calculation results are given for multiple UWB pulse exposure to an anatomical human body model under the Federal Communications Commission (FCC) UWB limit.

  • Analysis and Uniform Design of a Single-Layer Slotted Waveguide Array Antenna with Baffles

    Takehito SUZUKI  Jiro HIROKAWA  Makoto ANDO  

     
    PAPER-Devices/Circuits for Communications

      Vol:
    E92-B No:1
      Page(s):
    150-158

    This paper presents the formulation for the evaluation of external coupling in the alternating-phase feed single-layer slotted waveguide array antenna with baffles by using the Spectrum of Two-Dimensional Solutions (S2DS) method. A one-dimensional slot array is extracted from the array by assuming the periodicity in transversal direction and introducing the perfect electric conductors in the external region. The uniform excitation over the finite array is synthesized iteratively to demonstrate the fast and accurate results by S2DS. A unit design model with the baffles is introduced to determine the initial parameters of the slot pair which accelerate the iteration. Experiment at 25.3 GHz demonstrates good uniformity of the aperture field distribution as well as the effects of the baffles. The directivity is 32.7 dB which corresponds to the aperture efficiency 90.5% and the reflection is below -15.0 dB over 1.3 GHz.

  • A Subtractive-Type Speech Enhancement Using the Perceptual Frequency-Weighting Function

    Seiji HAYASHI  Hiroyuki INUKAI  Masahiro SUGUIMOTO  

     
    PAPER-Speech and Hearing

      Vol:
    E92-A No:1
      Page(s):
    226-234

    The present paper describes quality enhancement of speech corrupted by an additive background noise in a single-channel system. The proposed approach is based on the introduction of a perceptual criterion using a frequency-weighting filter in a subtractive-type enhancement process. Although this subtractive-type method is very attractive because of its simplicity, it produces an unnatural and unpleasant residual noise. Thus, it is difficult to select fixed optimized parameters for all speech and noise conditions. A new and effective algorithm is thus developed based on the masking properties of the human ear. This newly developed algorithm allows for an automatic adaptation in the time and frequency of the enhancement system and determines a suitable noise estimate according to the frequency of the noisy input speech. Experimental results demonstrate that the proposed approach can efficiently remove additive noise related to various kinds of noise corruption.

  • Realtime Joint Speech Coding and Transmission Algorithm for High Packet Loss Rate Wireless Channels

    Tan PENG  Huijuan CUI  Kun TANG  Wei MIAO  

     
    LETTER-Speech and Hearing

      Vol:
    E91-D No:12
      Page(s):
    2892-2896

    In digital speech communication over noisy high packet loss rate wireless channels, improving the overall performance of the realtime speech coding and transmission system is of great importance. A novel joint speech coding and transmission algorithm is proposed by fully exploiting the correlation between speech coding, channel coding and the transmission process. The proposed algorithm requires no algorithm delay and less bandwidth expansion while greatly enhancing the error correcting performance and the reconstructed speech quality compared with conventional algorithms. Simulations show that the residual error rate is reduced by 84.36% and the MOS (Mean Opinion Score) is improved over 38.86%.

  • An Experimental Study of Head Instabilities in TMR Sensors for Magnetic Recording Heads with Adaptive Flying Height

    Damrongsak TONGSOMPORN  Nitin AFZULPURKAR  Brent BARGMANN  Lertsak LEKAWAT  Apirat SIRITARATIWAT  

     
    PAPER-Storage Technology

      Vol:
    E91-C No:12
      Page(s):
    1958-1965

    We did an experimental study to investigate the effect of the thermal stress due to the heater for adjusting adaptive flying height (AFH) on the readability and instability of tunneling magnetoresistance (TMR) sensors. The slider head consists of a small heater nearby the read/write elements for controlling the clearance between the read/write elements and the recording medium of the magnetic recording system. It is firstly reported that the thermal stress from the AFH heater induces instabilities and caused head degradation. The thermal stress degrades the reader performance by inducing voltage fluctuations and large noise spikes that causes the magnetic recording system having poor bit error rate (BER). The open loop of the transfer curve indicates that the flipping of a synthetic antiferromagnet (SAF) edge magnetization causes these instabilities. The thermal stress reduces the exchange bias field and the energy barrier to flop the SAF edge magnetization. The dispersion and thermal stability of the antiferromagnetic (AFM) layer are the potential root causes of these SAF instabilities because the larger AFM dispersion in these heads gives less net stabilizing field to SAF layers that lowers the energy barrier to flop the SAF edge magnetization. Scanning electron microscope (SEM) images of these weak heads show rough surface and scratches close to the sensor element. The mechanical stress due to these scratches may additionally impact to the stabilizing field of the SAF.

1041-1060hit(2504hit)