The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] SPE(2504hit)

141-160hit(2504hit)

  • A Node-Grouping Based Spatial Spectrum Reuse Method for WLANs in Dense Residential Scenarios

    Jin LIU  Masahide HATANAKA  Takao ONOYE  

     
    PAPER-Mobile Information Network and Personal Communications

      Vol:
    E103-A No:7
      Page(s):
    917-927

    Lately, an increasing number of wireless local area network (WLAN) access points (APs) are deployed to serve an ever increasing number of mobile stations (STAs). Due to the limited frequency spectrum, more and more AP and STA nodes try to access the same channel. Spatial spectrum reuse is promoted by the IEEE 802.11ax task group through dynamic sensitivity control (DSC), which permits cochannel operation when the received signal power at the prospective transmitting node (PTN) is lower than an adjusted carrier sensing threshold (CST). Previously-proposed DSC approaches typically calculate the CST without node grouping by using a margin parameter that remains fixed during operation. Setting the margin has previously been done heuristically. Finding a suitable value has remained an open problem. Therefore, herein, we propose a DSC approach that employs a node grouping method for adaptive calculation of the CST at the PTN with a channel-aware and margin-free formula. Numerical simulations for dense residential WLAN scenario reveal total throughput and Jain's fairness index gains of 8.4% and 7.6%, respectively, vs. no DSC (as in WLANs deployed to present).

  • Siamese Attention-Based LSTM for Speech Emotion Recognition

    Tashpolat NIZAMIDIN  Li ZHAO  Ruiyu LIANG  Yue XIE  Askar HAMDULLA  

     
    LETTER-Engineering Acoustics

      Vol:
    E103-A No:7
      Page(s):
    937-941

    As one of the popular topics in the field of human-computer interaction, the Speech Emotion Recognition (SER) aims to classify the emotional tendency from the speakers' utterances. Using the existing deep learning methods, and with a large amount of training data, we can achieve a highly accurate performance result. Unfortunately, it's time consuming and difficult job to build such a huge emotional speech database that can be applicable universally. However, the Siamese Neural Network (SNN), which we discuss in this paper, can yield extremely precise results with just a limited amount of training data through pairwise training which mitigates the impacts of sample deficiency and provides enough iterations. To obtain enough SER training, this study proposes a novel method which uses Siamese Attention-based Long Short-Term Memory Networks. In this framework, we designed two Attention-based Long Short-Term Memory Networks which shares the same weights, and we input frame level acoustic emotional features to the Siamese network rather than utterance level emotional features. The proposed solution has been evaluated on EMODB, ABC and UYGSEDB corpora, and showed significant improvement on SER results, compared to conventional deep learning methods.

  • Systematic Detection of State Variable Corruptions in Discrete Event System Specification Based Simulation

    Hae Young LEE  Jin Myoung KIM  

     
    LETTER-Software System

      Pubricized:
    2020/04/17
      Vol:
    E103-D No:7
      Page(s):
    1769-1772

    In this letter, we propose a more secure modeling and simulation approach that can systematically detect state variable corruptions caused by buffer overflows in simulation models. Using our approach, developers may not consider secure coding practices related to the corruptions. We have implemented a prototype of the approach based on a modeling and simulation formalism and an open source simulator. Through optimization, the prototype could show better performance, compared to the original simulator, and detect state variable corruptions.

  • Improving the Accuracy of Spectrum-Based Fault Localization Using Multiple Rules

    Rongcun WANG  Shujuan JIANG  Kun ZHANG  Qiao YU  

     
    PAPER-Software Engineering

      Pubricized:
    2020/02/26
      Vol:
    E103-D No:6
      Page(s):
    1328-1338

    Software fault localization, as one of the essential activities in program debugging, aids to software developers to identify the locations of faults in a program, thus reducing the cost of program debugging. Spectrum-based fault localization (SBFL), as one of the representative localization techniques, has been intensively studied. The localization technique calculates the probability of each program entity that is faulty by a certain suspiciousness formula. The accuracy of SBFL is not always as satisfactory as expected because it neglects the contextual information of statement executions. Therefore, we proposed 5 rules, i.e., random, the maximum coverage, the minimum coverage, the maximum distance, and the minimum distance, to improve the accuracy of SBFL for further. The 5 rules can effectively use the contextual information of statement executions. Moreover, they can be implemented on the traditional SBFL techniques using suspiciousness formulas with little effort. We empirically evaluated the impacts of the rules on 17 suspiciousness formulas. The results show that all 5 rules can significantly improve the ranking of faulty statements. Particularly, for the faults difficult to locate, the improvement is more remarkable. Generally, the rules can effectively reduce the number of statements examined by an average of more than 19%. Compared with other rules, the minimum coverage rule generates better results. This indicates that the application of the test case having the minimum coverage capability for fault localization is more effective.

  • End-to-End Multilingual Speech Recognition System with Language Supervision Training

    Danyang LIU  Ji XU  Pengyuan ZHANG  

     
    LETTER-Speech and Hearing

      Pubricized:
    2020/03/19
      Vol:
    E103-D No:6
      Page(s):
    1427-1430

    End-to-end (E2E) multilingual automatic speech recognition (ASR) systems aim to recognize multilingual speeches in a unified framework. In the current E2E multilingual ASR framework, the output prediction for a specific language lacks constraints on the output scope of modeling units. In this paper, a language supervision training strategy is proposed with language masks to constrain the neural network output distribution. To simulate the multilingual ASR scenario with unknown language identity information, a language identification (LID) classifier is applied to estimate the language masks. On four Babel corpora, the proposed E2E multilingual ASR system achieved an average absolute word error rate (WER) reduction of 2.6% compared with the multilingual baseline system.

  • A Unified Decision Scheme for Classification and Localization of Cable Faults

    So Ryoung PARK  Iickho SONG  Seokho YOON  

     
    LETTER-Measurement Technology

      Vol:
    E103-A No:6
      Page(s):
    865-871

    A unified decision scheme for the classification and localization of cable faults is proposed based on a two-step procedure. Having basis in the time domain reflectometry (TDR), the proposed scheme is capable of determining not only the locations but also types of faults in a cable without an excessive additional computational burden compared to other TDR-based schemes. Results from simulation and experiments with measured real data demonstrate that the proposed scheme exhibits a higher rate of correct decision than the conventional schemes in localizing and classifying faults over a wide range of the location of faults.

  • Voice Conversion for Improving Perceived Likability of Uttered Speech

    Shinya HORIIKE  Masanori MORISE  

     
    LETTER-Speech and Hearing

      Pubricized:
    2020/01/23
      Vol:
    E103-D No:5
      Page(s):
    1199-1202

    To improve the likability of speech, we propose a voice conversion algorithm by controlling the fundamental frequency (F0) and the spectral envelope and carry out a subjective evaluation. The subjects can manipulate these two speech parameters. From the result, the subjects preferred speech with a parameter related to higher brightness.

  • Mimicking Lombard Effect: An Analysis and Reconstruction

    Thuan Van NGO  Rieko KUBO  Masato AKAGI  

     
    PAPER-Speech and Hearing

      Pubricized:
    2020/02/13
      Vol:
    E103-D No:5
      Page(s):
    1108-1117

    Lombard speech is produced in noisy environments due to the Lombard effect and is intelligible in adverse environments. To adaptively control the intelligibility of transmitted speech for public announcement systems, in this study, we focus on perceptually mimicking Lombard speech under backgrounds with varying noise levels. Other approaches map corresponding neutral speech features to Lombard speech features, but as this can only be applied to one noise level at a time, it is unsuitable for varying noise levels because the characteristics of Lombard speech are varied according to noise level. Instead, we utilize a rule-based method that automatically generates rules and flexibly controls features with any change of noise level. Specifically, we conduct a feature tendency analysis and propose a continuous rule generation model to estimate the effect of varying noise levels on features. The proposed techniques, which are based on a coarticulation model, MRTD, and spectral-GMM, can easily modify neutral speech features by following the generated rules. Voices having these features are then synthesized by STRAIGHT to obtain Lombard speech fitting to noises with varying levels. To validate our proposed method, the quality of mimicking speech is evaluated in subjective listening experiments on similarity, intelligibility, and naturalness. In varying noise levels, the results show equal similarity with Lombard speech between the proposed method and a state-of-the-art method. Intelligibility and naturalness are comparable with some feature modifications.

  • Continuous Noise Masking Based Vocoder for Statistical Parametric Speech Synthesis

    Mohammed Salah AL-RADHI  Tamás Gábor CSAPÓ  Géza NÉMETH  

     
    PAPER-Speech and Hearing

      Pubricized:
    2020/02/10
      Vol:
    E103-D No:5
      Page(s):
    1099-1107

    In this article, we propose a method called “continuous noise masking (cNM)” that allows eliminating residual buzziness in a continuous vocoder, i.e. of which all parameters are continuous and offers a simple and flexible speech analysis and synthesis system. Traditional parametric vocoders generally show a perceptible deterioration in the quality of the synthesized speech due to different processing algorithms. Furthermore, an inaccurate noise resynthesis (e.g. in breathiness or hoarseness) is also considered to be one of the main underlying causes of performance degradation, leading to noisy transients and temporal discontinuity in the synthesized speech. To overcome these issues, a new cNM is developed based on the phase distortion deviation in order to reduce the perceptual effect of the residual noise, allowing a proper reconstruction of noise characteristics, and model better the creaky voice segments that may happen in natural speech. To this end, the cNM is designed to keep only voice components under a condition of the cNM threshold while discarding others. We evaluate the proposed approach and compare with state-of-the-art vocoders using objective and subjective listening tests. Experimental results show that the proposed method can reduce the effect of residual noise and can reach the quality of other sophisticated approaches like STRAIGHT and log domain pulse model (PML).

  • Patient-Specific ECG Classification with Integrated Long Short-Term Memory and Convolutional Neural Networks

    Jiaquan WU  Feiteng LI  Zhijian CHEN  Xiaoyan XIANG  Yu PU  

     
    PAPER-Biological Engineering

      Pubricized:
    2020/02/13
      Vol:
    E103-D No:5
      Page(s):
    1153-1163

    This paper presents an automated patient-specific ECG classification algorithm, which integrates long short-term memory (LSTM) and convolutional neural networks (CNN). While LSTM extracts the temporal features, such as the heart rate variance (HRV) and beat-to-beat correlation from sequential heartbeats, CNN captures detailed morphological characteristics of the current heartbeat. To further improve the classification performance, adaptive segmentation and re-sampling are applied to align the heartbeats of different patients with various heart rates. In addition, a novel clustering method is proposed to identify the most representative patterns from the common training data. Evaluated on the MIT-BIH arrhythmia database, our algorithm shows the superior accuracy for both ventricular ectopic beats (VEB) and supraventricular ectopic beats (SVEB) recognition. In particular, the sensitivity and positive predictive rate for SVEB increase by more than 8.2% and 8.8%, respectively, compared with the prior works. Since our patient-specific classification does not require manual feature extraction, it is potentially applicable to embedded devices for automatic and accurate arrhythmia monitoring.

  • Orthogonal Gradient Penalty for Fast Training of Wasserstein GAN Based Multi-Task Autoencoder toward Robust Speech Recognition

    Chao-Yuan KAO  Sangwook PARK  Alzahra BADI  David K. HAN  Hanseok KO  

     
    LETTER-Speech and Hearing

      Pubricized:
    2020/01/27
      Vol:
    E103-D No:5
      Page(s):
    1195-1198

    Performance in Automatic Speech Recognition (ASR) degrades dramatically in noisy environments. To alleviate this problem, a variety of deep networks based on convolutional neural networks and recurrent neural networks were proposed by applying L1 or L2 loss. In this Letter, we propose a new orthogonal gradient penalty (OGP) method for Wasserstein Generative Adversarial Networks (WGAN) applied to denoising and despeeching models. WGAN integrates a multi-task autoencoder which estimates not only speech features but also noise features from noisy speech. While achieving 14.1% improvement in Wasserstein distance convergence rate, the proposed OGP enhanced features are tested in ASR and achieve 9.7%, 8.6%, 6.2%, and 4.8% WER improvements over DDAE, MTAE, R-CED(CNN) and RNN models.

  • Efficient Computation of Boomerang Connection Probability for ARX-Based Block Ciphers with Application to SPECK and LEA

    Dongyeong KIM  Dawoon KWON  Junghwan SONG  

     
    PAPER-Cryptography and Information Security

      Vol:
    E103-A No:4
      Page(s):
    677-685

    The boomerang connectivity table (BCT) was introduced by C. Cid et al. Using the BCT, for SPN block cipher, the dependency between sub-ciphers in boomerang structure can be computed more precisely. However, the existing method to generate BCT is difficult to be applied to the ARX-based cipher, because of the huge domain size. In this paper, we show a method to compute the dependency between sub-ciphers in boomerang structure for modular addition. Using bit relation in modular addition, we compute the dependency sequentially in bitwise. And using this method, we find boomerang characteristics and amplified boomerang characteristics for the ARX-based ciphers LEA and SPECK. For LEA-128, we find a reduced 15-round boomerang characteristic and reduced 16-round amplified boomerang characteristic which is two rounds longer than previous boomerang characteristic. Also for SPECK64/128, we find a reduced 13-round amplified boomerang characteristic which is one round longer than previous rectangle characteristic.

  • Linear Constellation Precoded OFDM with Index Modulation Based Orthogonal Cooperative System

    Qingbo WANG  Gaoqi DOU  Ran DENG  Jun GAO  

     
    PAPER

      Pubricized:
    2019/10/15
      Vol:
    E103-B No:4
      Page(s):
    312-320

    The current orthogonal cooperative system (OCS) achieves diversity through the use of relays and the consumption of an additional time slot (TS). To guarantee the orthogonality of the received signal and avoid the mutual interference at the destination, the source has to be mute in the second TS. Consequently, the spectral efficiency (SE) is halved. In this paper, linear constellation precoded orthogonal frequency division multiplexing with index modulation (LCP-OFDM-IM) based OCS is proposed, where the source activates the complementary subcarriers to convey the symbols over two TSs. Hence the source can consecutively transmit information to the destination without the mutual interference. Compared with the current OFDM based OCS, the LCP-OFDM-IM based OCS can achieve a higher SE, since the subcarrier activation patterns (SAPs) can be exploited to convey additional information. Furthermore, the optimal precoder, in the sense of maximizing the minimum Euclidean distance of the symbols conveyed on each subcarrier over two TSs, is provided. Simulation results show the superiority of the LCP-OFDM-IM based OCS over the current OFDM based OCS.

  • Angular Momentum Spectrum of Electromagnetic Wave

    Chao ZHANG  Jin JIANG  

     
    LETTER-Analog Signal Processing

      Vol:
    E103-A No:4
      Page(s):
    715-717

    Angular Momentum (AM) has been considered as a new dimension of wireless transmissions as well as the intrinsic property of Electro-Magnetic (EM) waves. So far, AM is utilized as a discrete mode not only in the quantum states, but also in the statistical beam forming. Traditionally, the continuous value of AM is ignored and only the quantized mode number is identified. However, the recent discovery on electrons in spiral motion producing twisted radiation with AM, including Spin Angular Momentum (SAM) and Orbital Angular Momentum (OAM), proves that the continuous value of AM is available in the statistical EM wave beam. This is also revealed by the so-called fractional OAM, which is reported in optical OAM beams. Then, as the new dimension with continuous real number field, AM should turn out to be a certain spectrum, similar to the frequency spectrum usually in the wireless signal processing. In this letter, we mathematically define the AM spectrum and show the applications in the information theory analysis, which is expected to be an efficient tool for the future wireless communications with AM.

  • Korean-Vietnamese Neural Machine Translation with Named Entity Recognition and Part-of-Speech Tags

    Van-Hai VU  Quang-Phuoc NGUYEN  Kiem-Hieu NGUYEN  Joon-Choul SHIN  Cheol-Young OCK  

     
    PAPER-Natural Language Processing

      Pubricized:
    2020/01/15
      Vol:
    E103-D No:4
      Page(s):
    866-873

    Since deep learning was introduced, a series of achievements has been published in the field of automatic machine translation (MT). However, Korean-Vietnamese MT systems face many challenges because of a lack of data, multiple meanings of individual words, and grammatical diversity that depends on context. Therefore, the quality of Korean-Vietnamese MT systems is still sub-optimal. This paper discusses a method for applying Named Entity Recognition (NER) and Part-of-Speech (POS) tagging to Vietnamese sentences to improve the performance of Korean-Vietnamese MT systems. In terms of implementation, we used a tool to tag NER and POS in Vietnamese sentences. In addition, we had access to a Korean-Vietnamese parallel corpus with more than 450K paired sentences from our previous research paper. The experimental results indicate that tagging NER and POS in Vietnamese sentences can improve the quality of Korean-Vietnamese Neural MT (NMT) in terms of the Bi-Lingual Evaluation Understudy (BLEU) and Translation Error Rate (TER) score. On average, our MT system improved by 1.21 BLEU points or 2.33 TER scores after applying both NER and POS tagging to the Vietnamese corpus. Due to the structural features of language, the MT systems in the Korean to Vietnamese direction always give better BLEU and TER results than translation machines in the reverse direction.

  • An Approximation Algorithm for the 2-Dispersion Problem

    Kazuyuki AMANO  Shin-ichi NAKANO  

     
    PAPER

      Pubricized:
    2019/11/28
      Vol:
    E103-D No:3
      Page(s):
    506-508

    Let P be a set of points on the plane, and d(p, q) be the distance between a pair of points p, q in P. For a point p∈P and a subset S ⊂ P with |S|≥3, the 2-dispersion cost, denoted by cost2(p, S), of p with respect to S is the sum of (1) the distance from p to the nearest point in Ssetminus{p} and (2) the distance from p to the second nearest point in Ssetminus{p}. The 2-dispersion cost cost2(S) of S ⊂ P with |S|≥3 is minp∈S{cost2(p, S)}. Given a set P of n points and an integer k we wish to compute k point subset S of P with maximum cost2(S). In this paper we give a simple 1/({4sqrt{3}}) approximation algorithm for the problem.

  • Analysis of Antenna Performance Degradation due to Coupled Electromagnetic Interference from Nearby Circuits

    Hosang LEE  Jawad YOUSAF  Kwangho KIM  Seongjin MUN  Chanseok HWANG  Wansoo NAH  

     
    PAPER-Electromagnetic Theory

      Pubricized:
    2019/08/27
      Vol:
    E103-C No:3
      Page(s):
    110-118

    This paper analyzes and compares two methods to estimate electromagnetically coupled noises introduced to an antenna due to the nearby circuits at a circuit design stage. One of them is to estimate the power spectrum, and the other one is to estimate the active S11 parameter at the victim antenna, respectively, and both of them use simulated standard S-parameters for the electromagnetic coupling in the circuit. They also need the assumed or measured excitation of noise sources. To confirm the validness of the two methods, an evaluation board consisting of an antenna and noise sources were designed and fabricated in which voltage controlled oscillator (VCO) chips are placed as noise sources. The generated electromagnetic noises are transferred to an antenna via loop-shaped transmission lines, degrading the performance of the antenna. In this paper, detailed analysis procedures are described using the evaluation board, and it is shown that the two methods are equivalent to each other in terms of the induced voltages in the antenna. Finally, a procedure to estimate antenna performance degradation at the design stage is summarized.

  • Range Points Migration Based Spectroscopic Imaging Algorithm for Wide-Beam Terahertz Subsurface Sensor Open Access

    Takamaru MATSUI  Shouhei KIDERA  

     
    BRIEF PAPER-Electromagnetic Theory

      Pubricized:
    2019/09/25
      Vol:
    E103-C No:3
      Page(s):
    127-130

    Here, we present a novel spectroscopic imaging method based on the boundary-extraction scheme for wide-beam terahertz (THz) three-dimensional imaging. Optical-lens-focusing systems for THz subsurface imaging generally require the depth of the object from the surface to be input beforehand to achieve the desired azimuth resolution. This limitation can be alleviated by incorporating a wide-beam THz transmitter into the synthetic aperture to automatically change the focusing depth in the post-signal processing. The range point migration (RPM) method has been demonstrated to have significant advantages in terms of imaging accuracy over the synthetic-aperture method. Moreover, in the RPM scheme, spectroscopic information can be easily associated with each scattering center. Thus, we propose an RPM-based terahertz spectroscopic imaging method. The finite-difference time-domain-based numerical analysis shows that the proposed algorithm provides accurate target boundary imaging associated with each frequency-dependent characteristic.

  • A High-Speed Method for Generating Edge-Preserving Bubble Images

    Toru HIRAOKA  

     
    LETTER-Computer Graphics

      Pubricized:
    2019/11/29
      Vol:
    E103-D No:3
      Page(s):
    724-727

    We propose a non-photorealistic rendering method for generating edge-preserving bubble images from gray-scale photographic images. Bubble images are non-photorealistic images embedded in many bubbles, and edge-preserving bubble images are bubble images where edges in photographic images are preserved. The proposed method is executed by an iterative processing using absolute difference in window. The proposed method has features that processing is simple and fast. To validate the effectiveness of the proposed method, experiments using various photographic images are conducted. Results show that the proposed method can generate edge-preserving bubble images by preserving the edges of photographic images and the processing speed is high.

  • An Efficient Image to Sound Mapping Method Preserving Speech Spectral Envelope

    Yuya HOSODA  Arata KAWAMURA  Youji IIGUNI  

     
    LETTER-Digital Signal Processing

      Vol:
    E103-A No:3
      Page(s):
    629-630

    In this paper, we propose an image to sound mapping method. This technique treats an image as a spectrogram and maps it to a sound by taking inverse FFT of the spectrogram. Amplitude spectra of a speech signal are embedded to the spectrogram to give speech intelligibility for the mapped sound. Specifically, we hold amplitude spectra of a speech signal with strong power and embed the image brightness in other frequency bands. Holding amplitude spectra of a speech signal with strong power preserves a speech spectral envelope and improves the speech quality of the mapped sound. The amplitude spectra of the mapped sound with weak power represent the image brightness, and then the image is successfully reconstructed from the mapped sound. Simulation results show that the proposed method achieves sufficient speech quality.

141-160hit(2504hit)