IEICE global.ieice.org Site

Keyword Search Result

[Keyword] SPE(2504hit)

141-160hit(2504hit)

A Node-Grouping Based Spatial Spectrum Reuse Method for WLANs in Dense Residential Scenarios
Jin LIU Masahide HATANAKA Takao ONOYE

PAPER-Mobile Information Network and Personal Communications

Vol:
E103-A No:7
Page(s):
917-927
Lately, an increasing number of wireless local area network (WLAN) access points (APs) are deployed to serve an ever increasing number of mobile stations (STAs). Due to the limited frequency spectrum, more and more AP and STA nodes try to access the same channel. Spatial spectrum reuse is promoted by the IEEE 802.11ax task group through dynamic sensitivity control (DSC), which permits cochannel operation when the received signal power at the prospective transmitting node (PTN) is lower than an adjusted carrier sensing threshold (CST). Previously-proposed DSC approaches typically calculate the CST without node grouping by using a margin parameter that remains fixed during operation. Setting the margin has previously been done heuristically. Finding a suitable value has remained an open problem. Therefore, herein, we propose a DSC approach that employs a node grouping method for adaptive calculation of the CST at the PTN with a channel-aware and margin-free formula. Numerical simulations for dense residential WLAN scenario reveal total throughput and Jain's fairness index gains of 8.4% and 7.6%, respectively, vs. no DSC (as in WLANs deployed to present).
Siamese Attention-Based LSTM for Speech Emotion Recognition
Tashpolat NIZAMIDIN Li ZHAO Ruiyu LIANG Yue XIE Askar HAMDULLA

LETTER-Engineering Acoustics

Vol:
E103-A No:7
Page(s):
937-941
As one of the popular topics in the field of human-computer interaction, the Speech Emotion Recognition (SER) aims to classify the emotional tendency from the speakers' utterances. Using the existing deep learning methods, and with a large amount of training data, we can achieve a highly accurate performance result. Unfortunately, it's time consuming and difficult job to build such a huge emotional speech database that can be applicable universally. However, the Siamese Neural Network (SNN), which we discuss in this paper, can yield extremely precise results with just a limited amount of training data through pairwise training which mitigates the impacts of sample deficiency and provides enough iterations. To obtain enough SER training, this study proposes a novel method which uses Siamese Attention-based Long Short-Term Memory Networks. In this framework, we designed two Attention-based Long Short-Term Memory Networks which shares the same weights, and we input frame level acoustic emotional features to the Siamese network rather than utterance level emotional features. The proposed solution has been evaluated on EMODB, ABC and UYGSEDB corpora, and showed significant improvement on SER results, compared to conventional deep learning methods.
Systematic Detection of State Variable Corruptions in Discrete Event System Specification Based Simulation
Hae Young LEE Jin Myoung KIM

LETTER-Software System

Pubricized:
2020/04/17
Vol:
E103-D No:7
Page(s):
1769-1772
In this letter, we propose a more secure modeling and simulation approach that can systematically detect state variable corruptions caused by buffer overflows in simulation models. Using our approach, developers may not consider secure coding practices related to the corruptions. We have implemented a prototype of the approach based on a modeling and simulation formalism and an open source simulator. Through optimization, the prototype could show better performance, compared to the original simulator, and detect state variable corruptions.
Improving the Accuracy of Spectrum-Based Fault Localization Using Multiple Rules
Rongcun WANG Shujuan JIANG Kun ZHANG Qiao YU

PAPER-Software Engineering

Pubricized:
2020/02/26
Vol:
E103-D No:6
Page(s):
1328-1338
Software fault localization, as one of the essential activities in program debugging, aids to software developers to identify the locations of faults in a program, thus reducing the cost of program debugging. Spectrum-based fault localization (SBFL), as one of the representative localization techniques, has been intensively studied. The localization technique calculates the probability of each program entity that is faulty by a certain suspiciousness formula. The accuracy of SBFL is not always as satisfactory as expected because it neglects the contextual information of statement executions. Therefore, we proposed 5 rules, i.e., random, the maximum coverage, the minimum coverage, the maximum distance, and the minimum distance, to improve the accuracy of SBFL for further. The 5 rules can effectively use the contextual information of statement executions. Moreover, they can be implemented on the traditional SBFL techniques using suspiciousness formulas with little effort. We empirically evaluated the impacts of the rules on 17 suspiciousness formulas. The results show that all 5 rules can significantly improve the ranking of faulty statements. Particularly, for the faults difficult to locate, the improvement is more remarkable. Generally, the rules can effectively reduce the number of statements examined by an average of more than 19%. Compared with other rules, the minimum coverage rule generates better results. This indicates that the application of the test case having the minimum coverage capability for fault localization is more effective.
End-to-End Multilingual Speech Recognition System with Language Supervision Training
Danyang LIU Ji XU Pengyuan ZHANG

LETTER-Speech and Hearing

Pubricized:
2020/03/19
Vol:
E103-D No:6
Page(s):
1427-1430
End-to-end (E2E) multilingual automatic speech recognition (ASR) systems aim to recognize multilingual speeches in a unified framework. In the current E2E multilingual ASR framework, the output prediction for a specific language lacks constraints on the output scope of modeling units. In this paper, a language supervision training strategy is proposed with language masks to constrain the neural network output distribution. To simulate the multilingual ASR scenario with unknown language identity information, a language identification (LID) classifier is applied to estimate the language masks. On four Babel corpora, the proposed E2E multilingual ASR system achieved an average absolute word error rate (WER) reduction of 2.6% compared with the multilingual baseline system.
A Unified Decision Scheme for Classification and Localization of Cable Faults
So Ryoung PARK Iickho SONG Seokho YOON

LETTER-Measurement Technology

Vol:
E103-A No:6
Page(s):
865-871
A unified decision scheme for the classification and localization of cable faults is proposed based on a two-step procedure. Having basis in the time domain reflectometry (TDR), the proposed scheme is capable of determining not only the locations but also types of faults in a cable without an excessive additional computational burden compared to other TDR-based schemes. Results from simulation and experiments with measured real data demonstrate that the proposed scheme exhibits a higher rate of correct decision than the conventional schemes in localizing and classifying faults over a wide range of the location of faults.
Voice Conversion for Improving Perceived Likability of Uttered Speech
Shinya HORIIKE Masanori MORISE

LETTER-Speech and Hearing

Pubricized:
2020/01/23
Vol:
E103-D No:5
Page(s):
1199-1202
To improve the likability of speech, we propose a voice conversion algorithm by controlling the fundamental frequency (F0) and the spectral envelope and carry out a subjective evaluation. The subjects can manipulate these two speech parameters. From the result, the subjects preferred speech with a parameter related to higher brightness.
Mimicking Lombard Effect: An Analysis and Reconstruction
Thuan Van NGO Rieko KUBO Masato AKAGI

PAPER-Speech and Hearing

Pubricized:
2020/02/13
Vol:
E103-D No:5
Page(s):
1108-1117
Lombard speech is produced in noisy environments due to the Lombard effect and is intelligible in adverse environments. To adaptively control the intelligibility of transmitted speech for public announcement systems, in this study, we focus on perceptually mimicking Lombard speech under backgrounds with varying noise levels. Other approaches map corresponding neutral speech features to Lombard speech features, but as this can only be applied to one noise level at a time, it is unsuitable for varying noise levels because the characteristics of Lombard speech are varied according to noise level. Instead, we utilize a rule-based method that automatically generates rules and flexibly controls features with any change of noise level. Specifically, we conduct a feature tendency analysis and propose a continuous rule generation model to estimate the effect of varying noise levels on features. The proposed techniques, which are based on a coarticulation model, MRTD, and spectral-GMM, can easily modify neutral speech features by following the generated rules. Voices having these features are then synthesized by STRAIGHT to obtain Lombard speech fitting to noises with varying levels. To validate our proposed method, the quality of mimicking speech is evaluated in subjective listening experiments on similarity, intelligibility, and naturalness. In varying noise levels, the results show equal similarity with Lombard speech between the proposed method and a state-of-the-art method. Intelligibility and naturalness are comparable with some feature modifications.
Continuous Noise Masking Based Vocoder for Statistical Parametric Speech Synthesis
Mohammed Salah AL-RADHI Tamás Gábor CSAPÓ Géza NÉMETH

PAPER-Speech and Hearing

Pubricized:
2020/02/10
Vol:
E103-D No:5
Page(s):
1099-1107
In this article, we propose a method called “continuous noise masking (cNM)” that allows eliminating residual buzziness in a continuous vocoder, i.e. of which all parameters are continuous and offers a simple and flexible speech analysis and synthesis system. Traditional parametric vocoders generally show a perceptible deterioration in the quality of the synthesized speech due to different processing algorithms. Furthermore, an inaccurate noise resynthesis (e.g. in breathiness or hoarseness) is also considered to be one of the main underlying causes of performance degradation, leading to noisy transients and temporal discontinuity in the synthesized speech. To overcome these issues, a new cNM is developed based on the phase distortion deviation in order to reduce the perceptual effect of the residual noise, allowing a proper reconstruction of noise characteristics, and model better the creaky voice segments that may happen in natural speech. To this end, the cNM is designed to keep only voice components under a condition of the cNM threshold while discarding others. We evaluate the proposed approach and compare with state-of-the-art vocoders using objective and subjective listening tests. Experimental results show that the proposed method can reduce the effect of residual noise and can reach the quality of other sophisticated approaches like STRAIGHT and log domain pulse model (PML).
Patient-Specific ECG Classification with Integrated Long Short-Term Memory and Convolutional Neural Networks
Jiaquan WU Feiteng LI Zhijian CHEN Xiaoyan XIANG Yu PU

PAPER-Biological Engineering

Pubricized:
2020/02/13
Vol:
E103-D No:5
Page(s):
1153-1163
This paper presents an automated patient-specific ECG classification algorithm, which integrates long short-term memory (LSTM) and convolutional neural networks (CNN). While LSTM extracts the temporal features, such as the heart rate variance (HRV) and beat-to-beat correlation from sequential heartbeats, CNN captures detailed morphological characteristics of the current heartbeat. To further improve the classification performance, adaptive segmentation and re-sampling are applied to align the heartbeats of different patients with various heart rates. In addition, a novel clustering method is proposed to identify the most representative patterns from the common training data. Evaluated on the MIT-BIH arrhythmia database, our algorithm shows the superior accuracy for both ventricular ectopic beats (VEB) and supraventricular ectopic beats (SVEB) recognition. In particular, the sensitivity and positive predictive rate for SVEB increase by more than 8.2% and 8.8%, respectively, compared with the prior works. Since our patient-specific classification does not require manual feature extraction, it is potentially applicable to embedded devices for automatic and accurate arrhythmia monitoring.
Orthogonal Gradient Penalty for Fast Training of Wasserstein GAN Based Multi-Task Autoencoder toward Robust Speech Recognition
Chao-Yuan KAO Sangwook PARK Alzahra BADI David K. HAN Hanseok KO

LETTER-Speech and Hearing

Pubricized:
2020/01/27
Vol:
E103-D No:5
Page(s):
1195-1198
Performance in Automatic Speech Recognition (ASR) degrades dramatically in noisy environments. To alleviate this problem, a variety of deep networks based on convolutional neural networks and recurrent neural networks were proposed by applying L1 or L2 loss. In this Letter, we propose a new orthogonal gradient penalty (OGP) method for Wasserstein Generative Adversarial Networks (WGAN) applied to denoising and despeeching models. WGAN integrates a multi-task autoencoder which estimates not only speech features but also noise features from noisy speech. While achieving 14.1% improvement in Wasserstein distance convergence rate, the proposed OGP enhanced features are tested in ASR and achieve 9.7%, 8.6%, 6.2%, and 4.8% WER improvements over DDAE, MTAE, R-CED(CNN) and RNN models.
Efficient Computation of Boomerang Connection Probability for ARX-Based Block Ciphers with Application to SPECK and LEA
Dongyeong KIM Dawoon KWON Junghwan SONG

PAPER-Cryptography and Information Security

Vol:
E103-A No:4
Page(s):
677-685
The boomerang connectivity table (BCT) was introduced by C. Cid et al. Using the BCT, for SPN block cipher, the dependency between sub-ciphers in boomerang structure can be computed more precisely. However, the existing method to generate BCT is difficult to be applied to the ARX-based cipher, because of the huge domain size. In this paper, we show a method to compute the dependency between sub-ciphers in boomerang structure for modular addition. Using bit relation in modular addition, we compute the dependency sequentially in bitwise. And using this method, we find boomerang characteristics and amplified boomerang characteristics for the ARX-based ciphers LEA and SPECK. For LEA-128, we find a reduced 15-round boomerang characteristic and reduced 16-round amplified boomerang characteristic which is two rounds longer than previous boomerang characteristic. Also for SPECK64/128, we find a reduced 13-round amplified boomerang characteristic which is one round longer than previous rectangle characteristic.
Linear Constellation Precoded OFDM with Index Modulation Based Orthogonal Cooperative System
Qingbo WANG Gaoqi DOU Ran DENG Jun GAO

PAPER

Pubricized:
2019/10/15
Vol:
E103-B No:4
Page(s):
312-320
The current orthogonal cooperative system (OCS) achieves diversity through the use of relays and the consumption of an additional time slot (TS). To guarantee the orthogonality of the received signal and avoid the mutual interference at the destination, the source has to be mute in the second TS. Consequently, the spectral efficiency (SE) is halved. In this paper, linear constellation precoded orthogonal frequency division multiplexing with index modulation (LCP-OFDM-IM) based OCS is proposed, where the source activates the complementary subcarriers to convey the symbols over two TSs. Hence the source can consecutively transmit information to the destination without the mutual interference. Compared with the current OFDM based OCS, the LCP-OFDM-IM based OCS can achieve a higher SE, since the subcarrier activation patterns (SAPs) can be exploited to convey additional information. Furthermore, the optimal precoder, in the sense of maximizing the minimum Euclidean distance of the symbols conveyed on each subcarrier over two TSs, is provided. Simulation results show the superiority of the LCP-OFDM-IM based OCS over the current OFDM based OCS.
Angular Momentum Spectrum of Electromagnetic Wave
Chao ZHANG Jin JIANG

LETTER-Analog Signal Processing

Vol:
E103-A No:4
Page(s):
715-717
Angular Momentum (AM) has been considered as a new dimension of wireless transmissions as well as the intrinsic property of Electro-Magnetic (EM) waves. So far, AM is utilized as a discrete mode not only in the quantum states, but also in the statistical beam forming. Traditionally, the continuous value of AM is ignored and only the quantized mode number is identified. However, the recent discovery on electrons in spiral motion producing twisted radiation with AM, including Spin Angular Momentum (SAM) and Orbital Angular Momentum (OAM), proves that the continuous value of AM is available in the statistical EM wave beam. This is also revealed by the so-called fractional OAM, which is reported in optical OAM beams. Then, as the new dimension with continuous real number field, AM should turn out to be a certain spectrum, similar to the frequency spectrum usually in the wireless signal processing. In this letter, we mathematically define the AM spectrum and show the applications in the information theory analysis, which is expected to be an efficient tool for the future wireless communications with AM.
Korean-Vietnamese Neural Machine Translation with Named Entity Recognition and Part-of-Speech Tags
Van-Hai VU Quang-Phuoc NGUYEN Kiem-Hieu NGUYEN Joon-Choul SHIN Cheol-Young OCK

PAPER-Natural Language Processing

Pubricized:
2020/01/15
Vol:
E103-D No:4
Page(s):
866-873
Since deep learning was introduced, a series of achievements has been published in the field of automatic machine translation (MT). However, Korean-Vietnamese MT systems face many challenges because of a lack of data, multiple meanings of individual words, and grammatical diversity that depends on context. Therefore, the quality of Korean-Vietnamese MT systems is still sub-optimal. This paper discusses a method for applying Named Entity Recognition (NER) and Part-of-Speech (POS) tagging to Vietnamese sentences to improve the performance of Korean-Vietnamese MT systems. In terms of implementation, we used a tool to tag NER and POS in Vietnamese sentences. In addition, we had access to a Korean-Vietnamese parallel corpus with more than 450K paired sentences from our previous research paper. The experimental results indicate that tagging NER and POS in Vietnamese sentences can improve the quality of Korean-Vietnamese Neural MT (NMT) in terms of the Bi-Lingual Evaluation Understudy (BLEU) and Translation Error Rate (TER) score. On average, our MT system improved by 1.21 BLEU points or 2.33 TER scores after applying both NER and POS tagging to the Vietnamese corpus. Due to the structural features of language, the MT systems in the Korean to Vietnamese direction always give better BLEU and TER results than translation machines in the reverse direction.
An Approximation Algorithm for the 2-Dispersion Problem
Kazuyuki AMANO Shin-ichi NAKANO

PAPER

Pubricized:
2019/11/28
Vol:
E103-D No:3
Page(s):
506-508
Let P be a set of points on the plane, and d(p, q) be the distance between a pair of points p, q in P. For a point p∈P and a subset S ⊂ P with |S|≥3, the 2-dispersion cost, denoted by cost2(p, S), of p with respect to S is the sum of (1) the distance from p to the nearest point in Ssetminus{p} and (2) the distance from p to the second nearest point in Ssetminus{p}. The 2-dispersion cost cost2(S) of S ⊂ P with |S|≥3 is minp∈S{cost2(p, S)}. Given a set P of n points and an integer k we wish to compute k point subset S of P with maximum cost2(S). In this paper we give a simple 1/({4sqrt{3}}) approximation algorithm for the problem.
Analysis of Antenna Performance Degradation due to Coupled Electromagnetic Interference from Nearby Circuits
Hosang LEE Jawad YOUSAF Kwangho KIM Seongjin MUN Chanseok HWANG Wansoo NAH

PAPER-Electromagnetic Theory

Pubricized:
2019/08/27
Vol:
E103-C No:3
Page(s):
110-118
This paper analyzes and compares two methods to estimate electromagnetically coupled noises introduced to an antenna due to the nearby circuits at a circuit design stage. One of them is to estimate the power spectrum, and the other one is to estimate the active S11 parameter at the victim antenna, respectively, and both of them use simulated standard S-parameters for the electromagnetic coupling in the circuit. They also need the assumed or measured excitation of noise sources. To confirm the validness of the two methods, an evaluation board consisting of an antenna and noise sources were designed and fabricated in which voltage controlled oscillator (VCO) chips are placed as noise sources. The generated electromagnetic noises are transferred to an antenna via loop-shaped transmission lines, degrading the performance of the antenna. In this paper, detailed analysis procedures are described using the evaluation board, and it is shown that the two methods are equivalent to each other in terms of the induced voltages in the antenna. Finally, a procedure to estimate antenna performance degradation at the design stage is summarized.
Range Points Migration Based Spectroscopic Imaging Algorithm for Wide-Beam Terahertz Subsurface Sensor Open Access
Takamaru MATSUI Shouhei KIDERA

BRIEF PAPER-Electromagnetic Theory

Pubricized:
2019/09/25
Vol:
E103-C No:3
Page(s):
127-130
Here, we present a novel spectroscopic imaging method based on the boundary-extraction scheme for wide-beam terahertz (THz) three-dimensional imaging. Optical-lens-focusing systems for THz subsurface imaging generally require the depth of the object from the surface to be input beforehand to achieve the desired azimuth resolution. This limitation can be alleviated by incorporating a wide-beam THz transmitter into the synthetic aperture to automatically change the focusing depth in the post-signal processing. The range point migration (RPM) method has been demonstrated to have significant advantages in terms of imaging accuracy over the synthetic-aperture method. Moreover, in the RPM scheme, spectroscopic information can be easily associated with each scattering center. Thus, we propose an RPM-based terahertz spectroscopic imaging method. The finite-difference time-domain-based numerical analysis shows that the proposed algorithm provides accurate target boundary imaging associated with each frequency-dependent characteristic.
A High-Speed Method for Generating Edge-Preserving Bubble Images
Toru HIRAOKA

LETTER-Computer Graphics

Pubricized:
2019/11/29
Vol:
E103-D No:3
Page(s):
724-727
We propose a non-photorealistic rendering method for generating edge-preserving bubble images from gray-scale photographic images. Bubble images are non-photorealistic images embedded in many bubbles, and edge-preserving bubble images are bubble images where edges in photographic images are preserved. The proposed method is executed by an iterative processing using absolute difference in window. The proposed method has features that processing is simple and fast. To validate the effectiveness of the proposed method, experiments using various photographic images are conducted. Results show that the proposed method can generate edge-preserving bubble images by preserving the edges of photographic images and the processing speed is high.
An Efficient Image to Sound Mapping Method Preserving Speech Spectral Envelope
Yuya HOSODA Arata KAWAMURA Youji IIGUNI

LETTER-Digital Signal Processing

Vol:
E103-A No:3
Page(s):
629-630
In this paper, we propose an image to sound mapping method. This technique treats an image as a spectrogram and maps it to a sound by taking inverse FFT of the spectrogram. Amplitude spectra of a speech signal are embedded to the spectrogram to give speech intelligibility for the mapped sound. Specifically, we hold amplitude spectra of a speech signal with strong power and embed the image brightness in other frequency bands. Holding amplitude spectra of a speech signal with strong power preserves a speech spectral envelope and improves the speech quality of the mapped sound. The amplitude spectra of the mapped sound with weak power represent the image brightness, and then the image is successfully reconstructed from the mapped sound. Simulation results show that the proposed method achieves sufficient speech quality.

141-160hit(2504hit)

Keyword Search Result

[Keyword] SPE(2504hit)

A Node-Grouping Based Spatial Spectrum Reuse Method for WLANs in Dense Residential Scenarios

Siamese Attention-Based LSTM for Speech Emotion Recognition

Systematic Detection of State Variable Corruptions in Discrete Event System Specification Based Simulation

Improving the Accuracy of Spectrum-Based Fault Localization Using Multiple Rules

End-to-End Multilingual Speech Recognition System with Language Supervision Training

A Unified Decision Scheme for Classification and Localization of Cable Faults

Voice Conversion for Improving Perceived Likability of Uttered Speech

Mimicking Lombard Effect: An Analysis and Reconstruction

Continuous Noise Masking Based Vocoder for Statistical Parametric Speech Synthesis

Patient-Specific ECG Classification with Integrated Long Short-Term Memory and Convolutional Neural Networks

Orthogonal Gradient Penalty for Fast Training of Wasserstein GAN Based Multi-Task Autoencoder toward Robust Speech Recognition

Efficient Computation of Boomerang Connection Probability for ARX-Based Block Ciphers with Application to SPECK and LEA

Linear Constellation Precoded OFDM with Index Modulation Based Orthogonal Cooperative System

Angular Momentum Spectrum of Electromagnetic Wave

Korean-Vietnamese Neural Machine Translation with Named Entity Recognition and Part-of-Speech Tags

An Approximation Algorithm for the 2-Dispersion Problem

Analysis of Antenna Performance Degradation due to Coupled Electromagnetic Interference from Nearby Circuits

Range Points Migration Based Spectroscopic Imaging Algorithm for Wide-Beam Terahertz Subsurface Sensor Open Access

A High-Speed Method for Generating Edge-Preserving Bubble Images

An Efficient Image to Sound Mapping Method Preserving Speech Spectral Envelope

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles