Mohammed Salah AL-RADHI Tamás Gábor CSAPÓ Géza NÉMETH
In this article, we propose a method called “continuous noise masking (cNM)” that allows eliminating residual buzziness in a continuous vocoder, i.e. of which all parameters are continuous and offers a simple and flexible speech analysis and synthesis system. Traditional parametric vocoders generally show a perceptible deterioration in the quality of the synthesized speech due to different processing algorithms. Furthermore, an inaccurate noise resynthesis (e.g. in breathiness or hoarseness) is also considered to be one of the main underlying causes of performance degradation, leading to noisy transients and temporal discontinuity in the synthesized speech. To overcome these issues, a new cNM is developed based on the phase distortion deviation in order to reduce the perceptual effect of the residual noise, allowing a proper reconstruction of noise characteristics, and model better the creaky voice segments that may happen in natural speech. To this end, the cNM is designed to keep only voice components under a condition of the cNM threshold while discarding others. We evaluate the proposed approach and compare with state-of-the-art vocoders using objective and subjective listening tests. Experimental results show that the proposed method can reduce the effect of residual noise and can reach the quality of other sophisticated approaches like STRAIGHT and log domain pulse model (PML).
Lin JIANG Xin WU Yun ZHU Yu WANG
For high definition (HD) videos, the 3D-High Efficiency Video Coding (3D-HEVC) reference algorithm incurs dramatically highly computation loads. Therefore, with the demands for the real-time processing of HD video, a hardware implementation is necessary. In this paper, a reconfigurable architecture is proposed that can support both median filtering preprocessing and mean filtering preprocessing to satisfy different scene depth maps. The architecture sends different instructions to the corresponding processing elements according to different scenarios. Mean filter is used to process near-range images, and median filter is used to process long-range images. The simulation results show that the designed architecture achieves an averaged PSNR of 34.55dB for the tested images. The hardware design for the proposed virtual view synthesis system operates at a maximum clock frequency of 160MHz on the BEE4 platform which is equipped with four Virtex-6 FF1759 LX550T Field-Programmable Gate Array (FPGA) for outputting 720p (1024×768) video at 124fps.
Masahiro TAKIGAWA Shinsuke IBI Seiichi SAMPEI
This paper proposes a successive interference cancellation (SIC) of independent component analysis (ICA) aided spatial division multiple access (SDMA) for Gaussian filtered frequency shift keying (GFSK) in Bluetooth low energy (BLE) systems. The typical SDMA scheme requires estimations of channel state information (CSI) using orthogonal pilot sequences. However, the orthogonal pilot is not embedded in the BLE packet. This fact motivates us to add ICA detector into BLE systems. In this paper, focusing on the covariance matrix of ICA outputs, SIC can be applied with Cholesky decomposition. Then, in order to address the phase ambiguity problems created by the ICA process, we propose a differential detection scheme based on the MAP algorithm. In practical scenarios, it is subject to carrier frequency offset (CFO) as well as symbol timing offset (STO) induced by the hardware impairments present in the BLE peripherals. The packet error rate (PER) performance is evaluated by computer simulations when BLE peripherals simultaneously communicate in the presence of CFO and STO.
Wei GE Shenghua CHEN Benyu LIU Min ZHU Bo LIU
Side-channel Attack, such as simple power analysis and differential power analysis (DPA), is an efficient method to gather the key, which challenges the security of crypto chips. Side-channel Attack logs the power trace of the crypto chip and speculates the key by statistical analysis. To reduce the threat of power analysis attack, an innovative method based on random execution and register randomization is proposed in this paper. In order to enhance ability against DPA, the method disorders the correspondence between power trace and operands by scrambling the data execution sequence randomly and dynamically and randomize the data operation path to randomize the registers that store intermediate data. Experiments and verification are done on the Sakura-G FPGA platform. The results show that the key is not revealed after even 2 million power traces by adopting the proposed method and only 7.23% slices overhead and 3.4% throughput rate cost is introduced. Compared to unprotected chip, it increases more than 4000× measure to disclosure.
For embedded systems, verifying both real-time properties and logical validity are important. The embedded system is not only required to the accurate operation but also required to strictly real-time properties. To verify real-time properties is a key problem in model checking. In order to verify real-time properties of assembly program, we develop the simulator to propose the model checking method for verifying assembly programs. Simultaneously, we propose a timed Kripke structure and implement the simulator of the robot's processor to be verified. We propose the timed Kripke structure including the execution time which extends Kripke structure. For the input assembly program, the simulator generates timed Kripke structure by dynamic program analysis. Also, we implement model checker after generating timed Kripke structure in order to verify whether timed Kripke structure satisfies RTCTL formulas. Finally, to evaluate a proposed method, we conduct experiments with the implementation of the verification system. To solve the real problem, we have experimented with real microcontroller software.
Dongyeong KIM Dawoon KWON Junghwan SONG
The boomerang connectivity table (BCT) was introduced by C. Cid et al. Using the BCT, for SPN block cipher, the dependency between sub-ciphers in boomerang structure can be computed more precisely. However, the existing method to generate BCT is difficult to be applied to the ARX-based cipher, because of the huge domain size. In this paper, we show a method to compute the dependency between sub-ciphers in boomerang structure for modular addition. Using bit relation in modular addition, we compute the dependency sequentially in bitwise. And using this method, we find boomerang characteristics and amplified boomerang characteristics for the ARX-based ciphers LEA and SPECK. For LEA-128, we find a reduced 15-round boomerang characteristic and reduced 16-round amplified boomerang characteristic which is two rounds longer than previous boomerang characteristic. Also for SPECK64/128, we find a reduced 13-round amplified boomerang characteristic which is one round longer than previous rectangle characteristic.
Rachasak SOMYANONTHANAKUL Thanaruk THEERAMUNKONG
Objective interestingness measures play a vital role in association rule mining of a large-scaled database because they are used for extracting, filtering, and ranking the patterns. In the past, several measures have been proposed but their similarities or relations are not sufficiently explored. This work investigates sixty-one objective interestingness measures on the pattern of A → B, to analyze their similarity and dissimilarity as well as their relationship. Three-probability patterns, P(A), P(B), and P(AB), are enumerated in both linear and exponential scales and each measure's values of those conditions are calculated, forming synthesis data for investigation. The behavior of each measure is explored by pairwise comparison based on these three-probability patterns. The relationship among the sixty-one interestingness measures has been characterized with correlation analysis and association rule mining. In the experiment, relationships are summarized using heat-map and association rule mined. As the result, selection of an appropriate interestingness measure can be realized using the generated heat-map and association rules.
Konlakorn WONGAPTIKASEREE Panida YOMABOOT Kantinee KATCHAPAKIRIN Yongyos KAEWPITAKKUN
Depression is a major mental health problem in Thailand. The depression rates have been rapidly increasing. Over 1.17 million Thai people suffer from this mental illness. It is important that a reliable depression screening tool is made available so that depression could be early detected. Given Facebook is the most popular social network platform in Thailand, it could be a large-scale resource to develop a depression detection tool. This research employs techniques to develop a depression detection algorithm for the Thai language on Facebook where people use it as a tool for sharing opinions, feelings, and life events. To establish the reliable result, Thai Mental Health Questionnaire (TMHQ), a standardized psychological inventory that measures major mental health problems including depression. Depression scale of the TMHQ comprises of 20 items, is used as the baseline for concluding the result. Furthermore, this study also aims to do factor analysis and reduce the number of depression items. Data was collected from over 600 Facebook users. Descriptive statistics, Exploratory Factor Analysis, and Internal consistency were conducted. Results provide the optimized version of the TMHQ-depression that contain 9 items. The 9 items are categorized into four factors which are suicidal ideation, sleep problems, anhedonic, and guilty feelings. Internal consistency analysis shows that this short version of the TMHQ-depression has good to excellent reliability (Cronbach's alpha >.80). The findings suggest that this optimized TMHQ-depression questionnaire holds a good psychometric property and can be used for depression detection.
In this paper, we propose an effective and robust method of spatial feature extraction for acoustic scene analysis utilizing partially synchronized and/or closely located distributed microphones. In the proposed method, a new cepstrum feature utilizing a graph-based basis transformation to extract spatial information from distributed microphones, while taking into account whether any pairs of microphones are synchronized and/or closely located, is introduced. Specifically, in the proposed graph-based cepstrum, the log-amplitude of a multichannel observation is converted to a feature vector utilizing the inverse graph Fourier transform, which is a method of basis transformation of a signal on a graph. Results of experiments using real environmental sounds show that the proposed graph-based cepstrum robustly extracts spatial information with consideration of the microphone connections. Moreover, the results indicate that the proposed method more robustly classifies acoustic scenes than conventional spatial features when the observed sounds have a large synchronization mismatch between partially synchronized microphone groups.
Hiroshi FUJIWARA Kei SHIBUSAWA Kouki YAMAMOTO Hiroaki YAMAMOTO
The multislope ski-rental problem is an online optimization problem that generalizes the classical ski-rental problem. The player is offered not only a buy and a rent options but also other options that charge both initial and per-time fees. The competitive ratio of the classical ski-rental problem is known to be 2. In contrast, the best known so far on the competitive ratio of the multislope ski-rental problem is an upper bound of 4 and a lower bound of 3.62. In this paper we consider a parametric version of the multislope ski-rental problem, regarding the number of options as a parameter. We prove an upper bound for the parametric problem which is strictly less than 4. Moreover, we give a simple recurrence relation that yields an equation having a lower bound value as its root.
Hiroki TAMARU Yuki SAITO Shinnosuke TAKAMICHI Tomoki KORIYAMA Hiroshi SARUWATARI
This paper proposes a generative moment matching network (GMMN)-based post-filtering method for providing inter-utterance pitch variation to singing voices and discusses its application to our developed mixing method called neural double-tracking (NDT). When a human singer sings and records the same song twice, there is a difference between the two recordings. The difference, which is called inter-utterance variation, enriches the performer's musical expression and the audience's experience. For example, it makes every concert special because it never recurs in exactly the same manner. Inter-utterance variation enables a mixing method called double-tracking (DT). With DT, the same phrase is recorded twice, then the two recordings are mixed to give richness to singing voices. However, in synthesized singing voices, which are commonly used to create music, there is no inter-utterance variation because the synthesis process is deterministic. There is also no inter-utterance variation when only one voice is recorded. Although there is a signal processing-based method called artificial DT (ADT) to layer singing voices, the signal processing results in unnatural sound artifacts. To solve these problems, we propose a post-filtering method for randomly modulating synthesized or natural singing voices as if the singer sang again. The post-filter built with our method models the inter-utterance pitch variation of human singing voices using a conditional GMMN. Evaluation results indicate that 1) the proposed method provides perceptible and natural inter-utterance variation to synthesized singing voices and that 2) our NDT exhibits higher double-trackedness than ADT when applied to both synthesized and natural singing voices.
Apinporn METHAWACHANANONT Marut BURANARACH Pakaimart AMSURIYA Sompol CHAIMONGKHON Kamthorn KRAIRAKSA Thepchai SUPNITHI
A key driver of software business growth in developing countries is the survival of software small and medium-sized enterprises (SMEs). Quality of products is a critical factor that can indicate the future of the business by building customer confidence. Software development agencies need to be aware of meeting international standards in software development process. In practice, consultants and assessors are usually employed as the primary solution, which can impact the budget in case of small businesses. Self-assessment tools for software development process can potentially reduce time and cost of formal assessment for software SMEs. However, the existing support methods and tools are largely insufficient in terms of process coverage and semi-automated evaluation. This paper proposes to apply a knowledge-based approach in development of a self-assessment and gap analysis support system for the ISO/IEC 29110 standard. The approach has an advantage that insights from domain experts and the standard are captured in the knowledge base in form of decision tables that can be flexibly managed. Our knowledge base is unique in that task lists and work products defined in the standard are broken down into task and work product characteristics, respectively. Their relation provides the links between Task List and Work Product which make users more understand and influence self-assessment. A prototype support system was developed to assess the level of software development capability of the agencies based on the ISO/IEC 29110 standard. A preliminary evaluation study showed that the system can improve performance of users who are inexperienced in applying ISO/IEC 29110 standard in terms of task coverage and user's time and effort compared to the traditional self-assessment method.
S-shaped nonlinearity is found in the electrical resistance-length relationship in an electroactive supercoiled polymer artificial muscle. The modulation of the electrical resistance is mainly caused by the change in the contact condition of coils in the artificial muscle upon deformation. A mathematical model based on logistic function fairly reproduces the experimental data of electrical resistance-length relationship.
Yuhei WATANABE Hideki YAMAMOTO Hirotaka YOSHIDA
As Internet-connected service is emerged, there has been a need for use cases where a lightweight cryptographic primitive meets both of a constrained hardware implementation requirement and a constrained embedded software requirement. One of the examples of these use cases is the PKES (Passive Keyless Entry and Start) system in an automotive domain. From the perspective on these use cases, one interesting direction is to investigate how small the memory (RAM/ROM) requirement of ARM-implementations of hardware-oriented stream ciphers can be. In this paper, we propose implementation techniques for memory-optimized implementations of lightweight hardware-oriented stream ciphers including Grain-128a specified in ISO/IEC 29167-13 for RFID protocols. Our techniques include data-dependency analysis to take a close look at how and in which timing certain variables are updated and also the way taking into account the structure of registers on the target micro-controller. In order to minimize RAM size, we reduce the number of general purpose registers for computation of Grain-128a's update and pre-output values. We present results of our memory-optimized implementations of Grain-128a, one of which requires 84 RAM bytes on ARM Cortex-M3.
Kenji KITA Hiroshi GOTOH Hiroyasu ISHIKAWA Hideyuki SHINONAGA
Power line communications (PLC) is a communication technology that uses a power-line as a transmission medium. Previous studies have shown that connecting an AC adapter such as a mobile phone charger to the power-line affects signal quality. Therefore, in this paper, the authors analyze the influence of chargers on inter-computer communications using packet capture to evaluate communications quality. The analysis results indicate the occurrence of a short duration in which packets are not detected once in a half period of the power-line supply: named communication forbidden time. For visualizing the communication forbidden time and for evaluating the communications quality of the inter-computer communications using PLC, the authors propose an instantaneous power-line frequency synchronized superimposed chart and its plotting algorithm. Further, in order to analyze accurately, the position of the communication forbidden time can be changed by altering the initial burst signal plotting position. The difference in the chart, which occurs when the plotting start position changes, is also discussed. We show analysis examples using the chart for a test bed data assumed an ideal environment, and show the effectiveness of the chart for analyzing PLC inter-computer communications.
The spectrum sensing of the orthogonal frequency division multiplexing (OFDM) system in cognitive radio (CR) has always been challenging, especially for user terminals that utilize the full-duplex (FD) mode. We herein propose an advanced FD spectrum-sensing scheme that can be successfully performed even when severe self-interference is encountered from the user terminal. Based on the “classification-converted sensing” framework, the cyclostationary periodogram generated by OFDM pilots is exhibited in the form of images. These images are subsequently plugged into convolutional neural networks (CNNs) for classifications owing to the CNN's strength in image recognition. More importantly, to realize spectrum sensing against residual self-interference, noise pollution, and channel fading, we used adversarial training, where a CR-specific, modified training database was proposed. We analyzed the performances exhibited by the different architectures of the CNN and the different resolutions of the input image to balance the detection performance with computing capability. We proposed a design plan of the signal structure for the CR transmitting terminal that can fit into the proposed spectrum-sensing scheme while benefiting from its own transmission. The simulation results prove that our method has excellent sensing capability for the FD system; furthermore, our method achieves a higher detection accuracy than the conventional method.
Qian CHENG Jiang ZHU Tao XIE Junshan LUO Zuohong XU
A low-complexity time-invariant angle-range dependent directional modulation (DM) based on time-modulated frequency diverse array (TM-FDA-DM) is proposed to achieve point-to-point physical layer security communications. The principle of TM-FDA is elaborated and the vector synthesis method is utilized to realize the proposal, TM-FDA-DM, where normalization and orthogonal matrices are designed to modulate the useful baseband symbols and inserted artificial noise, respectively. Since the two designed matrices are time-invariant fixed values, which avoid real-time calculation, the proposed TM-FDA-DM is much easier to implement than time-invariant DMs based on conventional linear FDA or logarithmical FDA, and it also outperforms the time-invariant angle-range dependent DM that utilizes genetic algorithm (GA) to optimize phase shifters on radio frequency (RF) frontend. Additionally, a robust synthesis method for TM-FDA-DM with imperfect angle and range estimations is proposed by optimizing normalization matrix. Simulations demonstrate that the proposed TM-FDA-DM exhibits time-invariant and angle-range dependent characteristics, and the proposed robust TM-FDA-DM can achieve better BER performance than the non-robust method when the maximum range error is larger than 7km and the maximum angle error is larger than 4°.
Yusuke KIMURA Amir Masoud GHAREHBAGHI Masahiro FUJITA
This paper introduces methods to modify a buggy sequential gate-level circuit to conform to the specification. In order to preserve the optimization efforts, the modifications should be as small as possible. Assuming that the locations to be modified are given, our proposed method finds an appropriate set of fan-in signals for the patch function of those locations by iteratively calculating the state correspondence between the specification and the buggy circuit and applying a method for debugging combinational circuits. The experiments are conducted on ITC99 benchmark circuits, and it is shown that our proposed method can work when there are at most 30,000 corresponding reachable state pairs between two circuits. Moreover, a heuristic method using the information of data-path FFs is proposed, which can find a correct set of fan-ins for all the benchmark circuits within practical time.
Xina CHENG Yang LIU Takeshi IKENAGA
Volleyball video analysis plays important roles in providing data for TV contents and developing strategies. Among all the topics of volleyball analysis, qualitative player action recognition is essential because it potentially provides not only the action that being performed but also the quality, which means how well the action is performed. However, most action recognition researches focus on the discrimination between different actions. The quality of an action, which is helpful for evaluation and training of the player skill, has only received little attention so far. The vital problems in qualitative action recognition include occlusion, small inter-class difference and various kinds of appearance caused by the player change. This paper proposes a 3D global and multi-view local features combination based recognition framework with global team formation feature, ball state feature and abrupt pose features. The above problems are solved by the combination of 3D global features (which hide the unstable and incomplete 2D motion feature caused by occlusion) and the multi-view local features (which get detailed local motion features of body parts in multiple viewpoints). Firstly, the team formation extracts the 3D trajectories from the whole team members rather than a single target player. This proposal focuses more on the entire feature while eliminating the personal effect. Secondly, the ball motion state feature extracts features from the 3D ball trajectory. The ball motion is not affected by the personal appearance, so this proposal ignores the influence of the players appearance and makes it more robust to target player change. At last, the abrupt pose feature consists of two parts: the abrupt hit frame pose (which extracts the contour shape of the player's pose at the hit time) and abrupt pose variation (which extracts the pose variation between the preparation pose and ending pose during the action). These two features make difference of each action quality more distinguishable by focusing on the motion standard and stability between different quality actions. Experiments are conducted on game videos from the Semifinal and Final Game of 2014 Japan Inter High School Games of Men's Volleyball in Tokyo Metropolitan Gymnasium. The experimental results show the accuracy achieves 97.26%, improving 11.33% for action discrimination and 91.76%, and improving 13.72% for action quality evaluation.
Linear Prediction (LP) analysis is commonly used in speech processing. LP is based on Auto-Regressive (AR) model and it estimates the AR model parameter from signals with l2-norm optimization. Recently, sparse estimation is paid attention since it can extract significant features from big data. The sparse estimation is realized by l1 or l0-norm optimization or regularization. Sparse LP analysis methods based on l1-norm optimization have been proposed. Since excitation of speech is not white Gaussian, a sparse LP estimation can estimate more accurate parameter than the conventional l2-norm based LP. These are time-invariant and real-valued analysis. We have been studied Time-Varying Complex AR (TV-CAR) analysis for an analytic signal and have evaluated the performance on speech processing. The TV-CAR methods are l2-norm methods. In this paper, we propose the sparse TV-CAR analysis based on adaptive LASSO (Least absolute shrinkage and selection operator) that is l1-norm regularization and evaluate the performance on F0 estimation of speech using IRAPT (Instantaneous RAPT). The experimental results show that the sparse TV-CAR methods perform better for a high level of additive Pink noise.