Wei HAN Xiongwei ZHANG Meng SUN Li LI Wenhua SHI
In this letter, we propose a novel speech separation method based on perceptual weighted deep recurrent neural network (DRNN) which incorporate the masking properties of the human auditory system. In supervised training stage, we firstly utilize the clean label speech of two different speakers to calculate two perceptual weighting matrices. Then, the obtained different perceptual weighting matrices are utilized to adjust the mean squared error between the network outputs and the reference features of both the two clean speech so that the two different speech can mask each other. Experimental results on TSP speech corpus demonstrate that the proposed speech separation approach can achieve significant improvements over the state-of-the-art methods when tested with different mixing cases.
Hafiz Farooq AHMAD Hiroki SUGURI Muhammad Qaisar CHOUDHARY Ammar HASSAN Ali LIAQAT Muhammad Umer KHAN
Wireless technology has become widely popular and an important means of communication. A key issue in delivering wireless services is the problem of congestion which has an adverse impact on the Quality of Service (QoS), especially timeliness. Although a lot of work has been done in the context of RRM (Radio Resource Management), the deliverance of quality service to the end user still remains a challenge. Therefore there is need for a system that provides real-time services to the users through high assurance. We propose an intelligent agent-based approach to guarantee a predefined Service Level Agreement (SLA) with heterogeneous user requirements for appropriate bandwidth allocation in QoS sensitive cellular networks. The proposed system architecture exploits Case Based Reasoning (CBR) technique to handle RRM process of congestion management. The system accomplishes predefined SLA through the use of Retrieval and Adaptation Algorithm based on CBR case library. The proposed intelligent agent architecture gives autonomy to Radio Network Controller (RNC) or Base Station (BS) in accepting, rejecting or buffering a connection request to manage system bandwidth. Instead of simply blocking the connection request as congestion hits the system, different buffering durations are allocated to diverse classes of users based on their SLA. This increases the opportunity of connection establishment and reduces the call blocking rate extensively in changing environment. We carry out simulation of the proposed system that verifies efficient performance for congestion handling. The results also show built-in dynamism of our system to cater for variety of SLA requirements.
Yinan LI Xiongwei ZHANG Meng SUN Yonggang HU Li LI
An online version of convolutive non-negative sparse coding (CNSC) with the generalized Kullback-Leibler (K-L) divergence is proposed to adaptively learn spectral-temporal bases from speech streams. The proposed scheme processes training data piece-by-piece and incrementally updates learned bases with accumulated statistics to overcome the inefficiency of its offline counterpart in processing large scale or streaming data. Compared to conventional non-negative sparse coding, we utilize the convolutive model within bases, so that each basis is capable of describing a relatively long temporal span of signals, which helps to improve the representation power of the model. Moreover, by incorporating a voice activity detector (VAD), we propose an unsupervised enhancement algorithm that updates the noise dictionary adaptively from non-speech intervals. Meanwhile, for the speech intervals, one can adaptively learn the speech bases by keeping the noise ones fixed. Experimental results show that the proposed algorithm outperforms the competing algorithms substantially, especially when the background noise is highly non-stationary.
Zhouwen TAN Ziji MA Hongli LIU Keli PENG Xun SHAO
Impulsive noise (IN) is the most dominant factor degrading the performance of communication systems over powerlines. In order to improve performance of high-speed power line communication (PLC), this work focuses on mitigating burst IN effects based on compressive sensing (CS), and an adaptive burst IN mitigation method, namely combination of adaptive interleaver and permutation of null carriers is designed. First, the long burst IN is dispersed by an interleaver at the receiver and the characteristic of noise is estimated by the method of moment estimation, finally, the generated sparse noise is reconstructed by changing the number of null carriers(NNC) adaptively according to noise environment. In our simulations, the results show that the proposed IN mitigation technique is simple and effective for mitigating burst IN in PLC system, it shows the advantages to reduce the burst IN and to improve the overall system throughput. In addition, the performance of the proposed technique outpeformences other known nonlinear noise mitigation methods and CS methods.
Hanli LIU Teerachot SIRIBURANON Kengo NAKATA Wei DENG Ju Ho SON Dae Young LEE Kenichi OKADA Akira MATSUZAWA
This paper presents a 27.5-29.6GHz fractional-N frequency synthesizer using reference and frequency doublers to achieve low in-band and out-of-band phase-noise for 5G mobile communications. A consideration of the baseband carrier recovery circuit helps estimate phase noise requirement for high modulation scheme. The push-push amplifier and 28GHz balun help achieving differential signals with low out-of-band phase noise while consuming low power. A charge pump with gated offset as well as reference doubler help reducing PD noise resulting in low in-band phase noise while sampling loop filter helps reduce spurs. The proposed synthesizer has been implemented in 65nm CMOS technology achieving an in-band and out-of-band phase noise of -78dBc/Hz and -126dBc/Hz, respectively. It consumes only a total power of 33mW. The jitter-power figure-of-merit (FOM) is -231dB which is the highest among the state of the art >20GHz fractional-N PLLs using a low reference clock (<200MHz). The measured reference spurs are less than -80dBc.
Jianwei LIU Hongli LIU Xuefeng NI Ziji MA Chao WANG Xun SHAO
Automatic disassembly of railway fasteners is of great significance for improving the efficiency of replacing rails. The accurate positioning of fastener is the key factor to realize automatic disassembling. However, most of the existing literature mainly focuses on fastener region positioning and the literature on accurate positioning of fasteners is scarce. Therefore, this paper constructed a visual inspection system for accurate positioning of fastener (VISP). At first, VISP acquires railway image by image acquisition subsystem, and then the subimage of fastener can be obtained by coarse-to-fine method. Subsequently, the accurate positioning of fasteners can be completed by three steps, including contrast enhancement, binarization and spike region extraction. The validity and robustness of the VISP were verified by vast experiments. The results show that VISP has competitive performance for accurate positioning of fasteners. The single positioning time is about 260ms, and the average positioning accuracy is above 90%. Thus, it is with theoretical interest and potential industrial application.
In a 1-out-of-n oblivious signature scheme, a user provides a set of messages to a signer for signatures but he/she can only obtain a valid signature for a specific message chosen from the message set. There are two security requirements for 1-out-of-n oblivious signature. The first is ambiguity, which requires that the signer is not aware which message among the set is signed. The other one is unforgeability which requires that the user is not able to derive any other valid signature for any messages beyond the one that he/she has chosen. In this paper, we provide a generic construction of 1-out-of-n oblivious signature. Our generic construction consists of two building blocks, a commitment scheme and a standard signature scheme. Our construction is round efficient since it only asks one interaction (i.e., two rounds) between the user and signer. Meanwhile, in our construction, the ambiguity of the 1-out-of-n oblivious signature scheme is based on the hiding property of the underlying commitment, while the unforgeability is based on the binding property of the underlying commitment scheme and the unforgeability of the underlying signature scheme. Moreover, our construction can also enjoy strong unforgeability as long as the underlying building blocks have strong binding property and strong unforgeability respectively. Given the fact that commitment and digital signature are well-studied topics in cryptography and numerous concrete schemes have been proposed in the standard model, our generic construction immediately yields a bunch of instantiations in the standard model based on well-known assumptions, including not only traditional assumptions like Decision Diffie-Hellman (DDH), Decision Composite Residue (DCR), etc., but also some post-quantum assumption like Learning with Errors (LWE). As far as we know, our construction admits the first 1-out-of-n oblivious signature schemes based on the standard model.
Zheng SUN Dingxin XU Hongye HUANG Zheng LI Hanli LIU Bangan LIU Jian PANG Teruki SOMEYA Atsushi SHIRANE Kenichi OKADA
This paper presents a miniaturized transformer-based ultra-low-power (ULP) LC-VCO with embedded supply pushing reduction techniques for IoT applications in 65-nm CMOS process. To reduce the on-chip area, a compact transformer patterned ground shield (PGS) is implemented. The transistors with switchable capacitor banks and associated components are placed underneath the transformer, which further shrinking the on-chip area. To lower the power consumption of VCO, a gm-stacked LC-VCO using the transformer embedded with PGS is proposed. The transformer is designed to provide large inductance to obtain a robust start-up within limited power consumption. Avoiding implementing an off/on-chip Low-dropout regulator (LDO) which requires additional voltage headroom, a low-power supply pushing reduction feedback loop is integrated to mitigate the current variation and thus the oscillation amplitude and frequency can be stabilized. The proposed ULP TF-based LC-VCO achieves phase noise of -114.8dBc/Hz at 1MHz frequency offset and 16kHz flicker corner with a 103µW power consumption at 2.6GHz oscillation frequency, which corresponds to a -193dBc/Hz VCO figure-of-merit (FoM) and only occupies 0.12mm2 on-chip area. The supply pushing is reduced to 2MHz/V resulting in a -50dBc spur, while 5MHz sinusoidal ripples with 50mVPP are added on the DC supply.
Hongwei YANG Fucheng XUE Dan LIU Li LI Jiahui FENG
Service composition optimization is a classic NP-hard problem. How to quickly select high-quality services that meet user needs from a large number of candidate services is a hot topic in cloud service composition research. An efficient second-order beetle swarm optimization is proposed with a global search ability to solve the problem of cloud service composition optimization in this study. First, the beetle antennae search algorithm is introduced into the modified particle swarm optimization algorithm, initialize the population bying using a chaotic sequence, and the modified nonlinear dynamic trigonometric learning factors are adopted to control the expanding capacity of particles and global convergence capability. Second, modified secondary oscillation factors are incorporated, increasing the search precision of the algorithm and global searching ability. An adaptive step adjustment is utilized to improve the stability of the algorithm. Experimental results founded on a real data set indicated that the proposed global optimization algorithm can solve web service composition optimization problems in a cloud environment. It exhibits excellent global searching ability, has comparatively fast convergence speed, favorable stability, and requires less time cost.
This letter proposes a spread spectrum audio watermarking robust against playback speed modification (PSM) attack which introduces both time-scale modification and pitch shifting. Two important improvements are exploited to achieve this robustness. The first one is selecting an embedding region according to the stable characteristic of the audio energy. The second one is stretching the pseudo-random noise sequence to match the length of the embedding region before embedding and detection. Experimental results show that our method is highly robust to common audio signal processing attacks and synchronization attacks including PSM, cropping, trimming and jittering.
Middle-level parts have attracted great attention in the computer vision community, acting as discriminative elements for objects. In this paper we propose an unsupervised approach to mine discriminative parts for object detection. This work features three aspects. First, we introduce an unsupervised, exemplar-based training process for part detection. We generate initial parts by selective search and then train part detectors by exemplar SVM. Second, a part selection model based on consistency and distinctiveness is constructed to select effective parts from the candidate pool. Third, we combine discriminative part mining with the deformable part model (DPM) for object detection. The proposed method is evaluated on the PASCAL VOC2007 and VOC2010 datasets. The experimental results demons-trate the effectiveness of our method for object detection.
As a giant in open source community, OpenOffice.org has become the most popular office suite within Linux community. But OpenOffice.org is relatively slow while loading documents. Research shows that the most time consuming part is importing one page of whole document. If there are many pages in a document, the accumulation of time consumed can be astonishing. Therefore, this paper proposes a solution, which has improved the speed of loading documents through asynchronous importing mechanism: a document is not imported as a whole, but only part of the document is imported at first for display, then mechanism in the background is started to asynchronously import the remaining parts, and insert it into the drawing queue of OpenOffice.org for display. In this way, the problem can be solved and users don't have to wait for a long time. Application start-up time testing tool has been used to test the time consumed in loading different pages of documents before and after optimization of OpenOffice.org, then, we adopt the regression theory to analyse the correlation between the page number of documents and the loading time. In addition, visual modeling of the experimental data are acquired with the aid of matlab. An obvious increase in loading speed can be seen after a comparison of the time consumed to load a document before and after the solution is adopted. And then, using Microsoft Office compared with the optimized OpenOffice.org, their loading speeds are almost same. The results of the experiments show the effectiveness of this solution.
In this letter we propose a robust detection algorithm for audio watermarking for copyright protection. The watermark is embedded in the time domain of an audio signal by the normally used spread spectrum technique. The scheme of detection is an improvement of the conventional correlation detector. A high-pass filter is applied along with the linear prediction error filter for whitening the audio signal and an adaptive threshold is chosen for decision comparing. Experimental results show that our detection algorithm outperforms the conventional one not only because it improves the robustness to normal attacks but also because it can provide the robustness to time-invariant pitch-scale modification.
Song JIA Li LIU Xiayu LI Fengfeng WU Yuan WANG Ganggang ZHANG
Information security has been seriously threatened by the differential power analysis (DPA). Delay-based dual-rail precharge logic (DDPL) is an effective solution to resist these attacks. However, conventional DDPL convertors have some shortcomings. In this paper, we propose improved convertor pairs based on dynamic logic and a sense amplifier (SA). Compared with the reference CMOS-to-DDPL convertor, our scheme could save 69% power consumption. As to the comparison of DDPL-to-CMOS convertor, the speed and power performances could be improved by 39% and 54%, respectively.
Yali LI Hongma LIU Shengjin WANG
A brain-computer interface (BCI) translates the brain activity into commands to control external devices. P300 speller based character recognition is an important kind of application system in BCI. In this paper, we propose a framework to integrate channel correlation analysis into P300 detection. This work is distinguished by two key contributions. First, a coefficient matrix is introduced and constructed for multiple channels with the elements indicating channel correlations. Agglomerative clustering is applied to group correlated channels. Second, the statistics of central tendency are used to fuse the information of correlated channels and generate virtual channels. The generated virtual channels can extend the EEG signals and lift up the signal-to-noise ratio. The correlated features from virtual channels are combined with original signals for classification and the outputs of discriminative classifier are used to determine the characters for spelling. Experimental results prove the effectiveness and efficiency of the channel correlation analysis based framework. Compared with the state-of-the-art, the recognition rate was increased by both 6% with 5 and 10 epochs by the proposed framework.
In cloud computing environments, data processing systems with strong and stochastic stream data processing capabilities are highly desired by multi-service oriented computing-intensive applications. The independeny of different business data streams makes these services very suitable for parallel processing with the aid of multicore processors. Furthermore, for the random crossing of data streams between different services, data synchronization is required. Aiming at the stochastic cross service stream, we propose a hardware synchronization mechanism based on index tables. By using a specifically designed hardware synchronization circuit, we can record the business index number (BIN) of the input and output data flow of the processing unit. By doing so, we can not only obtain the flow control of the job package accessing the processing units, but also guarantee that the work of the processing units is single and continuous. This approach overcomes the high complexity and low reliability of the programming in the software synchronization. As demonstrated by numerical experiment results, the proposed scheme can ensure the validity of the cross service stream, and its processing speed is better than that of the lock-based synchronization scheme. This scheme is applied to a cryptographic server and accelerates the processing speed of the cryptographic service.
Zheng SUN Hanli LIU Dingxin XU Hongye HUANG Bangan LIU Zheng LI Jian PANG Teruki SOMEYA Atsushi SHIRANE Kenichi OKADA
This paper presents a high jitter performance injection-locked clock multiplier (ILCM) using an ultra-low power (ULP) voltage-controlled oscillator (VCO) for IoT application in 65-nm CMOS. The proposed transformer-based VCO achieves low flicker noise corner and sub-100µW power consumption. Double cross-coupled NMOS transistors sharing the same current provide high transconductance. The network using high-Q factor transformer (TF) provides a large tank impedance to minimize the current requirement. Thanks to the low current bias with a small conduction angle in the ULP VCO design, the proposed TF-based VCO's flicker noise can be suppressed, and a good PN can be achieved in flicker region (1/f3) with sub-100µW power consumption. Thus, a high figure-of-merit (FoM) can be obtained at both 100kHz and 1MHz without additional inductor. The proposed VCO achieves phase noise of -94.5/-115.3dBc/Hz at 100kHz/1MHz frequency offset with a 97µW power consumption, which corresponds to a -193/-194dBc/Hz VCO FoM at 2.62GHz oscillation frequency. The measurement results show that the 1/f3 corner is below 60kHz over the tuning range from 2.57GHz to 3.40GHz. Thanks to the proposed low power VCO, the total ILCM achieves 78 fs RMS jitter while using a high reference clock. A 960 fs RMS jitter can be achieved with a 40MHz common reference and 107µW corresponding power.
In 2004, Tsuji and Shimizu proposed a one-time password authentication protocol, named 2GR (Two-Gene-Relation password authentication protocol). The design goal of the 2GR protocol is to eliminate the stolen-verifier attack on SAS-2 (Simple And Secure password authentication protocol, ver.2) and the theft attack on ROSI (RObust and SImple password authentication protocol). Tsuji and Shimizu claimed that in the 2GR an attacker who has stolen the verifiers from the server cannot impersonate a legitimate user. This paper, however, will point out that the 2GR protocol is still vulnerable to an impersonation attack, in which any attacker can, without stealing the verifiers, masquerade as a legitimate user.
Chun-Li LIN Hung-Min SUN Tzonelih HWANG
A password-based mechanism is the most widely used method of authentication in distributed environments. However, because people are used to choosing easy-to-remember passwords, so-called "weak-passwords," dictionary attacks on them can succeed. The techniques used to prevent dictionary attacks lead to a heavy computational load. Indeed, forcing people to use well-chosen passwords, so-called "strong passwords," with the assistance of tamper-resistant hardware devices can be regarded as another fine authentication solution. In this paper, we examine a recent solution, the SAS protocol, and demonstrate that it is vulnerable to replay and denial of service attacks. We also propose an Optimal Strong-Password Authentication (OSPA) protocol that is secure against stolen-verifier, replay, and denial of service attacks, and minimizes computation, storage, and transmission overheads.
In this letter, we consider the problem of joint selection of transmitters and receivers in a distributed multi-input multi-output radar network for localization. Different from previous works, we consider a more mathematically challenging but generalized situation that the transmitting signals are not perfectly orthogonal. Taking Cramér Rao lower bound as performance metric, we propose a scheme of joint selection of transmitters and receivers (JSTR) aiming at optimizing the localization performance under limited number of nodes. We propose a bi-convex relaxation to replace the resultant NP hard non-convex problem. Using the bi-convexity, the surrogate problem can be efficiently resolved by nonlinear alternating direction method of multipliers. Simulation results reveal that the proposed algorithm has very close performance compared with the computationally intensive but global optimal exhaustive search method.