Ryo AIHARA Ryoichi TAKASHIMA Tetsuya TAKIGUCHI Yasuo ARIKI
This paper presents a voice conversion (VC) technique for noisy environments based on a sparse representation of speech. Sparse representation-based VC using Non-negative matrix factorization (NMF) is employed for noise-added spectral conversion between different speakers. In our previous exemplar-based VC method, source exemplars and target exemplars are extracted from parallel training data, having the same texts uttered by the source and target speakers. The input source signal is represented using the source exemplars and their weights. Then, the converted speech is constructed from the target exemplars and the weights related to the source exemplars. However, this exemplar-based approach needs to hold all training exemplars (frames), and it requires high computation times to obtain the weights of the source exemplars. In this paper, we propose a framework to train the basis matrices of the source and target exemplars so that they have a common weight matrix. By using the basis matrices instead of the exemplars, the VC is performed with lower computation times than with the exemplar-based method. The effectiveness of this method was confirmed by comparing its effectiveness (in speaker conversion experiments using noise-added speech data) with that of an exemplar-based method and a conventional Gaussian mixture model (GMM)-based method.
Lechang LIU Keisuke ISHIKAWA Tadahiro KURODA
Parametric resonance based solutions for sub-gigahertz radio frequency transceiver with 0.3V supply voltage are proposed in this paper. As an implementation example, a 0.3V 720µW variation-tolerant injection-locked frequency multiplier is developed in 90nm CMOS. It features a parametric resonance based multi-phase synthesis scheme, thereby achieving the lowest supply voltage with -110dBc@ 600kHz phase noise and 873MHz-1.008GHz locking range in state-of-the-art frequency synthesizers.
Qianjian XING Feng YU Xiaobo YIN Bei ZHAO
In this letter, we present a radix-R regular interconnection pattern family of factorizations for the WHT-FFT with identical stage-to-stage interconnection pattern in a unified form, where R is any power of 2. This family of algorithms has identical sparse matrix factorization in each stage and can be implemented in a merged butterfly structure, which conduce to regular and efficient memory managing scalable to high radices. And in each stage, the butterflies with same twiddle factor set are aggregated together, which can reduce the twiddle factor evaluations or accesses to the lookup table. The kinds of factorization can also be extended to FFT, WHT and SCHT with identical stage-to-stage interconnection pattern.
Mahmoud KESHAVARZI Delaram AMIRI Amir Mansour PEZESHK Forouhar FARZANEH
This letter presents a novel method based on sparsity, to solve the problem of deinterleaving pulse trains. The proposed method models the problem of deinterleaving pulse trains as an underdetermined system of linear equations. After determining the mixing matrix, we find sparsest solution of an underdetermined system of linear equations using basis pursuit denoising. This method is superior to previous ones in a number of aspects. First, spurious and missing pulses would not cause any performance reduction in the algorithm. Second, the algorithm works well despite the type of pulse repetition interval modulation that is used. Third, the proposed method is able to separate similar sources.
Chang-shuai WANG Jong-wha CHONG
In this paper, a novel White-RGB (WRGB) color filter array-based imaging system for cell phone is presented to reduce noise and reproduce color in low illumination. The core process is based on adaptive diagonal color separation to recover color components from a white signal using diagonal reference blocks and location-based color ratio estimation in the luminance space. The experiments, which are compared with the RGB and state-of-the-art WRGB approaches, show that our imaging system performs well for various spatial frequency images and color restoration in low-light environments.
Recently, the wavelet-based estimation method has gradually been becoming popular as a new tool for software reliability assessment. The wavelet transform possesses both spatial and temporal resolution which makes the wavelet-based estimation method powerful in extracting necessary information from observed software fault data, in global and local points of view at the same time. This enables us to estimate the software reliability measures in higher accuracy. However, in the existing works, only the point estimation of the wavelet-based approach was focused, where the underlying stochastic process to describe the software-fault detection phenomena was modeled by a non-homogeneous Poisson process. In this paper, we propose an interval estimation method for the wavelet-based approach, aiming at taking account of uncertainty which was left out of consideration in point estimation. More specifically, we employ the simulation-based bootstrap method, and derive the confidence intervals of software reliability measures such as the software intensity function and the expected cumulative number of software faults. To this end, we extend the well-known thinning algorithm for the purpose of generating multiple sample data from one set of software-fault count data. The results of numerical analysis with real software fault data make it clear that, our proposal is a decision support method which enables the practitioners to do flexible decision making in software development project management.
A suffix tree is widely adopted for indexing genome sequences. While supporting highly efficient search, the suffix tree has a few shortcomings such as very large size and very long construction time. In this paper, we propose a very fast parallel algorithm to construct a disk-based suffix tree for human genome sequences. Our algorithm constructs a suffix array for part of the suffixes in the human genome sequence and then converts it into a suffix tree very quickly. It outperformed the previous algorithms by Loh et al. and Barsky et al. by up to 2.09 and 3.04 times, respectively.
To better support data-intensive workflows which are typically built out of various independently developed executables, this paper proposes extensions to parallel database systems called User-Defined eXecutables (UDX) and collective queries. UDX facilitates the description of workflows by enabling seamless integrations of external executables into SQL statements without any efforts to write programs confirming to strict specifications of databases. A collective query is an SQL query whose results are distributed to multiple clients and then processed by them in parallel, using arbitrary UDX. It provides efficient parallelization of executables through the data transfer optimization algorithms that distribute query results to multiple clients, taking both communication cost and computational loads into account. We implement this concept in a system called ParaLite, a parallel database system based on a popular lightweight database SQLite. Our experiments show that ParaLite has several times higher performance over Hive for typical SQL tasks and has 10x speedup compared to a commercial DBMS for executables. In addition, this paper studies a real-world text processing workflow and builds it on top of ParaLite, Hadoop, Hive and general files. Our experiences indicate that ParaLite outperforms other systems in both productivity and performance for the workflow.
Zhiwei RUAN Guijin WANG Xinggang LIN Jing-Hao XUE Yong JIANG
The transfer of prior knowledge from source domains can improve the performance of learning when the training data in a target domain are insufficient. In this paper we propose a new strategy to transfer deformable part models (DPMs) for object detection, using offline-trained auxiliary DPMs of similar categories as source models to improve the performance of the target object detector. A DPM presents an object by using a root filter and several part filters. We use these filters of the auxiliary DPMs as prior knowledge and adapt the filters to the target object. With a latent transfer learning method, appropriate local features are extracted for the transfer of part filters. Our experiments demonstrate that this strategy can lead to a detector superior to some state-of-the-art methods.
Daichi KITAMURA Hiroshi SARUWATARI Kosuke YAGI Kiyohiro SHIKANO Yu TAKAHASHI Kazunobu KONDO
In this letter, we address monaural source separation based on supervised nonnegative matrix factorization (SNMF) and propose a new penalized SNMF. Conventional SNMF often degrades the separation performance owing to the basis-sharing problem. Our penalized SNMF forces nontarget bases to become different from the target bases, which increases the separated sound quality.
A new scheme based on multi-order visual comparison is proposed for full-reference image quality assessment. Inspired by the observation that various image derivatives have great but different effects on visual perception, we perform respective comparison on different orders of image derivatives. To obtain an overall image quality score, we adaptively integrate the results of different comparisons via a perception-inspired strategy. Experimental results on public databases demonstrate that the proposed method is more competitive than some state-of-the-art methods, benchmarked against subjective assessment given by human beings.
Hongliang XU Fei ZHOU Fan YANG Qingmin LIAO
We propose a parameterized multisurface fitting method for multi-frame super-resolution (SR) processing. A parameter assumed for the unknown high-resolution (HR) pixel is used for multisurface fitting. Each surface fitted at each low-resolution (LR) pixel is an expression of the parameter. Final SR result is obtained by fusing the sampling values from these surfaces in the maximum a posteriori fashion. Experimental results demonstrate the superiority of the proposed method.
Gibran BENITEZ-GARCIA Gabriel SANCHEZ-PEREZ Hector PEREZ-MEANA Keita TAKAHASHI Masahide KANEKO
This paper presents a facial expression recognition algorithm based on segmentation of a face image into four facial regions (eyes-eyebrows, forehead, mouth and nose). In order to unify the different results obtained from facial region combinations, a modal value approach that employs the most frequent decision of the classifiers is proposed. The robustness of the algorithm is also evaluated under partial occlusion, using four different types of occlusion (half left/right, eyes and mouth occlusion). The proposed method employs sub-block eigenphases algorithm that uses the phase spectrum and principal component analysis (PCA) for feature vector estimation which is fed to a support vector machine (SVM) for classification. Experimental results show that using modal value approach improves the average recognition rate achieving more than 90% and the performance can be kept high even in the case of partial occlusion by excluding occluded parts in the feature extraction process.
Tsukasa OMOTO Koji EGUCHI Shotaro TORA
The hierarchical Dirichlet process (HDP) can provide a nonparametric prior for a mixture model with grouped data, where mixture components are shared across groups. However, the computational cost is generally very high in terms of both time and space complexity. Therefore, developing a method for fast inference of HDP remains a challenge. In this paper, we assume a symmetric multiprocessing (SMP) cluster, which has been widely used in recent years. To speed up the inference on an SMP cluster, we explore hybrid two-level parallelization of the Chinese restaurant franchise sampling scheme for HDP, especially focusing on the application to topic modeling. The methods we developed, Hybrid-AD-HDP and Hybrid-Diff-AD-HDP, make better use of SMP clusters, resulting in faster HDP inference. While the conventional parallel algorithms with a full message-passing interface does not benefit from using SMP clusters due to higher communication costs, the proposed hybrid parallel algorithms have lower communication costs and make better use of the computational resources.
A new type of the affine projection (AP) algorithms which incorporates the sparsity condition of a system is presented. To exploit the sparsity of the system, a weighted l1-norm regularization is imposed on the cost function of the AP algorithm. Minimizing the cost function with a subgradient calculus and choosing two distinct weightings for l1-norm, two stochastic gradient based sparsity regularized AP (SR-AP) algorithms are developed. Experimental results show that the SR-AP algorithms outperform the typical AP counterparts for identifying sparse systems.
Junjun GUO Jianjun MU Xiaopeng JIAO Guiping LI
In this letter, we present a new scheme to find small fundamental instantons (SFIs) of regular low-density parity-check (LDPC) codes for the linear programming (LP) decoding over the binary symmetric channel (BSC). Based on the fact that each instanton-induced graph (IIG) contains at least one short cycle, we determine potential instantons by constructing possible IIGs which contain short cycles and additional paths connected to the cycles. Then we identify actual instantons from potential ones under the LP decoding. Simulation results on some typical LDPC codes show that our scheme is effective, and more instantons can be obtained by the proposed scheme when compared with the existing instanton search method.
Lijian ZHOU Wanquan LIU Zhe-Ming LU Tingyuan NIE
In this Letter, a new face recognition approach based on curvelets and local ternary patterns (LTP) is proposed. First, we observe that the curvelet transform is a new anisotropic multi-resolution transform and can efficiently represent edge discontinuities in face images, and that the LTP operator is one of the best texture descriptors in terms of characterizing face image details. This motivated us to decompose the image using the curvelet transform, and extract the features in different frequency bands. As revealed by curvelet transform properties, the highest frequency band information represents the noisy information, so we directly drop it from feature selection. The lowest frequency band mainly contains coarse image information, and thus we deal with it more precisely to extract features as the face's details using LTP. The remaining frequency bands mainly represent edge information, and we normalize them for achieving explicit structure information. Then, all the extracted features are put together as the elementary feature set. With these features, we can reduce the features' dimension using PCA, and then use the sparse sensing technique for face recognition. Experiments on the Yale database, the extended Yale B database, and the CMU PIE database show the effectiveness of the proposed methods.
Chongjing SUN Hui GAO Junlin ZHOU Yan FU Li SHE
With the distributed data mining technique having been widely used in a variety of fields, the privacy preserving issue of sensitive data has attracted more and more attention in recent years. Our major concern over privacy preserving in distributed data mining is the accuracy of the data mining results while privacy preserving is ensured. Corresponding to the horizontally partitioned data, this paper presents a new hybrid algorithm for privacy preserving distributed data mining. The main idea of the algorithm is to combine the method of random orthogonal matrix transformation with the proposed secure multi-party protocol of matrix product to achieve zero loss of accuracy in most data mining implementations.
Participatory sensing is an emerging system that allows the increasing number of smartphone users to share effectively the minute statistical information collected by themselves. This system relies on participants' active contribution including intentional input data. However, a number of privacy concerns will hinder the spread of participatory sensing applications. It is difficult for resource-constrained mobile phones to rely on complicated encryption schemes. We should prepare a privacy-preserving participatory sensing scheme with low computation complexity. Moreover, an environment that can reassure participants and encourage their participation in participatory sensing is strongly required because the quality of the statistical data is dependent on the active contribution of general users. In this article, we present MNS-RRT algorithms, which is the combination of negative surveys and randomized response techniques, for preserving privacy in participatory sensing, with high levels of data integrity. By using our method, participatory sensing applications can deal with a data having two selections in a dimension. We evaluated how this scheme can preserve the privacy while ensuring data integrity.
Sumxin JIANG Rendong YING Peilin LIU Zhenqi LU Zenghui ZHANG
This paper describes a new method for lossy audio signal compression via compressive sensing (CS). In this method, a structured shrinkage operator is employed to decompose the audio signal into three layers, with two sparse layers, tonal and transient, and additive noise, and then, both the tonal and transient layers are compressed using CS. Since the shrinkage operator is able to take into account the structure information of the coefficients in the transform domain, it is able to achieve a better sparse approximation of the audio signal than traditional methods do. In addition, we propose a sparsity allocation algorithm, which adjusts the sparsity between the two layers, thus improving the performance of CS. Experimental results demonstrated that the new method provided a better compression performance than conventional methods did.