Taku ODAKA Wannida SAE-TANG Masaaki FUJIYOSHI Hiroyuki KOBAYASHI Masahiro IWAHASHI Hitoshi KIYA
This letter proposes an efficient lossless compression method for high dynamic range (HDR) images in OpenEXR format. The proposed method transforms an HDR image to an indexed image and packs the histogram of the indexed image. Finally the packed image is losslessly compressed by using any existing lossless compression algorithm such as JPEG 2000. Experimental results show that the proposed method reduces the bit rate of compressed OpenEXR images compared with equipped lossless compression methods of OpenEXR format.
Kazuma SHIMADA Katsumi KONISHI Kazunori URUMA Tomohiro TAKAHASHI Toshihiro FURUKAWA
This paper deals with the problem of reconstructing a high-resolution digital image from a single low-resolution digital image and proposes a new intra-frame super-resolution algorithm based on the mixed lp/l1 norm minimization. Introducing some assumptions, this paper formulates the super-resolution problem as a mixed l0/l1 norm minimization and relaxes the l0 norm term to the lp norm to avoid ill-posedness. A heuristic iterative algorithm is proposed based on the iterative reweighted least squares (IRLS). Numerical examples show that the proposed algorithm achieves super-resolution efficiently.
Tsubasa TERADA Toshihiko NISHIMURA Yasutaka OGAWA Takeo OHGANE Hiroyoshi YAMADA
Much attention has recently been paid to direction of arrival (DOA) estimation using compressed sensing (CS) techniques, which are sparse signal reconstruction methods. In our previous study, we developed a method for estimating the DOAs of multi-band signals that uses CS processing and that is based on the assumption that incident signals have the same complex amplitudes in all the bands. That method has a higher probability of correct estimation than a single-band DOA estimation method using CS. In this paper, we propose novel DOA estimation methods for multi-band signals with frequency characteristics using the Khatri-Rao product. First, we formulate a method that can estimate DOAs of multi-band signals whose phases alone have frequency dependence. Second, we extend the scheme in such a way that we can estimate DOAs of multi-band signals whose amplitudes and phases both depend on frequency. Finally, we evaluate the performance of the proposed methods through computer simulations and reveal the improvement in estimation performance.
Shuang BAI Jianjun HOU Noboru OHNISHI
Local descriptors, Local Binary Pattern (LBP) and Scale Invariant Feature Transform (SIFT) are widely used in various computer applications. They emphasize different aspects of image contents. In this letter, we propose to combine them in sparse coding for categorizing scene images. First, we regularly extract LBP and SIFT features from training images. Then, corresponding to each feature, a visual word codebook is constructed. The obtained LBP and SIFT codebooks are used to create a two-dimensional table, in which each entry corresponds to an LBP visual word and a SIFT visual word. Given an input image, LBP and SIFT features extracted from the same positions of this image are encoded together based on sparse coding. After that, spatial max pooling is adopted to determine the image representation. Obtained image representations are converted into one-dimensional features and classified by utilizing SVM classifiers. Finally, we conduct extensive experiments on datasets of Scene Categories 8 and MIT 67 Indoor Scene to evaluate the proposed method. Obtained results demonstrate that combining features in the proposed manner is effective for scene categorization.
By exploiting the inherent sparsity of wireless propagation channels, the theory of compressive sensing (CS) provides us with novel technologies to estimate the channel state information (CSI) that require considerably fewer samples than traditional pilot-aided estimation methods. In this paper, we describe the block-sparse structure of the fast time-varying channel and apply the model-based CS (MCS) for channel estimation in orthogonal frequency division multiplexing (OFDM) systems. By exploiting the structured sparsity, the proposed MCS-based method can further compress the channel information, thereby allowing a more efficient and precise estimation of the CSI compared with conventional CS-based approaches. Furthermore, a specific pilot arrangement is tailored for the proposed estimation scheme. This so-called random grouped pilot pattern can not only effectively protect the measurements from the inter-carrier interference (ICI) caused by Doppler spreading but can also enable the measurement matrix to meet the conditions required for MCS with relatively high probability. Simulation results demonstrate that our method has good performance at high Doppler frequencies.
Jingjie YAN Wenming ZHENG Minghai XIN Jingwei YAN
In this letter, a new sparse locality preserving projection (SLPP) algorithm is developed and applied to facial expression recognition. In comparison with the original locality preserving projection (LPP) algorithm, the presented SLPP algorithm is able to simultaneously find the intrinsic manifold of facial feature vectors and deal with facial feature selection. This is realized by the use of l1-norm regularization in the LPP objective function, which is directly formulated as a least squares regression pattern. We use two real facial expression databases (JAFFE and Ekman's POFA) to testify the proposed SLPP method and certain experiments show that the proposed SLPP approach respectively gains 77.60% and 82.29% on JAFFE and POFA database.
Song GAO Chunheng WANG Baihua XIAO Cunzhao SHI Wen ZHOU Zhong ZHANG
In this paper, we propose a representation method based on local spatial strokes for scene character recognition. High-level semantic information, namely co-occurrence of several strokes is incorporated by learning a sparse dictionary, which can further restrain noise brought by single stroke detectors. The encouraging results outperform state-of-the-art algorithms.
In holographic data storage, information is recorded within the volume of a holographic medium. Typically, the data is presented as an array of pixels with modulation in amplitude and/or phase. In the 4-f orientation, the Fourier domain representation of the data array is produced optically, and this image is recorded. If the Fourier image contains large peaks, the recording material can saturate, which leads to errors in the read-out data array. In this paper, we present a coding process that produces sparse ternary data arrays. Ternary modulation is used because it inherently provides Fourier domain smoothing and allows more data to be stored per array in comparison to binary modulation. Sparse arrays contain fewer on-pixels than dense arrays, and thus contain less power overall, which reduces the severity of peaks in the Fourier domain. The coding process first converts binary data to a sequence of ternary symbols via a high-rate block code, and then uses guided scrambling to produce a set of candidate codewords, from which the most sparse is selected to complete the encoding process. Our analysis of the guided scrambling division and selection processes demonstrates that, with primitive scrambling polynomials, a sparsity greater than 1/3 is guaranteed for all encoded arrays, and that the probability of this worst-case sparsity decreases with increasing block size.
Honggyu JUNG Kwang-Yul KIM Yoan SHIN
We propose a cooperative compressed spectrum sensing scheme for correlated signals in wideband cognitive radio networks. In order to design a reconstruction algorithm which accurately recover the wideband signals from the compressed samples in low SNR (Signal-to-Noise Ratio) environments, we consider the multiple measurement vector model exploiting a sequence of input signals and propose a cooperative sparse Bayesian learning algorithm which models the temporal correlation of the input signals. Simulation results show that the proposed scheme outperforms existing compressed sensing algorithms for low SNRs.
Duck-Ho BAE Jong-Min LEE Sang-Wook KIM Youngjoon WON Yongsu PARK
A burst of social network services increases the need for in-depth analysis of network activities. Privacy breach for network participants is a concern in such analysis efforts. This paper investigates structural and property changes via several privacy preserving methods (anonymization) for social network. The anonymized social network does not follow the power-law for node degree distribution as the original network does. The peak-hop for node connectivity increases at most 1 and the clustering coefficient of neighbor nodes shows 6.5 times increases after anonymization. Thus, we observe inconsistency of privacy preserving methods in social network analysis.
Jianqiao WANG Yuehua LI Jianfei CHEN Yuanjiang LI
The label estimation technique provides a new way to design semi-supervised learning algorithms. If the labels of the unlabeled data can be estimated correctly, the semi-supervised methods can be replaced by the corresponding supervised versions. In this paper, we propose a novel semi-supervised learning algorithm, called Geodesic Weighted Sparse Representation (GWSR), to estimate the labels of the unlabeled data. First, the geodesic distance and geodesic weight are calculated. The geodesic weight is utilized to reconstruct the labeled samples. The Euclidean distance between the reconstructed labeled sample and the unlabeled sample equals the geodesic distance between the original labeled sample and the unlabeled sample. Then, the unlabeled samples are sparsely reconstructed and the sparse reconstruction weight is obtained by minimizing the L1-norm. Finally, the sparse reconstruction weight is utilized to estimate the labels of the unlabeled samples. Experiments on synthetic data and USPS hand-written digit database demonstrate the effectiveness of our method.
Ryo AIHARA Ryoichi TAKASHIMA Tetsuya TAKIGUCHI Yasuo ARIKI
This paper presents a voice conversion (VC) technique for noisy environments based on a sparse representation of speech. Sparse representation-based VC using Non-negative matrix factorization (NMF) is employed for noise-added spectral conversion between different speakers. In our previous exemplar-based VC method, source exemplars and target exemplars are extracted from parallel training data, having the same texts uttered by the source and target speakers. The input source signal is represented using the source exemplars and their weights. Then, the converted speech is constructed from the target exemplars and the weights related to the source exemplars. However, this exemplar-based approach needs to hold all training exemplars (frames), and it requires high computation times to obtain the weights of the source exemplars. In this paper, we propose a framework to train the basis matrices of the source and target exemplars so that they have a common weight matrix. By using the basis matrices instead of the exemplars, the VC is performed with lower computation times than with the exemplar-based method. The effectiveness of this method was confirmed by comparing its effectiveness (in speaker conversion experiments using noise-added speech data) with that of an exemplar-based method and a conventional Gaussian mixture model (GMM)-based method.
Qianjian XING Feng YU Xiaobo YIN Bei ZHAO
In this letter, we present a radix-R regular interconnection pattern family of factorizations for the WHT-FFT with identical stage-to-stage interconnection pattern in a unified form, where R is any power of 2. This family of algorithms has identical sparse matrix factorization in each stage and can be implemented in a merged butterfly structure, which conduce to regular and efficient memory managing scalable to high radices. And in each stage, the butterflies with same twiddle factor set are aggregated together, which can reduce the twiddle factor evaluations or accesses to the lookup table. The kinds of factorization can also be extended to FFT, WHT and SCHT with identical stage-to-stage interconnection pattern.
Mahmoud KESHAVARZI Delaram AMIRI Amir Mansour PEZESHK Forouhar FARZANEH
This letter presents a novel method based on sparsity, to solve the problem of deinterleaving pulse trains. The proposed method models the problem of deinterleaving pulse trains as an underdetermined system of linear equations. After determining the mixing matrix, we find sparsest solution of an underdetermined system of linear equations using basis pursuit denoising. This method is superior to previous ones in a number of aspects. First, spurious and missing pulses would not cause any performance reduction in the algorithm. Second, the algorithm works well despite the type of pulse repetition interval modulation that is used. Third, the proposed method is able to separate similar sources.
A new type of the affine projection (AP) algorithms which incorporates the sparsity condition of a system is presented. To exploit the sparsity of the system, a weighted l1-norm regularization is imposed on the cost function of the AP algorithm. Minimizing the cost function with a subgradient calculus and choosing two distinct weightings for l1-norm, two stochastic gradient based sparsity regularized AP (SR-AP) algorithms are developed. Experimental results show that the SR-AP algorithms outperform the typical AP counterparts for identifying sparse systems.
Sumxin JIANG Rendong YING Peilin LIU Zhenqi LU Zenghui ZHANG
This paper describes a new method for lossy audio signal compression via compressive sensing (CS). In this method, a structured shrinkage operator is employed to decompose the audio signal into three layers, with two sparse layers, tonal and transient, and additive noise, and then, both the tonal and transient layers are compressed using CS. Since the shrinkage operator is able to take into account the structure information of the coefficients in the transform domain, it is able to achieve a better sparse approximation of the audio signal than traditional methods do. In addition, we propose a sparsity allocation algorithm, which adjusts the sparsity between the two layers, thus improving the performance of CS. Experimental results demonstrated that the new method provided a better compression performance than conventional methods did.
Lijian ZHOU Wanquan LIU Zhe-Ming LU Tingyuan NIE
In this Letter, a new face recognition approach based on curvelets and local ternary patterns (LTP) is proposed. First, we observe that the curvelet transform is a new anisotropic multi-resolution transform and can efficiently represent edge discontinuities in face images, and that the LTP operator is one of the best texture descriptors in terms of characterizing face image details. This motivated us to decompose the image using the curvelet transform, and extract the features in different frequency bands. As revealed by curvelet transform properties, the highest frequency band information represents the noisy information, so we directly drop it from feature selection. The lowest frequency band mainly contains coarse image information, and thus we deal with it more precisely to extract features as the face's details using LTP. The remaining frequency bands mainly represent edge information, and we normalize them for achieving explicit structure information. Then, all the extracted features are put together as the elementary feature set. With these features, we can reduce the features' dimension using PCA, and then use the sparse sensing technique for face recognition. Experiments on the Yale database, the extended Yale B database, and the CMU PIE database show the effectiveness of the proposed methods.
Jingjie YAN Wenming ZHENG Minhai XIN Jingwei YAN
In this letter, we research the method of using face and gesture image sequences to deal with the video-based bimodal emotion recognition problem, in which both Harris plus cuboids spatio-temporal feature (HST) and sparse canonical correlation analysis (SCCA) fusion method are applied to this end. To efficaciously pick up the spatio-temporal features, we adopt the Harris 3D feature detector proposed by Laptev and Lindeberg to find the points from both face and gesture videos, and then apply the cuboids feature descriptor to extract the facial expression and gesture emotion features [1],[2]. To further extract the common emotion features from both facial expression feature set and gesture feature set, the SCCA method is applied and the extracted emotion features are used for the biomodal emotion classification, where the K-nearest neighbor classifier and the SVM classifier are respectively used for this purpose. We test this method on the biomodal face and body gesture (FABO) database and the experimental results demonstrate the better recognition accuracy compared with other methods.
Li ZENG Xiongwei ZHANG Liang CHEN Weiwei YANG
Presented is a new measuring and reconstruction framework of Compressed Sensing (CS), aiming at reducing the measurements required to ensure faithful reconstruction. A sparse vector is segmented into sparser vectors. These new ones are then randomly sensed. For recovery, we reconstruct these vectors individually and assemble them to obtain the original signal. We show that the proposed scheme, referred to as SegOMP, yields higher probability of exact recovery in theory. It is finished with much smaller number of measurements to achieve a same reconstruction quality when compared to the canonical greedy algorithms. Extensive experiments verify the validity of the SegOMP and demonstrate its potentials.
Yuan TAO Yangdong DENG Shuai MU Zhenzhong ZHANG Mingfa ZHU Limin XIAO Li RUAN
The sparse matrix operation, y ← y+AtAx, where A is a sparse matrix and x and y are dense vectors, is a widely used computing pattern in High Performance Computing (HPC) applications. The pattern poses challenge to efficient solutions because both a matrix and its transposed version are involved. An efficient sparse matrix format, Compressed Sparse Blocks (CSB), has been proposed to provide nearly the same performance for both Ax and Atx. We develop a multithreaded implementation for the CSB format and apply it to solve y ← y+AtAx. Experiments show that our technique outperforms the Compressed Sparse Row (CSR) based solution in POSKI by up to 2.5 fold on over 70% of benchmarking matrices.