1-18hit |
Song GAO Chunheng WANG Baihua XIAO Cunzhao SHI Wen ZHOU Zhong ZHANG
This paper tries to model spatial layout beyond the traditional spatial pyramid (SP) in the coding/pooling scheme for scene text character recognition. Specifically, we propose a novel method to build a dictionary called spatiality embedded dictionary (SED) in which each codeword represents a particular character stroke and is associated with a local response region. The promising results outperform other state-of-the-art algorithms.
Yazhong ZHANG Jinjian WU Guangming SHI Xuemei XIE Yi NIU Chunxiao FAN
Reduced-reference (RR) image quality assessment (IQA) algorithm aims to automatically evaluate the distorted image quality with partial reference data. The goal of RR IQA metric is to achieve higher quality prediction accuracy using less reference information. In this paper, we introduce a new RR IQA metric by quantifying the difference of discrete cosine transform (DCT) entropy features between the reference and distorted images. Neurophysiological evidences indicate that the human visual system presents different sensitivities to different frequency bands. Moreover, distortions on different bands result in individual quality degradations. Therefore, we suggest to calculate the information degradation on each band separately for quality assessment. The information degradations are firstly measured by the entropy difference of reorganized DCT coefficients. Then, the entropy differences on all bands are pooled to obtain the quality score. Experimental results on LIVE, CSIQ, TID2008, Toyama and IVC databases show that the proposed method performs highly consistent with human perception with limited reference data (8 values).
Shuang LIU Zhong ZHANG Baihua XIAO Xiaozhong CAO
Texture feature descriptors such as local binary patterns (LBP) have proven effective for ground-based cloud classification. Traditionally, these texture feature descriptors are predefined in a handcrafted way. In this paper, we propose a novel method which automatically learns discriminative features from labeled samples for ground-based cloud classification. Our key idea is to learn these features through mutual information maximization which learns a transformation matrix for local difference vectors of LBP. The experimental results show that our learned features greatly improves the performance of ground-based cloud classification when compared to the other state-of-the-art methods.
Wen ZHOU Chunheng WANG Baihua XIAO Zhong ZHANG Yunxue SHAO
Recognizing human action in complex scenes is a challenging problem in computer vision. Some action-unrelated concepts, such as camera position features, could significantly affect the appearance of local spatio-temporal features, and therefore the performance of low-level features based methods degrades. In this letter, we define the action-unrelated concept: the position of camera as high-level features. We observe that they can serve as a prior to local spatio-temporal features for human action recognition. We encode this prior by modeling interactions between spatio-temporal features and camera position features. We infer camera position features from local spatio-temporal features via these interactions. The parameters of this model are estimated by a new max-margin algorithm. We evaluate the proposed method on KTH, IXMAS and Youtube actions datasets. Experimental results show the effectiveness of the proposed method.
Shuang LIU Zhong ZHANG Xiaozhong CAO
Although sparse coding has emerged as an extremely powerful tool for texture and image classification, it neglects the relationship of coding coefficients from the same class in the training stage, which may cause a decline in the classification performance. In this paper, we propose a novel coding strategy named compact sparse coding for ground-based cloud classification. We add a constraint on coding coefficients into the objective function of traditional sparse coding. In this way, coding coefficients from the same class can be forced to their mean vector, making them more compact and discriminative. Experiments demonstrate that our method achieves better performance than the state-of-the-art methods.
Yingzhong ZHANG Xiaoni DU Wengang JIN Xingbin QIAO
Boolean functions with a few Walsh spectral values have important applications in sequence ciphers and coding theory. In this paper, we first construct a class of Boolean functions with at most five-valued Walsh spectra by using the secondary construction of Boolean functions, in particular, plateaued functions are included. Then, we construct three classes of Boolean functions with five-valued Walsh spectra using Kasami functions and investigate the Walsh spectrum distributions of the new functions. Finally, three classes of minimal linear codes with five-weights are obtained, which can be used to design secret sharing scheme with good access structures.
Zhong ZHANG Shuang LIU Xing MEI
The bag-of-words model (BOW) has been extensively adopted by recent human action recognition methods. The pooling operation, which aggregates local descriptor encodings into a single representation, is a key determiner of the performance of the BOW-based methods. However, the spatio-temporal relationship among interest points has rarely been considered in the pooling step, which results in the imprecise representation of human actions. In this paper, we propose a novel pooling strategy named contextual max pooling (CMP) to overcome this limitation. We add a constraint term into the objective function under the framework of max pooling, which forces the weights of interest points to be consistent with their probabilities. In this way, CMP explicitly considers the spatio-temporal contextual relationships among interest points and inherits the positive properties of max pooling. Our method is verified on three challenging datasets (KTH, UCF Sports and UCF Films datasets), and the results demonstrate that our method achieves better results than the state-of-the-art methods in human action recognition.
Hao CHI Qingji ZENG Huandong ZHAO Jiangtao LUO Zhizhong ZHANG
The conservative mode and the greedy mode scheduling algorithms for OBS switch with shared buffer are presented and discussed. Their performance is evaluated by computer simulations, as well as that of the greedy mode with void-filling algorithm. Simulation results show that the conservative mode and the greedy mode have different characteristics under different input load. The greedy mode and the conservative mode are more applicable in a real system than that with void-filling, owing to their lower computational complexity and FIFO characteristic. Finally, a composite algorithm integrated by the conservative mode and the greedy mode is proposed, which is adapted to the input load with the help of an input load monitor. The simulation results reveal that it has favorable performance under different load.
Lizhong ZHANG Yuan WANG Yandong HE
This work reports a new technique to suppress the undesirable multiple-triggering effect in the typical diode triggered silicon controlled rectifier (DTSCR), which is frequently used as an ESD protection element in the advanced CMOS technologies. The technique is featured by inserting additional N-Well areas under the N+ region of intrinsic SCR, which helps to improve the substrate resistance. As a consequence, the delay of intrinsic SCR is reduced as the required triggering current is largely decreased and multiple-triggering related higher trigger voltage is removed. The novel DTSCR structures can alter the stacked diodes to achieve the precise trigger voltage to meet different ESD protection requirements. All explored DTSCR structures are fabricated in a 65-nm CMOS process. Transmission-line-pulsing (TLP) and Very-Fast-Transmission-line-pulsing (VF-TLP) test systems are adopted to confirm the validity of this technique and the test results accord well with our analysis.
Song GAO Chunheng WANG Baihua XIAO Cunzhao SHI Wen ZHOU Zhong ZHANG
In this paper, we propose a representation method based on local spatial strokes for scene character recognition. High-level semantic information, namely co-occurrence of several strokes is incorporated by learning a sparse dictionary, which can further restrain noise brought by single stroke detectors. The encouraging results outperform state-of-the-art algorithms.
Zhong ZHANG Hong WANG Shuang LIU Tariq S. DURRANI
A rich and robust representation for scene characters plays a significant role in automatically understanding the text in images. In this letter, we focus on the issue of feature representation, and propose a novel encoding method named bilateral convolutional activations encoded with Fisher vectors (BCA-FV) for scene character recognition. Concretely, we first extract convolutional activation descriptors from convolutional maps and then build a bilateral convolutional activation map (BCAM) to capture the relationship between the convolutional activation response and the spatial structure information. Finally, in order to obtain the global feature representation, the BCAM is injected into FV to encode convolutional activation descriptors. Hence, the BCA-FV can effectively integrate the prominent features and spatial structure information for character representation. We verify our method on two widely used databases (ICDAR2003 and Chars74K), and the experimental results demonstrate that our method achieves better results than the state-of-the-art methods. In addition, we further validate the proposed BCA-FV on the “Pan+ChiPhoto” database for Chinese scene character recognition, and the experimental results show the good generalization ability of the proposed BCA-FV.
Zhong ZHANG Shuang LIU Zhiwei ZHANG
Sparsity-based methods have been recently applied to abnormal event detection and have achieved impressive results. However, most such methods suffer from the problem of dimensionality curse; furthermore, they also take no consideration of the relationship among coefficient vectors. In this paper, we propose a novel method called consistent sparse representation (CSR) to overcome the drawbacks. We first reconstruct each feature in the space spanned by the clustering centers of training features so as to reduce the dimensionality of features and preserve the neighboring structure. Then, the consistent regularization is added to the sparse representation model, which explicitly considers the relationship of coefficient vectors. Our method is verified on two challenging databases (UCSD Ped1 database and Subway batabase), and the experimental results demonstrate that our method obtains better results than previous methods in abnormal event detection.
Jiawei DU Xiaoni DU Wengang JIN Yingzhong ZHANG
Linear codes with a few-weight have important applications in combinatorial design, strongly regular graphs and cryptography. In this paper, we first construct a class of Boolean functions with at most five-valued Walsh spectra, and determine their spectrum distribution. Then, we derive two classes of linear codes with at most six-weight from the new functions. Meanwhile, the length, dimension and weight distributions of the codes are obtained. Results show that both of the new codes are minimal and among them, one is wide minimal code and the other is a narrow minimal code and thus can be used to design secret sharing scheme with good access structures. Finally, some Magma programs are used to verify the correctness of our results.
Guizhong ZHANG Baoxian WANG Zhaobo YAN Yiqiang LI Huaizhi YANG
In this work, we present one novel rust detection method based upon one-class classification and L2 sparse representation (SR) with decision fusion. Firstly, a new color contrast descriptor is proposed for extracting the rust features of steel structure images. Considering that the patterns of rust features are more simplified than those of non-rust ones, one-class support vector machine (SVM) classifier and L2 SR classifier are designed with these rust image features, respectively. After that, a multiplicative fusion rule is advocated for combining the one-class SVM and L2 SR modules, thereby achieving more accurate rust detecting results. In the experiments, we conduct numerous experiments, and when compared with other developed rust detectors, the presented method can offer better rust detecting performances.
Zhaoxi FANG Feng LIANG Shaozhong ZHANG Xiaolin ZHOU
Timing asynchronism strongly degrades the performance of analog network coded (ANC) bi-directional transmission. This letter investigates receiver design for asynchronous broadband bi-directional transmission over frequency selective fading channels. Based on time domain oversampling, we propose fractionally spaced frequency domain minimum mean square error (MMSE) equalizers for bi-directional ANC based on orthogonal frequency division multiplexing (OFDM) and cyclic prefixed single carrier (CP-SC) radio access. Simulation results show that the proposed fractionally spaced equalizer (FSE) can eliminate the negative effect of timing misalignment in bi-directional transmissions.
Zhong ZHANG Hong WANG Shuang LIU Liang ZHENG
Feature representation, as a key component of scene character recognition, has been widely studied and a number of effective methods have been proposed. In this letter, we propose the novel method named coupled spatial learning (CSL) for scene character representation. Different from the existing methods, the proposed CSL method simultaneously discover the spatial context in both the dictionary learning and coding stages. Concretely, we propose to build the spatial dictionary by preserving the corresponding positions of the codewords. Correspondingly, we introduce the spatial coding strategy which utilizes the spatiality regularization to consider the relationship among features in the Euclidean space. Based on the spatial dictionary and spatial coding, the spatial context can be effectively integrated in the visual representations. We verify our method on two widely used databases (ICDAR2003 and Chars74k), and the experimental results demonstrate that our method achieves competitive results compared with the state-of-the-art methods. In addition, we further validate the proposed CSL method on the Caltech-101 database for image classification task, and the experimental results show the good generalization ability of the proposed CSL.
Yuan TAO Yangdong DENG Shuai MU Zhenzhong ZHANG Mingfa ZHU Limin XIAO Li RUAN
The sparse matrix operation, y ← y+AtAx, where A is a sparse matrix and x and y are dense vectors, is a widely used computing pattern in High Performance Computing (HPC) applications. The pattern poses challenge to efficient solutions because both a matrix and its transposed version are involved. An efficient sparse matrix format, Compressed Sparse Blocks (CSB), has been proposed to provide nearly the same performance for both Ax and Atx. We develop a multithreaded implementation for the CSB format and apply it to solve y ← y+AtAx. Experiments show that our technique outperforms the Compressed Sparse Row (CSR) based solution in POSKI by up to 2.5 fold on over 70% of benchmarking matrices.
Wei FENG Suili FENG Yuehua DING Yongzhong ZHANG
The rapid variation of wireless channels and feedback delay make the available channel state information (CSI) outdated in dynamic wireless multi-hop networks, which significantly degrades the accuracy of cross-layer resource allocation. To deal with this problem, a cross-layer resource allocation scheme is proposed for wireless multi-hop networks by taking the outdated CSI into account and basing compensation on the results of channel prediction. The cross-layer resource allocation is formulated as a network utility maximization problem, which jointly considers congestion control, channel allocation, power control, scheduling and routing with the compensated CSI. Based on a dual decomposition approach, the problem is solved in a distributed manner. Simulation results show that the proposed algorithm can reasonably allocate the resources, and significantly improve the throughput and energy efficiency in the network.