1-16hit |
Yan CHEN Jing ZHANG Yuebing XU Yingjie ZHANG Renyuan ZHANG Yasuhiko NAKASHIMA
An efficient resistive random access memory (ReRAM) structure is developed for accelerating convolutional neural network (CNN) powered by the in-memory computation. A novel ReRAM cell circuit is designed with two-directional (2-D) accessibility. The entire memory system is organized as a 2-D array, in which specific memory cells can be identically accessed by both of column- and row-locality. For the in-memory computations of CNNs, only relevant cells in an identical sub-array are accessed by 2-D read-out operations, which is hardly implemented by conventional ReRAM cells. In this manner, the redundant access (column or row) of the conventional ReRAM structures is prevented to eliminated the unnecessary data movement when CNNs are processed in-memory. From the simulation results, the energy and bandwidth efficiency of the proposed memory structure are 1.4x and 5x of a state-of-the-art ReRAM architecture, respectively.
Dongliang CHEN Peng SONG Wenjing ZHANG Weijian ZHANG Bingui XU Xuan ZHOU
In this letter, we propose a novel robust transferable subspace learning (RTSL) method for cross-corpus facial expression recognition. In this method, on one hand, we present a novel distance metric algorithm, which jointly considers the local and global distance distribution measure, to reduce the cross-corpus mismatch. On the other hand, we design a label guidance strategy to improve the discriminate ability of subspace. Thus, the RTSL is much more robust to the cross-corpus recognition problem than traditional transfer learning methods. We conduct extensive experiments on several facial expression corpora to evaluate the recognition performance of RTSL. The results demonstrate the superiority of the proposed method over some state-of-the-art methods.
Weiguo ZHANG Jiaqi LU Jing ZHANG Xuewen LI Qi ZHAO
The haze situation will seriously affect the quality of license plate recognition and reduce the performance of the visual processing algorithm. In order to improve the quality of haze pictures, a license plate recognition algorithm based on haze weather is proposed in this paper. The algorithm in this paper mainly consists of two parts: The first part is MPGAN image dehazing, which uses a generative adversarial network to dehaze the image, and combines multi-scale convolution and perceptual loss. Multi-scale convolution is conducive to better feature extraction. The perceptual loss makes up for the shortcoming that the mean square error (MSE) is greatly affected by outliers; the second part is to recognize the license plate, first we use YOLOv3 to locate the license plate, the STN network corrects the license plate, and finally enters the improved LPRNet network to get license plate information. Experimental results show that the dehazing model proposed in this paper achieves good results, and the evaluation indicators PSNR and SSIM are better than other representative algorithms. After comparing the license plate recognition algorithm with the LPRNet algorithm, the average accuracy rate can reach 93.9%.
Jing ZHANG Degen HUANG Kaiyu HUANG Zhuang LIU Fuji REN
Microblog data contains rich information of real-world events with great commercial values, so microblog-oriented natural language processing (NLP) tasks have grabbed considerable attention of researchers. However, the performance of microblog-oriented Chinese Word Segmentation (CWS) based on deep neural networks (DNNs) is still not satisfying. One critical reason is that the existing microblog-oriented training corpus is inadequate to train effective weight matrices for DNNs. In this paper, we propose a novel active learning method to extend the scale of the training corpus for DNNs. However, due to a large amount of partially overlapped sentences in the microblogs, it is difficult to select samples with high annotation values from raw microblogs during the active learning procedure. To select samples with higher annotation values, parameter λ is introduced to control the number of repeatedly selected samples. Meanwhile, various strategies are adopted to measure the overall annotation values of a sample during the active learning procedure. Experiments on the benchmark datasets of NLPCC 2015 show that our λ-active learning method outperforms the baseline system and the state-of-the-art method. Besides, the results also demonstrate that the performances of the DNNs trained on the extended corpus are significantly improved.
Keke ZHAO Peng SONG Shaokai LI Wenjing ZHANG Wenming ZHENG
In this letter, we present an adaptive weighted transfer subspace learning (AWTSL) method for cross-database speech emotion recognition (SER), which can efficiently eliminate the discrepancy between source and target databases. Specifically, on one hand, a subspace projection matrix is first learned to project the cross-database features into a common subspace. At the same time, each target sample can be represented by the source samples by using a sparse reconstruction matrix. On the other hand, we design an adaptive weighted matrix learning strategy, which can improve the reconstruction contribution of important features and eliminate the negative influence of redundant features. Finally, we conduct extensive experiments on four benchmark databases, and the experimental results demonstrate the efficacy of the proposed method.
Jing ZHANG Dan LI Hong-an LI Xuewen LI Lizhi ZHANG
In order to solve the low-quality problems such as low brightness, poor contrast, noise interference and color imbalance in night images, a night image enhancement algorithm based on MDIFE-Net curve estimation is presented. This algorithm mainly consists of three parts: Firstly, we design an illumination estimation curve (IEC), which adjusts the pixel level of the low illumination image domain through a non-linear fitting function, maps to the enhanced image domain, and effectively eliminates the effect of illumination loss; Secondly, the DCE-Net is improved, replacing the original Relu activation function with a smoother Mish activation function, so that the parameters can be better updated; Finally, illumination estimation loss function, which combines image attributes with fidelity, is designed to drive the no-reference image enhancement, which preserves more image details while enhancing the night image. The experimental results show that our method can not only effectively improve the image contrast, but also make the details of the target more prominent, improve the visual quality of the image, and make the image achieve a better visual effect. Compared with four existing low illumination image enhancement algorithms, the NIQE and STD evaluation index values are better than other representative algorithms, verify the feasibility and validity of the algorithm, and verify the rationality and necessity of each component design through ablation experiments.
Wenjing ZHANG Peng SONG Wenming ZHENG
In this letter, we propose a novel transferable sparse regression (TSR) method, for cross-database facial expression recognition (FER). In TSR, we firstly present a novel regression function to regress the data into a latent representation space instead of a strict binary label space. To further alleviate the influence of outliers and overfitting, we impose a row sparsity constraint on the regression term. And a pairwise relation term is introduced to guide the feature transfer learning. Secondly, we design a global graph to transfer knowledge, which can well preserve the cross-database manifold structure. Moreover, we introduce a low-rank constraint on the graph regularization term to uncover additional structural information. Finally, several experiments are conducted on three popular facial expression databases, and the results validate that the proposed TSR method is superior to other non-deep and deep transfer learning methods.
Jianxiong HUANG Taiyi ZHANG Runping YUAN Jing ZHANG
In this letter, the performance of opportunistic-based two-way relaying with beamforming over Nakagami-m fading channels is investigated. We provide an approximate expression for the cumulative distribution function of the end-to-end signal-to-noise ratio to derive the closed-form lower bounds for the outage probability and average bit error probability as well as the closed-form upper bound for the ergodic capacity. Simulation results demonstrate the tightness of the derived bounds.
Mathieu STOFFEL Jing ZHANG Oliver G. SCHMIDT
We present room temperature current voltage characteristics from SiGe interband tunneling diodes epitaxially grown on highly resistive Si(001) substrates. In this case, a maximum peak to valley current ratio (PVCR) of 5.65 was obtained. The possible integration of a SiGe tunnel diode with a strained Si transistor lead us to investigate the growth of SiGe interband tunneling diodes on Si0.7Ge0.3 virtual substrates. A careful optimization of the layer structure leads to a maximum PVCR of 1.36 at room temperature. The latter value can be further increased to 2.26 at 3.7 K. Our results demonstrate that high quality SiGe interband tunneling diodes can be realized, which is of great interest for future memory and high speed applications.
Runping YUAN Taiyi ZHANG Jing ZHANG Jianxiong HUANG Zhenjie FENG
In this letter, a dual-hop wireless communication network with opportunistic amplify and forward (O-AF) relay is investigated over independent and non-identically distributed Nakagami-m fading channels. Employing Maclaurin series expansion around zero to derive the approximate probability density function of the normalized instantaneous signal-to-noise ratio (SNR), the asymptotic symbol error rate (SER) and outage probability expressions are presented. Simulation results indicate that the derived expressions well match the results of Monte-Carlo simulations at medium and high SNR regions. By comparing the O-AF with all AF relaying analyzed previously, it can be concluded that the former has significantly better performance than the latter in many cases.
Zizheng JI Zhengchao LEI Tingting SHEN Jing ZHANG
The joint representations of knowledge graph have become an important approach to improve the quality of knowledge graph, which is beneficial to machine learning, data mining, and artificial intelligence applications. However, the previous work suffers severely from the noise in text when modeling the text information. To overcome this problem, this paper mines the high-quality reference sentences of the entities in the knowledge graph, to enhance the representation ability of the entities. A novel framework for joint representation learning of knowledge graphs and text information based on reference sentence noise-reduction is proposed, which embeds the entity, the relations, and the words into a unified vector space. The proposed framework consists of knowledge graph representation learning module, textual relation representation learning module, and textual entity representation learning module. Experiments on entity prediction, relation prediction, and triple classification tasks are conducted, results show that the proposed framework can significantly improve the performance of mining and fusing the text information. Especially, compared with the state-of-the-art method[15], the proposed framework improves the metric of H@10 by 5.08% and 3.93% in entity prediction task and relation prediction task, respectively, and improves the metric of accuracy by 5.08% in triple classification task.
Pengtao JIA Qi ZHAO Boze LI Jing ZHANG
Gait recognition distinguishes one individual from others according to the natural patterns of human gaits. Gait recognition is a challenging signal processing technology for biometric identification due to the ambiguity of contours and the complex feature extraction procedure. In this work, we proposed a new model - the convolutional neural network (CNN) joint attention mechanism (CJAM) - to classify the gait sequences and conduct person identification using the CASIA-A and CASIA-B gait datasets. The CNN model has the ability to extract gait features, and the attention mechanism continuously focuses on the most discriminative area to achieve person identification. We present a comprehensive transformation from gait image preprocessing to final identification. The results from 12 experiments show that the new attention model leads to a lower error rate than others. The CJAM model improved the 3D-CNN, CNN-LSTM (long short-term memory), and the simple CNN by 8.44%, 2.94% and 1.45%, respectively.
Zhengwei GONG Taiyi ZHANG Jing ZHANG
The subspace algorithm can be utilized for the blind detection of space-time block codes (STBC) without knowledge of channel state information (CSI) both at the transmitter and receiver. However, its performance degrades when the channels are correlated. In this letter, we analyze the impact of channel correlation from the orthogonality loss between the transmit signal subspace (TSS) and the statistical noise subspace (SNS). Based on the decoding property of the subspace algorithm, we propose a revised detection in favor of the channel correlation matrix (CCM) only known to the receiver. Then, a joint transmit-receive preprocessing scheme is derived to obtain a further performance improvement when the CCM is available both at the transmitter and receiver. Analysis and simulation results indicate that the proposed methods can significantly improve the blind detection performance of STBC over the correlated channels.
Zihao SONG Peng SONG Chao SHENG Wenming ZHENG Wenjing ZHANG Shaokai LI
Unsupervised Feature selection is an important dimensionality reduction technique to cope with high-dimensional data. It does not require prior label information, and has recently attracted much attention. However, it cannot fully utilize the discriminative information of samples, which may affect the feature selection performance. To tackle this problem, in this letter, we propose a novel discriminative virtual label regression method (DVLR) for unsupervised feature selection. In DVLR, we develop a virtual label regression function to guide the subspace learning based feature selection, which can select more discriminative features. Moreover, a linear discriminant analysis (LDA) term is used to make the model be more discriminative. To further make the model be more robust and select more representative features, we impose the ℓ2,1-norm on the regression and feature selection terms. Finally, extensive experiments are carried out on several public datasets, and the results demonstrate that our proposed DVLR achieves better performance than several state-of-the-art unsupervised feature selection methods.
Huaning WU Yalong YAN Chao LIU Jing ZHANG
This paper introduces and uses spider monkey optimization (SMO) for synthesis sparse linear arrays, which are composed of a uniformly spaced core subarray and an extended sparse subarray. The amplitudes of all the elements and the locations of elements in the extended sparse subarray are optimized by the SMO algorithm to reduce the side lobe levels of the whole array, under a set of practical constraints. To show the efficiency of SMO, different examples are presented and solved. Simulation results of the sparse arrays designed by SMO are compared with published results to verify the effectiveness of the SMO method.
Jianxiong HUANG Taiyi ZHANG Runping YUAN Jing ZHANG
This letter investigates the performance of amplify-and-forward relaying systems using maximum ratio transmission at the source. A closed-form expression for the outage probability and a closed-form lower bound for the average bit error probability of the system are derived. Also, the approximate expressions for the outage probability and average bit error probability in the high signal-to-noise ratio regime are given, based on which the optimal power allocation strategies to minimize the outage probability and average bit error probability are developed. Furthermore, numerical results illustrate that optimizing the allocation of power can improve the system performance, especially in the high signal-to-noise ratio regime.