Jie SUN Lijian ZHOU Zhe-Ming LU Tingyuan NIE
In this Letter, a new iris recognition approach based on local Gabor orientation feature is proposed. On one hand, the iris feature extraction method using the traditional Gabor filters can cause time-consuming and high-feature dimension. On the other hand, we can find that the changes of original iris texture in angle and radial directions are more obvious than the other directions by observing the iris images. These changes in the preprocessed iris images are mainly reflected in vertical and horizontal directions. Therefore, the local directional Gabor filters are constructed to extract the horizontal and vertical texture characteristics of iris. First, the iris images are preprocessed by iris and eyelash location, iris segmentation, normalization and zooming. After analyzing the variety of iris texture and 2D-Gabor filters, we construct the local directional Gabor filters to extract the local Gabor features of iris. Then, the Gabor & Fisher features are obtained by Linear Discriminant Analysis (LDA). Finally, the nearest neighbor method is used to recognize the iris. Experimental results show that the proposed method has better iris recognition performance with less feature dimension and calculation time.
Takashi IMAGAWA Masayuki HIROMOTO Hiroyuki OCHI Takashi SATO
Time redundancy is sometimes an only option for enhancing circuit reliability when the circuit area is severely restricted. In this paper, a time-redundant error-correction scheme, which is particularly suitable for coarse-grained reconfigurable arrays (CGRAs), is proposed. It judges the correctness of the executions by comparing the results of two identical runs. Once a mismatch is found, the second run is terminated immediately to start the third run, under the assumption that the errors tend to persist in many applications, for selecting the correct result in the three runs. The circuit area and reliability of the proposed method is compared with a straightforward implementation of time-redundancy and a selective triple modular redundancy (TMR). A case study on a CGRA revealed that the area of the proposed method is 1% larger than that of the implementation for the selective TMR. The study also shows the proposed scheme is up to 2.6x more reliable than the full-TMR when the persistent error is predominant.
Shuang LIU Zhong ZHANG Baihua XIAO Xiaozhong CAO
Texture feature descriptors such as local binary patterns (LBP) have proven effective for ground-based cloud classification. Traditionally, these texture feature descriptors are predefined in a handcrafted way. In this paper, we propose a novel method which automatically learns discriminative features from labeled samples for ground-based cloud classification. Our key idea is to learn these features through mutual information maximization which learns a transformation matrix for local difference vectors of LBP. The experimental results show that our learned features greatly improves the performance of ground-based cloud classification when compared to the other state-of-the-art methods.
Takashi HIROSE Fusao NUNO Masashi NAKATSUGAWA
This paper presents wireless systems for use in disaster recovery operations. The Great East Japan Earthquake of March 11, 2011 reinforced the importance of communications in, to, and between disaster areas as lifelines. It also revealed that conventional wireless systems used for disaster recovery need to be renovated to cope with technological changes and to provide their services with easier operations. To address this need we have developed new systems, which include a relay wireless system, subscriber wireless systems, business radio systems, and satellite communication systems. They will be chosen and used depending on the situations in disaster areas as well as on the required services.
In this study, Si(100) surface flattening process was investigated utilizing sacrificial oxidation method to improve Metal--Insulator--Semiconductor (MIS) diode characteristics. By etching of the 100,nm-thick sacrificial oxide formed by thermal oxidation at 1100$^{circ}$C, the surface roughness of Si substrate was reduced. The obtained Root-Mean-Square (RMS) roughness was decreased from 0.15,nm (as-cleaned) to 0.07,nm in the case of sacrificial oxide formed by wet oxidation, while it was 0.10,nm in the case of dry oxidation. Furthermore, time-dependent dielectric breakdown (TDDB) characteristic of Al/SiO$_{2}$(10,nm)/p-Si(100) MIS diode structures was found to be improved by the reduction of Si surface RMS roughness.
An LIU Maoyin CHEN Donghua ZHOU
Robust crater recognition is a research focus on deep space exploration mission, and sparse representation methods can achieve desirable robustness and accuracy. Due to destruction and noise incurred by complex topography and varied illumination in planetary images, a robust crater recognition approach is proposed based on dictionary learning with a low-rank error correction model in a sparse representation framework. In this approach, all the training images are learned as a compact and discriminative dictionary. A low-rank error correction term is introduced into the dictionary learning to deal with gross error and corruption. Experimental results on crater images show that the proposed method achieves competitive performance in both recognition accuracy and efficiency.
For robust visual tracking, the main challenges of a subspace representation model can be attributed to the difficulty in handling various appearances of the target object. Traditional subspace learning tracking algorithms neglected the discriminative correlation between different multi-view target samples and the effectiveness of sparse subspace learning. For learning a better subspace representation model, we designed a discriminative graph to model both the labeled target samples with various appearances and the updated foreground and background samples, which are selected using an incremental updating scheme. The proposed discriminative graph structure not only can explicitly capture multi-modal intraclass correlations within labeled samples but also can obtain a balance between within-class local manifold and global discriminative information from foreground and background samples. Based on the discriminative graph, we achieved a sparse embedding by using L2,1-norm, which is incorporated to select relevant features and learn transformation in a unified framework. In a tracking procedure, the subspace learning is embedded into a Bayesian inference framework using compound motion estimation and a discriminative observation model, which significantly makes localization effective and accurate. Experiments on several videos have demonstrated that the proposed algorithm is robust for dealing with various appearances, especially in dynamically changing and clutter situations, and has better performance than alternatives reported in the recent literature.
Yuko OZASA Mikio NAKANO Yasuo ARIKI Naoto IWAHASHI
This paper deals with a problem where a robot identifies an object that a human asks it to bring by voice when there is a set of objects that the human and the robot can see. When the robot knows the requested object, it must identify the object and when it does not know the object, it must say it does not. This paper presents a new method for discriminating unknown objects from known objects using object images and human speech. It uses a confidence measure that integrates image recognition confidences and speech recognition confidences based on logistic regression.
Meixu SONG Jielin PAN Qingwei ZHAO Yonghong YAN
Introducing pronunciation models into decoding has been proven to be benefit to LVCSR. In this paper, a discriminative pronunciation modeling method is presented, within the framework of the Minimum Phone Error (MPE) training for HMM/GMM. In order to bring the pronunciation models into the MPE training, the auxiliary function is rewritten at word level and decomposes into two parts. One is for co-training the acoustic models, and the other is for discriminatively training the pronunciation models. On Mandarin conversational telephone speech recognition task, compared to the baseline using a canonical lexicon, the discriminative pronunciation models reduced the absolute Character Error Rate (CER) by 0.7% on LDC test set, and with the acoustic model co-training, 0.8% additional CER decrease had been achieved.
Speeded up robust features (SURF) can detect/describe scale- and rotation-invariant features at high speed by relying on integral images for image convolutions. However, the time taken for matching SURF descriptors is still long, and this has been an obstacle for use in real-time applications. In addition, the matching time further increases in proportion to the number of features and the dimensionality of the descriptor. Therefore, we propose a fast matching method that rearranges the elements of SURF descriptors based on their entropies, divides SURF descriptors into sub-descriptors, and sequentially and analytically matches them to each other. Our results show that the matching time could be reduced by about 75% at the expense of a small drop in accuracy.
Hirofumi SHIMIZU Hiromitsu AWANO Masayuki HIROMOTO Takashi SATO
The modeling of random telegraph noise (RTN) of MOS transistors is becoming increasingly important. In this paper, a novel method is proposed for realizing automated estimation of two important RTN-model parameters: the number of interface-states and corresponding threshold voltage shift. The proposed method utilizes a Gaussian mixture model (GMM) to represent the voltage distributions, and estimates their parameters using the expectation-maximization (EM) algorithm. Using information criteria, the optimal estimation is automatically obtained while avoiding overfitting. In addition, we use a shared variance for all the Gaussian components in the GMM to deal with the noise in RTN signals. The proposed method improved estimation accuracy when the large measurement noise is observed.
A discriminative reference-based method for scene image categorization is presented in this letter. Reference-based image classification approach combined with K-SVD is approved to be a simple, efficient, and effective method for scene image categorization. It learns a subspace as a means of randomly selecting a reference-set and uses it to represent images. A good reference-set should be both representative and discriminative. More specifically, the reference-set subspace should well span the data space while maintaining low redundancy. To automatically select reference images, we adapt affinity propagation algorithm based on data similarity to gather a reference-set that is both representative and discriminative. We apply the discriminative reference-based method to the task of scene categorization on some benchmark datasets. Extensive experiment results demonstrate that the proposed scene categorization method with selected reference set achieves better performance and higher efficiency compared to the state-of-the-art methods.
Xue CHEN Chunheng WANG Baihua XIAO Yunxue SHAO
In Still-to-Video (S2V) face recognition, only a few high resolution images are registered for each subject, while the probe is video clips of complex variations. As faces present distinct characteristics under different scenarios, recognition in the original space is obviously inefficient. Thus, in this paper, we propose a novel discriminant analysis method to learn separate mappings for different scenario patterns (still, video), and further pursue a common discriminant space based on these mappings. Concretely, by modeling each video as a manifold and each image as point data, we form the scenario-oriented mapping learning as a Point-Manifold Discriminant Analysis (PMDA) framework. The learning objective is formulated by incorporating the intra-class compactness and inter-class separability for good discrimination. Experiments on the COX-S2V dataset demonstrate the effectiveness of the proposed method.
Runtime analysis is to enhance the safety of critical systems by monitoring the change of corresponding external environments. In this paper, a modified FTA approach, making full utilization of the existing safety analysis result, is put forward to achieve runtime safety analysis. The procedures of the approach are given in detail. This approach could be widely used in safety engineering of critical systems.
Ruicong ZHI Lei ZHAO Bolin SHI Yi JIN
A novel Two-dimensional Fuzzy Discriminant Locality Preserving Projections (2D-FDLPP) algorithm is proposed for learning effective subspace of two-dimensional images. The 2D-FDLPP algorithm is derived from the Two-dimensional Locality Preserving Projections (2D-LPP) by exploiting both fuzzy and discriminant properties. 2D-FDLPP algorithm preserves the relationship degree of each sample belonging to given classes with fuzzy k-nearest neighbor classifier. Also, it introduces between-class scatter constrain and label information into 2D-LPP algorithm. 2D-FDLPP algorithm finds the subspace which can best discriminate different pattern classes and weakens the environment factors according to soft assignment method. Therefore, 2D-FDLPP algorithm has more discriminant power than 2D-LPP, and is more suitable for recognition tasks. Experiments are conducted on the MNIST database for handwritten image classification, the JAFFE database and Cohn-Kanade database for facial expression recognition and the ORL database for face recognition. Experimental results reported the effectiveness of our proposed algorithm.
Thao-Ngoc NGUYEN Bac LE Kazunori MIYATA
This paper introduces a novel approach of feature description by integrating the intensity order and textures in different support regions into a compact vector. We first propose the Intensity Order Local Binary Pattern (IO-LBP) operator, which simultaneously encodes the gradient and texture information in the local neighborhood of a pixel. We divide each region of interest into segments according to the order of pixel intensities, build one histogram of IO-LBP patterns for each segment, and then concatenate all histograms to obtain a feature descriptor. Furthermore, multi support regions are adopted to enhance the distinctiveness. The proposed descriptor effectively describes a region at both local and global levels, and thus high performance is expected. Experimental results on the Oxford benchmark and images of cast shadows show that our approach is invariant to common photometric and geometric transformations, such as illumination change and image rotation, and robust to complex lighting effects caused by shadows. It achieves a comparable accuracy to that of state-of-art methods while performs considerably faster.
Yaohui QI Fuping PAN Fengpei GE Qingwei ZHAO Yonghong YAN
A smoothing method for minimum phone error linear regression (MPELR) is proposed in this paper. We show that the objective function for minimum phone error (MPE) can be combined with a prior mean distribution. When the prior mean distribution is based on maximum likelihood (ML) estimates, the proposed method is the same as the previous smoothing technique for MPELR. Instead of ML estimates, maximum a posteriori (MAP) parameter estimate is used to define the mode of prior mean distribution to improve the performance of MPELR. Experiments on a large vocabulary speech recognition task show that the proposed method can obtain 8.4% relative reduction in word error rate when the amount of data is limited, while retaining the same asymptotic performance as conventional MPELR. When compared with discriminative maximum a posteriori linear regression (DMAPLR), the proposed method shows improvement except for the case of limited adaptation data for supervised adaptation.
Hao HAN Yinxing XUE Keizo OYAMA Yang LIU
The rendering mechanism plays an indispensable role in browser-based Web application. It generates active webpages dynamically and provides human-readable layout through template engines, which are used as a standard programming model to separate the business logic and data computations from the webpage presentation. The client-side rendering mechanism, owing to the advances of rich application technologies, has been widely adopted. The adoption of client side rendering brings not only various merits but also new problems. In this paper, we propose and construct “pagelet”, a segment-based template engine for developing flexible and extensible Web applications. By presenting principles, practice and usage experience of pagelet, we conduct a comprehensive analysis of possible advantages and disadvantages brought by client-side rendering mechanism from the viewpoints of both developers and end-users.
Hyunwook YANG Yeongyu HAN Seungwon CHOI
In a multi-user multiple-input multiple-output (MU-MIMO) system that adopts zero-forcing (ZF) as a precoder, the best selection is the combination of users who provide the smallest trace of the inverse of the channel auto-correlation matrix. Noting that the trace of the matrix is closely related to the determinant, we search for users that yield the largest determinant of their channel auto-correlation matrix. The proposed technique utilizes the determinant row-exchange criterion (DREC) for computing the determinant-changing ratio, which is generated whenever a user is replaced by one of a group of pre-selected users. Based on the ratio computed by the DREC, the combination of users providing the largest changing ratio is selected. In order to identify the optimal combination, the DREC procedure is repeated until user replacement provides no increase in the determinant. Through computer simulations of four transmit antennas, we show that the bit error rate (BER) per signal-to-noise ratio (SNR) as well as the sum-rate performance provided by the proposed method is comparable to that of the full search method. Furthermore, using the proposed method, a partial replacement of users can be performed easily with a new user who provides the largest determinant.
Keigo KUBO Sakriani SAKTI Graham NEUBIG Tomoki TODA Satoshi NAKAMURA
Grapheme-to-phoneme (g2p) conversion, used to estimate the pronunciations of out-of-vocabulary (OOV) words, is a highly important part of recognition systems, as well as text-to-speech systems. The current state-of-the-art approach in g2p conversion is structured learning based on the Margin Infused Relaxed Algorithm (MIRA), which is an online discriminative training method for multiclass classification. However, it is known that the aggressive weight update method of MIRA is prone to overfitting, even if the current example is an outlier or noisy. Adaptive Regularization of Weight Vectors (AROW) has been proposed to resolve this problem for binary classification. In addition, AROW's update rule is simpler and more efficient than that of MIRA, allowing for more efficient training. Although AROW has these advantages, it has not been applied to g2p conversion yet. In this paper, we first apply AROW on g2p conversion task which is structured learning problem. In an evaluation that employed a dataset generated from the collective knowledge on the Web, our proposed approach achieves a 6.8% error reduction rate compared to MIRA in terms of phoneme error rate. Also the learning time of our proposed approach was shorter than that of MIRA in almost datasets.