Fourier transform is a significant tool in image processing and pattern recognition. By introducing a hypercomplex number, hypercomplex Fourier transform treats a signal as a vector field and generalizes the conventional Fourier transform. Inspired from that, hypercomplex polar Fourier analysis that extends conventional polar Fourier analysis is proposed in this paper. The proposed method can handle signals represented by hypercomplex numbers as color images. The hypercomplex polar Fourier analysis is reversible that means it can be used to reconstruct image. The hypercomplex polar Fourier descriptor has rotation invariance property that can be used for feature extraction. Due to the noncommutative property of quaternion multiplication, both left-side and right-side hypercomplex polar Fourier analysis are discussed and their relationships are also established in this paper. The experimental results on image reconstruction, rotation invariance, color plate test and image retrieval are given to illustrate the usefulness of the proposed method as an image analysis tool.
In this paper, an adaptive block-based text line extraction algorithm is proposed. Three global and two local parameters are defined to adapt the method to various handwritings in different languages. A document image is segmented into several overlapping blocks. The skew of each block is estimated. Text block is de-skewed by using the estimated skew angle. Text regions are detected in the de-skewed text block. A number of data points are extracted from the detected text regions in each block. These data points are used to estimate the paths of text lines. By thinning the background of the image including text line paths, text line boundaries or separators are estimated. Furthermore, an algorithm is proposed to assign to the extracted text lines the connected components which have intersections with the estimated separators. Extensive experiments on different standard datasets in various languages demonstrate that the proposed algorithm outperforms previous methods.
Kohei INOUE Kenji HARA Kiichi URAHAMA
Linear discriminant analysis (LDA) is one of the well-known schemes for feature extraction and dimensionality reduction of labeled data. Recently, two-dimensional LDA (2DLDA) for matrices such as images has been reformulated into symmetric 2DLDA (S2DLDA), which is solved by an iterative algorithm. In this paper, we propose a non-iterative S2DLDA and experimentally show that the proposed method achieves comparable classification accuracy with the conventional S2DLDA, while the proposed method is computationally more efficient than the conventional S2DLDA.
Peerasak INTARAPAIBOON Ekawit NANTAJEEWARAWAT Thanaruk THEERAMUNKONG
Based on sliding-window rule application and extraction filtering, we present a framework for extracting multi-slot frames describing chemical reactions from Thai free text with unknown target-phrase boundaries. A supervised rule learning algorithm is employed for automatic construction of pattern-based extraction rules from hand-tagged training phrases. A filtering method is devised for removal of incorrect extraction results based on features observed from text portions appearing between adjacent slot fillers in source documents. Extracted reaction frames are represented as concept expressions in description logics and are used as metadata for document indexing. A document knowledge base supporting semantics-based information retrieval is constructed by integrating document metadata with domain-specific ontologies.
Liang SHA Guijin WANG Xinggang LIN Kongqiao WANG
This paper presents a robust framework of human-computer interaction from the hand gesture vision in the presence of realistic and challenging scenarios. To this end, several novel components are proposed. A hybrid approach is first proposed to automatically infer the beginning position of hand gestures of interest via jointly optimizing the regions given by an offline skin model trained from Gaussian mixture models and a specific hand gesture classifier trained from the Adaboost technique. To consistently track the hand in the context of using kernel based tracking, a semi-supervised feature selection strategy is further presented to choose the feature subspaces which appropriately represent the properties of offline hand skin cues and online foreground-background-classification cues. Taking the histogram of oriented gradients as the descriptor to represent hand gestures, a soft-decision approach is finally proposed for recognizing static hand gestures at the locations where severe ambiguity occurs and hidden Markov model based dynamic gestures are employed for interaction. Experiments on various real video sequences show the superior performance of the proposed components. In addition, the whole framework is applicable to real-time applications on general computing platforms.
Chizu MATSUMOTO Yuichi HAMAMURA Yoshiyuki TSUNODA Hiroshi UOZAKI Isao MIYAZAKI Shiro KAMOHARA Yoshiyuki KANEKO Kenji KANAMITSU
In order to accelerate yield improvement in semiconductor manufacturing, it is important to prevent the root causes of product-specific failures, such as systematic defects and parametric defects, which are different for each product. We herein propose a method for the investigation of product-specific failures by estimating differences between the actual failing bit signatures (FBSs) and the predicted FBSs caused by random defects. In order to estimate these differences accurately, we have developed a novel algorithm by which to extract the critical area for each FBS. The total failure rate errors of FBSs are within 0.5% for embedded SRAMs. The proposed method identified the root causes of product-specific failures in 150 and 65 nm technology node products.
Go TANAKA Noriaki SUETAKE Eiji UCHINO
A new recoloring method to improve visibility of indiscriminable colors for protanopes or deuteranopes is proposed. In the proposed method, yellow-blue components of a color image perceived by protanopes/deuteranopes are adequately modified. Moreover, the gamut mapping is considered to obtain proper output color values in this method.
Our research is focused on examining the Image Quality Assessment Model based on the MPEG-7 descriptor and the No Reference model. The model retrieves a reference image using image search and evaluate its subject score as a pseudo Reduced Reference model. The MPEG-7 descriptor was originally used for content retrieval, but we discovered that the MPEG-7 descriptor can also be used for image quality assessment. We examined the performance of the proposed model and the results revealed that this method has a higher performance rating than the SSIM.
Owing to the high expressiveness of regular expression, it is frequently used in searching and manipulation of text based data. Regular expression is highly applicable in processing Latin alphabet based text, but the same cannot be said for Hangeul*, the writing system for Korean language. Although Hangeul possesses alphabetic features within the script, expressiveness of regular expression pattern using Hangeul is hindered by the absence of syllable decomposition. Without decomposition support in regular expression, searching through Hangeul text is limited to string literal matching. Literal matching has made enumeration of syllable candidates in regular expression pattern definition indispensable, albeit impractical, especially for a large set of syllable candidates. Although the existing implementation of canonical decomposition in Unicode standard does reduce a pre-composed Hangeul syllable into smaller unit of consonant-vowel or consonant-vowel-consonant letters, it still leaves quite a number of the individual letters in compounded form. We have observed that there is a necessity to further reduce the compounded letters into unit of basic letters to properly represent the Korean script in regular expression. We look at how the new canonical decomposition technique proposed by Kim can help in handling Hangeul in regular expression. In this paper, we examine several of the performance indicators of full decomposition of Hangeul syllable to better understand the overhead that might incur, if a full decomposition were to be implemented in a regular expression engine. For efficiency considerations, we propose a semi decomposition technique alongside with a notation for defining Hangeul syllables. The semi decomposition functions as an enhancement to the existing regular expression syntax by taking in some of the special constructs and features of the Korean language. This proposed technique intends to allow an end user to have a greater freedom to define regular expression syntax for Hangeul.
Xin NIE Jianhua ZHANG Ping ZHANG
Relay, which promises to enhance the performance of future communication networks, is one of the most promising techniques for IMT-Advanced systems. In this paper, multiple-input multiple-output (MIMO) relay channels based on outdoor measurements are investigated. We focus on the link between the base station (BS) and the relay station (RS) as well as the link between the RS and the mobile station (MS). First of all, the channels were measured employing a real-time channel sounder in IMT-Advanced frequency band (2.35 GHz with 50 MHz bandwidth). Then, the parameters of multipath components (MPCs) are extracted utilizing space-alternating generalized expectation algorithm. MPC parameters of the two links are statistically analyzed and compared. The polarization and spatial statistics are gotten. The trends of power azimuth spectrum (PAS) and cross-polarization discrimination (XPD) with the separation between the RS and the MS are investigated. Based on the PAS, the propagation mechanisms of line-of-sight and non-line-of-sight scenarios are analyzed. Furthermore, an approximate closed-form expression of channel correlation is derived. The impacts of PAS and XPD on the channel correlation are studied. Finally, some guidelines for the antenna configurations of the BS, the RS and the MS are presented. The results reveal the different characteristics of relay channels and provide the basis for the practical deployment of relay systems.
In the field of software reengineering, many component identification approaches have been proposed for evolving legacy systems into component-based systems. Understanding the behaviors of various component identification approaches is the first important step to meaningfully employ them for legacy systems evolution, therefore we performed an empirical study on component identification technology with considerations of their similarity measures, clustering approaches and stopping criteria. We proposed a set of evaluation criteria and developed the tool CIETool to automate the process of component identification and evaluation. The experimental results revealed that many components of poor quality were produced by the employed component identification approaches; that is, many of the identified components were tightly coupled, weakly cohesive, or had inappropriate numbers of implementation classes and interface operations. Finally, we presented an analysis on the component identification approaches according to the proposed evaluation criteria, which suggested that the weaknesses of these clustering approaches were the major reasons that caused components of poor-quality.
Kuo-Chen HUNG Yuan-Cheng TSAI Kuo-Ping LIN Peterson JULIAN
Several papers have presented measured function to handle multi-criteria fuzzy decision-making problems based on interval-valued intuitionistic fuzzy sets. However, in some cases, the proposed function cannot give sufficient information about alternatives. Consequently, in this paper, we will overcome previous insufficient problem and provide a novel accuracy function to measure the degree of the interval-valued intuitionistic fuzzy information. And a practical example has been provided to demonstrate our proposed approach. In addition, to make computing and ranking results easier and to increase the recruiting productivity, a computer-based interface system has been developed for decision makers to make decisions more efficiently.
Apisak WORAPISHET Tanee DEMEECHAI
The noise performances under AWGN channel of the IF-derivative and the quadrature differential FM discriminators, which are widely utilized in modern low power wireless radios, are analyzed and compared. The analysis relies upon the time-domain multi-sinusoidal representation of the noise that facilitates accurate and closed-form analytical SNR characteristics. Derivation of the SNR equations is detailed and discussion based on the analysis results is given to provide insights into the discriminators' performance limitation where it is demonstrated that the differential scheme is considerably more advantageous. Simulated SNR characteristics of practical continuous-phase frequency shift keying (CPFSK) systems using both the FM discriminators are presented as analysis verification.
Yasunari OBUCHI Takashi SUMIYOSHI
In this paper we introduce a new framework of audio processing, which is essential to achieve a trigger-free speech interface for home appliances. If the speech interface works continually in real environments, it must extract occasional voice commands and reject everything else. It is extremely important to reduce the number of false alarms because the number of irrelevant inputs is much larger than the number of voice commands even for heavy users of appliances. The framework, called Intentional Voice Command Detection, is based on voice activity detection, but enhanced by various speech/audio processing techniques such as emotion recognition. The effectiveness of the proposed framework is evaluated using a newly-collected large-scale corpus. The advantages of combining various features were tested and confirmed, and the simple LDA-based classifier demonstrated acceptable performance. The effectiveness of various methods of user adaptation is also discussed.
Dan-ni AI Xian-hua HAN Xiang RUAN Yen-wei CHEN
In this paper, we present a novel color independent components based SIFT descriptor (termed CIC-SIFT) for object/scene classification. We first learn an efficient color transformation matrix based on independent component analysis (ICA), which is adaptive to each category in a database. The ICA-based color transformation can enhance contrast between the objects and the background in an image. Then we compute CIC-SIFT descriptors over all three transformed color independent components. Since the ICA-based color transformation can boost the objects and suppress the background, the proposed CIC-SIFT can extract more effective and discriminative local features for object/scene classification. The comparison is performed among seven SIFT descriptors, and the experimental classification results show that our proposed CIC-SIFT is superior to other conventional SIFT descriptors.
Eka FIRMANSYAH Satoshi TOMIOKA Seiya ABE Masahito SHOYAMA Tamotsu NINOMIYA
This paper proposes a new power-factor-correction (PFC) topology, and explains its operation principle, its control mechanism, related application problems followed by experimental results. In this proposed topology, critical-conduction-mode (CRM) interleaved technique is applied to a bridgeless PFC in order to achieve high efficiency by combining benefits of each topology. This application is targeted toward low to middle power applications that normally employs continuous-conduction-mode boost converter.
Shoei SATO Takahiro OKU Shinichi HOMMA Akio KOBAYASHI Toru IMAI
We present a new discriminative method of acoustic model adaptation that deals with a task-dependent speech variability. We have focused on differences of expressions or speaking styles between tasks and set the objective of this method as improving the recognition accuracy of indistinctly pronounced phrases dependent on a speaking style. The adaptation appends subword models for frequently observable variants of subwords in the task. To find the task-dependent variants, low-confidence words are statistically selected from words with higher frequency in the task's adaptation data by using their word lattices. HMM parameters of subword models dependent on the words are discriminatively trained by using linear transforms with a minimum phoneme error (MPE) criterion. For the MPE training, subword accuracy discriminating between the variants and the originals is also investigated. In speech recognition experiments, the proposed adaptation with the subword variants reduced the word error rate by 12.0% relative in a Japanese conversational broadcast task.
Jungsuk SONG Hiroki TAKAKURA Yasuo OKABE Daisuke INOUE Masashi ETO Koji NAKAO
Intrusion Detection Systems (IDS) have been received considerable attention among the network security researchers as one of the most promising countermeasures to defend our crucial computer systems or networks against attackers on the Internet. Over the past few years, many machine learning techniques have been applied to IDSs so as to improve their performance and to construct them with low cost and effort. Especially, unsupervised anomaly detection techniques have a significant advantage in their capability to identify unforeseen attacks, i.e., 0-day attacks, and to build intrusion detection models without any labeled (i.e., pre-classified) training data in an automated manner. In this paper, we conduct a set of experiments to evaluate and analyze performance of the major unsupervised anomaly detection techniques using real traffic data which are obtained at our honeypots deployed inside and outside of the campus network of Kyoto University, and using various evaluation criteria, i.e., performance evaluation by similarity measurements and the size of training data, overall performance, detection ability for unknown attacks, and time complexity. Our experimental results give some practical and useful guidelines to IDS researchers and operators, so that they can acquire insight to apply these techniques to the area of intrusion detection, and devise more effective intrusion detection models.
Nan LIU Yao ZHAO Zhenfeng ZHU Rongrong NI
This paper presents a commercial shot classification scheme combining well-designed visual and textual features to automatically detect TV commercials. To identify the inherent difference between commercials and general programs, a special mid-level textual descriptor is proposed, aiming to capture the spatio-temporal properties of the video texts typical of commercials. In addition, we introduce an ensemble-learning based combination method, named Co-AdaBoost, to interactively exploit the intrinsic relations between the visual and textual features employed.
Polar Fourier Descriptor(PFD) and Spherical Fourier Descriptor(SFD) are rotation invariant feature descriptors for two dimensional(2D) and three dimensional(3D) image retrieval and pattern recognition tasks. They are demonstrated to show superiorities compared with other methods on describing rotation invariant features of 2D and 3D images. However in order to increase the computation speed, fast computation method is needed especially for machine vision applications like realtime systems, limited computing environments and large image databases. This paper presents fast computation method for PFD and SFD that are deduced based on mathematical properties of trigonometric functions and associated Legendre polynomials. Proposed fast PFD and SFD are 8 and 16 times faster than direct calculation that significantly boost computation process. Furthermore, the proposed methods are also compact for memory requirements for storing PFD and SFD basis in lookup tables. The experimental results on both synthetic and real data are given to illustrate the efficiency of the proposed method.