Rajashree S. SOKASANE Kyungbaek KIM
In these days, recognizing a user personality is an important issue in order to support various personalized services. Besides the conventional phone usage such as call logs, SMS logs and application usages, smart phones can gather the behavior of users by polling various embedded sensors such as GPS sensors. In this paper, we focus on how to predict user attitude based on GPS log data by applying location clustering techniques and extracting features from the location clusters. Through the evaluation with one month-long GPS log data, it is observed that the location-based features, such as number of clusters and coverage of clusters, are correlated with user attitude to some extent. Especially, when SVM is used as a classifier for predicting the dichotomy of user attitudes of MBTI, over 90% F-measure is achieved.
Support Vector Machine (SVM) is one of the most widely used classifiers to categorize observations. This classifier deterministically selects a class that has the largest score for a classification output. In this letter, we propose a multiclass probabilistic classification method that reflects the degree of confidence. We apply the proposed method to age group classification and verify the performance.
Jie LIU Linlin QIN Jing GAO Aidong ZHANG
Ontology mapping is important in many areas, such as information integration, semantic web and knowledge management. Thus the effectiveness of ontology mapping needs to be further studied. This paper puts forward a mapping method between different ontology concepts in the same field. Firstly, the algorithms of calculating four individual similarities (the similarities of concept name, property, instance and structure) between two concepts are proposed. The algorithm features of four individual similarities are as follows: a new WordNet-based method is used to compute semantic similarity between concept names; property similarity algorithm is used to form property similarity matrix between concepts, then the matrix will be processed into a numerical similarity; a new vector space model algorithm is proposed to compute the individual similarity of instance; structure parameters are added to structure similarity calculation, structure parameters include the number of properties, instances, sub-concepts, and the hierarchy depth of two concepts. Then similarity of each of ontology concept pairs is represented by a vector. Finally, Support Vector Machine (SVM) is used to accomplish mapping discovery by training and learning the similarity vectors. In this algorithm, Harmony and reliability are used as the weights of the four individual similarities, which increases the accuracy and reliability of the algorithm. Experiments achieve good results and the results show that the proposed method outperforms many other methods of similarity-based algorithms.
In this letter, we propose a new no-reference blur estimation method in the frequency domain. It is based on computing the cumulative distribution function (CDF) of the Fourier transform spectrum of the blurred image and analyzing the relationship between its shape and the blur strength. From the analysis, we propose and evaluate six curve-shaped analytic metrics for estimating blur strength. Also, we employ an SVM-based learning scheme to improve the accuracy and robustness of the proposed metrics. In our experiments on Gaussian blurred images, one of the six metrics outperformed the others and the standard deviation values between 0 and 6 could be estimated with an estimation error of 0.31 on average.
This paper proposes a so called quasi-linear support vector machine (SVM), which is an SVM with a composite quasi-linear kernel. In the quasi-linear SVM model, the nonlinear separation hyperplane is approximated by multiple local linear models with interpolation. Instead of building multiple local SVM models separately, the quasi-linear SVM realizes the multi local linear model approach in the kernel level. That is, it is built exactly in the same way as a single SVM model, by composing a quasi-linear kernel. A guided partitioning method is proposed to obtain the local partitions for the composition of quasi-linear kernel function. Experiment results on artificial data and benchmark datasets show that the proposed method is effective and improves classification performances.
Artificial blurring is a typical operation in image forging. Most existing image forgery detection methods consider only one single feature of artificial blurring operation. In this manuscript, we propose to adopt feature fusion, with multifeatures for artificial blurring operation in image tampering, to improve the accuracy of forgery detection. First, three feature vectors that address the singular values of the gray image matrix, correlation coefficients for double blurring operation, and image quality metrics (IQM) are extracted and fused using principal component analysis (PCA), and then a support vector machine (SVM) classifier is trained using the fused feature extracted from training images or image patches containing artificial blurring operations. Finally, the same procedures of feature extraction and feature fusion are carried out on the suspected image or suspected image patch which is then classified, using the trained SVM, into forged or non-forged classes. Experimental results show the feasibility of the proposed method for image tampering feature fusion and forgery detection.
Kenshi SAHO Hiroaki HOMMA Takuya SAKAMOTO Toru SATO Kenichi INOUE Takeshi FUKUDA
Recent studies have focused on developing security systems using micro-Doppler radars to detect human bodies. However, the resolution of these conventional methods is unsuitable for identifying bodies and moreover, most of these conventional methods were designed for a solitary or sufficiently well-spaced targets. This paper proposes a solution to these problems with an image separation method for two closely spaced pedestrian targets. The proposed method first develops an image of the targets using ultra-wide-band (UWB) Doppler imaging radar. Next, the targets in the image are separated using a supervised learning-based separation method trained on a data set extracted using a range profile. We experimentally evaluated the performance of the image separation using some representative supervised separation methods and selected the most appropriate method. Finally, we reject false points caused by target interference based on the separation result. The experiment, assuming two pedestrians with a body separation of 0.44m, shows that our method accurately separates their images using a UWB Doppler radar with a nominal down-range resolution of 0.3m. We describe applications using various target positions, establish the performance, and derive optimal settings for our method.
Mitsuru SHIOZAKI Kousuke OGAWA Kota FURUHASHI Takahiko MURAYAMA Masaya YOSHIKAWA Takeshi FUJINO
In modern hardware security applications, silicon physical unclonable functions (PUFs) are of interest for their potential use as a unique identity or secret key that is generated from inherent characteristics caused by process variations. However, arbiter-based PUFs utilizing the relative delay-time difference between equivalent paths have a security issue in which the generated challenge-response pairs (CRPs) can be predicted by a machine learning attack. We previously proposed the RG-DTM PUF, in which a response is decided from divided time domains allocated to response 0 or 1, to improve the uniqueness of the conventional arbiter-PUF in a small circuit. However, its resistance against machine learning attacks has not yet been studied. In this paper, we evaluate the resistance against machine learning attacks by using a support vector machine (SVM) and logistic regression (LR) in both simulations and measurements and compare the RG-DTM PUF with the conventional arbiter-PUF and with the XOR arbiter-PUF, which strengthens the resistance by using XORing output from multiple arbiter-PUFs. In numerical simulations, prediction rates using both SVM and LR were above 90% within 1,000 training CRPs on the arbiter-PUF. The machine learning attack using the SVM could never predict responses on the XOR arbiter-PUF with over six arbiter-PUFs, whereas the prediction rate eventually reached 95% using the LR and many training CRPs. On the RG-DTM PUF, when the division number of the time domains was over eight, the prediction rates using the SVM were equal to the probability by guess. The machine learning attack using LR has the potential to predict responses, although an adversary would need to steal a significant amount of CRPs. However, the resistance can exponentially be strengthened with an increase in the division number, just like with the XOR arbiter-PUF. Over one million CRPs are required to attack the 16-divided RG-DTM PUF. Differences between the RG-DTM PUF and the XOR arbiter-PUF relate to the area penalty and the power penalty. Specifically, the XOR arbiter-PUF has to make up for resistance against machine learning attacks by increasing the circuit area, while the RG-DTM PUF is resistant against machine learning attacks with less area penalty and power penalty since only capacitors are added to the conventional arbiter-PUF. We also attacked RG-DTM PUF chips, which were fabricated with 0.18-µm CMOS technology, to evaluate the effect of physical variations and unstable responses. The resistance against machine learning attacks was related to the delay-time difference distribution, but unstable responses had little influence on the attack results.
Ryoichi ISAWA Tao BAN Shanqing GUO Daisuke INOUE Koji NAKAO
PEiD is a packer identification tool widely used for malware analysis but its accuracy is becoming lower and lower recently. There exist two major reasons for that. The first is that PEiD does not provide a way to create signatures, though it adopts a signature-based approach. We need to create signatures manually, and it is difficult to catch up with packers created or upgraded rapidly. The second is that PEiD utilizes exact matching. If a signature contains any error, PEiD cannot identify the packer that corresponds to the signature. In this paper, we propose a new automated packer identification method to overcome the limitations of PEiD and report the results of our numerical study. Our method applies string-kernel-based support vector machine (SVM): it can measure the similarity between packed programs without our operations such as manually creating signature and it provides some error tolerant mechanism that can significantly reduce detection failure caused by minor signature violations. In addition, we use the byte sequence starting from the entry point of a packed program as a packer's feature given to SVM. That is, our method combines the advantages from signature-based approach and machine learning (ML) based approach. The numerical results on 3902 samples with 26 packer classes and 3 unpacked (not-packed) classes shows that our method achieves a high accuracy of 99.46% outperforming PEiD and an existing ML-based method that Sun et al. have proposed.
Yoonjae CHOI Pum-Mo RYU Hyunki KIM Changki LEE
Event extraction is vital to social media monitoring and social event prediction. In this paper, we propose a method for social event extraction from web documents by identifying binary relations between named entities. There have been many studies on relation extraction, but their aims were mostly academic. For practical application, we try to identify 130 relation types that comprise 31 predefined event types, which address business and public issues. We use structured Support Vector Machine, the state of the art classifier to capture relations. We apply our method on news, blogs and tweets collected from the Internet and discuss the results.
Santi NURATCH Panuthat BOONPRAMUK Chai WUTIWIWATCHAI
This paper presents a new technique to smooth speech feature vectors for text-independent speaker verification using an adaptive band-pass IIR filer. The filter is designed by considering the probability density of modulation-frequency components of an M-dimensional feature vector. Each dimension of the feature vector is processed and filtered separately. Initial filter parameters, low-cut-off and high-cut-off frequencies, are first determined by the global mean of the probability densities computed from all feature vectors of a given speech utterance. Then, the cut-off frequencies are adapted over time, i.e. every frame vector, in both low-frequency and high-frequency bands based also on the global mean and the standard deviation of feature vectors. The filtered feature vectors are used in a SVM-GMM Supervector speaker verification system. The NIST Speaker Recognition Evaluation 2006 (SRE06) core-test is used in evaluation. Experimental results show that the proposed technique clearly outperforms a baseline system using a conventional RelAtive SpecTrA (RASTA) filter.
Xuefeng BAI Tiejun ZHANG Chuanjun WANG Ahmed A. ABD EL-LATIF Xiamu NIU
Player detection is an important part in sports video analysis. Over the past few years, several learning based detection methods using various supervised two-class techniques have been presented. Although satisfactory results can be obtained, a lot of manual labor is needed to construct the training set. To overcome this drawback, this letter proposes a player detection method based on one-class SVM (OCSVM) using automatically generated training data. The proposed method is evaluated using several video clips captured from World Cup 2010, and experimental results show that our approach achieves a high detection rate while keeping the training set construction's cost low.
In this paper, we propose a way to improve the classification performance of support vector machines (SVMs), especially for speech and music frames within a selectable mode vocoder (SMV) framework. A myriad of techniques have been proposed for SVMs, and most of them are employed during the training phase of SVMs. Instead, the proposed algorithm is applied during the test phase and works with existing schemes. The proposed algorithm modifies a kernel parameter in the decision function of SVMs to alter SVM decisions for better classification accuracy based on the previous outputs of SVMs. Since speech and music frames exhibit strong inter-frame correlation, the outputs of SVMs can guide the kernel parameter modification. Our experimental results show that the proposed algorithm has the potential for adaptively tuning classifications of support vector machines for better performance.
Yonggang HUANG Jun ZHANG Yongwang ZHAO Dianfu MA
We propose a novel re-ranking method for content-based medical image retrieval based on the idea of pseudo-relevance feedback (PRF). Since the highest ranked images in original retrieval results are not always relevant, a naive PRF based re-ranking approach is not capable of producing a satisfactory result. We employ a two-step approach to address this issue. In step 1, a Pearson's correlation coefficient based similarity update method is used to re-rank the high ranked images. In step 2, after estimating a relevance probability for each of the highest ranked images, a fuzzy SVM ensemble based approach is adopted to re-rank the images. The experiments demonstrate that the proposed method outperforms two other re-ranking methods.
Yonggang HUANG Dianfu MA Jun ZHANG Yongwang ZHAO
We propose a novel query-dependent feature aggregation (QDFA) method for medical image retrieval. The QDFA method can learn an optimal feature aggregation function for a multi-example query, which takes into account multiple features and multiple examples with different importance. The experiments demonstrate that the QDFA method outperforms three other feature aggregation methods.
Learning to rank refers to machine learning techniques for training the model in a ranking task. Learning to rank is useful for many applications in Information Retrieval, Natural Language Processing, and Data Mining. Intensive studies have been conducted on the problem and significant progress has been made [1],[2]. This short paper gives an introduction to learning to rank, and it specifically explains the fundamental problems, existing approaches, and future work of learning to rank. Several learning to rank methods using SVM techniques are described in details.
Chang LIU Guijin WANG Wenxin NING Xinggang LIN
A novel approach for detecting anomaly in visual surveillance system is proposed in this paper. It is composed of three parts: (a) a dense motion field and motion statistics method, (b) motion directional PCA for feature dimensionality reduction, (c) an improved one-class SVM for one-class classification. Experiments demonstrate the effectiveness of the proposed algorithm in detecting abnormal events in surveillance video, while keeping a low false alarm rate. Our scheme works well in complicated situations that common tracking or detection modules cannot handle.
Guangyi ZHOU Yi CUI Yumeng LIU Jian YANG
In this letter, a new terrain type classifier is proposed for polarimetric Synthetic Aperture Radar (Pol-SAR) images. This classifier uses the binary tree structure. The homogenous and inhomogeneous areas are first classified by the support vector machine (SVM) classifier based on the texture features extracted from the span image. Then the homogenous and inhomogeneous areas are, respectively, classified by the traditional Wishart classifier and the SVM classifier based on the texture features. Using a NASA/JPL AIRSAR image, the authors achieve the classification accuracy of up to 98%, demonstrating the effectiveness of the proposed method.
Masahiro TSUKADA Yuya UTSUMI Hirokazu MADOKORO Kazuhito SATO
This paper presents an unsupervised learning-based method for selection of feature points and object category classification without previous setting of the number of categories. Our method consists of the following procedures: 1)detection of feature points and description of features using a Scale-Invariant Feature Transform (SIFT), 2)selection of target feature points using One Class-Support Vector Machines (OC-SVMs), 3)generation of visual words of all SIFT descriptors and histograms in each image of selected feature points using Self-Organizing Maps (SOMs), 4)formation of labels using Adaptive Resonance Theory-2 (ART-2), and 5)creation and classification of categories on a category map of Counter Propagation Networks (CPNs) for visualizing spatial relations between categories. Classification results of static images using a Caltech-256 object category dataset and dynamic images using time-series images obtained using a robot according to movements respectively demonstrate that our method can visualize spatial relations of categories while maintaining time-series characteristics. Moreover, we emphasize the effectiveness of our method for category classification of appearance changes of objects.
Yuan HU Li LU Jingqi YAN Zhi LIU Pengfei SHI
In this paper, we present the sexual dimorphism analysis in 3D human face and perform gender classification based on the result of sexual dimorphism analysis. Four types of features are extracted from a 3D human-face image. By using statistical methods, the existence of sexual dimorphism is demonstrated in 3D human face based on these features. The contributions of each feature to sexual dimorphism are quantified according to a novel criterion. The best gender classification rate is 94% by using SVMs and Matcher Weighting fusion method. This research adds to the knowledge of 3D faces in sexual dimorphism and affords a foundation that could be used to distinguish between male and female in 3D faces.