1-2hit |
Sang-Chul LEE Christos FALOUTSOS Dong-Kyu CHAE Sang-Wook KIM
This paper deals with a novel, interesting problem of detecting frauds in comparison-shopping services (CSS). In CSS, there exist frauds who perform excessive clicks on a target item. They aim at making the item look very popular and subsequently ranked high in the search and recommendation results. As a result, frauds may distort the quality of recommendations and searches. We propose an approach of detecting such frauds by analyzing click behaviors of users in CSS. We evaluate the effectiveness of the proposed approach on a real-world clickstream dataset.
Lei LI Bin FU Christos FALOUTSOS
Quad-core cpus have been a common desktop configuration for today's office. The increasing number of processors on a single chip opens new opportunity for parallel computing. Our goal is to make use of the multi-core as well as multi-processor architectures to speed up large-scale data mining algorithms. In this paper, we present a general parallel learning framework, Cut-And-Stitch, for training hidden Markov chain models. Particularly, we propose two model-specific variants, CAS-LDS for learning linear dynamical systems (LDS) and CAS-HMM for learning hidden Markov models (HMM). Our main contribution is a novel method to handle the data dependencies due to the chain structure of hidden variables, so as to parallelize the EM-based parameter learning algorithm. We implement CAS-LDS and CAS-HMM using OpenMP on two supercomputers and a quad-core commercial desktop. The experimental results show that parallel algorithms using Cut-And-Stitch achieve comparable accuracy and almost linear speedups over the traditional serial version.