The search functionality is under construction.

Author Search Result

[Author] Sang-Wook KIM(9hit)

1-9hit
  • Fraud Detection in Comparison-Shopping Services: Patterns and Anomalies in User Click Behaviors

    Sang-Chul LEE  Christos FALOUTSOS  Dong-Kyu CHAE  Sang-Wook KIM  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2017/07/10
      Vol:
    E100-D No:10
      Page(s):
    2659-2663

    This paper deals with a novel, interesting problem of detecting frauds in comparison-shopping services (CSS). In CSS, there exist frauds who perform excessive clicks on a target item. They aim at making the item look very popular and subsequently ranked high in the search and recommendation results. As a result, frauds may distort the quality of recommendations and searches. We propose an approach of detecting such frauds by analyzing click behaviors of users in CSS. We evaluate the effectiveness of the proposed approach on a real-world clickstream dataset.

  • ACE-INPUTS: A Cost-Effective Intelligent Public Transportation System

    Jongchan LEE  Sanghyun PARK  Minkoo SEO  Sang-Wook KIM  

     
    PAPER-Distributed Cooperation and Agents

      Vol:
    E90-D No:8
      Page(s):
    1251-1261

    With the rapid adoption of mobile devices and location based services (LBS), applications provide with nearby information like recommending sightseeing resort are becoming more and more popular. In the mean time, traffic congestion in cities led to the development of mobile public transportation systems. In such applications, mobile devices need to communicate with servers via wireless communications and servers should process queries from tons of devices. However, because users can not neglect the payment for the wireless communications and server capacities are limited, decreasing the communications made between central servers and devices and reducing the burden on servers are quite demanding. Therefore, in this paper, we propose a cost-effective intelligent public transportation system, ACE-INPUTS, which utilizes a mobile device to retrieve the bus routes to reach a destination from the current location at the lowest wireless communication cost. To accomplish this task, ACE-INPUTS maintains a small amount of information on bus stops and bus routes in a mobile device and runs a heuristic routing algorithm based on such information. Only when a user asks more accurate route information or calls for a "leave later query", ACE-INPUTS entrusts the task to a server into which real-time traffic and bus location information is being collected. By separating the roles into mobile devices and servers, ACE-INPUTS is able to provide bus routes at the lowest wireless communication cost and reduces burden on servers. Experimental results have revealed that ACE-INPUTS is effective and scalable in most experimental settings.

  • Efficient Storage and Querying of Horizontal Tables Using a PIVOT Operation in Commercial Relational DBMSs

    Sung-Hyun SHIN  Yang-Sae MOON  Jinho KIM  Sang-Wook KIM  

     
    PAPER-Database

      Vol:
    E91-D No:6
      Page(s):
    1719-1729

    In recent years, a horizontal table with a large number of attributes is widely used in OLAP or e-business applications to analyze multidimensional data efficiently. For efficient storing and querying of horizontal tables, recent works have tried to transform a horizontal table to a traditional vertical table. Existing works, however, have the drawback of not considering an optimized PIVOT operation provided (or to be provided) in recent commercial RDBMSs. In this paper we propose a formal approach that exploits the optimized PIVOT operation of commercial RDBMSs for storing and querying of horizontal tables. To achieve this goal, we first provide an overall framework that stores and queries a horizontal table using an equivalent vertical table. Under the proposed framework, we then formally define 1) a method that stores a horizontal table in an equivalent vertical table and 2) a PIVOT operation that converts a stored vertical table to an equivalent horizontal view. Next, we propose a novel method that transforms a user-specified query on horizontal tables to an equivalent PIVOT-included query on vertical tables. In particular, by providing transformation rules for all five elementary operations in relational algebra as theorems, we prove our method is theoretically applicable to commercial RDBMSs. Experimental results show that, compared with the earlier work, our method reduces storage space significantly and also improves average performance by several orders of magnitude. These results indicate that our method provides an excellent framework to maximize performance in handling horizontal tables by exploiting the optimized PIVOT operation in commercial RDBMSs.

  • TL-Rank: A Blend of Text and Link Information for Measuring Similarity in Scientific Literature Databases

    Seok-Ho YOON  Ji-Su KIM  Sang-Wook KIM  Choonhwa LEE  

     
    LETTER-Artificial Intelligence, Data Mining

      Vol:
    E95-D No:10
      Page(s):
    2556-2559

    This paper presents a novel similarity measure that computes similarity scores among scientific research papers. The text of a given paper in online scientific literature is often found to be incomplete in terms of its potential to be compared with others, which likely leads to inaccurate results. Our solution to this problem makes use of both text and link information of a paper in question for similarity scores in that the comparison text of the paper is strengthened by adding that of papers related to it. More accurate similarity scores can be computed by reinforcing the input with the citations of the paper as well as the citations included within the paper. The efficacy of the proposed measure is validated through our extensive performance evaluation study which demonstrates a substantial gain.

  • An Approach to Effective Recommendation Considering User Preference and Diversity Simultaneously

    Sang-Chul LEE  Sang-Wook KIM  Sunju PARK  Dong-Kyu CHAE  

     
    LETTER-Data Engineering, Web Information Systems

      Pubricized:
    2017/09/28
      Vol:
    E101-D No:1
      Page(s):
    244-248

    This paper addresses recommendation diversification. Existing diversification methods have difficulty in dealing with the tradeoff between accuracy and diversity. We point out the root of the problem in diversification methods and propose a novel method that can avoid the problem. Our method aims to find an optimal solution of the objective function that is carefully designed to consider user preference and the diversity among recommended items simultaneously. In addition, we propose an item clustering and a greedy approximation to achieve efficiency in recommendation.

  • Analyzing Network Privacy Preserving Methods: A Perspective of Social Network Characteristics

    Duck-Ho BAE  Jong-Min LEE  Sang-Wook KIM  Youngjoon WON  Yongsu PARK  

     
    LETTER-Artificial Intelligence, Data Mining

      Vol:
    E97-D No:6
      Page(s):
    1664-1667

    A burst of social network services increases the need for in-depth analysis of network activities. Privacy breach for network participants is a concern in such analysis efforts. This paper investigates structural and property changes via several privacy preserving methods (anonymization) for social network. The anonymized social network does not follow the power-law for node degree distribution as the original network does. The peak-hop for node connectivity increases at most 1 and the clustering coefficient of neighbor nodes shows 6.5 times increases after anonymization. Thus, we observe inconsistency of privacy preserving methods in social network analysis.

  • Index Interpolation: A Subsequence Matching Algorithm Supporting Moving Average Transform of Arbitrary Order in Time-Series Databases

    Woong-Kee LOH  Sang-Wook KIM  Kyu-Young WHANG  

     
    PAPER-Databases

      Vol:
    E84-D No:1
      Page(s):
    76-86

    In this paper we propose a subsequence matching algorithm that supports moving average transform of arbitrary order in time-series databases. Moving average transform reduces the effect of noise and has been used in many areas such as econometrics since it is useful in finding the overall trends. The proposed algorithm extends the existing subsequence matching algorithm proposed by Faloutsos et al. (SUB94 in short). If we applied the algorithm without any extension, we would have to generate an index for each moving average order and would have serious storage and CPU time overhead. In this paper we tackle the problem using the notion of index interpolation. Index interpolation is defined as a searching method that uses one or more indexes generated for a few selected cases and performs searching for all the cases satisfying some criteria. The proposed algorithm, which is based on index interpolation, can use only one index for a pre-selected moving average order k and performs subsequence matching for arbitrary order m ( k). We prove that the proposed algorithm causes no false dismissal. The proposed algorithm can also use more than one index to improve search performance. The algorithm works better with smaller selectivities. For selectivities less than 10-2, the degradation of search performance compared with the fully-indexed case--which is equivalent to SUB94--is no more than 33.0% when one index is used, and 17.2% when two indexes are used. Since the queries with smaller selectivities are much more frequent in general database applications, the proposed algorithm is suitable for practical situations.

  • SimCS: An Effective Method to Compute Similarity of Scientific Papers Based on Contribution Scores

    Masoud REYHANI HAMEDANI  Sang-Wook KIM  

     
    LETTER-Data Engineering, Web Information Systems

      Pubricized:
    2015/09/14
      Vol:
    E98-D No:12
      Page(s):
    2328-2332

    In this paper, we propose SimCS (similarity based on contribution scores) to compute the similarity of scientific papers. For similarity computation, we exploit a notion of a contribution score that indicates how much a paper contributes to another paper citing it. Also, we consider the author dominance of papers in computing contribution scores. We perform extensive experiments with a real-world dataset to show the superiority of SimCS. In comparison with SimCC, the-state-of-the-art method, SimCS not only requires no extra parameter tuning but also shows higher accuracy in similarity computation.

  • Physical Database Design for Efficient Time-Series Similarity Search

    Sang-Wook KIM  Jinho KIM  Sanghyun PARK  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E91-B No:4
      Page(s):
    1251-1254

    Similarity search in time-series databases finds such data sequences whose changing patterns are similar to that of a query sequence. For efficient processing, it normally employs a multi-dimensional index. In order to alleviate the well-known dimensionality curse, the previous methods for similarity search apply the Discrete Fourier Transform (DFT) to data sequences, and take only the first two or three DFT coefficients as organizing attributes. Other than this ad-hoc approach, there have been no research efforts on devising a systematic guideline for choosing the best organizing attributes. This paper first points out the problems occurring in the previous methods, and proposes a novel solution to construct optimal multi-dimensional indexes. The proposed method analyzes the characteristics of a target time-series database, and identifies the organizing attributes having the best discrimination power. It also determines the optimal number of organizing attributes for efficient similarity search by using a cost model. Through a series of experiments, we show that the proposed method outperforms the previous ones significantly.