The search functionality is under construction.

Author Search Result

[Author] Kazumi SAITO(5hit)

1-5hit
  • Pivot Generation Algorithm with a Complete Binary Tree for Efficient Exact Similarity Search

    Yuki YAMAGISHI  Kazuo AOYAMA  Kazumi SAITO  Tetsuo IKEDA  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2017/10/20
      Vol:
    E101-D No:1
      Page(s):
    142-151

    This paper presents a pivot-set generation algorithm for accelerating exact similarity search in a large-scale data set. To deal with the large-scale data set, it is important to efficiently construct a search index offline as well as to perform fast exact similarity search online. Our proposed algorithm efficiently generates competent pivots with two novel techniques: hierarchical data partitioning and fast pivot optimization techniques. To make effective use of a small number of pivots, the former recursively partitions a data set into two subsets with the same size depending on the rank order from each of two assigned pivots, resulting in a complete binary tree. The latter calculates a defined objective function for pivot optimization with a low computational cost by skillfully operating data objects mapped into a pivot space. Since the generated pivots provide the tight lower bounds on distances between a query object and the data objects, an exact similarity search algorithm effectively avoids unnecessary distance calculations. We demonstrate that the search algorithm using the pivots generated by the proposed algorithm reduces distance calculations with an extremely high rate regarding a range query problem for real large-scale image data sets.

  • Extracting Communities from Complex Networks by the k-Dense Method

    Kazumi SAITO  Takeshi YAMADA  Kazuhiro KAZAMA  

     
    PAPER-Graphs and Networks

      Vol:
    E91-A No:11
      Page(s):
    3304-3311

    To understand the structural and functional properties of large-scale complex networks, it is crucial to efficiently extract a set of cohesive subnetworks as communities. There have been proposed several such community extraction methods in the literature, including the classical k-core decomposition method and, more recently, the k-clique based community extraction method. The k-core method, although computationally efficient, is often not powerful enough for uncovering a detailed community structure and it produces only coarse-grained and loosely connected communities. The k-clique method, on the other hand, can extract fine-grained and tightly connected communities but requires a substantial amount of computational load for large-scale complex networks. In this paper, we present a new notion of a subnetwork called k-dense, and propose an efficient algorithm for extracting k-dense communities. We applied our method to the three different types of networks assembled from real data, namely, from blog trackbacks, word associations and Wikipedia references, and demonstrated that the k-dense method could extract communities almost as efficiently as the k-core method, while the qualities of the extracted communities are comparable to those obtained by the k-clique method.

  • Multimedia HTML Layout Method

    Toshimitsu SUZUKI  Kazumi SAITO  Sadao YASHIRO  Takahide MURAMOTO  

     
    PAPER

      Vol:
    E79-B No:8
      Page(s):
    1076-1082

    We proposed a graphical user interface (GUI) that provides users with multimedia information, including dynamic images. On the Internet, there are many WWW browsers that read the Hypertext Markup Language (HTML). As various browsers extend the HTML tags and attributes independently to expand and/or improve layout, the HTML compatibility between browsers is lost. We have developed a WWW browser to solve this problem. Our browser presents all multimedia information, including text, images, and dynamic images as a block and renders them without the need to extend the HTML specifications. It independently interprets and draws HTML objects using a layout manager. It has a layout rule, and manages the hierarchical data structure and the block data of HTML documents. This browser also allows layout-rule changes. The layout manager efficiently displays information while checking the available display area size. The structure of this browser is such that the portion that manages the formatting of the document is separated from the portion that displays the individual parts. In this browser, the layout rule allows text to be placed around an image without the need to modify the existing HTML contents. It is also relatively easy to change the presentation of multiple screens, such as a two-page book-like layout or the conventional single-page scroll-bar format by changing the layout rule. The incorporation of media decoders into the browser enables the displaying of various multimedia information, such as sounds, pictures, and moving images.

  • Efficient Similarity Search with a Pivot-Based Complete Binary Tree

    Yuki YAMAGISHI  Kazuo AOYAMA  Kazumi SAITO  Tetsuo IKEDA  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2017/07/04
      Vol:
    E100-D No:10
      Page(s):
    2526-2536

    This paper presents an efficient similarity search method utilizing as an index a complete binary tree (CBT) based on optimized pivots for a large-scale and high-dimensional data set. A similarity search method, in general, requires high-speed performance on both index construction off-line and similarity search itself online. To fulfill the requirement, we introduce novel techniques into an index construction and a similarity search algorithm in the proposed method for a range query. The index construction algorithm recursively employs the following two main functions, resulting in a CBT index. One is a pivot generation function that obtains one effective pivot at each node by efficiently maximizing a defined objective function. The other is a node bisection function that partitions a set of objects at a node into two almost equal-sized subsets based on the optimized pivot. The similarity search algorithm employs a three-stage process that narrows down candidate objects within a given range by pruning unnecessary branches and filtering objects in each stage. Experimental results on one million real image data set with high dimensionality demonstrate that the proposed method finds an exact solution for a range query at around one-quarter to half of the computational cost of one of the state-of-the-art methods, by using a CBT index constructed off-line at a reasonable computational cost.

  • Accelerating a Lloyd-Type k-Means Clustering Algorithm with Summable Lower Bounds in a Lower-Dimensional Space

    Kazuo AOYAMA  Kazumi SAITO  Tetsuo IKEDA  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2018/08/02
      Vol:
    E101-D No:11
      Page(s):
    2773-2783

    This paper presents an efficient acceleration algorithm for Lloyd-type k-means clustering, which is suitable to a large-scale and high-dimensional data set with potentially numerous classes. The algorithm employs a novel projection-based filter (PRJ) to avoid unnecessary distance calculations, resulting in high-speed performance keeping the same results as a standard Lloyd's algorithm. The PRJ exploits a summable lower bound on a squared distance defined in a lower-dimensional space to which data points are projected. The summable lower bound can make the bound tighter dynamically by incremental addition of components in the lower-dimensional space within each iteration although the existing lower bounds used in other acceleration algorithms work only once as a fixed filter. Experimental results on large-scale and high-dimensional real image data sets demonstrate that the proposed algorithm works at high speed and with low memory consumption when large k values are given, compared with the state-of-the-art algorithms.