The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] structured data(7hit)

1-7hit
  • Recent Advances and Trends in Large-Scale Kernel Methods

    Hisashi KASHIMA  Tsuyoshi IDE  Tsuyoshi KATO  Masashi SUGIYAMA  

     
    INVITED PAPER

      Vol:
    E92-D No:7
      Page(s):
    1338-1353

    Kernel methods such as the support vector machine are one of the most successful algorithms in modern machine learning. Their advantage is that linear algorithms are extended to non-linear scenarios in a straightforward way by the use of the kernel trick. However, naive use of kernel methods is computationally expensive since the computational complexity typically scales cubically with respect to the number of training samples. In this article, we review recent advances in the kernel methods, with emphasis on scalability for massive problems.

  • Efficient Substructure Discovery from Large Semi-Structured Data

    Tatsuya ASAI  Kenji ABE  Shinji KAWASOE  Hiroshi SAKAMOTO  Hiroki ARIMURA  Setsuo ARIKAWA  

     
    PAPER-Data Mining

      Vol:
    E87-D No:12
      Page(s):
    2754-2763

    In this paper, we consider a data mining problem for semi-structured data. Modeling semi-structured data as labeled ordered trees, we present an efficient algorithm for discovering frequent substructures from a large collection of semi-structured data. By extending the enumeration technique developed by Bayardo (SIGMOD'98) for discovering long itemsets, our algorithm scales almost linearly in the total size of maximal tree patterns contained in an input collection depending mildly on the size of the longest pattern. We also developed several pruning techniques that significantly speed-up the search. Experiments on Web data show that our algorithm runs efficiently on real-life datasets combined with proposed pruning techniques in the wide range of parameters.

  • Adaptive Data Class Structure for Efficient Inter-Vehicle Communication

    Lachlan B. MICHAEL  Ryuji KOHNO  

     
    PAPER

      Vol:
    E85-D No:11
      Page(s):
    1830-1838

    To reduce the bandwidth needed for data transmission in an ad-hoc communication network, such as an Intelligent Transportation System (ITS) inter-vehicle communication network, a broadcast scheme is proposed where the data to be transmitted is arranged into several classes. Each class contains more specific and detailed information. Since information to be transmitted often has a geographical relevance, the classes can be structured to represent this relationship. As data is routed through the ad-hoc network, the total amount of transmitted data is reduced by removing the data contained in one class on each hop. The class structure is adaptive so that in unforeseen situations the relative importance of transmitted data can be dynamically adjusted. Furthermore different manufacturers can implement different classes structures, and total length of data may be different. By computer simulation it was shown that in the proposed system the required bandwidth for transmission to achieve similar data reception rates to conventional non-structured data schemes can be reduced to less than one third, resulting in a more efficient transmission scheme. In addition a packet structure similar to IP packets is proposed which will enable easily integration of multimedia transmissions into vehicle to vehicle communications.

  • Complexity and a Method of Extracting a Database Schema over Semistructured Documents

    Nobutaka SUZUKI  Yoichirou SATO  Michiyoshi HAYASE  

     
    PAPER-Databases

      Vol:
    E85-D No:6
      Page(s):
    940-949

    Semistructured data comprises irregular structure and has no a-priori database schema, therefore we encounter several problems such as inefficient data retrieval and wasteful data storage. To cope with such problems, some schema extraction algorithms over semistructured data have been proposed, in which data is modeled as an unordered tree. However, the order of elements is indispensable for document data, therefore we consider extracting an optimal database schema over an ordered tree. We consider an optimization problem to extract a smallest database schema such that the density of each class is no less than a given threshold, where the density of a class represents a similarity between the type of the class and those of the objects in the class. We first prove that the corresponding decision problem is strongly NP-complete, and show that another version of the problem is strongly NP-hard and belongs to Δ2 P. Then we show that for any r < 3/2, there is no polynomial-time r-approximation algorithm that solves the optimization problem unless P = NP. Finally, we propose a kind of class called bounded class that can be constructed efficiently, then show a polynomial-time algorithm for constructing a database schema by using bounded classes.

  • Extracting Typical Classes and a Database Schema from Semistructured Data

    Nobutaka SUZUKI  Yoichirou SATO  Michiyoshi HAYASE  

     
    PAPER-Databases

      Vol:
    E84-D No:1
      Page(s):
    100-112

    Semistructured data has no a-priori schema information, which causes some problems such as inefficient storage and query execution. To cope with such problems, extracting schema information from semistructured data has been an important issue. However, in most cases optimal schema information cannot be extracted efficiently, and few efficient approximation algorithms have been proposed. In this paper, we consider an approximation algorithm for extracting "typical" classes from semistructured data. Intuitively, a class C is said to be typical if the structure of C is "similar" to those of "many" objects. We present the following results. First, we prove that the problem of deciding if a typical class can be extracted from given semistructured data is NP-complete. Second, we present an approximation algorithm for extracting typical classes from given semistructured data, and show a sufficient condition for the approximation algorithm to run in polynomial time. Finally, by using extracted classes obtained by the approximation algorithm, we propose a polynomial-time algorithm for constructing a set R of classes such that R covers all the objects to form a database schema.

  • Discovering Knowledge from Graph Structured Data by Using Refutably Inductive Inference of Formal Graph Systems

    Tetsuhiro MIYAHARA  Tomoyuki UCHIDA  Takayoshi SHOUDAI  Tetsuji KUBOYAMA  Kenichi TAKAHASHI  Hiroaki UEDA  

     
    PAPER

      Vol:
    E84-D No:1
      Page(s):
    48-56

    We present a new method for discovering knowledge from structured data which are represented by graphs in the framework of Inductive Logic Programming. A graph, or network, is widely used for representing relations between various data and expressing a small and easily understandable hypothesis. The analyzing system directly manipulating graphs is useful for knowledge discovery. Our method uses Formal Graph System (FGS) as a knowledge representation language for graph structured data. FGS is a kind of logic programming system which directly deals with graphs just like first order terms. And our method employs a refutably inductive inference algorithm as a learning algorithm. A refutably inductive inference algorithm is a special type of inductive inference algorithm with refutability of hypothesis spaces, and is suitable for knowledge discovery. We give a sufficiently large hypothesis space, the set of weakly reducing FGS programs. And we show that this hypothesis space is refutably inferable from complete data. We have designed and implemented a prototype of a knowledge discovery system KD-FGS, which is based on our method and acquires knowledge directly from graph structured data. Finally we discuss the applicability of our method for graph structured data with experimental results on some graph theoretical notions.

  • A Structured Video Handling Technique for Multimedia Systems

    Yoshinobu TONOMURA  Akihito AKUTSU  

     
    PAPER-Image Processing, Computer Graphics and Pattern Recognition

      Vol:
    E78-D No:6
      Page(s):
    764-777

    This paper proposes a functional video handling technique based on structured video. The video handling architecture, which includes a video data structure, file management structure, and visual interface structure, is introduced as the core concept of this technique. One of the key features of this architecture is that the newly proposed video indexing method is performed automatically based on image processing. The video data structure, which plays an important role in the architecture, has two kinds of data structures: content and node. The central idea behind these structures is to separate the video contents from the processing operations and to create links between them. Video indexes work as a backend mechanism in structuring video content. A prototype video handling system called the MediaBENCH, a hypermedia basic environment for computer and human interactions, which demonstrates the actual implementation of the proposed concept and technique, is described. Basic functions such as browsing and editing, which are achieved based on the architecture, exhibit the advantages of structured video handling. The concept and the methods proposed in this paper assure various video-computer applications, which will play major roles in the multimedia field.