IEICE global.ieice.org Site

Keyword Search Result

[Keyword] structured data(7hit)

1-7hit

Recent Advances and Trends in Large-Scale Kernel Methods
Hisashi KASHIMA Tsuyoshi IDE Tsuyoshi KATO Masashi SUGIYAMA

INVITED PAPER

Vol:
E92-D No:7
Page(s):
1338-1353
Kernel methods such as the support vector machine are one of the most successful algorithms in modern machine learning. Their advantage is that linear algorithms are extended to non-linear scenarios in a straightforward way by the use of the kernel trick. However, naive use of kernel methods is computationally expensive since the computational complexity typically scales cubically with respect to the number of training samples. In this article, we review recent advances in the kernel methods, with emphasis on scalability for massive problems.
Efficient Substructure Discovery from Large Semi-Structured Data
Tatsuya ASAI Kenji ABE Shinji KAWASOE Hiroshi SAKAMOTO Hiroki ARIMURA Setsuo ARIKAWA

PAPER-Data Mining

Vol:
E87-D No:12
Page(s):
2754-2763
In this paper, we consider a data mining problem for semi-structured data. Modeling semi-structured data as labeled ordered trees, we present an efficient algorithm for discovering frequent substructures from a large collection of semi-structured data. By extending the enumeration technique developed by Bayardo (SIGMOD'98) for discovering long itemsets, our algorithm scales almost linearly in the total size of maximal tree patterns contained in an input collection depending mildly on the size of the longest pattern. We also developed several pruning techniques that significantly speed-up the search. Experiments on Web data show that our algorithm runs efficiently on real-life datasets combined with proposed pruning techniques in the wide range of parameters.
Adaptive Data Class Structure for Efficient Inter-Vehicle Communication
Lachlan B. MICHAEL Ryuji KOHNO

PAPER

Vol:
E85-D No:11
Page(s):
1830-1838
To reduce the bandwidth needed for data transmission in an ad-hoc communication network, such as an Intelligent Transportation System (ITS) inter-vehicle communication network, a broadcast scheme is proposed where the data to be transmitted is arranged into several classes. Each class contains more specific and detailed information. Since information to be transmitted often has a geographical relevance, the classes can be structured to represent this relationship. As data is routed through the ad-hoc network, the total amount of transmitted data is reduced by removing the data contained in one class on each hop. The class structure is adaptive so that in unforeseen situations the relative importance of transmitted data can be dynamically adjusted. Furthermore different manufacturers can implement different classes structures, and total length of data may be different. By computer simulation it was shown that in the proposed system the required bandwidth for transmission to achieve similar data reception rates to conventional non-structured data schemes can be reduced to less than one third, resulting in a more efficient transmission scheme. In addition a packet structure similar to IP packets is proposed which will enable easily integration of multimedia transmissions into vehicle to vehicle communications.
Complexity and a Method of Extracting a Database Schema over Semistructured Documents
Nobutaka SUZUKI Yoichirou SATO Michiyoshi HAYASE

PAPER-Databases

Vol:
E85-D No:6
Page(s):
940-949
Semistructured data comprises irregular structure and has no a-priori database schema, therefore we encounter several problems such as inefficient data retrieval and wasteful data storage. To cope with such problems, some schema extraction algorithms over semistructured data have been proposed, in which data is modeled as an unordered tree. However, the order of elements is indispensable for document data, therefore we consider extracting an optimal database schema over an ordered tree. We consider an optimization problem to extract a smallest database schema such that the density of each class is no less than a given threshold, where the density of a class represents a similarity between the type of the class and those of the objects in the class. We first prove that the corresponding decision problem is strongly NP-complete, and show that another version of the problem is strongly NP-hard and belongs to Δ2 P. Then we show that for any r < 3/2, there is no polynomial-time r-approximation algorithm that solves the optimization problem unless P = NP. Finally, we propose a kind of class called bounded class that can be constructed efficiently, then show a polynomial-time algorithm for constructing a database schema by using bounded classes.
Extracting Typical Classes and a Database Schema from Semistructured Data
Nobutaka SUZUKI Yoichirou SATO Michiyoshi HAYASE

PAPER-Databases

Vol:
E84-D No:1
Page(s):
100-112
Semistructured data has no a-priori schema information, which causes some problems such as inefficient storage and query execution. To cope with such problems, extracting schema information from semistructured data has been an important issue. However, in most cases optimal schema information cannot be extracted efficiently, and few efficient approximation algorithms have been proposed. In this paper, we consider an approximation algorithm for extracting "typical" classes from semistructured data. Intuitively, a class C is said to be typical if the structure of C is "similar" to those of "many" objects. We present the following results. First, we prove that the problem of deciding if a typical class can be extracted from given semistructured data is NP-complete. Second, we present an approximation algorithm for extracting typical classes from given semistructured data, and show a sufficient condition for the approximation algorithm to run in polynomial time. Finally, by using extracted classes obtained by the approximation algorithm, we propose a polynomial-time algorithm for constructing a set R of classes such that R covers all the objects to form a database schema.
Discovering Knowledge from Graph Structured Data by Using Refutably Inductive Inference of Formal Graph Systems
Tetsuhiro MIYAHARA Tomoyuki UCHIDA Takayoshi SHOUDAI Tetsuji KUBOYAMA Kenichi TAKAHASHI Hiroaki UEDA

PAPER

Vol:
E84-D No:1
Page(s):
48-56
We present a new method for discovering knowledge from structured data which are represented by graphs in the framework of Inductive Logic Programming. A graph, or network, is widely used for representing relations between various data and expressing a small and easily understandable hypothesis. The analyzing system directly manipulating graphs is useful for knowledge discovery. Our method uses Formal Graph System (FGS) as a knowledge representation language for graph structured data. FGS is a kind of logic programming system which directly deals with graphs just like first order terms. And our method employs a refutably inductive inference algorithm as a learning algorithm. A refutably inductive inference algorithm is a special type of inductive inference algorithm with refutability of hypothesis spaces, and is suitable for knowledge discovery. We give a sufficiently large hypothesis space, the set of weakly reducing FGS programs. And we show that this hypothesis space is refutably inferable from complete data. We have designed and implemented a prototype of a knowledge discovery system KD-FGS, which is based on our method and acquires knowledge directly from graph structured data. Finally we discuss the applicability of our method for graph structured data with experimental results on some graph theoretical notions.
A Structured Video Handling Technique for Multimedia Systems
Yoshinobu TONOMURA Akihito AKUTSU

PAPER-Image Processing, Computer Graphics and Pattern Recognition

Vol:
E78-D No:6
Page(s):
764-777
This paper proposes a functional video handling technique based on structured video. The video handling architecture, which includes a video data structure, file management structure, and visual interface structure, is introduced as the core concept of this technique. One of the key features of this architecture is that the newly proposed video indexing method is performed automatically based on image processing. The video data structure, which plays an important role in the architecture, has two kinds of data structures: content and node. The central idea behind these structures is to separate the video contents from the processing operations and to create links between them. Video indexes work as a backend mechanism in structuring video content. A prototype video handling system called the MediaBENCH, a hypermedia basic environment for computer and human interactions, which demonstrates the actual implementation of the proposed concept and technique, is described. Basic functions such as browsing and editing, which are achieved based on the architecture, exhibit the advantages of structured video handling. The concept and the methods proposed in this paper assure various video-computer applications, which will play major roles in the multimedia field.