The search functionality is under construction.

Author Search Result

[Author] Takashi WASHIO(6hit)

1-6hit
  • Efficient Graph Sequence Mining Using Reverse Search

    Akihiro INOKUCHI  Hiroaki IKUTA  Takashi WASHIO  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E95-D No:7
      Page(s):
    1947-1958

    The mining of frequent subgraphs from labeled graph data has been studied extensively. Furthermore, much attention has recently been paid to frequent pattern mining from graph sequences. A method, called GTRACE, has been proposed to mine frequent patterns from graph sequences under the assumption that changes in graphs are gradual. Although GTRACE mines the frequent patterns efficiently, it still needs substantial computation time to mine the patterns from graph sequences containing large graphs and long sequences. In this paper, we propose a new version of GTRACE that permits efficient mining of frequent patterns based on the principle of a reverse search. The underlying concept of the reverse search is a general scheme for designing efficient algorithms for hard enumeration problems. Our performance study shows that the proposed method is efficient and scalable for mining both long and large graph sequence patterns and is several orders of magnitude faster than the original GTRACE.

  • Discovery of Laws

    Hiroshi MOTODA  Takashi WASHIO  

     
    INVITED PAPER

      Vol:
    E83-D No:1
      Page(s):
    44-51

    Methods to discover laws are reviewed from among both statistical approach and artificial intelligence approach with more emphasis placed on the latter. Dimensions discussed are variable dependency checking, passive or active data gathering, single or multiple laws discovery, static (equilibrium) or dynamic (transient) behavior, quantitative (numeric) or qualitative or structural law discovery, and use of domain-general knowledge. Some of the representative discovery systems are also briefly discussed in conjunction with the methods used in the above dimensions.

  • FRISSMiner: Mining Frequent Graph Sequence Patterns Induced by Vertices

    Akihiro INOKUCHI  Takashi WASHIO  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E95-D No:6
      Page(s):
    1590-1602

    The mining of a complete set of frequent subgraphs from labeled graph data has been studied extensively. Furthermore, much attention has recently been paid to frequent pattern mining from graph sequences (dynamic graphs or evolving graphs). In this paper, we define a novel subgraph subsequence class called an “induced subgraph subsequence” to enable the efficient mining of a complete set of frequent patterns from graph sequences containing large graphs and long sequences. We also propose an efficient method for mining frequent patterns, called “FRISSs (Frequent Relevant, and Induced Subgraph Subsequences)”, from graph sequences. The fundamental performance of the method is evaluated using artificial datasets, and its practicality is confirmed through experiments using a real-world dataset.

  • Fast and Accurate PSD Matrix Estimation by Row Reduction

    Hiroshi KUWAJIMA  Takashi WASHIO  Ee-Peng LIM  

     
    PAPER-Fundamentals of Information Systems

      Vol:
    E95-D No:11
      Page(s):
    2599-2612

    Fast and accurate estimation of missing relations, e.g., similarity, distance and kernel, among objects is now one of the most important techniques required by major data mining tasks, because the missing information of the relations is needed in many applications such as economics, psychology, and social network communities. Though some approaches have been proposed in the last several years, the practical balance between their required computation amount and obtained accuracy are insufficient for some class of the relation estimation. The objective of this paper is to formalize a problem to quickly and efficiently estimate missing relations among objects from the other known relations among the objects and to propose techniques called “PSD Estimation” and “Row Reduction” for the estimation problem. This technique uses a characteristic of the relations named “Positive Semi-Definiteness (PSD)” and a special assumption for known relations in a matrix. The superior performance of our approach in both efficiency and accuracy is demonstrated through an evaluation based on artificial and real-world data sets.

  • GTRACE: Mining Frequent Subsequences from Graph Sequences

    Akihiro INOKUCHI  Takashi WASHIO  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E93-D No:10
      Page(s):
    2792-2804

    In recent years, the mining of a complete set of frequent subgraphs from labeled graph data has been studied extensively. However, to the best of our knowledge, no method has been proposed for finding frequent subsequences of graphs from a set of graph sequences. In this paper, we define a novel class of graph subsequences by introducing axiomatic rules for graph transformations, their admissibility constraints, and a union graph. Then we propose an efficient approach named "GTRACE" for enumerating frequent transformation subsequences (FTSs) of graphs from a given set of graph sequences. The fundamental performance of the proposed method is evaluated using artificial datasets, and its practicality is confirmed by experiments using real-world datasets.

  • Density-Based Spam Detector

    Kenichi YOSHIDA  Fuminori ADACHI  Takashi WASHIO  Hiroshi MOTODA  Teruaki HOMMA  Akihiro NAKASHIMA  Hiromitsu FUJIKAWA  Katsuyuki YAMAZAKI  

     
    PAPER-Internet Systems

      Vol:
    E87-D No:12
      Page(s):
    2678-2688

    The volume of mass unsolicited electronic mail, often known as spam, has recently increased enormously and has become a serious threat not only to the Internet but also to society. This paper proposes a new spam detection method which uses document space density information. Although the proposed method requires extensive e-mail traffic to acquire the necessary information, it can achieve perfect detection (i.e., both recall and precision is 100%) under practical conditions. A direct-mapped cache method contributes to the handling of over 13,000 e-mail messages per second. Experimental results, which were conducted using over 50 million actual e-mail messages, are also reported in this paper.