The search functionality is under construction.

Author Search Result

[Author] Tatsuya AKUTSU(21hit)


  • A Linear Time Pattern Matching Algorithm between a String and a Tree

    Tatsuya AKUTSU  

    PAPER-Algorithm and Computational Complexity

    E77-D No:3

    This paper presents a linear time algorithm for testing whether or not there is a path ,vm> of an undiercted tree T (|V(T)|n) that coincides with a string ss1sm (i.e., label(v1)label(vm)s1sm). Since any path of the tree is allowed, linear time substring matching algorithms can not be directly applied and a new method is developed. In the algorithm, O(n/m) vertices are selected from V(T) such that any path pf length more than m 2 must contain at least one of the selected vertices. A search is performed using the selected vertices as 'bases' and two tables of size O(m) are constructed for each of the selected vertices. A suffix tree, which is a well-known-data structure in string matching, is used effectively in the algorithm. From each of the selected vertices, a search is performed with traversing the suffix tree associated with s. Although the size of the alphabet is assumed to be bounded by a constant in this paper, the algorithm can be applied to the case of unbounded alphabets by increasing the time complexity to O(n log m).

  • On Finding a Fixed Point in a Boolean Network with Maximum Indegree 2

    Tatsuya AKUTSU  Takeyuki TAMURA  


    E92-A No:8

    Finding fixed points in discrete dynamical systems is important because fixed points correspond to steady-states. The Boolean network is considered as one of the simplest discrete dynamical systems and is often used as a model of genetic networks. It is known that detection of a fixed point in a Boolean network with n nodes and maximum indegree K can be polynomially transformed into (K+1)-SAT with n variables. In this paper, we focus on the case of K=2 and present an O(1.3171n) expected time algorithm, which is faster than the naive algorithm based on a reduction to 3-SAT, where we assume that nodes with indegree 2 do not contain self-loops. We also show an algorithm for the general case of K=2 that is slightly faster than the naive algorithm.

  • Integer Programming-Based Approach to Attractor Detection and Control of Boolean Networks

    Tatsuya AKUTSU  Yang ZHAO  Morihiro HAYASHIDA  Takeyuki TAMURA  

    PAPER-Fundamentals of Information Systems

    E95-D No:12

    The Boolean network (BN) can be used to create discrete mathematical models of gene regulatory networks. In this paper, we consider three problems on BNs that are known to be NP-hard: detection of a singleton attractor, finding a control strategy that shifts a BN from a given initial state to the desired state, and control of attractors. We propose integer programming-based methods which solve these problems in a unified manner. Then, we present results of computational experiments which suggest that the proposed methods are useful for solving moderate size instances of these problems. We also show that control of attractors is -hard, which suggests that control of attractors is harder than the other two problems.

  • An RNC Algorithm for Finding a Largest Common Subtree of Two Trees

    Tatsuya AKUTSU  


    E75-D No:1

    It is known that the problem of finding a largest common subgraph is NP-hard for general graphs even if the number of input graphs is two. It is also known that the problem can be solved in polynomial time if the input is restricted to two trees. In this paper, a randomized parallel (an RNC) algorithm for finding a largest common subtree of two trees is presented. The dynamic tree contraction technique and the RNC minimum weight perfect matching algorithm are used to obtain the RNC algorithm. Moreover, an efficient NC algorithm is presented in the case where input trees are of bounded vertex degree. It works in O(log(n1)log(n2)) time using O(n1n2) processors on a CREW PRAM, where n1 and n2 denote the numbers of vertices of input trees. It is also proved that the problem is NP-hard if the number of input trees is more than two. The three dimensional matching problem, a well known NP-complete problem, is reduced to the problem of finding a largest common subtree of three trees.

  • Detecting a Singleton Attractor in a Boolean Network Utilizing SAT Algorithms

    Takeyuki TAMURA  Tatsuya AKUTSU  

    PAPER-Algorithms and Data Structures

    E92-A No:2

    The Boolean network (BN) is a mathematical model of genetic networks. It is known that detecting a singleton attractor, which is also called a fixed point, is NP-hard even for AND/OR BNs (i.e., BNs consisting of AND/OR nodes), where singleton attractors correspond to steady states. Though a naive algorithm can detect a singleton attractor for an AND/OR BN in O(n 2n) time, no O((2-ε)n) (ε > 0) time algorithm was known even for an AND/OR BN with non-restricted indegree, where n is the number of nodes in a BN. In this paper, we present an O(1.787n) time algorithm for detecting a singleton attractor of a given AND/OR BN, along with related results. We also show that detection of a singleton attractor in a BN with maximum indegree two is NP-hard and can be polynomially reduced to a satisfiability problem.

  • Kernel Methods for Chemical Compounds: From Classification to Design Open Access

    Tatsuya AKUTSU  Hiroshi NAGAMOCHI  


    E94-D No:10

    In this paper, we briefly review kernel methods for analysis of chemical compounds with focusing on the authors' works. We begin with a brief review of existing kernel functions that are used for classification of chemical compounds and prediction of their activities. Then, we focus on the pre-image problem for chemical compounds, which is to infer a chemical structure that is mapped to a given feature vector, and has a potential application to design of novel chemical compounds. In particular, we consider the pre-image problem for feature vectors consisting of frequencies of labeled paths of length at most K. We present several time complexity results that include: NP-hardness result for a general case, polynomial time algorithm for tree structured compounds with fixed K, and polynomial time algorithm for K=1 based on graph detachment. Then we review practical algorithms for the pre-image problem, which are based on enumeration of chemical structures satisfying given constraints. We also briefly review related results which include efficient enumeration of stereoisomers of tree-like chemical compounds and efficient enumeration of outerplanar graphs.

  • A Polynomial Time Algorithm for Finding a Largest Common Subgraph of almost Trees of Bounded Degree

    Tatsuya AKUTSU  

    PAPER-Algorithms, Data Structures and Computational Complexity

    E76-A No:9

    This paper considers the problem of finding a largest common subgraph of graphs, which is an important problem in chemical synthesis. It is known that the problem is NP-hard even if graphs are restricted to planar graphs of vertex degree at most three. By the way, a graph is called an almost tree if E(B)V(B)+ K holds for every block B where K is a constant. In this paper, a polynomial time algorithm for finding a largest common subgraph of two graphs which are connected, almost trees and of bounded vertex degree. The algorithm is an extension of a subtree isomorphism algorithm which is based on dynamic programming. Moreover, it is shown that the degree bound is essential. That is, the problem of finding a largest common subgraph of two connected almost trees is proved to be NP-hard for any K0 if degree is not bounded. The three dimensional matching problem, a well known NP-complete problem, is reduced to the problem.

  • A Parallel Algorithm for Determining the Congruence of Point Sets in Three-Dimensions

    Tatsuya AKUTSU  

    PAPER-Algorithm and Computational Complexity

    E78-D No:4

    This paper describes an O(log3n) time O(n/log n) processors parallel algorithm for determining the congruence (exact matching) of two point sets in three-dimensions on a CREW PRAM, where n is the maximum size of the input point sets. Although optimal O(n log n) time sequential algorithms were developed for this problem, no efficient parallel algorithm was known previously. In the algorithm, the original problem is reduced to the two-dimensional congruence problem by computing a three-dimensional point set cps(S) for each input point set S, where cps(S) satisfies the following conditions: 0|cps(S)|12; cps(T(S))T(cps(S)) for all isometric transformations T. The two-dimensional problem can be solved efficiently in parallel using a parallel version of a previously-known sequential algorithm. cps(S) is computed recursively in the following way: the size of a point set is reduced by a constant factor in each recursive step. To reduce the size of a point set, a convex hull is constructed and then it is regarded as a planar graph, so that combinatorial properties of a planar graph are used effectively. A sequential version of the algorithm works in O(n log n) time, so that this paper gives another optimal sequential algorithm. The presented algorithm can be applied for graphs such that each vertex corresponds to a point and each edge corresponds to a line segment connecting its endpoints. Moreover, the algorithm can be modified for computing the canonical form of a point set or a graph.

  • Approximation Algorithms for Optimal RNA Secondary Structures Common to Multiple Sequences

    Takeyuki TAMURA  Tatsuya AKUTSU  


    E90-A No:5

    It is well known that a basic version (i.e., maximizing the number of base-pairs) of the RNA secondary structure prediction problem can be solved in O(n3) time by using simple dynamic programming procedures. For this problem, an O(n3(log log n)1/2/(log n)1/2) time exact algorithm and an O(n2.776+(1/ε)O(1)) time approximation algorithm which has guaranteed approximation ratio 1-ε for any positive constant ε are also known. Moreover, when two RNA sequences are given, there is an O(n6) time exact algorithm which can optimize structure and alignments. In this paper, we show an O(n5) time approximation algorithm for optimizing structure and alignments of two RNA sequences with assuming that the optimal number of base-pairs is more than O(n0.75). We also show that the problem to optimize structure and alignments for given N sequences is NP-hard and introduce a constant-factor approximation algorithm.

  • On the Complexity of Inference and Completion of Boolean Networks from Given Singleton Attractors

    Hao JIANG  Takeyuki TAMURA  Wai-Ki CHING  Tatsuya AKUTSU  

    PAPER-General Fundamentals and Boundaries

    E96-A No:11

    In this paper, we consider the problem of inferring a Boolean network (BN) from a given set of singleton attractors, where it is required that the resulting BN has the same set of singleton attractors as the given one. We show that the problem can be solved in linear time if the number of singleton attractors is at most two and each Boolean function is restricted to be a conjunction or disjunction of literals. We also show that the problem can be solved in polynomial time if more general Boolean functions can be used. In addition to the inference problem, we study two network completion problems from a given set of singleton attractors: adding the minimum number of edges to a given network, and determining Boolean functions to all nodes when only network structure of a BN is given. In particular, we show that the latter problem cannot be solved in polynomial time unless P=NP, by means of a polynomial-time Turing reduction from the complement of the another solution problem for the Boolean satisfiability problem.

  • Algorithms for Finding the Largest Subtree whose Copies Cover All the Leaves

    Tatsuya AKUTSU  Satoshi KOBAYASHI  Koichi HORI  Setsuo OHSUGA  

    LETTER-Algorithm and Computational Complexity

    E76-D No:6

    This paper presents efficient algorithms for finding the largest tree S such that there are vertex disjoint subtrees S1, , S (k1) of T each of which is isomorphic to S and every leaf of T is a leaf of some Si. The algorithms are useful for learning a macro table.

  • Measuring the Similarity of Protein Structures Using Image Compression Algorithms

    Morihiro HAYASHIDA  Tatsuya AKUTSU  

    PAPER-Artificial Intelligence, Data Mining

    E94-D No:12

    For measuring the similarity of biological sequences and structures such as DNA sequences, protein sequences, and tertiary structures, several compression-based methods have been developed. However, they are based on compression algorithms only for sequential data. For instance, protein structures can be represented by two-dimensional distance matrices. Therefore, it is expected that image compression is useful for measuring the similarity of protein structures because image compression algorithms compress data horizontally and vertically. This paper proposes series of methods for measuring the similarity of protein structures. In the methods, an original protein structure is transformed into a distance matrix, which is regarded as a two-dimensional image. Then, the similarity of two protein structures is measured by a kind of compression ratio of the concatenated image. We employed several image compression algorithms, JPEG, GIF, PNG, IFS, and SPC. Since SPC often gave better results among the other image compression methods, and it is simple and easy to be modified, we modified SPC and obtained MSPC. We applied the proposed methods to clustering of protein structures, and performed Receiver Operating Characteristic (ROC) analysis. The results of computational experiments suggest that MSPC has the best performance among existing compression-based methods. We also present some theoretical results on the time complexity and Kolmogorov complexity of image compression-based protein structure comparison.

  • An Efficient Method of Computing Impact Degrees for Multiple Reactions in Metabolic Networks with Cycles

    Takeyuki TAMURA  Yang CONG  Tatsuya AKUTSU  Wai-Ki CHING  

    PAPER-Fundamentals of Information Systems

    E94-D No:12

    The impact degree is a measure of the robustness of a metabolic network against deletion of single or multiple reaction(s). Although such a measure is useful for mining important enzymes/genes, it was defined only for networks without cycles. In this paper, we extend the impact degree for metabolic networks containing cycles and develop a simple algorithm to calculate the impact degree. Furthermore we improve this algorithm to reduce computation time for the impact degree by deletions of multiple reactions. We applied our method to the metabolic network of E. coli, that includes reference pathways, consisting of 3281 reaction nodes and 2444 compound nodes, downloaded from KEGG database, and calculate the distribution of the impact degree. The results of our computational experiments show that the improved algorithm is 18.4 times faster than the simple algorithm for deletion of reaction-pairs and 11.4 times faster for deletion of reaction-triplets. We also enumerate genes with high impact degrees for single and multiple reaction deletions.

  • Exact Algorithms for Finding a Minimum Reaction Cut under a Boolean Model of Metabolic Networks

    Takeyuki TAMURA  Tatsuya AKUTSU  

    PAPER-Algorithms and Data Structures

    E93-A No:8

    A reaction cut is a set of chemical reactions whose deletion blocks the operation of given reactions or the production of given chemical compounds. In this paper, we study two problems ReactionCut and MD-ReactionCut for calculating the minimum reaction cut of a metabolic network under a Boolean model. These problems are based on the flux balance model and the minimal damage model respectively. We show that ReactionCut and MD-ReactionCut are NP-hard even if the maximum outdegree of reaction nodes (Kout) is one. We also present O(1.822n), O(1.959n) and o(2n) time algorithms for MD-ReactionCut with Kout=2, 3, k respectively where n is the number of reaction nodes and k is a constant. The same algorithms also work for ReactionCut if there is no directed cycle. Furthermore, we present a 2O((log n)) time algorithm, which is faster than O((1+ε)n) for any positive constant ε, for the planar case of MD-ReactionCut under a reasonable constraint utilizing Lipton and Tarjan's separator algorithm.

  • Protein Structure Alignment Using Dynamic Programing and Iterative Improvement

    Tatsuya AKUTSU  

    PAPER-Algorithm and Computational Complexity

    E79-D No:12

    In this paper, we consider the protein structure alignment problem, which is a very important problem in molecular biology. Since an outline of protein structure is represented by a sequence of points in three-dimensional space, this problem is defined as the following geometric pattern matching problem: given two point sequences P and Q in three-dimensions and a real number δ > 0, find a maximum-cardinality set of point pairs such that the distance between each pair is at most δ under the condition that any translation and rotation can be applied to P. Since it is very difficult to solve this problem exactly, we consider algorithms that solve it approximately. We propose three algorithms: BASICALIGN, RANDALIGN and FRAGALIGN whose worst case time complexities are O(n8), O((n7/k3) polylog(n)) and O(n4) respectively, where n denotes the size of larger input structure and k denotes the minimum size of the alignment to be obtained. All of these have the following common framework: a series of initial superpositions are computed; for each of such superpositions, a rough alignment is first computed using a dynamic programming technique, and then it is refined through an iterative improvement procedure which also uses dynamic programming; the best alignment among them is selected as an output. The difference among three algorithms lies in the methods of finding initial superpositions. BASICALIGN, RANDALIGN and FRAGALIGN use exhaustive search, random sampling technique and fragment-based search, respectively. We prove guaranteed approximation ratios (in the sense of distances between point pairs) for theoretical versions of BASICALIGN and RANDALIGN. Practical versions of RANDALIGN and FRAGALIGN were implemented and compared with a previous algorithm using real protein structure data. The experimental results show that FRAGALIGN is best among them and it outputs good alignments quickly.

  • A Fixed-Parameter Algorithm for Detecting a Singleton Attractor in an AND/OR Boolean Network with Bounded Treewidth

    Chia-Jung CHANG  Takeyuki TAMURA  Kun-Mao CHAO  Tatsuya AKUTSU  

    PAPER-Algorithms and Data Structures

    E98-A No:1

    The Boolean network can be used as a mathematical model for gene regulatory networks. An attractor, which is a state of a Boolean network repeating itself periodically, can represent a stable stage of a gene regulatory network. It is known that the problem of finding an attractor of the shortest period is NP-hard. In this article, we give a fixed-parameter algorithm for detecting a singleton attractor (SA) for a Boolean network that has only AND and OR Boolean functions of literals and has bounded treewidth k. The algorithm is further extended to detect an SA for a constant-depth nested canalyzing Boolean network with bounded treewidth. We also prove the fixed-parameter intractability of the detection of an SA for a general Boolean network with bounded treewidth.

  • Tree Edit Distance Problems: Algorithms and Applications to Bioinformatics

    Tatsuya AKUTSU  


    E93-D No:2

    Tree structured data often appear in bioinformatics. For example, glycans, RNA secondary structures and phylogenetic trees usually have tree structures. Comparison of trees is one of fundamental tasks in analysis of these data. Various distance measures have been proposed and utilized for comparison of trees, among which extensive studies have been done on tree edit distance. In this paper, we review key results and our recent results on the tree edit distance problem and related problems. In particular, we review polynomial time exact algorithms and more efficient approximation algorithms for the edit distance problem for ordered trees, and approximation algorithms for the largest common sub-tree problem for unordered trees. We also review applications of tree edit distance and its variants to bioinformatics with focusing on comparison of glycan structures.

  • Dynamic Programming and Clique Based Approaches for Protein Threading with Profiles and Constraints

    Tatsuya AKUTSU  Morihiro HAYASHIDA  Dukka Bahadur K.C.  Etsuji TOMITA  Jun'ichi SUZUKI  Katsuhisa HORIMOTO  


    E89-A No:5

    The protein threading problem with profiles is known to be efficiently solvable using dynamic programming. In this paper, we consider a variant of the protein threading problem with profiles in which constraints on distances between residues are given. We prove that protein threading with profiles and constraints is NP-hard. Moreover, we show a strong hardness result on the approximation of an optimal threading satisfying all the constraints. On the other hand, we develop two practical algorithms: CLIQUETHREAD and BBDPTHREAD. CLIQUETHREAD reduces the threading problem to the maximum edge-weight clique problem, whereas BBDPTHREAD combines dynamic programming and branch-and-bound techniques. We perform computational experiments using protein structure data in PDB (Protein Data Bank) using simulated distance constraints. The results show that constraints are useful to improve the alignment accuracy of the target sequence and the template structure. Moreover, these results also show that BBDPTHREAD is in general faster than CLIQUETHREAD for larger size proteins whereas CLIQUETHREAD is useful if there does not exist a feasible threading.

  • Approximate String Matching with Variable Length Don't Care Characters

    Tatsuya AKUTSU  

    LETTER-Algorithm and Computational Complexity

    E79-D No:9

    This paper presents an O(mn log n) time algorithm for an approximate string matching problem, in which a pattern string may contain variable length don't care characters. This problem is important for searching DNA sequences or amino acid sequences.

  • A Minimum Path Decomposition of the Hasse Diagram for Testing the Consistency of Functional Dependencies

    Atsuhiro TAKASU  Tatsuya AKUTSU  

    LETTER-Algorithm and Computational Complexity

    E76-D No:2

    An optimal algorithm for decomposing a special type of the Hasse diagram into a minimum set of disjoint paths is described. It is useful for testing the consistency of functional dependencies.
