A Linear Time Pattern Matching Algorithm between a String and a Tree

Tatsuya AKUTSU

A Linear Time Pattern Matching Algorithm between a String and a Tree

Tatsuya AKUTSU

Full Text Views

0

Cite this

Summary :

This paper presents a linear time algorithm for testing whether or not there is a path <v₁,・・・,v_m> of an undiercted tree T (|V(T)|n) that coincides with a string ss₁・・・s_m (i.e., label(v₁)・・・label(v_m)s₁・・・s_m). Since any path of the tree is allowed, linear time substring matching algorithms can not be directly applied and a new method is developed. In the algorithm, O(n/m) vertices are selected from V(T) such that any path pf length more than m 2 must contain at least one of the selected vertices. A search is performed using the selected vertices as 'bases' and two tables of size O(m) are constructed for each of the selected vertices. A suffix tree, which is a well-known-data structure in string matching, is used effectively in the algorithm. From each of the selected vertices, a search is performed with traversing the suffix tree associated with s. Although the size of the alphabet is assumed to be bounded by a constant in this paper, the algorithm can be applied to the case of unbounded alphabets by increasing the time complexity to O(n log m).

Publication: IEICE TRANSACTIONS on Information Vol.E77-D No.3 pp.281-287

Publication Date: 1994/03/25

Publicized

Online ISSN

DOI

Type of Manuscript: PAPER

Category: Algorithm and Computational Complexity

Cite this

Copy

Tatsuya AKUTSU, "A Linear Time Pattern Matching Algorithm between a String and a Tree" in IEICE TRANSACTIONS on Information, vol. E77-D, no. 3, pp. 281-287, March 1994, doi: .
Abstract: This paper presents a linear time algorithm for testing whether or not there is a path <v₁,・・・,v_m> of an undiercted tree T (|V(T)|n) that coincides with a string ss₁・・・s_m (i.e., label(v₁)・・・label(v_m)s₁・・・s_m). Since any path of the tree is allowed, linear time substring matching algorithms can not be directly applied and a new method is developed. In the algorithm, O(n/m) vertices are selected from V(T) such that any path pf length more than m 2 must contain at least one of the selected vertices. A search is performed using the selected vertices as 'bases' and two tables of size O(m) are constructed for each of the selected vertices. A suffix tree, which is a well-known-data structure in string matching, is used effectively in the algorithm. From each of the selected vertices, a search is performed with traversing the suffix tree associated with s. Although the size of the alphabet is assumed to be bounded by a constant in this paper, the algorithm can be applied to the case of unbounded alphabets by increasing the time complexity to O(n log m).
URL: https://global.ieice.org/en_transactions/information/10.1587/e77-d_3_281/_p

Copy

@ARTICLE{e77-d_3_281,
author={Tatsuya AKUTSU, },
journal={IEICE TRANSACTIONS on Information},
title={A Linear Time Pattern Matching Algorithm between a String and a Tree},
year={1994},
volume={E77-D},
number={3},
pages={281-287},
abstract={This paper presents a linear time algorithm for testing whether or not there is a path <v₁,・・・,v_m> of an undiercted tree T (|V(T)|n) that coincides with a string ss₁・・・s_m (i.e., label(v₁)・・・label(v_m)s₁・・・s_m). Since any path of the tree is allowed, linear time substring matching algorithms can not be directly applied and a new method is developed. In the algorithm, O(n/m) vertices are selected from V(T) such that any path pf length more than m 2 must contain at least one of the selected vertices. A search is performed using the selected vertices as 'bases' and two tables of size O(m) are constructed for each of the selected vertices. A suffix tree, which is a well-known-data structure in string matching, is used effectively in the algorithm. From each of the selected vertices, a search is performed with traversing the suffix tree associated with s. Although the size of the alphabet is assumed to be bounded by a constant in this paper, the algorithm can be applied to the case of unbounded alphabets by increasing the time complexity to O(n log m).},
keywords={},
doi={},
ISSN={},
month={March},}

Copy

TY - JOUR
TI - A Linear Time Pattern Matching Algorithm between a String and a Tree
T2 - IEICE TRANSACTIONS on Information
SP - 281
EP - 287
AU - Tatsuya AKUTSU
PY - 1994
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E77-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 1994
AB - This paper presents a linear time algorithm for testing whether or not there is a path <v₁,・・・,v_m> of an undiercted tree T (|V(T)|n) that coincides with a string ss₁・・・s_m (i.e., label(v₁)・・・label(v_m)s₁・・・s_m). Since any path of the tree is allowed, linear time substring matching algorithms can not be directly applied and a new method is developed. In the algorithm, O(n/m) vertices are selected from V(T) such that any path pf length more than m 2 must contain at least one of the selected vertices. A search is performed using the selected vertices as 'bases' and two tables of size O(m) are constructed for each of the selected vertices. A suffix tree, which is a well-known-data structure in string matching, is used effectively in the algorithm. From each of the selected vertices, a search is performed with traversing the suffix tree associated with s. Although the size of the alphabet is assumed to be bounded by a constant in this paper, the algorithm can be applied to the case of unbounded alphabets by increasing the time complexity to O(n log m).
ER -