Let us consider a regular expression r of length m and a text string T of length n over an alphabet Σ. Then, the RE minimal substring search problem is to find all minimal substrings of T matching r. Yamamoto proposed O(mn) time and O(m) space algorithm using a Thompson automaton. In this paper, we improve Yamamoto's algorithm by introducing parallelism. The proposed algorithm runs in O(mn) time in the worst case and in O(mn/p) time in the best case, where p denotes the number of processors. Besides, we show a parameter related to the parallel time of the proposed algorithm. We evaluate the algorithm experimentally.
Yosuke OBE
SCSK Corporation
Hiroaki YAMAMOTO
Shinshu University
Hiroshi FUJIWARA
Shinshu University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Yosuke OBE, Hiroaki YAMAMOTO, Hiroshi FUJIWARA, "Parallelization on a Minimal Substring Search Algorithm for Regular Expressions" in IEICE TRANSACTIONS on Information,
vol. E106-D, no. 5, pp. 952-958, May 2023, doi: 10.1587/transinf.2022EDP7105.
Abstract: Let us consider a regular expression r of length m and a text string T of length n over an alphabet Σ. Then, the RE minimal substring search problem is to find all minimal substrings of T matching r. Yamamoto proposed O(mn) time and O(m) space algorithm using a Thompson automaton. In this paper, we improve Yamamoto's algorithm by introducing parallelism. The proposed algorithm runs in O(mn) time in the worst case and in O(mn/p) time in the best case, where p denotes the number of processors. Besides, we show a parameter related to the parallel time of the proposed algorithm. We evaluate the algorithm experimentally.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2022EDP7105/_p
Copy
@ARTICLE{e106-d_5_952,
author={Yosuke OBE, Hiroaki YAMAMOTO, Hiroshi FUJIWARA, },
journal={IEICE TRANSACTIONS on Information},
title={Parallelization on a Minimal Substring Search Algorithm for Regular Expressions},
year={2023},
volume={E106-D},
number={5},
pages={952-958},
abstract={Let us consider a regular expression r of length m and a text string T of length n over an alphabet Σ. Then, the RE minimal substring search problem is to find all minimal substrings of T matching r. Yamamoto proposed O(mn) time and O(m) space algorithm using a Thompson automaton. In this paper, we improve Yamamoto's algorithm by introducing parallelism. The proposed algorithm runs in O(mn) time in the worst case and in O(mn/p) time in the best case, where p denotes the number of processors. Besides, we show a parameter related to the parallel time of the proposed algorithm. We evaluate the algorithm experimentally.},
keywords={},
doi={10.1587/transinf.2022EDP7105},
ISSN={1745-1361},
month={May},}
Copy
TY - JOUR
TI - Parallelization on a Minimal Substring Search Algorithm for Regular Expressions
T2 - IEICE TRANSACTIONS on Information
SP - 952
EP - 958
AU - Yosuke OBE
AU - Hiroaki YAMAMOTO
AU - Hiroshi FUJIWARA
PY - 2023
DO - 10.1587/transinf.2022EDP7105
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E106-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2023
AB - Let us consider a regular expression r of length m and a text string T of length n over an alphabet Σ. Then, the RE minimal substring search problem is to find all minimal substrings of T matching r. Yamamoto proposed O(mn) time and O(m) space algorithm using a Thompson automaton. In this paper, we improve Yamamoto's algorithm by introducing parallelism. The proposed algorithm runs in O(mn) time in the worst case and in O(mn/p) time in the best case, where p denotes the number of processors. Besides, we show a parameter related to the parallel time of the proposed algorithm. We evaluate the algorithm experimentally.
ER -