The closeness of a match is an important measure with a number of practical applications, including computational biology, signal processing and text retrieval. The approximate string matching (ASM) problem asks to find a substring of string Y of length n that is most similar to string X of length m. It is well-know that the ASM can be solved by dynamic programming technique by computing a table of size m×n. The main contribution of this work is to present a memory-access-efficient implementation for computing the ASM on a GPU. The proposed GPU implementation relies on warp shuffle instructions which are used to accelerate the communication between threads without resorting to shared memory access. Despite the fact that O(mn) memory access operations are necessary to access all elements of a table with size n×m, the proposed implementation performs only $O(rac{mn}{w})$ memory access operations, where w is the warp size. Experimental results carried out on a GeForce GTX 980 GPU show that the proposed implementation, called w-SCAN, provides speed-up of over two fold in computing the ASM as compared to another prominent alternative.
Lucas Saad Nogueira NUNES
University of Brasilia
Jacir Luiz BORDIM
University of Brasilia
Yasuaki ITO
Hiroshima University
Koji NAKANO
Hiroshima University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Lucas Saad Nogueira NUNES, Jacir Luiz BORDIM, Yasuaki ITO, Koji NAKANO, "A Memory-Access-Efficient Implementation for Computing the Approximate String Matching Algorithm on GPUs" in IEICE TRANSACTIONS on Information,
vol. E99-D, no. 12, pp. 2995-3003, December 2016, doi: 10.1587/transinf.2016PAP0024.
Abstract: The closeness of a match is an important measure with a number of practical applications, including computational biology, signal processing and text retrieval. The approximate string matching (ASM) problem asks to find a substring of string Y of length n that is most similar to string X of length m. It is well-know that the ASM can be solved by dynamic programming technique by computing a table of size m×n. The main contribution of this work is to present a memory-access-efficient implementation for computing the ASM on a GPU. The proposed GPU implementation relies on warp shuffle instructions which are used to accelerate the communication between threads without resorting to shared memory access. Despite the fact that O(mn) memory access operations are necessary to access all elements of a table with size n×m, the proposed implementation performs only $O(rac{mn}{w})$ memory access operations, where w is the warp size. Experimental results carried out on a GeForce GTX 980 GPU show that the proposed implementation, called w-SCAN, provides speed-up of over two fold in computing the ASM as compared to another prominent alternative.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2016PAP0024/_p
Copy
@ARTICLE{e99-d_12_2995,
author={Lucas Saad Nogueira NUNES, Jacir Luiz BORDIM, Yasuaki ITO, Koji NAKANO, },
journal={IEICE TRANSACTIONS on Information},
title={A Memory-Access-Efficient Implementation for Computing the Approximate String Matching Algorithm on GPUs},
year={2016},
volume={E99-D},
number={12},
pages={2995-3003},
abstract={The closeness of a match is an important measure with a number of practical applications, including computational biology, signal processing and text retrieval. The approximate string matching (ASM) problem asks to find a substring of string Y of length n that is most similar to string X of length m. It is well-know that the ASM can be solved by dynamic programming technique by computing a table of size m×n. The main contribution of this work is to present a memory-access-efficient implementation for computing the ASM on a GPU. The proposed GPU implementation relies on warp shuffle instructions which are used to accelerate the communication between threads without resorting to shared memory access. Despite the fact that O(mn) memory access operations are necessary to access all elements of a table with size n×m, the proposed implementation performs only $O(rac{mn}{w})$ memory access operations, where w is the warp size. Experimental results carried out on a GeForce GTX 980 GPU show that the proposed implementation, called w-SCAN, provides speed-up of over two fold in computing the ASM as compared to another prominent alternative.},
keywords={},
doi={10.1587/transinf.2016PAP0024},
ISSN={1745-1361},
month={December},}
Copy
TY - JOUR
TI - A Memory-Access-Efficient Implementation for Computing the Approximate String Matching Algorithm on GPUs
T2 - IEICE TRANSACTIONS on Information
SP - 2995
EP - 3003
AU - Lucas Saad Nogueira NUNES
AU - Jacir Luiz BORDIM
AU - Yasuaki ITO
AU - Koji NAKANO
PY - 2016
DO - 10.1587/transinf.2016PAP0024
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E99-D
IS - 12
JA - IEICE TRANSACTIONS on Information
Y1 - December 2016
AB - The closeness of a match is an important measure with a number of practical applications, including computational biology, signal processing and text retrieval. The approximate string matching (ASM) problem asks to find a substring of string Y of length n that is most similar to string X of length m. It is well-know that the ASM can be solved by dynamic programming technique by computing a table of size m×n. The main contribution of this work is to present a memory-access-efficient implementation for computing the ASM on a GPU. The proposed GPU implementation relies on warp shuffle instructions which are used to accelerate the communication between threads without resorting to shared memory access. Despite the fact that O(mn) memory access operations are necessary to access all elements of a table with size n×m, the proposed implementation performs only $O(rac{mn}{w})$ memory access operations, where w is the warp size. Experimental results carried out on a GeForce GTX 980 GPU show that the proposed implementation, called w-SCAN, provides speed-up of over two fold in computing the ASM as compared to another prominent alternative.
ER -