This paper presents a blocked united algorithm for the all-pairs shortest paths (APSP) problem. This algorithm simultaneously computes both the shortest-path distance matrix and the shortest-path construction matrix for a graph. It is designed for a high-speed APSP solution on hybrid CPU-GPU systems. In our implementation, two most compute intensive parts of the algorithm are performed on the GPU. The first part is to solve the APSP sub-problem for a block of sub-matrices, and the other part is a matrix-matrix “multiplication” for the APSP problem. Moreover, the amount of data communication between CPU (host) memory and GPU memory is reduced by reusing blocks once sent to the GPU. When a problem size (the number of vertices in a graph) is large enough compared to a block size, our implementation of the blocked algorithm requires CPU
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Kazuya MATSUMOTO, Naohito NAKASATO, Stanislav G. SEDUKHIN, "Blocked United Algorithm for the All-Pairs Shortest Paths Problem on Hybrid CPU-GPU Systems" in IEICE TRANSACTIONS on Information,
vol. E95-D, no. 12, pp. 2759-2768, December 2012, doi: 10.1587/transinf.E95.D.2759.
Abstract: This paper presents a blocked united algorithm for the all-pairs shortest paths (APSP) problem. This algorithm simultaneously computes both the shortest-path distance matrix and the shortest-path construction matrix for a graph. It is designed for a high-speed APSP solution on hybrid CPU-GPU systems. In our implementation, two most compute intensive parts of the algorithm are performed on the GPU. The first part is to solve the APSP sub-problem for a block of sub-matrices, and the other part is a matrix-matrix “multiplication” for the APSP problem. Moreover, the amount of data communication between CPU (host) memory and GPU memory is reduced by reusing blocks once sent to the GPU. When a problem size (the number of vertices in a graph) is large enough compared to a block size, our implementation of the blocked algorithm requires CPU
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E95.D.2759/_p
Copy
@ARTICLE{e95-d_12_2759,
author={Kazuya MATSUMOTO, Naohito NAKASATO, Stanislav G. SEDUKHIN, },
journal={IEICE TRANSACTIONS on Information},
title={Blocked United Algorithm for the All-Pairs Shortest Paths Problem on Hybrid CPU-GPU Systems},
year={2012},
volume={E95-D},
number={12},
pages={2759-2768},
abstract={This paper presents a blocked united algorithm for the all-pairs shortest paths (APSP) problem. This algorithm simultaneously computes both the shortest-path distance matrix and the shortest-path construction matrix for a graph. It is designed for a high-speed APSP solution on hybrid CPU-GPU systems. In our implementation, two most compute intensive parts of the algorithm are performed on the GPU. The first part is to solve the APSP sub-problem for a block of sub-matrices, and the other part is a matrix-matrix “multiplication” for the APSP problem. Moreover, the amount of data communication between CPU (host) memory and GPU memory is reduced by reusing blocks once sent to the GPU. When a problem size (the number of vertices in a graph) is large enough compared to a block size, our implementation of the blocked algorithm requires CPU
keywords={},
doi={10.1587/transinf.E95.D.2759},
ISSN={1745-1361},
month={December},}
Copy
TY - JOUR
TI - Blocked United Algorithm for the All-Pairs Shortest Paths Problem on Hybrid CPU-GPU Systems
T2 - IEICE TRANSACTIONS on Information
SP - 2759
EP - 2768
AU - Kazuya MATSUMOTO
AU - Naohito NAKASATO
AU - Stanislav G. SEDUKHIN
PY - 2012
DO - 10.1587/transinf.E95.D.2759
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E95-D
IS - 12
JA - IEICE TRANSACTIONS on Information
Y1 - December 2012
AB - This paper presents a blocked united algorithm for the all-pairs shortest paths (APSP) problem. This algorithm simultaneously computes both the shortest-path distance matrix and the shortest-path construction matrix for a graph. It is designed for a high-speed APSP solution on hybrid CPU-GPU systems. In our implementation, two most compute intensive parts of the algorithm are performed on the GPU. The first part is to solve the APSP sub-problem for a block of sub-matrices, and the other part is a matrix-matrix “multiplication” for the APSP problem. Moreover, the amount of data communication between CPU (host) memory and GPU memory is reduced by reusing blocks once sent to the GPU. When a problem size (the number of vertices in a graph) is large enough compared to a block size, our implementation of the blocked algorithm requires CPU
ER -