The search functionality is under construction.

IEICE TRANSACTIONS on Information

An Efficient Method for Training Deep Learning Networks Distributed

Chenxu WANG, Yutong LU, Zhiguang CHEN, Junnan LI

  • Full Text Views

    0

  • Cite this

Summary :

Training deep learning (DL) is a computationally intensive process; as a result, training time can become so long that it impedes the development of DL. High performance computing clusters, especially supercomputers, are equipped with a large amount of computing resources, storage resources, and efficient interconnection ability, which can train DL networks better and faster. In this paper, we propose a method to train DL networks distributed with high efficiency. First, we propose a hierarchical synchronous Stochastic Gradient Descent (SGD) strategy, which can make full use of hardware resources and greatly increase computational efficiency. Second, we present a two-level parameter synchronization scheme which can reduce communication overhead by transmitting parameters of the first layer models in shared memory. Third, we optimize the parallel I/O by making each reader read data as continuously as possible to avoid the high overhead of discontinuous data reading. At last, we integrate the LARS algorithm into our system. The experimental results demonstrate that our approach has tremendous performance advantages relative to unoptimized methods. Compared with the native distributed strategy, our hierarchical synchronous SGD strategy (HSGD) can increase computing efficiency by about 20 times.

Publication
IEICE TRANSACTIONS on Information Vol.E103-D No.12 pp.2444-2456
Publication Date
2020/12/01
Publicized
2020/09/07
Online ISSN
1745-1361
DOI
10.1587/transinf.2020PAP0007
Type of Manuscript
Special Section PAPER (Special Section on Parallel, Distributed, and Reconfigurable Computing, and Networking)
Category
Fundamentals of Information Systems

Authors

Chenxu WANG
  National University of Defense Technology
Yutong LU
  National Supercomputer Center in Guangzhou,Sun Yat-sen University
Zhiguang CHEN
  National Supercomputer Center in Guangzhou,Sun Yat-sen University
Junnan LI
  National University of Defense Technology

Keyword