IEICE global.ieice.org Site

Author Search Result

[Author] Junnan LI(2hit)

1-2hit

Exploiting Packet-Level Parallelism of Packet Parsing for FPGA-Based Switches
Junnan LI Biao HAN Zhigang SUN Tao LI Xiaoyan WANG

PAPER-Transmission Systems and Transmission Equipment for Communications

Pubricized:
2019/03/18
Vol:
E102-B No:9
Page(s):
1862-1874
FPGA-based switches are appealing nowadays due to the balance between hardware performance and software flexibility. Packet parser, as the foundational component of FPGA-based switches, is to identify and extract specific fields used in forwarding decisions, e.g., destination IP address. However, traditional parsers are too rigid to accommodate new protocols. In addition, FPGAs usually have a much lower clock frequency and fewer hardware resources, compared to ASICs. In this paper, we present PLANET, a programmable packet-level parallel parsing architecture for FPGA-based switches, to overcome these two limitations. First, PLANET has flexible programmability of updating parsing algorithms at run-time. Second, PLANET highly exploits parallelism inside packet parsing to compensate FPGA's low clock frequency and reduces resource consumption with one-block recycling design. We implemented PLANET on an FPGA-based switch prototype with well-integrated datacenter protocols. Evaluation results show that our design can parse packets at up to 100 Gbps, as well as maintain a relative low parsing latency and fewer hardware resources than existing proposals.
An Efficient Method for Training Deep Learning Networks Distributed
Chenxu WANG Yutong LU Zhiguang CHEN Junnan LI

PAPER-Fundamentals of Information Systems

Pubricized:
2020/09/07
Vol:
E103-D No:12
Page(s):
2444-2456
Training deep learning (DL) is a computationally intensive process; as a result, training time can become so long that it impedes the development of DL. High performance computing clusters, especially supercomputers, are equipped with a large amount of computing resources, storage resources, and efficient interconnection ability, which can train DL networks better and faster. In this paper, we propose a method to train DL networks distributed with high efficiency. First, we propose a hierarchical synchronous Stochastic Gradient Descent (SGD) strategy, which can make full use of hardware resources and greatly increase computational efficiency. Second, we present a two-level parameter synchronization scheme which can reduce communication overhead by transmitting parameters of the first layer models in shared memory. Third, we optimize the parallel I/O by making each reader read data as continuously as possible to avoid the high overhead of discontinuous data reading. At last, we integrate the LARS algorithm into our system. The experimental results demonstrate that our approach has tremendous performance advantages relative to unoptimized methods. Compared with the native distributed strategy, our hierarchical synchronous SGD strategy (HSGD) can increase computing efficiency by about 20 times.

Author Search Result

[Author] Junnan LI(2hit)

Exploiting Packet-Level Parallelism of Packet Parsing for FPGA-Based Switches

An Efficient Method for Training Deep Learning Networks Distributed

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles