A sparse memory access architecture which is proposed to achieve a high-computational-speed neural-network LSI is described in detail. This architecture uses two key techniques, compressible synapse-weight neuron calculation and differential neuron operation, to reduce the number of accesses to synapse weight memories and the number of neuron calculations without incurring an accuracy penalty. The test chip based on this architecture has 96 parallel data-driven processing units and enough memory for 12,288 synapse weights. In a pattern recognition example, the number of memory accesses and neuron calculations was reduced to 0.87% that needed in the conventional method and the practical performance was 18 GCPS. The sparse memory access architecture is also effective when the synapse weights are stored in off-chip memory.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Kimihisa AIHARA, Osamu FUJITA, Kuniharu UCHIMURA, "A Sparse Memory Access Architecture for Digital Neural Network LSIs" in IEICE TRANSACTIONS on Electronics,
vol. E80-C, no. 7, pp. 996-1002, July 1997, doi: .
Abstract: A sparse memory access architecture which is proposed to achieve a high-computational-speed neural-network LSI is described in detail. This architecture uses two key techniques, compressible synapse-weight neuron calculation and differential neuron operation, to reduce the number of accesses to synapse weight memories and the number of neuron calculations without incurring an accuracy penalty. The test chip based on this architecture has 96 parallel data-driven processing units and enough memory for 12,288 synapse weights. In a pattern recognition example, the number of memory accesses and neuron calculations was reduced to 0.87% that needed in the conventional method and the practical performance was 18 GCPS. The sparse memory access architecture is also effective when the synapse weights are stored in off-chip memory.
URL: https://global.ieice.org/en_transactions/electronics/10.1587/e80-c_7_996/_p
Copy
@ARTICLE{e80-c_7_996,
author={Kimihisa AIHARA, Osamu FUJITA, Kuniharu UCHIMURA, },
journal={IEICE TRANSACTIONS on Electronics},
title={A Sparse Memory Access Architecture for Digital Neural Network LSIs},
year={1997},
volume={E80-C},
number={7},
pages={996-1002},
abstract={A sparse memory access architecture which is proposed to achieve a high-computational-speed neural-network LSI is described in detail. This architecture uses two key techniques, compressible synapse-weight neuron calculation and differential neuron operation, to reduce the number of accesses to synapse weight memories and the number of neuron calculations without incurring an accuracy penalty. The test chip based on this architecture has 96 parallel data-driven processing units and enough memory for 12,288 synapse weights. In a pattern recognition example, the number of memory accesses and neuron calculations was reduced to 0.87% that needed in the conventional method and the practical performance was 18 GCPS. The sparse memory access architecture is also effective when the synapse weights are stored in off-chip memory.},
keywords={},
doi={},
ISSN={},
month={July},}
Copy
TY - JOUR
TI - A Sparse Memory Access Architecture for Digital Neural Network LSIs
T2 - IEICE TRANSACTIONS on Electronics
SP - 996
EP - 1002
AU - Kimihisa AIHARA
AU - Osamu FUJITA
AU - Kuniharu UCHIMURA
PY - 1997
DO -
JO - IEICE TRANSACTIONS on Electronics
SN -
VL - E80-C
IS - 7
JA - IEICE TRANSACTIONS on Electronics
Y1 - July 1997
AB - A sparse memory access architecture which is proposed to achieve a high-computational-speed neural-network LSI is described in detail. This architecture uses two key techniques, compressible synapse-weight neuron calculation and differential neuron operation, to reduce the number of accesses to synapse weight memories and the number of neuron calculations without incurring an accuracy penalty. The test chip based on this architecture has 96 parallel data-driven processing units and enough memory for 12,288 synapse weights. In a pattern recognition example, the number of memory accesses and neuron calculations was reduced to 0.87% that needed in the conventional method and the practical performance was 18 GCPS. The sparse memory access architecture is also effective when the synapse weights are stored in off-chip memory.
ER -