New media processing applications such as image recognition and AR (Augment Reality) have become into practical on embedded systems for automotive, digital-consumer and mobile products. Many-core processors have been proposed to realize much higher performance than multi-core processors. We have developed a low-power many-core SoC for multimedia applications in 40nm CMOS technology. Within a 210mm2 die, two 32-core clusters are integrated with dynamically reconfigurable processors, hardware accelerators, 2-channel DDR3 I/Fs, and other peripherals. Processor cores in the cluster share a 2MB L2 cache connected through a tree-based Network-on-Chip (NoC). Its total peak performance exceeds 1.5TOPS (Tera Operations Per Second). The high scalability and low power consumption are accomplished by parallelized software for multimedia applications. In case of face detection, the performance scales up to 64 cores and the SoC consumes only 2.21W. Moreover, it can execute the 1080p 48fps H.264 decoding about 520mW by 28 cores and the 4K2K 15fps super resolution about 770mW by 32 cores in one cluster. Exploiting parallelism by low power processor cores, the many-core SoC provides several tens of times better energy efficiency than that of a high performance desk-top quad-core processor.
Takashi MIYAMORI
Toshiba Corporation Semiconductor & Storage Products Company
Hui XU
Toshiba Corporation Semiconductor & Storage Products Company
Hiroyuki USUI
Toshiba Corporation Semiconductor & Storage Products Company
Soichiro HOSODA
Toshiba Corporation Semiconductor & Storage Products Company
Toru SANO
Toshiba Corporation Semiconductor & Storage Products Company
Kazumasa YAMAMOTO
Toshiba Corporation Semiconductor & Storage Products Company
Takeshi KODAKA
Toshiba Corporation Semiconductor & Storage Products Company
Nobuhiro NONOGAKI
Toshiba Corporation Semiconductor & Storage Products Company
Nau OZAKI
Toshiba Corporation Semiconductor & Storage Products Company
Jun TANABE
Toshiba Corporation Semiconductor & Storage Products Company
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Takashi MIYAMORI, Hui XU, Hiroyuki USUI, Soichiro HOSODA, Toru SANO, Kazumasa YAMAMOTO, Takeshi KODAKA, Nobuhiro NONOGAKI, Nau OZAKI, Jun TANABE, "Architecture and Evaluation of Low Power Many-Core SoC with Two 32-Core Clusters" in IEICE TRANSACTIONS on Electronics,
vol. E97-C, no. 4, pp. 360-368, April 2014, doi: 10.1587/transele.E97.C.360.
Abstract: New media processing applications such as image recognition and AR (Augment Reality) have become into practical on embedded systems for automotive, digital-consumer and mobile products. Many-core processors have been proposed to realize much higher performance than multi-core processors. We have developed a low-power many-core SoC for multimedia applications in 40nm CMOS technology. Within a 210mm2 die, two 32-core clusters are integrated with dynamically reconfigurable processors, hardware accelerators, 2-channel DDR3 I/Fs, and other peripherals. Processor cores in the cluster share a 2MB L2 cache connected through a tree-based Network-on-Chip (NoC). Its total peak performance exceeds 1.5TOPS (Tera Operations Per Second). The high scalability and low power consumption are accomplished by parallelized software for multimedia applications. In case of face detection, the performance scales up to 64 cores and the SoC consumes only 2.21W. Moreover, it can execute the 1080p 48fps H.264 decoding about 520mW by 28 cores and the 4K2K 15fps super resolution about 770mW by 32 cores in one cluster. Exploiting parallelism by low power processor cores, the many-core SoC provides several tens of times better energy efficiency than that of a high performance desk-top quad-core processor.
URL: https://global.ieice.org/en_transactions/electronics/10.1587/transele.E97.C.360/_p
Copy
@ARTICLE{e97-c_4_360,
author={Takashi MIYAMORI, Hui XU, Hiroyuki USUI, Soichiro HOSODA, Toru SANO, Kazumasa YAMAMOTO, Takeshi KODAKA, Nobuhiro NONOGAKI, Nau OZAKI, Jun TANABE, },
journal={IEICE TRANSACTIONS on Electronics},
title={Architecture and Evaluation of Low Power Many-Core SoC with Two 32-Core Clusters},
year={2014},
volume={E97-C},
number={4},
pages={360-368},
abstract={New media processing applications such as image recognition and AR (Augment Reality) have become into practical on embedded systems for automotive, digital-consumer and mobile products. Many-core processors have been proposed to realize much higher performance than multi-core processors. We have developed a low-power many-core SoC for multimedia applications in 40nm CMOS technology. Within a 210mm2 die, two 32-core clusters are integrated with dynamically reconfigurable processors, hardware accelerators, 2-channel DDR3 I/Fs, and other peripherals. Processor cores in the cluster share a 2MB L2 cache connected through a tree-based Network-on-Chip (NoC). Its total peak performance exceeds 1.5TOPS (Tera Operations Per Second). The high scalability and low power consumption are accomplished by parallelized software for multimedia applications. In case of face detection, the performance scales up to 64 cores and the SoC consumes only 2.21W. Moreover, it can execute the 1080p 48fps H.264 decoding about 520mW by 28 cores and the 4K2K 15fps super resolution about 770mW by 32 cores in one cluster. Exploiting parallelism by low power processor cores, the many-core SoC provides several tens of times better energy efficiency than that of a high performance desk-top quad-core processor.},
keywords={},
doi={10.1587/transele.E97.C.360},
ISSN={1745-1353},
month={April},}
Copy
TY - JOUR
TI - Architecture and Evaluation of Low Power Many-Core SoC with Two 32-Core Clusters
T2 - IEICE TRANSACTIONS on Electronics
SP - 360
EP - 368
AU - Takashi MIYAMORI
AU - Hui XU
AU - Hiroyuki USUI
AU - Soichiro HOSODA
AU - Toru SANO
AU - Kazumasa YAMAMOTO
AU - Takeshi KODAKA
AU - Nobuhiro NONOGAKI
AU - Nau OZAKI
AU - Jun TANABE
PY - 2014
DO - 10.1587/transele.E97.C.360
JO - IEICE TRANSACTIONS on Electronics
SN - 1745-1353
VL - E97-C
IS - 4
JA - IEICE TRANSACTIONS on Electronics
Y1 - April 2014
AB - New media processing applications such as image recognition and AR (Augment Reality) have become into practical on embedded systems for automotive, digital-consumer and mobile products. Many-core processors have been proposed to realize much higher performance than multi-core processors. We have developed a low-power many-core SoC for multimedia applications in 40nm CMOS technology. Within a 210mm2 die, two 32-core clusters are integrated with dynamically reconfigurable processors, hardware accelerators, 2-channel DDR3 I/Fs, and other peripherals. Processor cores in the cluster share a 2MB L2 cache connected through a tree-based Network-on-Chip (NoC). Its total peak performance exceeds 1.5TOPS (Tera Operations Per Second). The high scalability and low power consumption are accomplished by parallelized software for multimedia applications. In case of face detection, the performance scales up to 64 cores and the SoC consumes only 2.21W. Moreover, it can execute the 1080p 48fps H.264 decoding about 520mW by 28 cores and the 4K2K 15fps super resolution about 770mW by 32 cores in one cluster. Exploiting parallelism by low power processor cores, the many-core SoC provides several tens of times better energy efficiency than that of a high performance desk-top quad-core processor.
ER -