A 5.6 GOPS/1.4 GFLOPS 350 MHz, four-way very long instruction word (VLIW) microprocessor is developed for embedded applications in a 0.18 µm five-layer-metal CMOS process. This processor features a two-way integer pipeline and two-way floating/media pipelines. Each floating pipeline and media pipeline has two-parallel and four-parallel single instruction multiple-data (SIMD) mechanisms, respectively. The processor has separate instruction and data caches, each of 16 KB in size and having four-way set associative. The data cache employs a non-blocking technique and can process two load instructions in parallel. The processor had about a 50% clock net power reduction compared with one without power optimization. 6.7 million transistors are integrated in an area of 7.5 mm
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Hiroshi OKANO, Atsuhiro SUGA, Hideo MIYAKE, Yoshimasa TAKEBE, Yasuki NAKAMURA, Hiromasa TAKAHASHI, "A 350 MHz 5.6 GOPS/1.4 GFLOPS 4-Way VLIW Embedded Microprocessor" in IEICE TRANSACTIONS on Electronics,
vol. E84-C, no. 2, pp. 150-156, February 2001, doi: .
Abstract: A 5.6 GOPS/1.4 GFLOPS 350 MHz, four-way very long instruction word (VLIW) microprocessor is developed for embedded applications in a 0.18 µm five-layer-metal CMOS process. This processor features a two-way integer pipeline and two-way floating/media pipelines. Each floating pipeline and media pipeline has two-parallel and four-parallel single instruction multiple-data (SIMD) mechanisms, respectively. The processor has separate instruction and data caches, each of 16 KB in size and having four-way set associative. The data cache employs a non-blocking technique and can process two load instructions in parallel. The processor had about a 50% clock net power reduction compared with one without power optimization. 6.7 million transistors are integrated in an area of 7.5 mm
URL: https://global.ieice.org/en_transactions/electronics/10.1587/e84-c_2_150/_p
Copy
@ARTICLE{e84-c_2_150,
author={Hiroshi OKANO, Atsuhiro SUGA, Hideo MIYAKE, Yoshimasa TAKEBE, Yasuki NAKAMURA, Hiromasa TAKAHASHI, },
journal={IEICE TRANSACTIONS on Electronics},
title={A 350 MHz 5.6 GOPS/1.4 GFLOPS 4-Way VLIW Embedded Microprocessor},
year={2001},
volume={E84-C},
number={2},
pages={150-156},
abstract={A 5.6 GOPS/1.4 GFLOPS 350 MHz, four-way very long instruction word (VLIW) microprocessor is developed for embedded applications in a 0.18 µm five-layer-metal CMOS process. This processor features a two-way integer pipeline and two-way floating/media pipelines. Each floating pipeline and media pipeline has two-parallel and four-parallel single instruction multiple-data (SIMD) mechanisms, respectively. The processor has separate instruction and data caches, each of 16 KB in size and having four-way set associative. The data cache employs a non-blocking technique and can process two load instructions in parallel. The processor had about a 50% clock net power reduction compared with one without power optimization. 6.7 million transistors are integrated in an area of 7.5 mm
keywords={},
doi={},
ISSN={},
month={February},}
Copy
TY - JOUR
TI - A 350 MHz 5.6 GOPS/1.4 GFLOPS 4-Way VLIW Embedded Microprocessor
T2 - IEICE TRANSACTIONS on Electronics
SP - 150
EP - 156
AU - Hiroshi OKANO
AU - Atsuhiro SUGA
AU - Hideo MIYAKE
AU - Yoshimasa TAKEBE
AU - Yasuki NAKAMURA
AU - Hiromasa TAKAHASHI
PY - 2001
DO -
JO - IEICE TRANSACTIONS on Electronics
SN -
VL - E84-C
IS - 2
JA - IEICE TRANSACTIONS on Electronics
Y1 - February 2001
AB - A 5.6 GOPS/1.4 GFLOPS 350 MHz, four-way very long instruction word (VLIW) microprocessor is developed for embedded applications in a 0.18 µm five-layer-metal CMOS process. This processor features a two-way integer pipeline and two-way floating/media pipelines. Each floating pipeline and media pipeline has two-parallel and four-parallel single instruction multiple-data (SIMD) mechanisms, respectively. The processor has separate instruction and data caches, each of 16 KB in size and having four-way set associative. The data cache employs a non-blocking technique and can process two load instructions in parallel. The processor had about a 50% clock net power reduction compared with one without power optimization. 6.7 million transistors are integrated in an area of 7.5 mm
ER -