This paper describes a 16-b fixed point digital signal processor(DSP), especially its multiply-accumulate(MAC) unit, memories, and instruction set.By adopting a redundant binary multiplier and a variable pipeline structure, this DSP's MAC unit, compared to a conventional MAC unit, consumes about 15% less power and operates 24% faster. Furthermore, its doublespeed MAC mechanism can realize twice the performance of a single MAC operation while consuming only 69% more power. By being able to more finely control which portions of memory are activated, the data ROM and data RAM's precharge current was reduced to about 1/8 of the conventional ROM and RAM's. We redesigned the instruction set and reduced its width from 32 b to 24 b based on the analysis of data generated by simulating an application program on our previous DSP. The reduction in instruction width made our on-chip instruction memory size 33% smaller than the previous one. This chip is fabricated with a 0.5- µm double-metal-layer CMOS process and achieves 80-MOPS-peak double speed multiply-accumulate performance.
Hideyuki KABUO
Minoru OKAMOTO
Isao TANAKA
Hiroyuki YASOSHIMA
Shinichi MARUI
Masayuki YAMASAKI
Toshio SUGIMURA
Katsuhiko UEDA
Toshihiro ISHIKAWA
Hidetoshi SUZUKI
Ryuichi ASAHI
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Hideyuki KABUO, Minoru OKAMOTO, Isao TANAKA, Hiroyuki YASOSHIMA, Shinichi MARUI, Masayuki YAMASAKI, Toshio SUGIMURA, Katsuhiko UEDA, Toshihiro ISHIKAWA, Hidetoshi SUZUKI, Ryuichi ASAHI, "An 80-MOPS-Peak High-Speed and Low-Power-Consumption 16-b Digital Signal Processor" in IEICE TRANSACTIONS on Electronics,
vol. E79-C, no. 7, pp. 905-914, July 1996, doi: .
Abstract: This paper describes a 16-b fixed point digital signal processor(DSP), especially its multiply-accumulate(MAC) unit, memories, and instruction set.By adopting a redundant binary multiplier and a variable pipeline structure, this DSP's MAC unit, compared to a conventional MAC unit, consumes about 15% less power and operates 24% faster. Furthermore, its doublespeed MAC mechanism can realize twice the performance of a single MAC operation while consuming only 69% more power. By being able to more finely control which portions of memory are activated, the data ROM and data RAM's precharge current was reduced to about 1/8 of the conventional ROM and RAM's. We redesigned the instruction set and reduced its width from 32 b to 24 b based on the analysis of data generated by simulating an application program on our previous DSP. The reduction in instruction width made our on-chip instruction memory size 33% smaller than the previous one. This chip is fabricated with a 0.5- µm double-metal-layer CMOS process and achieves 80-MOPS-peak double speed multiply-accumulate performance.
URL: https://global.ieice.org/en_transactions/electronics/10.1587/e79-c_7_905/_p
Copy
@ARTICLE{e79-c_7_905,
author={Hideyuki KABUO, Minoru OKAMOTO, Isao TANAKA, Hiroyuki YASOSHIMA, Shinichi MARUI, Masayuki YAMASAKI, Toshio SUGIMURA, Katsuhiko UEDA, Toshihiro ISHIKAWA, Hidetoshi SUZUKI, Ryuichi ASAHI, },
journal={IEICE TRANSACTIONS on Electronics},
title={An 80-MOPS-Peak High-Speed and Low-Power-Consumption 16-b Digital Signal Processor},
year={1996},
volume={E79-C},
number={7},
pages={905-914},
abstract={This paper describes a 16-b fixed point digital signal processor(DSP), especially its multiply-accumulate(MAC) unit, memories, and instruction set.By adopting a redundant binary multiplier and a variable pipeline structure, this DSP's MAC unit, compared to a conventional MAC unit, consumes about 15% less power and operates 24% faster. Furthermore, its doublespeed MAC mechanism can realize twice the performance of a single MAC operation while consuming only 69% more power. By being able to more finely control which portions of memory are activated, the data ROM and data RAM's precharge current was reduced to about 1/8 of the conventional ROM and RAM's. We redesigned the instruction set and reduced its width from 32 b to 24 b based on the analysis of data generated by simulating an application program on our previous DSP. The reduction in instruction width made our on-chip instruction memory size 33% smaller than the previous one. This chip is fabricated with a 0.5- µm double-metal-layer CMOS process and achieves 80-MOPS-peak double speed multiply-accumulate performance.},
keywords={},
doi={},
ISSN={},
month={July},}
Copy
TY - JOUR
TI - An 80-MOPS-Peak High-Speed and Low-Power-Consumption 16-b Digital Signal Processor
T2 - IEICE TRANSACTIONS on Electronics
SP - 905
EP - 914
AU - Hideyuki KABUO
AU - Minoru OKAMOTO
AU - Isao TANAKA
AU - Hiroyuki YASOSHIMA
AU - Shinichi MARUI
AU - Masayuki YAMASAKI
AU - Toshio SUGIMURA
AU - Katsuhiko UEDA
AU - Toshihiro ISHIKAWA
AU - Hidetoshi SUZUKI
AU - Ryuichi ASAHI
PY - 1996
DO -
JO - IEICE TRANSACTIONS on Electronics
SN -
VL - E79-C
IS - 7
JA - IEICE TRANSACTIONS on Electronics
Y1 - July 1996
AB - This paper describes a 16-b fixed point digital signal processor(DSP), especially its multiply-accumulate(MAC) unit, memories, and instruction set.By adopting a redundant binary multiplier and a variable pipeline structure, this DSP's MAC unit, compared to a conventional MAC unit, consumes about 15% less power and operates 24% faster. Furthermore, its doublespeed MAC mechanism can realize twice the performance of a single MAC operation while consuming only 69% more power. By being able to more finely control which portions of memory are activated, the data ROM and data RAM's precharge current was reduced to about 1/8 of the conventional ROM and RAM's. We redesigned the instruction set and reduced its width from 32 b to 24 b based on the analysis of data generated by simulating an application program on our previous DSP. The reduction in instruction width made our on-chip instruction memory size 33% smaller than the previous one. This chip is fabricated with a 0.5- µm double-metal-layer CMOS process and achieves 80-MOPS-peak double speed multiply-accumulate performance.
ER -