The new technique for reducing the load latency is presented. This technique, named tunneling-load, utilizes the register specifier buffer in order to reduce the load latency without fetching the data cache speculatively, and thus eliminates the drawback of any load address prediction techniques. As a consequence of the trend toward increasing clock frequency, the internal cache is no longer able to fill the speed gap between the processor and the external memory, and the data cache latency degrades the processor performance. In order to hide this latency, several techniques predicting the load address have been proposed. These techniques carry out the speculative data cache fetching, which causes the explosion of the memory traffic and the pollution of the data cache. The tunneling-load solves these problems. We have evaluated the effects of the tunneling-load, and found that in an in-order-issue superscalar platform the instruction level parallelism is increased by approximately 10%.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Toshinori SATO, "Resolving Load Data Dependency Using Tunneling-Load Technique" in IEICE TRANSACTIONS on Information,
vol. E81-D, no. 8, pp. 829-838, August 1998, doi: .
Abstract: The new technique for reducing the load latency is presented. This technique, named tunneling-load, utilizes the register specifier buffer in order to reduce the load latency without fetching the data cache speculatively, and thus eliminates the drawback of any load address prediction techniques. As a consequence of the trend toward increasing clock frequency, the internal cache is no longer able to fill the speed gap between the processor and the external memory, and the data cache latency degrades the processor performance. In order to hide this latency, several techniques predicting the load address have been proposed. These techniques carry out the speculative data cache fetching, which causes the explosion of the memory traffic and the pollution of the data cache. The tunneling-load solves these problems. We have evaluated the effects of the tunneling-load, and found that in an in-order-issue superscalar platform the instruction level parallelism is increased by approximately 10%.
URL: https://global.ieice.org/en_transactions/information/10.1587/e81-d_8_829/_p
Copy
@ARTICLE{e81-d_8_829,
author={Toshinori SATO, },
journal={IEICE TRANSACTIONS on Information},
title={Resolving Load Data Dependency Using Tunneling-Load Technique},
year={1998},
volume={E81-D},
number={8},
pages={829-838},
abstract={The new technique for reducing the load latency is presented. This technique, named tunneling-load, utilizes the register specifier buffer in order to reduce the load latency without fetching the data cache speculatively, and thus eliminates the drawback of any load address prediction techniques. As a consequence of the trend toward increasing clock frequency, the internal cache is no longer able to fill the speed gap between the processor and the external memory, and the data cache latency degrades the processor performance. In order to hide this latency, several techniques predicting the load address have been proposed. These techniques carry out the speculative data cache fetching, which causes the explosion of the memory traffic and the pollution of the data cache. The tunneling-load solves these problems. We have evaluated the effects of the tunneling-load, and found that in an in-order-issue superscalar platform the instruction level parallelism is increased by approximately 10%.},
keywords={},
doi={},
ISSN={},
month={August},}
Copy
TY - JOUR
TI - Resolving Load Data Dependency Using Tunneling-Load Technique
T2 - IEICE TRANSACTIONS on Information
SP - 829
EP - 838
AU - Toshinori SATO
PY - 1998
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E81-D
IS - 8
JA - IEICE TRANSACTIONS on Information
Y1 - August 1998
AB - The new technique for reducing the load latency is presented. This technique, named tunneling-load, utilizes the register specifier buffer in order to reduce the load latency without fetching the data cache speculatively, and thus eliminates the drawback of any load address prediction techniques. As a consequence of the trend toward increasing clock frequency, the internal cache is no longer able to fill the speed gap between the processor and the external memory, and the data cache latency degrades the processor performance. In order to hide this latency, several techniques predicting the load address have been proposed. These techniques carry out the speculative data cache fetching, which causes the explosion of the memory traffic and the pollution of the data cache. The tunneling-load solves these problems. We have evaluated the effects of the tunneling-load, and found that in an in-order-issue superscalar platform the instruction level parallelism is increased by approximately 10%.
ER -