1-2hit |
Prefetching is a promising approach to tackle the memory latency problem. Two basic variants of hardware data prefetching methods are sequential prefetching and stride prefetching. The latter based on stride calculation of future references has the potential to out-perform the former which is based on the data locality. In this paper, a typical stride prefetching and its improved version, adaptive stride prefetching, are compared in quantitative way using simulation for some parallel benchmark programs in the context of uniform memory access and non-uniform memory access architectures. The simulation results show that adaptability of stride is essential since the proposed adaptive scheme can reduce pending stall time which is large in the typical scheme.
Chang-Jae PARK Ando KI In-Cheol PARK Chong-Min KYUNG
This paper describes an automatic interface insertion scheme for in-system verification of algorithm models. To insert the interface, an algorithm model described in C is translated into another source code that includes the communication with hardware components in the target system to be validated with the algorithm model. The communication between the algorithm model and hardware components is achieved using transactors that perform transformation between access operations and bus cycle transactions. I/O terminal is introduced as an interface model to relate the transactions to access operations during the execution of the algorithm model, i.e., accesses to I/O terminals invoke bus cycle transactions in hardware and vice versa. An automatic interface insertion tool is developed using the source-to-source translation to identify the I/O terminals and insert interface function calls in the source code. The proposed automatic interface insertion scheme is validated by emulating several multimedia algorithms written in C on real target systems.