Broadcast and synchronization techniques are used for cache coherence control in conventional larger scale snoop-based SMP systems. The penalty for synchronization is directly proportional to system size. Meanwhile, advances in LSI technology now enable placing a memory controller on a CPU die. The latency to access directly linked memory is drastically reduced by an on-die controller. Developing an enterprise server system with these CPUs allows us an opportunity to achieve higher performance. Though the penalty of synchronization is counted whenever a cache miss occurs, it is necessary to improve the coherence method to receive the full benefit of this effect. In this paper, we demonstrate a coherence directory organization that fits into DSM enterprise server systems. Originally, a directory-based method was adopted in high performance computing systems because of its huge scalability in comparison with snoop-based method. Though directory capacity miss and long directory access latency are the major problems of this method, the relaxed scalability requirement of enterprise servers is advantageous to us to solve these problems along with an advanced LSI technology. Our proposed directory solves both problems by implementing a full bit vector level map of the coherence directory on an LSI chip. Our experimental results validate that a system controlled by our proposed directory can surpass a snoop-based system in performance even without applying data localization optimization to an online transaction processing (OLTP) workload.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Mariko SAKAMOTO, Akira KATSUNO, Go SUGIZAKI, Toshio YOSHIDA, Aiichiro INOUE, Koji INOUE, Kazuaki MURAKAMI, "A Next-Generation Enterprise Server System with Advanced Cache Coherence Chips" in IEICE TRANSACTIONS on Electronics,
vol. E90-C, no. 10, pp. 1972-1982, October 2007, doi: 10.1093/ietele/e90-c.10.1972.
Abstract: Broadcast and synchronization techniques are used for cache coherence control in conventional larger scale snoop-based SMP systems. The penalty for synchronization is directly proportional to system size. Meanwhile, advances in LSI technology now enable placing a memory controller on a CPU die. The latency to access directly linked memory is drastically reduced by an on-die controller. Developing an enterprise server system with these CPUs allows us an opportunity to achieve higher performance. Though the penalty of synchronization is counted whenever a cache miss occurs, it is necessary to improve the coherence method to receive the full benefit of this effect. In this paper, we demonstrate a coherence directory organization that fits into DSM enterprise server systems. Originally, a directory-based method was adopted in high performance computing systems because of its huge scalability in comparison with snoop-based method. Though directory capacity miss and long directory access latency are the major problems of this method, the relaxed scalability requirement of enterprise servers is advantageous to us to solve these problems along with an advanced LSI technology. Our proposed directory solves both problems by implementing a full bit vector level map of the coherence directory on an LSI chip. Our experimental results validate that a system controlled by our proposed directory can surpass a snoop-based system in performance even without applying data localization optimization to an online transaction processing (OLTP) workload.
URL: https://global.ieice.org/en_transactions/electronics/10.1093/ietele/e90-c.10.1972/_p
Copy
@ARTICLE{e90-c_10_1972,
author={Mariko SAKAMOTO, Akira KATSUNO, Go SUGIZAKI, Toshio YOSHIDA, Aiichiro INOUE, Koji INOUE, Kazuaki MURAKAMI, },
journal={IEICE TRANSACTIONS on Electronics},
title={A Next-Generation Enterprise Server System with Advanced Cache Coherence Chips},
year={2007},
volume={E90-C},
number={10},
pages={1972-1982},
abstract={Broadcast and synchronization techniques are used for cache coherence control in conventional larger scale snoop-based SMP systems. The penalty for synchronization is directly proportional to system size. Meanwhile, advances in LSI technology now enable placing a memory controller on a CPU die. The latency to access directly linked memory is drastically reduced by an on-die controller. Developing an enterprise server system with these CPUs allows us an opportunity to achieve higher performance. Though the penalty of synchronization is counted whenever a cache miss occurs, it is necessary to improve the coherence method to receive the full benefit of this effect. In this paper, we demonstrate a coherence directory organization that fits into DSM enterprise server systems. Originally, a directory-based method was adopted in high performance computing systems because of its huge scalability in comparison with snoop-based method. Though directory capacity miss and long directory access latency are the major problems of this method, the relaxed scalability requirement of enterprise servers is advantageous to us to solve these problems along with an advanced LSI technology. Our proposed directory solves both problems by implementing a full bit vector level map of the coherence directory on an LSI chip. Our experimental results validate that a system controlled by our proposed directory can surpass a snoop-based system in performance even without applying data localization optimization to an online transaction processing (OLTP) workload.},
keywords={},
doi={10.1093/ietele/e90-c.10.1972},
ISSN={1745-1353},
month={October},}
Copy
TY - JOUR
TI - A Next-Generation Enterprise Server System with Advanced Cache Coherence Chips
T2 - IEICE TRANSACTIONS on Electronics
SP - 1972
EP - 1982
AU - Mariko SAKAMOTO
AU - Akira KATSUNO
AU - Go SUGIZAKI
AU - Toshio YOSHIDA
AU - Aiichiro INOUE
AU - Koji INOUE
AU - Kazuaki MURAKAMI
PY - 2007
DO - 10.1093/ietele/e90-c.10.1972
JO - IEICE TRANSACTIONS on Electronics
SN - 1745-1353
VL - E90-C
IS - 10
JA - IEICE TRANSACTIONS on Electronics
Y1 - October 2007
AB - Broadcast and synchronization techniques are used for cache coherence control in conventional larger scale snoop-based SMP systems. The penalty for synchronization is directly proportional to system size. Meanwhile, advances in LSI technology now enable placing a memory controller on a CPU die. The latency to access directly linked memory is drastically reduced by an on-die controller. Developing an enterprise server system with these CPUs allows us an opportunity to achieve higher performance. Though the penalty of synchronization is counted whenever a cache miss occurs, it is necessary to improve the coherence method to receive the full benefit of this effect. In this paper, we demonstrate a coherence directory organization that fits into DSM enterprise server systems. Originally, a directory-based method was adopted in high performance computing systems because of its huge scalability in comparison with snoop-based method. Though directory capacity miss and long directory access latency are the major problems of this method, the relaxed scalability requirement of enterprise servers is advantageous to us to solve these problems along with an advanced LSI technology. Our proposed directory solves both problems by implementing a full bit vector level map of the coherence directory on an LSI chip. Our experimental results validate that a system controlled by our proposed directory can surpass a snoop-based system in performance even without applying data localization optimization to an online transaction processing (OLTP) workload.
ER -