The problems of cache coherency in multiprocessor systems are directly related to their architectural structures. Small scale multiprocessor systems have focused on the use of bus based memory interconnection networks using centrally shared memory and a sequential consistency model for coherency. This has limited scalability to but a few tens of processors due to the limited bus bandwidth used for both coherency updates and memory traffic. Recently, large scale multiprocessor systems have been proposed that use general interconnection networks and distributed shared memory. These architectures have been proposed using weak consistency models and various directory map schemes to hide the overhead for coherency maintenance within the memory hieratchy, interconnection network or process context switch latencies. The coherency and memory traffic are still maintained over the same interconnection network. In this paper, we present the architecture of a new general purpose medium scale multiprocessor system. This Cache Coherent Multiprocessor System (C2MP), supports distributed shared memory using a general memory interconnection network for memory traffic and a separate bus based coherency interconnection network for coherency maintenance. Through the use of a special directory based coherency protocol and cache oriented distributed coherency controllers, direct cache-to-cache coherency maintenance is performed over the dedicated coherency bus. This minimizes coherency updates to only those processor nodes needing coherency maintenance. An aggressive sequential coherncy model is used, which reduces the hardware penalty to support an ideal sequential consistency programmers model. The system can scale up to 256-512 processors depending on the degree of shared data and is expected to have higher per processor utilization in this range than currently proposed medium and large scale multiprocessor systems. The C2MP system is analyzed utilizing a Generalized Timed Petri-Net model of a processor node. A stochastic model for internode interactions over the general memory interconnection network and coherency bus are used . The model of the proposed architecture is analyzed under steady-state conditions for varying system work load parameters.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Douglas E. MARQUARDT, Hasan S. ALKHATIB, "A Cache-Coherent, Distributed Memory Multiprocessor System and Its Performance Analysis" in IEICE TRANSACTIONS on Information,
vol. E75-D, no. 3, pp. 274-290, May 1992, doi: .
Abstract: The problems of cache coherency in multiprocessor systems are directly related to their architectural structures. Small scale multiprocessor systems have focused on the use of bus based memory interconnection networks using centrally shared memory and a sequential consistency model for coherency. This has limited scalability to but a few tens of processors due to the limited bus bandwidth used for both coherency updates and memory traffic. Recently, large scale multiprocessor systems have been proposed that use general interconnection networks and distributed shared memory. These architectures have been proposed using weak consistency models and various directory map schemes to hide the overhead for coherency maintenance within the memory hieratchy, interconnection network or process context switch latencies. The coherency and memory traffic are still maintained over the same interconnection network. In this paper, we present the architecture of a new general purpose medium scale multiprocessor system. This Cache Coherent Multiprocessor System (C2MP), supports distributed shared memory using a general memory interconnection network for memory traffic and a separate bus based coherency interconnection network for coherency maintenance. Through the use of a special directory based coherency protocol and cache oriented distributed coherency controllers, direct cache-to-cache coherency maintenance is performed over the dedicated coherency bus. This minimizes coherency updates to only those processor nodes needing coherency maintenance. An aggressive sequential coherncy model is used, which reduces the hardware penalty to support an ideal sequential consistency programmers model. The system can scale up to 256-512 processors depending on the degree of shared data and is expected to have higher per processor utilization in this range than currently proposed medium and large scale multiprocessor systems. The C2MP system is analyzed utilizing a Generalized Timed Petri-Net model of a processor node. A stochastic model for internode interactions over the general memory interconnection network and coherency bus are used . The model of the proposed architecture is analyzed under steady-state conditions for varying system work load parameters.
URL: https://global.ieice.org/en_transactions/information/10.1587/e75-d_3_274/_p
Copy
@ARTICLE{e75-d_3_274,
author={Douglas E. MARQUARDT, Hasan S. ALKHATIB, },
journal={IEICE TRANSACTIONS on Information},
title={A Cache-Coherent, Distributed Memory Multiprocessor System and Its Performance Analysis},
year={1992},
volume={E75-D},
number={3},
pages={274-290},
abstract={The problems of cache coherency in multiprocessor systems are directly related to their architectural structures. Small scale multiprocessor systems have focused on the use of bus based memory interconnection networks using centrally shared memory and a sequential consistency model for coherency. This has limited scalability to but a few tens of processors due to the limited bus bandwidth used for both coherency updates and memory traffic. Recently, large scale multiprocessor systems have been proposed that use general interconnection networks and distributed shared memory. These architectures have been proposed using weak consistency models and various directory map schemes to hide the overhead for coherency maintenance within the memory hieratchy, interconnection network or process context switch latencies. The coherency and memory traffic are still maintained over the same interconnection network. In this paper, we present the architecture of a new general purpose medium scale multiprocessor system. This Cache Coherent Multiprocessor System (C2MP), supports distributed shared memory using a general memory interconnection network for memory traffic and a separate bus based coherency interconnection network for coherency maintenance. Through the use of a special directory based coherency protocol and cache oriented distributed coherency controllers, direct cache-to-cache coherency maintenance is performed over the dedicated coherency bus. This minimizes coherency updates to only those processor nodes needing coherency maintenance. An aggressive sequential coherncy model is used, which reduces the hardware penalty to support an ideal sequential consistency programmers model. The system can scale up to 256-512 processors depending on the degree of shared data and is expected to have higher per processor utilization in this range than currently proposed medium and large scale multiprocessor systems. The C2MP system is analyzed utilizing a Generalized Timed Petri-Net model of a processor node. A stochastic model for internode interactions over the general memory interconnection network and coherency bus are used . The model of the proposed architecture is analyzed under steady-state conditions for varying system work load parameters.},
keywords={},
doi={},
ISSN={},
month={May},}
Copy
TY - JOUR
TI - A Cache-Coherent, Distributed Memory Multiprocessor System and Its Performance Analysis
T2 - IEICE TRANSACTIONS on Information
SP - 274
EP - 290
AU - Douglas E. MARQUARDT
AU - Hasan S. ALKHATIB
PY - 1992
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E75-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - May 1992
AB - The problems of cache coherency in multiprocessor systems are directly related to their architectural structures. Small scale multiprocessor systems have focused on the use of bus based memory interconnection networks using centrally shared memory and a sequential consistency model for coherency. This has limited scalability to but a few tens of processors due to the limited bus bandwidth used for both coherency updates and memory traffic. Recently, large scale multiprocessor systems have been proposed that use general interconnection networks and distributed shared memory. These architectures have been proposed using weak consistency models and various directory map schemes to hide the overhead for coherency maintenance within the memory hieratchy, interconnection network or process context switch latencies. The coherency and memory traffic are still maintained over the same interconnection network. In this paper, we present the architecture of a new general purpose medium scale multiprocessor system. This Cache Coherent Multiprocessor System (C2MP), supports distributed shared memory using a general memory interconnection network for memory traffic and a separate bus based coherency interconnection network for coherency maintenance. Through the use of a special directory based coherency protocol and cache oriented distributed coherency controllers, direct cache-to-cache coherency maintenance is performed over the dedicated coherency bus. This minimizes coherency updates to only those processor nodes needing coherency maintenance. An aggressive sequential coherncy model is used, which reduces the hardware penalty to support an ideal sequential consistency programmers model. The system can scale up to 256-512 processors depending on the degree of shared data and is expected to have higher per processor utilization in this range than currently proposed medium and large scale multiprocessor systems. The C2MP system is analyzed utilizing a Generalized Timed Petri-Net model of a processor node. A stochastic model for internode interactions over the general memory interconnection network and coherency bus are used . The model of the proposed architecture is analyzed under steady-state conditions for varying system work load parameters.
ER -