The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] false sharing(3hit)

1-3hit
  • A False-Sharing Free Distributed Shared Memory Management Scheme

    Alexander I-Chi LAI  Chin-Laung LEI  Hann-Huei CHIOU  

     
    PAPER-Computer Systems

      Vol:
    E83-D No:4
      Page(s):
    777-788

    Distributed shared memory (DSM) systems on top of network of workstations are especially vulnerable to the impact of false sharing because of their higher memory transaction overheads and thus higher false sharing penalties. In this paper we develop a dynamic-granularity shared memory management scheme that eliminates false sharing without sacrificing the transparency to conventional shared-memory applications. Our approach utilizes a special threaded splay tree (TST) for shared memory information management, and a dynamic token-based path-compression synchronization algorithm for data transferring. The combination of the TST and path compression is quite efficient; asymptotically, in an n-processor system with m shared memory segments, synchronizing at most s segments takes O(s log m log n) amortized computation steps and generates O(s log n) communication messages, respectively. Based on the proposed scheme we constructed an experimental DSM prototype which consists of several Ethernet-connected Pentium-based computers running Linux. Preliminary benchmark results on our prototype indicate that our scheme is quite efficient, significantly outperforming traditional schemes and scaling up well.

  • Buddy Coherence: An Adaptive Granularity Handling Scheme for Page-Based DSM

    Sangbum LEE  Inbum JUNG  Joonwon LEE  

     
    PAPER-Computer Systems

      Vol:
    E81-D No:12
      Page(s):
    1473-1482

    Page-based DSM systems suffer from false sharing since they use a large page as a coherence unit. The optimal page size is dynamically affected by application characteristics. Therefore, a fixed-size page cannot satisfy various applications even if it is small as a cache line size. In this paper we present a software-only coherence protocol called BCP (Buddy Coherence Protocol) to support multiple page sizes that vary adaptively according to the behavior of each application during run time. In BCP, the address of a remote access and the address of the most recent local access is compared. If they are to the different halves of a page, BCP considers it as false sharing and demotes the page to two subpages of equal size. If two contiguous pages belong to the same node, BCP promotes two pages to a superpage to reduce the number of the following coherence activities. We also suggest a mechanism to detect data sharing patterns to optimize the protocol. It detects and keeps the sharing pattern for each page by a state transition mechanism. By referring to those patterns, BCP selectively demotes the page and increases the effectiveness of a demotion. Self-invalidation of the migratorily shared page is also employed to reduce the number of invalidations. Our simulations show that the optimized BCP outperforms almost all the best cases of the write-invalidate protocols using fixed-size pages. BCP improves performance by 42.2% for some applications when compared against the case of the fixed-size page.

  • Two-Tier Paging and Its Performance Analysis for Network-based Distributed Shared Memory Systems

    Chi-Jiunn JOU  Hasan S. ALKHATIB  Qiang LI  

     
    PAPER-Computer Networks

      Vol:
    E78-D No:8
      Page(s):
    1021-1031

    Distributed computing over a network of workstations continues to be an illusive goal. Its main obstacle is the delay penalty due to network protocol and OS overhead. We present in this paper a low level hardware supported scheme for managing distributed shared memory (DSM), as an underlying paradigm for distributed computing. The proposed DSM is novel in that it employs a two-tier paging scheme that reduces the probability of false sharing and facilitates an efficient hardware implementation. The scheme employs a standard OS page and divides it into fixed smaller memory units called paragraphs, similar to cache lines. This scheme manages the shared data regions only, while other regions are handled by the OS in the standard manner without modification. A hardware extension of a traditional MMU, namely Distributed MMU or DMMU, is introduced to support the DSM. Shared memory coherency is maintained through a write-invalidate protocol. An analytical model is built to evaluate the system sensitivity to various parameters and to assess its performance.