The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] LRU(7hit)

1-7hit
  • Fogcached: A DRAM/NVMM Hybrid KVS Server for Edge Computing

    Kouki OZAWA  Takahiro HIROFUCHI  Ryousei TAKANO  Midori SUGAYA  

     
    PAPER

      Pubricized:
    2021/08/18
      Vol:
    E104-D No:12
      Page(s):
    2089-2096

    With the development of IoT devices and sensors, edge computing is leading towards new services like autonomous cars and smart cities. Low-latency data access is an essential requirement for such services, and a large-capacity cache server is needed on the edge side. However, it is not realistic to build a large capacity cache server using only DRAM because DRAM is expensive and consumes substantially large power. A hybrid main memory system is promising to address this issue, in which main memory consists of DRAM and non-volatile memory. It achieves a large capacity of main memory within the power supply capabilities of current servers. In this paper, we propose Fogcached, that is, the extension of a widely-used KVS (Key-Value Store) server program (i.e., Memcached) to exploit both DRAM and non-volatile main memory (NVMM). We used Intel Optane DCPM as NVMM for its prototype. Fogcached implements a Dual-LRU (Least Recently Used) mechanism that seamlessly extends the memory management of Memcached to hybrid main memory. Fogcached reuses the segmented LRU of Memcached to manage cached objects in DRAM, adds another segmented LRU for those in DCPM and bridges the LRUs by a mechanism to automatically replace cached objects between DRAM and DCPM. Cached objects are autonomously moved between the two memory devices according to their access frequencies. Through experiments, we confirmed that Fogcached improved the peak value of a latency distribution by about 40% compared to Memcached.

  • A SOI Cache-Tag Memory with Dual-Rail Wordline Scheme

    Nobutaro SHIBATA  Takako ISHIHARA  

     
    PAPER-Integrated Electronics

      Vol:
    E99-C No:2
      Page(s):
    316-330

    Cache memories are the major application of high-speed SRAMs, and they are frequently installed in high performance logic VLSIs including microprocessors. This paper presents a 4-way set-associative, SOI cache-tag memory. To obtain higher operating speed with less power dissipation, we devised an I/O-separated memory cell with a dual-rail wordline, which is used to transmit complementary selection signals. The address decoding delay was shortened using CMOS dual-rail logic. To enhance the maximum operating frequency, bitline's recovery operations after writing data were eliminated using a memory array configuration without half-selected cells. Moreover, conventional, sensitive but slow differential amplifiers were successfully removed from the data I/O circuitry with a hierarchical bitline scheme. As regards the stored data management, we devised a new hardware-oriented LRU-data replacement algorithm on the basis of 6-bit directed graph. With the experimental results obtained with a test chip fabricated with a 0.25-µm CMOS/SIMOX process, the core of the cache-tag memory with a 1024-set configuration can achieve a 1.5-ns address access time under typical conditions of a 2-V power supply and 25°C. The power dissipation during standby was less than 14 µW, and that at the 500-MHz operation was 13-83 mW, depending on the bit-stream data pattern.

  • Improving Cache Partitioning Algorithms for Pseudo-LRU Policies

    Xi ZHANG  Chuanyi LIU  Zhenyu LIU  Dongsheng WANG  

     
    PAPER

      Vol:
    E96-D No:12
      Page(s):
    2514-2523

    As the number of concurrently running applications on the chip multiprocessors (CMPs) is increasing, efficient management of the shared last-level cache (LLC) is crucial to guarantee overall performance. Recent studies have shown that cache partitioning can provide benefits in throughput, fairness and quality of service. Most prior arts apply true Least Recently Used (LRU) as the underlying cache replacement policy and rely on its stack property to work properly. However, in commodity processors, pseudo-LRU policies without stack property are commonly used instead of LRU for their simplicity and low storage overhead. Therefore, this study sets out to understand whether LRU-based cache partitioning techniques can be applied to commodity processors. In this work, we instead propose a cache partitioning mechanism for two popular pseudo-LRU policies: Not Recently Used (NRU) and Binary Tree (BT). Without the help of true LRU's stack property, we propose a profiling logic that applies curve approximation methods to derive the hit curve (hit counts under varied way allocations) for an application. We then propose a hybrid partitioning mechanism, which mitigates the gap between the predicted hit curve and the actual statistics. Simulation results demonstrate that our proposal can improve throughput by 15.3% on average and outperforms the stack-estimate proposal by 12.6% on average. Similar results can be achieved in weighted speedup. For the cache configurations under study, it requires less than 0.5% storage overhead compared to the last-level cache. In addition, we also show that profiling mechanism with only one true LRU ATD achieves comparable performance and can further reduce the hardware cost by nearly two thirds compared with the hybrid mechanism.

  • On Temporal Locality in IP Address Sequences

    Weiguang SHI  Mike H. MACGREGOR  Pawel GBURZYNSKI  

     
    LETTER-Internet

      Vol:
    E86-B No:11
      Page(s):
    3352-3354

    Temporal locality in IP destination address sequences can be captured by the addresses' reuse distance distribution. Based on measurements from data for a wide range of networks, we propose an accurate empirical model in contrast to results derived from the stationarity assumption of address generation processes.

  • Stochastic Model of Internet Access Patterns: Coexistence of Stationarity and Zipf-Type Distributions

    Masaki AIDA  Tetsuya ABE  

     
    PAPER-Fundamental Theories

      Vol:
    E85-B No:8
      Page(s):
    1469-1478

    This paper investigates the stochastic property of packet destinations in order to describe Internet access patterns. If we assume a sort of stationary condition for the address generation process, the process is an LRU stack model. Although the LRU stack model gives appropriate descriptions of address generation on a medium/long time-scale, address sequences generated from the LRU stack model do not reproduce Zipf-type distributions, which appear frequently in Internet access patterns. This implies that the address generation behavior on a short time-scale has a strong influence on the shape of the distributions that describe frequency of address appearances. This paper proposes an address generation algorithm that does not meet the stationary condition on the short time-scale, but restores it on the medium/long time-scale, and shows that the proposed algorithm reproduces Zipf-type distributions.

  • Stochastic Model of Internet Access Patterns

    Masaki AIDA  Tetsuya ABE  

     
    PAPER-Traffic Measurement and Analysis

      Vol:
    E84-B No:8
      Page(s):
    2142-2150

    This paper investigates the stochastic property of the packet destinations and proposes an address generation algorithm which is applicable for describing various Internet access patterns. We assume that a stochastic process of Internet access satisfies the stationary condition and derive the fundamental structure of the address generation algorithm. Pseudo IP-address sequence generated from our algorithm gives dependable cache performance and reproduces the results obtained from trace-driven simulation. The proposed algorithm is applicable not only to the destination IP address but also to the destination URLs of packets, and is useful for simulation studies of Internet performance, Web caching, DNS, and so on.

  • VLRU: Buffer Management in Client-Server Systems

    Sung-Jin LEE  Chin-Wan CHUNG  

     
    PAPER-Databases

      Vol:
    E83-D No:6
      Page(s):
    1245-1254

    In a client-server system, when LRU or its variant buffer replacement strategy is used on both the client and the server, the cache performance on the server side is very poor mainly because of pages duplicated in both systems. This paper introduces a server buffer replacement strategy which uses a replaced page-id than a request page-id, for the primary information for its operations. The importance of the corresponding pages in the server cache is decided according to the replaced page-ids that are delivered from clients to the server, so that locations of the pages are altered. Consequently, if a client uses LRU as its buffer replacement strategy, then the server cache is seen by the client as a long virtual client LRU cache extended to the server. Since the replaced page-id is only sent to the server by piggybacking whenever a new page fetch request is sent, the operation to deliver the replaced page-id is simple and induces a minimal overhead. We show that the proposed strategy reveals good performance characteristics in diverse situations, such as single and multiple clients, as well as with various access patterns.