The search functionality is under construction.

Author Search Result

[Author] Koji KAI(6hit)

1-6hit
  • Dynamically Variable Line-Size Cache Architecture for Merged DRAM/Logic LSIs

    Koji INOUE  Koji KAI  Kazuaki MURAKAMI  

     
    PAPER-Computer System Element

      Vol:
    E83-D No:5
      Page(s):
    1048-1057

    This paper proposes a novel cache architecture suitable for merged DRAM/logic LSIs, which is called "dynamically variable line-size cache (D-VLS cache). " The D-VLS cache can optimize its line-size according to the characteristic of programs, and attempts to improve the performance by exploiting the high on-chip memory bandwidth on merged DRAM/logic LSIs appropriately. In our evaluation, it is observed that an average memory-access time improvement achieved by a direct-mapped D-VLS cache is about 20% compared to a conventional direct-mapped cache with fixed 32-byte lines. This performance improvement is better than that of a doubled-size conventional direct-mapped cache.

  • High Bandwidth, Variable Line-Size Cache Architecture for Merged DRAM/Logic LSIs

    Koji INOUE  Koji KAI  Kazuaki MURAKAMI  

     
    PAPER

      Vol:
    E81-C No:9
      Page(s):
    1438-1447

    Merged DRAM/logic LSIs could provide high on-chip memory bandwidth by interconnecting logic portions and DRAM with wider on-chip buses. For merged DRAM/logic LSIs with the memory hierarchy including cache memory, we can exploit such high on-chip memory bandwidth by means of replacing a whole cache line (or cache block) at a time on cache misses. This approach tends to increase the cache-line size if we attempt to improve the attainable memory bandwidth. Larger cache lines, however, might worsen the system performance if programs running on the LSIs do not have enough spatial locality of references and cache misses frequently take place. This paper describes a novel cache architecture suitable for merged DRAM/logic LSIs, called variable line-size cache or VLS cache, for resolving the above-mentioned dilemma. The VLS cache can make good use of the high on-chip memory bandwidth by means of larger cache lines and, at the same time, alleviate the negative effects of larger cache-line size by partitioning each large cache line into multiple sub-lines and allowing every sub-line to work as an independent cache line. The number of sub-lines involved when a cache replacement occurs can be determined depending on the characteristics of programs. This paper also evaluates the cost/performance improvements attainable by the VLS cache and compares it with those of conventional cache architectures. As a result, it is observed that a VLS cache reduces the average memory-access time by 16. 4% while it increases the hardware cost by only 13%, compared to a conventional direct-mapped cache with fixed 32-byte lines.

  • Analyzing and Reducing the Impact of Shorter Data Retention Time on the Performance of Merged DRAM/Logic LSIs

    Koji KAI  Akihiko INOUE  Taku OHSAWA  Kazuaki MURAKAMI  

     
    PAPER

      Vol:
    E81-C No:9
      Page(s):
    1448-1454

    In merged DRAM/logic LSIs, the DRAM portion could suffer from shorter data retention time because of heat and noise caused by the logic portion. In order to reconsider the DRAM data retention characteristics, this paper formulates and evaluates the performance degradation due to conflicts between normal DRAM accesses and refresh operations. Next, this paper proposes a new DRAM refresh architecture which intends to reduce unnecessary refreshes. This architecture exploits multiple refresh periods. Each row is refreshed with the most appropriate period of them. Reducing the number of refreshes improves the accessibility to DRAM. It is shown that the method reduces the number of refreshes and the degree of the performance degradation of the logic portion.

  • Evaluating DRAM Refresh Architectures for Merged DRAM/Logic LSIs

    Taku OHSAWA  Koji KAI  Kazuaki MURAKAMI  

     
    PAPER

      Vol:
    E81-C No:9
      Page(s):
    1455-1462

    In merged DRAM/logic LSIs, it is necessary to reduce the number of DRAM refreshes because of higher heat dissipation caused by the logic portion on the same chip. In order to overcome this problem, we propose several DRAM refresh architectures. The basic is to eliminate unnecessary DRAM refreshes. In addition to this, we propose a method for reducing the number of DRAM refreshes by relocating data. In order to evaluate these architectures and method, we have estimated the DRAM refresh count in executing benchmark programs under several models which simulate each combination of them. As a result, in the most effective combination, we have obtained more than 80% reduction against a conventional DRAM refresh architecture for most of benchmark programs. In addition to it, we have taken normal DRAM access into account, even then we have obtained more than 50% reduction for several benchmarks.

  • A High-Performance/Low-Power On-Chip Memory-Path Architecture with Variable Cache-Line Size

    Koji INOUE  Koji KAI  Kazuaki MURAKAMI  

     
    PAPER

      Vol:
    E83-C No:11
      Page(s):
    1716-1723

    This paper proposes an on-chip memory-path architecture employing the dynamically variable line-size (D-VLS) cache for high performance and low energy consumption. The D-VLS cache exploits the high on-chip memory bandwidth attainable on merged DRAM/logic LSIs by replacing a whole large cache line in one cycle. At the same time, it attempts to avoid frequent evictions by decreasing the cache-line size when programs have poor spatial locality. Activating only on-chip DRAM subarrays corresponding to a replaced cache-line size produces a significant energy reduction. In our simulation, it is observed that our proposed on-chip memory-path architecture, which employs a direct-mapped D-VLS cache, improves the ED (Energy Delay) product by more than 75% over a conventional memory-path model.

  • Prospective Silicon Applications and Technologies in 2025 Open Access

    Koji KAI  Minoru FUJISHIMA  

     
    INVITED PAPER

      Vol:
    E94-C No:4
      Page(s):
    386-393

    Today, practical semiconductor products are an integral part of our lives and the infrastructure of society, and this trend will continue in the future. New areas of application will expand into medical, environmental, and agriculture (food)-related fields in addition to the conventional information and communication technology (ICT)-related field. Low-cost semiconductor devices with advanced functions have thus far been realized by miniaturization. However, we are now approaching the physical limit of miniaturization, and also, the investment required for new semiconductor manufacturing facilities has become huge. Under such circumstances, we propose an approach based on semiconductor devices called microcube chips and ideas of semiconductor development, i.e., agile integration and "inch-fab." Our approach is expected to contribute to expanding the range of companies that can fabricate semiconductor devices to include small-size companies, exploring new applications of semiconductor devices, and providing a wide variety of semiconductor devices at a low cost from the semiconductor industry.