The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] prefetching(11hit)

1-11hit
  • DCUIP Poisoning Attack in Intel x86 Processors

    Youngjoo SHIN  

     
    LETTER-Dependable Computing

      Pubricized:
    2021/05/13
      Vol:
    E104-D No:8
      Page(s):
    1386-1390

    Cache prefetching technique brings huge benefits to performance improvement, but it comes at the cost of microarchitectural security in processors. In this letter, we deep dive into internal workings of a DCUIP prefetcher, which is one of prefetchers equipped in Intel processors. We discover that a DCUIP table is shared among different execution contexts in hyperthreading-enabled processors, which leads to another microarchitectural vulnerability. By exploiting the vulnerability, we propose a DCUIP poisoning attack. We demonstrate an AES encryption key can be extracted from an AES-NI implementation by mounting the proposed attack.

  • Experimental Verification of SDN/NFV in Integrated mmWave Access and Mesh Backhaul Networks Open Access

    Makoto NAKAMURA  Hiroaki NISHIUCHI  Jin NAKAZATO  Konstantin KOSLOWSKI  Julian DAUBE  Ricardo SANTOS  Gia Khanh TRAN  Kei SAKAGUCHI  

     
    PAPER-Network

      Pubricized:
    2020/09/29
      Vol:
    E104-B No:3
      Page(s):
    217-228

    In this paper, a Proof-of-Concept (PoC) architecture is constructed, and the effectiveness of mmWave overlay heterogeneous network (HetNet) with mesh backhaul utilizing route-multiplexing and Multi-access Edge Computing (MEC) utilizing prefetching algorithm is verified by measuring the throughput and the download time of real contents. The architecture can cope with the intensive mobile data traffic since data delivery utilizes multiple backhaul routes based on the mesh topology, i.e. route-multiplexing mechanism. On the other hand, MEC deploys the network edge contents requested in advance by nearby User Equipment (UE) based on pre-registered context information such as location, destination, demand application, etc. to the network edge, which is called prefetching algorithm. Therefore, mmWave access can be fully exploited even with capacity-limited backhaul networks by introducing the proposed algorithm. These technologies solve the problems in conventional mmWave HetNet to reduce mobile data traffic on backhaul networks to cloud networks. In addition, the proposed architecture is realized by introducing wireless Software Defined Network (SDN) and Network Function Virtualization (NFV). In our architecture, the network is dynamically controlled via wide-coverage microwave band links by which UE's context information is collected for optimizing the network resources and controlling network infrastructures to establish backhaul routes and MEC servers. In this paper, we develop the hardware equipment and middleware systems, and introduce these algorithms which are used as a driver of IEEE802.11ad and open source software. For 5G and beyond, the architecture integrated in mmWave backhaul, MEC and SDN/NFV will support some scenarios and use cases.

  • PMOP: Efficient Per-Page Most-Offset Prefetcher

    Kanghee KIM  Wooseok LEE  Sangbang CHOI  

     
    PAPER-Computer System

      Pubricized:
    2019/04/12
      Vol:
    E102-D No:7
      Page(s):
    1271-1279

    Hardware prefetching involves a sophisticated balance between accuracy, coverage, and timeliness while minimizing hardware cost. Recent prefetchers have achieved these goals, but they still require complex hardware and a significant amount of storage. In this paper, we propose an efficient Per-page Most-Offset Prefetcher (PMOP) that minimizes hardware cost and simultaneously improves accuracy while maintaining coverage and timeliness. We achieve these objectives using an enhanced offset prefetcher that performs well with a reasonable hardware cost. Our approach first addresses coverage and timeliness by allowing multiple Most-Offset predictions. To minimize offset interference between pages, the PMOP leverages a fine-grain per-page offset filter. This filter records the access history with page-IDs, which enables efficient mapping and tracking of multiple offset streams from diverse pages. Analysis results show that PMOP outperforms the state-of-the-art Signature Path Prefetcher while reducing storage overhead by a factor of 3.4.

  • Application Prefetcher Design Using both I/O Reordering and I/O Interleaving

    Yongsoo JOO  Sangsoo PARK  Hyokyung BAHN  

     
    LETTER-Computer System

      Pubricized:
    2015/08/20
      Vol:
    E98-D No:12
      Page(s):
    2317-2321

    Application prefetchers improve application launch performance on HDDs through either I/O reordering or I/O interleaving, but there has been no proposal to combine the two techniques. We present a new algorithm to combine both approaches, and demonstrate that it reduces cold start launch time by 50%.

  • An Asynchronous Striping-Aware Readahead Framework for Disk Arrays in Linux

    Sung Hoon BAEK  

     
    PAPER-Software System

      Vol:
    E96-D No:1
      Page(s):
    19-27

    Disk arrays and prefetching schemes are used to mitigate the performance gap between main memory and disks. This paper presents a new problem that arises if prefetching schemes that are widely used in operation systems are applied to disk arrays. The key point of the problem is that block address space from the viewpoint of the host is contiguous but from that of the disk array it is discontiguous and thus more disk accesses than expected are required. This paper presents two ways to resolve the problem that arises from the Linux readahead framework. The proposed scheme prevents a readahead window from being split into multiple requests from the viewpoint of the disk array but not from the viewpoint of the host thereby reducing disk head movements. In addition, it outperforms the prior work by adopting an asynchronous solution, improving performance for fragmented files, eliminating readahead size restriction, and improving disk parallelism. We implemented the proposed scheme and integrated it with Linux. Our experiment shows that the solution significantly improved the original Linux readahead framework when a storage server processes multiple concurrent requests.

  • RPP: Reference Pattern Based Kernel Prefetching Controller

    Hyo J. LEE  In Hwan DOH  Eunsam KIM  Sam H. NOH  

     
    LETTER-System Programs

      Vol:
    E92-D No:12
      Page(s):
    2512-2515

    Conventional kernel prefetching schemes have focused on taking advantage of sequential access patterns that are easy to detect. However, it is observed that, on random and even sequential references, they may cause performance degradation due to inaccurate pattern prediction and overshooting. To address these problems, we propose a novel approach to work with existing kernel prefetching schemes, called Reference Pattern based kernel Prefetching (RPP). The RPP can reduce negative effects of existing schemes by identifying one more reference pattern, i.e., looping, in addition to random and sequential patterns and delaying starting prefetching until patterns are confirmed to be sequential or looping.

  • Improving the Performance of Linux Operating System via Buffer Cache Partitioning and Prefetching

    Heung Seok JEON  Sam H. NOH  

     
    PAPER-Software Systems

      Vol:
    E86-D No:3
      Page(s):
    616-622

    Buffer caching is an integral part of the operating system. In this paper, we propose a scheme that integrates buffer cache management and prefetching via cache partitioning. The scheme, which we call SA-W2R, is simple to implement, making it a feasible solution in real systems. In its basic form, for buffer replacement, it uses the LRU policy. However, its modular design allows for any replacement policy to be incorporated into the scheme. For prefetching, it uses the LRU-One Block Lookahead (LRU-OBL) approach, eliminating any extra burden that is generally necessary in other prefetching approaches. Implementation studies based on the GNU/Linux kernel version 2.2.14 show that the SA-W2R performs better than the scheme currently used, with a maximum increases of 23% for the workloads considered.

  • Dynamic File Prefetching Scheme Based on File Access Patterns in VIA-Based Parallel File System

    Yoon-Young LEE  Chei-Yol KIM  Dae-Wha SEO  

     
    PAPER-Computer Systems

      Vol:
    E85-D No:4
      Page(s):
    714-721

    A parallel file system is normally used to support excessive file requests from parallel applications in a cluster system, whereas prefetching is useful for improving the file system performance. This paper proposes dynamic file prefetching scheme based on file access patterns, named table-comparison prefetching policy, that is particularly suitable for parallel scientific applications and multimedia web services in a VIA-based parallel file system. VIA relieves the communication overhead of traditional communication protocols, such as TCP/IP. The proposed policy introduces a table-comparison method to predict data for prefetching. In addition, it includes an algorithm to determine whether and when prefetching is performed using the current available I/O bandwidth. Experimental results confirmed that the use of the proposed prefetching policy in a VIA-based parallel file system produced a higher file system performance for various file access patterns.

  • Adaptive Stride Prefetching for the Secondary Data Cache of UMA and NUMA

    Ando KI  

     
    PAPER-Computer Systems

      Vol:
    E83-D No:2
      Page(s):
    168-176

    Prefetching is a promising approach to tackle the memory latency problem. Two basic variants of hardware data prefetching methods are sequential prefetching and stride prefetching. The latter based on stride calculation of future references has the potential to out-perform the former which is based on the data locality. In this paper, a typical stride prefetching and its improved version, adaptive stride prefetching, are compared in quantitative way using simulation for some parallel benchmark programs in the context of uniform memory access and non-uniform memory access architectures. The simulation results show that adaptability of stride is essential since the proposed adaptive scheme can reduce pending stall time which is large in the typical scheme.

  • A Simple Hardware Prefetching Scheme Using Sequentiality for Shared-Memory Multiprocessors

    Myoung Kwon TCHEUN  Seung Ryoul MAENG  Jung Wan CHO  

     
    PAPER-Computer Hardware and Design

      Vol:
    E80-D No:11
      Page(s):
    1055-1063

    To reduce the memory access latency on sharedmemory multiprocessors, several prefetching schemes have been proposed. The sequential prefetching scheme is a simple hardware-controlled scheme, which exploits the sequentiality of memory accesses to predict which blocks will be read in the near future. Aggressive sequential prefetching prefetches many blocks on each miss to reduce the miss rates and results in good performance for application programs with high sequentiality. However, conservative sequential prefetching prefetches a few blocks on each miss to avoid prefetching of useless blocks, which shows better performance than aggressive sequential prefetching for application programs with low sequentiality. We analyze the relationship between the sequentiality of application programs and the effectiveness of sequential prefetching on various memory and network latency and propose a new adaptive sequential prefetching scheme. Simply adding a small table to the sequential prefetching scheme, the proposed scheme prefetches a large number of blocks for application programs with high sequentiality and reduces the miss rates significantly, and prefetches a small number of blocks for application programs with low sequentiality and avoids loading useless blocks.

  • High Speed DRAMs with Innovative Architectures

    Shigeo OHSHIMA  Tohru FURUYAMA  

     
    INVITED PAPER-DRAM

      Vol:
    E77-C No:8
      Page(s):
    1303-1315

    The newly developed high speed DRAMs are introduced and their innovative circuit techniques for achieving a high data bandwidth are described; the synchronous DRAM, the cache DRAM and the Rambus DRAM. They are all designed to fill the performance gap between MPUs and the main memory of computer systems, which will diverge in '90s. Although these high speed DRAMs have the same purpose to increase the data bandwidth, their approaches to accomplish it is different, which may in turn lead to some advantages or disadvantages as well as their fields of applications. The paper is intended not only to discuss them from technical overview, but also to be a guide to DRAM users when choosing the best fitting one for their systems.