The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] parallel I/O(4hit)

1-4hit
  • Parallel Geospatial Raster Data I/O Using File View

    Wei XIONG  Ye WU  Luo CHEN  Ning JING  

     
    LETTER-Storage System

      Pubricized:
    2015/09/15
      Vol:
    E98-D No:12
      Page(s):
    2192-2195

    The challenges of providing a divide-and-conquer strategy for tackling large geospatial raster data input/output (I/O) are longstanding. Solutions need to change with advances in the technology and hardware. After analyzing the reason for the problems of traditional parallel raster I/O mode, a parallel I/O strategy using file view is proposed to solve these problems. Message Passing Interface I/O (MPI-IO) is used to implement this strategy. Experimental results show how a file view approach can be effectively married to General Parallel File System (GPFS). A suitable file view setting provides an efficient solution to parallel geospatial raster data I/O.

  • An Efficient I/O Aggregator Assignment Scheme for Multi-Core Cluster Systems

    Kwangho CHA  

     
    PAPER-Fundamentals of Information Systems

      Vol:
    E96-D No:2
      Page(s):
    259-269

    As the number of nodes in high-performance computing (HPC) systems increases, parallel I/O becomes an important issue: collective I/O is the specialized parallel I/O that provides the function of single-file based parallel I/O. Collective I/O in most message passing interface (MPI) libraries follows a two-phase I/O scheme in which the particular processes, namely I/O aggregators, perform important roles by engaging the communications and I/O operations. This approach, however, is based on a single-core architecture. Because modern HPC systems use multi-core computational nodes, the roles of I/O aggregators need to be re-evaluated. Although there have been many previous studies that have focused on the improvement of the performance of collective I/O, it is difficult to locate a study regarding the assignment scheme for I/O aggregators that considers multi-core architectures. In this research, it was discovered that the communication costs in collective I/O differed according to the placement of the I/O aggregators, where each node had multiple I/O aggregators. The performance with the two processor affinity rules was measured and the results demonstrated that the distributed affinity rule used to locate the I/O aggregators in different sockets was appropriate for collective I/O. Because there may be some applications that cannot use the distributed affinity rule, the collective I/O scheme was modified in order to guarantee the appropriate placement of the I/O aggregators for the accumulated affinity rule. The performance of the proposed scheme was examined using two Linux cluster systems, and the results demonstrated that the performance improvements were more clearly evident when the computational node of a given cluster system had a complicated architecture. Under the accumulated affinity rule, the performance improvements between the proposed scheme and the original MPI-IO were up to approximately 26.25% for the read operation and up to approximately 31.27% for the write operation.

  • An Asynchronous Striping-Aware Readahead Framework for Disk Arrays in Linux

    Sung Hoon BAEK  

     
    PAPER-Software System

      Vol:
    E96-D No:1
      Page(s):
    19-27

    Disk arrays and prefetching schemes are used to mitigate the performance gap between main memory and disks. This paper presents a new problem that arises if prefetching schemes that are widely used in operation systems are applied to disk arrays. The key point of the problem is that block address space from the viewpoint of the host is contiguous but from that of the disk array it is discontiguous and thus more disk accesses than expected are required. This paper presents two ways to resolve the problem that arises from the Linux readahead framework. The proposed scheme prevents a readahead window from being split into multiple requests from the viewpoint of the disk array but not from the viewpoint of the host thereby reducing disk head movements. In addition, it outperforms the prior work by adopting an asynchronous solution, improving performance for fragmented files, eliminating readahead size restriction, and improving disk parallelism. We implemented the proposed scheme and integrated it with Linux. Our experiment shows that the solution significantly improved the original Linux readahead framework when a storage server processes multiple concurrent requests.

  • Disk Allocation Methods Using Genetic Algorithm

    Dae-Young AHN  Kyu-Ho PARK  

     
    PAPER-Computer Systems

      Vol:
    E82-D No:1
      Page(s):
    291-300

    The disk allocation problem examined in this paper is finding a method to distribute a Binary Cartesian Product File on multiple disks to maximize parallel disk I/O accesses for partial match retrieval. This problem is known to be NP-hard, and heuristic approaches have been applied to obtain suboptimal solutions. Recently, efficient methods such as Binary Disk Modulo (BDM) and Error Correcting Code (ECC) methods have been proposed along with the restrictions that the number of disks in which files are stored should be a power of 2. In this paper, a new Disk Allocation method based on Genetic Algorithm (DAGA) is proposed. The DAGA does not place restrictions on the number of disks to be applied and it can allocate the disks adaptively by taking into account the data access patterns. Using the schema theory, it is proven that the DAGA can realize a near-optimal solution with high probability. Comparing the quality of solution derived by the DAGA with the General Disk Modulo (GDM), BDM, and ECC methods through the simulation, shows that 1) the DAGA is superior to the GDM method in all the cases and 2) with the restrictions being placed on the number of disks, the average response time of the DAGA is always less than that of the BDM method and greater than that of the ECC method in the absence of data skew and 3) when data skew is considered, the DAGA performs better than or equal to both BDM and ECC methods, even when restrictions on the number of disks are enforced.