The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] GPU computing(5hit)

1-5hit
  • Accelerating Stochastic Simulations on GPUs Using OpenCL

    Pilsung KANG  

     
    LETTER-Fundamentals of Information Systems

      Pubricized:
    2019/07/23
      Vol:
    E102-D No:11
      Page(s):
    2253-2256

    Since first introduced in 2008 with the 1.0 specification, OpenCL has steadily evolved over the decade to increase its support for heterogeneous parallel systems. In this paper, we accelerate stochastic simulation of biochemical reaction networks on modern GPUs (graphics processing units) by means of the OpenCL programming language. In implementing the OpenCL version of the stochastic simulation algorithm, we carefully apply its data-parallel execution model to optimize the performance provided by the underlying hardware parallelism of the modern GPUs. To evaluate our OpenCL implementation of the stochastic simulation algorithm, we perform a comparative analysis in terms of the performance using the CPU-based cluster implementation and the NVidia CUDA implementation. In addition to the initial report on the performance of OpenCL on GPUs, we also discuss applicability and programmability of OpenCL in the context of GPU-based scientific computing.

  • OpenACC Parallelization of Stochastic Simulations on GPUs

    Pilsung KANG  

     
    LETTER-Fundamentals of Information Systems

      Pubricized:
    2019/05/17
      Vol:
    E102-D No:8
      Page(s):
    1565-1568

    We present an OpenACC-based parallelization implementation of stochastic algorithms for simulating biochemical reaction networks on modern GPUs (graphics processing units). To investigate the effectiveness of using OpenACC for leveraging the massive hardware parallelism of the GPU architecture, we carefully apply OpenACC's language constructs and mechanisms to implementing a parallel version of stochastic simulation algorithms on the GPU. Using our OpenACC implementation in comparison to both the NVidia CUDA and the CPU-based implementations, we report our initial experiences on OpenACC's performance and programming productivity in the context of GPU-accelerated scientific computing.

  • GPU-Accelerated Stochastic Simulation of Biochemical Networks

    Pilsung KANG  

     
    LETTER-Fundamentals of Information Systems

      Pubricized:
    2017/12/20
      Vol:
    E101-D No:3
      Page(s):
    786-790

    We present a GPU (graphics processing unit) accelerated stochastic algorithm implementation for simulating biochemical reaction networks using the latest NVidia architecture. To effectively utilize the massive parallelism offered by the NVidia Pascal hardware, we apply a set of performance tuning methods and guidelines such as exploiting the architecture's memory hierarchy in our algorithm implementation. Based on our experimentation results as well as comparative analysis using CPU-based implementations, we report our initial experiences on the performance of modern GPUs in the context of scientific computing.

  • Parallel Dynamic Cloud Rendering Method Based on Physical Cellular Automata Model

    Liqiang ZHANG  Chao LI  Haoliang SUN  Changwen ZHENG  Pin LV  

     
    PAPER-Parallel and Distributed Computing

      Vol:
    E95-D No:12
      Page(s):
    2750-2758

    Due to the complicated composition of cloud and its disordered transformation, the rendering of cloud does not perfectly meet actual prospect by current methods. Based on physical characteristics of cloud, a physical cellular automata model of Dynamic cloud is designed according to intrinsic factor of cloud, which describes the rules of hydro-movement, deposition and accumulation and diffusion. Then a parallel computing architecture is designed to compute the large-scale data set required by the rendering of dynamical cloud, and a GPU-based ray-casting algorithm is implemented to render the cloud volume data. The experiment shows that cloud rendering method based on physical cellular automata model is very efficient and able to adequately exhibit the detail of cloud.

  • Parallel Implementation Strategy for CoHOG-Based Pedestrian Detection Using a Multi-Core Processor

    Ryusuke MIYAMOTO  Hiroki SUGANO  

     
    PAPER-Image Processing

      Vol:
    E94-A No:11
      Page(s):
    2315-2322

    Pedestrian detection from visual images, which is used for driver assistance or video surveillance, is a recent challenging problem. Co-occurrence histograms of oriented gradients (CoHOG) is a powerful feature descriptor for pedestrian detection and achieves the highest detection accuracy. However, its calculation cost is too large to calculate it in real-time on state-of-the-art processors. In this paper, to obtain optimal parallel implementation for an NVIDIA GPU, several kinds of parallelism of CoHOG-based detection are shown and evaluated suitability for implementation. The experimental result shows that the detection process can be performed at 16.5 fps in QVGA images on NVIDIA Tesla C1060 by optimized parallel implementation. By our evaluation, it is shown that the optimal strategy of parallel implementation for an NVIDIA GPU is different from that of FPGA. We discuss about the reason and show the advantages of each device. To show the scalability and portability of GPU implementation, the same object code is executed on other NVIDA GPUs. The experimental result shows that GTX570 can perform the CoHOG-based pedestiran detection 21.3 fps in QVGA images.