The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] graphics processing unit (GPU)(5hit)

1-5hit
  • Multiparallel MMT: Faster ISD Algorithm Solving High-Dimensional Syndrome Decoding Problem

    Shintaro NARISADA  Kazuhide FUKUSHIMA  Shinsaku KIYOMOTO  

     
    PAPER

      Pubricized:
    2022/11/09
      Vol:
    E106-A No:3
      Page(s):
    241-252

    The hardness of the syndrome decoding problem (SDP) is the primary evidence for the security of code-based cryptosystems, which are one of the finalists in a project to standardize post-quantum cryptography conducted by the U.S. National Institute of Standards and Technology (NIST-PQC). Information set decoding (ISD) is a general term for algorithms that solve SDP efficiently. In this paper, we conducted a concrete analysis of the time complexity of the latest ISD algorithms under the limitation of memory using the syndrome decoding estimator proposed by Esser et al. As a result, we present that theoretically nonoptimal ISDs, such as May-Meurer-Thomae (MMT) and May-Ozerov, have lower time complexity than other ISDs in some actual SDP instances. Based on these facts, we further studied the possibility of multiple parallelization for these ISDs and proposed the first GPU algorithm for MMT, the multiparallel MMT algorithm. In the experiments, we show that the multiparallel MMT algorithm is faster than existing ISD algorithms. In addition, we report the first successful attempts to solve the 510-, 530-, 540- and 550-dimensional SDP instances in the Decoding Challenge contest using the multiparallel MMT.

  • View Priority Based Threads Allocation and Binary Search Oriented Reweight for GPU Accelerated Real-Time 3D Ball Tracking

    Yilin HOU  Ziwei DENG  Xina CHENG  Takeshi IKENAGA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2018/08/31
      Vol:
    E101-D No:12
      Page(s):
    3190-3198

    In real-time 3D ball tracking of sports analysis in computer vision technology, complex algorithms which assure the accuracy could be time-consuming. Particle filter based algorithm has a large potential to accelerate since the algorithm between particles has the chance to be paralleled in heterogeneous CPU-GPU platform. Still, with the target multi-view 3D ball tracking algorithm, challenges exist: 1) serial flowchart for each step in the algorithm; 2) repeated processing for multiple views' processing; 3) the low degree of parallelism in reweight and resampling steps for sequential processing. On the CPU-GPU platform, this paper proposes the double stream system flow, the view priority based threads allocation, and the binary search oriented reweight. Double stream system flow assigns tasks which there is no data dependency exists into different streams for each frame processing to achieve parallelism in system structure level. View priority based threads allocation manipulates threads in multi-view observation task. Threads number is view number multiplied by particles number, and with view priority assigning, which could help both memory accessing and computing achieving parallelism. Binary search oriented reweight reduces the time complexity by avoiding to generate cumulative distribution function and uses an unordered array to implement a binary search. The experiment is based on videos which record the final game of an official volleyball match (2014 Inter-High School Games of Men's Volleyball held in Tokyo Metropolitan Gymnasium in Aug. 2014) and the test sequences are taken by multiple-view system which is made of 4 cameras locating at the four corners of the court. The success rate achieves 99.23% which is the same as target algorithm while the time consumption has been accelerated from 75.1ms/frame in CPU environment to 3.05ms/frame in the proposed system which is 24.62 times speed up, also, it achieves 2.33 times speedup compared with basic GPU implemented work.

  • Parallelization of Dynamic Time Warping on a Heterogeneous Platform

    Yao ZHENG  Limin XIAO  Wenqi TANG  Lihong SHANG  Guangchao YAO  Li RUAN  

     
    LETTER-Algorithms and Data Structures

      Vol:
    E97-A No:11
      Page(s):
    2258-2262

    The dynamic time warping (DTW) algorithm is widely used to determine time series similarity search. As DTW has quadratic time complexity, the time taken for similarity search is the bottleneck for virtually all time series data mining algorithms. In this paper, we present a parallel approach for DTW on a heterogeneous platform with a graphics processing unit (GPU). In order to exploit fine-grained data-level parallelism, we propose a specific parallel decomposition in DTW. Furthermore, we introduce an optimization technique called diamond tiling to improve the utilization of threads. Results show that our approach substantially reduces computational time.

  • NUFFT- & GPU-Based Fast Imaging of Vegetation

    Amedeo CAPOZZOLI  Claudio CURCIO  Antonio DI VICO  Angelo LISENO  

     
    PAPER-Sensing

      Vol:
    E94-B No:7
      Page(s):
    2092-2103

    We develop an effective algorithm, based on the filtered backprojection (FBP) approach, for the imaging of vegetation. Under the FBP scheme, the reconstruction amounts at a non-trivial Fourier inversion, since the data are Fourier samples arranged on a non-Cartesian grid. The computational issue is efficiently tackled by Non-Uniform Fast Fourier Transforms (NUFFTs), whose complexity grows asymptotically as that of a standard FFT. Furthermore, significant speed-ups, as compared to fast CPU implementations, are obtained by a parallel versions of the NUFFT algorithm, purposely designed to be run on Graphic Processing Units (GPUs) by using the CUDA language. The performance of the parallel algorithm has been assessed in comparison to a CPU-multicore accelerated, Matlab implementation of the same routine, to other CPU-multicore accelerated implementations based on standard FFT and employing linear, cubic, spline and sinc interpolations and to a different, parallel algorithm exploiting a parallel linear interpolation stage. The proposed approach has resulted the most computationally convenient. Furthermore, an indoor, polarimetric experimental setup is developed, capable to isolate and introduce, one at a time, different non-idealities of a real acquisition, as the sources (wind, rain) of temporal decorrelation. Experimental far-field polarimetric measurements on a thuja plicata (western redcedar) tree point out the performance of the set up algorithm, its robustness against data truncation and temporal decorrelation as well as the possibility of discriminating scatterers with different features within the investigated scene.

  • Media Processing LSI Architectures for Automotives -- Challenges and Future Trends --

    Ichiro KURODA  Shorin KYO  

     
    INVITED PAPER

      Vol:
    E90-C No:10
      Page(s):
    1850-1857

    This paper presents media processor architectures for automotive applications. Media processing applications with their requirements for LSI implementations are first described for vision based driver assistance as well as graphical user interface for car navigation using 3D graphics. Then, parallel processing architectures for vision and graphics in these applications are reviewed with their performance and cost. After that, future trends of automotive media processing such as integration of vision and 3D graphics functions are shown with their applications and the required performance. Moreover, parallel processing architectures are discussed for the integration of vision and graphics. Finally, an prospect of a next-generation media processing LSI for automotives is provided.