The search functionality is under construction.

Author Search Result

[Author] Tsutomu MARUYAMA(4hit)

1-4hit
  • A Fast and Accurate FPGA System for Short Read Mapping Based on Parallel Comparison on Hash Table

    Yoko SOGABE  Tsutomu MARUYAMA  

     
    PAPER-Computer System

      Pubricized:
    2017/01/30
      Vol:
    E100-D No:5
      Page(s):
    1016-1025

    The purpose of DNA sequencing is to determine the order of nucleotides within a DNA molecule of target. The target DNA molecules are fragmented into short reads, which are short fixed-length subsequences composed of ‘A’, ‘C’, ‘G’ ‘T’, by next generation sequencing (NGS) machine. To reconstruct the target DNA from the short reads using a reference genome, which is a representative example of a species that was constructed in advance, it is necessary to determine their locations in the target DNA from where they have been extracted by aligning them onto the reference genome. This process is called short read mapping, and it is important to improve the performance of the short read mapping to realize fast DNA sequencing. We propose three types of FPGA acceleration methods based on hash table; (1) sorting and parallel comparison, (2) matching that allows one mutation to reduce the number of the candidates, (3) optimized hash function using variable masks. The first one reduces the number of accesses to off-chip memory to avoid the bottleneck by access latency. The second one enables to reduce the number of the candidates without degrading mapping sensitivity by allowing one mutation in the comparison. The last one reduces hash collisions using a table that was calculated from the reference genome in advance. We implemented the three methods on Xilinx Virtex-7 and evaluated them to show their effectiveness of them. In our experiments, our system achieves 20 fold of processing speed compared with BWA, which is one of the most popular mapping tools. Furthermore, we shows that the our system outperforms one of the fastest FPGA short read mapping systems.

  • Multiple Sequence Alignment Based on Dynamic Programming Using FPGA

    Shingo MASUNO  Tsutomu MARUYAMA  Yoshiki YAMAGUCHI  Akihiko KONAGAYA  

     
    PAPER-Reconfigurable System and Applications

      Vol:
    E90-D No:12
      Page(s):
    1939-1946

    Multiple sequence alignment problems in computational biology have been focused recently because of the rapid growth of sequence databases. By computing alignment, we can understand similarity among the sequences. Many hardware systems for alignment have been proposed to date, but most of them are designed for two-dimensional alignment (alignment between two sequences) because of the complexity to calculate alignment among more than two sequences under limited hardware resources. In this paper, we describe a compact system with an off-the-shelf FPGA board and a host computer for more than three-dimensional alignment based on dynamic programming. In our approach, high performance is achieved (1) by configuring optimal circuit for each dimensional alignment, and (2) by two phase search in each dimension by reconfiguration. In order to realize multidimensional search with a common architecture, two-dimensional dynamic programming is repeated along other dimensions. With this approach, we can minimize the size of units for alignment and achieve high parallelism. Our system with one XC2V6000 enables about 300-fold speedup as compared with single Intel Pentium4 2 GHz processor for four-dimensional alignment, and 100-fold speedup for five-dimensional alignment.

  • FPGA Hardware Acceleration of a Phylogenetic Tree Reconstruction with Maximum Parsimony Algorithm

    Henry BLOCK  Tsutomu MARUYAMA  

     
    PAPER-Computer System

      Pubricized:
    2016/11/14
      Vol:
    E100-D No:2
      Page(s):
    256-264

    In this paper, we present an FPGA hardware implementation for a phylogenetic tree reconstruction with a maximum parsimony algorithm. We base our approach on a particular stochastic local search algorithm that uses the Progressive Neighborhood and the Indirect Calculation of Tree Lengths method. This method is widely used for the acceleration of the phylogenetic tree reconstruction algorithm in software. In our implementation, we define a tree structure and accelerate the search by parallel and pipeline processing. We show results for eight real-world biological datasets. We compare execution times against our previous hardware approach, and TNT, the fastest available parsimony program, which is also accelerated by the Indirect Calculation of Tree Lengths method. Acceleration rates between 34 to 45 per rearrangement, and 2 to 6 for the whole search, are obtained against our previous hardware approach. Acceleration rates between 2 to 36 per rearrangement, and 18 to 112 for the whole search, are obtained against TNT.

  • An Approach for Solving SAT/MaxSAT-Encoded Formal Verification Problems on FPGA

    Kenji KANAZAWA  Tsutomu MARUYAMA  

     
    PAPER-Computer System

      Pubricized:
    2017/05/12
      Vol:
    E100-D No:8
      Page(s):
    1807-1818

    WalkSAT (WSAT) is one of the best performing stochastic local search algorithms for the Boolean Satisfiability (SAT) and the Maximum Boolean Satisfiability (MaxSAT). WSAT is very suitable for hardware acceleration because of its high inherent parallelism. Formal verification of digital circuits is one of the most important applications of SAT and MaxSAT. Structural knowledge such as logic gates and their dependencies can be derived from SAT/MaxSAT instances generated from formal verification of digital circuits. Such that knowledge is useful to solve these instances efficiently. In this paper, we first discuss a heuristic to utilize the structural knowledge for solving these problems by using WSAT. Then, we show its implementation on FPGA. The problem size of the formal verification is typically very large, and most data have to be placed in off-chip DRAMs. In this situation, the acceleration by FPGA is limited by the throughput and access latency of the DRAMs. In our implementation, data are carefully mapped on the on-chip memory banks and off-chip DRAMs so that most data in the off-chip DRAMs can be continuously accessed using burst-read. Furthermore, a variable-way cache memory comprised of the on-chip memory banks is used in order to hide the DRAM access latency by caching the head portion of the continuous read from the DRAMs and giving them to the circuit till the rest portion is started to be given by the burst-read. We evaluate the performance of our proposed method by changing configuration of the variable-way cache and the processing parallelism, and discuss how much acceleration can be achieved.