IEICE global.ieice.org Site

Author Search Result

[Author] Taito MANABE(6hit)

1-6hit

FPGA Implementation of a Real-Time Super-Resolution System Using Flips and an RNS-Based CNN
Taito MANABE Yuichiro SHIBATA Kiyoshi OGURI

PAPER

Vol:
E101-A No:12
Page(s):
2280-2289
The super-resolution technology is one of the solutions to fill the gap between high-resolution displays and lower-resolution images. There are various algorithms to interpolate the lost information, one of which is using a convolutional neural network (CNN). This paper shows an FPGA implementation and a performance evaluation of a novel CNN-based super-resolution system, which can process moving images in real time. We apply horizontal and vertical flips to input images instead of enlargement. This flip method prevents information loss and enables the network to make the best use of its patch size. In addition, we adopted the residue number system (RNS) in the network to reduce FPGA resource utilization. Efficient multiplication and addition with LUTs increased a network scale that can be implemented on the same FPGA by approximately 54% compared to an implementation with fixed-point operations. The proposed system can perform super-resolution from 960×540 to 1920×1080 at 60fps with a latency of less than 1ms. Despite resource restriction of the FPGA, the system can generate clear super-resolution images with smooth edges. The evaluation results also revealed the superior quality in terms of the peak signal-to-noise ratio (PSNR) and the structural similarity (SSIM) index, compared to systems with other methods.
Real-Time Image-Based Vibration Extraction with Memory-Efficient Optical Flow and Block-Based Adaptive Filter
Taito MANABE Yuichiro SHIBATA

PAPER

Pubricized:
2022/09/05
Vol:
E106-A No:3
Page(s):
504-513
In this paper, we propose a real-time vibration extraction system, which extracts vibration component within a given frequency range from videos in real time, for realizing tremor suppression used in microsurgery assistance systems. To overcome the problems in our previous system based on the mean Lucas-Kanade (LK) optical flow of the whole frame, we have introduced a new architecture combining dense optical flow calculated with simple feature matching and block-based band-pass filtering using band-limited multiple Fourier linear combiner (BMFLC). As a feature of optical flow calculation, we use the simplified rotation-invariant histogram of oriented gradients (RIHOG) based on a gradient angle quantized to 1, 2, or 3 bits, which greatly reduces the usage of memory resources for a frame buffer. An obtained optical flow map is then divided into multiple blocks, and BMFLC is applied to the mean optical flow of each block independently. By using the L1-norm of adaptive weight vectors in BMFLC as a criterion, blocks belonging to vibrating objects can be isolated from background at low cost, leading to better extraction accuracy compared to the previous system. The whole system for 480p and 720p resolutions can be implemented on a single Xilinx Zynq-7000 XC7Z020 FPGA without any external memory, and can process a video stream supplied directly from a camera at 60fps.
Pipelined ADPCM Compression for HDR Synthesis on an FPGA
Masahiro NISHIMURA Taito MANABE Yuichiro SHIBATA

PAPER-VLSI Design Technology and CAD

Pubricized:
2023/08/31
Vol:
E107-A No:3
Page(s):
531-539
This paper presents an FPGA implementation of real-time high dynamic range (HDR) synthesis, which expresses a wide dynamic range by combining multiple images with different exposures using image pyramids. We have implemented a pipeline that performs streaming processing on images without using external memory. However, implementation for high-resolution images has been difficult due to large memory usage for line buffers. Therefore, we propose an image compression algorithm based on adaptive differential pulse code modulation (ADPCM). Compression modules based on the algorithm can be easily integrated into the pipeline. When the image resolution is 4K and the pyramid depth is 7, memory usage can be halved from 168.48% to 84.32% by introducing the compression modules, resulting in better quality.
FPGA Implementation of a Stream-Based Real-Time Hardware Line Segment Detector
Taito MANABE Taichi KATAYAMA Yuichiro SHIBATA

PAPER

Pubricized:
2021/09/02
Vol:
E105-A No:3
Page(s):
468-477
Line detection is the fundamental image processing technique which has various applications in the field of computer vision. For example, lane keeping required to realize autonomous vehicles can be implemented based on line detection technique. For such purposes, however, low detection latency and power consumption are essential. Using hardware-based stream processing is considered as an effective way to achieve such properties since it eliminates the need of storing the whole frame into energy-consuming external memory. In addition, adopting FPGAs enables us to keep flexibility of software processing. The line segment detector (LSD) is the algorithm based on intensity gradient, and performs better than the well-known Hough transform in terms of processing speed and accuracy. However, implementing the original LSD on FPGAs as a pipeline structure is difficult mainly because of its iterative region growing approach. Therefore, we propose a simple and stream-friendly line segment detection algorithm based on the concept of LSD. The whole system is implemented on a Xilinx Zynq-7000 XC7Z020-1CLG400C FPGA without any external memory. Evaluation results reveal that the implemented system is able to detect line segments successfully and is compact with 7.5% of Block RAM and less than 7.0% of the other resources used, while maintaining 60 fps throughput for VGA videos. It is also shown that the system is power-efficient compared to software processing on CPUs.
A Hardware Oriented Approximate Convex Hull Algorithm and its FPGA Implementation Open Access
Tatsuma MORI Taito MANABE Yuichiro SHIBATA

PAPER

Pubricized:
2021/09/02
Vol:
E105-A No:3
Page(s):
459-467
The convex hull is the minimum convex surrounding a given set of points. Since the process of finding convex hulls has various practical application fields including embedded real-time systems, efficient acceleration of convex hull algorithms is an important problem in computer geometry. In this paper, we discuss an FPGA acceleration approach to address this problem. In order to compute the convex hull of an unsorted point set, it is necessary to store all the points during the computation, and thus the capacity of a on-chip memory is likely to be a major constraint for efficient FPGA implementation. On the other hand, approximate convex hulls are often sufficient for practical applications. Therefore, we propose a hardware oriented approximate convex hull algorithm, which can process the input points as a stream without storing all the points in the memory. We also propose some computation reduction techniques for efficient FPGA implementation. Then, we present FPGA implementation of the proposed algorithm, which is parallelized both in temporal and spatial domains, and evaluate its effectiveness in terms of performance and accuracy. As a result, we demonstrated 11 to 30 times faster performance compared to the widely-used convex hull software library Qhull. In addition, accuracy assessment revealed that the maximum approximation error normalized to the diameters of point sets was 0.038%, which was reasonably small for practical use cases.
FPGA Implementation and Evaluation of a Real-Time Image-Based Vibration Detection System with Adaptive Filtering
Taito MANABE Kazuya UETSUHARA Akane TAHARA Yuichiro SHIBATA

PAPER

Vol:
E103-A No:12
Page(s):
1472-1480
This paper shows design and implementation of an image-based vibration detection system on a field-programmable gate array (FPGA), aiming at application to tremor suppression for microsurgery assistance systems. The system can extract a vibration component within a user-specified frequency band from moving images in real-time. For fast and robust detection, we employ a statistical approach using dense optical flow to derive vibration component, and design a custom hardware based on the Lucas-Kanade (LK) method to compute optical flow. And for band-pass filtering without phase delay, we implement the band-limited multiple Fourier linear combiner (BMFLC), a sort of adaptive band-pass filter which can recompose an input signal as a mixture of sinusoidal signals with multiple frequencies within the specified band, with no phase delay. The whole system is implemented as a deep pipeline on a Xilinx Kintex-7 XC7K325T FPGA without using any external memory. We employ fixed-point arithmetic to reduce resource utilization while maintaining accuracy close to double-precision floating-point arithmetic. Empirical experiments reveal that the proposed system extracts a high-frequency tremor component from hand motions, with intentional low-frequency motions successfully filtered out. The system can process VGA moving images at 60fps, with a delay of less than 1 µs for the BMFLC, suggesting effectiveness of the deep pipelined architecture. In addition, we are planning to integrate a CNN-based segmentation system for improving detection accuracy, and show preliminary software evaluation results.

Author Search Result

[Author] Taito MANABE(6hit)

FPGA Implementation of a Real-Time Super-Resolution System Using Flips and an RNS-Based CNN

Real-Time Image-Based Vibration Extraction with Memory-Efficient Optical Flow and Block-Based Adaptive Filter

Pipelined ADPCM Compression for HDR Synthesis on an FPGA

FPGA Implementation of a Stream-Based Real-Time Hardware Line Segment Detector

A Hardware Oriented Approximate Convex Hull Algorithm and its FPGA Implementation Open Access

FPGA Implementation and Evaluation of a Real-Time Image-Based Vibration Detection System with Adaptive Filtering

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles