IEICE global.ieice.org Site

Author Search Result

[Author] Yuta UKON(3hit)

1-3hit

A Low Area Overhead Design Method for High-Performance General-Synchronous Circuits with Speculative Execution
Shimpei SATO Eijiro SASSA Yuta UKON Atsushi TAKAHASHI

PAPER

Vol:
E102-A No:12
Page(s):
1760-1769
In order to obtain high-performance circuits in advanced technology nodes, design methodology has to take the existence of large delay variations into account. Clock scheduling and speculative execution have overheads to realize them, but have potential to improve the performance by averaging the imbalance of maximum delay among paths and by utilizing valid data available earlier than worst-case scenarios, respectively. In this paper, we propose a high-performance digital circuit design method with speculative executions with less overhead by utilizing clock scheduling with delay insertions effectively. The necessity of speculations that cause overheads is effectively reduced by clock scheduling with delay insertion. Experiments show that a generated circuit achieves 26% performance improvement with 1.3% area overhead compared to a circuit without clock scheduling and without speculative execution.
Real-Time Image Processing Based on Service Function Chaining Using CPU-FPGA Architecture
Yuta UKON Koji YAMAZAKI Koyo NITTA

PAPER-Network System

Pubricized:
2019/08/05
Vol:
E103-B No:1
Page(s):
11-19
Advanced information-processing services based on cloud computing are in great demand. However, users want to be able to customize cloud services for their own purposes. To provide image-processing services that can be optimized for the purpose of each user, we propose a technique for chaining image-processing functions in a CPU-field programmable gate array (FPGA) coupled server architecture. One of the most important requirements for combining multiple image-processing functions on a network, is low latency in server nodes. However, large delay occurs in the conventional CPU-FPGA architecture due to the overheads of packet reordering for ensuring the correctness of image processing and data transfer between the CPU and FPGA at the application level. This paper presents a CPU-FPGA server architecture with a real-time packet reordering circuit for low-latency image processing. In order to confirm the efficiency of our idea, we evaluated the latency of histogram of oriented gradients (HOG) feature calculation as an offloaded image-processing function. The results show that the latency is about 26 times lower than that of the conventional CPU-FPGA architecture. Moreover, the throughput decreased by less than 3.7% under the worst-case condition where 90 percent of the packets are randomly swapped at a 40-Gbps input rate. Finally, we demonstrated that a real-time video monitoring service can be provided by combining image processing functions using our architecture.
Design Method of Variable-Latency Circuit with Tunable Approximate Completion-Detection Mechanism
Yuta UKON Shimpei SATO Atsushi TAKAHASHI

PAPER

Pubricized:
2020/12/21
Vol:
E104-C No:7
Page(s):
309-318
Advanced information-processing services such as computer vision require a high-performance digital circuit to perform high-load processing at high speed. To achieve high-speed processing, several image-processing applications use an approximate computing technique to reduce idle time of the circuit. However, it is difficult to design the high-speed image-processing circuit while controlling the error rate so as not to degrade service quality, and this technique is used for only a few applications. In this paper, we propose a method that achieves high-speed processing effectively in which processing time for each task is changed by roughly detecting its completion. Using this method, a high-speed processing circuit with a low error rate can be designed. The error rate is controllable, and a circuit design method to minimize the error rate is also presented in this paper. To confirm the effectiveness of our proposal, a ripple-carry adder (RCA), 2-dimensional discrete cosine transform (2D-DCT) circuit, and histogram of oriented gradients (HOG) feature calculation circuit are evaluated. Effective clock periods of these circuits obtained by our method with around 1% error rate are improved about 64%, 6%, and 12%, respectively, compared with circuits without error. Furthermore, the impact of the miscalculation on a video monitoring service using an object detection application is investigated. As a result, more than 99% of detection points required to be obtained are detected, and it is confirmed the miscalculation hardly degrades the service quality.

Author Search Result

[Author] Yuta UKON(3hit)

A Low Area Overhead Design Method for High-Performance General-Synchronous Circuits with Speculative Execution

Real-Time Image Processing Based on Service Function Chaining Using CPU-FPGA Architecture

Design Method of Variable-Latency Circuit with Tunable Approximate Completion-Detection Mechanism

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles