1-3hit |
Shu TAJIMA Yusuke KAMEDA Ichiro MATSUDA Susumu ITOH
This paper proposes an efficient lossless coding scheme for color video in RGB 4:4:4 format. For the R signal that is encoded before the other signals at each frame, we employ a block-adaptive prediction technique originally developed for monochrome video. The prediction technique used for the remaining G and B signals is extended to exploit inter-color correlations as well as inter- and intra-frame ones. In both cases, multiple predictors are adaptively selected on a block-by-block basis. For the purpose of designing a set of predictors well suited to the local properties of video signals, we also explore an appropriate setting for the spatiotemporal partitioning of a video volume.
Farhad MEHDIPOUR Hamid NOORI Morteza SAHEB ZAMANI Koji INOUE Kazuaki MURAKAMI
Extracting frequently executed (hot) portions of the application and executing their corresponding data flow graph (DFG) on the hardware accelerator brings about more speedup and energy saving for embedded systems comprising a base processor integrated with a tightly coupled accelerator. Extending DFGs to support control instructions and using Control DFGs (CDFGs) instead of DFGs results in more coverage of application code portion are being accelerated hence, more speedup and energy saving. In this paper, motivations for extending DFGs to CDFGs and handling control instructions are introduced. In addition, basic requirements for an accelerator with conditional execution support are proposed. Then, two algorithms are presented for temporal partitioning of CDFGs considering the target accelerator architectural constraints. To demonstrate effectiveness of the proposed ideas, they are applied to the accelerator of a reconfigurable processor called AMBER. Experimental results approve the remarkable effectiveness of covering control instructions and using CDFGs versus DFGs in the aspects of performance and energy reduction.
Jinhwan KIM Jeonghun CHO Tag Gon KIM
In these days, many dynamically reconfigurable architectures have been introduced to fill the gap between ASICs and software-programmed processors such as GPPs and DSPs. These reconfigurable architectures have shown to achieve higher performance compared to software-programmed processors. However, reconfigurable architectures suffer from a significant reconfiguration overhead and a speedup limitation. By reducing the reconfiguration overhead, the overall performance of reconfigurable architectures can be improved. Therefore, we will describe temporal partitioning, which are able to amortize the reconfiguration overhead at synthesis phase or compilation time. Our temporal partitioning methodology splits a configuration context into temporal partitions to amortize reconfiguration overhead. And then, we will present benchmark results to demonstrate the effectiveness of our methodology.