1-2hit |
Object detection is one of the most important aspects of computer vision, and the use of CNNs for object detection has yielded substantial results in a variety of fields. However, due to the fixed sampling in standard convolution layers, it restricts receptive fields to fixed locations and limits CNNs in geometric transformations. This leads to poor performance of CNNs for slender object detection. In order to achieve better slender object detection accuracy and efficiency, this proposed detector DFAM-DETR not only can adjust the sampling points adaptively, but also enhance the ability to focus on slender object features and extract essential information from global to local on the image through an attention mechanism. This study uses slender objects images from MS-COCO dataset. The experimental results show that DFAM-DETR achieves excellent detection performance on slender objects compared to CNN and transformer-based detectors.
Yusuke HARA Xueting WANG Toshihiko YAMASAKI
Video inpainting is a task of filling missing regions in videos. In this task, it is important to efficiently use information from other frames and generate plausible results with sufficient temporal consistency. In this paper, we present a video inpainting method jointly using affine transformation and deformable convolutions for frame alignment. The former is responsible for frame-scale rough alignment and the latter performs pixel-level fine alignment. Our model does not depend on 3D convolutions, which limits the temporal window, or troublesome flow estimation. The proposed method achieves improved object removal results and better PSNR and SSIM values compared with previous learning-based methods.