1-2hit |
Sota MORIYAMA Koichi ICHIGE Yuichi HORI Masayuki TACHI
In this paper, we propose a method for video reflection removal using a video restoration framework with enhanced deformable networks (EDVR). We examine the effect of each module in EDVR on video reflection removal and modify the models using 3D convolutions. The performance of each modified model is evaluated in terms of the RMSE between the structural similarity (SSIM) and the smoothed SSIM representing temporal consistency.
The performance of video action recognition has improved significantly in recent decades. Current recognition approaches mainly utilize convolutional neural networks to acquire video feature representations. In addition to the spatial information of video frames, temporal information such as motions and changes is important for recognizing videos. Therefore, the use of convolutions in a spatiotemporal three-dimensional (3D) space for representing spatiotemporal features has garnered significant attention. Herein, we introduce recent advances in 3D convolutions for video action recognition.