IEICE global.ieice.org Site

Keyword Search Result

[Keyword] depth(97hit)

1-20hit(97hit)

2D Human Skeleton Action Recognition Based on Depth Estimation Open Access
Lei WANG Shanmin YANG Jianwei ZHANG Song GU

PAPER-Image Recognition, Computer Vision

Pubricized:
2024/02/27
Vol:
E107-D No:7
Page(s):
869-877
Human action recognition (HAR) exhibits limited accuracy in video surveillance due to the 2D information captured with monocular cameras. To address the problem, a depth estimation-based human skeleton action recognition method (SARDE) is proposed in this study, with the aim of transforming 2D human action data into 3D format to dig hidden action clues in the 2D data. SARDE comprises two tasks, i.e., human skeleton action recognition and monocular depth estimation. The two tasks are integrated in a multi-task manner in end-to-end training to comprehensively utilize the correlation between action recognition and depth estimation by sharing parameters to learn the depth features effectively for human action recognition. In this study, graph-structured networks with inception blocks and skip connections are investigated for depth estimation. The experimental results verify the effectiveness and superiority of the proposed method in skeleton action recognition that the method reaches state-of-the-art on the datasets.
Projection-Based Physical Adversarial Attack for Monocular Depth Estimation
Renya DAIMO Satoshi ONO

LETTER

Pubricized:
2022/10/17
Vol:
E106-D No:1
Page(s):
31-35
Monocular depth estimation has improved drastically due to the development of deep neural networks (DNNs). However, recent studies have revealed that DNNs for monocular depth estimation contain vulnerabilities that can lead to misestimation when perturbations are added to input. This study investigates whether DNNs for monocular depth estimation is vulnerable to misestimation when patterned light is projected on an object using a video projector. To this end, this study proposes an evolutionary adversarial attack method with multi-fidelity evaluation scheme that allows creating adversarial examples under black-box condition while suppressing the computational cost. Experiments in both simulated and real scenes showed that the designed light pattern caused a DNN to misestimate objects as if they have moved to the back.
Depth Image Noise Reduction and Super-Resolution by Pixel-Wise Multi-Frame Fusion
Masahiro MURAYAMA Toyohiro HIGASHIYAMA Yuki HARAZONO Hirotake ISHII Hiroshi SHIMODA Shinobu OKIDO Yasuyoshi TARUTA

PAPER-Image Processing and Video Processing

Pubricized:
2022/03/04
Vol:
E105-D No:6
Page(s):
1211-1224
High-quality depth images are required for stable and accurate computer vision. Depth images captured by depth cameras tend to be noisy, incomplete, and of low-resolution. Therefore, increasing the accuracy and resolution of depth images is desirable. We propose a method for reducing the noise and holes from depth images pixel by pixel, and increasing resolution. For each pixel in the target image, the linear space from the focal point of the camera through each pixel to the existing object is divided into equally spaced grids. In each grid, the difference from each grid to the object surface is obtained from multiple tracked depth images, which have noisy depth values of the respective image pixels. Then, the coordinates of the correct object surface are obtainable by reducing the depth random noise. The missing values are completed. The resolution can also be increased by creating new pixels between existing pixels and by then using the same process as that used for noise reduction. Evaluation results have demonstrated that the proposed method can do processing with less GPU memory. Furthermore, the proposed method was able to reduce noise more accurately, especially around edges, and was able to process more details of objects than the conventional method. The super-resolution of the proposed method also produced a high-resolution depth image with smoother and more accurate edges than the conventional methods.
Smaller Residual Network for Single Image Depth Estimation
Andi HENDRA Yasushi KANAZAWA

PAPER-Image Recognition, Computer Vision

Pubricized:
2021/08/17
Vol:
E104-D No:11
Page(s):
1992-2001
We propose a new framework for estimating depth information from a single image. Our framework is relatively small and straightforward by employing a two-stage architecture: a residual network and a simple decoder network. Our residual network in this paper is a remodeled of the original ResNet-50 architecture, which consists of only thirty-eight convolution layers in the residual block following by pair of two up-sampling and layers. While the simple decoder network, stack of five convolution layers, accepts the initial depth to be refined as the final output depth. During training, we monitor the loss behavior and adjust the learning rate hyperparameter in order to improve the performance. Furthermore, instead of using a single common pixel-wise loss, we also compute loss based on gradient-direction, and their structure similarity. This setting in our network can significantly reduce the number of network parameters, and simultaneously get a more accurate image depth map. The performance of our approach has been evaluated by conducting both quantitative and qualitative comparisons with several prior related methods on the publicly NYU and KITTI datasets.
Effects of Initial Configuration on Attentive Tracking of Moving Objects Whose Depth in 3D Changes
Anis Ur REHMAN Ken KIHARA Sakuichi OHTSUKA

PAPER-Vision

Pubricized:
2021/02/25
Vol:
E104-A No:9
Page(s):
1339-1344
In daily reality, people often pay attention to several objects that change positions while being observed. In the laboratory, this process is investigated by a phenomenon known as multiple object tracking (MOT) which is a task that evaluates attentive tracking performance. Recent findings suggest that the attentional set for multiple moving objects whose depth changes in three dimensions from one plane to another is influenced by the initial configuration of the objects. When tracking objects, it is difficult for people to expand their attentional set to multiple-depth planes once attention has been focused on a single plane. However, less is known about people contracting their attentional set from multiple-depth planes to a single-depth plane. In two experiments, we examined tracking accuracy when four targets or four distractors, which were initially distributed on two planes, come together on one of the planes during an MOT task. The results from this study suggest that people have difficulty changing the depth range of their attention during attentive tracking, and attentive tracking performance depends on the initial attentional set based on the configuration prior to attentive tracking.
Simultaneous Attack on CNN-Based Monocular Depth Estimation and Optical Flow Estimation
Koichiro YAMANAKA Keita TAKAHASHI Toshiaki FUJII Ryuraroh MATSUMOTO

LETTER-Image Recognition, Computer Vision

Pubricized:
2021/02/08
Vol:
E104-D No:5
Page(s):
785-788
Thanks to the excellent learning capability of deep convolutional neural networks (CNNs), CNN-based methods have achieved great success in computer vision and image recognition tasks. However, it has turned out that these methods often have inherent vulnerabilities, which makes us cautious of the potential risks of using them for real-world applications such as autonomous driving. To reveal such vulnerabilities, we propose a method of simultaneously attacking monocular depth estimation and optical flow estimation, both of which are common artificial-intelligence-based tasks that are intensively investigated for autonomous driving scenarios. Our method can generate an adversarial patch that can fool CNN-based monocular depth estimation and optical flow estimation methods simultaneously by simply placing the patch in the input images. To the best of our knowledge, this is the first work to achieve simultaneous patch attacks on two or more CNNs developed for different tasks.
Depth Range Control in Visually Equivalent Light Field 3D Open Access
Munekazu DATE Shinya SHIMIZU Hideaki KIMATA Dan MIKAMI Yoshinori KUSACHI

INVITED PAPER-Electronic Displays

Pubricized:
2020/08/13
Vol:
E104-C No:2
Page(s):
52-58
3D video contents depend on the shooting condition, which is camera positioning. Depth range control in the post-processing stage is not easy, but essential as the video from arbitrary camera positions must be generated. If light field information can be obtained, video from any viewpoint can be generated exactly and post-processing is possible. However, a light field has a huge amount of data, and capturing a light field is not easy. To compress data quantity, we proposed the visually equivalent light field (VELF), which uses the characteristics of human vision. Though a number of cameras are needed, VELF can be captured by a camera array. Since camera interpolation is made using linear blending, calculation is so simple that we can construct a ray distribution field of VELF by optical interpolation in the VELF3D display. It produces high image quality due to its high pixel usage efficiency. In this paper, we summarize the relationship between the characteristics of human vision, VELF and VELF3D display. We then propose a method to control the depth range for the observed image on the VELF3D display and discuss the effectiveness and limitations of displaying the processed image on the VELF3D display. Our method can be applied to other 3D displays. Since the calculation is just weighted averaging, it is suitable for real-time applications.
A Simple Depth-Key-Based Image Composition Considering Object Movement in Depth Direction
Mami NAGOYA Tomoaki KIMURA Hiroyuki TSUJI

LETTER-Computer Graphics

Vol:
E103-A No:12
Page(s):
1603-1608
A simple depth-key-based image composition is proposed, which uses two still images with depth information, background and foreground object. The proposed method can place the object at various locations in the background considering the depth in the 3D world coordinate system. The main feature is that a simple algorithm is provided, which enables us to achieve the depthward movement within the camera plane, without being aware of the 3D world coordinate system. Two algorithms are proposed (P-OMDD and O-OMDD), which are based on the pin-hole camera model. As an advantage, camera calibration is not required before applying the algorithm in these methods. Since a single image is used for the object representation, each of the proposed methods has its limitations in terms of fidelity of the composite image. P-OMDD faithfully reproduces the angle at which the object is seen, but the pixels of the hidden surface are missing. On the contrary, O-OMDD can avoid the hidden surface problem, but the angle of the object is fixed, wherever it moves. It is verified through several experiments that, when using O-OMDD, subjectively natural composite images can be obtained under any object movement, in terms of size and position in the camera plane. Future tasks include improving the change in illumination due to positional changes and the partial loss of objects due to noise in depth images.
Multi-Layered DP Quantization Algorithm Open Access
Yukihiro BANDOH Seishi TAKAMURA Hideaki KIMATA

PAPER-Image

Vol:
E103-A No:12
Page(s):
1552-1561
Designing an optimum quantizer can be treated as the optimization problem of finding the quantization indices that minimize the quantization error. One solution to the optimization problem, DP quantization, is based on dynamic programming. Some applications, such as bit-depth scalable codec and tone mapping, require the construction of multiple quantizers with different quantization levels, for example, from 12bit/channel to 10bit/channel and 8bit/channel. Unfortunately, the above mentioned DP quantization optimizes the quantizer for just one quantization level. That is, it is unable to simultaneously optimize multiple quantizers. Therefore, when DP quantization is used to design multiple quantizers, there are many redundant computations in the optimization process. This paper proposes an extended DP quantization with a complexity reduction algorithm for the optimal design of multiple quantizers. Experiments show that the proposed algorithm reduces complexity by 20.8%, on average, compared to conventional DP quantization.
Simultaneous Estimation of Object Region and Depth in Participating Media Using a ToF Camera
Yuki FUJIMURA Motoharu SONOGASHIRA Masaaki IIYAMA

PAPER-Image Recognition, Computer Vision

Pubricized:
2019/12/03
Vol:
E103-D No:3
Page(s):
660-673
Three-dimensional (3D) reconstruction and scene depth estimation from 2-dimensional (2D) images are major tasks in computer vision. However, using conventional 3D reconstruction techniques gets challenging in participating media such as murky water, fog, or smoke. We have developed a method that uses a continuous-wave time-of-flight (ToF) camera to estimate an object region and depth in participating media simultaneously. The scattered light observed by the camera is saturated, so it does not depend on the scene depth. In addition, received signals bouncing off distant points are negligible due to light attenuation, and thus the observation of such a point contains only a scattering component. These phenomena enable us to estimate the scattering component in an object region from a background that only contains the scattering component. The problem is formulated as robust estimation where the object region is regarded as outliers, and it enables the simultaneous estimation of an object region and depth on the basis of an iteratively reweighted least squares (IRLS) optimization scheme. We demonstrate the effectiveness of the proposed method using captured images from a ToF camera in real foggy scenes and evaluate the applicability with synthesized data.
Posture Recognition Technology Based on Kinect
Yan LI Zhijie CHU Yizhong XIN

PAPER-Human-computer Interaction

Pubricized:
2019/12/12
Vol:
E103-D No:3
Page(s):
621-630
Aiming at the complexity of posture recognition with Kinect, a method of posture recognition using distance characteristics is proposed. Firstly, depth image data was collected by Kinect, and three-dimensional coordinate information of 20 skeleton joints was obtained. Secondly, according to the contribution of joints to posture expression, 60 dimensional Kinect skeleton joint data was transformed into a vector of 24-dimensional distance characteristics which were normalized according to the human body structure. Thirdly, a static posture recognition method of the shortest distance and a dynamic posture recognition method of the minimum accumulative distance with dynamic time warping (DTW) were proposed. The experimental results showed that the recognition rates of static postures, non-cross-subject dynamic postures and cross-subject dynamic postures were 95.9%, 93.6% and 89.8% respectively. Finally, posture selection, Kinect placement, and comparisons with literatures were discussed, which provides a reference for Kinect based posture recognition technology and interaction design.
Cauchy Aperture and Perfect Reconstruction Filters for Extending Depth-of-Field from Focal Stack Open Access
Akira KUBOTA Kazuya KODAMA Asami ITO

PAPER

Pubricized:
2019/08/16
Vol:
E102-D No:11
Page(s):
2093-2100
A pupil function of aperture in image capturing systems is theoretically derived such that one can perfectly reconstruct all-in-focus image through linear filtering of the focal stack. The perfect reconstruction filters are also designed based on the derived pupil function. The designed filters are space-invariant; hence the presented method does not require region segmentation. Simulation results using synthetic scenes shows effectiveness of the derived pupil function and the filters.
Depth from Defocus Technique Based on Cross Reblurring
Kazumi TAKEMURA Toshiyuki YOSHIDA

PAPER

Pubricized:
2019/07/11
Vol:
E102-D No:11
Page(s):
2083-2092
This paper proposes a novel Depth From Defocus (DFD) technique based on the property that two images having different focus settings coincide if they are reblurred with the opposite focus setting, which is referred to as the “cross reblurring” property in this paper. Based on the property, the proposed technique estimates the block-wise depth profile for a target object by minimizing the mean squared error between the cross-reblurred images. Unlike existing DFD techniques, the proposed technique is free of lens parameters and independent of point spread function models. A compensation technique for a possible pixel disalignment between images is also proposed to improve the depth estimation accuracy. The experimental results and comparisons with the other DFD techniques show the advantages of our technique.
Automatic and Accurate 3D Measurement Based on RGBD Saliency Detection
Yibo JIANG Hui BI Hui LI Zhihao XU

LETTER-Image Recognition, Computer Vision

Pubricized:
2018/12/21
Vol:
E102-D No:3
Page(s):
688-689
The 3D measurement is widely required in modern industries. In this letter, a method based on the RGBD saliency detection with depth range adjusting (RGBD-DRA) is proposed for 3D measurement. By using superpixels and prior maps, RGBD saliency detection is utilized to detect and measure the target object automatically Meanwhile, the proposed depth range adjusting is processing while measuring to prompt the measuring accuracy further. The experimental results demonstrate the proposed method automatic and accurate, with 3 mm and 3.77% maximum deviation value and rate, respectively.
A Robust Depth Image Based Rendering Scheme for Stereoscopic View Synthesis with Adaptive Domain Transform Based Filtering Framework
Wei LIU Yun Qi TANG Jian Wei DING Ming Yue CUI

PAPER-Image Processing and Video Processing

Pubricized:
2018/08/31
Vol:
E101-D No:12
Page(s):
3138-3149
Depth image based rendering (DIBR), which is utilized to render virtual views with a color image and the corresponding depth map, is one of the key procedures in the 2D to 3D conversion process. However, some troubling problems, such as depth edge misalignment, disocclusion occurrences and cracks at resampling, still exist in current DIBR systems. To solve these problems, in this paper, we present a robust depth image based rendering scheme for stereoscopic view synthesis. The cores of the proposed scheme are two depth map filters which share a common domain transform based filtering framework. As a first step, a filter of this framework is carried out to realize texture-depth boundary alignments and directional disocclusion reduction smoothing simultaneously. Then after depth map 3D warping, another adaptive filter is used on the warped depth maps with delivered scene gradient structures to further diminish the remaining cracks and noises. Finally, with the optimized depth map of the virtual view, backward texture warping is adopted to retrieve the final texture virtual view. The proposed scheme enables to yield visually satisfactory results for high quality 2D to 3D conversion. Experimental results demonstrate the excellent performances of the proposed approach.
A Secure In-Depth File System Concealed by GPS-Based Mounting Authentication for Mobile Devices
Yong JIN Masahiko TOMOISHI Satoshi MATSUURA Yoshiaki KITAGUCHI

PAPER-Mobile Application and Web Security

Pubricized:
2018/08/22
Vol:
E101-D No:11
Page(s):
2612-2621
Data breach and data destruction attack have become the critical security threats for the ICT (Information and Communication Technology) infrastructure. Both the Internet service providers and users are suffering from the cyber threats especially those to confidential data and private information. The requirements of human social activities make people move carrying confidential data and data breach always happens during the transportation. The Internet connectivity and cryptographic technology have made the usage of confidential data much secure. However, even with the high deployment rate of the Internet infrastructure, the concerns for lack of the Internet connectivity make people carry data with their mobile devices. In this paper, we describe the main patterns of data breach occur on mobile devices and propose a secure in-depth file system concealed by GPS-based mounting authentication to mitigate data breach on mobile devices. In the proposed in-depth file system, data can be stored based on the level of credential with corresponding authentication policy and the mounting operation will be only successful on designated locations. We implemented a prototype system using Veracrypt and Perl language and confirmed that the in-depth file system worked exactly as we expected by evaluations on two locations. The contribution of this paper includes the clarification that GPS-based mounting authentication for a file system can reduce the risk of data breach for mobile devices and a realization of prototype system.
Advanced DBS (Direct-Binary Search) Method for Compensating Spatial Chromatic Errors on RGB Digital Holograms in a Wide-Depth Range with Binary Holograms
Thibault LEPORTIER Min-Chul PARK

LETTER-Digital Signal Processing

Vol:
E101-A No:5
Page(s):
848-849
Direct-binary search method has been used for converting complex holograms into binary format. However, this algorithm is optimized to reconstruct monochromatic digital holograms and is accurate only in a narrow-depth range. In this paper, we proposed an advanced direct-binary search method to increase the depth of field of 3D scenes reconstructed in RGB by binary holograms.
Measurement of Accommodation and Convergence Eye Movement when a Display and 3D Movie Move in the Depth Direction Simultaneously
Shinya MOCHIDUKI Yuki YOKOYAMA Keigo SUKEGAWA Hiroki SATO Miyuki SUGANUMA Mitsuho YAMADA

PAPER-Image

Vol:
E101-A No:2
Page(s):
488-498
In this study, we first developed a simultaneous measurement system for accommodation and convergence eye movement and evaluated its precision. Then, using a stuffed animal as the target, whose depth should be relatively easy to perceive, we measured convergence eye movement and accommodation at the same time while a tablet displaying a 3D movie was moved in the depth direction. By adding the real 3D display depth movement to the movement of the 3D image, subjects showed convergence eye movement that corresponds appropriately to the dual change of parallax in the 3D movie and real display, even when a subject's convergence changed very little. Accommodation also changed appropriately according to the change in depth.
Single Image Dehazing Using Invariance Principle
Mingye JU Zhenfei GU Dengyin ZHANG Jian LIU

LETTER-Image Processing and Video Processing

Pubricized:
2017/09/01
Vol:
E100-D No:12
Page(s):
3068-3072
In this letter, we propose a novel technique to increase the visibility of the hazy image. Benefiting from the atmospheric scattering model and the invariance principle for scene structure, we formulate structure constraint equations that derive from two simulated inputs by performing gamma correction on the input image. Relying on the inherent boundary constraint of the scattering function, the expected scene albedo can be well restored via these constraint equations. Extensive experimental results verify the power of the proposed dehazing technique.
Depth Map Estimation Using Census Transform for Light Field Cameras
Takayuki TOMIOKA Kazu MISHIBA Yuji OYAMADA Katsuya KONDO

PAPER-Image Recognition, Computer Vision

Pubricized:
2017/08/02
Vol:
E100-D No:11
Page(s):
2711-2720
Depth estimation for a lense-array type light field camera is a challenging problem because of the sensor noise and the radiometric distortion which is a global brightness change among sub-aperture images caused by a vignetting effect of the micro-lenses. We propose a depth map estimation method which has robustness against sensor noise and radiometric distortion. Our method first binarizes sub-aperture images by applying the census transform. Next, the binarized images are matched by computing the majority operations between corresponding bits and summing up the Hamming distance. An initial depth obtained by matching has ambiguity caused by extremely short baselines among sub-aperture images. After an initial depth estimation process, we refine the result with following refinement steps. Our refinement steps first approximate the initial depth as a set of depth planes. Next, we optimize the result of plane fitting with an edge-preserving smoothness term. Experiments show that our method outperforms the conventional methods.

1-20hit(97hit)

Keyword Search Result

[Keyword] depth(97hit)

2D Human Skeleton Action Recognition Based on Depth Estimation Open Access

Projection-Based Physical Adversarial Attack for Monocular Depth Estimation

Depth Image Noise Reduction and Super-Resolution by Pixel-Wise Multi-Frame Fusion

Smaller Residual Network for Single Image Depth Estimation

Effects of Initial Configuration on Attentive Tracking of Moving Objects Whose Depth in 3D Changes

Simultaneous Attack on CNN-Based Monocular Depth Estimation and Optical Flow Estimation

Depth Range Control in Visually Equivalent Light Field 3D Open Access

A Simple Depth-Key-Based Image Composition Considering Object Movement in Depth Direction

Multi-Layered DP Quantization Algorithm Open Access

Simultaneous Estimation of Object Region and Depth in Participating Media Using a ToF Camera

Posture Recognition Technology Based on Kinect

Cauchy Aperture and Perfect Reconstruction Filters for Extending Depth-of-Field from Focal Stack Open Access

Depth from Defocus Technique Based on Cross Reblurring

Automatic and Accurate 3D Measurement Based on RGBD Saliency Detection

A Robust Depth Image Based Rendering Scheme for Stereoscopic View Synthesis with Adaptive Domain Transform Based Filtering Framework

A Secure In-Depth File System Concealed by GPS-Based Mounting Authentication for Mobile Devices

Advanced DBS (Direct-Binary Search) Method for Compensating Spatial Chromatic Errors on RGB Digital Holograms in a Wide-Depth Range with Binary Holograms

Measurement of Accommodation and Convergence Eye Movement when a Display and 3D Movie Move in the Depth Direction Simultaneously

Single Image Dehazing Using Invariance Principle

Depth Map Estimation Using Census Transform for Light Field Cameras

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles