The search functionality is under construction.

Author Search Result

[Author] Keita TAKAHASHI(18hit)

1-18hit
  • Reconstruction of Compressively Sampled Ray Space by Statistically Weighted Model

    Qiang YAO  Keita TAKAHASHI  Toshiaki FUJII  

     
    PAPER-Image

      Vol:
    E97-A No:10
      Page(s):
    2064-2073

    In recent years, ray space (or light field in other literatures) photography has become popular in the area of computer vision and image processing, and the capture of a ray space has become more significant to these practical applications. In order to handle the huge data problem in the acquisition stage, original data are compressively sampled in the first place and completely reconstructed later. In this paper, in order to achieve better reconstruction quality and faster reconstruction speed, we propose a statistically weighted model in the reconstruction of compressively sampled ray space. This model can explore the structure of ray space data in an orthogonal basis, and integrate this structure into the reconstruction of ray space. In the experiment, the proposed model can achieve much better reconstruction quality for both 2D image patch and 3D image cube cases. Especially in a relatively low sensing ratio, about 10%, the proposed method can still recover most of the low frequency components which are of more significance for representation of ray space data. Besides, the proposed method is almost as good as the state-of-art technique, dictionary learning based method, in terms of reconstruction quality, and the reconstruction speed of our method is much faster. Therefore, our proposed method achieves better trade-off between reconstruction quality and reconstruction time, and is more suitable in the practical applications.

  • Sheared EPI Analysis for Disparity Estimation from Light Fields

    Takahiro SUZUKI  Keita TAKAHASHI  Toshiaki FUJII  

     
    PAPER

      Pubricized:
    2017/06/14
      Vol:
    E100-D No:9
      Page(s):
    1984-1993

    Structure tensor analysis on epipolar plane images (EPIs) is a successful approach to estimate disparity from a light field, i.e. a dense set of multi-view images. However, the disparity range allowable for the light field is limited because the estimation becomes less accurate as the range of disparities become larger. To overcome this limitation, we developed a new method called sheared EPI analysis, where EPIs are sheared before the structure tensor analysis. The results of analysis obtained with different shear values are integrated into a final disparity map through a smoothing process, which is the key idea of our method. In this paper, we closely investigate the performance of sheared EPI analysis and demonstrate the effectiveness of the smoothing process by extensively evaluating the proposed method with 15 datasets that have large disparity ranges.

  • Binary and Rotational Coded-Aperture Imaging for Dynamic Light Fields

    Kohei SAKAI  Keita TAKAHASHI  Toshiaki FUJII  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2021/04/28
      Vol:
    E104-D No:8
      Page(s):
    1395-1398

    Coded-aperture imaging has been utilized for compressive light field acquisition; several images are captured using different aperture patterns, and from those images, an entire light field is computationally reconstructed. This method has been extended to dynamic light fields (moving scenes). However, this method assumed that the patterns were gray-valued and of arbitrary shapes. Implementation of such patterns required a special device such as a liquid crystal on silicon (LCoS) display, which made the imaging system costly and prone to noise. To address this problem, we propose the use of a binary aperture pattern rotating along time, which can be implemented with a rotating plate with a hole. We demonstrate that although using such a pattern limits the design space, our method can still achieve a high reconstruction quality comparable to the original method.

  • Simultaneous Attack on CNN-Based Monocular Depth Estimation and Optical Flow Estimation

    Koichiro YAMANAKA  Keita TAKAHASHI  Toshiaki FUJII  Ryuraroh MATSUMOTO  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2021/02/08
      Vol:
    E104-D No:5
      Page(s):
    785-788

    Thanks to the excellent learning capability of deep convolutional neural networks (CNNs), CNN-based methods have achieved great success in computer vision and image recognition tasks. However, it has turned out that these methods often have inherent vulnerabilities, which makes us cautious of the potential risks of using them for real-world applications such as autonomous driving. To reveal such vulnerabilities, we propose a method of simultaneously attacking monocular depth estimation and optical flow estimation, both of which are common artificial-intelligence-based tasks that are intensively investigated for autonomous driving scenarios. Our method can generate an adversarial patch that can fool CNN-based monocular depth estimation and optical flow estimation methods simultaneously by simply placing the patch in the input images. To the best of our knowledge, this is the first work to achieve simultaneous patch attacks on two or more CNNs developed for different tasks.

  • Good Group Sparsity Prior for Light Field Interpolation Open Access

    Shu FUJITA  Keita TAKAHASHI  Toshiaki FUJII  

     
    PAPER-Image

      Vol:
    E103-A No:1
      Page(s):
    346-355

    A light field, which is equivalent to a dense set of multi-view images, has various applications such as depth estimation and 3D display. One of the essential problems in light field applications is light field interpolation, i.e., view interpolation. The interpolation accuracy is enhanced by exploiting an inherent property of a light field. One example is that an epipolar plane image (EPI), which is a 2D subset of the 4D light field, consists of many lines, and these lines have almost the same slope in a local region. This structure induces a sparse representation in the frequency domain, where most of the energy resides on a line passing through the origin. On the basis of this observation, we propose a group sparsity prior suitable for light fields to exploit their line structure fully for interpolation. Specifically, we designed the directional groups in the discrete Fourier transform (DFT) domain so that the groups can represent the concentration of the energy, and we thereby formulated an LF interpolation problem as an overlapping group lasso. We also introduce several techniques to improve the interpolation accuracy such as applying a window function, determining group weights, expanding processing blocks, and merging blocks. Our experimental results show that the proposed method can achieve better or comparable quality as compared to state-of-the-art LF interpolation methods such as convolutional neural network (CNN)-based methods.

  • Unrolled Network for Light Field Display

    Kotaro MATSUURA  Chihiro TSUTAKE  Keita TAKAHASHI  Toshiaki FUJII  

     
    LETTER

      Pubricized:
    2022/05/06
      Vol:
    E105-D No:10
      Page(s):
    1721-1725

    Inspired by the framework of algorithm unrolling, we propose a scalable network architecture that computes layer patterns for light field displays, enabling control of the trade-off between the display quality and the computational cost on a single pre-trained network.

  • High-Quality Multi-View Image Extraction from a Light Field Camera Considering Its Physical Pixel Arrangement

    Shu FUJITA  Keita TAKAHASHI  Toshiaki FUJII  

     
    INVITED PAPER

      Pubricized:
    2019/01/28
      Vol:
    E102-D No:4
      Page(s):
    702-714

    We propose a method for extracting multi-view images from a light field (plenoptic) camera that accurately handles the physical pixel arrangement of this camera. We use a Lytro Illum camera to obtain 4D light field data (a set of multi-viewpoint images) through a micro-lens array. The light field data are multiplexed on a single image sensor, and thus, the data is first demultiplexed into a set of multi-viewpoint (sub-aperture) images. However, the demultiplexing process usually includes interpolation of the original data such as demosaicing for a color filter array and pixel resampling for the hexagonal pixel arrangement of the original sub-aperture images. If this interpolation is performed, some information is added or lost to/from the original data. In contrast, we preserve the original data as faithfully as possible, and use them directly for the super resolution reconstruction, where the super-resolved image and the corresponding depth map are alternatively refined. We experimentally demonstrate the effectiveness of our method in resolution enhancement through comparisons with Light Field Toolbox and Lytro Desktop Application. Moreover, we also mention another type of light field cameras, a Raytrix camera, and describe how it can be handled to extract high-quality multi-view images.

  • Estimation of Dense Displacement by Scale Invariant Polynomial Expansion of Heterogeneous Multi-View Images

    Kazuki SHIBATA  Mehrdad PANAHPOUR TEHERANI  Keita TAKAHASHI  Toshiaki FUJII  

     
    LETTER

      Pubricized:
    2017/06/14
      Vol:
    E100-D No:9
      Page(s):
    2048-2051

    Several applications for 3-D visualization require dense detection of correspondence for displacement estimation among heterogeneous multi-view images. Due to differences in resolution or sampling density and field of view in the images, estimation of dense displacement is not straight forward. Therefore, we propose a scale invariant polynomial expansion method that can estimate dense displacement between two heterogeneous views. Evaluation on heterogeneous images verifies accuracy of our approach.

  • Stereo Image Retargeting with Shift-Map

    Ryo NAKASHIMA  Kei UTSUGI  Keita TAKAHASHI  Takeshi NAEMURA  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E94-D No:6
      Page(s):
    1345-1348

    We propose a new stereo image retargeting method based on the framework of shift-map image editing. Retargeting is the process of changing the image size according to the target display while preserving as much of the richness of the image as possible, and is often applied to monocular images and videos. Retargeting stereo images poses a new challenge because pixel correspondences between the stereo pair should be preserved to keep the scene's structure. The main contribution of this paper is integrating a stereo correspondence constraint into the retargeting process. Among several retargeting methods, we adopt shift-map image editing because this framework can be extended naturally to stereo images, as we show in this paper. We confirmed the effectiveness of our method through experiments.

  • Time-Multiplexed Coded Aperture and Coded Focal Stack -Comparative Study on Snapshot Compressive Light Field Imaging Open Access

    Kohei TATEISHI  Chihiro TSUTAKE  Keita TAKAHASHI  Toshiaki FUJII  

     
    PAPER

      Pubricized:
    2022/05/26
      Vol:
    E105-D No:10
      Page(s):
    1679-1690

    A light field (LF), which is represented as a set of dense, multi-view images, has been used in various 3D applications. To make LF acquisition more efficient, researchers have investigated compressive sensing methods by incorporating certain coding functionalities into a camera. In this paper, we focus on a challenging case called snapshot compressive LF imaging, in which an entire LF is reconstructed from only a single acquired image. To embed a large amount of LF information in a single image, we consider two promising methods based on rapid optical control during a single exposure: time-multiplexed coded aperture (TMCA) and coded focal stack (CFS), which were proposed individually in previous works. Both TMCA and CFS can be interpreted in a unified manner as extensions of the coded aperture (CA) and focal stack (FS) methods, respectively. By developing a unified algorithm pipeline for TMCA and CFS, based on deep neural networks, we evaluated their performance with respect to other possible imaging methods. We found that both TMCA and CFS can achieve better reconstruction quality than the other snapshot methods, and they also perform reasonably well compared to methods using multiple acquired images. To our knowledge, we are the first to present an overall discussion of TMCA and CFS and to compare and validate their effectiveness in the context of compressive LF imaging.

  • Physically-Correct Light-Field Factorization for Perspective Images

    Shu KONDO  Yuto KOBAYASHI  Keita TAKAHASHI  Toshiaki FUJII  

     
    LETTER

      Pubricized:
    2017/06/14
      Vol:
    E100-D No:9
      Page(s):
    2052-2055

    A layered light-field display based on light-field factorization is considered. In the original work, the factorization is formulated under the assumption that the light field is captured with orthographic cameras. In this paper, we introduce a generalized framework for light-field factorization that can handle both the orthographic and perspective camera projection models. With our framework, a light field captured with perspective cameras can be displayed accurately.

  • Light Field Coding Using Weighted Binary Images

    Koji KOMATSU  Kohei ISECHI  Keita TAKAHASHI  Toshiaki FUJII  

     
    PAPER

      Pubricized:
    2019/07/03
      Vol:
    E102-D No:11
      Page(s):
    2110-2119

    We propose an efficient coding scheme for a dense light field, i.e., a set of multi-viewpoint images taken with very small viewpoint intervals. The key idea behind our proposal is that a light field is represented using only weighted binary images, where several binary images and corresponding weight values are chosen so as to optimally approximate the light field. The proposed coding scheme is completely different from those of modern image/video coding standards that involve more complex procedures such as intra/inter-frame prediction and transforms. One advantage of our method is the extreme simplicity of the decoding process, which will lead to a faster and less power-hungry decoder than those of the standard codecs. Another useful aspect of our proposal is that our coding method can be made scalable, where the accuracy of the decoded light field is improved in a progressive manner as we use more encoded information. Thanks to the divide-and-conquer strategy adopted for the scalable coding, we can also substantially reduce the computational complexity of the encoding process. Although our method is still in the early research phase, experimental results demonstrated that it achieves reasonable rate-distortion performances compared with those of the standard video codecs.

  • Facial Expression Recognition Based on Facial Region Segmentation and Modal Value Approach

    Gibran BENITEZ-GARCIA  Gabriel SANCHEZ-PEREZ  Hector PEREZ-MEANA  Keita TAKAHASHI  Masahide KANEKO  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E97-D No:4
      Page(s):
    928-935

    This paper presents a facial expression recognition algorithm based on segmentation of a face image into four facial regions (eyes-eyebrows, forehead, mouth and nose). In order to unify the different results obtained from facial region combinations, a modal value approach that employs the most frequent decision of the classifiers is proposed. The robustness of the algorithm is also evaluated under partial occlusion, using four different types of occlusion (half left/right, eyes and mouth occlusion). The proposed method employs sub-block eigenphases algorithm that uses the phase spectrum and principal component analysis (PCA) for feature vector estimation which is fed to a support vector machine (SVM) for classification. Experimental results show that using modal value approach improves the average recognition rate achieving more than 90% and the performance can be kept high even in the case of partial occlusion by excluding occluded parts in the feature extraction process.

  • Designing Coded Aperture Camera Based on PCA and NMF for Light Field Acquisition

    Yusuke YAGI  Keita TAKAHASHI  Toshiaki FUJII  Toshiki SONODA  Hajime NAGAHARA  

     
    PAPER

      Pubricized:
    2018/06/20
      Vol:
    E101-D No:9
      Page(s):
    2190-2200

    A light field, which is often understood as a set of dense multi-view images, has been utilized in various 2D/3D applications. Efficient light field acquisition using a coded aperture camera is the target problem considered in this paper. Specifically, the entire light field, which consists of many images, should be reconstructed from only a few images that are captured through different aperture patterns. In previous work, this problem has often been discussed from the context of compressed sensing (CS), where sparse representations on a pre-trained dictionary or basis are explored to reconstruct the light field. In contrast, we formulated this problem from the perspective of principal component analysis (PCA) and non-negative matrix factorization (NMF), where only a small number of basis vectors are selected in advance based on the analysis of the training dataset. From this formulation, we derived optimal non-negative aperture patterns and a straight-forward reconstruction algorithm. Even though our method is based on conventional techniques, it has proven to be more accurate and much faster than a state-of-the-art CS-based method.

  • Frequency-Domain EMI Simulation of Power Electronic Converter with Voltage-Source and Current-Source Noise Models

    Keita TAKAHASHI  Takaaki IBUCHI  Tsuyoshi FUNAKI  

     
    PAPER-Energy in Electronics Communications

      Pubricized:
    2019/03/14
      Vol:
    E102-B No:9
      Page(s):
    1853-1861

    The electromagnetic interference (EMI) generated by power electronic converters is largely influenced by parasitic inductances and capacitances of the converter. One of the most popular EMI simulation methods that can take account of the parasitic parameters is the three-dimensional electromagnetic simulation by finite element method (FEM). A noise-source model should be given in the frequency domain in comprehensive FEM simulations. However, the internal impedance of the noise source is static in the frequency domain, whereas the transient switching of a power semiconductor changes its internal resistance in the time domain. In this paper, we propose the use of a voltage-source noise model and a current-source noise model to simulate EMI noise with the two components of voltage-dependent noise and current-dependent noise in the frequency domain. In order to simulate voltage-dependent EMI noise, we model the power semiconductor that is turning on by a voltage source, whose internal impedance is low. The voltage-source noise is proportional to the amplitude of the voltage. In order to simulate current-dependent EMI noise, we model the power semiconductor that is turning off by a current source, whose internal impedance is large. The current-source noise is proportional to the amplitude of the current. The measured and simulated conducted EMI agreed very well.

  • Weighted 4D-DCT Basis for Compressively Sampled Light Fields

    Yusuke MIYAGI  Keita TAKAHASHI  Toshiaki FUJII  

     
    PAPER

      Vol:
    E99-A No:9
      Page(s):
    1655-1664

    Light field data, which is composed of multi-view images, have various 3D applications. However, the cost of acquiring many images from slightly different viewpoints sometimes makes the use of light fields impractical. Here, compressive sensing is a new way to obtain the entire light field data from only a few camera shots instead of taking all the images individually. In paticular, the coded aperture/mask technique enables us to capture light field data in a compressive way through a single camera. A pixel value recorded by such a camera is a sum of the light rays that pass though different positions on the coded aperture/mask. The target light field can be reconstructed from the recorded pixel values by using prior information on the light field signal. As prior information, the current state of the art uses a dictionary (light field atoms) learned from training datasets. Meanwhile, it was reported that general bases such as those of the discrete cosine transform (DCT) are not suitable for efficiently representing prior information. In this study, however, we demonstrate that a 4D-DCT basis works surprisingly well when it is combined with a weighting scheme that considers the amplitude differences between DCT coefficients. Simulations using 18 light field datasets show the superiority of the weighted 4D-DCT basis to the learned dictionary. Furthermore, we analyzed a disparity-dependent property of the reconstructed data that is unique to light fields.

  • Design and Implementation of a Real-Time Video-Based Rendering System Using a Network Camera Array

    Yuichi TAGUCHI  Keita TAKAHASHI  Takeshi NAEMURA  

     
    PAPER-Image Processing and Video Processing

      Vol:
    E92-D No:7
      Page(s):
    1442-1452

    We present a real-time video-based rendering system using a network camera array. Our system consists of 64 commodity network cameras that are connected to a single PC through a gigabit Ethernet. To render a high-quality novel view, our system estimates a view-dependent per-pixel depth map in real time by using a layered representation. The rendering algorithm is fully implemented on the GPU, which allows our system to efficiently perform capturing and rendering processes as a pipeline by using the CPU and GPU independently. Using QVGA input video resolution, our system renders a free-viewpoint video at up to 30 frames per second, depending on the output video resolution and the number of depth layers. Experimental results show high-quality images synthesized from various scenes.

  • Fast and Robust Disparity Estimation from Noisy Light Fields Using 1-D Slanted Filters

    Gou HOUBEN  Shu FUJITA  Keita TAKAHASHI  Toshiaki FUJII  

     
    PAPER

      Pubricized:
    2019/07/03
      Vol:
    E102-D No:11
      Page(s):
    2101-2109

    Depth (disparity) estimation from a light field (a set of dense multi-view images) is currently attracting much research interest. This paper focuses on how to handle a noisy light field for disparity estimation, because if left as it is, the noise deteriorates the accuracy of estimated disparity maps. Several researchers have worked on this problem, e.g., by introducing disparity cues that are robust to noise. However, it is not easy to break the trade-off between the accuracy and computational speed. To tackle this trade-off, we have integrated a fast denoising scheme in a fast disparity estimation framework that works in the epipolar plane image (EPI) domain. Specifically, we found that a simple 1-D slanted filter is very effective for reducing noise while preserving the underlying structure in an EPI. Moreover, this simple filtering does not require elaborate parameter configurations in accordance with the target noise level. Experimental results including real-world inputs show that our method can achieve good accuracy with much less computational time compared to some state-of-the-art methods.