The search functionality is under construction.

Author Search Result

[Author] Toshiaki FUJII(29hit)

1-20hit(29hit)

  • Light Field Coding Using Weighted Binary Images

    Koji KOMATSU  Kohei ISECHI  Keita TAKAHASHI  Toshiaki FUJII  

     
    PAPER

      Pubricized:
    2019/07/03
      Vol:
    E102-D No:11
      Page(s):
    2110-2119

    We propose an efficient coding scheme for a dense light field, i.e., a set of multi-viewpoint images taken with very small viewpoint intervals. The key idea behind our proposal is that a light field is represented using only weighted binary images, where several binary images and corresponding weight values are chosen so as to optimally approximate the light field. The proposed coding scheme is completely different from those of modern image/video coding standards that involve more complex procedures such as intra/inter-frame prediction and transforms. One advantage of our method is the extreme simplicity of the decoding process, which will lead to a faster and less power-hungry decoder than those of the standard codecs. Another useful aspect of our proposal is that our coding method can be made scalable, where the accuracy of the decoded light field is improved in a progressive manner as we use more encoded information. Thanks to the divide-and-conquer strategy adopted for the scalable coding, we can also substantially reduce the computational complexity of the encoding process. Although our method is still in the early research phase, experimental results demonstrated that it achieves reasonable rate-distortion performances compared with those of the standard video codecs.

  • FOREWORD Open Access

    Toshiaki FUJII  

     
    FOREWORD

      Vol:
    E102-D No:11
      Page(s):
    2082-2082
  • Simultaneous Visible Light Communication and Ranging Using High-Speed Stereo Cameras Based on Bicubic Interpolation Considering Multi-Level Pulse-Width Modulation

    Ruiyi HUANG  Masayuki KINOSHITA  Takaya YAMAZATO  Hiraku OKADA  Koji KAMAKURA  Shintaro ARAI  Tomohiro YENDO  Toshiaki FUJII  

     
    PAPER-Communication Theory and Signals

      Pubricized:
    2022/12/26
      Vol:
    E106-A No:7
      Page(s):
    990-997

    Visible light communication (VLC) and visible light ranging are applicable techniques for intelligent transportation systems (ITS). They use every unique light-emitting diode (LED) on roads for data transmission and range estimation. The simultaneous VLC and ranging can be applied to improve the performance of both. It is necessary to achieve rapid data rate and high-accuracy ranging when transmitting VLC data and estimating the range simultaneously. We use the signal modulation method of pulse-width modulation (PWM) to increase the data rate. However, when using PWM for VLC data transmission, images of the LED transmitters are captured at different luminance levels and are easily saturated, and LED saturation leads to inaccurate range estimation. In this paper, we establish a novel simultaneous visible light communication and ranging system for ITS using PWM. Here, we analyze the LED saturation problems and apply bicubic interpolation to solve the LED saturation problem and thus, improve the communication and ranging performance. Simultaneous communication and ranging are enabled using a stereo camera. Communication is realized using maximal-ratio combining (MRC) while ranging is achieved using phase-only correlation (POC) and sinc function approximation. Furthermore, we measured the performance of our proposed system using a field trial experiment. The results show that error-free performance can be achieved up to a communication distance of 55 m and the range estimation errors are below 0.5m within 60m.

  • The Adaptive Distributed Source Coding of Multi-View Images in Camera Sensor Networks

    Mehrdad PANAHPOUR TEHRANI  Toshiaki FUJII  Masayuki TANIMOTO  

     
    PAPER-Image Coding

      Vol:
    E88-A No:10
      Page(s):
    2835-2843

    We show that distributed source coding of multi-view images in camera sensor networks (CSNs) using adaptive modules can come close to the Slepian-Wolf bound. In a systematic scenario with limited node abilities, work by Slepian and Wolf suggest that it is possible to encode statistically dependent signals in a distributed manner to the same rate as with a system where the signals are jointly encoded. We considered three nodes (PN, CN and CNs), which are statistically depended. Different distributed architecture solutions are proposed based on a parent node and child node framework. A PN sends the whole image whereas a CNs/CN only partially, using an adaptive coding based on adaptive module-operation at a rate close to theoretical bound - H(CNs|PN)/H(CN|PN,CNs). CNs sends sub-sampled image and encodes the rest of image, however CN encodes all image. In other words, the proposed scheme allows independent encoding and jointly decoding of views. Experimental results show performance close to the information-theoretic limit. Furthermore, good performance of the proposed architecture with adaptive scheme shows significant improvement over previous work.

  • FOREWORD Open Access

    Toshiaki FUJII  

     
    FOREWORD

      Vol:
    E104-D No:10
      Page(s):
    1544-1544
  • FOREWORD Open Access

    Toshiaki FUJII  

     
    FOREWORD

      Vol:
    E103-D No:10
      Page(s):
    2035-2035
  • FOREWORD

    Toshiaki FUJII  

     
    FOREWORD

      Vol:
    E101-D No:9
      Page(s):
    2178-2178
  • Designing Coded Aperture Camera Based on PCA and NMF for Light Field Acquisition

    Yusuke YAGI  Keita TAKAHASHI  Toshiaki FUJII  Toshiki SONODA  Hajime NAGAHARA  

     
    PAPER

      Pubricized:
    2018/06/20
      Vol:
    E101-D No:9
      Page(s):
    2190-2200

    A light field, which is often understood as a set of dense multi-view images, has been utilized in various 2D/3D applications. Efficient light field acquisition using a coded aperture camera is the target problem considered in this paper. Specifically, the entire light field, which consists of many images, should be reconstructed from only a few images that are captured through different aperture patterns. In previous work, this problem has often been discussed from the context of compressed sensing (CS), where sparse representations on a pre-trained dictionary or basis are explored to reconstruct the light field. In contrast, we formulated this problem from the perspective of principal component analysis (PCA) and non-negative matrix factorization (NMF), where only a small number of basis vectors are selected in advance based on the analysis of the training dataset. From this formulation, we derived optimal non-negative aperture patterns and a straight-forward reconstruction algorithm. Even though our method is based on conventional techniques, it has proven to be more accurate and much faster than a state-of-the-art CS-based method.

  • Real-Time View-Interpolation System for Super Multi-View 3D Display

    Tadahiko HAMAGUCHI  Toshiaki FUJII  Toshio HONDA  

     
    PAPER-Image Processing, Image Pattern Recognition

      Vol:
    E86-D No:1
      Page(s):
    109-116

    A 3D display using super high-density multi-view images should enable reproduction of natural stereoscopic views. In the super multi-view display system, viewpoints are sampled at an interval narrower than the diameter of the pupil of a person's eye. With the parallax produced by a single eye, this system can pull out the accommodation of an eye to an object image. We are now working on a real-time view-interpolation system for the super multi-view 3D display. A multi-view camera using convergence capturing to prevent resolution degradation captures multi-view images of an object. Most of the data processing is used for view interpolation and rectification. View interpolation is done using a high-speed image-processing board with digital-signal-processor (DSP) chips or single instruction stream and multiple data streams (SIMD) parallel processor chips. Adaptive filtering of the epipolar plane images (EPIs) is used for the view-interpolation algorithm. The multi-view images are adaptively interpolated using the most suitable filters for the EPIs. Rectification, a preprocess, converts the multi-view images in convergence capturing into the ones in parallel capturing. The use of rectified multi-view images improves the processing speed by limiting the interpolation processing in EPI.

  • Weighted 4D-DCT Basis for Compressively Sampled Light Fields

    Yusuke MIYAGI  Keita TAKAHASHI  Toshiaki FUJII  

     
    PAPER

      Vol:
    E99-A No:9
      Page(s):
    1655-1664

    Light field data, which is composed of multi-view images, have various 3D applications. However, the cost of acquiring many images from slightly different viewpoints sometimes makes the use of light fields impractical. Here, compressive sensing is a new way to obtain the entire light field data from only a few camera shots instead of taking all the images individually. In paticular, the coded aperture/mask technique enables us to capture light field data in a compressive way through a single camera. A pixel value recorded by such a camera is a sum of the light rays that pass though different positions on the coded aperture/mask. The target light field can be reconstructed from the recorded pixel values by using prior information on the light field signal. As prior information, the current state of the art uses a dictionary (light field atoms) learned from training datasets. Meanwhile, it was reported that general bases such as those of the discrete cosine transform (DCT) are not suitable for efficiently representing prior information. In this study, however, we demonstrate that a 4D-DCT basis works surprisingly well when it is combined with a weighting scheme that considers the amplitude differences between DCT coefficients. Simulations using 18 light field datasets show the superiority of the weighted 4D-DCT basis to the learned dictionary. Furthermore, we analyzed a disparity-dependent property of the reconstructed data that is unique to light fields.

  • Fast and Robust Disparity Estimation from Noisy Light Fields Using 1-D Slanted Filters

    Gou HOUBEN  Shu FUJITA  Keita TAKAHASHI  Toshiaki FUJII  

     
    PAPER

      Pubricized:
    2019/07/03
      Vol:
    E102-D No:11
      Page(s):
    2101-2109

    Depth (disparity) estimation from a light field (a set of dense multi-view images) is currently attracting much research interest. This paper focuses on how to handle a noisy light field for disparity estimation, because if left as it is, the noise deteriorates the accuracy of estimated disparity maps. Several researchers have worked on this problem, e.g., by introducing disparity cues that are robust to noise. However, it is not easy to break the trade-off between the accuracy and computational speed. To tackle this trade-off, we have integrated a fast denoising scheme in a fast disparity estimation framework that works in the epipolar plane image (EPI) domain. Specifically, we found that a simple 1-D slanted filter is very effective for reducing noise while preserving the underlying structure in an EPI. Moreover, this simple filtering does not require elaborate parameter configurations in accordance with the target noise level. Experimental results including real-world inputs show that our method can achieve good accuracy with much less computational time compared to some state-of-the-art methods.

  • FOREWORD Open Access

    Toshiaki Fujii  

     
    FOREWORD

      Vol:
    E100-D No:9
      Page(s):
    1943-1943
  • A Segmentation-Based Multiple-Baseline Stereo (SMBS) Scheme for Acquisition of Depth in 3-D Scenes

    Takashi IMORI  Tadahiko KIMOTO  Bunpei TOUJI  Toshiaki FUJII  Masayuki TANIMOTO  

     
    PAPER-Image Processing,Computer Graphics and Pattern Recognition

      Vol:
    E81-D No:2
      Page(s):
    215-223

    This paper presents a new scheme to estimate depth in a natural three-dimensional scene using a multi-viewpoint image set. In the conventional Multiple-Baseline Stereo (MBS) scheme for the image set, although errors of stereo matching are somewhat reduced by using multiple stereo pairs, the use of square blocks of fixed size sometimes causes false matching, especially, in that image area where occlusion occurs and that image area of small variance of brightness levels. In the proposed scheme, the reference image is segmented into regions which are capable of being arbitrarily shaped, and a depth value is estimated for each region. Also, by comparing the image generated by projection with the original image, depth values are newly estimated in a top-down manner. Then, the error of the previous depth value is detected, and it is corrected. The results of experiments show advantages of the proposed scheme over the MBS scheme.

  • Simplified Vehicle Vibration Modeling for Image Sensor Communication

    Masayuki KINOSHITA  Takaya YAMAZATO  Hiraku OKADA  Toshiaki FUJII  Shintaro ARAI  Tomohiro YENDO  Koji KAMAKURA  

     
    PAPER

      Vol:
    E101-A No:1
      Page(s):
    176-184

    Image sensor communication (ISC), derived from visible light communication (VLC) is an attractive solution for outdoor mobile environments, particularly for intelligent transport systems (ITS). In ITS-ISC, tracking a transmitter in the image plane is critical issue since vehicle vibrations make it difficult to selsct the correct pixels for data reception. Our goal in this study is to develop a precise tracking method. To accomplish this, vehicle vibration modeling and its parameters estimation, i.e., represetative frequencies and their amplitudes for inherent vehicle vibration, and the variance of the Gaussian random process represnting road surface irregularity, are required. In this paper, we measured actual vehicle vibration in a driving situation and determined parameters based on the frequency characteristics. Then, we demonstrate that vehicle vibration that induces transmitter displacement in an image plane can be modeled by only Gaussian random processes that represent road surface irregularity when a high frame rate (e.g., 1000fps) image sensor is used as an ISC receiver. The simplified vehicle vibration model and its parameters are evaluated by numerical analysis and experimental measurement and obtained result shows that the proposed model can reproduce the characteristics of the transmitter displacement sufficiently.

  • Reconstruction of Compressively Sampled Ray Space by Statistically Weighted Model

    Qiang YAO  Keita TAKAHASHI  Toshiaki FUJII  

     
    PAPER-Image

      Vol:
    E97-A No:10
      Page(s):
    2064-2073

    In recent years, ray space (or light field in other literatures) photography has become popular in the area of computer vision and image processing, and the capture of a ray space has become more significant to these practical applications. In order to handle the huge data problem in the acquisition stage, original data are compressively sampled in the first place and completely reconstructed later. In this paper, in order to achieve better reconstruction quality and faster reconstruction speed, we propose a statistically weighted model in the reconstruction of compressively sampled ray space. This model can explore the structure of ray space data in an orthogonal basis, and integrate this structure into the reconstruction of ray space. In the experiment, the proposed model can achieve much better reconstruction quality for both 2D image patch and 3D image cube cases. Especially in a relatively low sensing ratio, about 10%, the proposed method can still recover most of the low frequency components which are of more significance for representation of ray space data. Besides, the proposed method is almost as good as the state-of-art technique, dictionary learning based method, in terms of reconstruction quality, and the reconstruction speed of our method is much faster. Therefore, our proposed method achieves better trade-off between reconstruction quality and reconstruction time, and is more suitable in the practical applications.

  • Sheared EPI Analysis for Disparity Estimation from Light Fields

    Takahiro SUZUKI  Keita TAKAHASHI  Toshiaki FUJII  

     
    PAPER

      Pubricized:
    2017/06/14
      Vol:
    E100-D No:9
      Page(s):
    1984-1993

    Structure tensor analysis on epipolar plane images (EPIs) is a successful approach to estimate disparity from a light field, i.e. a dense set of multi-view images. However, the disparity range allowable for the light field is limited because the estimation becomes less accurate as the range of disparities become larger. To overcome this limitation, we developed a new method called sheared EPI analysis, where EPIs are sheared before the structure tensor analysis. The results of analysis obtained with different shear values are integrated into a final disparity map through a smoothing process, which is the key idea of our method. In this paper, we closely investigate the performance of sheared EPI analysis and demonstrate the effectiveness of the smoothing process by extensively evaluating the proposed method with 15 datasets that have large disparity ranges.

  • Binary and Rotational Coded-Aperture Imaging for Dynamic Light Fields

    Kohei SAKAI  Keita TAKAHASHI  Toshiaki FUJII  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2021/04/28
      Vol:
    E104-D No:8
      Page(s):
    1395-1398

    Coded-aperture imaging has been utilized for compressive light field acquisition; several images are captured using different aperture patterns, and from those images, an entire light field is computationally reconstructed. This method has been extended to dynamic light fields (moving scenes). However, this method assumed that the patterns were gray-valued and of arbitrary shapes. Implementation of such patterns required a special device such as a liquid crystal on silicon (LCoS) display, which made the imaging system costly and prone to noise. To address this problem, we propose the use of a binary aperture pattern rotating along time, which can be implemented with a rotating plate with a hole. We demonstrate that although using such a pattern limits the design space, our method can still achieve a high reconstruction quality comparable to the original method.

  • Fractal Image Coding Based on Classified Range Regions

    Hiroshi OHYAMA  Tadahiko KIMOTO  Shin'ichi USUI  Toshiaki FUJII  Masayuki TANIMOTO  

     
    PAPER-Image Coding

      Vol:
    E81-B No:12
      Page(s):
    2257-2268

    A fractal image coding scheme using classified range regions is proposed. Two classes of range regions, shade and nonshade, are defined here, A shade range region is encoded by the average gray level, while a nonshade range region is encoded by IFS parameters. To obtain classified range regions, the two-stage block merging scheme is proposed. Each range region is produced by merging primitive square blocks. Shade range regions are obtained at the first stage, and from the rest of primitive blocks nonshade range regions are obtained at the second stage. Furthermore, for increasing the variety of region shape, the 8-directional block merging scheme is defined by extension of the 4-directional scheme. Also, two similar schemes for encoding region shapes, each corresponding to the 4-directional block merging scheme and the 8-directional block merging scheme, are proposed. From the results of simulation by using a test image, it was demonstrated that the variety of region shape allows large shade range regions to be extracted efficiently, and these large shade range regions are more effective in reduction of total amount of codebits with less increase of degradation of reconstructed image quality than large nonshade range regions. The 8-directional merging and coding scheme and the 4-directional scheme reveal almost the same coding performance, which is improved than that of the quad-tree partitioning scheme. Also, these two schemes achieve almost the same reconstructed image quality.

  • Simultaneous Attack on CNN-Based Monocular Depth Estimation and Optical Flow Estimation

    Koichiro YAMANAKA  Keita TAKAHASHI  Toshiaki FUJII  Ryuraroh MATSUMOTO  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2021/02/08
      Vol:
    E104-D No:5
      Page(s):
    785-788

    Thanks to the excellent learning capability of deep convolutional neural networks (CNNs), CNN-based methods have achieved great success in computer vision and image recognition tasks. However, it has turned out that these methods often have inherent vulnerabilities, which makes us cautious of the potential risks of using them for real-world applications such as autonomous driving. To reveal such vulnerabilities, we propose a method of simultaneously attacking monocular depth estimation and optical flow estimation, both of which are common artificial-intelligence-based tasks that are intensively investigated for autonomous driving scenarios. Our method can generate an adversarial patch that can fool CNN-based monocular depth estimation and optical flow estimation methods simultaneously by simply placing the patch in the input images. To the best of our knowledge, this is the first work to achieve simultaneous patch attacks on two or more CNNs developed for different tasks.

  • The Optimization of Distributed Processing for Arbitrary View Generation in Camera Sensor Networks

    Mehrdad PANAHPOUR TEHRANI  Purim NA BANGCHANG  Toshiaki FUJII  Masayuki TANIMOTO  

     
    PAPER-Image/Visual Signal Processing

      Vol:
    E87-A No:8
      Page(s):
    1863-1870

    The Camera sensor network is a new advent of technology in which each sensor node can capture video signal, process and communicate with other nodes. We have investigated a dense node configuration. The requested processing task in this network is arbitrary view generation among nodes view. To avoid unnecessary communication between nodes in this network and to speed up the processing time, we propose a distributed processing architecture where the number of nodes sharing image data are optimized. Therefore, each sensor node processes part of the interpolation algorithm with local communication between sensor nodes. Two processing methods are used based on the image size shared. These two methods are F-DP (Fully image shared Distributed Processing) and P-DP (Partially image shared Distributed Processing). In this research, the network processing time has been theoretically analyzed for one user. The theoretical results are compatible with the experimental results. In addition, the performance of proposed DP methods were compared with Centralized Processing (CP). As a result, the best processing method for optimum number of nodes can be chosen based on (i) communication delay of the network, (ii) whether the network has one or more channels for communication among nodes and (iii) the processing ability of nodes.

1-20hit(29hit)