The search functionality is under construction.

Author Search Result

[Author] Hideaki KIMATA(7hit)

1-7hit
  • Stabilization Technique for Region-of-Interest Trajectories Made from Video Watching Manipulations

    Daisuke OCHI  Hideaki KIMATA  Yoshinori KUSACHI  Kosuke TAKAHASHI  Akira KOJIMA  

     
    PAPER-Human-computer Interaction

      Vol:
    E97-D No:2
      Page(s):
    266-274

    Due to the recent progress made in camera and network environments, on-line video services enable people around the world to watch or share high-quality HD videos that can record a wider angle without losing objects' details in each image. As a result, users of these services can watch videos in different ways with different ROIs (Regions of Interest), especially when there are multiple objects in a scene, and thus there are few common ways for them to transfer their impressions for each scene directly. Posting messages is currently the usual way but it does not sufficiently enable all users to transfer their impressions. To transfer a user's impressions directly and provide users with a richer video watching experience, we propose a system that enables them to extract their favorite parts of videos as ROI trajectories through simple and intuitive manipulation of their tablet device. It also enables them to share a recorded trajectory with others after stabilizing it in a manner that should be satisfactory to every user. Using statistical analysis of user manipulations, we have demonstrated an approach to trajectory stabilization that can eliminate undesirable or uncomfortable elements due to tablet-specific manipulations. The system's validity has been confirmed by subjective evaluations.

  • Depth Range Control in Visually Equivalent Light Field 3D Open Access

    Munekazu DATE  Shinya SHIMIZU  Hideaki KIMATA  Dan MIKAMI  Yoshinori KUSACHI  

     
    INVITED PAPER-Electronic Displays

      Pubricized:
    2020/08/13
      Vol:
    E104-C No:2
      Page(s):
    52-58

    3D video contents depend on the shooting condition, which is camera positioning. Depth range control in the post-processing stage is not easy, but essential as the video from arbitrary camera positions must be generated. If light field information can be obtained, video from any viewpoint can be generated exactly and post-processing is possible. However, a light field has a huge amount of data, and capturing a light field is not easy. To compress data quantity, we proposed the visually equivalent light field (VELF), which uses the characteristics of human vision. Though a number of cameras are needed, VELF can be captured by a camera array. Since camera interpolation is made using linear blending, calculation is so simple that we can construct a ray distribution field of VELF by optical interpolation in the VELF3D display. It produces high image quality due to its high pixel usage efficiency. In this paper, we summarize the relationship between the characteristics of human vision, VELF and VELF3D display. We then propose a method to control the depth range for the observed image on the VELF3D display and discuss the effectiveness and limitations of displaying the processed image on the VELF3D display. Our method can be applied to other 3D displays. Since the calculation is just weighted averaging, it is suitable for real-time applications.

  • Extrinsic Camera Calibration of Display-Camera System with Cornea Reflections

    Kosuke TAKAHASHI  Dan MIKAMI  Mariko ISOGAWA  Akira KOJIMA  Hideaki KIMATA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2018/09/26
      Vol:
    E101-D No:12
      Page(s):
    3199-3208

    In this paper, we propose a novel method to extrinsically calibrate a camera to a 3D reference object that is not directly visible from the camera. We use a human cornea as a spherical mirror and calibrate the extrinsic parameters from the reflections of the reference points. The main contribution of this paper is to present a cornea-reflection-based calibration algorithm with a simple configuration: five reference points on a single plane and one mirror pose. In this paper, we derive a linear equation and obtain a closed-form solution of extrinsic calibration by introducing two ideas. The first is to model the cornea as a virtual sphere, which enables us to estimate the center of the cornea sphere from its projection. The second is to use basis vectors to represent the position of the reference points, which enables us to deal with 3D information of reference points compactly. We demonstrate the performance of the proposed method with qualitative and quantitative evaluations using synthesized and real data.

  • GAN-Based Image Compression Using Mutual Information for Optimizing Subjective Image Similarity

    Shinobu KUDO  Shota ORIHASHI  Ryuichi TANIDA  Seishi TAKAMURA  Hideaki KIMATA  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2020/12/02
      Vol:
    E104-D No:3
      Page(s):
    450-460

    Recently, image compression systems based on convolutional neural networks that use flexible nonlinear analysis and synthesis transformations have been developed to improve the restoration accuracy of decoded images. Although these methods that use objective metric such as peak signal-to-noise ratio and multi-scale structural similarity for optimization attain high objective results, such metric may not reflect human visual characteristics and thus degrade subjective image quality. A method using a framework called a generative adversarial network (GAN) has been reported as one of the methods aiming to improve the subjective image quality. It optimizes the distribution of restored images to be close to that of natural images; thus it suppresses visual artifacts such as blurring, ringing, and blocking. However, since methods of this type are optimized to focus on whether the restored image is subjectively natural or not, components that are not correlated with the original image are mixed into the restored image during the decoding process. Thus, even though the appearance looks natural, subjective similarity may be degraded. In this paper, we investigated why the conventional GAN-based compression techniques degrade subjective similarity, then tackled this problem by rethinking how to handle image generation in the GAN framework between image sources with different probability distributions. The paper describes a method to maximize mutual information between the coding features and the restored images. Experimental results show that the proposed mutual information amount is clearly correlated with subjective similarity and the method makes it possible to develop image compression systems with high subjective similarity.

  • Local Riesz Pyramid for Faster Phase-Based Video Magnification

    Shoichiro TAKEDA  Megumi ISOGAI  Shinya SHIMIZU  Hideaki KIMATA  

     
    PAPER

      Pubricized:
    2020/06/22
      Vol:
    E103-D No:10
      Page(s):
    2036-2046

    Phase-based video magnification methods can magnify and reveal subtle motion changes invisible to the naked eye. In these methods, each image frame in a video is decomposed into an image pyramid, and subtle motion changes are then detected as local phase changes with arbitrary orientations at each pixel and each pyramid level. One problem with this process is a long computational time to calculate the local phase changes, which makes high-speed processing of video magnification difficult. Recently, a decomposition technique called the Riesz pyramid has been proposed that detects only local phase changes in the dominant orientation. This technique can remove the arbitrariness of orientations and lower the over-completeness, thus achieving high-speed processing. However, as the resolution of input video increases, a large amount of data must be processed, requiring a long computational time. In this paper, we focus on the correlation of local phase changes between adjacent pyramid levels and present a novel decomposition technique called the local Riesz pyramid that enables faster phase-based video magnification by automatically processing the minimum number of sufficient local image areas at several pyramid levels. Through this minimum pyramid processing, our proposed phase-based video magnification method using the local Riesz pyramid achieves good magnification results within a short computational time.

  • Image Based Coding of Spatial Probability Distribution on Human Dynamics Data

    Hideaki KIMATA  Xiaojun WU  Ryuichi TANIDA  

     
    PAPER

      Pubricized:
    2021/06/24
      Vol:
    E104-D No:10
      Page(s):
    1545-1554

    The need for real-time use of human dynamics data is increasing. The technical requirements for this include improved databases for handling a large amount of data as well as highly accurate sensing of people's movements. A bitmap index format has been proposed for high-speed processing of data that spreads in a two-dimensional space. Using the same format is expected to provide a service that searches queries, reads out desired data, visualizes it, and analyzes it. In this study, we propose a coding format that enables human dynamics data to compress it in the target data size, in order to save data storage for successive increase of real-time human dynamics data. In the proposed method, the spatial population distribution, which is expressed by a probability distribution, is approximated and compressed using the one-pixel one-byte data format normally used for image coding. We utilize two kinds of approximation, which are accuracy of probability and precision of spatial location, in order to control the data size and the amount of information. For accuracy of probability, we propose a non-linear mapping method for the spatial distribution, and for precision of spatial location, we propose spatial scalable layered coding to refine the mesh level of the spatial distribution. Also, in order to enable additional detailed analysis, we propose another scalable layered coding that improves the accuracy of the distribution. We demonstrate through experiments that the proposed data approximation and coding format achieve sufficient approximation of spatial population distribution in the given condition of target data size.

  • Multi-Layered DP Quantization Algorithm Open Access

    Yukihiro BANDOH  Seishi TAKAMURA  Hideaki KIMATA  

     
    PAPER-Image

      Vol:
    E103-A No:12
      Page(s):
    1552-1561

    Designing an optimum quantizer can be treated as the optimization problem of finding the quantization indices that minimize the quantization error. One solution to the optimization problem, DP quantization, is based on dynamic programming. Some applications, such as bit-depth scalable codec and tone mapping, require the construction of multiple quantizers with different quantization levels, for example, from 12bit/channel to 10bit/channel and 8bit/channel. Unfortunately, the above mentioned DP quantization optimizes the quantizer for just one quantization level. That is, it is unable to simultaneously optimize multiple quantizers. Therefore, when DP quantization is used to design multiple quantizers, there are many redundant computations in the optimization process. This paper proposes an extended DP quantization with a complexity reduction algorithm for the optimal design of multiple quantizers. Experiments show that the proposed algorithm reduces complexity by 20.8%, on average, compared to conventional DP quantization.