The search functionality is under construction.

Author Search Result

[Author] Toru TAMAKI(10hit)

1-10hit
  • Rephrasing Visual Questions by Specifying the Entropy of the Answer Distribution

    Kento TERAO  Toru TAMAKI  Bisser RAYTCHEV  Kazufumi KANEDA  Shin'ichi SATOH  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2020/08/20
      Vol:
    E103-D No:11
      Page(s):
    2362-2370

    Visual question answering (VQA) is a task of answering a visual question that is a pair of question and image. Some visual questions are ambiguous and some are clear, and it may be appropriate to change the ambiguity of questions from situation to situation. However, this issue has not been addressed by any prior work. We propose a novel task, rephrasing the questions by controlling the ambiguity of the questions. The ambiguity of a visual question is defined by the use of the entropy of the answer distribution predicted by a VQA model. The proposed model rephrases a source question given with an image so that the rephrased question has the ambiguity (or entropy) specified by users. We propose two learning strategies to train the proposed model with the VQA v2 dataset, which has no ambiguity information. We demonstrate the advantage of our approach that can control the ambiguity of the rephrased questions, and an interesting observation that it is harder to increase than to reduce ambiguity.

  • Unified Approach to Image Distortion: D-U and U-D Models

    Toru TAMAKI  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E88-D No:5
      Page(s):
    1086-1090

    We propose a unified view to deal with two formulations of image distortion and a method for estimating the distortion parameters for both of the formulations; So far the formulations have been developed separately. The proposed method is based on image registration and consists of nonlinear optimization to estimate parameters including view change and radial distortion. Experimental results demonstrate that our approach can deal with the two formulations simultaneously.

  • Feasibility Study for Computer-Aided Diagnosis System with Navigation Function of Clear Region for Real-Time Endoscopic Video Image on Customizable Embedded DSP Cores

    Masayuki ODAGAWA  Tetsushi KOIDE  Toru TAMAKI  Shigeto YOSHIDA  Hiroshi MIENO  Shinji TANAKA  

     
    LETTER-VLSI Design Technology and CAD

      Pubricized:
    2021/07/08
      Vol:
    E105-A No:1
      Page(s):
    58-62

    This paper presents examination result of possibility for automatic unclear region detection in the CAD system for colorectal tumor with real time endoscopic video image. We confirmed that it is possible to realize the CAD system with navigation function of clear region which consists of unclear region detection by YOLO2 and classification by AlexNet and SVMs on customizable embedded DSP cores. Moreover, we confirmed the real time CAD system can be constructed by a low power ASIC using customizable embedded DSP cores.

  • Trajectory-Set Feature for Action Recognition

    Kenji MATSUI  Toru TAMAKI  Bisser RAYTCHEV  Kazufumi KANEDA  

     
    LETTER-Pattern Recognition

      Pubricized:
    2017/05/10
      Vol:
    E100-D No:8
      Page(s):
    1922-1924

    We propose a feature for action recognition called Trajectory-Set (TS), on top of the improved Dense Trajectory (iDT). The TS feature encodes only trajectories around densely sampled interest points, without any appearance features. Experimental results on the UCF50 action dataset demonstrates that TS is comparable to state-of-the-arts, and outperforms iDT; the accuracy of 95.0%, compared to 91.7% by iDT.

  • Classification with CNN features and SVM on Embedded DSP Core for Colorectal Magnified NBI Endoscopic Video Image

    Masayuki ODAGAWA  Takumi OKAMOTO  Tetsushi KOIDE  Toru TAMAKI  Shigeto YOSHIDA  Hiroshi MIENO  Shinji TANAKA  

     
    PAPER-VLSI Design Technology and CAD

      Pubricized:
    2021/07/21
      Vol:
    E105-A No:1
      Page(s):
    25-34

    In this paper, we present a classification method for a Computer-Aided Diagnosis (CAD) system in a colorectal magnified Narrow Band Imaging (NBI) endoscopy. In an endoscopic video image, color shift, blurring or reflection of light occurs in a lesion area, which affects the discrimination result by a computer. Therefore, in order to identify lesions with high robustness and stable classification to these images specific to video frame, we implement a CAD system for colorectal endoscopic images with the Convolutional Neural Network (CNN) feature and Support Vector Machine (SVM) classification on the embedded DSP core. To improve the robustness of CAD system, we construct the SVM learned by multiple image sizes data sets so as to adapt to the noise peculiar to the video image. We confirmed that the proposed method achieves higher robustness, stable, and high classification accuracy in the endoscopic video image. The proposed method also can cope with differences in resolution by old and new endoscopes and perform stably with respect to the input endoscopic video image.

  • Model-Agnostic Multi-Domain Learning with Domain-Specific Adapters for Action Recognition

    Kazuki OMI  Jun KIMATA  Toru TAMAKI  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2022/09/15
      Vol:
    E105-D No:12
      Page(s):
    2119-2126

    In this paper, we propose a multi-domain learning model for action recognition. The proposed method inserts domain-specific adapters between layers of domain-independent layers of a backbone network. Unlike a multi-head network that switches classification heads only, our model switches not only the heads, but also the adapters for facilitating to learn feature representations universal to multiple domains. Unlike prior works, the proposed method is model-agnostic and doesn't assume model structures unlike prior works. Experimental results on three popular action recognition datasets (HMDB51, UCF101, and Kinetics-400) demonstrate that the proposed method is more effective than a multi-head architecture and more efficient than separately training models for each domain.

  • A Hardware Implementation on Customizable Embedded DSP Core for Colorectal Tumor Classification with Endoscopic Video toward Real-Time Computer-Aided Diagnosais System

    Masayuki ODAGAWA  Takumi OKAMOTO  Tetsushi KOIDE  Toru TAMAKI  Bisser RAYTCHEV  Kazufumi KANEDA  Shigeto YOSHIDA  Hiroshi MIENO  Shinji TANAKA  Takayuki SUGAWARA  Hiroshi TOISHI  Masayuki TSUJI  Nobuo TAMBA  

     
    PAPER-VLSI Design Technology and CAD

      Pubricized:
    2020/10/06
      Vol:
    E104-A No:4
      Page(s):
    691-701

    In this paper, we present a hardware implementation of a colorectal cancer diagnosis support system using a colorectal endoscopic video image on customizable embedded DSP. In an endoscopic video image, color shift, blurring or reflection of light occurs in a lesion area, which affects the discrimination result by a computer. Therefore, in order to identify lesions with high robustness and stable classification to these images specific to video frame, we implement a computer-aided diagnosis (CAD) system for colorectal endoscopic images with Narrow Band Imaging (NBI) magnification with the Convolutional Neural Network (CNN) feature and Support Vector Machine (SVM) classification. Since CNN and SVM need to perform many multiplication and accumulation (MAC) operations, we implement the proposed hardware system on a customizable embedded DSP, which can realize at high speed MAC operations and parallel processing with Very Long Instruction Word (VLIW). Before implementing to the customizable embedded DSP, we profile and analyze processing cycles of the CAD system and optimize the bottlenecks. We show the effectiveness of the real-time diagnosis support system on the embedded system for endoscopic video images. The prototyped system demonstrated real-time processing on video frame rate (over 30fps @ 200MHz) and more than 90% accuracy.

  • Calibration Method by Image Registration with Synthetic Image of 3D Model

    Toru TAMAKI  Masanobu YAMAMOTO  

     
    LETTER-Image Processing, Image Pattern Recognition

      Vol:
    E86-D No:5
      Page(s):
    981-985

    We propose a method for camera calibration based on image registration. This method registers two images; one is a real image captured by a camera with a calibration object with known shape and texture, and the other is a synthetic image containing the object. The proposed method estimates the parameters of the rotation and translation of the object by using the depth information of the synthetic image. The Gauss-Newton method is used to minimize the residuals of intensities of the two images. The proposed method does not depend on initial values of the minimization, and is applicable to images with much noise. Experimental results using real images demonstrate the robustness against initial state and noise on the image.

  • A Method for Compensation of Image Distortion with Image Registration Technique

    Toru TAMAKI  Tsuyoshi YAMAMURA  Noboru OHNISHI  

     
    PAPER

      Vol:
    E84-D No:8
      Page(s):
    990-998

    We propose a method for compensating distortion of image by calibrating intrinsic camera parameters by image registration which does not need point-to-point correspondence. The proposed method divides the registration between a calibration pattern and a distorted image observed by a camera into two steps. The first step is the straightforward registration from the pattern in order to correct the displacement due to projection. The second step is the backward registration from the observed image for compensating the distortion of the image. Both of the steps use Gauss-Newton method, a nonlinear optimization technique, to minimize residuals of intensities so that the pattern and the observed image become the same. Experimental results show the usefulness of the proposed method. Finally we discuss the convergence of the proposed method which consists of the two registration steps.

  • Object-ABN: Learning to Generate Sharp Attention Maps for Action Recognition

    Tomoya NITTA  Tsubasa HIRAKAWA  Hironobu FUJIYOSHI  Toru TAMAKI  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2022/12/14
      Vol:
    E106-D No:3
      Page(s):
    391-400

    In this paper we propose an extension of the Attention Branch Network (ABN) by using instance segmentation for generating sharper attention maps for action recognition. Methods for visual explanation such as Grad-CAM usually generate blurry maps which are not intuitive for humans to understand, particularly in recognizing actions of people in videos. Our proposed method, Object-ABN, tackles this issue by introducing a new mask loss that makes the generated attention maps close to the instance segmentation result. Further the Prototype Conformity (PC) loss and multiple attention maps are introduced to enhance the sharpness of the maps and improve the performance of classification. Experimental results with UCF101 and SSv2 shows that the generated maps by the proposed method are much clearer qualitatively and quantitatively than those of the original ABN.