The search functionality is under construction.

Author Search Result

[Author] Satoshi KOMORITA(2hit)

1-2hit
  • SDOF-Tracker: Fast and Accurate Multiple Human Tracking by Skipped-Detection and Optical-Flow

    Hitoshi NISHIMURA  Satoshi KOMORITA  Yasutomo KAWANISHI  Hiroshi MURASE  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2022/08/01
      Vol:
    E105-D No:11
      Page(s):
    1938-1946

    Multiple human tracking is a fundamental problem in understanding the context of a visual scene. Although both accuracy and speed are required in real-world applications, recent tracking methods based on deep learning focus on accuracy and require a substantial amount of running time. We aim to improve tracking running speeds by performing human detections at certain frame intervals because it accounts for most of the running time. The question is how to maintain accuracy while skipping human detection. In this paper, we propose a method that interpolates the detection results by using an optical flow, which is based on the fact that someone's appearance does not change much between adjacent frames. To maintain the tracking accuracy, we introduce robust interest point detection within the human regions and a tracking termination metric defined by the distribution of the interest points. On the MOT17 and MOT20 datasets in the MOTChallenge, the proposed SDOF-Tracker achieved the best performance in terms of total running time while maintaining the MOTA metric. Our code is available at https://github.com/hitottiez/sdof-tracker.

  • FSPose: A Heterogeneous Framework with Fast and Slow Networks for Human Pose Estimation in Videos

    Jianfeng XU  Satoshi KOMORITA  Kei KAWAMURA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/03/20
      Vol:
    E106-D No:6
      Page(s):
    1165-1174

    We propose a framework for the integration of heterogeneous networks in human pose estimation (HPE) with the aim of balancing accuracy and computational complexity. Although many existing methods can improve the accuracy of HPE using multiple frames in videos, they also increase the computational complexity. The key difference here is that the proposed heterogeneous framework has various networks for different types of frames, while existing methods use the same networks for all frames. In particular, we propose to divide the video frames into two types, including key frames and non-key frames, and adopt three networks including slow networks, fast networks, and transfer networks in our heterogeneous framework. For key frames, a slow network is used that has high accuracy but high computational complexity. For non-key frames that follow a key frame, we propose to warp the heatmap of a slow network from a key frame via a transfer network and fuse it with a fast network that has low accuracy but low computational complexity. Furthermore, when extending to the usage of long-term frames where a large number of non-key frames follow a key frame, the temporal correlation decreases. Therefore, when necessary, we use an additional transfer network that warps the heatmap from a neighboring non-key frame. The experimental results on PoseTrack 2017 and PoseTrack 2018 datasets demonstrate that the proposed FSPose achieves a better balance between accuracy and computational complexity than the competitor method. Our source code is available at https://github.com/Fenax79/fspose.