The search functionality is under construction.

Author Search Result

[Author] Songlin DU(6hit)

1-6hit
  • Temporally Forward Nonlinear Scale Space for High Frame Rate and Ultra-Low Delay A-KAZE Matching System

    Songlin DU  Yuan LI  Takeshi IKENAGA  

     
    PAPER

      Pubricized:
    2020/03/06
      Vol:
    E103-D No:6
      Page(s):
    1226-1235

    High frame rate and ultra-low delay are the most essential requirements for building excellent human-machine-interaction systems. As a state-of-the-art local keypoint detection and feature extraction algorithm, A-KAZE shows high accuracy and robustness. Nonlinear scale space is one of the most important modules in A-KAZE, but it not only has at least one frame delay and but also is not hardware friendly. This paper proposes a hardware oriented nonlinear scale space for high frame rate and ultra-low delay A-KAZE matching system. In the proposed matching system, one part of nonlinear scale space is temporally forward and calculated in the previous frame (proposal #1), so that the processing delay is reduced to be less than 1 ms. To improve the matching accuracy affected by proposal #1, pre-adjustment of nonlinear scale (proposal #2) is proposed. Previous two frames are used to do motion estimation to predict the motion vector between previous frame and current frame. For further improvement of matching accuracy, pixel-level pre-adjustment (proposal #3) is proposed. The pre-adjustment changes from block-level to pixel-level, each pixel is assigned an unique motion vector. Experimental results prove that the proposed matching system shows average matching accuracy higher than 95% which is 5.88% higher than the existing high frame rate and ultra-low delay matching system. As for hardware performance, the proposed matching system processes VGA videos (640×480 pixels/frame) at the speed of 784 frame/second (fps) with a delay of 0.978 ms/frame.

  • Temporal Constraints and Block Weighting Judgement Based High Frame Rate and Ultra-Low Delay Mismatch Removal System

    Songlin DU  Zhe WANG  Takeshi IKENAGA  

     
    PAPER

      Pubricized:
    2020/03/18
      Vol:
    E103-D No:6
      Page(s):
    1236-1246

    High frame rate and ultra-low delay matching system plays an increasingly important role in human-machine interactions, because it guarantees high-quality experiences for users. Existing image matching algorithms always generate mismatches which heavily weaken the performance the human-machine-interactive systems. Although many mismatch removal algorithms have been proposed, few of them achieve real-time speed with high frame rate and low delay, because of complicated arithmetic operations and iterations. This paper proposes a temporal constraints and block weighting judgement based high frame rate and ultra-low delay mismatch removal system. The proposed method is based on two temporal constraints (proposal #1 and proposal #2) to firstly find some true matches, and uses these true matches to generate block weighting (proposal #3). Proposal #1 finds out some correct matches through checking a triangle route formed by three adjacent frames. Proposal #2 further reduces mismatch risk by adding one more time of matching with opposite matching direction. Finally, proposal #3 distinguishes the unverified matches to be correct or incorrect through weighting of each block. Software experiments show that the proposed mismatch removal system achieves state-of-the-art accuracy in mismatch removal. Hardware experiments indicate that the designed image processing core successfully achieves real-time processing of 784fps VGA (640×480 pixels/frame) video on field programmable gate array (FPGA), with a delay of 0.858 ms/frame.

  • AIGIF: Adaptively Integrated Gradient and Intensity Feature for Robust and Low-Dimensional Description of Local Keypoint

    Songlin DU  Takeshi IKENAGA  

     
    PAPER-Vision

      Vol:
    E100-A No:11
      Page(s):
    2275-2284

    Establishing local visual correspondences between images taken under different conditions is an important and challenging task in computer vision. A common solution for this task is detecting keypoints in images and then matching the keypoints with a feature descriptor. This paper proposes a robust and low-dimensional local feature descriptor named Adaptively Integrated Gradient and Intensity Feature (AIGIF). The proposed AIGIF descriptor partitions the support region surrounding each keypoint into sub-regions, and classifies the sub-regions into two categories: edge-dominated ones and smoothness-dominated ones. For edge-dominated sub-regions, gradient magnitude and orientation features are extracted; for smoothness-dominated sub-regions, intensity feature is extracted. The gradient and intensity features are integrated to generate the descriptor. Experiments on image matching were conducted to evaluate performances of the proposed AIGIF. Compared with SIFT, the proposed AIGIF achieves 75% reduction of feature dimension (from 128 bytes to 32 bytes); compared with SURF, the proposed AIGIF achieves 87.5% reduction of feature dimension (from 256 bytes to 32 bytes); compared with the state-of-the-art ORB descriptor which has the same feature dimension with AIGIF, AIGIF achieves higher accuracy and robustness. In summary, the AIGIF combines the advantages of gradient feature and intensity feature, and achieves relatively high accuracy and robustness with low feature dimension.

  • Hierarchical Progressive Trust Model for Mismatch Removal under Both Rigid and Non-Rigid Transformations

    Songlin DU  Takeshi IKENAGA  

     
    PAPER-Image, Vision

      Vol:
    E101-A No:11
      Page(s):
    1786-1794

    Accurate visual correspondence is the foundation of many computer vision based applications. Since existing image matching algorithms generate mismatches inevitably, a reliable mismatch-removal algorithm is highly desired to remove mismatches and preserve true matches. This paper proposes a hierarchical progressive trust (HPT) model to solve this problem. The HPT model first adopts a “trust the most trustworthy ones” strategy to select anchor inliers in its bottom layer, and then progressively propagates the trust from bottom layer to other layers in a bottom-up way: 1) bottom layer verifies anchor inliers with the guidance of local features; 2) middle layers progressively estimate local transformations and perform local verifications; 3) top layer estimates a global transformation with an anchor-inliers-guided expectation maximization (EM) algorithm and performs global verifications. Experimental results show that the proposed HPT model achieves higher performance than state-of-the-art mismatch-removal methods under both rigid transformations and non-rigid deformations.

  • Body Part Connection, Categorization and Occlusion Based Tracking with Correction by Temporal Positions for Volleyball Spike Height Analysis

    Xina CHENG  Ziken LI  Songlin DU  Takeshi IKENAGA  

     
    PAPER-Vision

      Vol:
    E103-A No:12
      Page(s):
    1503-1511

    The spike height of volleyball players is important in volleyball analysis as the quantitative criteria to evaluation players' motions, which not only provides rich information to audiences in live broadcast of sports events but also makes contribution to evaluate and improve the performance of players in strategy analysis and players training. In the volleyball game scene, the high similarity between hands, the deformation and the occlusion are three main problems that influence the acquisition performance of spike height. To solve these problems, this paper proposes a body part connection, categorization and occlusion based observation model and a temporal position based correction method. Firstly, skin pixel filter based connection detection solves the problem of high similarity between hands by judging whether a hand is connected to the spike player. Secondly, the body part categorization based observation uses the probability distribution map of hand to determine the category of each body part to solve the deformation problem. Thirdly, the occlusion part detection based observation eliminates the influence of the views with occluded body part by detecting the occluded views with a trained classifier of body part. At last, the temporal position based result correction combines the estimated results, which refers the historical positions, and the posterior result to obtain an optimal result by degree of confidence. The experiments are based on the videos of final and semi-final games of 2014 Japan Inter High School Men's Volleyball in Tokyo Metropolitan Gymnasium, which includes 196 spike sequences of 4 teams. The experiment results of proposed methods are that: 93.37% of test sequences can be successfully detected the spike height, and in which the average error of spike height is 5.96cm.

  • Adaptive-Partial Template Update with Center-Shifting Recovery for High Frame Rate and Ultra-Low Delay Deformation Matching

    Songlin DU  Yuhao XU  Tingting HU  Takeshi IKENAGA  

     
    PAPER-Image

      Vol:
    E102-A No:12
      Page(s):
    1872-1881

    High frame rate and ultra-low delay matching system plays an important role in various human-machine interactive applications, which demands better performance in matching deformable and out-of-plane rotating objects. Although many algorithms have been proposed for deformation tracking and matching, few of them are suitable for hardware implementation due to complicated operations and large time consumption. This paper proposes a hardware-oriented template update and recovery method for high frame rate and ultra-low delay deformation matching system. In the proposed method, the new template is generated in real time by partially updating the template descriptor and adding new keypoints simultaneously with the matching process in pixels (proposal #1), which avoids the large inter-frame delay. The size and shape of region of interest (ROI) are made flexible and the Hamming threshold used for brute-force matching is adjusted according to pixel position and the flexible ROI (proposal #2), which solves the problem of template drift. The template is recovered by the previous one with a relative center-shifting vector when it is judged as lost via region-wise difference check (proposal #3). Evaluation results indicate that the proposed method successfully achieves the real-time processing of 784fps at the resolution of 640×480 on field-programmable gate array (FPGA), with a delay of 0.808ms/frame, as well as achieves satisfactory deformation matching results in comparison with other general methods.