The search functionality is under construction.

Author Search Result

[Author] Xudong ZHANG(4hit)

1-4hit
  • Attention-Guided Region Proposal Network for Pedestrian Detection

    Rui SUN  Huihui WANG  Jun ZHANG  Xudong ZHANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2019/07/08
      Vol:
    E102-D No:10
      Page(s):
    2072-2076

    As a research hotspot and difficulty in the field of computer vision, pedestrian detection has been widely used in intelligent driving and traffic monitoring. The popular detection method at present uses region proposal network (RPN) to generate candidate regions, and then classifies the regions. But the RPN produces many erroneous candidate areas, causing region proposals for false positives to increase. This letter uses improved residual attention network to capture the visual attention map of images, then normalized to get the attention score map. The attention score map is used to guide the RPN network to generate more precise candidate regions containing potential target objects. The region proposals, confidence scores, and features generated by the RPN are used to train a cascaded boosted forest classifier to obtain the final results. The experimental results show that our proposed approach achieves highly competitive results on the Caltech and ETH datasets.

  • An Error Detection Method Based on Coded Block Pattern Information Verification for Wireless Video Communication

    Yu CHEN  XuDong ZHANG  DeSheng WANG  

     
    LETTER-Multimedia Systems for Communications" Multimedia Systems for Communications

      Vol:
    E89-B No:2
      Page(s):
    629-632

    A novel error detection method based on coded block pattern (CBP) information verification is proposed for error concealment of inter-coded video frames transmitted in wireless channel. This method firstly modifies the original video stream structure by the aggregation of certain important information, and then inserts some error verification bits into the video stream for each encoded macro block (MB), these bits can be used as reference information to determine whether each encoded MB is corrupted. Experimental results on additive Gauss white noise simulation wireless channel and H.263+ baseline codec show that the proposed method can outperform other reference approaches on error detection performance. In addition, it can preserve the original video quality with a small coding overhead increase.

  • Foveation Based Error Resilience Optimization for H.264 Intra Coded Frame in Wireless Communication

    Yu CHEN  XuDong ZHANG  DeSheng WANG  

     
    LETTER-Multimedia Systems for Communications" Multimedia Systems for Communications

      Vol:
    E89-B No:2
      Page(s):
    633-636

    Based on the observation that foveation analysis can be used to find most critical content in terms of human visual perception in video and image, one effective error resilience method is proposed for robust transmission of H.264 intra-coded frame in wireless channel. It firstly exploits the results of foveation analysis to find foveated area in picture, and then considers the results of pre-error concealment effect analysis to search for the center of foveation macro-blocks (MB) in foveated area, finally new error resilient alignment order of MB and new coding order of MB are proposed that are used in video encoder. Extensive experimental results on different portrait video sequences over random bit error wireless channel demonstrate that this proposed method can achieve better subjective and objective effect than original JM 8.2 H.264 video codec with little effect on coding rate and image quality.

  • Triplet Attention Network for Video-Based Person Re-Identification

    Rui SUN  Qili LIANG  Zi YANG  Zhenghui ZHAO  Xudong ZHANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2021/07/21
      Vol:
    E104-D No:10
      Page(s):
    1775-1779

    Video-based person re-identification (re-ID) aims at retrieving person across non-overlapping camera and has achieved promising results owing to deep convolutional neural network. Due to the dynamic properties of the video, the problems of background clutters and occlusion are more serious than image-based person Re-ID. In this letter, we present a novel triple attention network (TriANet) that simultaneously utilizes temporal, spatial, and channel context information by employing the self-attention mechanism to get robust and discriminative feature. Specifically, the network has two parts, where the first part introduces a residual attention subnetwork, which contains channel attention module to capture cross-dimension dependencies by using rotation and transformation and spatial attention module to focus on pedestrian feature. In the second part, a time attention module is designed to judge the quality score of each pedestrian, and to reduce the weight of the incomplete pedestrian image to alleviate the occlusion problem. We evaluate our proposed architecture on three datasets, iLIDS-VID, PRID2011 and MARS. Extensive comparative experimental results show that our proposed method achieves state-of-the-art results.