The search functionality is under construction.

Author Search Result

[Author] Zongliang GAN(5hit)

1-5hit
  • Wavelet Pyramid Based Multi-Resolution Bilateral Motion Estimation for Frame Rate Up-Conversion

    Ran LI  Hongbing LIU  Jie CHEN  Zongliang GAN  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2015/06/03
      Vol:
    E99-D No:1
      Page(s):
    208-218

    The conventional bilateral motion estimation (BME) for motion-compensated frame rate up-conversion (MC-FRUC) can avoid the problem of overlapped areas and holes but usually results in lots of inaccurate motion vectors (MVs) since 1) the MV of an object between the previous and following frames is more likely to have no temporal symmetry with respect to the target block of the interpolated frame and 2) the repetitive patterns existing in video frame lead to the problem of mismatch due to the lack of the interpolated block. In this paper, a new BME algorithm with a low computational complexity is proposed to resolve the above problems. The proposed algorithm incorporates multi-resolution search into BME, since it can easily utilize the MV consistency between two adjacent pyramid levels and spatial neighboring MVs to correct the inaccurate MVs resulting from no temporal symmetry while guaranteeing low computational cost. Besides, the multi-resolution search uses the fast wavelet transform to construct the wavelet pyramid, which not only can guarantee low computational complexity but also can reserve the high-frequency components of image at each level while sub-sampling. The high-frequency components are used to regularize the traditional block matching criterion for reducing the probability of mismatch in BME. Experiments show that the proposed algorithm can significantly improve both the objective and subjective quality of the interpolated frame with low computational complexity, and provide the better performance than the existing BME algorithms.

  • Collective Activity Recognition by Attribute-Based Spatio-Temporal Descriptor

    Changhong CHEN  Hehe DOU  Zongliang GAN  

     
    LETTER-Pattern Recognition

      Pubricized:
    2015/07/22
      Vol:
    E98-D No:10
      Page(s):
    1875-1878

    Collective activity recognition plays an important role in high-level video analysis. Most current feature representations look at contextual information extracted from the behaviour of nearby people. Every person needs to be detected and his pose should be estimated. After extracting the feature, hierarchical graphical models are always employed to model the spatio-temporal patterns of individuals and their interactions, and so can not avoid complex preprocessing and inference operations. To overcome these drawbacks, we present a new feature representation method, called attribute-based spatio-temporal (AST) descriptor. First, two types of information, spatio-temporal (ST) features and attribute features, are exploited. Attribute-based features are manually specified. An attribute classifier is trained to model the relationship between the ST features and attribute-based features, according to which the attribute features are refreshed. Then, the ST features, attribute features and the relationship between the attributes are combined to form the AST descriptor. An objective classifier can be specified on the AST descriptor and the weight parameters of the classifier are used for recognition. Experiments on standard collective activity benchmark sets show the effectiveness of the proposed descriptor.

  • Topic-Based Knowledge Transfer Algorithm for Cross-View Action Recognition

    Changhong CHEN  Shunqing YANG  Zongliang GAN  

     
    LETTER-Pattern Recognition

      Vol:
    E97-D No:3
      Page(s):
    614-617

    Cross-view action recognition is a challenging research field for human motion analysis. Appearance-based features are not credible if the viewpoint changes. In this paper, a new framework is proposed for cross-view action recognition by topic based knowledge transfer. First, Spatio-temporal descriptors are extracted from the action videos and each video is modeled by a bag of visual words (BoVW) based on the codebook constructed by the k-means cluster algorithm. Second, Latent Dirichlet Allocation (LDA) is employed to assign topics for the BoVW representation. The topic distribution of visual words (ToVW) is normalized and taken to be the feature vector. Third, in order to bridge different views, we transform ToVW into bilingual ToVW by constructing bilingual dictionaries, which guarantee that the same action has the same representation from different views. We demonstrate the effectiveness of the proposed algorithm on the IXMAS multi-view dataset.

  • Low Bit-Rate Compression Image Restoration through Subspace Joint Regression Learning

    Zongliang GAN  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2018/06/28
      Vol:
    E101-D No:10
      Page(s):
    2539-2542

    In this letter, an effective low bit-rate image restoration method is proposed, in which image denoising and subspace regression learning are combined. The proposed framework has two parts: image main structure estimation by classical NLM denoising and texture component prediction by subspace joint regression learning. The local regression function are learned from denoised patch to original patch in each subspace, where the corresponding compression image patches are employed to generate anchoring points by the dictionary learning approach. Moreover, we extent Extreme Support Vector Regression (ESVR) as multi-variable nonlinear regression to get more robustness results. Experimental results demonstrate the proposed method achieves favorable performance compared with other leading methods.

  • Low Complexity Image/Video Super Resolution Using Edge and Nonlocal Self-Similarity Constraint

    Zongliang GAN  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E96-D No:7
      Page(s):
    1569-1572

    In this letter, we present a fast image/video super resolution framework using edge and nonlocal constraint. The proposed method has three steps. First, we improve the initial estimation using content-adaptive bilateral filtering to strengthen edge. Second, the high resolution image is estimated by using classical back projection method. Third, we use joint content-adaptive nonlocal means filtering to get the final result, and self-similarity structures are obtained by the low resolution image. Furthermore, content-adaptive filtering and fast self-similarity search strategy can effectively reduce computation complexity. The experimental results show the proposed method has good performance with low complexity and can be used for real-time environment.