The search functionality is under construction.

Author Search Result

[Author] Zhenglong YANG(6hit)

1-6hit
  • Loosening Bolts Detection of Bogie Box in Metro Vehicles Based on Deep Learning

    Weiwei QI  Shubin ZHENG  Liming LI  Zhenglong YANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2022/07/28
      Vol:
    E105-D No:11
      Page(s):
    1990-1993

    Bolts in the bogie box of metro vehicles are fasteners which are significant for bogie box structure. Effective loosening bolts detection in early stage can avoid the bolt loss and accident occurrence. Recently, detection methods based on machine vision are developed for bolt loosening. But traditional image processing and machine learning methods have high missed rate and false rate for bolts detection due to the small size and complex background. To address this problem, a loosening bolts defection method based on deep learning is proposed. The proposed method cascades two stages in a coarse-to-fine manner, including location stage based on the Single Shot Multibox Detector (SSD) and the improved SSD sequentially localizing the bogie box and bolts and a semantic segmentation stage with the U-shaped Network (U-Net) to detect the looseness of the bolts. The accuracy and effectiveness of the proposed method are verified with images captured from the Shanghai Metro Line 9. The results show that the proposed method has a higher accuracy in detecting the bolts loosening, which can guarantee the stable operation of the metro vehicles.

  • Lookahead Search-Based Low-Complexity Multi-Type Tree Pruning Method for Versatile Video Coding (VVC) Intra Coding

    Qi TENG  Guowei TENG  Xiang LI  Ran MA  Ping AN  Zhenglong YANG  

     
    PAPER-Coding Theory

      Pubricized:
    2022/08/24
      Vol:
    E106-A No:3
      Page(s):
    606-615

    The latest versatile video coding (VVC) introduces some novel techniques such as quadtree with nested multi-type tree (QTMT), multiple transform selection (MTS) and multiple reference line (MRL). These tools improve compression efficiency compared with the previous standard H.265/HEVC, but they suffer from very high computational complexity. One of the most time-consuming parts of VVC intra coding is the coding tree unit (CTU) structure decision. In this paper, we propose a low-complexity multi-type tree (MT) pruning method for VVC intra coding. This method consists of lookahead search and MT pruning. The lookahead search process is performed to derive the approximate rate-distortion (RD) cost of each MT node at depth 2 or 3. Subsequently, the improbable MT nodes are pruned by different strategies under different cost errors. These strategies are designed according to the priority of the node. Experimental results show that the overall proposed algorithm can achieve 47.15% time saving with only 0.93% Bjøntegaard delta bit rate (BDBR) increase over natural scene sequences, and 45.39% time saving with 1.55% BDBR increase over screen content sequences, compared with the VVC reference software VTM 10.0. Such results demonstrate that our method achieves a good trade-off between computational complexity and compression quality compared to recent methods.

  • A CNN-Based Optimal CTU λ Decision for HEVC Intra Rate Control

    Lili WEI  Zhenglong YANG  Zhenming WANG  Guozhong WANG  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2021/07/19
      Vol:
    E104-D No:10
      Page(s):
    1766-1769

    Since HEVC intra rate control has no prior information to rely on for coding, it is a difficult work to obtain the optimal λ for every coding tree unit (CTU). In this paper, a convolutional neural network (CNN) based intra rate control is proposed. Firstly, a CNN with two last output channels is used to predict the key parameters of the CTU R-λ curve. For well training the CNN, a combining loss function is built and the balance factor γ is explored to achieve the minimum loss result. Secondly, the initial CTU λ can be calculated by the predicted results of the CNN and the allocated bit per pixel (bpp). According to the rate distortion optimization (RDO) of a frame, a spatial equation is derived between the CTU λ and the frame λ. Lastly, The CTU clipping function is used to obtain the optimal CTU λ for the intra rate control. The experimental results show that the proposed algorithm improves the intra rate control performance significantly with a good rate control accuracy.

  • Temporal Domain Difference Based Secondary Background Modeling Algorithm

    Guowei TENG  Hao LI  Zhenglong YANG  

     
    LETTER-Communication Theory and Signals

      Vol:
    E103-A No:2
      Page(s):
    571-575

    This paper proposes a temporal domain difference based secondary background modeling algorithm for surveillance video coding. The proposed algorithm has three key technical contributions as following. Firstly, the LDBCBR (Long Distance Block Composed Background Reference) algorithm is proposed, which exploits IBBS (interval of background blocks searching) to weaken the temporal correlation of the foreground. Secondly, both BCBR (Block Composed Background Reference) and LDBCBR are exploited at the same time to generate the temporary background reference frame. The secondary modeling algorithm utilizes the temporary background blocks generated by BCBR and LDBCBR to get the final background frame. Thirdly, monitor the background reference frame after it is generated is also important. We would update the background blocks immediately when it has a big change, shorten the modeling period of the areas where foreground moves frequently and check the stable background regularly. The proposed algorithm is implemented in the platform of IEEE1857 and the experimental results demonstrate that it has significant improvement in coding efficiency. In surveillance test sequences recommended by the China AVS (Advanced Audio Video Standard) working group, our method achieve BD-Rate gain by 6.81% and 27.30% comparing with BCBR and the baseline profile.

  • A Foreground-Background-Based CTU λ Decision Algorithm for HEVC Rate Control of Surveillance Videos

    Zhenglong YANG  Guozhong WANG  GuoWei TENG  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2018/12/18
      Vol:
    E102-D No:3
      Page(s):
    670-674

    Although HEVC rate control can achieve high coding efficiency, it still does not fully utilize the special characteristics of surveillance videos, which typically have a moving foreground and relatively static background. For surveillance videos, it is usually necessary to provide a better coding quality of the moving foreground. In this paper, a foreground-background CTU λ separate decision scheme is proposed. First, low-complexity pixel-based segmentation is presented to obtain the foreground and the background. Second, the rate distortion (RD) characteristics of the foreground and the background are explored. With the rate distortion optimization (RDO) process, the average CTU λ value of the foreground or the background should be equal to the frame λ. Then, a separate optimal CTU λ decision is proposed with a separate λ clipping method. Finally, a separate updating process is used to obtain reasonable parameters for the foreground and the background. The experimental results show that the quality of the foreground is improved by 0.30 dB in the random access configuration and 0.45 dB in the low delay configuration without degradation of either the rate control accuracy or whole frame quality.

  • A Weighted Forward-Backward Spatial Smoothing DOA Estimation Algorithm Based on TLS-ESPRIT

    Manlin XIAO  Zhibo DUAN  Zhenglong YANG  

     
    LETTER-Fundamentals of Information Systems

      Pubricized:
    2021/03/16
      Vol:
    E104-D No:6
      Page(s):
    881-884

    Based on TLS-ESPRIT algorithm, this paper proposes a weighted spatial smoothing DOA estimation algorithm to address the problem that the conventional TLS-ESPRIT algorithm will be disabled to estimate the direction of arrival (DOA) in the scenario of coherent sources. The proposed method divides the received signal array into several subarrays with special structural feature. Then, utilizing these subarrays, this paper constructs the new weighted covariance matrix to estimate the DOA based on TLS-ESPRIT. The auto-correlation and cross-correlation information of subarrays in the proposed algorithm is extracted sufficiently, improving the orthogonality between the signal subspace and the noise subspace so that the DOA of coherent sources could be estimated accurately. The simulations show that the proposed algorithm is superior to the conventional spatial smoothing algorithms under different signal to noise ratio (SNR) and snapshot numbers with coherent sources.