The search functionality is under construction.

Author Search Result

[Author] Yao ZHAO(13hit)

1-13hit
  • Saccade Information Based Directional Heat Map Generation for Gaze Data Visualization

    Yinwei ZHAN  Yaodong LI  Zhuo YANG  Yao ZHAO  Huaiyu WU  

     
    LETTER-Computer Graphics

      Pubricized:
    2019/05/15
      Vol:
    E102-D No:8
      Page(s):
    1602-1605

    Heat map is an important tool for eye tracking data analysis and visualization. It is very intuitive to express the area watched by observer, but ignores saccade information that expresses gaze shift. Based on conventional heat map generation method, this paper presents a novel heat map generation method for eye tracking data. The proposed method introduces a mixed data structure of fixation points and saccades, and considers heat map deformation for saccade type data. The proposed method has advantages on indicating gaze transition direction while visualizing gaze region.

  • Compatible Stereo Video Coding with Adaptive Prediction Structure

    Lili MENG  Yao ZHAO  Anhong WANG  Jeng-Shyang PAN  Huihui BAI  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E94-D No:7
      Page(s):
    1506-1509

    A stereo video coding scheme which is compatible with monoview-processor is presented in this paper. At the same time, this paper proposes an adaptive prediction structure which can make different prediction modes to be applied to different groups of picture (GOPs) according to temporal correlations and interview correlations to improve the coding efficiency. Moreover, the most advanced video coding standard H.264 is used conveniently for maximize the coding efficiency in this paper. Finally, the effectiveness of the proposed scheme is verified by extensive experimental results.

  • MR-MIL: Manifold Ranking Based Multiple-Instance Learning for Automatic Image Annotation

    Yufeng ZHAO  Yao ZHAO  Zhenfeng ZHU  Jeng-Shyang PAN  

     
    LETTER-Image

      Vol:
    E91-A No:10
      Page(s):
    3088-3089

    A novel automatic image annotation (AIA) scheme is proposed based on multiple-instance learning (MIL). For a given concept, manifold ranking (MR) is first employed to MIL (referred as MR-MIL) for effectively mining the positive instances (i.e. regions in images) embedded in the positive bags (i.e. images). With the mined positive instances, the semantic model of the concept is built by the probabilistic output of SVM classifier. The experimental results reveal that high annotation accuracy can be achieved at region-level.

  • Security Consideration for Deep Learning-Based Image Forensics

    Wei ZHAO  Pengpeng YANG  Rongrong NI  Yao ZHAO  Haorui WU  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2018/08/24
      Vol:
    E101-D No:12
      Page(s):
    3263-3266

    Recently, image forensics community has paid attention to the research on the design of effective algorithms based on deep learning technique. And facts proved that combining the domain knowledge of image forensics and deep learning would achieve more robust and better performance than the traditional schemes. Instead of improving algorithm performance, in this paper, the safety of deep learning based methods in the field of image forensics is taken into account. To the best of our knowledge, this is the first work focusing on this topic. Specifically, we experimentally find that the method using deep learning would fail when adding the slight noise into the images (adversarial images). Furthermore, two kinds of strategies are proposed to enforce security of deep learning-based methods. Firstly, a penalty term to the loss function is added, which is the 2-norm of the gradient of the loss with respect to the input images, and then an novel training method is adopt to train the model by fusing the normal and adversarial images. Experimental results show that the proposed algorithm can achieve good performance even in the case of adversarial images and provide a security consideration for deep learning-based image forensics.

  • Commercial Shot Classification Based on Multiple Features Combination

    Nan LIU  Yao ZHAO  Zhenfeng ZHU  Rongrong NI  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E93-D No:9
      Page(s):
    2651-2655

    This paper presents a commercial shot classification scheme combining well-designed visual and textual features to automatically detect TV commercials. To identify the inherent difference between commercials and general programs, a special mid-level textual descriptor is proposed, aiming to capture the spatio-temporal properties of the video texts typical of commercials. In addition, we introduce an ensemble-learning based combination method, named Co-AdaBoost, to interactively exploit the intrinsic relations between the visual and textual features employed.

  • Multiple Description Video Coding Using Inter- and Intra-Description Correlation at Macro Block Level

    Huihui BAI  Mengmeng ZHANG  Anhong WANG  Meiqin LIU  Yao ZHAO  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E97-D No:2
      Page(s):
    384-387

    A novel standard-compliant multiple description (MD) video codec is proposed in this paper, which aims to achieve effective redundancy allocation using inter- and intra-description correlation. The inter-description correlation at macro block (MB) level is applied to produce side information of different modes which is helpful for better side decoding quality. Furthermore, the intra-description correlation at MB level is exploited to design the adaptive skip mode for higher compression efficiency. The experimental results exhibit a better rate of side and central distortion performance compared with other relevant MDC schemes.

  • Just Noticeable Difference Based Fast Coding Unit Partition in HEVC Intra Coding

    Meng ZHANG  Huihui BAI  Meiqin LIU  Anhong WANG  Mengmeng ZHANG  Yao ZHAO  

     
    LETTER-Image

      Vol:
    E97-A No:12
      Page(s):
    2680-2683

    As an ongoing video compression standard, High Efficiency Video Coding (HEVC) has achieved better rate distortion performance than H.264, but it also leads to enormous encoding complexity. In this paper, we propose a novel fast coding unit partition algorithm in the intra prediction of HEVC. Firstly, instead of the time-consuming rate distortion optimization for coding mode decision, just-noticeable-difference (JND) values can be exploited to partition the coding unit according to human visual system characteristics. Furthermore, coding bits in HEVC can also be considered as assisted information to refine the partition results. Compared with HEVC test model HM10.1, the experimental results show that the fast intra mode decision algorithm provides over 28% encoding time saving on average with comparable rate distortion performance.

  • Edge-Based Adaptive Sampling for Image Block Compressive Sensing

    Lijing MA  Huihui BAI  Mengmeng ZHANG  Yao ZHAO  

     
    LETTER-Image

      Vol:
    E99-A No:11
      Page(s):
    2095-2098

    In this paper, a novel scheme of the adaptive sampling of block compressive sensing is proposed for natural images. In view of the contents of images, the edge proportion in a block can be used to represent its sparsity. Furthermore, according to the edge proportion, the adaptive sampling rate can be adaptively allocated for better compressive sensing recovery. Given that there are too many blocks in an image, it may lead to a overhead cost for recording the ratio of measurement of each block. Therefore, K-means method is applied to classify the blocks into clusters and for each cluster a kind of ratio of measurement can be allocated. In addition, we design an iterative termination condition to reduce time-consuming in the iteration of compressive sensing recovery. The experimental results show that compared with the corresponding methods, the proposed scheme can acquire a better reconstructed image at the same sampling rate.

  • Robust Multi-Bit Watermarking for Free-View Television Using Light Field Rendering

    Huawei TIAN  Yao ZHAO  Zheng WANG  Rongrong NI  Lunming QIN  

     
    PAPER-Image Processing and Video Processing

      Vol:
    E96-D No:12
      Page(s):
    2820-2829

    With the rapid development of multi-view video coding (MVC) and light field rendering (LFR), Free-View Television (FTV) has emerged as new entrainment equipment, which can bring more immersive and realistic feelings for TV viewers. In FTV broadcasting system, the TV-viewer can freely watch a realistic arbitrary view of a scene generated from a number of original views. In such a scenario, the ownership of the multi-view video should be verified not only on the original views, but also on any virtual view. However, capacities of existing watermarking schemes as copyright protection methods for LFR-based FTV are only one bit, i.e., presence or absence of the watermark, which seriously impacts its usage in practical scenarios. In this paper, we propose a robust multi-bit watermarking scheme for LFR-based free-view video. The direct-sequence code division multiple access (DS-CDMA) watermark is constructed according to the multi-bit message and embedded into DCT domain of each view frame. The message can be extracted bit-by-bit from a virtual frame generated at an arbitrary view-point with a correlation detector. Furthermore, we mathematically prove that the watermark can be detected from any virtual view. Experimental results also show that the watermark in FTV can be successfully detected from a virtual view. Moreover, the proposed watermark method is robust against common signal processing attacks, such as Gaussian filtering, salt & peppers noising, JPEG compression, and center cropping.

  • SLA-Aware and Energy-Efficient VM Consolidation in Cloud Data Centers Using Host State Binary Decision Tree Prediction Model Open Access

    Lianpeng LI  Jian DONG  Decheng ZUO  Yao ZHAO  Tianyang LI  

     
    PAPER-Computer System

      Pubricized:
    2019/07/11
      Vol:
    E102-D No:10
      Page(s):
    1942-1951

    For cloud data center, Virtual Machine (VM) consolidation is an effective way to save energy and improve efficiency. However, inappropriate consolidation of VMs, especially aggressive consolidation, can lead to performance problems, and even more serious Service Level Agreement (SLA) violations. Therefore, it is very important to solve the tradeoff between reduction in energy use and reduction of SLA violation level. In this paper, we propose two Host State Detection algorithms and an improved VM placement algorithm based on our proposed Host State Binary Decision Tree Prediction model for SLA-aware and energy-efficient consolidation of VMs in cloud data centers. We propose two formulas of conditions for host state estimate, and our model uses them to build a Binary Decision Tree manually for host state detection. We extend Cloudsim simulator to evaluate our algorithms by using PlanetLab workload and random workload. The experimental results show that our proposed model can significantly reduce SLA violation rates while keeping energy cost efficient, it can reduce the metric of SLAV by at most 98.12% and the metric of Energy by at most 33.96% for real world workload.

  • Fast Intra Coding Algorithm for HEVC Based on Decision Tree

    Jia QIN  Huihui BAI  Mengmeng ZHANG  Yao ZHAO  

     
    LETTER-Image

      Vol:
    E100-A No:5
      Page(s):
    1274-1278

    High Efficiency Video Coding (HEVC) is the latest coding standard. Compared with Advanced Video coding (H.264/AVC), HEVC offers about a 50% bitrate reduction at the same reconstructed video quality. However, this new coding standard leads to enormous computational complexity, which makes it difficult to encode video in real time. Therefore, in this paper, aiming at the high complexity of intra coding in HEVC, a new fast coding unit (CU) splitting algorithm is proposed based on the decision tree. Decision tree, as a method of machine learning, can be designed to determine the size of CUs adaptively. Here, two significant features, Just Noticeable Difference (JND) values and coding bits of each CU can be extracted to train the decision tree, according to their relationships with the CUs' partitions. The experimental results have revealed that the proposed algorithm can save about 34% of time, on average, with only a small increase of BD-rate under the “All_Intra” setting, compared with the HEVC reference software.

  • Lossless Data Hiding Based on Companding Technique and Difference Expansion of Triplets

    ShaoWei WENG  Yao ZHAO  Jeng-Shyang PAN  

     
    LETTER-Image

      Vol:
    E90-A No:8
      Page(s):
    1717-1718

    A reversible data hiding scheme based on the companding technique and the difference expansion (DE) of triplets is proposed in this paper. The companding technique is employed to increase the number of the expandable triplets. The capacity consumed by the location map recording the expanded positions is largely decreased. As a result, the hiding capacity is considerably increased. The experimental results reveal that high hiding capacity can be achieved at low embedding distortion.

  • Standard-Compliant Multiple Description Image Coding Based on Convolutional Neural Networks

    Ting ZHANG  Huihui BAI  Mengmeng ZHANG  Yao ZHAO  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2018/07/19
      Vol:
    E101-D No:10
      Page(s):
    2543-2546

    Multiple description (MD) coding is an attractive framework for robust information transmission over non-prioritized and unpredictable networks. In this paper, a novel MD image coding scheme is proposed based on convolutional neural networks (CNNs), which aims to improve the reconstructed quality of side and central decoders. For this purpose initially, a given image is encoded into two independent descriptions by sub-sampling. Such a design can make the proposed method compatible with the existing image coding standards. At the decoder, in order to achieve high-quality of side and central image reconstruction, three CNNs, including two side decoder sub-networks and one central decoder sub-network, are adopted into an end-to-end reconstruction framework. Experimental results show the improvement achieved by the proposed scheme in terms of both peak signal-to-noise ratio values and subjective quality. The proposed method demonstrates better rate central and side distortion performance.