The search functionality is under construction.

Keyword Search Result

[Keyword] video(613hit)

201-220hit(613hit)

  • Facial Micro-Expression Detection in Hi-Speed Video Based on Facial Action Coding System (FACS)

    Senya POLIKOVSKY  Yoshinari KAMEDA  Yuichi OHTA  

     
    PAPER-Pattern Recognition

      Vol:
    E96-D No:1
      Page(s):
    81-92

    Facial micro-expressions are fast and subtle facial motions that are considered as one of the most useful external signs for detecting hidden emotional changes in a person. However, they are not easy to detect and measure as they appear only for a short time, with small muscle contraction in the facial areas where salient features are not available. We propose a new computer vision method for detecting and measuring timing characteristics of facial micro-expressions. The core of this method is based on a descriptor that combines pre-processing masks, histograms and concatenation of spatial-temporal gradient vectors. Presented 3D gradient histogram descriptor is able to detect and measure the timing characteristics of the fast and subtle changes of the facial skin surface. This method is specifically designed for analysis of videos recorded using a hi-speed 200 fps camera. Final classification of micro expressions is done by using a k-mean classifier and a voting procedure. The Facial Action Coding System was utilized to annotate the appearance and dynamics of the expressions in our new hi-speed micro-expressions video database. The efficiency of the proposed approach was validated using our new hi-speed video database.

  • Optimization Algorithm for SVC Multicast with Light-Weight Feedback

    Hao ZHOU  Yu GU  Yusheng JI  Baohua ZHAO  

     
    PAPER-Communication Theory and Signals

      Vol:
    E95-A No:11
      Page(s):
    1946-1954

    Scalable video coding with different modulation and coding schemes (MCSs) applied to different video layers is very appropriate for wireless multicast services because it can provide different video quality to different users according to their channel conditions, and a promising solution to handle packet losses induced by fading wireless channels is the use of layered hybrid FEC/ARQ scheme according to light-weight feedback messages from users about how many packets they have received. It is important to choose an appropriate MCS for each layer, decide how many parity packets in one layer should be transmitted, and determine the resources allocated to multiple video sessions to apply scalable video coding to wireless multicast streaming. We prove that such resource allocation problem is NP-hard and propose an approximate optimal algorithm with a polynomial run time. The algorithm can get the optimal transmission configuration to maximize the expected utility for all users where the utility can be a generic non-negative, non-decreasing function of the received rate. The results from simulations revealed that our algorithm offer significant improvements to video quality over a nave algorithm, an optimal algorithm without feedback from users, and an algorithm with feedback from designated users, especially in scenarios with multiple video sessions and limited radio resources.

  • Performance of Spatial and Temporal Error Concealment Method for 3D DWT Video Coding in Packet Loss Channel

    Hirokazu TANAKA  Sunmi KIM  Takahiro OGAWA  Miki HASEYAMA  

     
    PAPER-Image Processing

      Vol:
    E95-A No:11
      Page(s):
    2015-2022

    A new spatial and temporal error concealment method for three-dimensional discrete wavelet transform (3D DWT) video coding is analyzed. 3D DWT video coding employing dispersive grouping (DG) and two-step error concealment is an efficient method in a packet loss channel [20],[21]. In the two-step error concealment method, the interpolations are only spatially applied however, higher efficiency of the interpolation can be expected by utilizing spatial and temporal similarities. In this paper, we propose an enhanced spatial and temporal error concealment method in order to achieve higher error concealment (EC) performance in packet loss networks. In the temporal error concealment method, structural similarity (SSIM) index is employed for inter group of pictures (GOP) EC and minimum mean square error (MMSE) is used for intra GOP EC. Experimental results show that the proposed method can obtain remarkable performance compared with the conventional methods.

  • Selective Intra Block Size Decision and Fast Intra Mode Decision Algorithms for H.264/AVC Encoder

    Karunanithi BHARANITHARAN  Jiun-Ren DING  Bo-Wei CHEN  Jhing-Fa WANG  

     
    LETTER-Data Engineering, Web Information Systems

      Vol:
    E95-D No:11
      Page(s):
    2720-2723

    In H.264/AVC intra frame coding, the rate-distortion optimization (RDO) is employed to select the optimal coding mode to achieve the minimum rate-distortion cost. Due to a large number of combinations of coding modes, the computational burden of Rate distortion optimization (RDO) becomes extremely high in intra prediction. In this paper, we proposed an efficient selective intra block size decision (SIB) that selects the appropriate block size for intra prediction, further proposed fast intra prediction algorithm reduces a number of modes required for RDO that significantly reduces the encoder complexity. Experimental results show that the proposed fast mode decision algorithm reduces the encoding time by up to 68% with negligible video quality degradation.

  • Burst Error Resilient Channel Coding for SVC over Mobile Networks

    GunWoo KIM  Yongwoo CHO  Jihyeok YUN  DougYoung SUH  

     
    LETTER-Multimedia Environment Technology

      Vol:
    E95-A No:11
      Page(s):
    2032-2035

    This paper proposes Burst Error Resilient coding (BRC) technology in mobile broadcasting network. The proposed method utilizes Scalable Video Coding (SVC) and Forward Error Correction (FEC) to overcome service outage due to burst loss in mobile network. The performance evaluation is performed by comparing PSNR of SVC and the proposed method under MBSFN simulation channel. The simulation result shows PSNR of SVC equal error protection (EEP), unequal error protection (UEP) and proposed BRC using Raptor FEC code.

  • No-Reference Quality Estimation for Video-Streaming Services Based on Error-Concealment Effectiveness

    Toru YAMADA  Yoshihiro MIYAMOTO  Takao NISHITANI  

     
    PAPER-Multimedia Environment Technology

      Vol:
    E95-A No:11
      Page(s):
    2007-2014

    This paper proposes a video-quality estimation method based on a no-reference model for realtime quality monitoring in video-streaming services. The proposed method analyzes both bitstream information and decoded pixel information to estimate video-quality degradation by transmission errors. Video quality in terms of a mean squared error (MSE) between degraded video frames and error-free video frames is estimated on the basis of the number of impairment macroblocks in which the quality degradation has not been possible to be concealed. Error-concealment effectiveness is evaluated using motion information and luminance discontinuity at the boundaries of impairment regions. Simulation results show a high correlation (correlation coefficients of 0.93) between the actual MSE and the number of macroblocks in which error concealment has not been effective. These results show that the proposed method works well in reatime quality monitoring for video-streaming services.

  • Super-Resolution Reconstruction for Spatio-Temporal Resolution Enhancement of Video Sequences

    Miki HASEYAMA  Daisuke IZUMI  Makoto TAKIZAWA  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E95-D No:9
      Page(s):
    2355-2358

    A method for spatio-temporal resolution enhancement of video sequences based on super-resolution reconstruction is proposed. A new observation model is defined for accurate resolution enhancement, which enables subpixel motion in intermediate frames to be obtained. A modified optimization formula for obtaining a high-resolution sequence is also adopted.

  • Resource Allocation for Scalable Video Multicast over OFDMA Systems

    Zan YANG  Yuping ZHAO  

     
    LETTER-Network

      Vol:
    E95-B No:9
      Page(s):
    2948-2951

    In this letter, we propose a framework for scalable video multicast, which exploits the scalability of scalable video and the multiuser diversity of OFDMA systems. We further propose a resource allocation algorithm which guarantees the base-quality video for all users, and improves the transmission efficiency for users with good channel conditions.

  • Real Time Aerial Video Stitching via Sensor Refinement and Priority Scan

    Chao LIAO  Guijin WANG  Bei HE  Chenbo SHI  Yongling SHEN  Xinggang LIN  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E95-D No:8
      Page(s):
    2146-2149

    The time efficiency of aerial video stitching is still an open problem due to the huge amount of input frames, which usually results in prohibitive complexities in both image registration and blending. In this paper, we propose an efficient framework aiming to stitch aerial videos in real time. Reasonable distortions are allowed as a tradeoff for acceleration. Instead of searching for globally optimized solutions, we directly refine frame positions with sensor data to compensate for the accumulative error in alignment. A priority scan method is proposed to select pixels within overlapping area into the final panorama for blending, which avoids complicated operations like weighting or averaging on pixels. Experiments show that our method can generate satisfying results at very competitive speed.

  • No-Reference Quality Estimation for Compressed Videos Based on Inter-Frame Activity Difference

    Toru YAMADA  Takao NISHITANI  

     
    PAPER-Quality Metrics

      Vol:
    E95-A No:8
      Page(s):
    1240-1246

    This paper presents a no-reference (NR) based video-quality estimation method for compressed videos which apply inter-frame prediction. The proposed method does not need bitstream information. Only pixel information of decoded videos is used for the video-quality estimation. An activity value which indicates a variance of luminance values is calculated for every given-size pixel block. The activity difference between an intra-coded frame and its adjacent frame is calculated and is employed for the video-quality estimation. In addition, a blockiness level and a blur level are also estimated at every frame by analyzing pixel information only. The estimated blockiness level and blur level are also taken into account to improve quality-estimation accuracy in the proposed method. Experimental results show that the proposed method achieves accurate video-quality estimation without the original video which does not include any artifacts by the video compression. The correlation coefficient between subjective video quality and estimated quality is 0.925. The proposed method is suitable for automatic video-quality checks when service providers cannot access the original videos.

  • High ESD Breakdown-Voltage InP HBT Transimpedance Amplifier IC for Optical Video Distribution Systems

    Kimikazu SANO  Munehiko NAGATANI  Miwa MUTOH  Koichi MURATA  

     
    PAPER-III-V High-Speed Devices and Circuits

      Vol:
    E95-C No:8
      Page(s):
    1317-1322

    This paper is a report on a high ESD breakdown-voltage InP HBT transimpedance amplifier IC for optical video distribution systems. To make ESD breakdown-voltage higher, we designed ESD protection circuits integrated in the TIA IC using base-collector/base-emitter diodes of InP HBTs and resistors. These components for ESD protection circuits have already existed in the employed InP HBT IC process, so no process modifications were needed. Furthermore, to meet requirements for use in optical video distribution systems, we studied circuit design techniques to obtain a good input-output linearity and a low-noise characteristic. Fabricated InP HBT TIA IC exhibited high human-body-model ESD breakdown voltages (±1000 V for power supply terminals, ±200 V for high-speed input/output terminals), good input-output linearity (less than 2.9-% duty-cycle-distortion), and low noise characteristic (10.7 pA/ averaged input-referred noise current density) with a -3-dB-down higher frequency of 6.9 GHz. To the best of our knowledge, this paper is the first literature describing InP ICs with high ESD-breakdown voltages.

  • Dynamic Resource Management in Clouds: A Probabilistic Approach Open Access

    Paulo GONÇALVES  Shubhabrata ROY  Thomas BEGIN  Patrick LOISEAU  

     
    INVITED PAPER

      Vol:
    E95-B No:8
      Page(s):
    2522-2529

    Dynamic resource management has become an active area of research in the Cloud Computing paradigm. Cost of resources varies significantly depending on configuration for using them. Hence efficient management of resources is of prime interest to both Cloud Providers and Cloud Users. In this work we suggest a probabilistic resource provisioning approach that can be exploited as the input of a dynamic resource management scheme. Using a Video on Demand use case to justify our claims, we propose an analytical model inspired from standard models developed for epidemiology spreading, to represent sudden and intense workload variations. We show that the resulting model verifies a Large Deviation Principle that statistically characterizes extreme rare events, such as the ones produced by “buzz/flash crowd effects” that may cause workload overflow in the VoD context. This analysis provides valuable insight on expectable abnormal behaviors of systems. We exploit the information obtained using the Large Deviation Principle for the proposed Video on Demand use-case for defining policies (Service Level Agreements). We believe these policies for elastic resource provisioning and usage may be of some interest to all stakeholders in the emerging context of cloud networking.

  • Reduced-Reference Objective Quality Assessment Model of Coded Video Sequences Based on the MPEG-7 Descriptor

    Masaharu SATO  Yuukou HORITA  

     
    LETTER-Quality Metrics

      Vol:
    E95-A No:8
      Page(s):
    1259-1263

    Our research is focused on examining the video quality assessment model based on the MPEG-7 descriptor. Video quality is estimated by using several features based on the predicted frame quality such as average value, worst value, best value, standard deviation, and the predicted frame rate obtained from descriptor information. As a result, assessment of video quality can be conducted with a high prediction accuracy with correlation coefficient=0.94, standard deviation of error=0.24, maximum error=0.68 and outlier ratio=0.23.

  • Recent Advances on Scalable Video Coding

    Kazuya HAYASE  Hiroshi FUJII  Yukihiro BANDOH  Hirohisa JOZAWA  

     
    INVITED PAPER

      Vol:
    E95-A No:8
      Page(s):
    1230-1239

    Scalable video coding offers efficient video transmission to a variety of display devices over heterogeneous and error-prone networks. Scalable video coding has been strenuously researched in recent years and state-of-the-art international coding with scalability has been standardized as SVC, which is an extension of H.264/AVC. This paper summarizes the recent advanced research that has been done for improving the quality and reducing the complexity of scalable video coding (including SVC), as well as for improving the quality assessment techniques. It is intended to give researchers a critical, technical overview of what is required to develop more efficient scalable video coding in the future.

  • Discriminative Textural Features for Image and Video Colorization

    Michal KAWULOK  Jolanta KAWULOK  Bogdan SMOLKA  

     
    PAPER-Image Synthesis

      Vol:
    E95-D No:7
      Page(s):
    1722-1730

    Image colorization is a semi-automatic process of adding colors to monochrome images and videos. Using existing methods, required human assistance can be limited to annotating the image with color scribbles or selecting a reference image, from which the colors are transferred to a source image or video sequence. In the work reported here we have explored how to exploit the textural information to improve this process. For every scribbled image we determine the discriminative textural feature domain. After that, the whole image is projected onto the feature space, which makes it possible to estimate textural similarity between any two pixels. For single image colorization based on a set of color scribbles, our contribution lies in using the proposed feature space domain rather than the luminance channel. In case of color transfer used for colorization of video sequences, the feature space is generated based on a reference image, and textural similarity is used to match the pixels between the reference and source images. We have conducted extensive experimental validation which confirmed the importance of using textural information and demonstrated that our method significantly improves colorization result.

  • Early Termination of CU Encoding to Reduce HEVC Complexity

    Ryeong-hee GWEON  Yung-Lyul LEE  

     
    LETTER-Image

      Vol:
    E95-A No:7
      Page(s):
    1215-1218

    The next generation video coding standard HEVC shows high coding performance compared with the H.264/AVC standard, but the computational complexity of the HEVC encoder (HM3.0) is significantly higher. In this letter, the early termination of the CU encoding algorithm is proposed to reduce the computational complexity in the HEVC encoder. The proposed method reduces the encoder complexity by 58.7%, while maintaining the same level of coding efficiency.

  • A Direct Inter-Mode Selection Algorithm for P-Frames in Fast H.264/AVC Transcoding

    Bin SONG  Haixiao LIU  Hao QIN  Jie QIN  

     
    PAPER-Multimedia Systems for Communications

      Vol:
    E95-B No:6
      Page(s):
    2101-2108

    A direct inter-mode selection algorithm for P-frames in fast homogeneous H.264/AVC bit-rate reduction transcoding is proposed in this paper. To achieve the direct inter-mode selection, we firstly develop a low-complexity distortion estimation method for fast transcoding, in which the distortion is directly calculated from the decoded residual together with the reference frames. We also present a linear estimation method to approximate the coding rate. With the estimated distortion and rate, the rate-distortion cost can be easily computed in the transcoder. In our algorithm, a method based on the normalized rate difference of P-frames (RP) is used to detect the high motion scene. To achieve fast transcoding, only for the P-frames with RP larger than a threshold, the rate-distortion optimized (RDO) mode decision is performed; meanwhile, the average cost of each inter-mode (ACM) is calculated. Then for the subsequent frames transcoding, the optimal coding mode can be directly selected using the estimated cost and the ACM threshold. Experiments show that the proposed method can significantly simplify the complex RDO mode decision, and achieve transcoding time reductions of up to 62% with small loss of rate-distortion performance.

  • Low-Complexity Coarse-Level Mode-Mapping Based H.264/AVC to H.264/SVC Spatial Transcoding for Video Conferencing

    Lei SUN  Jie LENG  Jia SU  Yiqing HUANG  Hiroomi MOTOHASHI  Takeshi IKENAGA  

     
    PAPER-Video Processing

      Vol:
    E95-D No:5
      Page(s):
    1313-1323

    Scalable Video Coding (SVC) was standardized as an extension of H.264/AVC with the intention to provide flexible adaptation to heterogeneous networks and different end-user requirements, which provides great scalability in multi-point applications such as video conferencing. However, due to the existence of H.264/AVC-based systems, transcoding between AVC and SVC becomes necessary. Most existing works focus on temporal transcoding, quality transcoding or SVC-to-AVC spatial transcoding while the straightforward re-encoding method requires high computational cost. This paper proposes a low-complexity AVC-to-SVC spatial transcoder based on coarse-level mode mapping for video conferencing scenes. First, to omit unnecessary motion estimations (ME) for layers with reduced resolution, an ME skipping scheme based on AVC mode distribution is proposed with an adaptive search range. Then a probability-profile based scheme is proposed for further mode skipping. After that 3 coarse-level mode-mapping methods are presented for fast mode decision and the adaptive usage of the 3 methods is discussed. Finally, motion vector (MV) refinement is introduced for further lower-layer time reduction. As for the top layer, direct encapsulation is proposed to preserve better quality and another scheme involving inter-layer predictions is also provided for bandwidth-crucial applications. Simulation results show that proposed transcoder achieves up to 92.6% time reduction without significant coding efficiency loss compared to re-encoding method.

  • Efficient Tracking of News Topics Based on Chronological Semantic Structures in a Large-Scale News Video Archive

    Ichiro IDE  Tomoyoshi KINOSHITA  Tomokazu TAKAHASHI  Hiroshi MO  Norio KATAYAMA  Shin'ichi SATOH  Hiroshi MURASE  

     
    PAPER-Video Processing

      Vol:
    E95-D No:5
      Page(s):
    1288-1300

    Recent advance in digital storage technology has enabled us to archive a large volume of video data. Thanks to this trend, we have archived more than 1,800 hours of video data from a daily Japanese news show in the last ten years. When considering the effective use of such a large news video archive, we assumed that analysis of its chronological and semantic structure becomes important. We also consider that providing the users with the development of news topics is more important to help their understanding of current affairs, rather than providing a list of relevant news stories as in most of the current news video retrieval systems. Therefore, in this paper, we propose a structuring method for a news video archive, together with an interface that visualizes the structure, so that users could track the development of news topics according to their interest, efficiently. The proposed news video structure, namely the “topic thread structure”, is obtained as a result of an analysis of the chronological and semantic relation between news stories. Meanwhile, the proposed interface, namely “mediaWalker II”, allows users to track the development of news topics along the topic thread structure, and at the same time watch the video footage corresponding to each news story. Analyses on the topic thread structures obtained by applying the proposed method to actual news video footages revealed interesting and comprehensible relations between news topics in the real world. At the same time, analyses on their size quantified the efficiency of tracking a user's topic-of-interest based on the proposed topic thread structure. We consider this as a first step towards facilitating video authoring by users based on existing contents in a large-scale news video archive.

  • Efficiently Finding Individuals from Video Dataset

    Pengyi HAO  Sei-ichiro KAMATA  

     
    PAPER-Video Processing

      Vol:
    E95-D No:5
      Page(s):
    1280-1287

    We are interested in retrieving video shots or videos containing particular people from a video dataset. Owing to the large variations in pose, illumination conditions, occlusions, hairstyles and facial expressions, face tracks have recently been researched in the fields of face recognition, face retrieval and name labeling from videos. However, when the number of face tracks is very large, conventional methods, which match all or some pairs of faces in face tracks, will not be effective. Therefore, in this paper, an efficient method for finding a given person from a video dataset is presented. In our study, in according to performing research on face tracks in a single video, we also consider how to organize all the faces in videos in a dataset and how to improve the search quality in the query process. Different videos may include the same person; thus, the management of individuals in different videos will be useful for their retrieval. The proposed method includes the following three points. (i) Face tracks of the same person appearing for a period in each video are first connected on the basis of scene information with a time constriction, then all the people in one video are organized by a proposed hierarchical clustering method. (ii) After obtaining the organizational structure of all the people in one video, the people are organized into an upper layer by affinity propagation. (iii) Finally, in the process of querying, a remeasuring method based on the index structure of videos is performed to improve the retrieval accuracy. We also build a video dataset that contains six types of videos: films, TV shows, educational videos, interviews, press conferences and domestic activities. The formation of face tracks in the six types of videos is first researched, then experiments are performed on this video dataset containing more than 1 million faces and 218,786 face tracks. The results show that the proposed approach has high search quality and a short search time.

201-220hit(613hit)