The search functionality is under construction.

Keyword Search Result

[Keyword] video(613hit)

101-120hit(613hit)

  • Hardware Oriented Low-Complexity Intra Coding Algorithm for SHVC

    Takafumi KATAYAMA  Tian SONG  Wen SHI  Gen FUJITA  Xiantao JIANG  Takashi SHIMAMOTO  

     
    PAPER-Digital Signal Processing

      Vol:
    E100-A No:12
      Page(s):
    2936-2947

    Scalable high efficiency video coding (SHVC) can provide variable video quality according to terminal devices. However, the computational complexity of SHVC is increased by introducing new techniques based on high efficiency video coding (HEVC). In this paper, a hardware oriented low complexity algorithm is proposed. The hardware oriented proposals have two key points. Firstly, the coding unit depth is determined by analyzing the boundary correlation between coding units before encoding process starts. Secondly, the redundant calculation of R-D optimization is reduced by adaptively using the information of the neighboring coding units and the co-located units in the base layer. The simulation results show that the proposed algorithm can achieve over 62% computation complexity reduction compared to the original SHM11.0. Compared with other related work, over 11% time saving have been achieved without PSNR loss. Furthermore, the proposed algorithm is hardware friendly which can be implemented in a small area.

  • Joint Transmission and Coding Scheme for High-Resolution Video Streams over Multiuser MIMO-OFDM Systems

    Koji TASHIRO  Leonardo LANANTE  Masayuki KUROSAKI  Hiroshi OCHI  

     
    PAPER-Communication Systems

      Vol:
    E100-A No:11
      Page(s):
    2304-2313

    High-resolution image and video communication in home networks is highly expected to proliferate with the spread of Wi-Fi devices and the introduction of multiple-input multiple-output (MIMO) systems. This paper proposes a joint transmission and coding scheme for broadcasting high-resolution video streams over multiuser MIMO systems with an eigenbeam-space division multiplexing (E-SDM) technique. Scalable video coding makes it possible to produce the code stream comprised of multiple layers having unequal contribution to image quality. The proposed scheme jointly assigns the data of scalable code streams to subcarriers and spatial streams based on their signal-to-noise ratio (SNR) values in order to transmit visually important data with high reliability. Simulation results show that the proposed scheme surpasses the conventional unequal power allocation (UPA) approach in terms of both peak signal-to-noise ratio (PSNR) of received images and correct decoding probability. PSNR performance of the proposed scheme exceeds 35dB with the probability of over 95% when received SNR is higher than 6dB. The improvement in average PSNR by the proposed scheme compared to the conventional UPA comes up to approx. 20dB at received SNR of 6dB. Furthermore, correct decoding probability reaches 95% when received SNR is greater than 4dB.

  • A New Scheme of Distributed Video Coding Based on Compressive Sensing and Intra-Predictive Coding

    Shin KURIHARA  Suguru HIROKAWA  Hisakazu KIKUCHI  

     
    PAPER

      Pubricized:
    2017/06/14
      Vol:
    E100-D No:9
      Page(s):
    1944-1952

    Compressive sensing is attractive to distributed video coding with respect to two issues: low complexity in encoding and low data rate in transmission. In this paper, a novel compressive sensing-based distributed video coding system is presented based on a combination of predictive coding and Wyner-Ziv difference coding of compressively sampled frames. Experimental results show that the data volume in transmission in the proposed method is less than one tenth of the distributed compressive video sensing. The quality of decoded video was evaluated in terms of PSNR and structural similarity index as well as visual inspections.

  • Visual Indexing of Large Scale Train-Borne Video for Rail Condition Perceiving

    Peng DAI  Shengchun WANG  Yaping HUANG  Hao WANG  Xinyu DU  Qiang HAN  

     
    PAPER

      Pubricized:
    2017/06/14
      Vol:
    E100-D No:9
      Page(s):
    2017-2026

    Train-borne video captured from the camera installed in the front or back of the train has been used for railway environment surveillance, including missing communication units and bolts on the track, broken fences, unpredictable objects falling into the rail area or hanging on wires on the top of rails. Moreover, the track condition can be perceived visually from the video by observing and analyzing the train-swaying arising from the track irregularity. However, it's a time-consuming and labor-intensive work to examine the whole large scale video up to dozens of hours frequently. In this paper, we propose a simple and effective method to detect the train-swaying quickly and automatically. We first generate the long rail track panorama (RTP) by stitching the stripes cut from the video frames, and then extract track profile to perform the unevenness detection algorithm on the RTP. The experimental results show that RTP, the compact video representation, can fast examine the visual train-swaying information for track condition perceiving, on which we detect the irregular spots with 92.86% recall and 82.98% precision in only 2 minutes computation from the video close to 1 hour.

  • 3D Tracker-Level Fusion for Robust RGB-D Tracking

    Ning AN  Xiao-Guang ZHAO  Zeng-Guang HOU  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2017/05/16
      Vol:
    E100-D No:8
      Page(s):
    1870-1881

    In this study, we address the problem of online RGB-D tracking which confronted with various challenges caused by deformation, occlusion, background clutter, and abrupt motion. Various trackers have different strengths and weaknesses, and thus a single tracker can merely perform well in specific scenarios. We propose a 3D tracker-level fusion algorithm (TLF3D) which enhances the strengths of different trackers and suppresses their weaknesses to achieve robust tracking performance in various scenarios. The fusion result is generated from outputs of base trackers by optimizing an energy function considering both the 3D cube attraction and 3D trajectory smoothness. In addition, three complementary base RGB-D trackers with intrinsically different tracking components are proposed for the fusion algorithm. We perform extensive experiments on a large-scale RGB-D benchmark dataset. The evaluation results demonstrate the effectiveness of the proposed fusion algorithm and the superior performance of the proposed TLF3D tracker against state-of-the-art RGB-D trackers.

  • A High-Throughput and Compact Hardware Implementation for the Reconstruction Loop in HEVC Intra Encoding

    Yibo FAN  Leilei HUANG  Zheng XIE  Xiaoyang ZENG  

     
    PAPER-Integrated Electronics

      Vol:
    E100-C No:6
      Page(s):
    643-654

    In the newly finalized video coding standard, namely high efficiency video coding (HEVC), new notations like coding unit (CU), prediction unit (PU) and transformation unit (TU) are introduced to improve the coding performance. As a result, the reconstruction loop in intra encoding is heavily burdened to choose the best partitions or modes for them. In order to solve the bottleneck problems in cycle and hardware cost, this paper proposed a high-throughput and compact implementation for such a reconstruction loop. By “high-throughput”, it refers to that it has a fixed throughput of 32 pixel/cycle independent of the TU/PU size (except for 4×4 TUs). By “compact”, it refers to that it fully explores the reusability between discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT) as well as that between quantization (Q) and de-quantization (IQ). Besides the contributions made in designing related hardware, this paper also provides a universal formula to analyze the cycle cost of the reconstruction loop and proposed a parallel-process scheme to further reduce the cycle cost. This design is verified on the Stratix IV FPGA. The basic structure achieved a maximum frequency of 150MHz and a hardware cost of 64K ALUTs, which could support the real time TU/PU partition decision for 4K×2K@20fps videos.

  • HVTS: Hadoop-Based Video Transcoding System for Media Services

    Seokhyun SON  Myoungjin KIM  

     
    LETTER-Graphs and Networks

      Vol:
    E100-A No:5
      Page(s):
    1248-1253

    In this letter, we propose a Hadoop-based Video Transcoding System (HVTS), which is designed to run on all major cloud computing services. HVTS is highly adapted to the structure and policies of Hadoop, thus it has additional capacities for transcoding, task distribution, load balancing, and content replication and distribution. To evaluate, our proposed system, we carry out two performance tests on our local testbed, transcoding and robustness to data node and task failures. The results confirmed that our system delivers satisfactory performance in facilitating seamless streaming services in cloud computing environments.

  • Perceptual Distributed Compressive Video Sensing via Reweighted Sampling and Rate-Distortion Optimized Measurements Allocation

    Jin XU  Yan ZHANG  Zhizhong FU  Ning ZHOU  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2017/01/06
      Vol:
    E100-D No:4
      Page(s):
    918-922

    Distributed compressive video sensing (DCVS) is a new paradigm for low-complexity video compression. To achieve the highest possible perceptual coding performance under the measurements budget constraint, we propose a perceptual optimized DCVS codec by jointly exploiting the reweighted sampling and rate-distortion optimized measurements allocation technologies. A visual saliency modulated just-noticeable distortion (VS-JND) profile is first developed based on the side information (SI) at the decoder side. Then the estimated correlation noise (CN) between each non-key frame and its SI is suppressed by the VS-JND. Subsequently, the suppressed CN is utilized to determine the weighting matrix for the reweighted sampling as well as to design a perceptual rate-distortion optimization model to calculate the optimal measurements allocation for each non-key frame. Experimental results indicate that the proposed DCVS codec outperforms the other existing DCVS codecs in term of both the objective and subjective performance.

  • Frame Popularity-Aware Loss-Resilient Interactive Multi-View Video Streaming

    Takuya FUJIHASHI  Yusuke HIROTA  Takashi WATANABE  

     
    PAPER-Multimedia Systems for Communications

      Pubricized:
    2016/10/20
      Vol:
    E100-B No:4
      Page(s):
    646-656

    Multi-view video streaming plays an important role in new interactive and augmented video applications such as telepresence, remote surgery, and entertainment. For those applications, interactive multi-view video transmission schemes have been proposed that aim to reduce the amount of video traffic. Specifically, these schemes only encode and transmit video frames, which are potentially displayed by users, based on periodical feedback from the users. However, existing schemes are vulnerable to frame loss, which often occurs during transmissions, because they encode most video frames using inter prediction and inter-view prediction to reduce traffic. Frame losses induce significant quality degradation due to the collapse of the decoding operations. To improve the loss resilience, we propose an encoding/decoding system, Frame Popularity-based Multi-view Video Streaming (FP-MVS), for interactive multi-view video streaming services. The main idea of FP-MVS is to assign intra (I) frames in the prediction structure for less/more popular (i.e., few/many observed users) potential frames in order to mitigate the impact of a frame loss. In addition, FP-MVS utilizes overlapping and non-overlapping areas between all user's potential frames to prevent redundant video transmission. Although each intra-frame has a large data size, the video traffic can be reduced within a network constraint by combining multicast and unicast for overlapping and non-overlapping area transmissions. Evaluations using Joint Multi-view Video Coding (JMVC) demonstrated that FP-MVS achieves higher video quality even in loss-prone environments. For example, our scheme improves video quality by 11.81dB compared to the standard multi-view video encoding schemes at the loss rate of 5%.

  • On Scheduling Delay-Sensitive SVC Multicast over Wireless Networks with Network Coding

    Shujuan WANG  Chunting YAN  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2016/09/12
      Vol:
    E100-B No:3
      Page(s):
    407-416

    In this work, we study efficient scheduling with network coding in a scalable video coding (SVC) multicast system. Transmission consists of two stages. The original SVC packets are multicasted by the server in the first stage and the lost packets are retransmitted in the second stage. With deadline constraint, the consumer can be only satisfied when the requested packets are received before expiration. Further, the hierarchical encoding architecture of SVC introduces extra decoding delay which poses a challenge for providing acceptable reconstructed video quality. To solve these problems, instantly decodable network coding is applied for reducing the decoding delay, and a novel packet weighted policy is designed to better describe the contribution a packet can make in upgrading the recovered video quality. Finally, an online packet scheduling algorithm based on the maximal weighted clique is proposed to improve the delay, deadline miss ratio and users' experience. Multiple characteristics of SVC packets, such as the packet utility, the slack time and the number of undelivered/wanted packets, are jointly considered. Simulation results prove that the proposed algorithm requires fewer retransmissions and achieves lower deadline miss ratio. Moreover, the algorithm enjoys fine recovery video quality and provides high user satisfaction.

  • A Loitering Discovery System Using Efficient Similarity Search Based on Similarity Hierarchy

    Jianquan LIU  Shoji NISHIMURA  Takuya ARAKI  Yuichi NAKAMURA  

     
    INVITED PAPER

      Vol:
    E100-A No:2
      Page(s):
    367-375

    Similarity search is an important and fundamental problem, and thus widely used in various fields of computer science including multimedia, computer vision, database, information retrieval, etc. Recently, since loitering behavior often leads to abnormal situations, such as pickpocketing and terrorist attacks, its analysis attracts increasing attention from research communities. In this paper, we present AntiLoiter, a loitering discovery system adopting efficient similarity search on surveillance videos. As we know, most of existing systems for loitering analysis, mainly focus on how to detect or identify loiterers by behavior tracking techniques. However, the difficulties of tracking-based methods are known as that their analysis results are heavily influenced by occlusions, overlaps, and shadows. Moreover, tracking-based methods need to track the human appearance continuously. Therefore, existing methods are not readily applied to real-world surveillance cameras due to the appearance discontinuity of criminal loiterers. To solve this problem, we abandon the tracking method, instead, propose AntiLoiter to efficiently discover loiterers based on their frequent appearance patterns in longtime multiple surveillance videos. In AntiLoiter, we propose a novel data structure Luigi that indexes data using only similarity value returned by a corresponding function (e.g., face matching). Luigi is adopted to perform efficient similarity search to realize loitering discovery. We conducted extensive experiments on both synthetic and real surveillance videos to evaluate the efficiency and efficacy of our approach. The experimental results show that our system can find out loitering candidates correctly and outperforms existing method by 100 times in terms of runtime.

  • A Histogram-Based Quality Model for HTTP Adaptive Streaming

    Huyen T. T. TRAN  Nam PHAM NGOC  Yong Ju JUNG  Anh T. PHAM  Truong Cong THANG  

     
    PAPER-VIDEO CODING

      Vol:
    E100-A No:2
      Page(s):
    555-564

    HTTP Adaptive Streaming (HAS) has become a popular solution for multimedia delivery nowadays. Because of throughput variations, video quality fluctuates during a streaming session. Therefore, a main challenge in HAS is how to evaluate the overall video quality of a session. In this paper, we explore the impacts of quality values and quality variations in HAS. We propose to use the histogram of segment quality values and the histogram of quality gradients in a session to model the overall video quality. Subjective test results show that the proposed model has very high prediction performance for different videos. Especially, the proposed model provides insights into the influence factors of the overall quality, thus leading to suggestions to improve the quality of streaming video.

  • A Video Salient Region Detection Framework Using Spatiotemporal Consistency Optimization

    Yunfei ZHENG  Xiongwei ZHANG  Lei BAO  Tieyong CAO  Yonggang HU  Meng SUN  

     
    PAPER-Image

      Vol:
    E100-A No:2
      Page(s):
    688-701

    Labeling a salient region accurately in video with cluttered background and complex motion condition is still a challenging work. Most existing video salient region detection models mainly extract the stimulus-driven saliency features to detect the salient region in video. They are easily influenced by the cluttered background and complex motion conditions. It may lead to incomplete or wrong detection results. In this paper, we propose a video salient region detection framework by fusing the stimulus-driven saliency features and spatiotemporal consistency cue to improve the performance of detection under these complex conditions. On one hand, stimulus-driven spatial saliency features and temporal saliency features are extracted effectively to derive the initial spatial and temporal salient region map. On the other hand, in order to make use of the spatiotemporal consistency cue, an effective spatiotemporal consistency optimization model is presented. We use this model optimize the initial spatial and temporal salient region map. Then the superpixel-level spatiotemporal salient region map is derived by optimizing the initial spatiotemporal salient region map. Finally, the pixel-level spatiotemporal salient region map is derived by solving a self-defined energy model. Experimental results on the challenging video datasets demonstrate that the proposed video salient region detection framework outperforms state-of-the-art methods.

  • Semantic Motion Signature for Segmentation of High Speed Large Displacement Objects

    Yinhui ZHANG  Zifen HE  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2016/10/05
      Vol:
    E100-D No:1
      Page(s):
    220-224

    This paper presents a novel method for unsupervised segmentation of objects with large displacements in high speed video sequences. Our general framework introduces a new foreground object predicting method that finds object hypotheses by encoding both spatial and temporal features via a semantic motion signature scheme. More specifically, temporal cues of object hypotheses are captured by the motion signature proposed in this paper, which is derived from sparse saliency representation imposed on magnitude of optical flow field. We integrate semantic scores derived from deep networks with location priors that allows us to directly estimate appearance potentials of foreground hypotheses. A unified MRF energy functional is proposed to simultaneously incorporate the information from the motion signature and semantic prediction features. The functional enforces both spatial and temporal consistency and impose appearance constancy and spatio-temporal smoothness constraints directly on the object hypotheses. It inherently handles the challenges of segmenting ambiguous objects with large displacements in high speed videos. Our experiments on video object segmentation benchmarks demonstrate the effectiveness of the proposed method for segmenting high speed objects despite the complicated scene dynamics and large displacements.

  • Quality Improvement for Video On-Demand Streaming over HTTP

    Huyen T. T. TRAN  Hung T. LE  Nam PHAM NGOC  Anh T. PHAM  Truong Cong THANG  

     
    LETTER

      Pubricized:
    2016/10/07
      Vol:
    E100-D No:1
      Page(s):
    61-64

    It is crucial to provide Internet videos with the best possible content value (or quality) to users. To adapt to network fluctuations, existing solutions provide various client-based heuristics to change video versions without considering the actual quality. In this work, we present for the first time the use of a quality model in making adaptation decisions to improve the overall quality. The proposed method also estimates the buffer level in the near future to prevent the client from buffer underflows. Experiment results show that the proposed method is able to provide high and consistent video quality under strongly fluctuating bandwidths.

  • Optimizing Video Delivery for Enhancing User Experience in Wireless Networks

    Jongwon YOON  

     
    PAPER-Network

      Pubricized:
    2016/08/04
      Vol:
    E100-B No:1
      Page(s):
    131-139

    With the proliferation of hand-held devices in recent years, mobile video streaming has become an extremely popular application. However, Internet video streaming to mobile devices faces several problems, such as unstable connections, long latency, high jitter, etc. We present a system, OptVid, which enhances the user's experiences of video streaming service on cellular networks. OptVid takes the user's profile and provides seamless adaptive bitrate streaming by leveraging the video transcoding solution. It provides very agile bitrate adaptation, especially in the mobile scenario where the wireless channel is not stable. We prototype video transcoding on a WiMAX testbed to bridge the gap between the wireless channel capacity and the video quality. Our evaluations reveal that OptVid provides better user experience than conventional schemes in terms of PSNR, video stalls, and buffering time. OptVid does not require any additional storage since it transcodes videos on-the-fly upon receiving requests and delivers them directly to the client.

  • Information Hiding and Its Criteria for Evaluation Open Access

    Keiichi IWAMURA  Masaki KAWAMURA  Minoru KURIBAYASHI  Motoi IWATA  Hyunho KANG  Seiichi GOHSHI  Akira NISHIMURA  

     
    INVITED PAPER

      Pubricized:
    2016/10/07
      Vol:
    E100-D No:1
      Page(s):
    2-12

    Within information hiding technology, digital watermarking is one of the most important technologies for copyright protection of digital content. Many digital watermarking schemes have been proposed in academia. However, these schemes are not used, because they are not practical; one reason for this is that the evaluation criteria are loosely defined. To make the evaluation more concrete and improve the practicality of digital watermarking, watermarking schemes must use common evaluation criteria. To realize such criteria, we organized the Information Hiding and its Criteria for Evaluation (IHC) Committee to create useful, globally accepted evaluation criteria for information hiding technology. The IHC Committee improves their evaluation criteria every year, and holds a competition for digital watermarking based on state-of-the-art evaluation criteria. In this paper, we describe the activities of the IHC Committee and its evaluation criteria for digital watermarking of still images, videos, and audio.

  • Practical Watermarking Method Estimating Watermarked Region from Recaptured Videos on Smartphone

    Motoi IWATA  Naoyoshi MIZUSHIMA  Koichi KISE  

     
    PAPER

      Pubricized:
    2016/10/07
      Vol:
    E100-D No:1
      Page(s):
    24-32

    In these days, we can see digital signages in many places, for example, inside stations or trains with the distribution of attractive promotional video clips. Users can easily get additional information related to such video clips via mobile devices such as smartphone by using some websites for retrieval. However, such retrieval is time-consuming and sometimes leads users to incorrect information. Therefore, it is desirable that the additional information can be directly obtained from the video clips. We implement a suitable digital watermarking method on smartphone to extract watermarks from video clips on signages in real-time. The experimental results show that the proposed method correctly extracts watermarks in a second on smartphone.

  • Performance Improvement of Error-Resilient 3D DWT Video Transmission Using Invertible Codes

    Kotoku OMURA  Shoichiro YAMASAKI  Tomoko K. MATSUSHIMA  Hirokazu TANAKA  Miki HASEYAMA  

     
    PAPER-Video Coding

      Vol:
    E99-A No:12
      Page(s):
    2256-2265

    Many studies have applied the three-dimensional discrete wavelet transform (3D DWT) to video coding. It is known that corruptions of the lowest frequency sub-band (LL) coefficients of 3D DWT severely affect the visual quality of video. Recently, we proposed an error resilient 3D DWT video coding method (the conventional method) that employs dispersive grouping and an error concealment (EC). The EC scheme of our conventional method adopts a replacement technique of the lost LL coefficients. In this paper, we propose a new 3D DWT video transmission method in order to enhance error resilience. The proposed method adopts an error correction scheme using invertible codes to protect LL coefficients. We use half-rate Reed-Solomon (RS) codes as invertible codes. Additionally, to improve performance by using the effect of interleave, we adopt a new configuration scheme at the RS encoding stage. The evaluation by computer simulation compares the performance of the proposed method with that of other EC methods, and indicates the advantage of the proposed method.

  • A Low-Power VLSI Architecture for HEVC De-Quantization and Inverse Transform

    Heming SUN  Dajiang ZHOU  Shuping ZHANG  Shinji KIMURA  

     
    PAPER

      Vol:
    E99-A No:12
      Page(s):
    2375-2387

    In this paper, we present a low-power system for the de-quantization and inverse transform of HEVC. Firstly, we present a low-delay circuit to process the coded results of the syntax elements, and then reduce the number of multipliers from 16 to 4 for the de-quantization process of each 4x4 block. Secondly, we give two efficient data mapping schemes for the memory between de-quantization and inverse transform, and the memory for transpose. Thirdly, the zero information is utilized through the whole system. For two memory parts, the write and read operation of zero blocks/ rows/ coefficients can all be skipped to save the power consumption. The results show that up to 86% power consumption can be saved for the memory part under the configuration of “Random-access” and common QPs. For the logical part, the proposed architecture for de-quantization can reduce 77% area consumption. Overall, our system can support real-time coding for 8K x 4K 120fps video sequences and the normalized area consumption can be reduced by 68% compared with the latest work.

101-120hit(613hit)