The search functionality is under construction.

Keyword Search Result

[Keyword] video(613hit)

81-100hit(613hit)

  • Infants' Pain Recognition Based on Facial Expression: Dynamic Hybrid Descriptions

    Ruicong ZHI  Ghada ZAMZMI  Dmitry GOLDGOF  Terri ASHMEADE  Tingting LI  Yu SUN  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2018/04/20
      Vol:
    E101-D No:7
      Page(s):
    1860-1869

    The accurate assessment of infants' pain is important for understanding their medical conditions and developing suitable treatment. Pediatric studies reported that the inadequate treatment of infants' pain might cause various neuroanatomical and psychological problems. The fact that infants can not communicate verbally motivates increasing interests to develop automatic pain assessment system that provides continuous and accurate pain assessment. In this paper, we propose a new set of pain facial activity features to describe the infants' facial expression of pain. Both dynamic facial texture feature and dynamic geometric feature are extracted from video sequences and utilized to classify facial expression of infants as pain or no pain. For the dynamic analysis of facial expression, we construct spatiotemporal domain representation for texture features and time series representation (i.e. time series of frame-level features) for geometric features. Multiple facial features are combined through both feature fusion and decision fusion schemes to evaluate their effectiveness in infants' pain assessment. Experiments are conducted on the video acquired from NICU infants, and the best accuracy of the proposed pain assessment approaches is 95.6%. Moreover, we find that although decision fusion does not perform better than that of feature fusion, the False Negative Rate of decision fusion (6.2%) is much lower than that of feature fusion (25%).

  • Enhancement of Video Streaming QoE by Considering Burst Loss in Wireless LANs

    Toshiro NUNOME  Yuta MATSUI  

     
    PAPER

      Pubricized:
    2018/01/22
      Vol:
    E101-B No:7
      Page(s):
    1653-1660

    In order to enhance QoE of audio and video IP transmission, this paper proposes a method for mitigating the spatial quality impairment during burst loss periods over the wireless networks in the video output scheme SCS, which is a QoE-based video output scheme. SCS switches between two common video output schemes: frame skipping and error concealment. The proposed method pauses video output with an undamaged frame during the burst loss period in order not to pause video output on a degraded frame. We perform an experiment with constant thresholds, the table-lookup method, and the proposed method under various network conditions. The result shows that the effect of the proposed method on QoE can differ with the contents and GOP structures.

  • Energy-Efficient Mobile Video Delivery Utilizing Moving Route Navigation and Video Playout Buffer Control

    Kenji KANAI  Sakiko TAKENAKA  Jiro KATTO  Tutomu MURASE  

     
    PAPER

      Pubricized:
    2018/01/22
      Vol:
    E101-B No:7
      Page(s):
    1635-1644

    Because mobile users demand a high quality and energy-friendly video delivery service that efficiently uses wireless resources, we introduce an energy-efficient video delivery system by applying moving route navigation and playout buffer control based on the mobile throughput history data. The proposed system first determines the optimal travel route to achieve high-speed and energy-efficient communications. Then when a user enters a high throughput area, our system temporarily extends the video playout buffer size, and the user aggressively downloads video segments via a high-speed and energy-efficient wireless connection until the extended buffer is filled. After leaving this area, the user consumes video segments from the extended buffer in order to keep smooth video playback without wireless communications. We carry out computer simulations, laboratory and field experiments and confirm that the proposed system can achieve energy-efficient mobile video delivery.

  • QoE Enhancement of Audio-Video Reliable Groupcast with IEEE 802.11aa

    Toshiro NUNOME  Takuya KOMATSU  

     
    PAPER

      Pubricized:
    2018/01/22
      Vol:
    E101-B No:7
      Page(s):
    1645-1652

    This paper enhances the QoE of audio and video multicast transmission over a wireless LAN by means of reliable groupcast schemes. We use GCR (GroupCast with Retries) Unsolicited Retry and GCR Block ACK as reliable groupcast schemes; they are standardized by IEEE 802.11aa. We assume that a wireless access point transmits audio and video streams to several terminals connected to the access point by groupcast. We compare three schemes: Groupcast with EDCA (Enhanced Distributed Channel Access), GCR Unsolicited Retry and GCR Block ACK. We perform computer simulations under various network conditions to assess application-level QoS and evaluate QoE by a subjective experiment. As a result, we find that the most effective scheme depends on network conditions.

  • Self-Supervised Learning of Video Representation for Anticipating Actions in Early Stage

    Yinan LIU  Qingbo WU  Liangzhi TANG  Linfeng XU  

     
    LETTER-Pattern Recognition

      Pubricized:
    2018/02/21
      Vol:
    E101-D No:5
      Page(s):
    1449-1452

    In this paper, we propose a novel self-supervised learning of video representation which is capable to anticipate the video category by only reading its short clip. The key idea is that we employ the Siamese convolutional network to model the self-supervised feature learning as two different image matching problems. By using frame encoding, the proposed video representation could be extracted from different temporal scales. We refine the training process via a motion-based temporal segmentation strategy. The learned representations for videos can be not only applied to action anticipation, but also to action recognition. We verify the effectiveness of the proposed approach on both action anticipation and action recognition using two datasets namely UCF101 and HMDB51. The experiments show that we can achieve comparable results with the state-of-the-art self-supervised learning methods on both tasks.

  • Forecasting Service Performance on the Basis of Temporal Information by the Conditional Restricted Boltzmann Machine

    Jiali YOU  Hanxing XUE  Yu ZHUO  Xin ZHANG  Jinlin WANG  

     
    PAPER-Network

      Pubricized:
    2017/11/10
      Vol:
    E101-B No:5
      Page(s):
    1210-1221

    Predicting the service performance of Internet applications is important in service selection, especially for video services. In order to design a predictor for forecasting video service performance in third-party application, two famous service providers in China, Iqiyi and Letv, are monitored and analyzed. The study highlights that the measured performance in the observation period is time-series data, and it has strong autocorrelation, which means it is predictable. In order to combine the temporal information and map the measured data to a proper feature space, the authors propose a predictor based on a Conditional Restricted Boltzmann Machine (CRBM), which can capture the potential temporal relationship of the historical information. Meanwhile, the measured data of different sources are combined to enhance the training process, which can enlarge the training size and avoid the over-fit problem. Experiments show that combining the measured results from different resolutions for a video can raise prediction performance, and the CRBM algorithm shows better prediction ability and more stable performance than the baseline algorithms.

  • Real-Time Color Image Improvement System for Visual Testing of Nuclear Reactors

    Naoki HOSOYA  Atsushi MIYAMOTO  Junichiro NAGANUMA  

     
    PAPER-Machine Vision and its Applications

      Pubricized:
    2018/02/16
      Vol:
    E101-D No:5
      Page(s):
    1243-1250

    Nuclear power plants require in-vessel inspections for soundness checks and preventive maintenance. One inspection procedure is visual testing (VT), which is based on video images of an underwater camera in a nuclear reactor. However, a lot of noise is superimposed on VT images due to radiation exposure. We propose a technique for improving the quality of those images by image processing that reduces radiation noise and enhances signals. Real-time video processing was achieved by applying the proposed technique with a parallel processing unit. Improving the clarity of VT images will lead to reducing the burden on inspectors.

  • Graph-Based Video Search Reranking with Local and Global Consistency Analysis

    Soh YOSHIDA  Takahiro OGAWA  Miki HASEYAMA  Mitsuji MUNEYASU  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2018/01/30
      Vol:
    E101-D No:5
      Page(s):
    1430-1440

    Video reranking is an effective way for improving the retrieval performance of text-based video search engines. This paper proposes a graph-based Web video search reranking method with local and global consistency analysis. Generally, the graph-based reranking approach constructs a graph whose nodes and edges respectively correspond to videos and their pairwise similarities. A lot of reranking methods are built based on a scheme which regularizes the smoothness of pairwise relevance scores between adjacent nodes with regard to a user's query. However, since the overall consistency is measured by aggregating only the local consistency over each pair, errors in score estimation increase when noisy samples are included within query-relevant videos' neighbors. To deal with the noisy samples, the proposed method leverages the global consistency of the graph structure, which is different from the conventional methods. Specifically, in order to detect this consistency, the propose method introduces a spectral clustering algorithm which can detect video groups, in which videos have strong semantic correlation, on the graph. Furthermore, a new regularization term, which smooths ranking scores within the same group, is introduced to the reranking framework. Since the score regularization is performed by both local and global aspects simultaneously, the accurate score estimation becomes feasible. Experimental results obtained by applying the proposed method to a real-world video collection show its effectiveness.

  • Improving Recommendation via Inference of User Popularity Preference in Sparse Data Environment

    Xiaoying TAN  Yuchun GUO  Yishuai CHEN  Wei ZHU  

     
    PAPER

      Pubricized:
    2018/01/18
      Vol:
    E101-D No:4
      Page(s):
    1088-1095

    The Collaborative Filtering (CF) algorithms work fairly well in personalized recommendation except in sparse data environment. To deal with the sparsity problem, researchers either take into account auxiliary information extracted from additional data resources, or set the missing ratings with default values, e.g., video popularity. Nevertheless, the former often costs high and incurs difficulty in knowledge transference whereas the latter degrades the accuracy and coverage of recommendation results. To our best knowledge, few literatures take advantage of users' preference on video popularity to tackle this problem. In this paper, we intend to enhance the performance of recommendation algorithm via the inference of the users' popularity preferences (PPs), especially in a sparse data environment. We propose a scheme to aggregate users' PPs and a Collaborative Filtering based algorithm to make the inference of PP feasible and effective from a small number of watching records. We modify a k-Nearest-Neighbor recommendation algorithm and a Matrix Factorization algorithm via introducing the inferred PP. Experiments on a large-scale commercial dataset show that the modified algorithm outperforms the original CF algorithms on both the recommendation accuracy and coverage. The significance of improvement is significant especially with the data sparsity.

  • A Video-Quality Controller for QoE Enhancement in HTTP Adaptive Streaming

    Takumi KUROSAKA  Shungo MORI  Masaki BANDAI  

     
    PAPER-Multimedia Systems for Communications

      Pubricized:
    2017/10/17
      Vol:
    E101-B No:4
      Page(s):
    1163-1174

    In this paper, we propose a quality-level control method based on quality of experience (QoE) characteristics for HTTP adaptive streaming (HAS). The proposed method works as an adaptive bitrate controller on the HAS client. The proposed method consists of two operations: buffer-aware control and QoE-aware control. We implement the proposed method on an actual dynamic adaptive streaming over HTTP (DASH) program and evaluate the QoE performance of the proposed method via both objective and subjective evaluations. The results show that the proposed method effectively improves both objective and subjective QoE performances by preventing stalling events and quality-level switchings that have a negative influence on subjective QoE performance.

  • Segment Scheduling for Progressive Download-Based Multi-View Video Delivery under Successive View Switching

    Takahito KITO  Iori OTOMO  Takuya FUJIHASHI  Yusuke HIROTA  Takashi WATANABE  

     
    PAPER-Multimedia Systems for Communications

      Pubricized:
    2017/10/04
      Vol:
    E101-B No:4
      Page(s):
    1152-1162

    In conventional multiview video systems using progressive download, a user downloads videos of all viewpoints of one content to realize smooth view switching. This, however, increases the video traffic, and if the available download rate is low, the video quality suffers. Downloading only the desired viewpoint is one approach for reducing the traffic. However, in this case, playback stalls will occur after view switching. These stalls degrade the user's satisfaction for the application. In this paper, we aim at two objectives: 1) to achieve reduction in video traffic and 2) to minimize the number of playback stalls. To this end, we propose a new multiview video delivery scheme for progressive download. The main idea of the proposed scheme is that the user downloads a part of viewpoints only, which will be played back by the user with a high probability, to realize both traffic reduction and smooth view switching. In addition, we propose two download-scheduling algorithms to prevent playback stalls even at low download rates. The first algorithm prevents stalls in the cases with frequent view switching, such as zapping, while the second prevents stalls in gazing cases. Evaluations using a Joint Multiview Video Coding (JMVC) encoder and multiview video sequences show that our scheme achieves not only reduced video traffic but also decreased number of playback stalls, regardless of the user's view-switching model or download rate. In addition, we demonstrate that the proposed method does not cause playback stalls irrespective of high and low motion video contents.

  • Performance Comparison of Subjective Quality Assessment Methods for 4k Video

    Kimiko KAWASHIMA  Kazuhisa YAMAGISHI  Takanori HAYASHI  

     
    PAPER-Multimedia Systems for Communications

      Pubricized:
    2017/08/29
      Vol:
    E101-B No:3
      Page(s):
    933-945

    Many subjective quality assessment methods have been standardized. Experimenters can select a method from these methods in accordance with the aim of the planned subjective assessment experiment. It is often argued that the results of subjective quality assessment are affected by range effects that are caused by the quality distribution of the assessment videos. However, there are no studies on the double stimulus continuous quality-scale (DSCQS) and absolute category rating with hidden reference (ACR-HR) methods that investigate range effects in the high-quality range. Therefore, we conduct experiments using high-quality assessment videos (high-quality experiment) and low-to-high-quality assessment videos (low-to-high-quality experiment) and compare the DSCQS and ACR-HR methods in terms of accuracy, stability, and discrimination ability. Regarding accuracy, we find that the mean opinion scores of the DSCQS and ACR-HR methods were marginally affected by range effects, although almost all common processed video sequences showed no significant difference for the high- and low-to-high-quality experiments. Second, the DSCQS and ACR-HR methods were equally stable in the low-to-high-quality experiment, whereas the DSCQS method was more stable than the ACR-HR method in the high-quality experiment. Finally, the DSCQS method had higher discrimination ability than the ACR-HR method in the low-to-high-quality experiment, whereas both methods had almost the same discrimination ability for the high-quality experiment. We thus determined that the DSCQS method is better at minimizing the range effects than the ACR-HR method in the high-quality range.

  • Intelligent Video Surveillance System Based on Event Detection and Rate Adaptation by Using Multiple Sensors

    Kenji KANAI  Keigo OGAWA  Masaru TAKEUCHI  Jiro KATTO  Toshitaka TSUDA  

     
    PAPER

      Pubricized:
    2017/09/19
      Vol:
    E101-B No:3
      Page(s):
    688-697

    To reduce the backbone video traffic generated by video surveillance, we propose an intelligent video surveillance system that offers multi-modal sensor-based event detection and event-driven video rate adaptation. Our proposed system can detect pedestrian existence and movements in the monitoring area by using multi-modal sensors (camera, laser scanner and infrared distance sensor) and control surveillance video quality according to the detected events. We evaluate event detection accuracy and video traffic volume in the experiment scenarios where up to six pedestrians pass through and/or stop at the monitoring area. Evaluation results conclude that our system can significantly reduce video traffic while ensuring high-quality surveillance.

  • Accurate Estimation of Personalized Video Preference Using Multiple Users' Viewing Behavior

    Yoshiki ITO  Takahiro OGAWA  Miki HASEYAMA  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2017/11/22
      Vol:
    E101-D No:2
      Page(s):
    481-490

    A method for accurate estimation of personalized video preference using multiple users' viewing behavior is presented in this paper. The proposed method uses three kinds of features: a video, user's viewing behavior and evaluation scores for the video given by a target user. First, the proposed method applies Supervised Multiview Spectral Embedding (SMSE) to obtain lower-dimensional video features suitable for the following correlation analysis. Next, supervised Multi-View Canonical Correlation Analysis (sMVCCA) is applied to integrate the three kinds of features. Then we can get optimal projections to obtain new visual features, “canonical video features” reflecting the target user's individual preference for a video based on sMVCCA. Furthermore, in our method, we use not only the target user's viewing behavior but also other users' viewing behavior for obtaining the optimal canonical video features of the target user. This unique approach is the biggest contribution of this paper. Finally, by integrating these canonical video features, Support Vector Ordinal Regression with Implicit Constraints (SVORIM) is trained in our method. Consequently, the target user's preference for a video can be estimated by using the trained SVORIM. Experimental results show the effectiveness of our method.

  • A Study on Quality Metrics for 360 Video Communications

    Huyen T. T. TRAN  Cuong T. PHAM  Nam PHAM NGOC  Anh T. PHAM  Truong Cong THANG  

     
    PAPER

      Pubricized:
    2017/10/16
      Vol:
    E101-D No:1
      Page(s):
    28-36

    360 videos have recently become a popular virtual reality content type. However, a good quality metric for 360 videos is still an open issue. In this work, our goal is to identify appropriate objective quality metrics for 360 video communications. Especially, fourteen objective quality measures at different processing phases are considered. Also, a subjective test is conducted in this study. The relationship between objective quality and subjective quality is investigated. It is found that most of the PSNR-related quality measures are well correlated with subjective quality. However, for evaluating video quality across different contents, a content-based quality metric is needed.

  • Personal Viewpoint Navigation Based on Object Trajectory Distribution for Multi-View Videos

    Xueting WANG  Kensho HARA  Yu ENOKIBORI  Takatsugu HIRAYAMA  Kenji MASE  

     
    PAPER-Human-computer Interaction

      Pubricized:
    2017/10/12
      Vol:
    E101-D No:1
      Page(s):
    193-204

    Multi-camera videos with abundant information and high flexibility are useful in a wide range of applications, such as surveillance systems, web lectures, news broadcasting, concerts and sports viewing. Viewers can enjoy an enhanced viewing experience by choosing their own viewpoint through viewing interfaces. However, some viewers may feel annoyed by the need for continual manual viewpoint selection, especially when the number of selectable viewpoints is relatively large. In order to solve this issue, we propose an automatic viewpoint navigation method designed especially for sports. This method focuses on a viewer's personal preference for viewpoint selection, instead of common and professional editing rules. We assume that different trajectory distributions of viewing objects cause a difference in the viewpoint selection according to personal preference. We learn the relationship between the viewer's personal viewpoint-selection tendency and the spatio-temporal game context represented by the objects trajectories. We compare three methods based on Gaussian mixture model, SVM with a general histogram and SVM with a bag-of-words to seek the best learning scheme for this relationship. The performance of the proposed methods are evaluated by assessing the degree of similarity between the selected viewpoints and the viewers' edited records.

  • Scalable Distributed Video Coding for Wireless Video Sensor Networks

    Hong YANG  Linbo QING  Xiaohai HE  Shuhua XIONG  

     
    PAPER

      Pubricized:
    2017/10/16
      Vol:
    E101-D No:1
      Page(s):
    20-27

    Wireless video sensor networks address problems, such as low power consumption of sensor nodes, low computing capacity of nodes, and unstable channel bandwidth. To transmit video of distributed video coding in wireless video sensor networks, we propose an efficient scalable distributed video coding scheme. In this scheme, the scalable Wyner-Ziv frame is based on transmission of different wavelet information, while the Key frame is based on transmission of different residual information. A successive refinement of side information for the Wyner-Ziv and Key frames are proposed in this scheme. Test results show that both the Wyner-Ziv and Key frames have four layers in quality and bit-rate scalable, but no increase in complexity of the encoder.

  • Achievable Rate Regions of Cache-Aided Broadcast Networks for Delivering Content with a Multilayer Structure

    Tetsunao MATSUTA  Tomohiko UYEMATSU  

     
    PAPER-Shannon Theory

      Vol:
    E100-A No:12
      Page(s):
    2629-2640

    This paper deals with a broadcast network with a server and many users. The server has files of content such as music and videos, and each user requests one of these files, where each file consists of some separated layers like a file encoded by a scalable video coding. On the other hand, each user has a local memory, and a part of information of the files is cached (i.e., stored) in these memories in advance of users' requests. By using the cached information as side information, the server encodes files based on users' requests. Then, it sends a codeword through an error-free shared link for which all users can receive a common codeword from the server without error. We assume that the server transmits some layers up to a certain level of requested files at each different transmission rate (i.e., the codeword length per file size) corresponding to each level. In this paper, we focus on the region of tuples of these rates such that layers up to any level of requested files are recovered at users with an arbitrarily small error probability. Then, we give inner and outer bounds on this region.

  • A 197mW 70ms-Latency Full-HD 12-Channel Video-Processing SoC in 16nm CMOS for In-Vehicle Information Systems

    Seiji MOCHIZUKI  Katsushige MATSUBARA  Keisuke MATSUMOTO  Chi Lan Phuong NGUYEN  Tetsuya SHIBAYAMA  Kenichi IWATA  Katsuya MIZUMOTO  Takahiro IRITA  Hirotaka HARA  Toshihiro HATTORI  

     
    PAPER

      Vol:
    E100-A No:12
      Page(s):
    2878-2887

    A 197mW 70ms-latency Full-HD 12-channel video-processing SoC for in-vehicle information systems has been implemented in 16nm CMOS. The SoC integrates 17 video processors of 6 types to operate video processing independently of other processing in CPU/GPU. The synchronous scheme between the video processors achieves 70ms low-latency for driver assistance. The optimized implementation of lossy and lossless video-data compression reduces memory access data by half and power consumption by 20%.

  • Resample-Based Hybrid Multi-Hypothesis Scheme for Distributed Compressive Video Sensing

    Can CHEN  Dengyin ZHANG  Jian LIU  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2017/09/08
      Vol:
    E100-D No:12
      Page(s):
    3073-3076

    Multi-hypothesis prediction technique, which exploits inter-frame correlation efficiently, is widely used in block-based distributed compressive video sensing. To solve the problem of inaccurate prediction in multi-hypothesis prediction technique at a low sampling rate and enhance the reconstruction quality of non-key frames, we present a resample-based hybrid multi-hypothesis scheme for block-based distributed compressive video sensing. The innovations in this paper include: (1) multi-hypothesis reconstruction based on measurements reorganization (MR-MH) which integrates side information into the original measurements; (2) hybrid multi-hypothesis (H-MH) reconstruction which mixes multiple multi-hypothesis reconstructions adaptively by resampling each reconstruction. Experimental results show that the proposed scheme outperforms the state-of-the-art technique at the same low sampling rate.

81-100hit(613hit)