IEICE global.ieice.org Site

Keyword Search Result

[Keyword] video analysis(8hit)

1-8hit

Local Riesz Pyramid for Faster Phase-Based Video Magnification
Shoichiro TAKEDA Megumi ISOGAI Shinya SHIMIZU Hideaki KIMATA

PAPER

Pubricized:
2020/06/22
Vol:
E103-D No:10
Page(s):
2036-2046
Phase-based video magnification methods can magnify and reveal subtle motion changes invisible to the naked eye. In these methods, each image frame in a video is decomposed into an image pyramid, and subtle motion changes are then detected as local phase changes with arbitrary orientations at each pixel and each pyramid level. One problem with this process is a long computational time to calculate the local phase changes, which makes high-speed processing of video magnification difficult. Recently, a decomposition technique called the Riesz pyramid has been proposed that detects only local phase changes in the dominant orientation. This technique can remove the arbitrariness of orientations and lower the over-completeness, thus achieving high-speed processing. However, as the resolution of input video increases, a large amount of data must be processed, requiring a long computational time. In this paper, we focus on the correlation of local phase changes between adjacent pyramid levels and present a novel decomposition technique called the local Riesz pyramid that enables faster phase-based video magnification by automatically processing the minimum number of sufficient local image areas at several pyramid levels. Through this minimum pyramid processing, our proposed phase-based video magnification method using the local Riesz pyramid achieves good magnification results within a short computational time.
Attentive Sequences Recurrent Network for Social Relation Recognition from Video Open Access
Jinna LV Bin WU Yunlei ZHANG Yunpeng XIAO

PAPER-Image Recognition, Computer Vision

Pubricized:
2019/09/02
Vol:
E102-D No:12
Page(s):
2568-2576
Recently, social relation analysis receives an increasing amount of attention from text to image data. However, social relation analysis from video is an important problem, which is lacking in the current literature. There are still some challenges: 1) it is hard to learn a satisfactory mapping function from low-level pixels to high-level social relation space; 2) how to efficiently select the most relevant information from noisy and unsegmented video. In this paper, we present an Attentive Sequences Recurrent Network model, called ASRN, to deal with the above challenges. First, in order to explore multiple clues, we design a Multiple Feature Attention (MFA) mechanism to fuse multiple visual features (i.e. image, motion, body, and face). Through this manner, we can generate an appropriate mapping function from low-level video pixels to high-level social relation space. Second, we design a sequence recurrent network based on Global and Local Attention (GLA) mechanism. Specially, an attention mechanism is used in GLA to integrate global feature with local sequence feature to select more relevant sequences for the recognition task. Therefore, the GLA module can better deal with noisy and unsegmented video. At last, extensive experiments on the SRIV dataset demonstrate the performance of our ASRN model.
Infants' Pain Recognition Based on Facial Expression: Dynamic Hybrid Descriptions
Ruicong ZHI Ghada ZAMZMI Dmitry GOLDGOF Terri ASHMEADE Tingting LI Yu SUN

PAPER-Artificial Intelligence, Data Mining

Pubricized:
2018/04/20
Vol:
E101-D No:7
Page(s):
1860-1869
The accurate assessment of infants' pain is important for understanding their medical conditions and developing suitable treatment. Pediatric studies reported that the inadequate treatment of infants' pain might cause various neuroanatomical and psychological problems. The fact that infants can not communicate verbally motivates increasing interests to develop automatic pain assessment system that provides continuous and accurate pain assessment. In this paper, we propose a new set of pain facial activity features to describe the infants' facial expression of pain. Both dynamic facial texture feature and dynamic geometric feature are extracted from video sequences and utilized to classify facial expression of infants as pain or no pain. For the dynamic analysis of facial expression, we construct spatiotemporal domain representation for texture features and time series representation (i.e. time series of frame-level features) for geometric features. Multiple facial features are combined through both feature fusion and decision fusion schemes to evaluate their effectiveness in infants' pain assessment. Experiments are conducted on the video acquired from NICU infants, and the best accuracy of the proposed pain assessment approaches is 95.6%. Moreover, we find that although decision fusion does not perform better than that of feature fusion, the False Negative Rate of decision fusion (6.2%) is much lower than that of feature fusion (25%).
A Fully Automatic Player Detection Method Based on One-Class SVM
Xuefeng BAI Tiejun ZHANG Chuanjun WANG Ahmed A. ABD EL-LATIF Xiamu NIU

LETTER-Image Recognition, Computer Vision

Vol:
E96-D No:2
Page(s):
387-391
Player detection is an important part in sports video analysis. Over the past few years, several learning based detection methods using various supervised two-class techniques have been presented. Although satisfactory results can be obtained, a lot of manual labor is needed to construct the training set. To overcome this drawback, this letter proposes a player detection method based on one-class SVM (OCSVM) using automatically generated training data. The proposed method is evaluated using several video clips captured from World Cup 2010, and experimental results show that our approach achieves a high detection rate while keeping the training set construction's cost low.
Commercial Shot Classification Based on Multiple Features Combination
Nan LIU Yao ZHAO Zhenfeng ZHU Rongrong NI

LETTER-Image Processing and Video Processing

Vol:
E93-D No:9
Page(s):
2651-2655
This paper presents a commercial shot classification scheme combining well-designed visual and textual features to automatically detect TV commercials. To identify the inherent difference between commercials and general programs, a special mid-level textual descriptor is proposed, aiming to capture the spatio-temporal properties of the video texts typical of commercials. In addition, we introduce an ensemble-learning based combination method, named Co-AdaBoost, to interactively exploit the intrinsic relations between the visual and textual features employed.
A Sieving ANN for Emotion-Based Movie Clip Classification
Saowaluk C. WATANAPA Bundit THIPAKORN Nipon CHAROENKITKARN

PAPER-Biocybernetics, Neurocomputing

Vol:
E91-D No:5
Page(s):
1562-1572
Effective classification and analysis of semantic contents are very important for the content-based indexing and retrieval of video database. Our research attempts to classify movie clips into three groups of commonly elicited emotions, namely excitement, joy and sadness, based on a set of abstract-level semantic features extracted from the film sequence. In particular, these features consist of six visual and audio measures grounded on the artistic film theories. A unique sieving-structured neural network is proposed to be the classifying model due to its robustness. The performance of the proposed model is tested with 101 movie clips excerpted from 24 award-winning and well-known Hollywood feature films. The experimental result of 97.8% correct classification rate, measured against the collected human-judges, indicates the great potential of using abstract-level semantic features as an engineered tool for the application of video-content retrieval/indexing.
A Visual Attention Based Region-of-Interest Determination Framework for Video Sequences
Wen-Huang CHENG Wei-Ta CHU Ja-Ling WU

PAPER-Image Processing and Multimedia Systems

Vol:
E88-D No:7
Page(s):
1578-1586
This paper presents a framework for automatic video region-of-interest determination based on visual attention model. We view this work as a preliminary step towards the solution of high-level semantic video analysis. Facing such a challenging issue, in this work, a set of attempts on using video attention features and knowledge of computational media aesthetics are made. The three types of visual attention features we used are intensity, color, and motion. Referring to aesthetic principles, these features are combined according to camera motion types on the basis of a new proposed video analysis unit, frame-segment. We conduct subjective experiments on several kinds of video data and demonstrate the effectiveness of the proposed framework.
PanoramaExcerpts: Video Cataloging by Automatic Synthesis and Layout of Panoramic Images
Yukinobu TANIGUCHI Akihito AKUTSU Yoshinobu TONOMURA

INVITED PAPER-Image Processing, Image Pattern Recognition

Vol:
E83-D No:12
Page(s):
2039-2046
Browsing is an important function supporting efficient access to relevant information in video archives. In this paper, we present PanoramaExcerpts -- a video browsing interface that shows a catalogue of two types of video icons: panoramic and keyframe icons. A panoramic icon is automatically synthesized from a video segment taken with camera pan or tilt using a camera parameter estimation technique. One keyframe icon is extracted for each shot to supplement the panoramic icons. A panoramic icon represents the entire visible contents of a scene extended with a camera pan or tilt, which is difficult to represent using a single keyframe. A graphical representation, called camera-work trajectory, is also proposed to show the direction and the speed of camera operation. For the automatic generation of PanoramaExcerpts, we propose an approach to integrate the following: (a) a shot-change detection method; (b) a method for locating segments that contain smooth camera operations; (c) a layout method for packing icons in a space-efficient manner. In this paper, we mainly describe (b) and (c) with experimental results.

Keyword Search Result

[Keyword] video analysis(8hit)

Local Riesz Pyramid for Faster Phase-Based Video Magnification

Attentive Sequences Recurrent Network for Social Relation Recognition from Video Open Access

Infants' Pain Recognition Based on Facial Expression: Dynamic Hybrid Descriptions

A Fully Automatic Player Detection Method Based on One-Class SVM

Commercial Shot Classification Based on Multiple Features Combination

A Sieving ANN for Emotion-Based Movie Clip Classification

A Visual Attention Based Region-of-Interest Determination Framework for Video Sequences

PanoramaExcerpts: Video Cataloging by Automatic Synthesis and Layout of Panoramic Images

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles