Hiroaki AKUTSU Ko ARAI
Lanxi LIU Pengpeng YANG Suwen DU Sani M. ABDULLAHI
Xiaoguang TU Zhi HE Gui FU Jianhua LIU Mian ZHONG Chao ZHOU Xia LEI Juhang YIN Yi HUANG Yu WANG
Yingying LU Cheng LU Yuan ZONG Feng ZHOU Chuangao TANG
Jialong LI Takuto YAMAUCHI Takanori HIRANO Jinyu CAI Kenji TEI
Wei LEI Yue ZHANG Hanfeng XIE Zebin CHEN Zengping CHEN Weixing LI
David CLARINO Naoya ASADA Atsushi MATSUO Shigeru YAMASHITA
Takashi YOKOTA Kanemitsu OOTSU
Xiaokang Jin Benben Huang Hao Sheng Yao Wu
Tomoki MIYAMOTO
Ken WATANABE Katsuhide FUJITA
Masashi UNOKI Kai LI Anuwat CHAIWONGYEN Quoc-Huy NGUYEN Khalid ZAMAN
Takaharu TSUBOYAMA Ryota TAKAHASHI Motoi IWATA Koichi KISE
Chi ZHANG Li TAO Toshihiko YAMASAKI
Ann Jelyn TIEMPO Yong-Jin JEONG
Haruhisa KATO Yoshitaka KIDANI Kei KAWAMURA
Jiakun LI Jiajian LI Yanjun SHI Hui LIAN Haifan WU
Gyuyeong KIM
Hyun KWON Jun LEE
Fan LI Enze YANG Chao LI Shuoyan LIU Haodong WANG
Guangjin Ouyang Yong Guo Yu Lu Fang He
Yuyao LIU Qingyong LI Shi BAO Wen WANG
Cong PANG Ye NI Jia Ming CHENG Lin ZHOU Li ZHAO
Nikolay FEDOROV Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Yukasa MURAKAMI Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Kazuya KAKIZAKI Kazuto FUKUCHI Jun SAKUMA
Yitong WANG Htoo Htoo Sandi KYAW Kunihiro FUJIYOSHI Keiichi KANEKO
Waqas NAWAZ Muhammad UZAIR Kifayat ULLAH KHAN Iram FATIMA
Haeyoung Lee
Ji XI Pengxu JIANG Yue XIE Wei JIANG Hao DING
Weiwei JING Zhonghua LI
Sena LEE Chaeyoung KIM Hoorin PARK
Akira ITO Yoshiaki TAKAHASHI
Rindo NAKANISHI Yoshiaki TAKATA Hiroyuki SEKI
Chuzo IWAMOTO Ryo TAKAISHI
Chih-Ping Wang Duen-Ren Liu
Yuya TAKADA Rikuto MOCHIDA Miya NAKAJIMA Syun-suke KADOYA Daisuke SANO Tsuyoshi KATO
Yi Huo Yun Ge
Rikuto MOCHIDA Miya NAKAJIMA Haruki ONO Takahiro ANDO Tsuyoshi KATO
Koichi FUJII Tomomi MATSUI
Yaotong SONG Zhipeng LIU Zhiming ZHANG Jun TANG Zhenyu LEI Shangce GAO
Souhei TAKAGI Takuya KOJIMA Hideharu AMANO Morihiro KUGA Masahiro IIDA
Jun ZHOU Masaaki KONDO
Tetsuya MANABE Wataru UNUMA
Kazuyuki AMANO
Takumi SHIOTA Tonan KAMATA Ryuhei UEHARA
Hitoshi MURAKAMI Yutaro YAMAGUCHI
Jingjing Liu Chuanyang Liu Yiquan Wu Zuo Sun
Zhenglong YANG Weihao DENG Guozhong WANG Tao FAN Yixi LUO
Yoshiaki TAKATA Akira ONISHI Ryoma SENDA Hiroyuki SEKI
Dinesh DAULTANI Masayuki TANAKA Masatoshi OKUTOMI Kazuki ENDO
Kento KIMURA Tomohiro HARAMIISHI Kazuyuki AMANO Shin-ichi NAKANO
Ryotaro MITSUBOSHI Kohei HATANO Eiji TAKIMOTO
Genta INOUE Daiki OKONOGI Satoru JIMBO Thiem Van CHU Masato MOTOMURA Kazushi KAWAMURA
Hikaru USAMI Yusuke KAMEDA
Yinan YANG
Takumi INABA Takatsugu ONO Koji INOUE Satoshi KAWAKAMI
Fengshan ZHAO Qin LIU Takeshi IKENAGA
Naohito MATSUMOTO Kazuhiro KURITA Masashi KIYOMI
Tomohiro KOBAYASHI Tomomi MATSUI
Shin-ichi NAKANO
Ming PAN
Computer-aided detection (CADe) and diagnosis (CAD) has been a rapidly growing, active area of research in medical imaging. Machine leaning (ML) plays an essential role in CAD, because objects such as lesions and organs may not be represented accurately by a simple equation; thus, medical pattern recognition essentially require “learning from examples.” One of the most popular uses of ML is the classification of objects such as lesion candidates into certain classes (e.g., abnormal or normal, and lesions or non-lesions) based on input features (e.g., contrast and area) obtained from segmented lesion candidates. The task of ML is to determine “optimal” boundaries for separating classes in the multi-dimensional feature space which is formed by the input features. ML algorithms for classification include linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), multilayer perceptrons, and support vector machines (SVM). Recently, pixel/voxel-based ML (PML) emerged in medical image processing/analysis, which uses pixel/voxel values in images directly, instead of features calculated from segmented lesions, as input information; thus, feature calculation or segmentation is not required. In this paper, ML techniques used in CAD schemes for detection and diagnosis of lung nodules in thoracic CT and for detection of polyps in CT colonography (CTC) are surveyed and reviewed.
Ken'ichi MOROOKA Masahiko NAKAMOTO Yoshinobu SATO
This paper reviews methods for computer assisted medical intervention using statistical models and machine learning technologies, which would be particularly useful for representing prior information of anatomical shape, motion, and deformation to extrapolate intraoperative sparse data as well as surgeons' expertise and pathology to optimize interventions. Firstly, we present a review of methods for recovery of static anatomical structures by only using intraoperative data without any preoperative patient-specific information. Then, methods for recovery of intraoperative motion and deformation are reviewed by combining intraoperative sparse data with preoperative patient-specific stationary data, which is followed by a survey of articles which incorporated biomechanics. Furthermore, the articles are reviewed which addressed the used of statistical models for optimization of interventions. Finally, we conclude the survey by describing the future perspective.
Amir H. FORUZAN Yen-Wei CHEN Reza A. ZOROOFI Akira FURUKAWA Yoshinobu SATO Masatoshi HORI Noriyuki TOMIYAMA
In this paper, we present an algorithm to segment the liver in low-contrast CT images. As the first step of our algorithm, we define a search range for the liver boundary. Then, the EM algorithm is utilized to estimate parameters of a 'Gaussian Mixture' model that conforms to the intensity distribution of the liver. Using the statistical parameters of the intensity distribution, we introduce a new thresholding technique to classify image pixels. We assign a distance feature vectors to each pixel and segment the liver by a K-means clustering scheme. This initial boundary of the liver is conditioned by the Fourier transform. Then, a Geodesic Active Contour algorithm uses the boundaries to find the final surface. The novelty in our method is the proper selection and combination of sub-algorithms so as to find the border of an object in a low-contrast image. The number of parameters in the proposed method is low and the parameters have a low range of variations. We applied our method to 30 datasets including normal and abnormal cases of low-contrast/high-contrast images and it was extensively evaluated both quantitatively and qualitatively. Minimum of Dice similarity measures of the results is 0.89. Assessment of the results proves the potential of the proposed method for segmentation in low-contrast images.
Masahiro ODA Takayuki KITASAKA Kazuhiro FURUKAWA Osamu WATANABE Takafumi ANDO Hidemi GOTO Kensaku MORI
Crohn's disease commonly affects the small and large intestines. Its symptoms include ulcers and intestinal stenosis, and its diagnosis is currently performed using an endoscope. However, because the endoscope cannot pass through the stenosed parts of the intestines, diagnosis of the entire intestines is difficult. A CT image-based method is expected to become an alternative way for the diagnosis of Crohn's disease because it enables observation of the entire intestine even if stenosis exists. To achieve efficient CT image-based diagnosis, diagnostic-aid by computers is required. This paper presents an automated detection method of the surface of ulcers in the small and large intestines from fecal tagging CT images. Ulcers cause rough surfaces on the intestinal wall and consist of small convex and concave (CC) regions. We detect them by blob and inverse-blob structure enhancement filters. A roughness value is utilized to reduce the false positives of the detection results. Many CC regions are concentrated in ulcers. The roughness value evaluates the concentration ratio of the detected regions. Detected regions with low roughness values are removed by a thresholding process. The thickness of the intestinal lumen and the CT values of the surrounding tissue of the intestinal lumen are also used to reduce false positives. Experimental results using ten cases of CT images showed that our proposed method detects 70.6% of ulcers with 12.7 FPs/case. The proposed method detected most of the ulcers.
Tatsuya KON Takashi OBI Hideaki TASHIMA Nagaaki OHYAMA
Parametric images can help investigate disease mechanisms and vital functions. To estimate parametric images, it is necessary to obtain the tissue time activity curves (tTACs), which express temporal changes of tracer activity in human tissue. In general, the tTACs are calculated from each voxel's value of the time sequential PET images estimated from dynamic PET data. Recently, spatio-temporal PET reconstruction methods have been proposed in order to take into account the temporal correlation within each tTAC. Such spatio-temporal algorithms are generally quite computationally intensive. On the other hand, typical algorithms such as the preconditioned conjugate gradient (PCG) method still does not provide good accuracy in estimation. To overcome these problems, we propose a new spatio-temporal reconstruction method based on the dynamic row-action maximum-likelihood algorithm (DRAMA). As the original algorithm does, the proposed method takes into account the noise propagation, but it achieves much faster convergence. Performance of the method is evaluated with digital phantom simulations and it is shown that the proposed method requires only a few reconstruction processes, thereby remarkably reducing the computational cost required to estimate the tTACs. The results also show that the tTACs and parametric images from the proposed method have better accuracy.
Yuichiro TAJIMA Kinya FUDANO Koichi ITO Takafumi AOKI
This paper presents a fast and accurate volume correspondence matching method using 3D Phase-Only Correlation (POC). The proposed method employs (i) a coarse-to-fine strategy using multi-scale volume pyramids for correspondence search and (ii) high-accuracy POC-based local block matching for finding dense volume correspondence with sub-voxel displacement accuracy. This paper also proposes its GPU implementation to achieve fast and practical computation of volume registration. Experimental evaluation shows that the proposed approach exhibits higher accuracy and lower computational cost compared with conventional method. We also demonstrate that the GPU implementation of the proposed method can align two volume data in several seconds, which is suitable for practical use in the image-guided radiation therapy.
Wei ZHAO Rui XU Yasushi HIRANO Rie TACHIBANA Shoji KIDO Narufumi SUGANUMA
This paper describes a computer-aided diagnosis (CAD) method to classify pneumoconiosis on HRCT images. In Japan, the pneumoconiosis is divided into 4 types according to the density of nodules: Type 1 (no nodules), Type 2 (few small nodules), Type 3-a (numerous small nodules) and Type 3-b (numerous small nodules and presence of large nodules). Because most pneumoconiotic nodules are small-sized and irregular-shape, only few nodules can be detected by conventional nodule extraction methods, which would affect the classification of pneumoconiosis. To improve the performance of nodule extraction, we proposed a filter based on analysis the eigenvalues of Hessian matrix. The classification of pneumoconiosis is performed in the following steps: Firstly the large-sized nodules were extracted and cases of type 3-b were recognized. Secondly, for the rest cases, the small nodules were detected and false positives were eliminated. Thirdly we adopted a bag-of-features-based method to generate input vectors for a support vector machine (SVM) classifier. Finally cases of type 1,2 and 3-a were classified. The proposed method was evaluated on 175 HRCT scans of 112 subjects. The average accuracy of classification is 90.6%. Experimental result shows that our method would be helpful to classify pneumoconiosis on HRCT.
Rui XU Yasushi HIRANO Rie TACHIBANA Shoji KIDO
Computer-aided diagnosis (CAD) systems on diffuse lung diseases (DLD) were required to facilitate radiologists to read high-resolution computed tomography (HRCT) scans. An important task on developing such CAD systems was to make computers automatically recognize typical pulmonary textures of DLD on HRCT. In this work, we proposed a bag-of-features based method for the classification of six kinds of DLD patterns which were consolidation (CON), ground-glass opacity (GGO), honeycombing (HCM), emphysema (EMP), nodular (NOD) and normal tissue (NOR). In order to successfully apply the bag-of-features based method on this task, we focused to design suitable local features and the classifier. Considering that the pulmonary textures were featured by not only CT values but also shapes, we proposed a set of statistical measures based local features calculated from both CT values and eigen-values of Hessian matrices. Additionally, we designed a support vector machine (SVM) classifier by optimizing parameters related to both kernels and the soft-margin penalty constant. We collected 117 HRCT scans from 117 subjects for experiments. Three experienced radiologists were asked to review the data and their agreed-regions where typical textures existed were used to generate 3009 3D volume-of-interest (VOIs) with the size of 32
Hiroyuki NOZAKA Tomisato MIURA Zhongxi ZHENG
Objective: The virtual slides are high-magnification whole digital images of histopathological tissue sections. The existing virtual slide system, which is optimized for scanning flat and smooth plane slides such as histopathological paraffin-embedded tissue sections, but is unsuitable for scanning irregular plane slides such as cytological smear slides. This study aims to develop a virtual slide system suitable for cytopathology slide scanning and to evaluate the effectiveness of multi-focus image fusion (MF) in cytopathological diagnosis. Study Design: We developed a multi-layer virtual slide scanning system with MF technology. Tumors for this study were collected from 21 patients diagnosed with primary breast cancer. After surgical extraction, smear slide for cytopathological diagnosis were manufactured by the conventional stamp method, fine needle aspiration method (FNA), and tissue washing method. The stamp slides were fixed in 95% ethanol. FNA and tissue washing samples were fixed in CytoRich RED Preservative Fluid, a liquid-based cytopathology (LBC). These slides were stained with Papanicolaou stain, and scanned by virtual slide system. To evaluate the suitability of MF technology in cytopathological diagnosis, we compared single focus (SF) virtual slide with MF virtual slide. Cytopathological evaluation was carried out by 5 pathologists and cytotechnologists. Results: The virtual slide system with MF provided better results than the conventional SF virtual slide system with regard to viewing inside cell clusters and image file size. Liquid-based cytology was more suitable than the stamp method for virtual slides with MF. Conclusion: The virtual slide system with MF is a useful technique for the digitization in cytopathology, and this technology could be applied to tele-cytology and e-learning by virtual slide system.
Akinobu SHIMIZU Takuya NARIHIRA Hidefumi KOBATAKE Daisuke FURUKAWA Shigeru NAWANO Kenji SHINOZAKI
This paper presents an ensemble learning algorithm for liver tumour segmentation from a CT volume in the form of U-Boost and extends the loss functions to improve performance. Five segmentation algorithms trained by the ensemble learning algorithm with different loss functions are compared in terms of error rate and Jaccard Index between the extracted regions and true ones.
Naoki KAMIYA Xiangrong ZHOU Huayue CHEN Chisako MURAMATSU Takeshi HARA Hiroshi FUJITA
Our purpose in this study is to develop a scheme to segment the rectus abdominis muscle region in X-ray CT images. We propose a new muscle recognition method based on the shape model. In this method, three steps are included in the segmentation process. The first is to generate a shape model for representing the rectus abdominis muscle. The second is to recognize anatomical feature points corresponding to the origin and insertion of the muscle, and the third is to segment the rectus abdominis muscles using the shape model. We generated the shape model from 20 CT cases and tested the model to recognize the muscle in 10 other CT cases. The average value of the Jaccard similarity coefficient (JSC) between the manually and automatically segmented regions was 0.843. The results suggest the validity of the model-based segmentation for the rectus abdominis muscle.
Qing LIU Tomohiro ODAKA Jousuke KUROIWA Hisakazu OGURA
An artificial fish swarm algorithm for solving symbolic regression problems is introduced in this paper. In the proposed AFSA, AF individuals represent candidate solutions, which are represented by the gene expression scheme in GEP. For evaluating AF individuals, a penalty-based fitness function, in which the node number of the parse tree is considered to be a constraint, was designed in order to obtain a solution expression that not only fits the given data well but is also compact. A number of important conceptions are defined, including distance, partners, congestion degree, and feature code. Based on the above concepts, we designed four behaviors, namely, randomly moving behavior, preying behavior, following behavior, and avoiding behavior, and present their respective formalized descriptions. The exhaustive simulation results demonstrate that the proposed algorithm can not only obtain a high-quality solution expression but also provides remarkable robustness and quick convergence.
Alberto CALIXTO SIMON Saul E. POMARES HERNANDEZ Jose Roberto PEREZ CRUZ Pilar GOMEZ-GIL Khalil DRIRA
Communication-induced checkpointing (CIC) has two main advantages: first, it allows processes in a distributed computation to take asynchronous checkpoints, and secondly, it avoids the domino effect. To achieve these, CIC algorithms piggyback information on the application messages and take forced local checkpoints when they recognize potentially dangerous patterns. The main disadvantages of CIC algorithms are the amount of overhead per message and the induced storage overhead. In this paper we present a communication-induced checkpointing algorithm called Scalable Fully-Informed (S-FI) that attacks the problem of message overhead. For this, our algorithm modifies the Fully-Informed algorithm by integrating it with the immediate dependency principle. The S-FI algorithm was simulated and the result shows that the algorithm is scalable since the message overhead presents an under-linear growth as the number of processes and/or the message density increase.
Identification of early aspects is the critical problem in aspect-oriented requirement engineering. But the representation of crosscutting concerns is various, which makes the identification difficult. To address the problem, this paper proposes the AspectQuery method based on goal model. We analyze four kinds of goal decomposition models, then summarize the main factors about identification of crosscutting concerns and conclude the identification rules based on a goal model. A goal is crosscutting concern when it satisfies one of the following conditions: i) the goal is contributed to realize one soft-goal; ii) parent goal of the goal is candidate crosscutting concern; iii) the goal has at least two parent goals. AspectQuery includes four steps: building the goal model, transforming the goal model, identifying the crosscutting concerns by identification rules, and composing the crosscutting concerns with the goals affected by them. We illustrate the AspectQuery method through a case study (a ticket booking management system). The results show the effectiveness of AspectQuery in identifying crosscutting concerns in the requirement phase.
Hsu-Kuang CHANG King-Chu HUNG I-Chang JOU
Compiling documents in extensible markup language (XML) increasingly requires access to data services which provide both rapid response and the precise use of search engines. Efficient data service should be based on a skillful representation that can support low complexity and high precision search capabilities. In this paper, a novel complete path representation (CPR) associated with a modified inverted index is presented to provide efficient XML data services, where queries can be versatile in terms of predicates. CPR can completely preserve hierarchical information, and the new index is used to save semantic information. The CPR approach can provide template-based indexing for fast data searches. An experiment is also conducted for the evaluation of the CPR approach.
Hongjun LIU Baokang ZHAO Xiaofeng HU Dan ZHAO Xicheng LU
Root cause analysis of BGP updates is the key to debug and troubleshoot BGP routing problems. However, it is a challenge to precisely diagnose the cause and the origin of routing instability. In this paper, we are the first to distinguish link failure events from policy change events based on BGP updates from single vantage points by analyzing the relationship of the closed loops formed through intersecting all the transient paths during instability and the length variation of the stable paths after instability. Once link failure events are recognized, their origins are precisely inferred with 100% accuracy. Through simulation, our method is effective to distinguish link failure events from link restoration events and policy related events, and reduce the size of candidate set of origins.
Jinfeng GAO Bilan ZHU Masaki NAKAGAWA
The paper describes how a robust and compact on-line handwritten Japanese text recognizer was developed by compressing each component of an integrated text recognition system including a SVM classifier to evaluate segmentation points, an on-line and off-line combined character recognizer, a linguistic context processor, and a geometric context evaluation module to deploy it on hand-held devices. Selecting an elastic-matching based on-line recognizer and compressing MQDF2 via a combination of LDA, vector quantization and data type transformation, have contributed to building a remarkably small yet robust recognizer. The compact text recognizer covering 7,097 character classes just requires about 15 MB memory to keep 93.11% accuracy on horizontal text lines extracted from the TUAT Kondate database. Compared with the original full-scale Japanese text recognizer, the memory size is reduced from 64.1 MB to 14.9 MB while the accuracy loss is only 0.5% from 93.6% to 93.11%. The method is scalable so even systems of less than 11 MB or less than 6 MB still remain 92.80% or 90.02% accuracy, respectively.
Sayaka SHIOTA Kei HASHIMOTO Yoshihiko NANKAKU Keiichi TOKUDA
This paper proposes an acoustic modeling technique based on Bayesian framework using multiple model structures for speech recognition. The aim of the Bayesian approach is to obtain good prediction of observation by marginalizing all variables related to generative processes. Although the effectiveness of marginalizing model parameters was recently reported in speech recognition, most of these systems use only “one” model structure, e.g., topologies of HMMs, the number of states and mixtures, types of state output distributions, and parameter tying structures. However, it is insufficient to represent a true model distribution, because a family of such models usually does not include a true distribution in most practical cases. One of solutions of this problem is to use multiple model structures. Although several approaches using multiple model structures have already been proposed, the consistent integration of multiple model structures based on the Bayesian approach has not seen in speech recognition. This paper focuses on integrating multiple phonetic decision trees based on the Bayesian framework in HMM based acoustic modeling. The proposed method is derived from a new marginal likelihood function which includes the model structures as a latent variable in addition to HMM state sequences and model parameters, and the posterior distributions of these latent variables are obtained using the variational Bayesian method. Furthermore, to improve the optimization algorithm, the deterministic annealing EM (DAEM) algorithm is applied to the training process. The proposed method effectively utilizes multiple model structures, especially in the early stage of training and this leads to better predictive distributions and improvement of recognition performance.
During the production of speech signals, the vowel onset point is an important event containing important information for many speech processing tasks, such as consonant-vowel unit recognition and speech end-points detection. In order to realize accurate automatic detection of vowel onset points, this paper proposes a reliable method using the energy characteristics of homomorphic filtered spectral peaks. The homomorphic filtering helps to separate the slowly varying vocal tract system characteristics from the rapidly fluctuating excitation characteristics in the cepstral domain. The distinct vocal tract shape related to vowels is obtained and the peaks in the estimated vocal tract spectrum provide accurate and stable information for VOP detection. Performance of the proposed method is compared with the existing method which uses the combination of evidence from the excitation source, spectral peaks, and modulation spectrum energies. The detection rate with different time resolutions, together with the missing rate and spurious rate, are used for comprehensive evaluation of the performance on continuous speech taken from the TIMIT database. The detection accuracy of the proposed method is 74.14% for ±10 ms resolution and it increases to 96.33% for ±40 ms resolution with 3.67% missing error and 4.14% spurious error, much better than the results obtained by the combined approach at each specified time resolution, especially the higher resolutions of ±10
Lin-bo LUO Jun CHEN Sang-woo AN Chang-shuai WANG Jong-joo PARK Ying-chun LI Jong-wha CHONG
In lowlight conditions, images taken by phone cameras usually have too much noise, while images taken using a flash have a high signal-noise ratio (SNR) and look unnatural. This paper proposes a novel imaging method using flash/no-flash image pairs. Through transferring the natural tone of the former to the latter, the resulting image has a high SNR and maintains a natural appearance. For realtime implementation, we use two preview images, which are taken with and without flash, to estimate the transformation function in advance. Then we use this function to adjust the tone of the image captured with flash in real time. Thus, the method does not require a frame memory and it is suitable for cell phone cameras.
Hyuk-Jun LEE Seung-Chul KIM Eui-Young CHUNG
A packet memory stores packets in internet routers and it requires typically RTT
In this paper, we propose an improved face clustering method using a weighted graph-based approach. We combine two parameters as the weight of a graph to improve clustering performance. One is average similarity, which is calculated with two constraints of geometric and symmetric properties, and the other is a newly proposed parameter called the orientation matching ratio, which is calculated from orientation analysis for matched keypoints in the face region. According to the results of face clustering for several datasets, the proposed method shows improved results compared to the previous method.
The emerging High Efficiency Video Coding (HEVC) standard attempts to improve the coding efficiency by a factor of two over H.264/AVC through the use of new compression tools with high computational complexity. Although multipledirectional prediction is one of the features contributing to the improved compression efficiency, the computational complexity for prediction increases significantly. This paper presents an early uni-directional prediction decision algorithm. The proposed algorithm takes advantage of the property of HEVC that it supports a deep quad-tree block structure. Statistical observation shows that the correlation of prediction direction among different blocks which share same area is very high. Based on this observation, the mode of the current block is determined early according to the mode of upper blocks. Bi-directional prediction is not performed when the upper block is encoded as the uni-directional prediction mode. A simulation shows that it reduces ME operation time by about 22.7% with a marginal drop in compression efficiency.
Ran LI Zong-Liang GAN Zi-Guan CUI Xiu-Chang ZHU
Novel joint motion-compensated interpolation using eight-neighbor block motion vectors (8J-MCI) is presented. The proposed method uses bi-directional motion estimation (BME) to obtain the motion vector field of the interpolated frame and adopts motion vectors of the interpolated block and its 8-neighbor blocks to jointly predict the target block. Since the smoothness of the motion vector filed makes the motion vectors of 8-neighbor blocks quite close to the true motion vector of the interpolated block, the proposed algorithm has the better fault-tolerancy than traditional ones. Experiments show that the proposed algorithm outperforms the motion-aligned auto-regressive algorithm (MAAR, one of the state-of-the-art frame rate up-conversion (FRUC) schemes) in terms of the average PSNR for the test image sequence and offers better subjective visual quality.
Qingbo WU Linfeng XU Zhengning WANG
In this letter, we propose a novel intra prediction coding scheme for H.264/AVC. Based on our proposed minimum distance prediction (MDP) scheme, the optimal reference samples for predicting the current pixel can be adaptively updated corresponding to different video contents. The experimental results show that up to 2 dB and 1 dB coding gains can be achieved with the proposed method for QCIF and CIF sequences respectively.
Rong WANG Zhiliang WANG Xirong MA
For the problem of Indoor Home Scene Classification, this paper proposes the BOW Model of Local Feature Information Gain. The experimental results show that not only the performance is improved but also the computation is reduced. Consequently this method out performs the state-of-the-art approach.
In object tracking, a recent trend is using “Tracking by Detection” technique which trains a discriminative online classifier to detect objects from background. However, the incorrect updating of the online classifier and insufficient features used during the online learning often lead to the drift problems. In this work we propose an online random fern classifier with a simple but effective compressive feature in a framework integrating the online classifier, the optical-flow tracker and an update model. The compressive feature is a random projection from highly dimensional multi-scale image feature space to a low-dimensional representation by a sparse measurement matrix, which is expect to contain more information. An update model is proposed to detect tracker failure, correct tracker result and constrain the updating of online classifier, thus reducing the chance of wrong updating in online training. Our method runs at real-time and the experimental results show performance improvement compared to other state-of-the-art approaches on several challenging video clips.
Toshihiko YAMASAKI Tomoaki MATSUNAMI Tuhan CHEN
This paper presents a technique that analyzes pedestrians' attributes such as gender and bag-possession status from surveillance video. One of the technically challenging issues is that we use only top-view camera images to protect privacy. The shape features over the frames are extracted by bag-of-features (BoF) using histogram of oriented gradients (HoG) vectors. In order to enhance the classification accuracy, a two-staged classification framework is presented. Multiple classifiers are trained by changing the parameters in the first stage. The outputs from the first stage is further trained and classified in the second stage classifier. The experiments using 60-minute video captured at Haneda Airport, Japan, show that the accuracies for the gender classification and the bag-possession classification were 95.8% and 97.2%, respectively, which is a significant improvement from our previous work.