Hiroaki AKUTSU Ko ARAI
Lanxi LIU Pengpeng YANG Suwen DU Sani M. ABDULLAHI
Xiaoguang TU Zhi HE Gui FU Jianhua LIU Mian ZHONG Chao ZHOU Xia LEI Juhang YIN Yi HUANG Yu WANG
Yingying LU Cheng LU Yuan ZONG Feng ZHOU Chuangao TANG
Jialong LI Takuto YAMAUCHI Takanori HIRANO Jinyu CAI Kenji TEI
Wei LEI Yue ZHANG Hanfeng XIE Zebin CHEN Zengping CHEN Weixing LI
David CLARINO Naoya ASADA Atsushi MATSUO Shigeru YAMASHITA
Takashi YOKOTA Kanemitsu OOTSU
Xiaokang Jin Benben Huang Hao Sheng Yao Wu
Tomoki MIYAMOTO
Ken WATANABE Katsuhide FUJITA
Masashi UNOKI Kai LI Anuwat CHAIWONGYEN Quoc-Huy NGUYEN Khalid ZAMAN
Takaharu TSUBOYAMA Ryota TAKAHASHI Motoi IWATA Koichi KISE
Chi ZHANG Li TAO Toshihiko YAMASAKI
Ann Jelyn TIEMPO Yong-Jin JEONG
Haruhisa KATO Yoshitaka KIDANI Kei KAWAMURA
Jiakun LI Jiajian LI Yanjun SHI Hui LIAN Haifan WU
Gyuyeong KIM
Hyun KWON Jun LEE
Fan LI Enze YANG Chao LI Shuoyan LIU Haodong WANG
Guangjin Ouyang Yong Guo Yu Lu Fang He
Yuyao LIU Qingyong LI Shi BAO Wen WANG
Cong PANG Ye NI Jia Ming CHENG Lin ZHOU Li ZHAO
Nikolay FEDOROV Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Yukasa MURAKAMI Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Kazuya KAKIZAKI Kazuto FUKUCHI Jun SAKUMA
Yitong WANG Htoo Htoo Sandi KYAW Kunihiro FUJIYOSHI Keiichi KANEKO
Waqas NAWAZ Muhammad UZAIR Kifayat ULLAH KHAN Iram FATIMA
Haeyoung Lee
Ji XI Pengxu JIANG Yue XIE Wei JIANG Hao DING
Weiwei JING Zhonghua LI
Sena LEE Chaeyoung KIM Hoorin PARK
Akira ITO Yoshiaki TAKAHASHI
Rindo NAKANISHI Yoshiaki TAKATA Hiroyuki SEKI
Chuzo IWAMOTO Ryo TAKAISHI
Chih-Ping Wang Duen-Ren Liu
Yuya TAKADA Rikuto MOCHIDA Miya NAKAJIMA Syun-suke KADOYA Daisuke SANO Tsuyoshi KATO
Yi Huo Yun Ge
Rikuto MOCHIDA Miya NAKAJIMA Haruki ONO Takahiro ANDO Tsuyoshi KATO
Koichi FUJII Tomomi MATSUI
Yaotong SONG Zhipeng LIU Zhiming ZHANG Jun TANG Zhenyu LEI Shangce GAO
Souhei TAKAGI Takuya KOJIMA Hideharu AMANO Morihiro KUGA Masahiro IIDA
Jun ZHOU Masaaki KONDO
Tetsuya MANABE Wataru UNUMA
Kazuyuki AMANO
Takumi SHIOTA Tonan KAMATA Ryuhei UEHARA
Hitoshi MURAKAMI Yutaro YAMAGUCHI
Jingjing Liu Chuanyang Liu Yiquan Wu Zuo Sun
Zhenglong YANG Weihao DENG Guozhong WANG Tao FAN Yixi LUO
Yoshiaki TAKATA Akira ONISHI Ryoma SENDA Hiroyuki SEKI
Dinesh DAULTANI Masayuki TANAKA Masatoshi OKUTOMI Kazuki ENDO
Kento KIMURA Tomohiro HARAMIISHI Kazuyuki AMANO Shin-ichi NAKANO
Ryotaro MITSUBOSHI Kohei HATANO Eiji TAKIMOTO
Genta INOUE Daiki OKONOGI Satoru JIMBO Thiem Van CHU Masato MOTOMURA Kazushi KAWAMURA
Hikaru USAMI Yusuke KAMEDA
Yinan YANG
Takumi INABA Takatsugu ONO Koji INOUE Satoshi KAWAKAMI
Fengshan ZHAO Qin LIU Takeshi IKENAGA
Naohito MATSUMOTO Kazuhiro KURITA Masashi KIYOMI
Tomohiro KOBAYASHI Tomomi MATSUI
Shin-ichi NAKANO
Ming PAN
This paper reports on the trending literature of occlusion handling in the task of online visual tracking. The discussion first explores visual tracking realm and pinpoints the necessity of dedicated attention to the occlusion problem. The findings suggest that although occlusion detection facilitated tracking impressively, it has been largely ignored. The literature further showed that the mainstream of the research is gathered around human tracking and crowd analysis. This is followed by a novel taxonomy of types of occlusion and challenges arising from it, during and after the emergence of an occlusion. The discussion then focuses on an investigation of the approaches to handle the occlusion in the frame-by-frame basis. Literature analysis reveals that researchers examined every aspect of a tracker design that is hypothesized as beneficial in the robust tracking under occlusion. State-of-the-art solutions identified in the literature involved various camera settings, simplifying assumptions, appearance and motion models, target state representations and observation models. The identified clusters are then analyzed and discussed, and their merits and demerits are explained. Finally, areas of potential for future research are presented.
Masatoshi KAWARASAKI Hyuma WATANABE
MapReduce and its open software implementation Hadoop are now widely deployed for big data analysis. As MapReduce runs over a cluster of massive machines, data transfer often becomes a bottleneck in job processing. In this paper, we explore the influence of data transfer to job processing performance and analyze the mechanism of job performance deterioration caused by data transfer oriented congestion at disk I/O and/or network I/O. Based on this analysis, we update Hadoop's Heartbeat messages to contain the real time system status for each machine, like disk I/O and link usage rate. This enhancement makes Hadoop's scheduler be aware of each machine's workload and make more accurate decision of scheduling. The experiment has been done to evaluate the effectiveness of enhanced scheduling methods and discussions are provided to compare the several proposed scheduling policies.
Taek LEE Jung-Been LEE Hoh Peter IN
Adherence to coding conventions during the code production stage of software development is essential. Benefits include enabling programmers to quickly understand the context of shared code, communicate with one another in a consistent manner, and easily maintain the source code at low costs. In reality, however, programmers tend to doubt or ignore the degree to which the quality of their code is affected by adherence to these guidelines. This paper addresses research questions such as “Do violations of coding conventions affect the readability of the produced code?”, “What kinds of coding violations reduce code readability?”, and “How much do variable factors such as developer experience, project size, team size, and project maturity influence coding violations?” To respond to these research questions, we explored 210 open-source Java projects with 117 coding conventions from the Sun standard checklist. We believe our findings and the analysis approach used in the paper will encourage programmers and QA managers to develop their own customized and effective coding style guidelines.
Tugkan TUGLULAR Arda MUFTUOGLU Fevzi BELLI Michael LINSCHULTE
Graphical User Interfaces (GUIs) are critical for the security, safety and reliability of software systems. Injection attacks, for instance via SQL, succeed due to insufficient input validation and can be avoided if contract-based approaches, such as Design by Contract, are followed in the software development lifecycle of GUIs. This paper proposes a model-based testing approach for detecting GUI data contract violations, which may result in serious failures such as system crash. A contract-based model of GUI data specifications is used to develop test scenarios and to serve as test oracle. The technique introduced uses multi terminal binary decision diagrams, which are designed as an integral part of decision table-augmented event sequence graphs, to implement a GUI testing process. A case study, which validates the presented approach on a port scanner written in Java programming language, is presented.
Takahiro YAMAMOTO Masaki KAWAMURA
We propose a method of spread spectrum digital watermarking with quantization index modulation (QIM) and evaluate the method on the basis of IHC evaluation criteria. The spread spectrum technique can make watermarks robust by using spread codes. Since watermarks can have redundancy, messages can be decoded from a degraded stego-image. Under IHC evaluation criteria, it is necessary to decode the messages without the original image. To do so, we propose a method in which watermarks are generated by using the spread spectrum technique and are embedded by QIM. QIM is an embedding method that can decode without an original image. The IHC evaluation criteria include JPEG compression and cropping as attacks. JPEG compression is lossy compression. Therefore, errors occur in watermarks. Since watermarks in stego-images are out of synchronization due to cropping, the position of embedded watermarks may be unclear. Detecting this position is needed while decoding. Therefore, both error correction and synchronization are required for digital watermarking methods. As countermeasures against cropping, the original image is divided into segments to embed watermarks. Moreover, each segment is divided into 8×8 pixel blocks. A watermark is embedded into a DCT coefficient in a block by QIM. To synchronize in decoding, the proposed method uses the correlation between watermarks and spread codes. After synchronization, watermarks are extracted by QIM, and then, messages are estimated from the watermarks. The proposed method was evaluated on the basis of the IHC evaluation criteria. The PSNR had to be higher than 30 dB. Ten 1920×1080 rectangular regions were cropped from each stego-image, and 200-bit messages were decoded from these regions. Their BERs were calculated to assess the tolerance. As a result, the BERs were less than 1.0%, and the average PSNR was 46.70 dB. Therefore, our method achieved a high image quality when using the IHC evaluation criteria. In addition, the proposed method was also evaluated by using StirMark 4.0. As a result, we found that our method has robustness for not only JPEG compression and cropping but also additional noise and Gaussian filtering. Moreover, the method has an advantage in that detection time is small since the synchronization is processed in 8×8 pixel blocks.
A two-handed distance control method is proposed for precisely and efficiently manipulating a virtual 3D object by hand in an immersive virtual reality environment. The proposed method enhances direct manipulation by hand and is used to precisely control and efficiently adjust the position of an object and the viewpoint using the distance between the two hands. The two-handed method is evaluated and compared with the previously proposed one-handed speed control method, which adjusts the position of an object in accordance with the speed of one hand. The results from experimental evaluation show that two-handed methods, which make position and viewpoint adjustments, are the best among six combinations of control and adjustment methods.
Wyllian B. da SILVA Keiko V. O. FONSECA Alexandre de A. P. POHL
Digital video signals are subject to several distortions due to compression processes, transmission over noisy channels or video processing. Therefore, the video quality evaluation has become a necessity for broadcasters and content providers interested in offering a high video quality to the customers. Thus, an objective no-reference video quality assessment metric is proposed based on the sigmoid model using spatial-temporal features weighted by parameters obtained through the solution of a nonlinear least squares problem using the Levenberg-Marquardt algorithm. Experimental results show that when it is applied to MPEG-2 streams our method presents better linearity than full-reference metrics, and its performance is close to that achieved with full-reference metrics for H.264 streams.
Xin TAN Yu LIU Huaxin XIAO Maojun ZHANG
A cascaded video denoising method based on frame averaging is proposed in this paper. A novel segmentation approach using intensity and structure tensor is used for change compensation, which can effectively suppress noise while preserving the structure of an image. The cascaded framework solves the problem of noise residual caused by single-frame averaging. The classical Wiener filter is used for spatial denoising in changing areas. Our algorithm works in real-time on an FPGA, since it does not involve future frames. Experiments on standard grayscale videos for various noise levels demonstrate that the proposed method is competitive with current state-of-the-art video denoising algorithms on both peak signal-to-noise ratio and structural similarity evaluations, particularly when dealing with large-scale noise.
In this paper, we propose a method for reconstructing 3D sequential patterns from multiple images without knowing exact image correspondences and without calibrating linear camera sensitivity parameters on intensity. The sequential pattern is defined as a series of colored 3D points. We assume that the series of the points are obtained in multiple images, but the correspondence of individual points is not known among multiple images. For reconstructing sequential patterns, we consider a camera projection model which combines geometric and photometric information of objects. Furthermore, we consider camera projections in the frequency space. By considering the multi-view relationship on the new projection model, we show that the 3D sequential patterns can be reconstructed without knowing exact correspondence of individual image points in the sequential patterns; moreover, the recovered 3D patterns do not suffer from changes in linear camera sensitivity parameters. The efficiency of the proposed method is tested using real images.
Fumi KAWAI Satoshi KONDO Keisuke HAYATA Jun OHMIYA Kiyoko ISHIKAWA Masahiro YAMAMOTO
We propose a fully automatic method for detecting the carotid artery from volumetric ultrasound images as a preprocessing stage for building three-dimensional images of the structure of the carotid artery. The proposed detector utilizes support vector machine classifiers to discriminate between carotid artery images and non-carotid artery images using two kinds of LBP-based features. The detector switches between these features depending on the anatomical position along the carotid artery. We evaluate our proposed method using actual clinical cases. Accuracies of detection are 100%, 87.5% and 68.8% for the common carotid artery, internal carotid artery, and external carotid artery sections, respectively.
Zijun SHA Lin HU Yuki TODO Junkai JI Shangce GAO Zheng TANG
Breast cancer is a serious disease across the world, and it is one of the largest causes of cancer death for women. The traditional diagnosis is not only time consuming but also easily affected. Hence, artificial intelligence (AI), especially neural networks, has been widely used to assist to detect cancer. However, in recent years, the computational ability of a neuron has attracted more and more attention. The main computational capacity of a neuron is located in the dendrites. In this paper, a novel neuron model with dendritic nonlinearity (NMDN) is proposed to classify breast cancer in the Wisconsin Breast Cancer Database (WBCD). In NMDN, the dendrites possess nonlinearity when realizing the excitatory synapses, inhibitory synapses, constant-1 synapses and constant-0 synapses instead of being simply weighted. Furthermore, the nonlinear interaction among the synapses on a dendrite is defined as a product of the synaptic inputs. The soma adds all of the products of the branches to produce an output. A back-propagation-based learning algorithm is introduced to train the NMDN. The performance of the NMDN is compared with classic back propagation neural networks (BPNNs). Simulation results indicate that NMDN possesses superior capability in terms of the accuracy, convergence rate, stability and area under the ROC curve (AUC). Moreover, regarding ROC, for continuum values, the existing 0-connections branches after evolving can be eliminated from the dendrite morphology to release computational load, but with no influence on the performance of classification. The results disclose that the computational ability of the neuron has been undervalued, and the proposed NMDN can be an interesting choice for medical researchers in further research.
Yoji YAMATO Shinichiro KATSURAGI Shinji NAGAO Norihiro MIURA
We evaluated software maintenance of an open source cloud platform system we developed using an agile software development method. We previously reported on a rapid service launch using the agile software development method in spite of large-scale development. For this study, we analyzed inquiries and the defect removal efficiency of our recently developed software throughout one-year operation. We found that the defect removal efficiency of our recently developed software was 98%. This indicates that we could achieve sufficient quality in spite of large-scale agile development. In term of maintenance process, we could answer all enquiries within three business days and could conduct version-upgrade fast. Thus, we conclude that software maintenance of agile software development is not ineffective.
Xinjie WANG Yuzhen HUANG Yansheng LI Zhe-Ming LU
In this Letter, we investigate the outage performance of MIMO amplify-and-forward (AF) multihop relay networks with maximum ratio transmission/receiver antenna selection (MRT/RAS) over Nakagami-m fading channels in the presence of co-channel interference (CCI) or not. In particular, the lower bounds for the outage probability of MIMO AF multihop relay networks with/without CCI are derived, which provides an efficient means to evaluate the joint effects of key system parameters, such as the number of antennas, the interfering power, and the severity of channel fading. In addition, the asymptotic behavior of the outage probability is investigated, and the results reveal that the full diversity order can be achieved regardless of CCI. In addition, simulation results are provided to show the correctness of our derived analytical results.
Sen ZHONG Wei XIA Lingfeng ZHU Zishu HE
In the localization systems based on time difference of arrival (TDOA), multipath fading and the interference source will deteriorate the localization performance. In response to this situation, TDOA estimation based on blind beamforming is proposed in the frequency domain. An additional constraint condition is designed for blind beamforming based on maximum power collecting (MPC). The relationship between the weight coefficients of the beamformer and TDOA is revealed. According to this relationship, TDOA is estimated by discrete Fourier transform (DFT). The efficiency of the proposed estimator is demonstrated by simulation results.
We present a new framework for embedding holographic halftone watermarking data into images by fusion of scale-related wavelet coefficients. The halftone watermarking image is obtained by using error-diffusion method and converted into Fresnel hologram, which is considered to be the initial password. After encryption, a scrambled watermarking image through Arnold transform is embedded into the host image during the halftoning process. We characterize the multi-scale representation of the original image using the discrete wavelet transform. The boundary information of the target image is fused by correlation of wavelet coefficients across wavelet transform layers to increase the pixel resolution scale. We apply the inter-scale fusion method to gain fusion coefficient of the fine-scale, which takes into account both the detail of the image and approximate information. Using the proposed method, the watermarking information can be embedded into the host image with recovery against the halftoning operation. The experimental results show that the proposed approach provides security and robustness against JPEG compression and different attacks compared to previous alternatives.
An appropriate similarity measure between images is one of the key techniques in search-based image annotation models. In order to capture the nonlinear relationships between visual features and image semantics, many kernel distance metric learning(KML) algorithms have been developed. However, when challenged with large-scale image annotation, their metrics can't explicitly represent the similarity between image semantics, and their algorithms suffer from high computation cost. Therefore, they always lose their efficiency. In this paper, we propose a manifold kernel metric learning (M_KML) algorithm. Our M_KML algorithm will simultaneously learn the manifold structure and the image annotation metrics. The main merit of our M_KML algorithm is that the distance metrics are builded on image feature's interior manifold structure, and the dimensionality reduction on manifold structure can handle the high dimensionality challenge faced by KML. Final experiments verify our method's efficiency and effectiveness by comparing it with state-of-the-art image annotation approaches.
Leigang HUO Xiangchu FENG Chunlei HUO Chunhong PAN
Using traditional single-layer dictionary learning methods, it is difficult to reveal the complex structures hidden in the hyperspectral images. Motivated by deep learning technique, a deep dictionary learning approach is proposed for hyperspectral image denoising, which consists of hierarchical dictionary learning, feature denoising and fine-tuning. Hierarchical dictionary learning is helpful for uncovering the hidden factors in the spectral dimension, and fine-tuning is beneficial for preserving the spectral structure. Experiments demonstrate the effectiveness of the proposed approach.
This paper describes an evaluation of a temporally stable spectral envelope estimator proposed in our past research. The past research demonstrated that the proposed algorithm can synthesize speech that is as natural as the input speech. This paper focuses on an objective comparison, in which the proposed algorithm is compared with two modern estimation algorithms in terms of estimation performance and temporal stability. The results show that the proposed algorithm is superior to the others in both aspects.
Peng SONG Wenming ZHENG Ruiyu LIANG
In traditional speech emotion recognition systems, when the training and testing utterances are obtained from different corpora, the recognition rates will decrease dramatically. To tackle this problem, in this letter, inspired from the recent developments of sparse coding and transfer learning, a novel sparse transfer learning method is presented for speech emotion recognition. Firstly, a sparse coding algorithm is employed to learn a robust sparse representation of emotional features. Then, a novel sparse transfer learning approach is presented, where the distance between the feature distributions of source and target datasets is considered and used to regularize the objective function of sparse coding. The experimental results demonstrate that, compared with the automatic recognition approach, the proposed method achieves promising improvements on recognition rates and significantly outperforms the classic dimension reduction based transfer learning approach.
Barrel distortion is a critical problem that can hinder the successful application of wide-angle cameras. This letter presents an implementation method for fast correction of the barrel distortion. In the proposed method, the required scaling factor is obtained by interpolating a mapping polynomial with a non-uniform spline instead of calculating it directly, which reduces the number of computations required for the distortion correction. This reduction in the number of computations leads to faster correction while maintaining quality: when compared to the conventional method, the reduction ratio of the correction time is about 89%, and the correction quality is 35.3 dB in terms of the average peak signal-to-noise ratio.
Chao WANG Xuanqin MOU Lei ZHANG
In this letter, we study the R-D properties of independent sources based on MSE and SSIM, and compare the bit allocation performance under the MINAVE and MINMAX criteria in video encoding. The results show that MINMAX has similar results in terms of average distortion with MINAVE by using SSIM, which illustrates the consistency between these two criteria in independent perceptual video coding. Further more, MINMAX results in lower quality fluctuation, which shows its advantage for perceptual video coding.
Shuang LIU Zhong ZHANG Baihua XIAO Xiaozhong CAO
Texture feature descriptors such as local binary patterns (LBP) have proven effective for ground-based cloud classification. Traditionally, these texture feature descriptors are predefined in a handcrafted way. In this paper, we propose a novel method which automatically learns discriminative features from labeled samples for ground-based cloud classification. Our key idea is to learn these features through mutual information maximization which learns a transformation matrix for local difference vectors of LBP. The experimental results show that our learned features greatly improves the performance of ground-based cloud classification when compared to the other state-of-the-art methods.
The paper proposes an algorithm to expose spliced photographs. Firstly, a graph-based segmentation, which defines a predictor to measure boundary evidence between two neighbor regions, is used to make greedy decision. Then the algorithm gets prediction error image using non-negative linear least-square prediction. For each pair of segmented neighbor regions, the proposed algorithm gathers their statistic features and calculates features of gray level co-occurrence matrix. K-means clustering is applied to create a dictionary, and the vector quantization histogram is taken as the result vector with fixed length. For a tampered image, its noise satisfies Gaussian distribution with zero mean. The proposed method checks the similarity between noise distribution and a zero-mean Gaussian distribution, and follows with the local flatness and texture measurement. Finally, all features are fed to a support vector machine classifier. The algorithm has low computational cost. Experiments show its effectiveness in exposing forgery.
Kyungsun LEE Minseok KEUM David K. HAN Hanseok KO
It is unclear whether Hidden Markov Model (HMM) or Dynamic Time Warping (DTW) mapping is more appropriate for visual speech recognition when only small data samples are available. In this letter, the two approaches are compared in terms of sensitivity to the amount of training samples and computing time with the objective of determining the tipping point. The limited training data problem is addressed by exploiting a straightforward template matching via weighted-DTW. The proposed framework is a refined DTW by adjusting the warping paths with judicially injected weights to ensure a smooth diagonal path for accurate alignment without added computational load. The proposed WDTW is evaluated on three databases (two in the public domain and one developed in-house) for visual recognition performance. Subsequent experiments indicate that the proposed WDTW significantly enhances the recognition rate compared to the DTW and HMM based algorithms, especially under limited data samples.
In this letter, we propose a new semantic parts learning approach to address the object detection problem with only the bounding boxes of object category labels. Our main observation is that even though the appearance and arrangement of object parts might have variations across the instances of different object categories, the constituent parts still maintain geometric consistency. Specifically, we propose a discriminative clustering method with sparse representation refinement to discover the mid-level semantic part set automatically. Then each semantic part detector is learned by the linear SVM in a one-vs-all manner. Finally, we utilize the learned part detectors to score the test image and integrate all the response maps of part detectors to obtain the detection result. The learned class-generic part detectors have the ability to capture the objects across different categories. Experimental results show that the performance of our approach can outperform some recent competing methods.