Hiroaki AKUTSU Ko ARAI
Lanxi LIU Pengpeng YANG Suwen DU Sani M. ABDULLAHI
Xiaoguang TU Zhi HE Gui FU Jianhua LIU Mian ZHONG Chao ZHOU Xia LEI Juhang YIN Yi HUANG Yu WANG
Yingying LU Cheng LU Yuan ZONG Feng ZHOU Chuangao TANG
Jialong LI Takuto YAMAUCHI Takanori HIRANO Jinyu CAI Kenji TEI
Wei LEI Yue ZHANG Hanfeng XIE Zebin CHEN Zengping CHEN Weixing LI
David CLARINO Naoya ASADA Atsushi MATSUO Shigeru YAMASHITA
Takashi YOKOTA Kanemitsu OOTSU
Xiaokang Jin Benben Huang Hao Sheng Yao Wu
Tomoki MIYAMOTO
Ken WATANABE Katsuhide FUJITA
Masashi UNOKI Kai LI Anuwat CHAIWONGYEN Quoc-Huy NGUYEN Khalid ZAMAN
Takaharu TSUBOYAMA Ryota TAKAHASHI Motoi IWATA Koichi KISE
Chi ZHANG Li TAO Toshihiko YAMASAKI
Ann Jelyn TIEMPO Yong-Jin JEONG
Haruhisa KATO Yoshitaka KIDANI Kei KAWAMURA
Jiakun LI Jiajian LI Yanjun SHI Hui LIAN Haifan WU
Gyuyeong KIM
Hyun KWON Jun LEE
Fan LI Enze YANG Chao LI Shuoyan LIU Haodong WANG
Guangjin Ouyang Yong Guo Yu Lu Fang He
Yuyao LIU Qingyong LI Shi BAO Wen WANG
Cong PANG Ye NI Jia Ming CHENG Lin ZHOU Li ZHAO
Nikolay FEDOROV Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Yukasa MURAKAMI Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Kazuya KAKIZAKI Kazuto FUKUCHI Jun SAKUMA
Yitong WANG Htoo Htoo Sandi KYAW Kunihiro FUJIYOSHI Keiichi KANEKO
Waqas NAWAZ Muhammad UZAIR Kifayat ULLAH KHAN Iram FATIMA
Haeyoung Lee
Ji XI Pengxu JIANG Yue XIE Wei JIANG Hao DING
Weiwei JING Zhonghua LI
Sena LEE Chaeyoung KIM Hoorin PARK
Akira ITO Yoshiaki TAKAHASHI
Rindo NAKANISHI Yoshiaki TAKATA Hiroyuki SEKI
Chuzo IWAMOTO Ryo TAKAISHI
Chih-Ping Wang Duen-Ren Liu
Yuya TAKADA Rikuto MOCHIDA Miya NAKAJIMA Syun-suke KADOYA Daisuke SANO Tsuyoshi KATO
Yi Huo Yun Ge
Rikuto MOCHIDA Miya NAKAJIMA Haruki ONO Takahiro ANDO Tsuyoshi KATO
Koichi FUJII Tomomi MATSUI
Yaotong SONG Zhipeng LIU Zhiming ZHANG Jun TANG Zhenyu LEI Shangce GAO
Souhei TAKAGI Takuya KOJIMA Hideharu AMANO Morihiro KUGA Masahiro IIDA
Jun ZHOU Masaaki KONDO
Tetsuya MANABE Wataru UNUMA
Kazuyuki AMANO
Takumi SHIOTA Tonan KAMATA Ryuhei UEHARA
Hitoshi MURAKAMI Yutaro YAMAGUCHI
Jingjing Liu Chuanyang Liu Yiquan Wu Zuo Sun
Zhenglong YANG Weihao DENG Guozhong WANG Tao FAN Yixi LUO
Yoshiaki TAKATA Akira ONISHI Ryoma SENDA Hiroyuki SEKI
Dinesh DAULTANI Masayuki TANAKA Masatoshi OKUTOMI Kazuki ENDO
Kento KIMURA Tomohiro HARAMIISHI Kazuyuki AMANO Shin-ichi NAKANO
Ryotaro MITSUBOSHI Kohei HATANO Eiji TAKIMOTO
Genta INOUE Daiki OKONOGI Satoru JIMBO Thiem Van CHU Masato MOTOMURA Kazushi KAWAMURA
Hikaru USAMI Yusuke KAMEDA
Yinan YANG
Takumi INABA Takatsugu ONO Koji INOUE Satoshi KAWAKAMI
Fengshan ZHAO Qin LIU Takeshi IKENAGA
Naohito MATSUMOTO Kazuhiro KURITA Masashi KIYOMI
Tomohiro KOBAYASHI Tomomi MATSUI
Shin-ichi NAKANO
Ming PAN
Kazuki MIYAHARA Kenji HASHIMOTO Hiroyuki SEKI
We consider the problem of deciding whether a query can be rewritten by a nondeterministic view. It is known that rewriting is decidable if views are given by single-valued non-copying devices such as compositions of single-valued extended linear top-down tree transducers with regular look-ahead, and queries are given by deterministic MSO tree transducers. In this paper, we extend the result to the case that views are given by nondeterministic devices that are not always single-valued. We define two variants of rewriting: universal preservation and existential preservation, and discuss the decidability of them.
For a service-oriented architecture-based system, the problem of synthesizing a concrete model (i.e., a behavioral model) for each peer configuring the system from an abstract specification — which is referred to as choreography — is known as the choreography realization problem. In this paper, we consider the condition for the behavioral model when choreography is given by an acyclic relation. A new notion called re-constructible decomposition of acyclic relations is introduced, and a necessary and sufficient condition for a decomposed relation to be re-constructible is shown. The condition provides lower and upper bounds of the acyclic relation for the behavioral model. Thus, the degree of freedom for behavioral models increases; developing algorithms for synthesizing an intelligible model for users becomes possible. It is also expected that the condition is applied to the case where choreography is given by a set of acyclic relations.
Sasinee PRUEKPRASERT Toshimitsu USHIO
This paper considers an optimal stabilization problem of quantitative discrete event systems (DESs) under the influence of disturbances. We model a DES by a deterministic weighted automaton. The control cost is concerned with the sum of the weights along the generated trajectories reaching the target state. The region of weak attraction is the set of states of the system such that all trajectories starting from them can be controlled to reach a specified set of target states and stay there indefinitely. An optimal stabilizing controller is a controller that drives the states in this region to the set of target states with minimum control cost and keeps them there. We consider two control objectives: to minimize the worst-case control cost (1) subject to all enabled trajectories and (2) subject to the enabled trajectories starting by controllable events. Moreover, we consider the disturbances which are uncontrollable events that rarely occur in the real system but may degrade the control performance when they occur. We propose a linearithmic time algorithm for the synthesis of an optimal stabilizing controller which is robust to disturbances.
Hayato MAKI Tomoki TODA Sakriani SAKTI Graham NEUBIG Satoshi NAKAMURA
In this paper a new method for noise removal from single-trial event-related potentials recorded with a multi-channel electroencephalogram is addressed. An observed signal is separated into multiple signals with a multi-channel Wiener filter whose coefficients are estimated based on parameter estimation of a probabilistic generative model that locally models the amplitude of each separated signal in the time-frequency domain. Effectiveness of using prior information about covariance matrices to estimate model parameters and frequency dependent covariance matrices were shown through an experiment with a simulated event-related potential data set.
Liang-Bi CHEN Wan-Jung CHANG Kuen-Min LEE Chi-Wei HUANG Katherine Shu-Min LI
Residents living in a nursing home usually have established medical histories in multiple sources, and most previous medicine management systems have only focused on the integration of prescriptions and the identification of repeated drug uses. Therefore, a comprehensive medicine management system is proposed to integrate medical information from different sources. The proposed system not only detects inappropriate drugs automatically but also allows users to input such information for any non-prescription medicines that the residents take. Every participant can fully track the residents' latest medicine use online and in real time. Pharmacists are able to issue requests for suggestions on medicine use, and residents can also have a comprehensive understanding of their medicine use. The proposed scheme has been practically implemented in a nursing home in Taiwan. The evaluation results show that the average time to detect an inappropriate drug use and complete a medicine record is reduced. With automatic and precise comparisons, the repeated drugs and drug side effects are identified effectively such that the amount of medicine cost spent on the residents is also reduced. Consequently, the proactive feedback, real-time tracking, and interactive consulting mechanisms bind all parties together to realize a comprehensive medicine management system.
The present study investigated the performance of text-based explanation for a large number of learners in an online tutoring task guided by a Pedagogical Conversational Agent (PCA). In the study, a lexical network analysis that focused on the co-occurrence of keywords in learner's explanation text, which were used as dependent variables, was performed. This method was used to investigate how the variables, which consisted of expressions of emotion, embodied characteristics of the PCA, and personal characteristics of the learner, influenced the performance of the explanation text. The learners (participants) were students enrolled in a psychology class. The learners provided explanations to a PCA one-on-one as an after-school activity. In this activity, the PCA, portraying the role of a questioner, asked the learners to explain a key concept taught in their class. The students were randomly assigned one key term out of 30 and were asked to formulate explanations by answering different types of questions. The task consisted of 17 trials. More than 300 text-based explanation dialogues were collected from learners using a web-based explanation system, and the factors influencing learner performance were investigated. Machine learning results showed that during the explanation activity, the expressions used and the gender of the PCA influenced learner performance. Results showed that (1) learners performed better when a male PCA expressed negative emotions as opposed to when a female PCA expressed negative emotions, and (2) learners performed better when a female PCA expressed positive expressions as opposed to when a female PCA expressed negative expressions. This paper provides insight into capturing the behavior of humans performing online tasks, and it puts forward suggestions related to the design of an efficient online tutoring system using PCA.
Shogo OKADA Mi HANG Katsumi NITTA
This study focuses on modeling the storytelling performance of the participants in a group conversation. Storytelling performance is one of the fundamental communication techniques for providing information and entertainment effectively to a listener. We present a multimodal analysis of the storytelling performance in a group conversation, as evaluated by external observers. A new multimodal data corpus is collected through this group storytelling task, which includes the participants' performance scores. We extract multimodal (verbal and nonverbal) features regarding storytellers and listeners from a manual description of spoken dialog and from various nonverbal patterns, including each participant's speaking turn, utterance prosody, head gesture, hand gesture, and head direction. We also extract multimodal co-occurrence features, such as head gestures, and interaction features, such as storyteller utterance overlapped with listener's backchannel. In the experiment, we modeled the relationship between the performance indices and the multimodal features using machine-learning techniques. Experimental results show that the highest accuracy (R2) is 0.299 for the total storytelling performance (sum of indices scores) obtained with a combination of verbal and nonverbal features in a regression task.
Most of the existing algorithms cannot effectively solve the data sparse problem of trajectory prediction. This paper proposes a novel sparse trajectory prediction method based on L-Z entropy estimation. Firstly, the moving region of trajectories is divided into a two-dimensional plane grid graph, and then the original trajectories are mapped to the grid graph so that each trajectory can be represented as a grid sequence. Secondly, an L-Z entropy estimator is used to calculate the entropy value of each grid sequence, and then the trajectory which has a comparatively low entropy value is segmented into several sub-trajectories. The new trajectory space is synthesised by these sub-trajectories based on trajectory entropy. The trajectory synthesis can not only resolve the sparse problem of trajectory data, but also make the new trajectory space more credible. In addition, the trajectory scale is limited in a certain range. Finally, under the new trajectory space, Markov model and Bayesian Inference is applied to trajectory prediction with data sparsity. The experiments based on the taxi trajectory dataset of Microsoft Research Asia show the proposed method can make an effective prediction for the sparse trajectory. Compared with the existing methods, our method needs a smaller trajectory space and provides much wider predicting range, faster predicting speed and better predicting accuracy.
Yoshitaka OTANI Osamu AOKI Tomohiro HIROTA Hiroshi ANDO
The purpose of this study is to make available a fall risk assessment for stroke patients during walking using an accelerometer. We assessed gait parameters, normalized root mean squared acceleration (NRMSA) and berg balance scale (BBS) values. Walking dynamics were better reflected in terms of the risk of falls during walking by NRMSA compared to the BBS.
We present a hierarchical replicated state machine (H-RSM) and its corresponding consensus protocol D-Paxos for replication across multiple data centers in the cloud. Our H-RSM is based on the idea of parallel processing and aims to improve resource utilization. We detail D-Paxos and theoretically prove that D-Paxos implements an H-RSM. With batching and logical pipelining, D-Paxos efficiently utilizes the idle time caused by high-latency message transmission in a wide-area network and available bandwidth in a local-area network. Experiments show that D-Paxos provides higher throughput and better scalability than other Paxos variants for replication across multiple data centers. To predict the optimal batch sizes when D-Paxos reaches its maximum throughput, an analytical model is developed theoretically and validated experimentally.
Dongchul PARK Biplob DEBNATH David H.C. DU
The Flash Translation Layer (FTL) is a firmware layer inside NAND flash memory that allows existing disk-based applications to use it without any significant modifications. Since the FTL has a critical impact on the performance and reliability of flash-based storage, a variety of FTLs have been proposed. The existing FTLs, however, are designed to perform well for either a read intensive workload or a write intensive workload, not for both due to their internal address mapping schemes. To overcome this limitation, we propose a novel hybrid FTL scheme named Convertible Flash Translation Layer (CFTL). CFTL is adaptive to data access patterns with the help of our unique hot data identification design that adopts multiple bloom filters. Thus, CFTL can dynamically switch its mapping scheme to either page-level mapping or block-level mapping to fully exploit the benefits of both schemes. In addition, we design a spatial locality-aware caching mechanism and adaptive cache partitioning to further improve CFTL performance. Consequently, both the adaptive switching scheme and the judicious caching mechanism empower CFTL to achieve good read and write performance. Our extensive evaluations demonstrate that CFTL outperforms existing FTLs. In particular, our specially designed caching mechanism remarkably improves the cache hit ratio, by an average of 2.4×, and achieves much higher hit ratios (up to 8.4×) especially for random read intensive workloads.
Tinghuai MA Limin GUO Meili TANG Yuan TIAN Mznah AL-RODHAAN Abdullah AL-DHELAAN
User-based and item-based collaborative filtering (CF) are two of the most important and popular techniques in recommender systems. Although they are widely used, there are still some limitations, such as not being well adapted to the sparsity of data sets, failure to consider the hierarchical structure of the items, and changes in users' interests when calculating the similarity of items. To overcome these shortcomings, we propose an evolutionary approach based on hierarchical structure for dynamic recommendation system named Hierarchical Temporal Collaborative Filtering (HTCF). The main contribution of the paper is displayed in the following two aspects. One is the exploration of hierarchical structure between items to improve similarity, and the other is the improvement of the prediction accuracy by utilizing a time weight function. A unique feature of our method is that it selects neighbors mainly based on hierarchical structure between items, which is more reliable than co-rated items utilized in traditional CF. To the best of our knowledge, there is little previous work on researching CF algorithm by combining object implicit or latent object-structure relations. The experimental results show that our method outperforms several current recommendation algorithms on recommendation accuracy (in terms of MAE).
Linked data entity resolution is the detection of instances that reside in different repositories but co-describe the same topic. The quality of the resolution result depends on the appropriateness of the configuration, including the selected matching properties and the similarity measures. Because such configuration details are currently set differently across domains and repositories, a general resolution approach for every repository is necessary. In this paper, we present cLink, a system that can perform entity resolution on any input effectively by using a learning algorithm to find the optimal configuration. Experiments show that cLink achieves high performance even when being given only a small amount of training data. cLink also outperforms recent systems, including the ones that use the supervised learning approach.
Zhili ZHOU Ching-Nung YANG Beijing CHEN Xingming SUN Qi LIU Q.M. Jonathan WU
For detecting the image copies of a given original image generated by arbitrary rotation, the existing image copy detection methods can not simultaneously achieve desirable performances in the aspects of both accuracy and efficiency. To address this challenge, a novel effective and efficient image copy detection method is proposed based on two global features extracted from rotation invariant partitions. Firstly, candidate images are preprocessed by an averaging operation to suppress noise. Secondly, the rotation invariant partitions of the preprocessed images are constructed based on pixel intensity orders. Thirdly, two global features are extracted from these partitions by utilizing image gradient magnitudes and orientations, respectively. Finally, the extracted features of images are compared to implement copy detection. Promising experimental results demonstrate our proposed method can effectively and efficiently resist rotations with arbitrary degrees. Furthermore, the performances of the proposed method are also desirable for resisting other typical copy attacks, such as flipping, rescaling, illumination and contrast change, as well as Gaussian noising.
Antonio CEDILLO-HERNANDEZ Manuel CEDILLO-HERNANDEZ Francisco GARCIA-UGALDE Mariko NAKANO-MIYATAKE Hector PEREZ-MEANA
A visible watermarking technique to provide copyright protection for portrait images is proposed in this paper. The proposal is focused on real-world applications where a portrait image is printed and illegitimately used for commercial purposes. It is well known that this is one of the most difficult challenges to prove ownership through current watermark techniques. We propose an original approach which avoids the deficiencies of typical watermarking methods in practical scenarios by introducing a smart process to automatically detect the most suitable region of the portrait image, where the visible watermark goes unnoticed to the naked eye of a viewer and is robust enough to remain visible when printed. The position of the watermark is determined by performing an analysis of the portrait image characteristics taking into account several conditions of their spatial information together with human visual system properties. Once the location is set, the watermark embedding process is performed adaptively by creating a contrast effect between the watermark and its background. Several experiments are performed to illustrate the proper functioning of the proposed watermark algorithm on portrait images with different characteristics, including dimensions, backgrounds, illumination and texture, with the conclusion that it can be applied in many practical situations.
Kenji FUJIKAWA Hiroaki HARAI Motoyuki OHMORI Masataka OHTA
We have developed an automatic network configuration technology for flexible and robust network construction. In this paper, we propose a two-or-more-level hierarchical link-state routing protocol in Hierarchical QoS Link Information Protocol (HQLIP). The hierarchical routing easily scales up the network by combining and stacking configured networks. HQLIP is designed not to recompute shortest-path trees from topology information in order to achieve a high-speed convergence of forwarding information base (FIB), especially when renumbering occurs in the network. In addition, we propose a fixed-midfix renumbering (FMR) method. FMR enables an even faster convergence when HQLIP is synchronized with Hierarchical/Automatic Number Allocation (HANA). Experiments demonstrate that HQLIP incorporating FMR achieves the convergence time within one second in the network where 22 switches and 800 server terminals are placed, and is superior to Open Shortest Path First (OSPF) in terms of a convergence time. This shows that a combination of HQLIP and HANA performs stable renumbering in link-state routing protocol networks.
Seokjoon HONG Ducsun LIM Inwhee JOE
The high-availability seamless redundancy (HSR) protocol is a representative protocol that fulfills the reliability requirements of the IEC61850-based substation automation system (SAS). However, it has the drawback of creating unnecessary traffic in a network. To solve this problem, a dual virtual path (DVP) algorithm based on HSR was recently presented. Although this algorithm dramatically reduces network traffic, it does not consider the substation timing requirements of messages in an SAS. To reduce unnecessary network traffic in an HSR ring network, we introduced a novel packet transmission (NPT) algorithm in a previous work that considers IEC61850 message types. To further reduce unnecessary network traffic, we propose an extended dual virtual paths (EDVP) algorithm in this paper that considers the timing requirements of IEC61850 message types. We also include sending delay (SD), delay queue (DQ), and traffic flow latency (TFL) features in our proposal. The source node sends data frames without SDs on the primary paths, and it transmits the duplicate data frames with SDs on the secondary paths. Since the EDVP algorithm discards all of the delayed data frames in DQs when there is no link or node failure, unnecessary network traffic can be reduced. We demonstrate the principle of the EDVP algorithm and its performance in terms of network traffic compared to the standard HSR, NPT, and DVP algorithm using the OPNET network simulator. Throughout the simulation results, the EDVP algorithm shows better traffic performance than the other algorithms, while guaranteeing the timing requirements of IEC61850 message types. Most importantly, when the source node transmits heavy data traffic, the EDVP algorithm shows greater than 80% and 40% network traffic reduction compared to the HSR and DVP approaches, respectively.
Cong Minh DINH Hyung Jeong YANG Guee Sang LEE Soo Hyung KIM
In recent years, optical music recognition (OMR) has been extensively developed, particularly for use with mobile devices that require fast processing to recognize and play live the notes in images captured from sheet music. However, most techniques that have been developed thus far have focused on playing back instrumental music and have ignored the importance of lyric extraction, which is time consuming and affects the accuracy of the OMR tools. The text of the lyrics adds complexity to the page layout, particularly when lyrics touch or overlap musical symbols, in which case it is very difficult to separate them from each other. In addition, the distortion that appears in captured musical images makes the lyric lines curved or skewed, making the lyric extraction problem more complicated. This paper proposes a new approach in which lyrics are detected and extracted quickly and effectively. First, in order to resolve the distortion problem, the image is undistorted by a method using information of stave lines and bar lines. Then, through the use of a frequency count method and heuristic rules based on projection, the lyric areas are extracted, the cases where symbols touch the lyrics are resolved, and most of the information from the musical notation is kept even when the lyrics and music notes are overlapping. Our algorithm demonstrated a short processing time and remarkable accuracy on two test datasets of images of printed Korean musical scores: the first set included three hundred scanned musical images; the second set had two hundred musical images that were captured by a digital camera.
Antoine TROUVÉ Arnaldo J. CRUZ Kazuaki J. MURAKAMI Masaki ARAI Tadashi NAKAHIRA Eiji YAMANAKA
Modern optimizing compilers tend to be conservative and often fail to vectorize programs that would have benefited from it. In this paper, we propose a way to predict the relevant command-line options of the compiler so that it chooses the most profitable vectorization strategy. Machine learning has proven to be a relevant approach for this matter: fed with features that describe the software to the compiler, a machine learning device is trained to predict an appropriate optimization strategy. The related work relies on the control and data flow graphs as software features. In this article, we consider tensor contraction programs, useful in various scientific simulations, especially chemistry. Depending on how they access the memory, different tensor contraction kernels may yield very different performance figures. However, they exhibit identical control and data flow graphs, making them completely out of reach of the related work. In this paper, we propose an original set of software features that capture the important properties of the tensor contraction kernels. Considering the Intel Merom processor architecture with the Intel Compiler, we model the problem as a classification problem and we solve it using a support vector machine. Our technique predicts the best suited vectorization options of the compiler with a cross-validation accuracy of 93.4%, leading to up to a 3-times speedup compared to the default behavior of the Intel Compiler. This article ends with an original qualitative discussion on the performance of software metrics by means of visualization. All our measurements are made available for the sake of reproducibility.
Xiaolei LIU Xiaosong ZHANG Yiqi JIANG Qingxin ZHU
Optimizating the deployment of wireless sensor networks, which is one of the key issues in wireless sensor networks research, helps improve the coverage of the networks and the system reliability. In this paper, we propose an evolutionary algorithm based on modified t-distribution for the wireless sensor by introducing a deployment optimization operator and an intelligent allocation operator. A directed perturbation operator is applied to the algorithm to guide the evolution of the node deployment and to speed up the convergence. In addition, with a new geometric sensor detection model instead of the old probability model, the computing speed is increased by 20 times. The simulation results show that when this algorithm is utilized in the actual scene, it can get the minimum number of nodes and the optimal deployment quickly and effectively.Compared with the existing mainstream swarm intelligence algorithms, this method has satisfied the need for convergence speed and better coverage, which is closer to the theoretical coverage value.
Bima Sena Bayu DEWANTARA Jun MIURA
This paper proposes an appearance-based novel descriptor for estimating head orientation. Our descriptor is inspired by the Weber-based feature, which has been successfully implemented for robust texture analysis, and the gradient which performs well for shape analysis. To further enhance the orientation differences, we combine them with an analysis of the intensity deviation. The position of a pixel and its intrinsic intensity are also considered. All features are then composed as a feature vector of a pixel. The information carried by each pixel is combined using a covariance matrix to alleviate the influence caused by rotations and illumination. As the result, our descriptor is compact and works at high speed. We also apply a weighting scheme, called Block Importance Feature using Genetic Algorithm (BIF-GA), to improve the performance of our descriptor by selecting and accentuating the important blocks. Experiments on three head pose databases demonstrate that the proposed method outperforms the current state-of-the-art methods. Also, we can extend the proposed method by combining it with a head detection and tracking system to enable it to estimate human head orientation in real applications.
Hangyu LI Hajime KIRA Shinobu HASEGAWA
This paper aims to support the cultivation of proper cognitive skills for academic English listening. First of all, this paper identified several listening strategies proved to be effective for cultivating listening skills through past research and builds up the respective strategy models, based on which we designed and developed various functional units as strategy objects, and the mashup environment where these function units can be assembled to serve as a personal learning environment. We also attached listening strategies and tactics to each object, in order to make learners aware of the related strategies and tactics applied during learning. Both short-term and mid-term case studies were carried out, and the data collected showed several positive results and some interesting indications.
Fine-grained visual categorization (FGVC) has drawn increasing attention as an emerging research field in recent years. In contrast to generic-domain visual recognition, FGVC is characterized by high intra-class and subtle inter-class variations. To distinguish conceptually and visually similar categories, highly discriminative visual features must be extracted. Moreover, FGVC has highly specialized and task-specific nature. It is not always easy to obtain a sufficiently large-scale training dataset. Therefore, the key to success in practical FGVC systems is to efficiently exploit discriminative features from a limited number of training examples. In this paper, we propose an efficient two-step dimensionality compression method to derive compact middle-level part-based features. To do this, we compare both space-first and feature-first convolution schemes and investigate their effectiveness. Our approach is based on simple linear algebra and analytic solutions, and is highly scalable compared with the current one-vs-one or one-vs-all approach, making it possible to quickly train middle-level features from a number of pairwise part regions. We experimentally show the effectiveness of our method using the standard Caltech-Birds and Stanford-Cars datasets.
Hamed ESLAMI Abolghasem A. RAIE Karim FAEZ
Today, computer vision is used in different applications for intelligent transportation systems like: traffic surveillance, driver assistance, law enforcement etc. Amongst these applications, we are concentrating on speed measurement for law enforcement. In law enforcement applications, the presence of the license plate in the scene is a presupposition and metric parameters like vehicle's speed are to be estimated with a high degree of precision. The novelty of this paper is to propose a new precise, practical and fast procedure, with hierarchical architecture, to estimate the homraphic transform of the license plate and using this transform to estimate the vehicle's speed. The proposed method uses the RANSAC algorithm to improve the robustness of the estimation. Hence, it is possible to replace the peripheral equipment with vision based systems, or in conjunction with these peripherals, it is possible to improve the accuracy and reliability of the system. Results of experiments on different datasets, with different specifications, show that the proposed method can be used in law enforcement applications to measure the vehicle's speed.
Sentence similarity computation is an increasingly important task in applications of natural language processing such as information retrieval, machine translation, text summarization and so on. From the viewpoint of information theory, the essential attribute of natural language is that the carrier of information and the capacity of information can be measured by information content which is already successfully used for word similarity computation in simple ways. Existing sentence similarity methods don't emphasize the information contained by the sentence, and the complicated models they employ often need using empirical parameters or training parameters. This paper presents a fully unsupervised computational model of sentence semantic similarity. It is also a simply and straightforward model that neither needs any empirical parameter nor rely on other NLP tools. The method can obtain state-of-the-art experimental results which show that sentence similarity evaluated by the model is closer to human judgment than multiple competing baselines. The paper also tests the proposed model on the influence of external corpus, the performance of various sizes of the semantic net, and the relationship between efficiency and accuracy.
Yali LI Hongma LIU Shengjin WANG
A brain-computer interface (BCI) translates the brain activity into commands to control external devices. P300 speller based character recognition is an important kind of application system in BCI. In this paper, we propose a framework to integrate channel correlation analysis into P300 detection. This work is distinguished by two key contributions. First, a coefficient matrix is introduced and constructed for multiple channels with the elements indicating channel correlations. Agglomerative clustering is applied to group correlated channels. Second, the statistics of central tendency are used to fuse the information of correlated channels and generate virtual channels. The generated virtual channels can extend the EEG signals and lift up the signal-to-noise ratio. The correlated features from virtual channels are combined with original signals for classification and the outputs of discriminative classifier are used to determine the characters for spelling. Experimental results prove the effectiveness and efficiency of the channel correlation analysis based framework. Compared with the state-of-the-art, the recognition rate was increased by both 6% with 5 and 10 epochs by the proposed framework.
Hyun-chong CHO Lubomir HADJIISKI Berkman SAHINER Heang-Ping CHAN Chintana PARAMAGUL Mark HELVIE Alexis V. NEES Hyun Chin CHO
To study the similarity between queries and retrieved masses, we design an interactive CBIR (Content-based Image Retrieval) CADx (Computer-aided Diagnosis) system using relevance feedback for the characterization of breast masses in ultrasound (US) images based on radiologists' visual similarity assessment. The CADx system retrieves masses that are similar to query masses from a reference library based on six computer-extracted features that describe the texture, width-to-height, and posterior shadowing of the mass. The k-NN retrieval with Euclidean distance similarity measure and the Rocchio relevance feedback algorithm (RRF) are used. To train the RRF parameters, the similarities of 1891 image pairs from 62 (31 malignant and 31 benign) masses are rated by 3 MQSA (Mammography Quality Standards Act) radiologists using a 9-point scale (9=most similar). The best RRF parameters are chosen based on 3 observer experiments. For testing, 100 independent query masses (49 malignant and 51 benign) and 121 reference masses on 230 (79 malignant and 151 benign) images were collected. Three radiologists rated the similarity between the query masses and the computer-retrieved masses. Average similarity ratings without and with RRF were 5.39 and 5.64 for the training set and 5.78 and 6.02 for the test set, respectively. Average AUC values without and with RRF were, respectively, 0.86±0.03 and 0.87±0.03 for the training set and 0.91±0.03 and 0.90±0.03 for the test set. On average, masses retrieved using the CBIR system were moderately similar to the query masses based on radiologists' similarity assessments. RRF improved the similarity of the retrieved masses.
The separation of signals with temporal structure from mixed sources is a challenging problem in signal processing. For this problem, blind source extraction (BSE) is more suitable than blind source separation (BSS) because it has lower computation cost. Nowadays many BSE algorithms can be used to extract signals with temporal structure. However, some of them are not robust because they are too dependent on the estimation precision of time delay; some others need to choose parameters before extracting, which means that arbitrariness can't be avoided. In order to solve the above problems, we propose a robust source extraction algorithm whose performance doesn't rely on the choice of parameters. The algorithm is realized by maximizing the objective function that we develop based on the non-Gaussianity and the temporal structure of source signals. Furthermore, we analyze the stability of the algorithm. Simulation results show that the algorithm can extract the desired signal from large numbers of observed sensor signals and is very robust to error in the estimation of time delay.
Wenzhu WANG Kun JIANG Yusong TAN Qingbo WU
Hierarchical scheduling for multiple resources is partially responsible for the performance achievements in large scale datacenters. However, the latest scheduling technique, Hierarchy Dominant Resource Fairness (H-DRF)[1], has some shortcomings in heterogeneous environments, such as starving certain jobs or unfair resource allocation. This is because a heterogeneous environment brings new challenges. In this paper, we propose a novel scheduling algorithm called Dominant Fairness Fairness (DFF). DFF tries to keep resource allocation fair, avoid job starvation, and improve system resource utilization. We implement DFF in the YARN system, a most commonly used scheduler for large scale clusters. The experimental results show that our proposed algorithm leads to higher resource utilization and better throughput than H-DRF.
Based on the completeness of the real-valued discrete Gabor transform, a new biorthogonal relationship between analysis window and synthesis window is derived and a fast algorithm for computing the analysis window is presented for any given synthesis window. The new biorthogonal relationship can be expressed as a linear equation set, which can be separated into a certain number of independent sub-equation sets, where each of them can be fast and independently solved by using convolution operations and FFT to obtain the analysis window for any given synthesis window. Computational complexity analysis and comparison indicate that the proposed algorithm can save a considerable amount of computation and is more efficient than the existing algorithms.
Zhihong LIU Aimal KHAN Peixin CHEN Yaping LIU Zhenghu GONG
MapReduce still suffers from a problem known as skew, where load is unevenly distributed among tasks. Existing solutions follow a similar pattern that estimates the load of each task and then rebalances the load among tasks. However, these solutions often incur heavy overhead due to the load estimation and rebalancing. In this paper, we present DynamicAdjust, a dynamic resource adjustment technique for mitigating skew in MapReduce. Instead of rebalancing the load among tasks, DynamicAdjust adjusts resources dynamically for the tasks that need more computation, thereby accelerating these tasks. Through experiments using real MapReduce workloads on a 21-node Hadoop cluster, we show that DynamicAdjust can effectively mitigate the skew and speed up the job completion time by up to 37.27% compared to the native Hadoop YARN.
Zhaofeng WU Guyu HU Fenglin JIN Yinjin FU Jianxin LUO Tingting ZHANG
Stability-featured dynamic multi-path routing (SDMR) based on the existing Traffic engineering eXplicit Control Protocol (TeXCP) is proposed and evaluated for traffic engineering in terrestrial networks. SDMR abandons the sophisticated stability maintenance mechanisms of TeXCP, whose load balancing scheme is also modified in the proposed mechanism. SDMR is proved to be able to converge to a unique equilibria state, which has been corroborated by the simulations.
Ping LU Wenming ZHENG Ziyan WANG Qiang LI Yuan ZONG Minghai XIN Lenan WU
In this letter, a micro-expression recognition method is investigated by integrating both spatio-temporal facial features and a regression model. To this end, we first perform a multi-scale facial region division for each facial image and then extract a set of local binary patterns on three orthogonal planes (LBP-TOP) features corresponding to divided facial regions of the micro-expression videos. Furthermore, we use GSLSR model to build the linear regression relationship between the LBP-TOP facial feature vectors and the micro expressions label vectors. Finally, the learned GSLSR model is applied to the prediction of the micro-expression categories for each test micro-expression video. Experiments are conducted on both CASME II and SMIC micro-expression databases to evaluate the performance of the proposed method, and the results demonstrate that the proposed method is better than the baseline micro-expression recognition method.
Jaeyong JU Taeyup SONG Bonhwa KU Hanseok KO
Key frame based video summarization has emerged as an important task for efficient video data management. This paper proposes a novel technique for key frame extraction based on chaos theory and color information. By applying chaos theory, a large content change between frames becomes more chaos-like and results in a more complex fractal trajectory in phase space. By exploiting the fractality measured in the phase space between frames, it is possible to evaluate inter-frame content changes invariant to effects of fades and illumination change. In addition to this measure, the color histogram-based measure is also used to complement the chaos-based measure which is sensitive to changes of camera /object motion. By comparing the last key frame with the current frame based on the proposed frame difference measure combining these two complementary measures, the key frames are robustly selected even under presence of video fades, changes of illumination, and camera/object motion. The experimental results demonstrate its effectiveness with significant improvement over the conventional method.
Jin XU Yuansong QIAO Zhizhong FU
Because the perceptual compressive sensing framework can achieve a much better performance than the legacy compressive sensing framework, it is very promising for the compressive sensing based image compression system. In this paper, we propose an innovative adaptive perceptual block compressive sensing scheme. Firstly, a new block-based statistical metric which can more appropriately measure each block's sparsity and perceptual sensibility is devised. Then, the approximated theoretical minimum measurement number for each block is derived from the new block-based metric and used as weight for adaptive measurements allocation. The obtained experimental results show that our scheme can significantly enhance both objective and subjective performance of a perceptual compressive sensing framework.
A non-linear extension of generalized hyperplane approximation (GHA) method is introduced in this letter. Although GHA achieved a high-confidence result in motion parameter estimation by utilizing the supervised learning scheme in histogram of oriented gradient (HOG) feature space, it still has unstable convergence range because it approximates the non-linear function of regression from the feature space to the motion parameter space as a linear plane. To extend GHA into a non-linear regression for larger convergence range, we derive theoretical equations and verify this extension's effectiveness and efficiency over GHA by experimental results.
Recent studies have obtained superior performance in image recognition tasks by using, as an image representation, the fully connected layer activations of Convolutional Neural Networks (CNN) trained with various kinds of images. However, the CNN representation is not very suitable for fine-grained image recognition tasks involving food image recognition. For improving performance of the CNN representation in food image recognition, we propose a novel image representation that is comprised of the covariances of convolutional layer feature maps. In the experiment on the ETHZ Food-101 dataset, our method achieved 58.65% averaged accuracy, which outperforms the previous methods such as the Bag-of-Visual-Words Histogram, the Improved Fisher Vector, and CNN-SVM.
Laplacian operator is a basic tool for image processing. For an image with regular pixels, the Laplacian operator can be represented as a stencil in which constant weights are arranged spatially to indicate which picture cells they apply to. However, in a discrete spherical image the image pixels are irregular; thus, a stencil with constant weights is not suitable. In this paper a spherical Laplacian operator is derived from Gauss's theorem; which is suitable to images with irregular pixels. The effectiveness of the proposed discrete spherical Laplacian operator is shown by the experimental results.
Jiatian PI Keli HU Xiaolin ZHANG Yuzhang GU Yunlong ZHAN
Object tracking is one of the fundamental problems in computer vision. However, there is still a need to improve the overall capability in various tracking circumstances. In this letter, a patches-collaborative compressive tracking (PCCT) algorithm is presented. Experiments on various challenging benchmark sequences demonstrate that the proposed algorithm performs favorably against several state-of-the-art algorithms.
A non-photorealistic rendering method creates oil-film-like images, expressed with colorful, smooth curves similar to the oil films generated on the surface of glass or water, from color photo images. The proposed method generates oil-film-like images through iterative processing between a bilateral infra-envelope filter and an unsharp mask. In order to verify the effectiveness of the proposed method, tests using a Lena image were performed, and visual assessment of oil-film-like images was conducted for changes in appearance as the parameter values of the proposed method were varied. As a result of tests, the optimal value of parameters was found for generating oil-film patterns.
Jaeyong JU Murray LOEW Bonhwa KU Hanseok KO
This paper presents a method for registering retinal images. Retinal image registration is crucial for the diagnoses and treatments of various eye conditions and diseases such as myopia and diabetic retinopathy. Retinal image registration is challenging because the images have non-uniform contrasts and intensity distributions, as well as having large homogeneous non-vascular regions. This paper provides a new retinal image registration method by effectively combining expectation maximization principal component analysis based mutual information (EMPCA-MI) with salient features. Experimental results show that our method is more efficient and robust than the conventional EMPCA-MI method.