Lihan TONG Weijia LI Qingxia YANG Liyuan CHEN Peng CHEN
Yinan YANG
Myung-Hyun KIM Seungkwang LEE
Shuoyan LIU Chao LI Yuxin LIU Yanqiu WANG
Takumi INABA Takatsugu ONO Koji INOUE Satoshi KAWAKAMI
Martin LUKAC Saadat NURSULTAN Georgiy KRYLOV Oliver KESZOCZE Abilmansur RAKHMETTULAYEV Michitaka KAMEYAMA
Zheqing ZHANG Hao ZHOU Chuan LI Weiwei JIANG
Liu ZHANG Zilong WANG Yindong CHEN
Wenxia Bao An Lin Hua Huang Xianjun Yang Hemu Chen
Fengshan ZHAO Qin LIU Takeshi IKENAGA
Haruhiko KAIYA Shinpei OGATA Shinpei HAYASHI
Jiakai LI Jianyong DUAN Hao WANG Li HE Qing ZHANG
Yuxin HUANG Yuanlin YANG Enchang ZHU Yin LIANG Yantuan XIAN
Naohito MATSUMOTO Kazuhiro KURITA Masashi KIYOMI
Na XING Lu LI Ye ZHANG Shiyi YANG
Zhe Wang Zhe-Ming Lu Hao Luo Yang-Ming Zheng
Rina TAGAMI Hiroki KOBAYASHI Shuichi AKIZUKI Manabu HASHIMOTO
Tomohiro KOBAYASHI Tomomi MATSUI
Shin-ichi NAKANO
Hongzhi XU Binlian ZHANG
Weizhi WANG Lei XIA Zhuo ZHANG Xiankai MENG
Yuka KO Katsuhito SUDOH Sakriani SAKTI Satoshi NAKAMURA
Rinka KAWANO Masaki KAWAMURA
Zhishuo ZHANG Chengxiang TAN Xueyan ZHAO Min YANG
Peng WANG Guifen CHEN Zhiyao SUN
Zeyuan JU Zhipeng LIU Yu GAO Haotian LI Qianhang DU Kota YOSHIKAWA Shangce GAO
Ji WU Ruoxi YU Kazuteru NAMBA
Hao WANG Yao Ma Jianyong Duan Li HE Xin Li
Shijie WANG Xuejiao HU Sheng LIU Ming LI Yang LI Sidan DU
Arata KANEKO Htoo Htoo Sandi KYAW Kunihiro FUJIYOSHI Keiichi KANEKO
Qi LIU Bo WANG Shihan TAN Shurong ZOU Wenyi GE
HanYu Zhang Tomoji Kishi
Shinobu NAGAYAMA Tsutomu SASAO Jon T. BUTLER
Yoon Hak KIM
Takashi HIRAYAMA Rin SUZUKI Katsuhisa YAMANAKA Yasuaki NISHITANI
Yosuke IIJIMA Atsunori OKADA Yasushi YUMINAKA
Batnasan Luvaanjalba Elaine Yi-Ling Wu
KuanChao CHU Satoshi YAMAZAKI Hideki NAKAYAMA
Shenglei LI Haoran LUO Tengfei SHAO Reiko HISHIYAMA
Yasushi YUMINAKA Kazuharu NAKAJIMA Yosuke IIJIMA
Chunbo Liu Liyin Wang Zhikai Zhang Chunmiao Xiang Zhaojun Gu Zhi Wang Shuang Wang
Jia-ji JIANG Hai-bin WAN Hong-min SUN Tuan-fa QIN Zheng-qiang WANG
Yuhao LIU Zhenzhong CHU Lifei WEI
Ken ASANO Masanori NATSUI Takahiro HANYU
Shuto HASEGAWA Koichiro ENOMOTO Taeko MIZUTANI Yuri OKANO Takenori TANAKA Osamu SAKAI
Zhewei XU Mizuho IWAIHARA
Takao WAHO Akihisa KOYAMA Hitoshi HAYASHI
Taisei SAITO Kota ANDO Tetsuya ASAI
Shiyu YANG Tetsuya KANDA Daniel M. GERMAN Yoshiki HIGO
Tsutomu SASAO
Jiyeon LEE
Koichi MORIYAMA Akira OTSUKA
Hongliang FU Qianqian LI Huawei TAO Chunhua ZHU Yue XIE Ruxue GUO
Gao WANG Gaoli WANG Siwei SUN
Hua HUANG Yiwen SHAN Chuan LI Zhi WANG
Zhi LIU Heng WANG Yuan LI Hongyun LU Hongyuan JING Mengmeng ZHANG
Tomoyasu NAKANO Masataka GOTO
Hyebong CHOI Joel SHIN Jeongho KIM Samuel YOON Hyeonmin PARK Hyejin CHO Jiyoung JUNG
Xianglong LI Yuan LI Jieyuan ZHANG Xinhai XU Donghong LIU
Haoran LUO Tengfei SHAO Shenglei LI Reiko HISHIYAMA
Chang SUN Yitong LIU Hongwen YANG
Ji XI Yue XIE Pengxu JIANG Wei JIANG
Ming PAN
Shinpei HAYASHI Keisuke ASANO Motoshi SAEKI
Goal refinement is a crucial step in goal-oriented requirements analysis to create a goal model of high quality. Poor goal refinement leads to missing requirements and eliciting incorrect requirements as well as less comprehensiveness of produced goal models. This paper proposes a technique to automate detecting bad smells of goal refinement, symptoms of poor goal refinement. At first, to clarify bad smells, we asked subjects to discover poor goal refinement concretely. Based on the classification of the specified poor refinement, we defined four types of bad smells of goal refinement: Low Semantic Relation, Many Siblings, Few Siblings, and Coarse Grained Leaf, and developed two types of measures to detect them: measures on the graph structure of a goal model and semantic similarity of goal descriptions. We have implemented a supporting tool to detect bad smells and assessed its usefulness by an experiment.
Yotaro SEKI Shinpei HAYASHI Motoshi SAEKI
Use case modeling is popular to represent the functionality of the system to be developed, and it consists of two parts: a use case diagram and use case descriptions. Use case descriptions are structured text written in natural language, and the usage of natural language can lead to poor descriptions such as ambiguous, inconsistent, and/or incomplete descriptions. Poor descriptions lead to missing requirements and eliciting incorrect requirements as well as less comprehensiveness of the produced use case model. This paper proposes a technique to automate detecting bad smells of use case descriptions, i.e., symptoms of poor descriptions. At first, to clarify bad smells, we analyzed existing use case models to discover poor use case descriptions concretely and developed the list of bad smells, i.e., a catalog of bad smells. Some of the bad smells can be refined into measures using the Goal-Question-Metric paradigm to automate their detection. The main contributions of this paper are the developed catalog of bad smells and the automated detection of these bad smells. We have implemented an automated smell detector for 22 bad smells at first and assessed its usefulness by an experiment. As a result, the first version of our tool got a precision ratio of 0.591 and a recall ratio of 0.981. Through evaluating our catalog and the automated tool, we found additional six bad smells and two metrics. Then, we obtained the precision of 0.596 and the recall of 1.000 by our final version of the automated tool.
With the high development of computation requirements in Internet of Things, resource-limited edge servers usually require to cooperate to perform the tasks. Most related studies usually assume a static cooperation approach which might not suit the dynamic environment of edge computing. In this paper, we consider a dynamic cooperation approach by guiding edge servers to form coalitions dynamically. It raises two issues: 1) how to guide them to optimally form coalitions and 2) how to cope with the dynamic feature where server statuses dynamically change as the tasks are performed. The coalitional Markov decision process (CMDP) model proposed in our previous work can handle these issues well. However, its basic solution, coalitional Q-learning, cannot handle the large scale problem when the task number is large in edge computing. Our response is to propose a novel algorithm called deep coalitional Q-learning (DCQL) to solve it. To sum up, we first formulate the dynamic cooperation problem of edge servers as a CMDP: each edge server is regarded as an agent and the dynamic process is modeled as a MDP where the agents observe the current state to formulate several coalitions. Each coalition takes an action to impact the environment which correspondingly transfers to the next state to repeat the above process. Then, we propose DCQL which includes a deep neural network and so can well cope with large scale problem. DCQL can guide the edge servers to form coalitions dynamically with the target of optimizing some goal. Furthermore, we run experiments to verify our proposed algorithm's effectiveness in different settings.
Rizal Setya PERDANA Yoshiteru ISHIDA
This study presents a formulation for generating context-aware natural language by machine from visual representation. Given an image sequence input, the visual storytelling task (VST) aims to generate a coherent, object-focused, and contextualized sentence story. Previous works in this domain faced a problem in modeling an architecture that works in temporal multi-modal data, which led to a low-quality output, such as low lexical diversity, monotonous sentences, and inaccurate context. This study introduces a further improvement, that is, an end-to-end architecture, called cross-modal contextualize attention, optimized to extract visual-temporal features and generate a plausible story. Visual object and non-visual concept features are encoded from the convolutional feature map, and object detection features are joined with language features. Three scenarios are defined in decoding language generation by incorporating weights from a pre-trained language generation model. Extensive experiments are conducted to confirm that the proposed model outperforms other models in terms of automatic metrics and manual human evaluation.
Hideaki OHASHI Toshiyuki SHIMIZU Masatoshi YOSHIKAWA
Peer assessment in education has pedagogical benefits and is a promising method for grading a large number of submissions. At the same time, student reliability has been regarded as a problem; consequently, various methods of estimating highly reliable grades from scores given by multiple students have been proposed. Under most of the existing methods, a nonadaptive allocation pattern, which performs allocation in advance, is assumed. In this study, we analyze the effect of student-submission allocation on score estimation in peer assessment under a nonadaptive allocation setting. We examine three types of nonadaptive allocation methods, random allocation, circular allocation and group allocation, which are considered the commonly used approaches among the existing nonadaptive peer assessment methods. Through simulation experiments, we show that circular allocation and group allocation tend to yield lower accuracy than random allocation. Then, we utilize this result to improve the existing adaptive allocation method, which performs allocation and assessment in parallel and tends to make similar allocation result to circular allocation. We propose the method to replace part of the allocation with random allocation, and show that the method is effective through experiments.
Tomohiro YAMAZAKI Hisashi KOGA
We study the continuous similarity search problem for evolving queries which has recently been formulated. Given a data stream and a database composed of n sets of items, the purpose of this problem is to maintain the top-k most similar sets to the query which evolves over time and consists of the latest W items in the data stream. For this problem, the previous exact algorithm adopts a pruning strategy which, at the present time T, decides the candidates of the top-k most similar sets from past similarity values and computes the similarity values only for them. This paper proposes a new exact algorithm which shortens the execution time by computing the similarity values only for sets whose similarity values at T can change from time T-1. We identify such sets very fast with frequency-based inverted lists (FIL). Moreover, we derive the similarity values at T in O(1) time by updating the previous values computed at time T-1. Experimentally, our exact algorithm runs faster than the previous exact algorithm by one order of magnitude and as fast as the previous approximation algorithm.
Yutaro BESSHO Yuto HAYAMIZU Kazuo GODA Masaru KITSUREGAWA
Parallel processing is a typical approach to answer analytical queries on large database. As the size of the database increases, we often try to increase the parallelism by incorporating more processing nodes. However, this approach increases the possibility of node failure as well. According to the conventional practice, if a failure occurs during query processing, the database system restarts the query processing from the beginning. Such temporal cost may be unacceptable to the user. This paper proposes a fault-tolerant query processing mechanism, named PhoeniQ, for analytical parallel database systems. PhoeniQ continuously takes a checkpoint for every operator pipeline and replicates the output of each stateful operator among different processing nodes. If a single processing node fails during query processing, another can promptly take over the processing. Hence, PhoneniQ allows the database system to efficiently resume query processing after a partial failure event. This paper presents a key design of PhoeniQ and prototype-based experiments to demonstrate that PhoeniQ imposes negligible performance overhead and efficiently continues query processing in the face of node failure.
Da LI Yuanyuan WANG Rikuya YAMAMOTO Yukiko KAWAI Kazutoshi SUMIYA
Recently, machine learning approaches and user movement history analysis on mobile devices have attracted much attention. Generally, we need to apply text data into the word embedding tool for acquiring word vectors as the preprocessing of machine learning approaches. However, it is difficult for mobile devices to afford the huge cost of high-dimensional vector calculation. Thus, a low-cost user behavior and user movement history analysis approach should be considered. To address this issue, firstly, we convert the zip code and street house number into vectors instead of textual address information to reduce the cost of spatial vector calculation. Secondly, we propose a low-cost high-performance semantic and physical distance (real distance) calculation method that applied zip-code-based vectors. Finally, to verify the validity of our proposed method, we utilize the US zip code data to calculate both semantic and physical distances and compare their results with the previous method. The experimental results showed that our proposed method could significantly improve the performance of distance calculation and effectively control the cost to a low level.
Tomoya HASHIGUCHI Takehiro YAMAMOTO Sumio FUJITA Hiroaki OHSHIMA
In this study, we generate dialogue contents in which two systems discuss their distress with each other. The user inputs sentences that include environment and feelings of distress. The system generates the dialogue content from the input. In this study, we created dialogue data about distress in order to generate them using deep learning. The generative model fine-tunes the GPT of the pre-trained model using the TransferTransfo method. The contribution of this study is the creation of a conversational dataset using publicly available data. This study used EmpatheticDialogues, an existing empathetic dialogue dataset, and Reddit r/offmychest, a public data set of distress. The models fine-tuned with each data were evaluated both automatically (such as by the BLEU and ROUGE scores) and manually (such as by relevance and empathy) by human assessors.
Distributed edge cloud computing is an important computation infrastructure for Internet of Things (IoT) and its task offloading problem has attracted much attention recently. Most existing work on task offloading in distributed edge cloud computing usually assumes that each self-interested user owns one edge server and chooses whether to execute its tasks locally or to offload the tasks to cloud servers. The goal of each edge server is to maximize its own interest like low delay cost, which corresponds to a non-cooperative setting. However, with the strong development of smart IoT communities such as smart hospital and smart factory, all edge and cloud servers can belong to one organization like a technology company. This corresponds to a cooperative setting where the goal of the organization is to maximize the team interest in the overall edge cloud computing system. In this paper, we consider a new problem called cooperative task offloading where all edge servers try to cooperate to make the entire edge cloud computing system achieve good performance such as low delay cost and low energy cost. However, this problem is hard to solve due to two issues: 1) each edge server status dynamically changes and task arrival is uncertain; 2) each edge server can observe only its own status, which makes it hard to optimize team interest as global information is unavailable. For solving these issues, we formulate the problem as a decentralized partially observable Markov decision process (Dec-POMDP) which can well handle the dynamic features under partial observations. Then, we apply a multi-agent reinforcement learning algorithm called value decomposition network (VDN) and propose a VDN-based task offloading algorithm (VDN-TO) to solve the problem. Specifically, the motivation is that we use a team value function to evaluate the team interest, which is then divided into individual value functions for each edge server. Then, each edge server updates its individual value function in the direction that can maximize the team interest. Finally, we choose a part of a real dataset to evaluate our algorithm and the results show the effectiveness of our algorithm in a comparison with some other existing methods.
Kento SUGIURA Yoshiharu ISHIKAWA
With the rapid increase in the number of CPU cores, software that can utilize these many cores is required. A lock-free algorithm based on compare-and-swap (CAS) operations is one of the concurrency control methods to implement such multi-threading software. A multi-word CAS (MwCAS) operation is an extension of a CAS operation to swap multiple words atomically. However, we noticed that the performance of the existing MwCAS implementation is limited because of garbage collection even if in a low-contention environment. To achieve high performance in low-contention workloads, we propose a new MwCAS algorithm without garbage collection. Experimental results show that our approach is three to five times faster than implementation with garbage collection in low-contention workloads. Moreover, the performance of the proposed method is also superior in a high-contention environment.
Hiroshi UEHARA Yasuhiro IUCHI Yusuke FUKAZAWA Yoshihiro KANETA
This study tries to predict date of ear emergence of rice plants, based on cropping records over 25 years. Predicting ear emergence of rice plants is known to be crucial for practicing good harvesting quality, and has long been dependent upon old farmers who acquire skills of intuitive prediction based on their long term experiences. Facing with aging farmers, data driven approach for the prediction have been pursued. Nevertheless, they are not necessarily sufficient in terms of practical use. One of the issue is to adopt weather forecast as the feature so that the predictive performance is varied by the accuracy of the forecast. The other issue is that the performance is varied by region and the regional characteristics have not been used as the features for the prediction. With this background, we propose a feature engineering to quantify hidden regional characteristics as the feature for the prediction. Further the feature is engineered based only on observational data without any forecast. Applying our proposal to the data on the cropping records resulted in sufficient predictive performance, ±2.69days of RMSE.
Tongzhou QU Zibin DAI Yanjiang LIU Lin CHEN Xianzhao XIA
The existing research on Amdahl's law is limited to multi/many-core processors, and cannot be applied to the important parallel processing architecture of coarse-grained reconfigurable arrays. This paper studies the relation between the multi-level parallelism of block cipher algorithms and the architectural characteristics of coarse-grain reconfigurable arrays. We introduce the key variables that affect the performance of reconfigurable arrays, such as communication overhead and configuration overhead, into Amdahl's law. On this basis, we propose a performance model for coarse-grain reconfigurable block cipher array (CGRBA) based on the extended Amdahl's law. In addition, this paper establishes the optimal integer nonlinear programming model, which can provide a parameter reference for the architecture design of CGRBA. The experimental results show that: (1) reducing the communication workload ratio and increasing the number of configuration pages reasonably can significantly improve the algorithm performance on CGRBA; (2) the communication workload ratio has a linear effect on the execution time.
Takashi ISHIO Naoto MAEDA Kensuke SHIBUYA Kenho IWAMOTO Katsuro INOUE
Software developers may write a number of similar source code fragments including the same mistake in software products. To remove such faulty code fragments, developers inspect code clones if they found a bug in their code. While various code clone detection methods have been proposed to identify clones of either code blocks or functions, those tools do not always fit the code inspection task because a faulty code fragment may be much smaller than code blocks, e.g. a single line of code. To enable developers to search code clones of such a small faulty code fragment in a large-scale software product, we propose a method using Lempel-Ziv Jaccard Distance, which is an approximation of Normalized Compression Distance. We conducted an experiment using an existing research dataset and a user survey in a company. The result shows our method efficiently reports cloned faulty code fragments and the performance is acceptable for software developers.
Xudong YANG Ling GAO Yan LI Jipeng XU Jie ZHENG Hai WANG Quanli GAO
With the popularity and development of Location-Based Services (LBS), location privacy-preservation has become a hot research topic in recent years, especially research on k-anonymity. Although previous studies have done a lot of work on anonymity-based privacy protection, there are still several challenges far from being perfectly solved, such as the negative impact on the security of anonymity by the semantic information, which from anonymous locations and query content. To address these semantic challenges, we propose a dual privacy preservation scheme based on the architecture of multi-anonymizers in this paper. Different from existing approaches, our method enhanced location privacy by integrating location anonymity and the encrypted query. First, the query encryption method that combines improved shamir mechanism and multi-anonymizers is proposed to enhance query safety. Second, we design an anonymity method that enhances semantic location privacy through anonymous locations that satisfy personal semantic diversity and replace sensitive semantic locations. Finally, the experiment on the real dataset shows that our algorithms provide much better privacy and use than previous solutions.
Ruijun MA Stefan HOLST Xiaoqing WEN Aibin YAN Hui XU
As modern CMOS circuits fabricated with advanced technology nodes are becoming more and more susceptible to soft-errors, many hardened latches have been proposed for reliable LSI designs. We reveal for the first time that production defects in such hardened latches can cause two serious problems: (1) these production defects are difficult to detect with conventional scan test and (2) these production defects can reduce the reliability of hardened latches. This paper systematically addresses these two problems with three major contributions: (1) Post-Test Vulnerability Factor (PTVF), a first-of-its-kind metric for quantifying the impact of production defects on hardened latches, (2) a novel Scan-Test-Aware Hardened Latch (STAHL) design that has the highest defect coverage compared to state-of-the-art hardened latch designs, and (3) an STAHL-based scan test procedure. Comprehensive simulation results demonstrate the accuracy of the proposed PTVF metric and the effectiveness of the STAHL-based scan test. As the first comprehensive study bridging the gap between hardened latch design and LSI testing, the findings of this paper will significantly improve the soft-error-related reliability of LSI designs for safety-critical applications.
Ryota YOSHIMURA Ichiro MARUTA Kenji FUJIMOTO Ken SATO Yusuke KOBAYASHI
Particle filters have been widely used for state estimation problems in nonlinear and non-Gaussian systems. Their performance depends on the given system and measurement models, which need to be designed by the user for each target system. This paper proposes a novel method to design these models for a particle filter. This is a numerical optimization method, where the particle filter design process is interpreted into the framework of reinforcement learning by assigning the randomnesses included in both models of the particle filter to the policy of reinforcement learning. In this method, estimation by the particle filter is repeatedly performed and the parameters that determine both models are gradually updated according to the estimation results. The advantage is that it can optimize various objective functions, such as the estimation accuracy of the particle filter, the variance of the particles, the likelihood of the parameters, and the regularization term of the parameters. We derive the conditions to guarantee that the optimization calculation converges with probability 1. Furthermore, in order to show that the proposed method can be applied to practical-scale problems, we design the particle filter for mobile robot localization, which is an essential technology for autonomous navigation. By numerical simulations, it is demonstrated that the proposed method further improves the localization accuracy compared to the conventional method.
Fei ZHANG Peining ZHEN Dishan JING Xiaotang TANG Hai-Bao CHEN Jie YAN
Intrusion is one of major security issues of internet with the rapid growth in smart and Internet of Thing (IoT) devices, and it becomes important to detect attacks and set out alarm in IoT systems. In this paper, the support vector machine (SVM) and principal component analysis (PCA) based method is used to detect attacks in smart IoT systems. SVM with nonlinear scheme is used for intrusion classification and PCA is adopted for feature selection on the training and testing datasets. Experiments on the NSL-KDD dataset show that the test accuracy of the proposed method can reach 82.2% with 16 features selected from PCA for binary-classification which is almost the same as the result obtained with all the 41 features; and the test accuracy can achieve 78.3% with 29 features selected from PCA for multi-classification while 79.6% without feature selection. The Denial of Service (DoS) attack detection accuracy of the proposed method can achieve 8.8% improvement compared with existing artificial neural network based method.
Although deep neural networks (DNNs) have achieved high performance across a variety of applications, they can often be deceived by adversarial examples that are generated by adding small perturbations to the original images. Adversaries may generate adversarial examples using the property of transferability, in which adversarial examples that deceive one model can also deceive other models because adversaries do not obtain any information on the DNNs deployed in real scenarios. Recent studies show that adversarial examples with feature space perturbations are more transferable than others. Adversarial training is an effective method to defend against adversarial attacks. However, it results in a decrease in the classification accuracy for natural images, and it is not sufficiently robust against transferable adversarial examples because it does not consider adversarial examples with feature space perturbations. We propose a novel adversarial training method to train DNNs to be robust against transferable adversarial examples and maximize their classification accuracy for natural images. The proposed method trains DNNs to correctly classify natural images and adversarial examples and also minimize the feature differences between them. The robustness of the proposed method was similar to those of the previous adversarial training methods for MNIST dataset and was up to average 6.13% and 9.24% more robust against transfer adversarial examples for CIFAR-10 and CIFAR-100 datasets, respectively. In addition, the proposed method yielded an average classification accuracy that was approximately 0.53%, 6.82%, and 10.60% greater than some state-of-the-art adversarial training methods for all datasets, respectively. The proposed method is robust against a variety of transferable adversarial examples, which enables its implementation in security applications that may benefit from high-performance classification but are at high risk of attack.
Kana MIYAMOTO Hiroki TANAKA Satoshi NAKAMURA
Music is often used for emotion induction because it can change the emotions of people. However, since we subjectively feel different emotions when listening to music, we propose an emotion induction system that generates music that is adapted to each individual. Our system automatically generates suitable music for emotion induction based on the emotions predicted from an electroencephalogram (EEG). We examined three elements for constructing our system: 1) a music generator that creates music that induces emotions that resemble the inputs, 2) emotion prediction using EEG in real-time, and 3) the control of a music generator using the predicted emotions for making music that is suitable for inducing emotions. We constructed our proposed system using these elements and evaluated it. The results showed its effectiveness for inducing emotions and suggest that feedback loops that tailor stimuli to individuals can successfully induce emotions.
Tingting HU Ryuji FUCHIKAMI Takeshi IKENAGA
High frame rate and ultra-low delay vision system, which can finish reading and processing of 1000fps sequence within 1ms/frame, draws increasing attention in the field of robotics that requires immediate feedback from image process core. Meanwhile, tracking task plays an important role in many computer vision applications. Among various tracking algorithms, Lucas Kanade (LK)-based template tracking, which tracks targets with high accuracy over the sub-pixel level, is one of the keys for robotic applications, such as factory automation (FA). However, the substantial spatial iterative processing and complex computation in the LK algorithm, make it difficult to achieve a high frame rate and ultra-low delay tracking with limited resources. Aiming at an LK-based template tracking system that reads and processes 1000fps sequences within 1ms/frame with small resource costs, this paper proposes: 1) High temporal resolution-based temporal iterative tracking, which maps the spatial iterations into the temporal domain, efficiently reduces resource cost and delay caused by spatial iterative processing. 2) Label scanner-based multi-stream spatial processing, which maps the local spatial processing into the labeled input pixel stream and aggregates them with a label scanner, makes the local spatial processing in the LK algorithm possible be implemented with a small resource cost. Algorithm evaluation shows that the proposed temporal iterative tracking performs dynamic tracking, which tracks object with coarse accuracy when it's moving fast and achieves higher accuracy when it slows down. Hardware evaluation shows that the proposed label scanner-based multi-stream architecture makes the system implemented on FPGA (zcu102) with resource cost less than 20%, and the designed tracking system supports to read and process 1000fps sequence within 1ms/frame.
Rubin ZHAO Xiaolong ZHENG Zhihua YING Lingyan FAN
Most existing object detection methods and text detection methods are mainly designed to detect either text or objects. In some scenarios where the task is to find the target word pointed-at by an object, results of existing methods are far from satisfying. However, such scenarios happen often in human-computer interaction, when the computer needs to figure out which word the user is pointing at. Comparing with object detection, pointed-at word localization (PAWL) requires higher accuracy, especially in dense text scenarios. Moreover, in printed document, characters are much smaller than those in scene text detection datasets such as ICDAR-2013, ICDAR-2015 and ICPR-2018 etc. To address these problems, the authors propose a novel target word localization network (TWLN) to detect the pointed-at word in printed documents. In this work, a single deep neural network is trained to extract the features of markers and text sequentially. For each image, the location of the marker is predicted firstly, according to the predicted location, a smaller image is cropped from the original image and put into the same network, then the location of pointed-at word is predicted. To train and test the networks, an efficient approach is proposed to generate the dataset from PDF format documents by inserting markers pointing at the words in the documents, which avoids laborious labeling work. Experiments on the proposed dataset demonstrate that TWLN outperforms the compared object detection method and optical character recognition method on every category of targets, especially when the target is a single character that only occupies several pixels in the image. TWLN is also tested with real photographs, and the accuracy shows no significant differences, which proves the validity of the generating method to construct the dataset.
Weiguo ZHANG Jiaqi LU Jing ZHANG Xuewen LI Qi ZHAO
The haze situation will seriously affect the quality of license plate recognition and reduce the performance of the visual processing algorithm. In order to improve the quality of haze pictures, a license plate recognition algorithm based on haze weather is proposed in this paper. The algorithm in this paper mainly consists of two parts: The first part is MPGAN image dehazing, which uses a generative adversarial network to dehaze the image, and combines multi-scale convolution and perceptual loss. Multi-scale convolution is conducive to better feature extraction. The perceptual loss makes up for the shortcoming that the mean square error (MSE) is greatly affected by outliers; the second part is to recognize the license plate, first we use YOLOv3 to locate the license plate, the STN network corrects the license plate, and finally enters the improved LPRNet network to get license plate information. Experimental results show that the dehazing model proposed in this paper achieves good results, and the evaluation indicators PSNR and SSIM are better than other representative algorithms. After comparing the license plate recognition algorithm with the LPRNet algorithm, the average accuracy rate can reach 93.9%.
Wen SHAO Rei KAWAKAMI Takeshi NAEMURA
Previous studies on anomaly detection in videos have trained detectors in which reconstruction and prediction tasks are performed on normal data so that frames on which their task performance is low will be detected as anomalies during testing. This paper proposes a new approach that involves sorting video clips, by using a generative network structure. Our approach learns spatial contexts from appearances and temporal contexts from the order relationship of the frames. Experiments were conducted on four datasets, and we categorized the anomalous sequences by appearance and motion. Evaluations were conducted not only on each total dataset but also on each of the categories. Our method improved detection performance on both anomalies with different appearance and different motion from normality. Moreover, combining our approach with a prediction method produced improvements in precision at a high recall.
Sejin JUNG Eui-Sub KIM Junbeom YOO
Traditional safety analysis techniques have shown difficulties in incorporating dynamically changing structures of CPSs (Cyber-Physical Systems). STPA (System-Theoretic Process Analysis), one of the widely used, needs to unfold and arrange all hidden structures before beginning a full-fledged analysis. This paper proposes an intermediate model “Information Unfolding Model (IUM)” and a process “Information Unfolding Process (IUP)” to unfold dynamic structures which are hidden in CPSs and so help analysts construct control structures in STPA thoroughly.
Convolutional Neural Network (CNN) has made extraordinary progress in image classification tasks. However, it is less effective to use CNN directly to detect image manipulation. To address this problem, we propose an image filtering layer and a multi-scale feature fusion module which can guide the model more accurately and effectively to perform image manipulation detection. Through a series of experiments, it is shown that our model achieves improvements on image manipulation detection compared with the previous researches.
Hao FANG Chi-Hua CHEN Dewang CHEN Feng-Jang HWANG
Aiming for accurate data-driven predictions for the passenger walking time, this study proposes a novel neuron-network-based mixture probability (NNBMP) model with repetition learning (RL) to estimate the probability density distribution of passenger walking time (PWT) in the metro station. Our conducted experiments for Fuzhou metro stations demonstrate that the proposed NNBMP-RL model achieved the mean absolute error, mean square error, and mean absolute percentage error of 0.0078, 1.33 × 10-4, and 19.41%, respectively, and it outperformed all the seven compared models. The developed NNBMP model fitting accurately the PWT distribution in the metro station is readily applicable to the microscopic analyses of passenger flow.
This letter proposes a post-processing method to improve the smoothness and safety of the path for an autonomous vehicle navigating in an urban environment. The proposed method transforms the initial path given by local path planning algorithms using a stochastic approach to improve its smoothness and safety. Using the proposed method, the initial path is efficiently transformed by iteratively updating the position of each waypoint within it. The proposed method also guarantees the feasibility of the transformed path. Experimental results verify that the proposed method can improve the smoothness and safety of the initial path and ensure the feasibility of the transformed path.
Zhimin GUO Jianfei CHEN Sheng ZHANG
Millimeter wave synthetic aperture interferometric radiometers (SAIR) are very powerful instruments, which can effectively realize high-precision imaging detection. However due to the existence of interference factor and complex near-field error, the imaging effect of near-field SAIR is usually not ideal. To achieve better imaging results, a new fully connected imaging network (FCIN) is proposed for near-field SAIR. In FCIN, the fully connected network is first used to reconstruct the image domain directly from the visibility function, and then the residual dense network is used for image denoising and enhancement. The simulation results show that the proposed FCIN method has high imaging accuracy and shorten imaging time.
In this letter, we propose a deep neural network and semi-supervised learning based dehazing algorithm. The dehazing network uses a pyramidal architecture to recover the haze-free scene from a single hazy image in a coarse-to-fine order. To faithfully restore the objects with different scales, we incorporate cascaded multi-scale convolutional blocks into each level of the pyramid. Feature fusion and transfer in the network are achieved using the paths constructed by interleaved residual connections. For better generalization to the complicated haze in real-world environments, we also devise a discriminator that enables semi-supervised adversarial training. Experimental results demonstrate that the proposed work outperforms comparative ones with higher quantitative metrics and more visually pleasant outputs. It can also enhance the robustness of object detection under haze.
A hubness-score based normalization of the pairwise similarity is proposed for the sequence-alignment based cover song retrieval. The hubness, which is the tendency of some data points in high-dimensional data sets to link more frequently to other points than the rest of the points from the set, is widely-known to deteriorate the information retrieval accuracy. This paper tries to relieve the performance degradation due to the hubness by normalizing the pairwise similarity with a hubness score. Experiments on two cover song datasets confirm that the proposed similarity normalization improves the cover song retrieval accuracy.