Hiroaki AKUTSU Ko ARAI
Lanxi LIU Pengpeng YANG Suwen DU Sani M. ABDULLAHI
Xiaoguang TU Zhi HE Gui FU Jianhua LIU Mian ZHONG Chao ZHOU Xia LEI Juhang YIN Yi HUANG Yu WANG
Yingying LU Cheng LU Yuan ZONG Feng ZHOU Chuangao TANG
Jialong LI Takuto YAMAUCHI Takanori HIRANO Jinyu CAI Kenji TEI
Wei LEI Yue ZHANG Hanfeng XIE Zebin CHEN Zengping CHEN Weixing LI
David CLARINO Naoya ASADA Atsushi MATSUO Shigeru YAMASHITA
Takashi YOKOTA Kanemitsu OOTSU
Xiaokang Jin Benben Huang Hao Sheng Yao Wu
Tomoki MIYAMOTO
Ken WATANABE Katsuhide FUJITA
Masashi UNOKI Kai LI Anuwat CHAIWONGYEN Quoc-Huy NGUYEN Khalid ZAMAN
Takaharu TSUBOYAMA Ryota TAKAHASHI Motoi IWATA Koichi KISE
Chi ZHANG Li TAO Toshihiko YAMASAKI
Ann Jelyn TIEMPO Yong-Jin JEONG
Haruhisa KATO Yoshitaka KIDANI Kei KAWAMURA
Jiakun LI Jiajian LI Yanjun SHI Hui LIAN Haifan WU
Gyuyeong KIM
Hyun KWON Jun LEE
Fan LI Enze YANG Chao LI Shuoyan LIU Haodong WANG
Guangjin Ouyang Yong Guo Yu Lu Fang He
Yuyao LIU Qingyong LI Shi BAO Wen WANG
Cong PANG Ye NI Jia Ming CHENG Lin ZHOU Li ZHAO
Nikolay FEDOROV Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Yukasa MURAKAMI Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Kazuya KAKIZAKI Kazuto FUKUCHI Jun SAKUMA
Yitong WANG Htoo Htoo Sandi KYAW Kunihiro FUJIYOSHI Keiichi KANEKO
Waqas NAWAZ Muhammad UZAIR Kifayat ULLAH KHAN Iram FATIMA
Haeyoung Lee
Ji XI Pengxu JIANG Yue XIE Wei JIANG Hao DING
Weiwei JING Zhonghua LI
Sena LEE Chaeyoung KIM Hoorin PARK
Akira ITO Yoshiaki TAKAHASHI
Rindo NAKANISHI Yoshiaki TAKATA Hiroyuki SEKI
Chuzo IWAMOTO Ryo TAKAISHI
Chih-Ping Wang Duen-Ren Liu
Yuya TAKADA Rikuto MOCHIDA Miya NAKAJIMA Syun-suke KADOYA Daisuke SANO Tsuyoshi KATO
Yi Huo Yun Ge
Rikuto MOCHIDA Miya NAKAJIMA Haruki ONO Takahiro ANDO Tsuyoshi KATO
Koichi FUJII Tomomi MATSUI
Yaotong SONG Zhipeng LIU Zhiming ZHANG Jun TANG Zhenyu LEI Shangce GAO
Souhei TAKAGI Takuya KOJIMA Hideharu AMANO Morihiro KUGA Masahiro IIDA
Jun ZHOU Masaaki KONDO
Tetsuya MANABE Wataru UNUMA
Kazuyuki AMANO
Takumi SHIOTA Tonan KAMATA Ryuhei UEHARA
Hitoshi MURAKAMI Yutaro YAMAGUCHI
Jingjing Liu Chuanyang Liu Yiquan Wu Zuo Sun
Zhenglong YANG Weihao DENG Guozhong WANG Tao FAN Yixi LUO
Yoshiaki TAKATA Akira ONISHI Ryoma SENDA Hiroyuki SEKI
Dinesh DAULTANI Masayuki TANAKA Masatoshi OKUTOMI Kazuki ENDO
Kento KIMURA Tomohiro HARAMIISHI Kazuyuki AMANO Shin-ichi NAKANO
Ryotaro MITSUBOSHI Kohei HATANO Eiji TAKIMOTO
Genta INOUE Daiki OKONOGI Satoru JIMBO Thiem Van CHU Masato MOTOMURA Kazushi KAWAMURA
Hikaru USAMI Yusuke KAMEDA
Yinan YANG
Takumi INABA Takatsugu ONO Koji INOUE Satoshi KAWAKAMI
Fengshan ZHAO Qin LIU Takeshi IKENAGA
Naohito MATSUMOTO Kazuhiro KURITA Masashi KIYOMI
Tomohiro KOBAYASHI Tomomi MATSUI
Shin-ichi NAKANO
Ming PAN
Chengcheng JI Masahito KURIHARA Haruhiko SATO
We present an automated lemma generation method for equational, inductive theorem proving based on the term rewriting induction of Reddy and Aoto as well as the divergence critic framework of Walsh. The method effectively works by using the divergence-detection technique to locate differences in diverging sequences, and generates potential lemmas automatically by analyzing these differences. We have incorporated this method in the multi-context inductive theorem prover of Sato and Kurihara to overcome the strategic problems resulting from the unsoundness of the method. The experimental results show that our method is effective especially for some problems diverging with complex differences (i.e., parallel and nested differences).
In this paper, we consider cloud-assisted Peer-to-Peer (P2P) video streaming systems, in which a given video stream is divided into several sub-streams called stripes and those stripes are delivered to all subscribers through different spanning trees of height two, with the aid of cloud upload capacity. We call such a low latency delivery of stripes a 2-hop delivery. This paper proves that if the average upload capacity of the peers equals to the bit rate of the video stream and the video stream is divided into a stripes, then 2-hop delivery of all stripes to n peers is possible if the upload capacity assisted by the cloud is 3n/a. If those peers have a uniform upload capacity, then the amount of cloud assistance necessary for the 2-hop delivery reduces to n/a.
In parallel computing systems, the interconnection network forms the critical infrastructure which enables robust and scalable communication between hundreds of thousands of nodes. The traditional packet-switched network tends to suffer from long communication time when network congestion occurs. In this context, we explore the use of circuit switching (CS) to replace packet switches with custom hardware that supports circuit-based switching efficiently with low latency. In our target CS network, a certain amount of bandwidth is guaranteed for each communication pair so that the network latency can be predictable when a limited number of node pairs exchange messages. The number of allocated time slots in every switch is a direct factor to affect the end-to-end latency, we thereby improve the slot utilization and develop a network topology generator to minimize the number of time slots optimized to target applications whose communication patterns are predictable. By a quantitative discrete-event simulation, we illustrate that the minimum necessary number of slots can be reduced to a small number in a generated topology by our design methodology while maintaining network cost 50% less than that in standard tori topologies.
This paper proposes a method to absorb flash crowd in P2P video streaming systems. The idea of the proposed method is to reduce the time before a newly arrived node becoming an uploader by explicitly constructing a group of newly arrived nodes called flash crowd absorber (FCA). FCA grows continuously while serving a video stream to the members of the group, and it is explicitly controlled so that the upload capacity of the nodes is fully utilized and it attains a nearly optimal latency of the stream during a flash crowd. A numerical comparison with a naive tree-based scheme is also given.
Modern file systems, such as ext4, btrfs, and XFS, are evolving and enable the introduction of new features to meet ever-changing demands and improve reliability. File system developers are struggling to eliminate all software bugs, but the operating system community points out that file systems are a hotbed of critical software bugs. This paper analyzes the code coverage of xfstests, a widely used suite of file system tests, on three major file systems (ext4, btrfs, and XFS). The coverage is 72.34%, and the uncovered code runs into 23,232 lines of code. To understand why the code coverage is low, the uncovered code is manually examined line by line. We identified three major causes, peculiar to file systems, that hinder higher coverage. First, covering all the features is difficult because each file system provides a wide variety of file-system specific features, and some features can be tested only on special storage devices. Second, covering all the execution paths is difficult because they depend on file system configurations and internal on-disk states. Finally, the code for maintaining backward-compatibility is executed only when a file system encounters old formats. Our findings will help file system developers improve the coverage of test suites and provide insights into fostering the development of new methodologies for testing file systems.
Shohei IKEDA Akinori IHARA Raula Gaikovina KULA Kenichi MATSUMOTO
Contemporary software projects often utilize a README.md to share crucial information such as installation and usage examples related to their software. Furthermore, these files serve as an important source of updated and useful documentation for developers and prospective users of the software. Nonetheless, both novice and seasoned developers are sometimes unsure of what is required for a good README file. To understand the contents of README, we investigate the contents of 43,900 JavaScript packages. Results show that these packages contain common content themes (i.e., ‘usage’, ‘install’ and ‘license’). Furthermore, we find that application-specific packages more frequently included content themes such as ‘options’, while library-based packages more frequently included other specific content themes (i.e., ‘install’ and ‘license’).
Minseok LEE Jihoon AN Younghee LEE
Data generated from the Internet of Things (IoT) devices in smart spaces are utilized in a variety of fields such as context recognition, service recommendation, and anomaly detection. However, the missing values in the data streams of the IoT devices remain a challenging problem owing to various missing patterns and heterogeneous data types from many different data streams. In this regard, while we were analyzing the dataset collected from a smart space with multiple IoT devices, we found a continuous missing pattern that is quite different from the existing missing-value patterns. The pattern has blocks of consecutive missing values over a few seconds and up to a few hours. Therefore, the pattern is a vital factor to the availability and reliability of IoT applications; yet, it cannot be solved by the existing missing-value imputation methods. Therefore, a novel approach for missing-value imputation of the continuous missing pattern is required. We deliberate that even if the missing values of the continuous missing pattern occur in one data stream, missing-values imputation is possible through learning other data streams correlated with this data stream. To solve the missing values of the continuous missing pattern problem, we analyzed multiple IoT data streams in a smart space and figured out the correlations between them that are the interdependencies among the data streams of the IoT devices in a smart space. To impute missing values of the continuous missing pattern, we propose a deep learning-based missing-value imputation model exploiting correlation information, namely, the deep imputation network (DeepIN), in a smart space. The DeepIN uses that multiple long short-term memories are constructed according to the correlation information of each IoT data stream. We evaluated the DeepIN on a real dataset from our campus IoT testbed, and the experimental results show that our proposed approach improves the imputation performance by 57.36% over the state-of-the-art missing-value imputation algorithm. Thus, our approach can be a promising methodology that enables IoT applications and services with a reasonable missing-value imputation accuracy (80∼85%) on average, even if a long-term block of values is missing in IoT environments.
Yang GAO Yong-juan WANG Qing-jun YUAN Tao WANG Xiang-bin WANG
We propose a new method of differential fault attack, which is based on the nibble-group differential diffusion property of the lightweight block cipher MIBS. On the basis of the statistical regularity of differential distribution of the S-box, we establish a statistical model and then analyze the relationship between the number of faults injections, the probability of attack success, and key recovering bits. Theoretically, time complexity of recovering the main key reduces to 22 when injecting 3 groups of faults (12 nibbles in total) in 30,31 and 32 rounds, which is the optimal condition. Furthermore, we calculate the expectation of the number of fault injection groups needed to recover 62 bits in main key, which is 3.87. Finally, experimental data verifies the correctness of the theoretical model.
Na WU Decheng ZUO Zhan ZHANG Peng ZHOU Yan ZHAO
Cloud computing has attracted a growing number of enterprises to move their business to the cloud because of the associated operational and cost benefits. Improving availability is one of the major concerns of cloud application owners because modern applications generally comprise a large number of components and failures are common at scale. Fault tolerance enables an application to continue operating properly when failure occurs, but fault tolerance strategy is typically employed for the most important components because of financial concerns. Therefore, identifying important components has become a critical research issue. To address this problem, we propose a failure-sensitive structure-based component ranking approach (FSCRank), which integrates component failure impact and application structure information into component importance evaluation. An iterative ranking algorithm is developed according to the structural characteristics of cloud applications. The experimental results show that FSCRank outperforms the other two structure-based ranking algorithms for cloud applications. In addition, factors that affect application availability optimization are analyzed and summarized. The experimental results suggest that the availability of cloud applications can be greatly improved by implementing fault tolerance strategy for the important components identified by FSCRank.
Longfei CHEN Yuichi NAKAMURA Kazuaki KONDO Walterio MAYOL-CUEVAS
This paper presents an approach to analyze and model tasks of machines being operated. The executions of the tasks were captured through egocentric vision. Each task was decomposed into a sequence of physical hand-machine interactions, which are described with touch-based hotspots and interaction patterns. Modeling the tasks was achieved by integrating the experiences of multiple experts and using a hidden Markov model (HMM). Here, we present the results of more than 70 recorded egocentric experiences of the operation of a sewing machine. Our methods show good potential for the detection of hand-machine interactions and modeling of machine operation tasks.
Hitomi YOKOYAMA Masano NAKAYAMA Hiroaki MURATA Kinya FUJITA
Aimed at long-term monitoring of daily office conversations without recording the conversational content, a system is presented for estimating acoustic nonverbal information such as utterance duration, utterance frequency, and turn-taking. The system combines a sound localization technique based on the sound energy distribution with 16 beam-forming microphone-array modules mounted in the ceiling for reducing the influence of multiple sound reflection. Furthermore, human detection using a wide field of view camera is integrated to the system for more robust speaker estimation. The system estimates the speaker for each utterance and calculates nonverbal information based on it. An evaluation analyzing data collected over ten 12-hour workdays in an office with three assigned workers showed that the system had 72% speech segmentation detection accuracy and 86% speaker identification accuracy when utterances were correctly detected. Even with false voice detection and incorrect speaker identification and even in cases where the participants frequently made noise or where seven participants had gathered together for a discussion, the order of the amount of calculated acoustic nonverbal information uttered by the participants coincided with that based on human-coded acoustic nonverbal information. Continuous analysis of communication dynamics such as dominance and conversation participation roles through nonverbal information will reveal the dynamics of a group. The main contribution of this study is to demonstrate the feasibility of unconstrained long-term monitoring of daily office activity through acoustic nonverbal information.
Shengyu YAO Ruohua ZHOU Pengyuan ZHANG
This paper proposes a speaker-phonetic i-vector modeling method for text-dependent speaker verification with random digit strings, in which enrollment and test utterances are not of the same phrase. The core of the proposed method is making use of digit alignment information in i-vector framework. By utilizing force alignment information, verification scores of the testing trials can be computed in the fixed-phrase situation, in which the compared speech segments between the enrollment and test utterances are of the same phonetic content. Specifically, utterances are segmented into digits, then a unique phonetically-constrained i-vector extractor is applied to obtain speaker and channel variability representation for every digit segment. Probabilistic linear discriminant analysis (PLDA) and s-norm are subsequently used for channel compensation and score normalization respectively. The final score is obtained by combing the digit scores, which are computed by scoring individual digit segments of the test utterance against the corresponding ones of the enrollment. Experimental results on the Part 3 of Robust Speaker Recognition (RSR2015) database demonstrate that the proposed approach significantly outperforms GMM-UBM by 52.3% and 53.5% relative in equal error rate (EER) for male and female respectively.
Gaofeng CHENG Pengyuan ZHANG Ji XU
The long short-term memory recurrent neural network (LSTM) has achieved tremendous success for automatic speech recognition (ASR). However, the complicated gating mechanism of LSTM introduces a massive computational cost and limits the application of LSTM in some scenarios. In this paper, we describe our work on accelerating the decoding speed and improving the decoding accuracy. First, we propose an architecture, which is called Projected Gated Recurrent Unit (PGRU), for ASR tasks, and show that the PGRU can consistently outperform the standard GRU. Second, to improve the PGRU generalization, particularly on large-scale ASR tasks, we propose the Output-gate PGRU (OPGRU). In addition, the time delay neural network (TDNN) and normalization methods are found beneficial for OPGRU. In this paper, we apply the OPGRU for both the acoustic model and recurrent neural network language model (RNN-LM). Finally, we evaluate the PGRU on the total Eval2000 / RT03 test sets, and the proposed OPGRU single ASR system achieves 0.9% / 0.9% absolute (8.2% / 8.6% relative) reduction in word error rate (WER) compared to our previous best LSTM single ASR system. Furthermore, the OPGRU ASR system achieves significant speed-up on both acoustic model and language model rescoring.
Hiroshi SEKI Kazumasa YAMAMOTO Tomoyosi AKIBA Seiichi NAKAGAWA
Deep neural networks (DNNs) have achieved significant success in the field of automatic speech recognition. One main advantage of DNNs is automatic feature extraction without human intervention. However, adaptation under limited available data remains a major challenge for DNN-based systems because of their enormous free parameters. In this paper, we propose a filterbank-incorporated DNN that incorporates a filterbank layer that presents the filter shape/center frequency and a DNN-based acoustic model. The filterbank layer and the following networks of the proposed model are trained jointly by exploiting the advantages of the hierarchical feature extraction, while most systems use pre-defined mel-scale filterbank features as input acoustic features to DNNs. Filters in the filterbank layer are parameterized to represent speaker characteristics while minimizing a number of parameters. The optimization of one type of parameters corresponds to the Vocal Tract Length Normalization (VTLN), and another type corresponds to feature-space Maximum Linear Likelihood Regression (fMLLR) and feature-space Discriminative Linear Regression (fDLR). Since the filterbank layer consists of just a few parameters, it is advantageous in adaptation under limited available data. In the experiment, filterbank-incorporated DNNs showed effectiveness in speaker/gender adaptations under limited adaptation data. Experimental results on CSJ task demonstrate that the adaptation of proposed model showed 5.8% word error reduction ratio with 10 utterances against the un-adapted model.
Huu-Anh TRAN Heyan HUANG Phuoc TRAN Shumin SHI Huu NGUYEN
Word order is one of the most significant differences between the Chinese and Vietnamese. In the phrase-based statistical machine translation, the reordering model will learn reordering rules from bilingual corpora. If the bilingual corpora are large and good enough, the reordering rules are exact and coverable. However, Chinese-Vietnamese is a low-resource language pair, the extraction of reordering rules is limited. This leads to the quality of reordering in Chinese-Vietnamese machine translation is not high. In this paper, we have combined Chinese dependency relation and Chinese-Vietnamese word alignment results in order to pre-order Chinese word order to be suitable to Vietnamese one. The experimental results show that our methodology has improved the machine translation performance compared to the translation system using only the reordering models of phrase-based statistical machine translation.
Hiroki WATANABE Hiroki TANAKA Sakriani SAKTI Satoshi NAKAMURA
Brain-computer interfaces (BCIs) have been used by users to convey their intentions directly with brain signals. For example, a spelling system that uses EEGs allows letters on a display to be selected. In comparison, previous studies have investigated decoding speech information such as syllables, words from single-trial brain signals during speech comprehension, or articulatory imagination. Such decoding realizes speech recognition with a relatively short time-lag and without relying on a display. Previous magnetoencephalogram (MEG) research showed that a template matching method could be used to classify three English sentences by using phase patterns in theta oscillations. This method is based on the synchronization between speech rhythms and neural oscillations during speech processing, that is, theta oscillations synchronized with syllabic rhythms and low-gamma oscillations with phonemic rhythms. The present study aimed to approximate this classification method to a BCI application. To this end, (1) we investigated the performance of the EEG-based classification of three Japanese sentences and (2) evaluated the generalizability of our models to other different users. For the purpose of improving accuracy, (3) we investigated the performances of four classifiers: template matching (baseline), logistic regression, support vector machine, and random forest. In addition, (4) we propose using novel features including phase patterns in a higher frequency range. Our proposed features were constructed in order to capture synchronization in a low-gamma band, that is, (i) phases in EEG oscillations in the range of 2-50 Hz from all electrodes used for measuring EEG data (all) and (ii) phases selected on the basis of feature importance (selected). The classification results showed that, except for random forest, most classifiers perform similarly. Our proposed features improved the classification accuracy with statistical significance compared with a baseline feature, which is a phase pattern in neural oscillations in the range of 4-8 Hz from the right hemisphere. The best mean accuracy across folds was 55.9% using template matching trained by all features. We concluded that the use of phase information in a higher frequency band improves the performance of EEG-based sentence classification and that this model is applicable to other different users.
Yuliang WEI Guodong XIN Wei WANG Fang LV Bailing WANG
Web person search often return web pages related to several distinct namesakes. This paper proposes a new web page model for template-free person data extraction, and uses Dirichlet Process Mixture model to solve name disambiguation. The results show that our method works best on web pages with complex structure.
In this study, we propose a statistical reputation approach for constructing a reliable packet route in ad-hoc sensor networks. The proposed method uses reputation as a measurement for router node selection through which a reliable data route is constructed for packet delivery. To refine the reputation, a transaction density is defined here to showcase the influence of node transaction frequency over the reputation. And to balance the energy consumption and avoid choosing repetitively the same node with high reputation, node remaining energy is also considered as a reputation factor in the selection process. Further, a shortest-path-tree routing protocol is designed so that data packets can reach the base station through the minimum intermediate nodes. Simulation tests illustrate the improvements in the packet delivery ratio and the energy utilization.
This letter proposes a comprehensive assessment of the mission-level damage caused by cyberattacks on an entire defense mission system. We experimentally prove that our method produces swift and accurate assessment results and that it can be applied to actual defense applications. This study contributes to the enhancement of cyber damage assessment with a faster and more accurate method.
This letter proposes a new face sketch recognition method. Given a query sketch and face photos in a database, the proposed method first synthesizes pseudo sketches by computing the locality sensitive histogram and dense illumination invariant features from the resized face photos, then extracts discriminative features by computing histogram of averaged oriented gradients on the query sketch and pseudo sketches, and finally find a match with the shortest cosine distance in the feature space. It achieves accuracy comparable to the state-of-the-art while showing much more robustness than the existing face sketch recognition methods.
Masashi ANZAWA Sosuke AMANO Yoko YAMAKATA Keiko MOTONAGA Akiko KAMEI Kiyoharu AIZAWA
We investigate image recognition of multiple food items in a single photo, focusing on a buffet restaurant application, where menu changes at every meal, and only a few images per class are available. After detecting food areas, we perform hierarchical recognition. We evaluate our results, comparing to two baseline methods.