Qin DAI Naoya INOUE Paul REISERT Kentaro INUI
A tremendous amount of knowledge is present in the ever-growing scientific literature. In order to efficiently grasp such knowledge, various computational tasks are proposed that train machines to read and analyze scientific documents. One of these tasks, Scientific Relation Extraction, aims at automatically capturing scientific semantic relationships among entities in scientific documents. Conventionally, only a limited number of commonly used knowledge bases, such as Wikipedia, are used as a source of background knowledge for relation extraction. In this work, we hypothesize that unannotated scientific papers could also be utilized as a source of external background information for relation extraction. Based on our hypothesis, we propose a model that is capable of extracting background information from unannotated scientific papers. Our experiments on the RANIS corpus [1] prove the effectiveness of the proposed model on relation extraction from scientific articles.
Kazuyoshi TSUCHIYA Chiaki OGAWA Yasuyuki NOGAMI Satoshi UEHARA
Pseudorandom number generators are required to generate pseudorandom numbers which have good statistical properties as well as unpredictability in cryptography. An m-sequence is a linear feedback shift register sequence with maximal period over a finite field. M-sequences have good statistical properties, however we must nonlinearize m-sequences for cryptographic purposes. A geometric sequence is a sequence given by applying a nonlinear feedforward function to an m-sequence. Nogami, Tada and Uehara proposed a geometric sequence whose nonlinear feedforward function is given by the Legendre symbol, and showed the period, periodic autocorrelation and linear complexity of the sequence. Furthermore, Nogami et al. proposed a generalization of the sequence, and showed the period and periodic autocorrelation. In this paper, we first investigate linear complexity of the geometric sequences. In the case that the Chan-Games formula which describes linear complexity of geometric sequences does not hold, we show the new formula by considering the sequence of complement numbers, Hasse derivative and cyclotomic classes. Under some conditions, we can ensure that the geometric sequences have a large linear complexity from the results on linear complexity of Sidel'nikov sequences. The geometric sequences have a long period and large linear complexity under some conditions, however they do not have the balance property. In order to construct sequences that have the balance property, we propose interleaved sequences of the geometric sequence and its complement. Furthermore, we show the periodic autocorrelation and linear complexity of the proposed sequences. The proposed sequences have the balance property, and have a large linear complexity if the geometric sequences have a large one.
Kazuichi OE Mitsuru SATO Takeshi NANRI
The response times of solid state drives (SSDs) have decreased dramatically due to the growing use of non-volatile memory express (NVMe) devices. Such devices have response times of less than 100 micro seconds on average. The response times of all-flash-array systems have also decreased dramatically through the use of NVMe SSDs. However, there are applications, particularly virtual desktop infrastructure and in-memory database systems, that require storage systems with even shorter response times. Their workloads tend to contain many input-output (IO) concentrations, which are aggregations of IO accesses. They target narrow regions of the storage volume and can continue for up to an hour. These narrow regions occupy a few percent of the logical unit number capacity, are the target of most IO accesses, and appear at unpredictable logical block addresses. To drastically reduce the response times for such workloads, we developed an automated tiered storage system called “automated tiered storage with fast memory and slow flash storage” (ATSMF) in which the data in targeted regions are migrated between storage devices depending on the predicted remaining duration of the concentration. The assumed environment is a server with non-volatile memory and directly attached SSDs, with the user applications executed on the server as this reduces the average response time. Our system predicts the effect of migration by using the previously monitored values of the increase in response time during migration and the change in response time after migration. These values are consistent for each type of workload if the system is built using both non-volatile memory and SSDs. In particular, the system predicts the remaining duration of an IO concentration, calculates the expected response-time increase during migration and the expected response-time decrease after migration, and migrates the data in the targeted regions if the sum of response-time decrease after migration exceeds the sum of response-time increase during migration. Experimental results indicate that ATSMF is at least 20% faster than flash storage only and that its memory access ratio is more than 50%.
A parallel phrase matching (PM) engine for dictionary compression is presented. Hardware based parallel chaining hash can eliminate erroneous PM results raised by hash collision; while newly-designed storage architecture holding PM results solved the data dependency issue; Thus, the average compression speed is increased by 53%.
Shanding XU Xiwang CAO Jian GAO Chunming TANG
As an optimal combinatorial object, cyclic perfect Mendelsohn difference family (CPMDF) was introduced by Fuji-Hara and Miao to construct optimal optical orthogonal codes. In this paper, we propose a direct construction of disjoint CPMDFs from the Zeng-Cai-Tang-Yang cyclotomy. Compared with a recent work of Fan, Cai, and Tang, our construction doesn't need to depend on a cyclic difference matrix. Furthermore, strictly optimal frequency-hopping sequences (FHSs) are a kind of optimal FHSs which has optimal Hamming auto-correlation for any correlation window. As an application of our disjoint CPMDFs, we present more flexible combinatorial constructions of strictly optimal FHSs, which interpret the previous construction proposed by Cai, Zhou, Yang, and Tang.
Yubo LI Shuonan LI Hongqian XUAN Xiuping PENG
In this letter, a generic method to construct mutually orthogonal binary zero correlation zone (ZCZ) sequence sets from mutually orthogonal complementary sequence sets (MOCSSs) with certain properties is presented at first. Then MOCSSs satisfying conditions are generated from binary orthogonal matrices with order N×N, where N=p-1, p is a prime. As a result, mutually orthogonal binary ZCZ sequence sets with parameters (2N2,N,N+1)-ZCZ can be obtained, the number of ZCZ sets is N. Note that each single ZCZ sequence set is optimal with respect to the theoretical bound.
Kento HASEGAWA Masao YANAGISAWA Nozomu TOGAWA
Recently, it has been reported that malicious third-party IC vendors often insert hardware Trojans into their products. Especially in IC design step, malicious third-party vendors can easily insert hardware Trojans in their products and thus we have to detect them efficiently. In this paper, we propose a machine-learning-based hardware-Trojan detection method for gate-level netlists using multi-layer neural networks. First, we extract 11 Trojan-net feature values for each net in a netlist. After that, we classify the nets in an unknown netlist into a set of Trojan nets and that of normal nets using multi-layer neural networks. By experimentally optimizing the structure of multi-layer neural networks, we can obtain an average of 84.8% true positive rate and an average of 70.1% true negative rate while we can obtain 100% true positive rate in some of the benchmarks, which outperforms the existing methods in most of the cases.
Wei LI Huajun GONG Chunlin SHEN Yi WU
Surface light field advances conventional light field rendering techniques by utilizing geometry information. Using surface light field, real-world objects with complex appearance could be faithfully represented. This capability could play an important role in many VR/AR applications. However, an accurate geometric model is needed for surface light field sampling and processing, which limits its wide usage since many objects of interests are difficult to reconstruct with their usually very complex appearances. We propose a novel two-step optimization framework to reduce the dependency of accurate geometry. The key insight is to treat surface light field sampling as a multi-view multi-texture optimization problem. Our approach can deal with both model inaccuracy and image to model misalignment, making it possible to create high-fidelity surface light field models without using high-precision special hardware.
Kehai CHEN Tiejun ZHAO Muyun YANG
Learning semantic representation for translation context is beneficial to statistical machine translation (SMT). Previous efforts have focused on implicitly encoding syntactic and semantic knowledge in translation context by neural networks, which are weak in capturing explicit structural syntax information. In this paper, we propose a new neural network with a tree-based convolutional architecture to explicitly learn structural syntax information in translation context, thus improving translation prediction. Specifically, we first convert parallel sentences with source parse trees into syntax-based linear sequences based on a minimum syntax subtree algorithm, and then define a tree-based convolutional network over the linear sequences to learn syntax-based context representation and translation prediction jointly. To verify the effectiveness, the proposed model is integrated into phrase-based SMT. Experiments on large-scale Chinese-to-English and German-to-English translation tasks show that the proposed approach can achieve a substantial and significant improvement over several baseline systems.
Recent studies utilize multiple kernel learning to deal with incomplete-data problem. In this study, we introduce new methods that do not only complete multiple incomplete kernel matrices simultaneously, but also allow control of the flexibility of the model by parameterizing the model matrix. By imposing restrictions on the model covariance, overfitting of the data is avoided. A limitation of kernel matrix estimations done via optimization of an objective function is that the positive definiteness of the result is not guaranteed. In view of this limitation, our proposed methods employ the LogDet divergence, which ensures the positive definiteness of the resulting inferred kernel matrix. We empirically show that our proposed restricted covariance models, employed with LogDet divergence, yield significant improvements in the generalization performance of previous completion methods.
Seiji MIYOSHI Yoshinobu KAJIKAWA
We analyze the behaviors of the FXLMS algorithm using a statistical-mechanical method. The cross-correlation between a primary path and an adaptive filter and the autocorrelation of the adaptive filter are treated as macroscopic variables. We obtain simultaneous differential equations that describe the dynamical behaviors of the macroscopic variables under the condition that the tapped-delay line is sufficiently long. The obtained equations are deterministic and closed-form. We analytically solve the equations to obtain the correlations and finally compute the mean-square error. The obtained theory can quantitatively predict the behaviors of computer simulations including the cases of both not only white but also nonwhite reference signals. The theory also gives the upper limit of the step size in the FXLMS algorithm.
Makoto TAKITA Masanori HIROTOMO Masakatu MORII
The network load is increasing due to the spread of content distribution services. Caching is known as a technique to reduce a peak network load by prefetching popular contents into memories of users. Coded caching is a new caching approach based on a carefully designed content placement in order to create coded multicasting opportunities. Recent works have discussed single-layer caching systems, but many networks consist of multiple layers of cache. In this paper, we discuss a coded caching problem for a hierarchical network that has a different number of layers of cache. The network has users who connect to an origin server via a mirror server and users who directly connect to the origin server. We provide lower bounds of the rates for this problem setting based on the cut-set bound. In addition, we propose three basic coded caching schemes and characterize these schemes. Also, we propose a new coded caching scheme by combining two basic schemes and provide achievable rates of the combination coded caching scheme. Finally, we show that the proposed combination scheme demonstrates a good performance by a numerical result.
Jung-Hyun KIM Min Kyu SONG Hong-Yeop SONG
In this paper, we investigate how to obtain binary locally repairable codes (LRCs) with good locality and availability from binary Simplex codes. We first propose a Combination code having the generator matrix with all the columns of positive weights less than or equal to a given value. Such a code can be also obtained by puncturing all the columns of weights larger than a given value from a binary Simplex Code. We call by block-puncturing such puncturing method. Furthermore, we suggest a heuristic puncturing method, called subblock-puncturing, that punctures a few more columns of the largest weight from the Combination code. We determine the minimum distance, locality, availability, joint information locality, joint information availability of Combination codes in closed-form. We also demonstrate the optimality of the proposed codes with certain choices of parameters in terms of some well-known bounds.
Kosuke TAKAHASHI Dan MIKAMI Mariko ISOGAWA Akira KOJIMA Hideaki KIMATA
In this paper, we propose a novel method to extrinsically calibrate a camera to a 3D reference object that is not directly visible from the camera. We use a human cornea as a spherical mirror and calibrate the extrinsic parameters from the reflections of the reference points. The main contribution of this paper is to present a cornea-reflection-based calibration algorithm with a simple configuration: five reference points on a single plane and one mirror pose. In this paper, we derive a linear equation and obtain a closed-form solution of extrinsic calibration by introducing two ideas. The first is to model the cornea as a virtual sphere, which enables us to estimate the center of the cornea sphere from its projection. The second is to use basis vectors to represent the position of the reference points, which enables us to deal with 3D information of reference points compactly. We demonstrate the performance of the proposed method with qualitative and quantitative evaluations using synthesized and real data.
Yundong LI Weigang ZHAO Xueyan ZHANG Qichen ZHOU
Crack detection is a vital task to maintain a bridge's health and safety condition. Traditional computer-vision based methods easily suffer from disturbance of noise and clutters for a real bridge inspection. To address this limitation, we propose a two-stage crack detection approach based on Convolutional Neural Networks (CNN) in this letter. A predictor of small receptive field is exploited in the first detection stage, while another predictor of large receptive field is used to refine the detection results in the second stage. Benefiting from data fusion of confidence maps produced by both predictors, our method can predict the probability belongs to cracked areas of each pixel accurately. Experimental results show that the proposed method is superior to an up-to-date method on real concrete surface images.
Yuhua SUN Qiang WANG Qiuyan WANG Tongjiang YAN
In the past two decades, many generalized cyclotomic sequences have been constructed and they have been used in cryptography and communication systems for their high linear complexity and low autocorrelation. But there are a few of papers focusing on the 2-adic complexities of such sequences. In this paper, we first give a property of a class of Gaussian periods based on Whiteman's generalized cyclotomic classes of order 4. Then, as an application of this property, we study the 2-adic complexity of a class of Whiteman's generalized cyclotomic sequences constructed from two distinct primes p and q. We prove that the 2-adic complexity of this class of sequences of period pq is lower bounded by pq-p-q-1. This lower bound is at least greater than one half of its period and thus it shows that this class of sequences can resist against the rational approximation algorithm (RAA) attack.
Takahiro MATSUMOTO Hideyuki TORII Yuta IDA Shinya MATSUFUJI
In this paper, we propose a generation method of new mutually zero-correlation zone set of optical orthogonal sequences (MZCZ-OOS) consisting of binary and bi-phase sequence pairs based on the optical zero-correlation zone (ZCZ) sequence set. The MZCZ-OOS is composed of several small orthogonal sequence sets. The sequences that belong to same subsets are orthogonal, and there is a ZCZ between the sequence that belong to different subsets. The set is suitable for the M-ary quasi-synchronous optical code-division multiple access (M-ary/QS-OCDMA) system. The product of set size S and family size M of proposed MMZCZ-OOS is more than the upper bound of optical ZCZ sequence set, and is fewer than the that of optical orthogonal sequence set.
In this paper, a perceptual distortion based rate-distortion optimized video coding scheme for High Efficiency Video Coding (HEVC) is proposed. Structural Similarity Index (SSIM) in transform domain, which is known as distortion metric to better reflect human's perception, is derived for the perceptual distortion model to be applied for hierarchical coding block structure of HEVC. A SSIM-quantization model is proposed using the properties of DCT and high resolution quantization assumption. The SSIM model is obtained as the sum of SSIM in each Coding Unit (CU) depth of HEVC, which precisely predict SSIM values for the hierarchical quadtree structure of CU in HEVC. The rate model is derived from the entropy, based on Laplacian distributions of transform residual coefficients and is jointly combined with the SSIM-based distortion model for rate-distortion optimization in an HEVC video codec and can be compliantly applied to HEVC. The experimental results demonstrate that the proposed method achieves 8.1% and 4.0% average bit rate reductions in rate-SSIM performance for low-delay and random access configurations respectively, outperforming other existing methods. The proposed method provides better visual quality than the conventional mean square error (MSE)-based RDO coding scheme.
Naoto ISHIDA Takashi ISHIO Yuta NAKAMURA Shinji KAWAGUCHI Tetsuya KANDA Katsuro INOUE
Defects in spacecraft software may result in loss of life and serious economic damage. To avoid such consequences, the software development process incorporates code review activity. A code review conducted by a third-party organization independently of a software development team can effectively identify defects in software. However, such review activity is difficult for third-party reviewers, because they need to understand the entire structure of the code within a limited time and without prior knowledge. In this study, we propose a tool to visualize inter-module dataflow for source code of spacecraft software systems. To evaluate the method, an autonomous rover control program was reviewed using this visualization. While the tool does not decreases the time required for a code review, the reviewers considered the visualization to be effective for reviewing code.
Boolean functions used in stream ciphers and block ciphers should have high second-order nonlinearity to resist several known attacks and some potential attacks which may exist but are not yet efficient and might be improved in the future. The second-order nonlinearity of Boolean functions also plays an important role in coding theory, since its maximal value equals the covering radius of the second-order Reed-Muller code. But it is an extremely hard task to calculate and even to bound the second-order nonlinearity of Boolean functions. In this paper, we present a lower bound on the second-order nonlinearity of the generalized Maiorana-McFarland Boolean functions. As applications of our bound, we provide more simpler and direct proofs for two known lower bounds on the second-order nonlinearity of functions in the class of Maiorana-McFarland bent functions. We also derive a lower bound on the second-order nonlinearity of the functions which were conjectured bent by Canteaut and whose bentness was proved by Leander, by further employing our bound.