Ai-ichiro SASAKI Ken FUKUSHIMA
Magnetic fields are often utilized for position sensing of mobile devices. In typical sensing systems, multiple sensors are used to detect magnetic fields generated by target devices. To determine the positions of the devices, magnetic-field data detected by the sensors must be converted to device-position data. The data conversion is not trivial because it is a nonlinear inverse problem. In this study, we propose a machine-learning approach suitable for data conversion required in the magnetic-field-based position sensing of target devices. In our approach, two different sets of training data are used. One of the training datasets is composed of raw data of magnetic fields to be detected by sensors. The other set is composed of logarithmically represented data of the fields. We can obtain two different predictor functions by learning with these training datasets. Results show that the prediction accuracy of the target position improves when the two different predictor functions are used. Based on our simulation, the error of the target position estimated with the predictor functions is within 10cm in a 2m × 2m × 2m cubic space for 87% of all the cases of the target device states. The computational time required for predicting the positions of the target device is 4ms. As the prediction method is accurate and rapid, it can be utilized for the real-time tracking of moving objects and people.
The objective of critical nodes problem is to minimize pair-wise connectivity as a result of removing a specific number of nodes in the residual graph. From a mathematical modeling perspective, it comes the truth that the more the number of fragmented components and the evenly distributed of disconnected sub-graphs, the better the quality of the solution. Basing on this conclusion, we proposed a new Cluster Expansion Method for Critical Node Problem (CEMCNP), which on the one hand exploits a contraction mechanism to greedy simplify the complexity of sparse graph model, and on the other hand adopts an incremental cluster expansion approach in order to maintain the size of formed component within reasonable limitation. The proposed algorithm also relies heavily on the idea of multi-start iterative local search algorithm, whereas brings in a diversified late acceptance local search strategy to keep the balance between interleaving diversification and intensification in the process of neighborhood search. Extensive evaluations show that CEMCNP running on 35 of total 42 benchmark instances are superior to the outcome of KBV, while holding 3 previous best results out of the challenging instances. In addition, CEMCNP also demonstrates equivalent performance in comparison with the existing MANCNP and VPMS algorithms over 22 of total 42 graph models with fewer number of node exchange operations.
Takashi ISHIO Naoto MAEDA Kensuke SHIBUYA Kenho IWAMOTO Katsuro INOUE
Software developers may write a number of similar source code fragments including the same mistake in software products. To remove such faulty code fragments, developers inspect code clones if they found a bug in their code. While various code clone detection methods have been proposed to identify clones of either code blocks or functions, those tools do not always fit the code inspection task because a faulty code fragment may be much smaller than code blocks, e.g. a single line of code. To enable developers to search code clones of such a small faulty code fragment in a large-scale software product, we propose a method using Lempel-Ziv Jaccard Distance, which is an approximation of Normalized Compression Distance. We conducted an experiment using an existing research dataset and a user survey in a company. The result shows our method efficiently reports cloned faulty code fragments and the performance is acceptable for software developers.
Tomohiro YAMAZAKI Hisashi KOGA
We study the continuous similarity search problem for evolving queries which has recently been formulated. Given a data stream and a database composed of n sets of items, the purpose of this problem is to maintain the top-k most similar sets to the query which evolves over time and consists of the latest W items in the data stream. For this problem, the previous exact algorithm adopts a pruning strategy which, at the present time T, decides the candidates of the top-k most similar sets from past similarity values and computes the similarity values only for them. This paper proposes a new exact algorithm which shortens the execution time by computing the similarity values only for sets whose similarity values at T can change from time T-1. We identify such sets very fast with frequency-based inverted lists (FIL). Moreover, we derive the similarity values at T in O(1) time by updating the previous values computed at time T-1. Experimentally, our exact algorithm runs faster than the previous exact algorithm by one order of magnitude and as fast as the previous approximation algorithm.
Shucong TIAN Meng YANG Jianpeng WANG Rui WANG Avik R. ADHIKARY
AlphaSeq is a new paradigm to design sequencess with desired properties based on deep reinforcement learning (DRL). In this work, we propose a new metric function and a new reward function, to design an improved version of AlphaSeq. We show analytically and also through numerical simulations that the proposed algorithm can discover sequence sets with preferable properties faster than that of the previous algorithm.
Taishu ITO Yusuke SANO Katsuhisa YAMANAKA Takashi HIRAYAMA
The problem of enumerating connected induced subgraphs of a given graph is classical and studied well. It is known that connected induced subgraphs can be enumerated in constant time for each subgraph. In this paper, we focus on highly connected induced subgraphs. The most major concept of connectivity on graphs is vertex connectivity. For vertex connectivity, some enumeration problem settings and enumeration algorithms have been proposed, such as k-vertex connected spanning subgraphs. In this paper, we focus on another major concept of graph connectivity, edge-connectivity. This is motivated by the problem of finding evacuation routes in road networks. In evacuation routes, edge-connectivity is important, since highly edge-connected subgraphs ensure multiple routes between two vertices. In this paper, we consider the problem of enumerating 2-edge-connected induced subgraphs of a given graph. We present an algorithm that enumerates 2-edge-connected induced subgraphs of an input graph G with n vertices and m edges. Our algorithm enumerates all the 2-edge-connected induced subgraphs in O(n3m|SG|) time, where SG is the set of the 2-edge-connected induced subgraphs of G. Moreover, by slightly modifying the algorithm, we have a O(n3m)-delay enumeration algorithm for 2-edge-connected induced subgraphs.
Keitaro NAKASAI Masateru TSUNODA Kenichi MATSUMOTO
Software developers often use a web search engine to improve work efficiency. However, web search strategies (e.g., frequently changing web search keywords) may be different for each developer. In this study, we attempted to define a better web search strategy. Although many previous studies analyzed web search behavior in programming, they did not provide guidelines for web search strategies. To suggest guidelines for web search strategies, we asked 10 subjects four questions about programming which they had to solve, and analyzed their behavior. In the analysis, we focused on the subjects' task time and the web search metrics defined by us. Based on our experiment, to enhance the effectiveness of the search, we suggest (1) that one should not go through the next search result pages, (2) the number of keywords in queries should be suppressed, and (3) previously used keywords must be avoided when creating a new query.
Enze YANG Shuoyan LIU Yuxin LIU Kai FANG
Crowd flow prediction in high density urban scenes is involved in a wide range of intelligent transportation and smart city applications, and it has become a significant topic in urban computing. In this letter, a CNN-based framework called Pyramidal Spatio-Temporal Network (PSTNet) for crowd flow prediction is proposed. Spatial encoding is employed for spatial representation of external factors, while prior pyramid enhances feature dependence of spatial scale distances and temporal spans, after that, post pyramid is proposed to fuse the heterogeneous spatio-temporal features of multiple scales. Experimental results based on TaxiBJ and MobileBJ demonstrate that proposed PSTNet outperforms the state-of-the-art methods.
Hongwei YANG Fucheng XUE Dan LIU Li LI Jiahui FENG
Service composition optimization is a classic NP-hard problem. How to quickly select high-quality services that meet user needs from a large number of candidate services is a hot topic in cloud service composition research. An efficient second-order beetle swarm optimization is proposed with a global search ability to solve the problem of cloud service composition optimization in this study. First, the beetle antennae search algorithm is introduced into the modified particle swarm optimization algorithm, initialize the population bying using a chaotic sequence, and the modified nonlinear dynamic trigonometric learning factors are adopted to control the expanding capacity of particles and global convergence capability. Second, modified secondary oscillation factors are incorporated, increasing the search precision of the algorithm and global searching ability. An adaptive step adjustment is utilized to improve the stability of the algorithm. Experimental results founded on a real data set indicated that the proposed global optimization algorithm can solve web service composition optimization problems in a cloud environment. It exhibits excellent global searching ability, has comparatively fast convergence speed, favorable stability, and requires less time cost.
To realize an information-centric networking, IPFS (InterPlanetary File System) generates a unique ContentID for each content by applying a cryptographic hash to the content itself. Although it could improve the security against attacks such as falsification, it makes difficult to realize a similarity search in the framework of IPFS, since the similarity of contents is not reflected in the proximity of ContentIDs. To overcome this issue, we propose a method to apply a locality sensitive hash (LSH) to feature vectors extracted from contents as the key of indexes stored in IPFS. By conducting experiments with 10,000 random points corresponding to stored contents, we found that more than half of randomly given queries return a non-empty result for the similarity search, and yield an accurate result which is outside the σ confidence interval of an ordinary flooding-based method. Note that such a collection of random points corresponds to the worst case scenario for the proposed scheme since the performance of similarity search could improve when points and queries follow an uneven distribution.
Yida HONG Yanlei YIN Cheng GUO Xiaobao LIU
Many scientific and technological resources (STR) cannot meet the needs of real demand-based industrial services. To address this issue, the characteristics of scientific and technological resource services (STRS) are analyzed, and a method of the optimal combination of demand-based STR based on multi-community collaborative search is then put forward. An optimal combined evaluative system that includes various indexes, namely response time, innovation, composability, and correlation, is developed for multi-services of STR, and a hybrid optimal combined model for STR is constructed. An evaluative algorithm of multi-community collaborative search is used to study the interactions between general communities and model communities, thereby improving the adaptive ability of the algorithm to random dynamic resource services. The average convergence value CMCCSA=0.00274 is obtained by the convergence measurement function, which exceeds other comparison algorithms. The findings of this study indicate that the proposed methods can preferably reach the maximum efficiency of demand-based STR, and new ideas and methods for implementing demand-based real industrial services for STR are provided.
Masakazu IWAMURA Shunsuke MORI Koichiro NAKAMURA Takuya TANOUE Yuzuko UTSUMI Yasushi MAKIHARA Daigo MURAMATSU Koichi KISE Yasushi YAGI
Most gait recognition approaches rely on silhouette-based representations due to high recognition accuracy and computational efficiency. A fundamental problem for those approaches is how to extract individuality-preserved silhouettes from real scenes accurately. Foreground colors may be similar to background colors, and the background is cluttered. Therefore, we propose a method of individuality-preserving silhouette extraction for gait recognition using standard gait models (SGMs) composed of clean silhouette sequences of various training subjects as shape priors. The SGMs are smoothly introduced into a well-established graph-cut segmentation framework. Experiments showed that the proposed method achieved better silhouette extraction accuracy by more than 2.3% than representative methods and better identification rate of gait recognition (improved by more than 11.0% at rank 20). Besides, to reduce the computation cost, we introduced approximation in the calculation of dynamic programming. As a result, without reducing the segmentation accuracy, we reduced 85.0% of the computational cost.
Scalable networking for scientific research data transfer is a vital factor in the progress of data-intensive research, such as collaborative research on observation of black hole. In this paper, investigations of the nature of practical research traffic allow us to introduce optical flow switching (OFS) and contents delivery network (CDN) technologies into a wide area network (WAN) to realize highly scalable networking. To measure the scalability of networks, energy consumption in the WAN is evaluated by considering the practical networking equipment as well as reasonable assumptions on scientific research data transfer networks. In this study, we explore the energy consumption performance of diverse Japan and US topologies and reveal that the energy consumption of a routing and wavelength assignment algorithm in an OFS scheduler becomes the major hurdle when the number of nodes is high, for example, as high as that of the United States of America layer 1 topology. To provide computational scalability of a network dimensioning algorithm for the CDN based WAN, a simple heuristic algorithm for a surrogate location problem is proposed and compared with an optimal algorithm. This paper provides intuitions and design rules for highly scalable research data transfer networks, and thus, it can accelerate technology advancements against the encountering big-science problems.
Kaisei KAJITA Go OHTAKE Kazuto OGAWA
In this study, we propose a secure data-providing system by using a verifiable attribute-based keyword search (VABKS), which also has the functions of privacy preservation and feedback to providers with IP anonymous server. We give both theoretic and experimental result, which show that our proposed system is a secure system with real-time property. One potential application of the system is to Integrated Broadcast-Broadband (IBB) services, which acquire information related to broadcast programs via broadband networks. One such service is a recommendation service that delivers recommendations matching user preferences (such as to TV programs) determined from the user's viewing history. We have developed a real-time system outsourcing data to the cloud and performing keyword searches on it by dividing the search process into two stages and performing heavy processing on the cloud side.
Kohei NAKAI Takashi MATSUBARA Kuniaki UEHARA
The recent development of neural architecture search (NAS) has enabled us to automatically discover architectures of neural networks with high performance within a few days. Convolutional neural networks extract fruitful features by repeatedly applying standard operations (convolutions and poolings). However, these operations also extract useless or even disturbing features. Attention mechanisms enable neural networks to discard information of no interest, having achieved the state-of-the-art performance. While a variety of attentions for CNNs have been proposed, current NAS methods have paid a little attention to them. In this study, we propose a novel NAS method that searches attentions as well as operations. We examined several patterns to arrange attentions and operations, and found that attentions work better when they have their own search space and follow operations. We demonstrate the superior performance of our method in experiments on CIFAR-10, CIFAR-100, and ImageNet datasets. The found architecture achieved lower classification error rates and required fewer parameters compared to those found by current NAS methods.
Maximum inner product search (MIPS) problem has gained much attention in a wide range of applications. In order to overcome the curse of dimensionality in high-dimensional spaces, most of existing methods first transform the MIPS problem into another approximate nearest neighbor search (ANNS) problem and then solve it by Locality Sensitive Hashing (LSH). However, due to the error incurred by the transmission and incomprehensive search strategies, these methods suffer from low precision and have loose probability guarantees. In this paper, we propose a novel search method named Adaptive-LSH (AdaLSH) to solve MIPS problem more efficiently and more precisely. AdaLSH examines objects in the descending order of both norms and (the probably correctly estimated) cosine angles with a query object in support of LSH with extendable windows. Such extendable windows bring not only efficiency in searching but also the probability guarantee of finding exact or approximate MIP objects. AdaLSH gives a better probability guarantee of success than those in conventional algorithms, bringing less running times on various datasets compared with them. In addition, AdaLSH can even support exact MIPS with probability guarantee.
Yanxi YANG Jinguang JIANG Meilin HE
The carrier-phase multipath effect can seriously affect the accuracy of GPS-based positioning in static short baseline applications. Although several kinds of methods based on time domain and spatial domain techniques have been proposed to mitigate this error, they are still limited by the accuracy of the multipath model and the effectiveness of the correction strategy. After analyzing the existing methods, a new method based on adaptive thresholding wavelet packet transform (AW) and time domain bootstrap spatial domain search strategy (TB) is presented (AWTB). Taking advantage of adaptive thresholding wavelet packet transform, we enhance the precision of the correction model and the efficiency of the extraction method. In addition, by adopting the proposed time domain bootstrap spatial domain strategy, the accuracy and efficiency of subsequent multipath correction are improved significantly. Specifically, after applying the adaptive thresholding wavelet packet method, the mean improvement rate in the RMS values of the single-difference L1 residuals is about 27.93% compared with the original results. Furthermore, after applying the proposed AWTB method, experiments show that the 3D positioning precision is improved by about 38.51% compared with the original results. Even compared with the method based on stationary wavelet transform (SWT), and the method based on wavelet packets denoising (WPD), the 3D precision is improved by about 26.94% over the SWT method and about 22.96% over the WPD method, respectively. It is worth noting that, although the mean time consumption of the proposed algorithm is larger than the original method, the increased time consumption is not a serious burden for overall performance.
Yu ZHANG Yansong ZHAO Yifan WANG Yin LI
Searchable encryption with advanced query function is an important technique in today's cloud environment. To date, in the public key setting, the best query function supported by the previous schemes are conjunctive or disjunctive keyword search, which are elementary but not enough to satisfy the user's query requirements. In this paper, we make a progress for constructing a searchable public key encryption scheme with advanced query function called simple Boolean keyword search. To create our scheme, we proposed a keywords conversion method that projects the index and query keywords into a group of vectors. Based on a combination of these obtained vectors and an adaptively secure inner product encryption scheme, a public key encryption with simple Boolean keyword search scheme is proposed. We also present both theoretical and experimental analysis to show the effectiveness of this scheme. To the best of our knowledge, it is the first time to give a searchable public key encryption scheme supporting queries like q1op1q2op2…opi-1qiopi…opn-1qn, where opi is a logical operator which can be and(∨) or or(∧) and qi is a keyword.
Junichiro HAYATA Masahito ISHIZAKA Yusuke SAKAI Goichiro HANAOKA Kanta MATSUURA
Public-key encryption with keyword search (PEKS) is a cryptographic primitive that allows us to search for particular keywords over ciphertexts without recovering plaintexts. By using PEKS in cloud services, users can outsource their data in encrypted form without sacrificing search functionality. Concerning PEKS that can specify logical disjunctions and logical conjunctions as a search condition, it is known that such PEKS can be (generically) constructed from anonymous attribute-based encryption (ABE). However, it is not clear whether it is possible to construct this types of PEKS without using ABE which may require large computational/communication costs and strong mathematical assumptions. In this paper, we show that ABE is crucial for constructing PEKS with the above functionality. More specifically, we give a generic construction of anonymous key-policy ABE from PEKS whose search condition is specified by logical disjunctions and logical conjunctions. Our result implies such PEKS always requires large computational/communication costs and strong mathematical assumptions corresponding to those of ABE.
This paper proposes a method for searching for graphs in the database which are contained as subgraphs by a given query. In the proposed method, the search index does not require any knowledge of the query set or the frequent subgraph patterns. In conventional techniques, enumerating and selecting frequent subgraph patterns is computationally expensive, and the distribution of the query set must be known in advance. Subsequent changes to the query set require the frequent patterns to be selected again and the index to be reconstructed. The proposed method overcomes these difficulties through graph coding, using a tree structured index that contains infrequent subgraph patterns in the shallow part of the tree. By traversing this code tree, we are able to rapidly determine whether multiple graphs in the database contain subgraphs that match the query, producing a powerful pruning or filtering effect. Furthermore, the filtering and verification steps of the graph search can be conducted concurrently, rather than requiring separate algorithms. As the proposed method does not require the frequent subgraph patterns and the query set, it is significantly faster than previous techniques; this independence from the query set also means that there is no need to reconstruct the search index when the query set changes. A series of experiments using a real-world dataset demonstrate the efficiency of the proposed method, achieving a search speed several orders of magnitude faster than the previous best.