Atsushi OOKA Eum SUYONG Shingo ATA Masayuki MURATA
Information-centric networking (ICN) has received increasing attention from all over the world. The novel aspects of ICN (e.g., the combination of caching, multicasting, and aggregating requests) is based on names that act as addresses for content. The communication with name has the potential to cope with the growing and complicating Internet technology, for example, Internet of Things, cloud computing, and a smart society. To realize ICN, router hardware must implement an innovative cache replacement algorithm that offers performance far superior to a simple policy-based algorithm while still operating with feasible computational and memory overhead. However, most previous studies on cache replacement policies in ICN have proposed policies that are too blunt to achieve significant performance improvement, such as first-in first-out (popularly, FIFO) and random policies, or impractical policies in a resource-restricted environment, such as least recently used (LRU). Thus, we propose CLOCK-Pro Using Switching Hash-tables (CUSH) as the suitable policy for network caching. CUSH can identify and keep popular content worth caching in a network environment. CUSH also employs CLOCK and hash-tables, which are low-overhead data structure, to satisfy the cost requirement. We numerically evaluate our proposed approach, showing that our proposal can achieve cache hits against the traffic traces that simple conventional algorithms hardly cause any hits.
Risa YASHIRO Takeshi SUGAWARA Mitsugu IWAMOTO Kazuo SAKIYAMA
Physically Unclonable Function (PUF) is a cryptographic primitive that is based on physical property of each entity or Integrated Circuit (IC) chip. It is expected that PUF be used in security applications such as ID generation and authentication. Some responses from PUF are unreliable, and they are usually discarded. In this paper, we propose a new PUF-based authentication system that exploits information of unreliable responses. In the proposed method, each response is categorized into multiple classes by its unreliability evaluated by feeding the same challenges several times. This authentication system is named Q-class authentication, where Q is the number of classes. We perform experiments assuming a challenge-response authentication system with a certain threshold of errors. Considering 4-class separation for 4-1 Double Arbiter PUF, it is figured out that the advantage of a legitimate prover against a clone is improved form 24% to 36% in terms of success rate. In other words, it is possible to improve the tolerance of machine-learning attack by using unreliable information that was previously regarded disadvantageous to authentication systems.
Takuma NAKAJIMA Masato YOSHIMI Celimuge WU Tsutomu YOSHINAGA
Cooperative caching is a key technique to reduce rapid growing video-on-demand's traffic by aggregating multiple cache storages. Existing strategies periodically calculate a sub-optimal allocation of the content caches in the network. Although such technique could reduce the generated traffic between servers, it comes with the cost of a large computational overhead. This overhead will be the cause of preventing these caches from following the rapid change in the access pattern. In this paper, we propose a light-weight scheme for cooperative caching by grouping contents and servers with color tags. In our proposal, we associate servers and caches through a color tag, with the aim to increase the effective cache capacity by storing different contents among servers. In addition to the color tags, we propose a novel hybrid caching scheme that divides its storage area into colored LFU (Least Frequently Used) and no-color LRU (Least Recently Used) areas. The colored LFU area stores color-matching contents to increase cache hit rate and no-color LRU area follows rapid changes in access patterns by storing popular contents regardless of their tags. On the top of the proposed architecture, we also present a new routing algorithm that takes benefit of the color tags information to reduce the traffic by fetching cached contents from the nearest server. Evaluation results, using a backbone network topology, showed that our color-tag based caching scheme could achieve a performance close to the sub-optimal one obtained with a genetic algorithm calculation, with only a few seconds of computational overhead. Furthermore, the proposed hybrid caching could limit the degradation of hit rate from 13.9% in conventional non-colored LFU, to only 2.3%, which proves the capability of our scheme to follow rapid insertions of new popular contents. Finally, the color-based routing scheme could reduce the traffic by up to 31.9% when compared with the shortest-path routing.
Tetsunao MATSUTA Tomohiko UYEMATSU
This paper deals with a broadcast network with a server and many users. The server has files of content such as music and videos, and each user requests one of these files, where each file consists of some separated layers like a file encoded by a scalable video coding. On the other hand, each user has a local memory, and a part of information of the files is cached (i.e., stored) in these memories in advance of users' requests. By using the cached information as side information, the server encodes files based on users' requests. Then, it sends a codeword through an error-free shared link for which all users can receive a common codeword from the server without error. We assume that the server transmits some layers up to a certain level of requested files at each different transmission rate (i.e., the codeword length per file size) corresponding to each level. In this paper, we focus on the region of tuples of these rates such that layers up to any level of requested files are recovered at users with an arbitrarily small error probability. Then, we give inner and outer bounds on this region.
Tomohiko UYEMATSU Tetsunao MATSUTA
We consider the intrinsic randomness problem for correlated sources. Specifically, there are three correlated sources, and we want to extract two mutually independent random numbers by using two separate mappings, where each mapping converts one of the output sequences from two correlated sources into a random number. In addition, we assume that the obtained pair of random numbers is also independent of the output sequence from the third source. We first show the δ-achievable rate region where a rate pair of two mappings must satisfy in order to obtain the approximation error within δ ∈ [0,1), and the second-order achievable rate region for correlated general sources. Then, we apply our results to non-mixed and mixed independently and identically distributed (i.i.d.) correlated sources, and reveal that the second-order achievable rate region for these sources can be represented in terms of the sum of normal distributions.
Feng LIU Yanli XU Conggai LI Xuan GENG
The effect of the hidden terminal (HT) over multi-hop cascaded wireless networks with the omni-directional full-duplex relays will cause data collision. We allocate the frequency band among different hops in an orthogonal way based on link grouping strategy to avoid this HT problem. In order to maximize the achievable rate, an optimal frequency allocation scheme is proposed by boundary alignment. Performance analyses are provided and further validated by the simulation results.
Kento HASEGAWA Masao YANAGISAWA Nozomu TOGAWA
It has been reported that malicious third-party IC vendors often insert hardware Trojans into their IC products. How to detect them is a critical concern in IC design process. Machine-learning-based hardware-Trojan detection gives a strong solution to tackle this problem. Hardware-Trojan infected nets (or Trojan nets) in ICs must have particular Trojan-net features, which differ from those of normal nets. In order to classify all the nets in a netlist designed by third-party vendors into Trojan nets and normal ones by machine learning, we have to extract effective Trojan-net features from Trojan nets. In this paper, we first propose 51 Trojan-net features which describe well Trojan nets. After that, we pick up random forest as one of the best candidates for machine learning and optimize it to apply to hardware-Trojan detection. Based on the importance values obtained from the optimized random forest classifier, we extract the best set of 11 Trojan-net features out of the 51 features which can effectively classify the nets into Trojan ones and normal ones, maximizing the F-measures. By using the 11 Trojan-net features extracted, our optimized random forest classifier has achieved at most 100% true positive rate as well as 100% true negative rate in several Trust-HUB benchmarks and obtained the average F-measure of 79.3% and the accuracy of 99.2%, which realize the best values among existing machine-learning-based hardware-Trojan detection methods.
Runzi ZHANG Jinlin WANG Yiqiang SHENG Xiao CHEN Xiaozhou YE
Cache affinity has been proved to have great impact on the performance of packet processing applications on multi-core platforms. Flow-based packet scheduling can make the best of data cache affinity with flow associated data and context structures. However, little work on packet scheduling algorithms has been conducted when it comes to instruction cache (I-Cache) affinity in modified pipelining (MPL) architecture for multi-core systems. In this paper, we propose a protocol-aware packet scheduling (PAPS) algorithm aiming at maximizing I-Cache affinity at protocol dependent stages in MPL architecture for multi-protocol processing (MPP) scenario. The characteristics of applications in MPL are analyzed and a mapping model is introduced to illustrate the procedure of MPP. Besides, a stage processing time model for MPL is presented based on the analysis of multi-core cache hierarchy. PAPS is a kind of flow-based packet scheduling algorithm and it schedules flows in consideration of both application-level protocol of flows and load balancing. Experiments demonstrate that PAPS outperforms the Round Robin algorithm and the HRW-based (HRW) algorithm for MPP applications. In particular, PAPS can eliminate all I-Cache misses at protocol dependent stage and reduce the average CPU cycle consumption per packet by more than 10% in comparison with HRW.
Kha Cong NGUYEN Cuong Tuan NGUYEN Masaki NAKAGAWA
This paper presents a method to segment single- and multiple-touching characters in offline handwritten Japanese text recognition with practical speed. Distortions due to handwriting and a mix of complex Chinese characters with simple phonetic and alphanumeric characters leave optical handwritten text recognition (OHTR) for Japanese still far from perfection. Segmentation of characters, which touch neighbors on multiple points, is a serious unsolved problem. Therefore, we propose a method to segment them which is made in two steps: coarse segmentation and fine segmentation. The coarse segmentation employs vertical projection, stroke-width estimation while the fine segmentation takes a graph-based approach for thinned text images, which employs a new bridge finding process and Voronoi diagrams with two improvements. Unlike previous methods, it locates character centers and seeks segmentation candidates between them. It draws vertical lines explicitly at estimated character centers in order to prevent vertically unconnected components from being left behind in the bridge finding. Multiple candidates of separation are produced by removing touching points combinatorially. SVM is applied to discard improbable segmentation boundaries. Then, ambiguities are finally solved by the text recognition employing linguistic context and geometric context to recognize segmented characters. The results of our experiments show that the proposed method can segment not only single-touching characters but also multiple-touching characters, and each component in our proposed method contributes to the improvement of segmentation and recognition rates.
Li CHEN Ling YANG Juan DU Chao SUN Shenglei DU Haipeng XI
Extreme learning machine (ELM) has recently attracted many researchers' interest due to its very fast learning speed, good generalization ability, and ease of implementation. However, it has a linear output layer which may limit the capability of exploring the available information, since higher-order statistics of the signals are not taken into account. To address this, we propose a novel ELM architecture in which the linear output layer is replaced by a Volterra filter structure. Additionally, the principal component analysis (PCA) technique is used to reduce the number of effective signals transmitted to the output layer. This idea not only improves the processing capability of the network, but also preserves the simplicity of the training process. Then we carry out performance evaluation and application analysis for the proposed architecture in the context of supervised classification and unsupervised equalization respectively, and the obtained results either on publicly available datasets or various channels, when compared to those produced by already proposed ELM versions and a state-of-the-art algorithm: support vector machine (SVM), highlight the adequacy and the advantages of the proposed architecture and characterize it as a promising tool to deal with signal processing tasks.
We propose a method for automatic emphasis estimation using conditional random fields. In our experiments, the value of F-measure obtained using our proposed method (0.31) was higher than that obtained using a random emphasis method (0.20), a method using TF-IDF (0.21), and a method based on LexRank (0.26). On the contrary, the value of F-measure of obtained using our proposed method (0.28) was slightly worse as compared with that obtained using manual estimation (0.26-0.40, with an average of 0.35).
Sadahiro TANI Toshimasa MATSUOKA Yusaku HIRAI Toshifumi KURATA Keiji TATSUMI Tomohiro ASANO Masayuki UEDA Takatsugu KAMATA
In the present paper, we propose a novel high-resolution analog-to-digital converter (ADC) for low-power biomedical analog front-ends, which we call the successive stochastic approximation ADC. The proposed ADC uses a stochastic flash ADC (SF-ADC) to realize a digitally controlled variable-threshold comparator in a successive-approximation-register ADC (SAR-ADC), which can correct errors originating from the internal digital-to-analog converter in the SAR-ADC. For the residual error after SAR-ADC operation, which can be smaller than thermal noise, the SF-ADC uses the statistical characteristics of noise to achieve high resolution. The SF-ADC output for the residual signal is combined with the SAR-ADC output to obtain high-precision output data using the supervised machine learning method.
Kohei TATENO Takahiro OGAWA Miki HASEYAMA
A novel dimensionality reduction method, Fisher Discriminant Locality Preserving Canonical Correlation Analysis (FDLP-CCA), for visualizing Web images is presented in this paper. FDLP-CCA can integrate two modalities and discriminate target items in terms of their semantics by considering unique characteristics of the two modalities. In this paper, we focus on Web images with text uploaded on Social Networking Services for these two modalities. Specifically, text features have high discriminate power in terms of semantics. On the other hand, visual features of images give their perceptual relationships. In order to consider both of the above unique characteristics of these two modalities, FDLP-CCA estimates the correlation between the text and visual features with consideration of the cluster structure based on the text features and the local structures based on the visual features. Thus, FDLP-CCA can integrate the different modalities and provide separated manifolds to organize enhanced compactness within each natural cluster.
Li WANG Xiaoan TANG Junda ZHANG Dongdong GUAN
Volume segmentation is of great significances for feature visualization and feature extraction, essentially volume segmentation can be viewed as generalized cluster. This paper proposes a hybrid approach via symmetric region growing (SRG) and information diffusion estimation (IDE) for volume segmentation, the volume dataset is over-segmented to series of subsets by SRG and then subsets are clustered by K-Means basing on distance-metric derived from IDE, experiments illustrate superiority of the hybrid approach with better segmentation performance.
Rachelle RIVERO Richard LEMENCE Tsuyoshi KATO
With the huge influx of various data nowadays, extracting knowledge from them has become an interesting but tedious task among data scientists, particularly when the data come in heterogeneous form and have missing information. Many data completion techniques had been introduced, especially in the advent of kernel methods — a way in which one can represent heterogeneous data sets into a single form: as kernel matrices. However, among the many data completion techniques available in the literature, studies about mutually completing several incomplete kernel matrices have not been given much attention yet. In this paper, we present a new method, called Mutual Kernel Matrix Completion (MKMC) algorithm, that tackles this problem of mutually inferring the missing entries of multiple kernel matrices by combining the notions of data fusion and kernel matrix completion, applied on biological data sets to be used for classification task. We first introduced an objective function that will be minimized by exploiting the EM algorithm, which in turn results to an estimate of the missing entries of the kernel matrices involved. The completed kernel matrices are then combined to produce a model matrix that can be used to further improve the obtained estimates. An interesting result of our study is that the E-step and the M-step are given in closed form, which makes our algorithm efficient in terms of time and memory. After completion, the (completed) kernel matrices are then used to train an SVM classifier to test how well the relationships among the entries are preserved. Our empirical results show that the proposed algorithm bested the traditional completion techniques in preserving the relationships among the data points, and in accurately recovering the missing kernel matrix entries. By far, MKMC offers a promising solution to the problem of mutual estimation of a number of relevant incomplete kernel matrices.
Weiwei XING Shibo ZHAO Shunli ZHANG Yuanyuan CAI
Crowd modeling and simulation is an active research field that has drawn increasing attention from industry, academia and government recently. In this paper, we present a generic data-driven approach to generate crowd behaviors that can match the video data. The proposed approach is a bi-layer model to simulate crowd behaviors in pedestrian traffic in terms of exclusion statistics, parallel dynamics and social psychology. The bottom layer models the microscopic collision avoidance behaviors, while the top one focuses on the macroscopic pedestrian behaviors. To validate its effectiveness, the approach is applied to generate collective behaviors and re-create scenarios in the Informatics Forum, the main building of the School of Informatics at the University of Edinburgh. The simulation results demonstrate that the proposed approach is able to generate desirable crowd behaviors and offer promising prediction performance.
To drastically increase the splitting ratio of extended-reach (40km span) time- and wavelength-division multiplexed passive optical networks (WDM/TDM-PONs), we modify the gain control scheme of our automatic gain controlled semiconductor optical amplifiers (AGC-SOAs) that were developed to support upstream transmission in long-reach systems. While the original AGC-SOAs are located outside the central office (CO) as repeaters, the new AGC-SOAs are located inside the CO and connected to each branch of an optical splitter in the CO. This arrangement has the potential to greatly reduce the costs of CO-sited equipment as they are shared by many more users if the new gain control scheme works properly even when the input optical powers are low. We develop a prototype and experimentally confirm its effectiveness in increasing the splitting ratio of extended-reach systems to 512.
Shunichi BUSHISUE Satoshi SUYAMA Satoshi NAGATA Nobuhiko MIKI
In the future, 5G radio access and support for the internet of things (IoT) is becoming more important, which is called machine type communications. Different from current mobile communication systems, machine type communications generates relatively small packets. In order to support such small packets with high reliability, channel coding techniques are inevitable. One of the most effective channel codes in such conditions is the tail-biting convolutional code, since it is used in LTE systems due to its good performance for small packet sizes. By employing a list Viterbi algorithm for the tail-biting convolutional code, the block error rate (BLER) performances is further improved. Therefore, this paper evaluates the BLER performances of several list Viterbi algorithms, i.e., circular parallel list Viterbi algorithm (CPLVA), per stage CPLVA (PSCPLVA), and successive state and sequence estimation (SSSE). In the evaluation, computational complexity is also taken into account. It is shown that the performance of the CPLVA is better in the wide range of computational complexity defined in this paper.
Ying MA Shunzhi ZHU Yumin CHEN Jingjing LI
An transfer learning method, called Kernel Canonical Correlation Analysis plus (KCCA+), is proposed for heterogeneous Cross-company defect prediction. Combining the kernel method and transfer learning techniques, this method improves the performance of the predictor with more adaptive ability in nonlinearly separable scenarios. Experiments validate its effectiveness.
Masakazu MORIMOTO Naotake KAMIURA Yutaka HATA Ichiro YAMAMOTO
To promote effective guidance by health checkup results, this paper predict a likelihood of developing lifestyle-related diseases from health check data. In this paper, we focus on the fluctuation of hemoglobin A1c (HbA1c) value, which deeply connected with diabetes onset. Here we predict incensement of HbA1c value and examine which kind of health checkup item has important role for HbA1c fluctuation. Our experimental results show that, when we classify the subjects according to their gender and triglyceride (TG) fluctuation value, we will effectively evaluate the risk of diabetes onset for each class.