Sufen ZHAO Rong PENG Meng ZHANG Liansheng TAN
It is of great importance to recommend collaborators for scholars in academic social networks, which can benefit more scientific research results. Facing the problem of data sparsity of co-author recommendation in academic social networks, a novel recommendation algorithm named HeteroRWR (Heterogeneous Random Walk with Restart) is proposed. Different from the basic Random Walk with Restart (RWR) model which only walks in homogeneous networks, HeteroRWR implements multiple random walks in a heterogeneous network which integrates a citation network and a co-authorship network to mine the k mostly valuable co-authors for target users. By introducing the citation network, HeteroRWR algorithm can find more suitable candidate authors when the co-authorship network is extremely sparse. Candidate recommenders will not only have high topic similarities with target users, but also have good community centralities. Analyses on the convergence and time efficiency of the proposed approach are presented. Extensive experiments have been conducted on DBLP and CiteSeerX datasets. Experimental results demonstrate that HeteroRWR outperforms state-of-the-art baseline methods in terms of precision and recall rate even in the case of incorporating an incomplete citation dataset.
This paper proposes a method for searching for graphs in the database which are contained as subgraphs by a given query. In the proposed method, the search index does not require any knowledge of the query set or the frequent subgraph patterns. In conventional techniques, enumerating and selecting frequent subgraph patterns is computationally expensive, and the distribution of the query set must be known in advance. Subsequent changes to the query set require the frequent patterns to be selected again and the index to be reconstructed. The proposed method overcomes these difficulties through graph coding, using a tree structured index that contains infrequent subgraph patterns in the shallow part of the tree. By traversing this code tree, we are able to rapidly determine whether multiple graphs in the database contain subgraphs that match the query, producing a powerful pruning or filtering effect. Furthermore, the filtering and verification steps of the graph search can be conducted concurrently, rather than requiring separate algorithms. As the proposed method does not require the frequent subgraph patterns and the query set, it is significantly faster than previous techniques; this independence from the query set also means that there is no need to reconstruct the search index when the query set changes. A series of experiments using a real-world dataset demonstrate the efficiency of the proposed method, achieving a search speed several orders of magnitude faster than the previous best.
This paper proposes a visual analytics (VA) interface for time-series data so that it can solve the problems arising from the property of time-series data: a collision between interaction and animation on the temporal aspect, collision of interaction between the temporal and spatial aspects, and the trade-off of exploration accuracy, efficiency, and scalability between different visualization methods. To solve these problems, this paper proposes a VA interface that can handle temporal and spatial changes uniformly. Trajectories can show temporal changes spatially, of which direct manipulation enables to examine the relationship among objects either at a certain time point or throughout the entire time range. The usefulness of the proposed interface is demonstrated through experiments.
Jianyong DUAN Yuwei WU Mingli WU Hao WANG
The similarity of words extracted from the rich text relation network is the main way to calculate the semantic similarity. Complex relational information and text content in Wikipedia website, Community Question Answering and social network, provide abundant corpus for semantic similarity calculation. However, most typical research only focused on single relationship. In this paper, we propose a semantic similarity calculation model which integrates multiple relational information, and map multiple relationship to the same semantic space through learning representing matrix and semantic matrix to improve the accuracy of semantic similarity calculation. In experiments, we confirm that the semantic calculation method which integrates many kinds of relationships can improve the accuracy of semantic calculation, compared with other semantic calculation methods.
Huaizhe ZHOU Haihe BA Yongjun WANG Tie HONG
The arms race between offense and defense in the cloud impels the innovation of techniques for monitoring attacks and unauthorized activities. The promising technique of virtual machine introspection (VMI) becomes prevalent for its tamper-resistant capability. However, some elaborate exploitations are capable of invalidating VMI-based tools by breaking the assumption of a trusted guest kernel. To achieve a more reliable and robust introspection, we introduce a practical approach to monitor and detect attacks that attempt to subvert VMI in this paper. Our approach combines supervised machine learning and hardware architectural events to identify those malicious behaviors which are targeted at VMI techniques. To demonstrate the feasibility, we implement a prototype named HyperMon on the Xen hypervisor. The results of our evaluation show the effectiveness of HyperMon in detecting malicious behaviors with an average accuracy of 90.51% (AUC).
Pengyu WANG Hongqing ZHU Ning CHEN
A novel superpixel segmentation approach driven by uniform mixture model with spatially constrained (UMMS) is proposed. Under this algorithm, each observation, i.e. pixel is first represented as a five-dimensional vector which consists of colour in CLELAB space and position information. And then, we define a new uniform distribution through adding pixel position, so that this distribution can describe each pixel in input image. Applied weighted 1-Norm to difference between pixels and mean to control the compactness of superpixel. In addition, an effective parameter estimation scheme is introduced to reduce computational complexity. Specifically, the invariant prior probability and parameter range restrict the locality of superpixels, and the robust mean optimization technique ensures the accuracy of superpixel boundaries. Finally, each defined uniform distribution is associated with a superpixel and the proposed UMMS successfully implements superpixel segmentation. The experiments on BSDS500 dataset verify that UMMS outperforms most of the state-of-the-art approaches in terms of segmentation accuracy, regularity, and rapidity.
We propose an image identification scheme for double-compressed encrypted JPEG images that aims to identify encrypted JPEG images that are generated from an original JPEG image. To store images without any visual sensitive information on photo sharing services, encrypted JPEG images are generated by using a block-scrambling-based encryption method that has been proposed for Encryption-then-Compression systems with JPEG compression. In addition, feature vectors robust against JPEG compression are extracted from encrypted JPEG images. The use of the image encryption and feature vectors allows us to identify encrypted images recompressed multiple times. Moreover, the proposed scheme is designed to identify images re-encrypted with different keys. The results of a simulation show that the identification performance of the scheme is high even when images are recompressed and re-encrypted.
In DNA data storage and computation, DNA strands are required to meet certain combinatorial constraints. This paper shows how some of these constraints can be achieved simultaneously. First, we use the algebraic structure of irreducible cyclic codes over finite fields to generate cyclic DNA codes that satisfy reverse and complement properties. We show how such DNA codes can meet constant guanine-cytosine content constraint by MacWilliams-Seery algorithm. Second, we consider fulfilling the run-length constraint in parallel with the above constraints, which allows a maximum predetermined number of consecutive duplicates of the same symbol in each DNA strand. Since irreducible cyclic codes can be represented in terms of the trace function over finite field extensions, the linearity of the trace function is used to fulfill a predefined run-length constraint. Thus, we provide an algorithm for constructing cyclic DNA codes with the above properties including run-length constraint. We show numerical examples to demonstrate our algorithms generating such a set of DNA strands with all the prescribed constraints.
Xingyu ZHANG Xia ZOU Meng SUN Penglong WU Yimin WANG Jun HE
In order to improve the noise robustness of automatic speaker recognition, many techniques on speech/feature enhancement have been explored by using deep neural networks (DNN). In this work, a DNN multi-level enhancement (DNN-ME), which consists of the stages of signal enhancement, cepstrum enhancement and i-vector enhancement, is proposed for text-independent speaker recognition. Given the fact that these enhancement methods are applied in different stages of the speaker recognition pipeline, it is worth exploring the complementary role of these methods, which benefits the understanding of the pros and cons of the enhancements of different stages. In order to use the capabilities of DNN-ME as much as possible, two kinds of methods called Cascaded DNN-ME and joint input of DNNs are studied. Weighted Gaussian mixture models (WGMMs) proposed in our previous work is also applied to further improve the model's performance. Experiments conducted on the Speakers in the Wild (SITW) database have shown that DNN-ME demonstrated significant superiority over the systems with only a single enhancement for noise robust speaker recognition. Compared with the i-vector baseline, the equal error rate (EER) was reduced from 5.75 to 4.01.
Yun ZHANG Bingrui LI Shujuan YU Meisheng ZHAO
In this paper, we propose a new scheme which uses blind detection algorithm for recovering the conventional user signal in a system which the sporadic machine-to-machine (M2M) communication share the same spectrum with the conventional user. Compressive sensing techniques are used to estimate the M2M devices signals. Based on the Hopfield neural network (HNN), the blind detection algorithm is used to recover the conventional user signal. The simulation results show that the conventional user signal can be effectively restored under an unknown channel. Compared with the existing methods, such as using the training sequence to estimate the channel in advance, the blind detection algorithm used in this paper with no need for identifying the channel, and can directly detect the transmitted signal blindly.
Chun-Jung WU Shin-Ying HUANG Katsunari YOSHIOKA Tsutomu MATSUMOTO
A drastic increase in cyberattacks targeting Internet of Things (IoT) devices using telnet protocols has been observed. IoT malware continues to evolve, and the diversity of OS and environments increases the difficulty of executing malware samples in an observation setting. To address this problem, we sought to develop an alternative means of investigation by using the telnet logs of IoT honeypots and analyzing malware without executing it. In this paper, we present a malware classification method based on malware binaries, command sequences, and meta-features. We employ both unsupervised or supervised learning algorithms and text-mining algorithms for handling unstructured data. Clustering analysis is applied for finding malware family members and revealing their inherent features for better explanation. First, the malware binaries are grouped using similarity analysis. Then, we extract key patterns of interaction behavior using an N-gram model. We also train a multiclass classifier to identify IoT malware categories based on common infection behavior. For misclassified subclasses, second-stage sub-training is performed using a file meta-feature. Our results demonstrate 96.70% accuracy, with high precision and recall. The clustering results reveal variant attack vectors and one denial of service (DoS) attack that used pure Linux commands.
Shaojing FU Yunpeng YU Ming XU
Cloud computing enables computational resource-limited devices to economically outsource much computations to the cloud. Modular exponentiation is one of the most expensive operations in public key cryptographic protocols, and such operation may be a heavy burden for the resource-constraint devices. Previous works for secure outsourcing modular exponentiation which use one or two untrusted cloud server model or have a relatively large computational overhead, or do not support the 100% possibility for the checkability. In this letter, we propose a new efficient and verifiable algorithm for securely outsourcing modular exponentiation in the two untrusted cloud server model. The algorithm improves efficiency by generating random pairs based on EBPV generators, and the algorithm has 100% probability for the checkability while preserving the data privacy.
We propose a method of non-blind speech watermarking based on direct spread spectrum (DSS) using a linear prediction scheme to solve sound distortion due to spread spectrum. Results of evaluation simulations revealed that the proposed method had much lower sound-quality distortion than the DSS method while having almost the same bit error ratios (BERs) against various attacks as the DSS method.
Deng-Fong LU Chin HSIA Kun-Chu LEE
The paper presents a low power, wideband operational trans-conductance amplifier (OTA) for applications to drive large capacitive loads. In order to satisfy the low static power dissipation, high-speed, while reserving high current driving capability, the complementary slew-rate enhancer in conjunction with a dual class AB input stage to improve the slew-rate of a rail-to-rail two-stage OTA is proposed. The proposed architecture was implemented using 0.5µm CMOS process with a supply voltage of 5V. The slew-rate can achieve 68V/µsec at static power dissipation of 0.9mW, which can be used to efficiently drive larger than 6 nF capacitive load. The measured output has a total harmonic distortion of less than 5%.
Tung Thanh VU Duy Trong NGO Minh N. DAO Quang-Thang DUONG Minoru OKADA Hung NGUYEN-LE Richard H. MIDDLETON
This paper studies the joint optimization of precoding, transmit power and data rate allocation for energy-efficient full-duplex (FD) cloud radio access networks (C-RANs). A new nonconvex problem is formulated, where the ratio of total sum rate to total power consumption is maximized, subject to the maximum transmit powers of remote radio heads and uplink users. An iterative algorithm based on successive convex programming is proposed with guaranteed convergence to the Karush-Kuhn-Tucker solutions of the formulated problem. Numerical examples confirm the effectiveness of the proposed algorithm and show that the FD C-RANs can achieve a large gain over half-duplex C-RANs in terms of energy efficiency at low self-interference power levels.
Kazuki NORITAKE Shouhei KIDERA
Microwave mammography is a promising alternative to X-ray based imaging modalities, because of its small size, low cost, and cell-friendly exposure. More importantly, this modality enables the suppression of surface reflection clutter, which can be enhanced by introducing accurate surface shape estimations. However, near-field measurements can reduce the shape estimation accuracy, due to a mismatch between the reference and observed waveforms. To mitigate this problem, this study incorporates envelope-based shape estimation and finite-difference time-domain (FDTD)-based waveform correction with a fractional derivative adjustment. Numerical simulations based on realistic breast phantoms derived from magnetic resonance imaging (MRI) show that the proposed method significantly enhances the accuracy of breast surface imaging and the performance of surface clutter rejection.
In this paper, we propose the decomposition ring homomorphic encryption scheme, that is a homomorphic encryption scheme built on the decomposition ring, which is a subring of cyclotomic ring. By using the decomposition ring the structure of plaintext slot becomes ℤpl, instead of GF(pd) in conventional schemes on the cyclotomic ring. For homomorphic multiplication of integers, one can use the full of ℤpl slots using the proposed scheme, although in conventional schemes one can use only one-dimensional subspace GF(p) in each GF(pd) slot. This allows us to realize fast and compact homomorphic encryption for integer plaintexts. In fact, our benchmark results indicate that our decomposition ring homomorphic encryption schemes are several times faster than HElib for integer plaintexts due to its higher parallel computation.
In 2015, Carlet and Tang [Des. Codes Cryptogr. 76(3): 571-587, 2015] proposed a concept called enhanced Boolean functions and a class of such kind of functions on odd number of variables was constructed. They proved that the constructed functions in this class have optimal algebraic immunity if the numbers of variables are a power of 2 plus 1 and at least sub-optimal algebraic immunity otherwise. In addition, an open problem that if there are enhanced Boolean functions with optimal algebraic immunity and maximal algebraic degree n-1 on odd variables n≠2k+1 was proposed. In this letter, we give a negative answer to the open problem, that is, we prove that there is no enhanced Boolean function on odd n≠2k+1 variables with optimal algebraic immunity and maximal algebraic degree n-1.