1-9hit |
Qian HU Muqing WU Song GUO Hailong HAN Chaoyi ZHANG
Information-centric networking (ICN) is a promising architecture and has attracted much attention in the area of future Internet architectures. As one of the key technologies in ICN, in-network caching can enhance content retrieval at a global scale without requiring any special infrastructure. In this paper, we propose a workload-aware caching policy, LRU-GT, which allows cache nodes to protect newly cached contents for a period of time (guard time) during which contents are protected from being replaced. LRU-GT can utilize the temporal locality and distinguish contents of different popularity, which are both the characteristics of the workload. Cache replacement is modeled as a semi-Markov process under the Independent Reference Model (IRM) assumption and a theoretical analysis proves that popular contents have longer sojourn time in the cache compared with unpopular ones in LRU-GT and the value of guard time can affect the cache hit ratio. We also propose a dynamic guard time adjustment algorithm to optimize the performance. Simulation results show that LRU-GT can reduce the average hops to get contents and improve cache hit ratio.
Template tracking has been extensively studied in Computer Vision with a wide range of applications. A general framework is to construct a parametric model to predict movement and to track the target. The difference in intensity between the pixels belonging to the current region and the pixels of the selected target allows a straightforward prediction of the region position in the current image. Traditional methods track the object based on the assumption that the relationship between the intensity difference and the region position is linear or non-linear. They will result in bad tracking performance when just one model is adopted. This paper proposes a method, called as Mixture Hyperplanes Approximation, which is based on finite mixture of generalized linear regression models to perform robust tracking. Moreover, a fast learning strategy is discussed, which improves the robustness against noise. Experiments demonstrate the performance and stability of Mixture Hyperplanes Approximation.
Many applications of wireless sensor networks (WSNs) require secure group communications. The WSNs are normally operated in unattended, harsh, or hostile environment. The adversaries may easily compromise some sensor nodes and abuse their shared keys to inject false sensing reports or modify the reports sent by other nodes. Once a malicious node is detected, the group key should be renewed immediately for the network security. Some strategies have been proposed to develop the group rekeying protocol, but most of existing schemes are not suitable for sensor networks due to their high overhead and poor scalability. In this paper, we propose a new group rekeying protocol for hierarchical WSNs with renewable network devices. Compared with existing schemes, our rekeying method possesses the following features that are particularly beneficial to the resource-constrained large-scale WSNs: (1) robustness to the node capture attack, (2) reactive rekeying capability to malicious nodes, and (3) low communication and storage overhead.
Shilei CHENG Mei XIE Zheng MA Siqi LI Song GU Feng YANG
As characterizing videos simultaneously from spatial and temporal cues have been shown crucial for video processing, with the shortage of temporal information of soft assignment, the vector of locally aggregated descriptor (VLAD) should be considered as a suboptimal framework for learning the spatio-temporal video representation. With the development of attention mechanisms in natural language processing, in this work, we present a novel model with VLAD following spatio-temporal self-attention operations, named spatio-temporal self-attention weighted VLAD (ST-SAWVLAD). In particular, sequential convolutional feature maps extracted from two modalities i.e., RGB and Flow are receptively fed into the self-attention module to learn soft spatio-temporal assignments parameters, which enabling aggregate not only detailed spatial information but also fine motion information from successive video frames. In experiments, we evaluate ST-SAWVLAD by using competitive action recognition datasets, UCF101 and HMDB51, the results shcoutstanding performance. The source code is available at:https://github.com/badstones/st-sawvlad.
Feilong TANG Minyi GUO Song GUO
Multiple hop based routing in homogeneous sensor networks with a single sink suffers performance degradation and severe security threats with the increase of the size of sensor networks. Large-scale sensor networks need to be deployed with multiple powerful nodes as sinks and they should be scheduled to move to different places during the lifetime of the networks. Existing routing mechanisms lack of such supports for large-scale sensor networks. In this paper, we propose a heterogeneous network model where multiple mesh nodes are deployed in a sensor network, and sensed data are collected through two tiers: firstly from a source sensor node to the closest mesh node in a multiple-hop fashion (called sensor routing), and then from the mesh node to the base station through long-distance mesh routing (called mesh routing). Based on this network model, we propose an energy-efficient and secure protocol for the sensor routing that can work well in large-scale sensor networks and resist most of attacks. Experiments demonstrate that our routing protocol significantly reduces average hops for data transmission. Our lightweight security mechanism enables the routing protocol to defend most attacks against sensor networks.
Rongchun LI Yong DOU Yuanwu LEI Shice NI Song GUO
This paper presents a parameterized multi-standard adaptive radix-4 Viterbi decoder with high throughput and low complexity. The proposed Viterbi decoder supports constraint lengths ranging from 3-9, code rates in the range of 1/2-1/3, and arbitrary truncation lengths. We present a novel fabric of Add-Compare-Select Unit (ACSU) and methods of unsigned quantization and efficient normalization that shorten the critical path. The decoder achieves a low bit error ratio in multiple standards, such as GPRS, WiMax, LTE, CDMA, and 3G. The proposed decoder is implemented on Xilinx XC5VLX330 device and the frequency achieved is 181.7 MHz. The throughput of the proposed decoder can reach 363 Mbps, which is superior to the other current multi-standard Viterbi decoders or radix-4 Viterbi decoders on the FPGA platform.
Lei WANG Shanmin YANG Jianwei ZHANG Song GU
Human action recognition (HAR) exhibits limited accuracy in video surveillance due to the 2D information captured with monocular cameras. To address the problem, a depth estimation-based human skeleton action recognition method (SARDE) is proposed in this study, with the aim of transforming 2D human action data into 3D format to dig hidden action clues in the 2D data. SARDE comprises two tasks, i.e., human skeleton action recognition and monocular depth estimation. The two tasks are integrated in a multi-task manner in end-to-end training to comprehensively utilize the correlation between action recognition and depth estimation by sharing parameters to learn the depth features effectively for human action recognition. In this study, graph-structured networks with inception blocks and skip connections are investigated for depth estimation. The experimental results verify the effectiveness and superiority of the proposed method in skeleton action recognition that the method reaches state-of-the-art on the datasets.
Haisong GU Yoshiaki SHIRAI Minoru ASADA
This paper presents a method for spatial and temporal segmentation of long image sequences which include multiple independently moving objects, based on the Minimum Description Length (MDL) principle. By obtaining an optimal motion description, we extract spatiotemporal (ST) segments in the image sequence, each of which consists of edge segments with similar motions. First, we construct a family of 2D motion models, each of which is completely determined by its specified set of equations. Then, based on these sets of equations we formulate the motion description length in a long sequence. The motion state of one object at one moment is determined by finding the model with shortest description length. Temporal segmentation is carried out when the motion state is found to have changed. At the same time, the spatial segmentation is globally optimized in such a way that the motion description of the entire scene reaches a minimum.
Shilei CHENG Song GU Maoquan YE Mei XIE
Human action recognition in videos draws huge research interests in computer vision. The Bag-of-Word model is quite commonly used to obtain the video level representations, however, BoW model roughly assigns each feature vector to its nearest visual word and the collection of unordered words ignores the interest points' spatial information, inevitably causing nontrivial quantization errors and impairing improvements on classification rates. To address these drawbacks, we propose an approach for action recognition by encoding spatio-temporal log Euclidean covariance matrix (ST-LECM) features within the low-rank and sparse representation framework. Motivated by low rank matrix recovery, local descriptors in a spatial temporal neighborhood have similar representation and should be approximately low rank. The learned coefficients can not only capture the global data structures, but also preserve consistent. Experimental results showed that the proposed approach yields excellent recognition performance on synthetic video datasets and are robust to action variability, view variations and partial occlusion.