1-6hit |
Yuan TAO Yangdong DENG Shuai MU Zhenzhong ZHANG Mingfa ZHU Limin XIAO Li RUAN
The sparse matrix operation, y ← y+AtAx, where A is a sparse matrix and x and y are dense vectors, is a widely used computing pattern in High Performance Computing (HPC) applications. The pattern poses challenge to efficient solutions because both a matrix and its transposed version are involved. An efficient sparse matrix format, Compressed Sparse Blocks (CSB), has been proposed to provide nearly the same performance for both Ax and Atx. We develop a multithreaded implementation for the CSB format and apply it to solve y ← y+AtAx. Experiments show that our technique outperforms the Compressed Sparse Row (CSR) based solution in POSKI by up to 2.5 fold on over 70% of benchmarking matrices.
Peng WANG Xiaofeng ZHONG Limin XIAO Shidong ZHOU Jing WANG Yong BAI
In this letter, the performance improvement by the deployment of multiple antennas in cognitive radio systems is studied from a system-level view. The term opportunistic spectrum efficiency (OSE) is defined as the performance metric to evaluate the spectrum opportunities that can actually be exploited by the secondary user (SU). By applying a simple energy combining detector, we show that deploying multiple antennas at the SU transceiver can improve the maximum achievable OSE significantly. Numerical results also reveal that the improvement comes from the reduction of both the detection overhead and the false alarm probability.
Zhisheng HUO Limin XIAO Zhenxue HE Xiaoling RONG Bing WEI
Previous works have studied the throughput allocation of the heterogeneous storage system consisting of SSD and HDD in the dynamic setting where users are not all present in the system simultaneously, but those researches make multiple servers as one large resource pool, and cannot cope with the multi-server environment. We design a dynamic throughput allocation mechanism named DAM, which can handle the throughput allocation of multiple heterogeneous servers in the dynamic setting, and can provide a number of desirable properties. The experimental results show that DAM can make one dynamic throughput allocation of multiple servers for making sure users' local allocations in each server, and can provide one efficient and fair throughput allocation in the whole system.
Min HUANG Limin XIAO Yunzhou LI Shidong ZHOU Jing WANG
In this letter, we investigate the application of Tomlinson-Harashima precoding (THP) in the downlink of multiuser multiple-input multiple-output (MIMO) systems, where multiple antennas are located at all the transceivers. Based on the criterion of maximum system sum-capacity, a per-layer optimization scheme is proposed, in which the subchannel ordering and transceiver filters design are generated. In the proposed scheme, the successive character of THP can be fully exploited, so that both the minimum cost of interference suppression and the maximum power and diversity gains can be implemented, and hence, the system sum-capacity can be improved effectively.
Yao ZHENG Limin XIAO Wenqi TANG Lihong SHANG Guangchao YAO Li RUAN
The dynamic time warping (DTW) algorithm is widely used to determine time series similarity search. As DTW has quadratic time complexity, the time taken for similarity search is the bottleneck for virtually all time series data mining algorithms. In this paper, we present a parallel approach for DTW on a heterogeneous platform with a graphics processing unit (GPU). In order to exploit fine-grained data-level parallelism, we propose a specific parallel decomposition in DTW. Furthermore, we introduce an optimization technique called diamond tiling to improve the utilization of threads. Results show that our approach substantially reduces computational time.
Wei ZHANG Li RUAN Mingfa ZHU Limin XIAO Jiajun LIU Xiaolan TANG Yiduo MEI Ying SONG Yuzhong SUN
In order to reduce cost and improve efficiency, many data centers adopt virtualization solutions. The advent of virtualization allows multiple virtual machines hosted on a single physical server. However, this poses new challenges for resource management. Web workloads which are dominant in data centers are known to vary dynamically with time. In order to meet application's service level agreement (SLA), how to allocate resources for virtual machines has become an important challenge in virtualized server environments, especially when dealing with fluctuating workloads and complex server applications. User experience is an important manifestation of SLA and attracts more attention. In this paper, the SLA is defined by server-side response time. Traditional resource allocation based on resource utilization has some drawbacks. We argue that dynamic resource allocation directly based on real-time user experience is more reasonable and also has practical significance. To address the problem, we propose a system architecture that combines response time measurements and analysis of user experience for resource allocation. An optimization model is introduced to dynamically allocate the resources among virtual machines. When resources are insufficient, we provide service differentiation and firstly guarantee resource requirements of applications that have higher priorities. We evaluate our proposal using TPC-W and Webbench. The experimental results show that our system can judiciously allocate system resources. The system helps stabilize applications' user experience. It can reduce the mean deviation of user experience from desired targets.