1-5hit |
Kazuei HIRONAKA Kensuke IIZUKA Miho YAMAKURA Akram BEN AHMED Hideharu AMANO
Multi-FPGA systems have been receiving a lot of attention as a low cost and energy efficient system for Multi-access Edge Computing (MEC). For such purpose, a bare-metal multi-FPGA system called FiC (Flow-in-Cloud) is under development. In this paper, we introduce the FiC multi FPGA cluster which is applied partial reconfiguration (PR) FPGA design flow to support online user defined accelerator replacement while executing FPGA interconnection network and its low-level multiple FPGA management software called remote PR manager. With the remote PR manager, the user can define the FiC FPGA cluster setup by JSON and control the cluster from user application with the cooperation of simple cluster management tool / library called ficmgr on the client host and REST API service provider called ficwww on Raspberry Pi 3 (RPi3) on each node. According to the evaluation results with a prototype FiC FPGA cluster system with 12 nodes, using with online application replacement by PR and on-the-fly FPGA bitstream compression, the time for FPGA bitstream distribution was reduced to 1/17 and the total cluster setup time was reduced by 21∼57% than compared to cluster setup with full configuration FPGA bitstream.
Ryuta KAWANO Ryota YASUDO Hiroki MATSUTANI Michihiro KOIBUCHI Hideharu AMANO
Network throughput has become an important issue for big-data analysis on Warehouse-Scale Computing (WSC) systems. It has been reported that randomly-connected inter-switch networks can enlarge the network throughput. For irregular networks, a multi-path routing method called k-shortest path routing is conventionally utilized. However, it cannot efficiently exploit longer-than-shortest paths that would be detour paths to avoid bottlenecks. In this work, a novel routing method called k-optimized path routing to achieve high throughput is proposed for irregular networks. We introduce a heuristic to select detour paths that can avoid bottlenecks in the network to improve the average-case network throughput. Experimental results by network simulation show that the proposed k-optimized path routing can improve the saturation throughput by up to 18.2% compared to the conventional k-shortest path routing. Moreover, it can reduce the computation time required for optimization to 1/2760 at a minimum compared to our previously proposed method.
Lianpeng LI Jian DONG Decheng ZUO Yao ZHAO Tianyang LI
For cloud data center, Virtual Machine (VM) consolidation is an effective way to save energy and improve efficiency. However, inappropriate consolidation of VMs, especially aggressive consolidation, can lead to performance problems, and even more serious Service Level Agreement (SLA) violations. Therefore, it is very important to solve the tradeoff between reduction in energy use and reduction of SLA violation level. In this paper, we propose two Host State Detection algorithms and an improved VM placement algorithm based on our proposed Host State Binary Decision Tree Prediction model for SLA-aware and energy-efficient consolidation of VMs in cloud data centers. We propose two formulas of conditions for host state estimate, and our model uses them to build a Binary Decision Tree manually for host state detection. We extend Cloudsim simulator to evaluate our algorithms by using PlanetLab workload and random workload. The experimental results show that our proposed model can significantly reduce SLA violation rates while keeping energy cost efficient, it can reduce the metric of SLAV by at most 98.12% and the metric of Energy by at most 33.96% for real world workload.
This paper covers new architectures, technologies, and performance benchmarking together with prospects for high productivity and high performance computing enabled by photonics. The exponential and sustained increases in computing and data center needs are driving the demands for exascale computing in the future. Power-efficient and parallel computing with balanced system design is essential for reaching that goal as should support ∼billion total concurrencies and ∼billion core interconnections with ∼exabyte/second bisection bandwidth. Photonic interconnects offer a disruptive technology solution that fundamentally changes the computing architectural design considerations. Optics provide ultra-high throughput, massive parallelism, minimal access latencies, and low power dissipation that remains independent of capacity and distance. In addition to the energy efficiency and many of the fundamental physical problems, optics will bring high productivity computing where programmers can ignore locality between billions of processors and memory where data resides. Repeaterless interconnection links across the entire computing system and all-to-all massively parallel interconnection switch will significantly transform not only the hardware aspects of computing but the way people program and harness the computing capability. This impacts programmability and productivity of computing. Benchmarking and optimization of the configuration of the computing system is very important. Practical and scalable deployment of photonic interconnected computing systems are likely to be aided by emergence of athermal silicon photonics and hybrid integration technologies.
Guillermo IBÁÑEZ Iván MARSÁ-MAESTRE Miguel A. LOPEZ-CARMONA Ignacio PÉREZ-IBÁÑEZ Jun TANAKA Jon CROWCROFT
This paper describes Path-Moose, a scalable tree-based shortest path bridging protocol. Both ARP-Path and Path-Moose protocols belong to a new category of bridges that we name All-path, because all paths of the network are explored simultaneously with a broadcast frame distributed over all network links to find a path or set a multicast tree. Path-Moose employs the ARP-based low latency routing mechanism of the ARP-Path protocol on a bridge basis instead of a per-single-host basis. This increases scalability by reducing forwarding table entries at core bridges by a factor of fifteen times for big data center networks and achieves a faster reconfiguration by an approximate factor of ten. Reconfiguration time is significantly shorter than ARP-Path (zero in many cases) because, due to the sharing of network paths by the hosts connected to same edge bridges, when a host needs the path it has already been recovered by another user of the path. Evaluation through simulations shows protocol correctness and confirms the theoretical evaluation results.