Tesshu HANAKA Nicolás HONORATO DROGUETT Kazuhiro KURITA Hirotaka ONO Yota OTACHI
In this paper, we study BALL COLLECTING WITH LIMITED ENERGY, which is a problem of scheduling robots with limited energy confined to a line to catch moving balls that eventually cross the line. For this problem, we show the NP-completeness of the general case and some algorithmic results for some cases with a small number of robots.
Hiroki NISHIKAWA Kana SHIMADA Ittetsu TANIGUCHI Hiroyuki TOMIYAMA
With the demand for energy-efficient and high- performance computing, multicore architecture has become more appealing than ever. Multicore task scheduling is one of domains in parallel computing which exploits the parallelism of multicore. Unlike traditional scheduling, multicore task scheduling has recently been studied on the assumption that tasks have inherent parallelism and can be split into multiple sub-tasks in data parallel fashion. However, it is still challenging to properly determine the degree of parallelism of tasks and mapping on multicores. Our proposed scheduling techniques determine the degree of parallelism of tasks, and sub-tasks are decided which type of cores to be assigned to heterogeneous multicores. In addition, two approaches to hardware/software codesign for heterogeneous multicore systems are proposed. The works optimize the types of cores organized in the architecture simultaneously with scheduling of the tasks such that the overall energy consumption is minimized under a deadline constraint, a warm start approach is also presented to effectively solve the problem. The experimental results show the simultaneous scheduling and core-type optimization technique remarkably reduces the energy consumption.
Takashi NAKADA Hiroyuki YANAGIHASHI Kunimaro IMAI Hiroshi UEKI Takashi TSUCHIYA Masanori HAYASHIKOSHI Hiroshi NAKAMURA
Near real-time periodic tasks, which are popular in multimedia streaming applications, have deadline periods that are longer than the input intervals thanks to buffering. For such applications, the conventional frame-based schedulings cannot realize the optimal scheduling due to their shortsighted deadline assumptions. To realize globally energy-efficient executions of these applications, we propose a novel task scheduling algorithm, which takes advantage of the long deadline period. We confirm our approach can take advantage of the longer deadline period and reduce the average power consumption by up to 18%.
Koichi MITSUNARI Jaehoon YU Takao ONOYE Masanori HASHIMOTO
Visual object detection on embedded systems involves a multi-objective optimization problem in the presence of trade-offs between power consumption, processing performance, and detection accuracy. For a new Pareto solution with high processing performance and low power consumption, this paper proposes a hardware architecture for decision tree ensemble using multiple channels of features. For efficient detection, the proposed architecture utilizes the dimensionality of feature channels in addition to parallelism in image space and adopts task scheduling to attain random memory access without conflict. Evaluation results show that an FPGA implementation of the proposed architecture with an aggregated channel features pedestrian detector can process 229 million samples per second at 100MHz operation frequency while it requires a relatively small amount of resources. Consequently, the proposed architecture achieves 350fps processing performance for 1080P Full HD images and outperforms conventional object detection hardware architectures developed for embedded systems.
Takashi NAKADA Tomoki HATANAKA Hiroshi UEKI Masanori HAYASHIKOSHI Toru SHIMIZU Hiroshi NAKAMURA
Improving energy efficiency is critical for embedded systems in our rapidly evolving information society. Near real-time data processing tasks, such as multimedia streaming applications, exhibit a common fact that their deadline periods are longer than their input intervals due to buffering. In general, executing tasks at lower performance is more energy efficient. On the other hand, higher performance is necessary for huge tasks to meet their deadlines. To minimize the energy consumption while meeting deadlines strictly, adaptive task scheduling including dynamic performance mode selection is very important. In this work, we propose an energy efficient slack-based task scheduling algorithm for such tasks by adapting to task size variations and applying DVFS with the help of statistical analysis. We confirmed that our proposal can further reduce the energy consumption when compared to oracle frame-based scheduling.
Hiroshi SAITO Masashi IMAI Tomohiro YONEDA
In this paper, we propose a redundant task allocation method for multi-core systems based on the Duplication with Temporary Triple-Modular Redundancy and Reconfiguration (DTTR) scheme. The proposed method determines task allocation of a given task graph to a given multi-core system model from task scheduling in given fault patterns. Fault patterns defined in this paper consist of a set of faulty cores and a set of surviving cores. To optimize the average failure rate of the system, task scheduling minimizes the execution time of the task graph preserving the property of the DTTR scheme. In addition, we propose a selection method of fault patterns to be scheduled to reduce the task allocation time. In the experiments, at first, we evaluate the proposed selection method of fault patterns in terms of the task allocation time. Then, we compare the average failure rate among the proposed method, a task allocation method which packs tasks into particular cores as much as possible, a task allocation method based on Simulated Annealing (SA), a task allocation method based on Integer Linear Programming (ILP), and a task allocation method based on task scheduling without considering the property of the DTTR scheme. The experimental results show that task allocation by the proposed method results in nearly the same average failure rate by the SA based method with shorter task allocation time.
Kana SHIMADA Shogo KITANO Ittetsu TANIGUCHI Hiroyuki TOMIYAMA
Task scheduling is one of the most important processes in the design of multicore computing systems. This paper presents a technique for scheduling of malleable tasks. Our scheduling technique decides not only the execution order of the tasks but also the number of cores assigned to the individual tasks, simultaneously. We formulate the scheduling problem as an integer linear programming (ILP) problem, and the optimal schedule can be obtained by solving the ILP problem. Experiments using a standard task-set suite clarify the strength of this work.
Byungnam LIM Yeeun SHIM Yon Dohn CHUNG
For an efficient processing of large data in a distributed system, Hadoop MapReduce performs task scheduling such that tasks are distributed with consideration of the data locality. The data locality, however, is limitedly exploited, since it is pursued one node at a time basis without considering the global optimality. In this paper, we propose a novel task scheduling algorithm that globally considers the data locality. Through experiments, we show our algorithm improves the performance of MapReduce in various situations.
Da-Ren CHEN Chiun-Chieh HSU Hon-Chan CHEN
Dynamic Voltage/Frequency Scaling (DVFS) allows designers to improve energy efficiency through adjusting supply voltage at runtime in order to meet the workload demand. Previous works solving real-time DVFS problems often refer to the canonical schedules with the exponential length. Other solutions for online scheduling depend on empirical or stochastic heuristics, which potentially result in frequent fluctuations of voltage/speed scaling. This paper aims at increasing the schedule predictability using period transformation in the pinwheel task model and improves the control on power-awareness by decreasing the speeds of as many tasks as possible to the same level. Experimental results show the maximum energy savings of 6% over the recent Dynamic Power Management (DPM) method and 12% over other slack reclamation algorithms.
Masatoshi KAWARASAKI Hyuma WATANABE
MapReduce and its open software implementation Hadoop are now widely deployed for big data analysis. As MapReduce runs over a cluster of massive machines, data transfer often becomes a bottleneck in job processing. In this paper, we explore the influence of data transfer to job processing performance and analyze the mechanism of job performance deterioration caused by data transfer oriented congestion at disk I/O and/or network I/O. Based on this analysis, we update Hadoop's Heartbeat messages to contain the real time system status for each machine, like disk I/O and link usage rate. This enhancement makes Hadoop's scheduler be aware of each machine's workload and make more accurate decision of scheduling. The experiment has been done to evaluate the effectiveness of enhanced scheduling methods and discussions are provided to compare the several proposed scheduling policies.
Energy-harvesting devices are materials that allow ambient energy sources to be converters into usable electrical power. While a battery powers the modern embedded systems, these energy-harvesting devices power the energy-harvesting embedded systems. This claims a new energy efficient management techniques for the energy-harvesting systems dislike the previous management techniques. The higher entire system efficiency in an energy-harvesting system can be obtained by a higher generating efficiency, a higher consuming efficiency, or a higher transferring efficiency. This paper presents a generalized technique for a dynamic reconfiguration and a task scheduling considering the power loss in DC-DC converters in the system. The proposed technique minimizes the power loss in the DC-DC converter and charger of the system. The proposed technique minimizes the power loss in the DC-DC converters and charger of the system. Experiments with actual application demonstrate that our approach reduces the total energy consumption by 22% in average over the conventional approach.
Fumihiko INO Shinta NAKAGAWA Kenichi HAGIHARA
This paper presents a stream programming framework, named GPU-chariot, for accelerating stream applications running on graphics processing units (GPUs). The main contribution of our framework is that it realizes efficient software pipelines on multi-GPU systems by enabling out-of-order execution of CPU functions, kernels, and data transfers. To achieve this out-of-order execution, we apply a runtime scheduler that not only maximizes the utilization of system resources but also encapsulates the number of GPUs available in the system. In addition, we implement a load-balancing capability to flow data efficiently through multiple GPUs. Furthermore, a callback interface enables overlapping execution of functions in third-party libraries. By using kernels with different performance bottlenecks, we show that our out-of-order execution is up to 20% faster than in-order execution. Finally, we conduct several case studies on a 4-GPU system and demonstrate the advantages of GPU-chariot over a manually pipelined code. We conclude that GPU-chariot can be useful when developing stream applications with software pipelines on multiple GPUs and CPUs.
Mahmoud MOMTAZPOUR Maziar GOUDARZI Esmaeil SANAEI
Parameter variations reveal themselves as different frequency and leakage powers per instances of the same MPSoC. By the increasing variation with technology scaling, worst-case-based scheduling algorithms result in either increasingly less optimal schedules or otherwise more lost yield. To address this problem, this paper introduces a variation-aware task and communication scheduling algorithm for multiprocessor system-on-chip (MPSoC). We consider both delay and leakage power variations during the process of finding the best schedule so that leakier processors are less utilized and can be more frequently put in sleep mode to reduce power. Our algorithm takes advantage of event tables to accelerate the statistical timing and power analysis. We use genetic algorithm to find the best schedule that maximizes power-yield under a performance-yield constraint. Experimental results on real world benchmarks show that our proposed algorithm achieves 16.6% power-yield improvement on average over deterministic worst-case-based scheduling.
Utilizing a heterogeneous multiprocessor system has become a popular design paradigm to build an embedded system at a cheap cost within short development time. A reliability issue for embedded systems, which is vulnerability to single event upsets (SEUs), has become a matter of concern as technology proceeds. This paper discusses reliability inherent in heterogeneous multiprocessors and proposes task scheduling for minimizing SEU vulnerability of them. This paper experimentally shows that increasing performance of a CPU core deteriorates its reliability. Based on the experimental observation, we propose task scheduling for reducing SEU vulnerability of a heterogeneous multiprocessor system. The experimental results demonstrate that our task scheduling technique can reduce much of SEU vulnerability under real-time constraints.
This paper investigates the problem of nonpreemptively scheduling independent parallel tasks in an environment with multiple machines, which is motivated from the recent studies in scheduling tasks in a multi-machine environment. In this scheduling environment, each machine contains a number of identical processors and each parallel task can simultaneously require a number of processors for its processing in any single machine. Whenever tasks are processed in parallel in a parallel machine, message communication among processors is often inevitable. The problem of finding a shortest schedule length on scheduling independent parallel tasks with the consideration of communication overhead in a multi-machine environment is NP-hard. The aim of this paper is to propose a heuristic algorithm for this kind of problem and to analyze the performance bound of this heuristic algorithm.
Makoto SUGIHARA Tohru ISHIHARA Kazuaki MURAKAMI
This paper proposes a task scheduling approach for reliable cache architectures (RCAs) of multiprocessor systems. The RCAs dynamically switch their operation modes for reducing the usage of vulnerable SRAMs under real-time constraints. A mixed integer programming model has been built for minimizing vulnerability under real-time constraints. Experimental results have shown that our task scheduling approach achieved 47.7-99.9% less vulnerability than a conventional one.
Wei SUN Chen YU Xavier DEFAGO Yasushi INOGUCHI
The scheduling of real-time tasks with fault-tolerant requirements has been an important problem in multiprocessor systems. The primary-backup (PB) approach is often used as a fault-tolerant technique to guarantee the deadlines of tasks despite the presence of faults. In this paper we propose a dynamic PB-based task scheduling approach, wherein an allocation parameter is used to search the available time slots for a newly arriving task, and the previously scheduled tasks can be re-scheduled when there is no available time slot for the newly arriving task. In order to improve the schedulability we also propose an overloading strategy for PB-overloading and Backup-backup (BB) overloading. Our proposed task scheduling algorithm is compared with some existing scheduling algorithms in the literature through simulation studies. The results have shown that the task rejection ratio of our real-time task scheduling algorithm is almost 50% lower than the compared algorithms.
Wei SUN Yuanyuan ZHANG Yasushi INOGUCHI
Heterogeneous distributed computing environments are well suited to meet the fast increasing computational demands. Task scheduling is very important for a heterogeneous distributed system to satisfy the large computational demands of applications. The performance of a scheduler in a heterogeneous distributed system normally has something to do with the dynamic task flow, that is, the scheduler always suffers from the heterogeneity of task sizes and the variety of task arrivals. From the long-term viewpoint it is necessary and possible to improve the performance of the scheduler serving the dynamic task flow. In this paper we propose a task scheduling method including a scheduling strategy which adapts to the dynamic task flow and a genetic algorithm which can achieve the short completion time of a batch of tasks. The strategy and the genetic algorithm work with each other to enhance the scheduler's efficiency and performance. We simulated a task flow with enough tasks, the scheduler with our strategy and algorithm, and the schedulers with other strategies and algorithms. We also simulated a complex scenario including the variant arrival rate of tasks and the heterogeneous computational nodes. The simulation results show that our scheduler achieves much better scheduling results than the others, in terms of the average waiting time, the average response time, and the finish time of all tasks.
Yuanyuan ZHANG Wei SUN Yasushi INOGUCHI
To make the best use of the resources in a shared grid environment, an application scheduler must make a prediction of available performance on each resource. In this paper, we examine the problem of predicting available CPU performance in time-shared grid system. We present and evaluate a new and innovative method to predict the one-step-ahead CPU load in a grid. Our prediction strategy forecasts the future CPU load based on the variety tendency in several past steps and in previous similar patterns, and uses a polynomial fitting method. Our experimental results on large load traces collected from four different kinds of machines demonstrate that this new prediction strategy achieves average prediction errors which are between 22% and 86% less than those incurred by four previous methods.
Yuanyuan ZHANG Yasushi INOGUCHI
Efficient task scheduling is critical for achieving high performance in grid computing systems. Existing task scheduling algorithms for grid environments usually assume that the performance prediction for both tasks and resources is perfectly accurate. In practice, however, it is very difficult to achieve such an accurate prediction in a heterogeneous and dynamic grid environment. Therefore, the performance of a task scheduling algorithm may be significantly influenced by prediction inaccuracy. In this paper, we study the influence of inaccurate predictions on task scheduling in the contexts of task selection and processor selection, which are two critical phases in task scheduling algorithms. We develop formulas for the misprediction degree, which is defined as the probability that the predicted values for the performances of tasks and processors reveal different orders from their real values. Based on these formulas, we also investigate the effect of several key parameters on the misprediction degree. Finally, we conduct extensive simulation for the sensitivities of some existing task scheduling algorithms to the prediction errors.