Biplab KUMER SARKER Anil KUMAR TRIPATHI Deo PRAKASH VIDYARTHI Kuniaki UEHARA
A Distributed Computing System (DCS) contributes in proper partitioning of the tasks into modules and allocating them to various nodes so as to enable parallel execution of their modules by individual different processing nodes of the system. The scheduling of various modules on particular processing nodes may be preceded by appropriate allocation of modules of the different tasks to various processing nodes and then only the appropriate execution characteristic can be obtained. A number of algorithms have been proposed for allocation of tasks in a DCS. Most of the solutions proposed had simplifying assumptions. The very first assumption has been: consideration of a single task with their corresponding modules only; second, no consideration of the status of processing nodes in terms of the previously allocated modules of various tasks and third, the capacity and capability of the processing nodes. This work proposes algorithms for a realistic situation wherein multiple tasks with their modules compete for execution on a DCS dynamically considering their architectural capability. In this work, we propose two algorithms based on the two well-known A* and GA for the task allocation models. The paper explains the algorithms elaborately by illustrated examples and presents a comparative performance study among our algorithms and the algorithms for task allocation proposed in the various literatures. The results demonstrate that our GA based task allocation algorithm achieves better performance compared with the other algorithms.
Most heuristics for the task scheduling problem employ a simple model of the target system, assuming fully connected processors, a dedicated communication subsystem and no contention for communication resources. A small number of algorithms is aware of the contention, using an undirected graph model of the communication network. Although, many scheduling algorithms have been compared in the literature, little is known about the accuracy and appropriateness of the employed models. This article evaluates the accuracy of task scheduling algorithms on generic parallel systems. The performed experiments show a significant inaccuracy of the schedules produced. In an extensive analysis, the reasons for these results are identified and the implications for the scheduling model are discussed.
In the last three decades, task scheduling problems onto parallel processing systems have been extensively studied. Some of those problems take communication delays into account. In most of previous works, the structure of the parallel processing systems of the scheduling problem is restricted to be fully connected. However, the realistic models of parallel processing systems, such as hypercubes, grids, tori, and so forth, are not fully connected and the communication delay has a great effect on the completion time of tasks. In this paper, we show that the problem of scheduling tasks onto a hypercube/grid is NP-complete even if the task set forms an out- or in-tree and the execution time of each task and each communication take one unit time. Moreover, we construct linear time algorithms for computing an optimal schedule of some classes of binary and ternary trees onto a hypercube if each communication has one unit time.
Yoshihiro MURATA Yasunori ISHIHARA Minoru ITO
The Task-Coalition Assignment Problem (TCAP) is a formalization of the distributed computation problem. In TCAP, a set of agents and a set of tasks are given. A subset of the agents processes a task to produce benefit. The goal of TCAP is to find the combination of the tasks and the subsets of the agents that maximizes the sum of the benefit. In this paper, we define 1-TCAP, which is a practical subclass of TCAP. In 1-TCAP, tasks and agents are characterized by scalar values. We propose a polynomial-time approximation algorithm for 1-TCAP, and show that this algorithm achieves an approximation ratio 9/4. Here, an algorithm achieves an approximation ratio α for a maximization problem if, for every instance, it produces a solution of value at least OPT/α, where OPT is the value of the optimal solution.
Koji HASHIMOTO Tatsuhiro TSUCHIYA Tohru KIKUNO
In this paper, we propose a new scheduling algorithm to achieve fault tolerance in multiprocessor systems. This algorithm first partitions a parallel program into subsets of tasks, based on the notion of height of a task graph. For each subset, the algorithm then duplicates and schedules the tasks in the subset successively. We prove that schedules obtained by the proposed algorithm can tolerate a single processor failure and show that the computational complexity of the algorithm is O(|V|4) where V is the set of nodes of a task graph. We conduct simulations by applying the algorithm to two kinds of practical task graphs (Gaussian elimination and LU-decomposition). The results of this experiment show that fault tolerance can be achieved at the cost of small degree of time redundancy, and that performance in the case of a processor failure is improved compared to a previous algorithm.
Tree task structures occur frequently in many applications where parallelization may be desirable. We present a formal treatment of non-preemptively scheduling task trees on distributed memory multiprocessors and show that the fundamental problems of scheduling (i) a task tree in absence of any inter-task communication on a fixed number of processors and (ii) a task tree with inter-task communication on an unbounded number of processors are NP-complete. For task trees that satisfy certain constraints, we present an optimal scheduling algorithm. The algorithm is shown optimal over a wider set of task trees than previous works.
Koji HASHIMOTO Tatsuhiro TSUCHIYA Tohru KIKUNO
A schedule for a parallel program is said to be 1-fault-secure if a system that uses the schedule can either produce correct output for the program or detect the presence of any faults in a single processor. Although several fault-secure scheduling algorithms have been proposed, they can all only be applied to a class of tree-structured task graphs with a uniform computation cost. Besides, they assume a stringent error model, called the redeemable error model, that considers extremely unlikely cases. In this paper, we first propose two new plausible error models which restrict the manner of error propagation. Then we present three fault-secure scheduling algorithms, one for each of the three models. Unlike previous algorithms, the proposed algorithms can deal with any task graphs with arbitrary computation and communication costs. Through experiments, we evaluate these algorithms and study the impact of the error models on the lengths of fault-secure schedules.
Atsushi NAKAMURA Masaki NAITO Hajime TSUKADA Rainer GRUHN Eiichiro SUMITA Hideki KASHIOKA Hideharu NAKAJIMA Tohru SHIMIZU Yoshinori SAGISAKA
This paper describes an application of a speech translation system to another task/domain in the real-world by using developmental data collected from real-world interactions. The total cost for this task-alteration was calculated to be 9 Person-Month. The newly applied system was also evaluated by using speech data collected from real-world interactions. For real-world speech having a machine-friendly speaking style, the newly applied system could recognize typical sentences with a word accuracy of 90% or better. We also found that, concerning the overall speech translation performance, the system could translate about 80% of the input Japanese speech into acceptable English sentences.
Sufang CHEN Xiangshi REN HunSoo KIM Yoshio MACHI
An experiment was conducted to measure and compare the physiological effects of three types of CRT on users. We proposed a new strategy for measuring the user's level of relaxation. In this strategy, called "Task Break Monitoring (TBM)," the subjects took a break with eyes closed after each interaction with the computer. During each break, electroencephalogram (EEG), especially alpha 1 waves, electrocardiogram (ECG) and galvanic skin resistance (GSR) were monitored and recorded. The results show that the type of CRT display which emits far-infrared rays modulated by a FIR-fan induce less fatigue in users while they are working and reduce the recovery time after the task was completed. We believe "TBM" to be an important innovation in human computer research and development because the after effects of computer use have an obvious bearing on recovery time, user endurance and psychological attitude to the technology in general etc.
Bong-Joon JUNG Kwang-Il PARK Kyu Ho PARK
In static multiprocessor scheduling, heuristic algorithms have been widely used. Instead of gaining execution speed, most of them show non promising solutions since they search only a part of solution spaces. In this paper, we propose a scheduling algorithm using the genetic algorithm (GA) which is a well-known stochastic search algorithm. The proposed algorithm, named ordered-deme GA (OGA), is based on the multiple subpopulation GA, where a global population is divided into several subpopulations (demes) and each demes evolves independently. To find better schedules, the OGA orders demes from the highest to the lowest deme and migrates both the best and the worst individuals at the same time. In addition, the OGA adaptively assigns different mutation probabilities to each deme to improve search capability. We compare the OGA with well-known heuristic algorithms and other GAs for random task graphs and the task graphs from real numerical problems. The results indicate that the OGA finds mostly better schedules than others although being slower in terms of execution time.
This paper presents a novel technique for analyzing and designing local communication systems for distributed mobile robotic systems (DMRS). Our goal is to provide an analysis-base guideline for designing local communication systems to efficiently transmit task information to the appropriate robots. In this paper, we propose a layered methodology, i. e. , design from spatial and temporal aspects based on analysis of information diffusion by local communication between robots. The task environment is classified so that each analysis and design is applied in a systematic way. The spatial design gives the optimal communication area for minimizing transmission time for various cooperative tasks. In the temporal design, we derive the information announcing time to avoid excessive information diffusion. The designed local communication is evaluated in comparison with global communication. Finally, we performed simulations and experiments to demonstrate that the analysis and design technique is effective for constructing an efficient local communication system.
In this paper, we propose an efficient task scheduling scheme, called CTS (Class-based Task Scheduling), to obtain high performance in terms of high system utilization and low waiting times for tasks. While a better submesh allocation scheme can improve system performance, an allocation policy alone cannot improve performance significantly. This is due to the fact that the FCFS task scheduling policy leads to large external fragmentation. The CTS strategy maintains four separate queues, one for each incoming task class. This avoids the blacking property incurred in the FCFS scheduling. To reduce the external fragmentation, a job tends to wait for an occupied submesh of the same size instead of using a new submesh in the CTS strategy. Simulation results indicate that the proposed scheduling strategy improves the performance compared to the FCFS scheduling policy by reducing the average waiting delay significantly.
Ara KHIL Seungryoul MAENG Jung Wan CHO
The problem of non-preemptive scheduling of real-time periodic tasks with specified release times on a uniprocessor system is known as NP-hard problem. In this paper we propose a new non-preemptive scheduling algorithm and a new static scheduling strategy which use the repetitiveness and the predictability of periodic tasks in order to improve schedulabilities of real-time periodic tasks with specified release times. The proposed scheduling algorithm schedules periodic tasks by using the heuristic that precalculates if the scheduling of the selected task leads to the case that a task misses a deadline when tasks are scheduled by the non-preemptive EDF algorithm. If so, it defers the scheduling of the selected task to avoid the precalculated deadline-missing. Otherwise, it schedules the selected task in the same way as the non-preemptive EDF algorithm. Our scheduling algorithm can always find a feasible schedule for the set of periodic tasks with specified release times which is schedulable by the non-preemptive EDF algorithm. Our static sheduling strategy transforms the problem of non-preemptive scheduling for periodic tasks with specified release times into one with same release times for all tasks. It suggests dividing the given problem into two subproblems, making a non-preemptive scheduling algorithm to find two feasible subschedules for the two subproblems in the forward or backward scheduling within specific time intervals, and then combining the two feasible subschedules into a complete feasible schedule for the given problem. We present the release times as a function of periods for the efficient problem division. Finally, we show improvements of schedulabilities of our scheduling algorithm and scheduling strategy by simulation results.
Akimasa YOSHIDA Ken'ichi KOSHIZUKA Wataru OGATA Hironori KASAHARA
This paper proposes a data-localization scheduling scheme inside a processor-cluster for multigrain parallel processing, which hierarchically exploits parallelism among coarsegrain tasks like loops, medium-grain tasks like loop iterations and near-fine-grain tasks like statements. The proposed scheme assigns near-fine-grain or medium-grain tasks inside coarse-grain tasks onto processors inside a processor-cluster so that maximum parallelism can be exploited and inter-processor data transfer can be minimum after data-localization for coarse-grain tasks across processor-clusters. Performance evaluation on a multiprocessor system OSCAR shows that multigrain parallel processing with the proposed data-localization scheduling can reduce execution time for application programs by 10% compared with multigrain parallel processing without data-localization.
Dingchao LI Akira MIZUNO Yuji IWAHORI Naohiro ISHII
This paper describes a new approach to the scheduling problem that assigns tasks of a parallel program described as a task graph onto parallel machines. The approach handles interprocessor communication and heterogeneity, based on using both the theoretical results developed so far and a lookahead scheduling strategy. The experimental results on randomly generated task graphs demonstrate the effectiveness of this scheduling heuristic.
Recently, various systems based on agent model architecture have been developed. In these systems, 'agents' with their own goals and functions are embedded, and perform their own tasks through collaboration among them by communication to achieve a goal as the system requires. Using this agent model for the construction of educational systems, adaptive configuration of the system is achieved. The purpose of this study is to propose a methodology for the design of an educational system based on agent model architecture. This paper describes the configuration of the agent model and the communication language and protocol used to represent collaboration among the agents necessary for performing a cooperative task. Moreover, we explain how to organize these agents as an educational system. As a case to show the organization of agents, we discuss the configuration of an intelligent learning environment to support C shell programming in UNIX and explain the collaborative behavior of embedded agents.
Naoshi UCHIHIRA Shinichi HONIDEN
This paper concerns a Petri-net-based model for describing reactive and concurrent systems. Although many high-level Petri nets have been proposed, they are insufficiently practical to describe reactive and concurrent systems in the detail modeling, design and implementation phases. They are mainly intended to describe concurrent systems in the rough modeling phase and lack in several important features (e.g., concurrent tasks, task communication/synchronization, I/O interface, task scheduling) which the most actual implementations of reactive and concurrent systems have. Therefore it is impossible to simulate and analyze the systems accurately without explicitly modeling these features. On the other hand, programming languages based on Petri nets are deeply dependent on their execution environments and not sophisticated as modeling and specification languages. This paper proposes MENDEL net which is a high-level Petri net extended by incorporating concurrent tasks, task communication/synchronization, I/O interface, and task scheduling in a sophisticated manner. MENDEL nets are a wide-spectrum modeling language, that is, they are suitable for not only modeling but also designing and implementing reactive and concurrent systems.
Paulo LORENZO Munehiro GOTO Arthur J. CATTO
The Manchester Dataflow Machine (MDFM) works with tasks of size equal to one single instruction. This fine granularity aims at exploring all parallelism at the instruction level. However, this project decision increases the instruction communication cost, which ends up to jam the interconnection network and reduces the system performance. One way to skirt this problem is to adopt variable size tasks instead of working with such small task size. In this paper, in order to study whether or not the usage of such variable size tasks in the MDFM architecture contributes to the improvement of the performance, some simulations by toy programs take place. In the simulation, variable size tasks are realized by packing the sequential instruction stretches into one task. To manage this packing, the Sequential Block (SB) technique is developed. The simulation of those packed and unpacked programs give an outline of advantages and disadvantages of working with variable size tasks, and how the SB technique should be implemented in the system.
Jiann-Fu LIN Win-Bin SEE Sao-Jie CHEN
This paper investigates the problem of scheduling parallel tasks" with consideration of communication cost on an m-processor system, where processors are assumed to be identical and tasks being scheduled are independent such that they can run on more than one processor simultaneously. Once a task is processed in parallel, its finish time will be speeded up, but communication cost will also be incurred and should be taken into account. To find a schedule with minimum finish time for the parallel tasks scheduling problem is NP-hard. Therefore, in this paper, we will propose a heuristic algorithm for this kind of problem and derive its performance bounds for two different cases of applications, respectively.
Tsuyoshi KAWAGUCHI Yoshinori TAMURA Kouichi UTSUMIYA
The linear array processor architecture is an important class of interconnection structures that are suitable for VLSI. In this paper we study the problem of mapping a task tree onto a linear array to minimize the total execution time. First, an optimization algorithm is presented for a message scheduling probrem which occurs in the task tree mapping problem. Next, we give a heuristic algorithm for the task tree mapping problem. The algorithm partitions the node set of a task tree into clusters and maps these clusters onto processors. Simulation experiments showed that the proposed algorithm is much more efficient than a conventional algorithm.