1-1hit |
Hai JIN Xuanhua SHI Weizhong QIANG Deqing ZOU
Grid computing presents a new trend to distributed and Internet computing to coordinate large scale resources sharing and problem solving in dynamic, multi-institutional virtual organizations. Due to the diverse failures and error conditions in the grid environments, developing, deploying, and executing applications over the grid is a challenge, thus dependability is a key factor for grid computing. This paper presents a dependable grid computing framework, called DRIC, to provide an adaptive failure detection service and a policy-based failure handling mechanism. The failure detection service in DRIC is adaptive to users' QoS requirements and system conditions, and the failure-handling mechanism can be set optimized based on decision-making method by a policy engine. The performance evaluation results show that this framework is scalable, high efficiency and low overhead.