Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
424753 | Future Generation Computer Systems | 2010 | 8 Pages |
A major performance issue in large-scale decentralized distributed systems, such as grids, is how to ensure that jobs finish their execution within the estimated completion times in the presence of resource performance fluctuations. Previously, several techniques including advance reservation, rescheduling and migration have been adopted to resolve/relieve this issue; however, they have some non-negligent practicality hurdles. The use of clouds may be an attractive alternative, since resources in clouds are much more reliable than those in grids. This paper investigates the effectiveness of rescheduling using cloud resources to increase the reliability of job completion. Specifically, schedules are initially generated using grid resources, and cloud resources (relatively costlier) are used only for rescheduling to cope with a delay in job completion. A job in our study refers to a bag-of-tasks (BoT) application that consists of a large number of independent tasks; this job model is common in many science and engineering applications. We have devised a novel rescheduling technique, called rescheduling using clouds for reliable completion (RC2RC2) and applied it to three well-known existing heuristics. Our experimental results reveal that RC2RC2 significantly reduces delay in job completion.