کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
425514 685761 2008 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Migol: A fault-tolerant service framework for MPI applications in the grid
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Migol: A fault-tolerant service framework for MPI applications in the grid
چکیده انگلیسی

Especially for sciences the provision of massive parallel CPU capacity is one of the most attractive features of a grid. A major challenge in a distributed, inherently dynamic grid is fault tolerance. The more resources and components involved, the more complicated and error-prone becomes the system. In a grid with potentially thousands of machines connected to each other the reliability of individual resources cannot be guaranteed.The benefit of the grid is that in case of a failure an application may be migrated and restarted from a checkpoint file on another site. This approach requires a service infrastructure which handles the necessary activities transparently. In this article, we present Migol, a fault-tolerant and self-healing grid middleware for MPI applications. Migol is based on open standards and extends the services of the Globus toolkit to support the fault tolerance of grid applications.Further, the Migol framework itself is designed with special focus on fault tolerance. For example, Migol replicates critical services and uses a ring-based replication protocol to achieve data consistency.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Future Generation Computer Systems - Volume 24, Issue 2, February 2008, Pages 142–152
نویسندگان
, ,