کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
865981 909690 2007 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Fault-Tolerant Mechanism of the Distributed Cluster Computers
موضوعات مرتبط
مهندسی و علوم پایه سایر رشته های مهندسی مهندسی (عمومی)
پیش نمایش صفحه اول مقاله
Fault-Tolerant Mechanism of the Distributed Cluster Computers
چکیده انگلیسی
The distributed system with high performance and stability is commonly adopted in large scale scientific and engineering computing. In this paper, we discuss a fault-tolerant mechanism under Linux circumstance to improve the fault-tolerant ability of the system, namely a scheme and frame to form the stable computing platform. In terms of the structure and function of the distributed system, active list and file invocation strategies are employed in the task management. System multilevel fault-tolerance can be achieved by repeated processes in a single node and task migration on multi-nodes. Manager node agent introduced in this paper administrates the nodes using the list, disposes of the tasks according to the nodes' performance, and hence, to be able to make full use of the cluster resources. An evaluation method is proposed to appraise the performance. The analyzed results show the usefulness of the scheme proposed except for some additional overhead of memory consumption.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Tsinghua Science & Technology - Volume 12, Supplement 1, July 2007, Pages 186-191
نویسندگان
, , ,