Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
425305 | Future Generation Computer Systems | 2008 | 7 Pages |
Abstract
As distributed storage systems grow, the response time between the occurrence of a fault, detection, and repair becomes significant. Systems built on shared servers have additional complexity because of the high rate of service outages and revocation. Managing high replica counts in this environment becomes very costly in terms of the storage required and bandwidth consumption for file copies. The storage challenge for this situation can thus be phrased as an attempt to function inexpensively with respect to cost constraints such as: disk utilization, network bandwidth consumption, and server CPU time. The GEMS (Grid Enabled Molecular Simulation) storage system provides a replicated and shared workspace for large scale molecular dynamics simulations, and exemplifies the above issues. The framework offers a solution to this problem by prioritizing observed faults and repairing them in an intelligent manner. In this paper, we provide observations from the operation of GEMS and compare its error handling to that of like storage systems.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Computational Theory and Mathematics
Authors
J.M. Wozniak, P. Brenner, D. Thain, A. Striegel, J.A. Izaguirre,