کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
425162 | 685694 | 2016 | 11 صفحه PDF | دانلود رایگان |

• We combine production experiments, modeling and simulation of applications deployed on distributed resources.
• We exhaustively evaluate an analytical model of the application makespan.
• We study the influence and calibrate the parameters of the application workflow.
• Our simulator is thoroughly validated and uses real production traces as inputs.
Evaluating the performance of distributed systems through real experimentation is resource-consuming and by essence very difficult to reproduce. Conversely, analytical modeling and simulation facilitate investigation, but their level of realism needs to be evaluated to avoid misinterpretation. In this paper, we combine production experiments and realistic simulation for performance modeling and optimization of application workflows deployed on the European Grid Infrastructure (EGI), one of the largest distributed systems in the world. We use a validated simulator to (i) exhaustively evaluate an analytical model of the application makespan and (ii) study the influence and calibrate the parameters of the application workflow, in particular the checkpointing period. Experimental results show that the model fits the simulated makespan with a relative error of at most 15%, and that simulation allows us to validate analytical models in a more exhaustive manner than what is possible with production experiments. Results also show that, provided that the simulator is correctly validated and instantiated, simulation can be safely used for exhaustive parameter studies, allowing for a quick and fine tuning of sensitive application parameters.
Journal: Future Generation Computer Systems - Volume 57, April 2016, Pages 13–23