کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
493905 723152 2015 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Simulation and optimization of HPC job allocation for jointly reducing communication and cooling costs
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
پیش نمایش صفحه اول مقاله
Simulation and optimization of HPC job allocation for jointly reducing communication and cooling costs
چکیده انگلیسی


• Design a job allocation policy to jointly optimize performance and cooling energy of data centers.
• Implement our evaluation models and allocation policies in structural simulation toolkit (SST).
• Propose a modeling approach to model the impact of communication cost on system performance.
• Evaluate our policy using traces extracted from real-world parallel workloads.

Performance and energy are critical aspects in high performance computing (HPC) data centers. Highly parallel HPC applications that require multiple nodes usually run for long durations in the range of minutes, hours or days. As the threads of parallel applications communicate with each other intensively, the communication cost of these applications has a significant impact on data center performance. Energy consumption has also become a first-order constraint of HPC data centers. Nearly half of the energy in the computing clusters today is consumed by the cooling infrastructure. Existing job allocation policies either target improving the system performance or reducing the cooling energy cost of the server nodes. How to optimize the system performance while minimizing the cooling energy consumption is still an open question. This paper proposes a job allocation methodology aimed at jointly reducing the communication cost and the cooling energy of HPC data centers. In order to evaluate and validate our optimization algorithm, we implement our joint job allocation methodology in the structural simulation toolkit (SST) – a simulation framework for large-scale data centers. We evaluate our joint optimization algorithm using traces extracted from real-world workloads. Experimental results show that, in comparison to performance-aware job allocation algorithms, our algorithm achieves comparable running times and reduces the cooling power by up to 42.21% across all the jobs.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Sustainable Computing: Informatics and Systems - Volume 6, June 2015, Pages 48–57
نویسندگان
, , , , ,