کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6935179 868488 2016 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
TracSim: Simulating and scheduling trapped power capacity to maximize machine room throughput
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
TracSim: Simulating and scheduling trapped power capacity to maximize machine room throughput
چکیده انگلیسی
The power supplied to machine rooms tends to be over-provisioned because it is specified in practice not by workload demands but rather by high energy LINPACK runs or nameplate power estimates. This results in a considerable amount of trapped power capacity-excess power infrastructure. Instead of being wasted, this trapped power capacity should be reclaimed to accommodate more compute nodes in the machine room and thereby increase system throughput. But to do this we need the ability to enforce a system-wide power cap. In this paper, we present TracSim, a full-system simulator that enables users to measure trapped power capacity and evaluate the performance of different policies for scheduling parallel tasks under a power cap. TracSim simulates the execution environment of a production HPC cluster at Los Alamos National Laboratory (LANL). TracSim enables users to specify the system topology, hardware configuration, power cap, and task workload and to develop resource configuration and task scheduling policies aimed at maximizing machine-room throughput while keeping power consumption under a power cap by exploiting CPU throttling techniques. We use real measurements from the LANL cluster to set TracSim's configuration parameters. We leverage TracSim to implement and evaluate four resource scheduling policies. Simulation results indicate the performance of those policies and quantify the amount of trapped capacity that can effectively be reclaimed.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Parallel Computing - Volume 57, September 2016, Pages 108-124
نویسندگان
, , , ,