Hierarchical power management of a system with autonomously power-managed components using reinforcement learning

Article ID	Journal	Published Year	Pages	File Type
539625	Integration, the VLSI Journal	2015	11 Pages	PDF

Abstract

•A dynamic power management framework based on reinforcement learning is proposed.•The framework is model-free and independent of pre-designed policies.•We perform learning and power management in a continuous-time.•The proposed framework can simultaneously consider power and performance.•An effective application-level scheduling is integrated for further energy saving.

This paper presents a hierarchical dynamic power management (DPM) framework based on reinforcement learning (RL) technique, which aims at power savings in a computer system with multiple I/O devices running a number of heterogeneous applications. The proposed framework interacts with the CPU scheduler to perform effective application-level scheduling, thereby enabling further power savings. Moreover, it considers non-stationary workloads and differentiates between the service request generation rates of various software application. The online adaptive DPM technique consists of two layers: component-level local power manager and system-level global power manager. The component-level PM policy is pre-specified and fixed whereas the system-level PM employs temporal difference learning on semi-Markov decision process as the model-free RL technique, and it is specifically optimized for a heterogeneous application pool. Experiments show that the proposed approach considerably enhances power savings while maintaining good performance levels. In comparison with other reference systems, the proposed RL-based DPM approach, further enhances power savings, performs well under various workloads, can simultaneously consider power and performance, and achieves wide and deep power-performance tradeoff curves. Experiments conducted with multiple service providers confirm that up to 63% maximum energy saving per service provider can be achieved.

Keywords

Semi-Markov decision process power management Temporal difference learning Reinforcement learning