Core frequency adjustment to optimize Time Warp on many-core processors

Article ID	Journal	Published Year	Pages	File Type
493611	Simulation Modelling Practice and Theory	2012	10 Pages	PDF

Abstract

Time Warp synchronized parallel discrete event simulators are organized to operate asynchronously and aggressively without explicit synchronization between the concurrently executing simulation objects. In place of an explicit synchronization mechanism, the concurrent simulators implement an independent but common virtual clock model and a rollback/recovery mechanism to restore causal order when out-of-order events are detected. When the critical path of execution of the simulation is balanced across this parallel threads of execution, this can result in a highly effective, lightweight synchronization mechanism to implement parallel simulation. However, imbalances in the workload across the threads can result in excessive rollback in some threads and slowed progress of the critical path. On small shared memory multi-core systems, a lowest time-stamp scheduling policy can effectively balance the workload. However, on larger many-core chips, conventional load balancing and workload migration will once again become necessary. Fortunately, emerging many-core chips contain some interesting features that can potentially be exploited to improve the performance of parallel simulations. In particular, the recently developed Intel Single-chip Cloud Computer (SCC) provides mechanisms for the runtime control of the frequency and voltage settings of the chip. Furthermore, the frequency and voltage settings are independently set within different regions (called islands) of the chip. Thus, in a Time Warp simulation, one could increase the frequency of the cores executing threads on the critical path (those experiencing infrequent rollback) and decrease the frequency of the cores executing threads off the critical path (those experiencing excessive rollback). This paper investigates the run-time control and adjustment of core frequency in some contemporary x86 multi-core processors to identify the platforms that can support the exploration of dynamic run-time control of core frequency settings. The results show that while all multi-core processors have software controllable core frequency modulation capabilities, they are generally not fully independent as the system comes under load and are therefore unsuitable for these studies. Fortunately, one processor, the AMD X6 line, provides software control for core frequencies that can be fixed (by software) even as the system operates under load.

► Dynamic core frequency tuning is explored to accelerate the critical path of parallel simulation. ► Parallel simulations on many-core show widely varying total rollbacks per core. ► Idealized theoretical analysis shows that speedups of up to 6.79 are possible. ► Only the AMD X6 chip supports software based dynamic core frequency studies.

Keywords

parallel simulation Many-core processors