Adapting attackers and defenders patrolling strategies: A reinforcement learning approach for Stackelberg security games

Article ID	Journal	Published Year	Pages	File Type
6874671	Journal of Computer and System Sciences	2018	20 Pages	PDF

Abstract

This paper presents a novel approach for adapting attackers and defenders preferred patrolling strategies using reinforcement learning (RL) based-on average rewards in Stackelberg security games. We propose a framework that combines three different paradigms: prior knowledge, imitation and temporal-difference method. The overall RL architecture involves two highest components: the Adaptive Primary Learning architecture and the Actor-critic architecture. In this work we consider that defenders and attackers conforms coalitions in the Stackelberg security game, these are reached by computing the Strong Lp-Stackelberg/Nash equilibrium. We present a numerical example that validates the proposed RL approach measuring the benefits for security resource allocation.

Keywords

Stackelberg games Security games Reinforcement learning