Article ID Journal Published Year Pages File Type
409206 Neurocomputing 2014 8 Pages PDF
Abstract

We consider state-dependent pricing in a two-player service market stochastic game where state of the game and its transition dynamics are modeled using a semi-Markovian queue. We propose a multi-time scale actor–critic based reinforcement algorithm for multi-agent learning under self-play and provide experimental results on Nash convergence.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, ,