A constrained optimization perspective on actor–critic algorithms and application to network routing

Article ID	Journal	Published Year	Pages	File Type
751933	Systems & Control Letters	2016	6 Pages	PDF

Abstract

We propose a novel actor–critic algorithm with guaranteed convergence to an optimal policy for a discounted reward Markov decision process. The actor incorporates a descent direction that is motivated by the solution of a certain non-linear optimization problem. We also discuss an extension to incorporate function approximation and demonstrate the practicality of our algorithms on a network routing application.

Keywords

Constrained optimization Reinforcement learning

Related Topics

Physical Sciences and Engineering Engineering Control and Systems Engineering

Preview

A constrained optimization perspective on actor–critic algorithms and application to network routing

Authors

Prashanth L.A., Prasad H.L., Shalabh Bhatnagar, Prakash Chandra,