Article ID Journal Published Year Pages File Type
720516 IFAC Proceedings Volumes 2007 6 Pages PDF
Abstract

This work proposes a methodology to generate risk averse policies for Markov Decision Processes(MDPs). This methodology is based on modifying the one stage reward or cost to weigh the trade-off between expected performance and downside risk represented by (CVαRα). The modified stage-wise utility function is used within dynamic programming to generate a set of policies representing different levels of the trade-off. The approach is demonstrated in a shortest path optimal control problem and a project management problem modeled as constrained MDP. To address a more complex management problem, we utilize the Real Time Approximate Dynamic Programming algorithm.

Related Topics
Physical Sciences and Engineering Engineering Computational Mechanics