Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
1133566 | Computers & Industrial Engineering | 2015 | 6 Pages |
•Compared computational performance of linear programming and the policy iteration.•Considered only discrete-time infinite-horizon MDPs with discounted reward.•Used randomly generated test problems and a real-life health-care problem.•Showed that, unlike previously reported, barrier methods for LP provide a viable tool.•LP is more effective when the transition probability matrix has a diagonal structure.
We compare the computational performance of linear programming (LP) and the policy iteration algorithm (PIA) for solving discrete-time infinite-horizon Markov decision process (MDP) models with total expected discounted reward. We use randomly generated test problems as well as a real-life health-care problem to empirically show that, unlike previously reported, barrier methods for LP provide a viable tool for optimally solving such MDPs. The dimensions of comparison include transition probability matrix structure, state and action size, and the LP solution method.