کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
7151536 1462283 2018 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Q-learning for Markov decision processes with a satisfiability criterion
موضوعات مرتبط
مهندسی و علوم پایه سایر رشته های مهندسی کنترل و سیستم های مهندسی
پیش نمایش صفحه اول مقاله
Q-learning for Markov decision processes with a satisfiability criterion
چکیده انگلیسی
A reinforcement learning algorithm is proposed in order to solve a multi-criterion Markov decision process, i.e., an MDP with a vector running cost. Specifically, it combines a Q-learning scheme for a weighted linear combination of the prescribed running costs with an incremental version of replicator dynamics that updates the weights. The objective is that the time averaged vector cost meets prescribed asymptotic bounds. Under mild assumptions, it is shown that the scheme achieves the desired objective.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Systems & Control Letters - Volume 113, March 2018, Pages 45-51
نویسندگان
, ,