کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
7108261 | 1460620 | 2018 | 14 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Multiple stopping time POMDPs: Structural results & application in interactive advertising on social media
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
سایر رشته های مهندسی
کنترل و سیستم های مهندسی
پیش نمایش صفحه اول مقاله

چکیده انگلیسی
This paper considers a multiple stopping time problem for a Markov chain observed in noise, where a decision maker chooses at most L stopping times to maximize a cumulative objective. We formulate the problem as a Partially Observed Markov Decision Process (POMDP) and derive structural results for the optimal multiple stopping policy. The main results are as follows: (i) The optimal multiple stopping policy is shown to be characterized by threshold curves Îl, for l=1,â¦,L, in the unit simplex of Bayesian Posteriors. (ii) The stopping sets Sl (defined by the threshold curves Îl) are shown to exhibit the following nested structure Slâ1âSl. (iii) The optimal cumulative reward is shown to be monotone with respect to the copositive ordering of the transition matrix. (iv) A stochastic gradient algorithm is provided for estimating linear threshold policies by exploiting the structural results. These linear threshold policies approximate the threshold curves Îl, and share the monotone structure of the optimal multiple stopping policy. (v) Application of the multiple stopping framework to interactively schedule advertisements in live online social media. It is shown that advertisement scheduling using multiple stopping performs significantly better than currently used methods.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Automatica - Volume 95, September 2018, Pages 385-398
Journal: Automatica - Volume 95, September 2018, Pages 385-398
نویسندگان
Vikram Krishnamurthy, Anup Aprem, Sujay Bhatt,