کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4952490 1442041 2016 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
LinUCB applied to Monte Carlo tree search
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
LinUCB applied to Monte Carlo tree search
چکیده انگلیسی
UCT is a standard method of Monte Carlo tree search (MCTS) algorithms, which have been applied to various domains and have achieved remarkable success. This study proposes a family of LinUCT algorithms that incorporate LinUCB into MCTS algorithms. LinUCB is a recently developed method that generalizes past episodes by ridge regression with feature vectors and rewards. LinUCB outperforms UCB1 in contextual multi-armed bandit problems. We introduce a straightforward application of LinUCB, LinUCTPLAIN by substituting UCB1 with LinUCB in UCT. We show that it does not work well owing to the minimax structure of game trees. To better handle such tree structures, we present LinUCTRAVE and LinUCTFP by further incorporating two existing techniques, rapid action value estimation (RAVE) and feature propagation, which recursively propagates the feature vector of a node to that of its parent. Experiments were conducted with a synthetic model, which is an extension of the standard incremental random tree model in which each node has a feature vector that represents the characteristics of the corresponding position, and Finnsson's shock step game which is used to empirically analyze the performance of UCT with respect to the distribution of suboptimal moves. The experiments results indicate that LinUCTRAVE and LinUCTFP outperform UCT, especially when the branching factor is relatively large.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Theoretical Computer Science - Volume 644, 6 September 2016, Pages 114-126
نویسندگان
, ,