کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
381090 1437461 2013 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
An automated signalized junction controller that learns strategies by temporal difference reinforcement learning
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
An automated signalized junction controller that learns strategies by temporal difference reinforcement learning
چکیده انگلیسی

This paper shows how temporal difference learning can be used to build a signalized junction controller that will learn its own strategies through experience. Simulation tests detailed here show that the learned strategies can have high performance. This work builds upon previous work where a neural network based junction controller that can learn strategies from a human expert was developed (Box and Waterson, 2012). In the simulations presented, vehicles are assumed to be broadcasting their position over WiFi giving the junction controller rich information. The vehicle's position data are pre-processed to describe a simplified state. The state-space is classified into regions associated with junction control decisions using a neural network. This classification is the strategy and is parametrized by the weights of the neural network. The weights can be learned either through supervised learning with a human trainer or reinforcement learning by temporal difference (TD). Tests on a model of an isolated T junction show an average delay of 14.12 s and 14.36 s respectively for the human trained and TD trained networks. Tests on a model of a pair of closely spaced junctions show 17.44 s and 20.82 s respectively. Both methods of training produced strategies that were approximately equivalent in their equitable treatment of vehicles, defined here as the variance over the journey time distributions.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Engineering Applications of Artificial Intelligence - Volume 26, Issue 1, January 2013, Pages 652–659
نویسندگان
, ,