کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
709176 | 892054 | 2013 | 6 صفحه PDF | دانلود رایگان |

During the mid-course phase of an air-to-air missile, choosing the optimal Guidance Point (GP) so as to maximize lock-on success and minimize intercept time is critical. Given low computational resources available on board and a very constrained maneuvering time frame, GP-based algorithms must be efficient. We suggest an innovative approach using Reinforcement Learning (RL) to produce finite state controllers that can be executed efficiently – using table lookup – to meet the strict time limits of a target engagement. Instead of hand-crafting a GP-picking algorithm for every combination of sensor and aircraft configuration, one promising alternative models a missile-target engagement as a Partially Observable Markov Decision Process (POMDP) and automatically generates a controller for picking the best GP by solving the POMDP model. Using a recently developed offline algorithm called Monte Carlo Value Iteration (MCVI) we constructed continuous-state POMDP models and solved them directly, without discretizing the entire state space.Invited session “Missile Guidance Navigation & Control” (pm846)
Journal: IFAC Proceedings Volumes - Volume 46, Issue 19, 2013, Pages 295-300