Comparing a class of dynamic model-based reinforcement learning schemes for handoff prioritization in mobile communication networks

Article ID	Journal	Published Year	Pages	File Type
385701	Expert Systems with Applications	2011	8 Pages	PDF

Abstract

This paper presents and compares three model-based reinforcement learning schemes for admission policy with handoff prioritization in mobile communication networks. The goal is to reduce the handoff failures while making efficient use of the wireless network resources. A performance measure is formed as a weighted linear function of the blocking probability of new connection requests and the handoff failure probability. Then, the problem is formulated as a semi-Markov decision process with an average cost criterion and a simulation-based learning algorithm is developed to approximate the optimal control policy. The proposed schemes are driven by a dynamic model estimated simultaneously while learning the control policy using samples generated from direct interactions with the network. Extensive simulations are provided to assess and compare their effectiveness of the algorithm under a variety of traffic conditions with some well-known policies.

Research highlights► Formulate call admission with handoff prioritization as average cost semi-Markov decision process. ► Present and compare three model-based reinforcement learning schemes to solve this problem. ► Assess the performance under a variety of traffic conditions and compare it with some existing policies.

Keywords

Semi-Markov decision process Cellular systems Resource management Reinforcement learning