Article ID Journal Published Year Pages File Type
6856393 Information Sciences 2018 38 Pages PDF
Abstract
In this study, an adversarial multiplayer multiarmed bandit (MAB) game is employed to model the problem of joint channel and power allocation in multiuser underwater acoustic communication networks (UACNs). Moreover, this study presents a distributed hierarchical learning algorithm that does not require any prior environmental information and direct information exchange among users. This algorithm has a two-tier learning approach that effectively improves user learning ability and decreases learning time. In upper learning, each user formulates a strategy by learning the actual played reward. Outdated virtual information, which can be obtained as the reward of a past-played strategy, is learned in lower learning. The dynamic lower learning mechanism is proposed to prevent falling into an inadequate local extreme value. The algorithm has high tolerance for delay and noncomplete information because of its unique learning behavior. Simulation results showed that the proposed algorithm achieves a high level of performance.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , , , ,