Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6856393 | Information Sciences | 2018 | 38 Pages |
Abstract
In this study, an adversarial multiplayer multiarmed bandit (MAB) game is employed to model the problem of joint channel and power allocation in multiuser underwater acoustic communication networks (UACNs). Moreover, this study presents a distributed hierarchical learning algorithm that does not require any prior environmental information and direct information exchange among users. This algorithm has a two-tier learning approach that effectively improves user learning ability and decreases learning time. In upper learning, each user formulates a strategy by learning the actual played reward. Outdated virtual information, which can be obtained as the reward of a past-played strategy, is learned in lower learning. The dynamic lower learning mechanism is proposed to prevent falling into an inadequate local extreme value. The algorithm has high tolerance for delay and noncomplete information because of its unique learning behavior. Simulation results showed that the proposed algorithm achieves a high level of performance.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Song Han, Xinbin Li, Lei Yan, Jiajie Xu, Zhixin Liu, Xinping Guan,