Reinforcement learning, Cognitive radio network, Collaborative Learning
Designing an efﬁcient routing protocol for cognitive radio networks is critical due to the dynamic behavior of the primary users. Based on empirical studies, the primary users activity on the licensed channels has periodicity comprised of several stages, and that the model of primary users activity may change during different stages. This paper has identiﬁed two main challenges facing designers: how to transmit packets via a stable route, and how to ensure imposing of minimal interference on the primary users. To address these, they propose a routing protocol which is based on a generalized version of Q-learning and which exploits the said model of primary users behavior. Degradation of QoS of secondary users stem from lack of attention to the multi-stage periodic behavior of primary users.