Reinforcement Learning

  • Unique Paper ID: 198267
  • Volume: 12
  • Issue: 11
  • PageNo: 8046-8050
  • Abstract:
  • Reinforcement Learning (RL) is a paradigm of machine learning wherein an agent learns to make sequential decisions by interacting with an environment to maximize cumulative rewards. This paper presents a comprehensive survey of RL, covering its theoretical foundations in Markov Decision Processes, core algorithms including Q-Learning, SARSA, and Monte Carlo methods, and advanced deep RL frameworks such as Deep Q-Networks (DQN), Proximal Policy Optimization (PPO), and Actor-Critic architectures. Applications across robotics, game-playing, healthcare, and autonomous systems are reviewed, alongside open challenges including sample inefficiency, reward specification, and safe exploration. The paper concludes with directions for future research bridging RL with large foundation models.

Copyright & License

Copyright © 2026 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{198267,
        author = {Shahid Mulani and Rudra Bundele and Shruti Tiwari and Neha Shinde},
        title = {Reinforcement Learning},
        journal = {International Journal of Innovative Research in Technology},
        year = {2026},
        volume = {12},
        number = {11},
        pages = {8046-8050},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=198267},
        abstract = {Reinforcement Learning (RL) is a paradigm of machine learning wherein an agent learns to make sequential decisions by interacting with an environment to maximize cumulative rewards. This paper presents a comprehensive survey of RL, covering its theoretical foundations in Markov Decision Processes, core algorithms including Q-Learning, SARSA, and Monte Carlo methods, and advanced deep RL frameworks such as Deep Q-Networks (DQN), Proximal Policy Optimization (PPO), and Actor-Critic architectures. Applications across robotics, game-playing, healthcare, and autonomous systems are reviewed, alongside open challenges including sample inefficiency, reward specification, and safe exploration. The paper concludes with directions for future research bridging RL with large foundation models.},
        keywords = {Reinforcement Learning, Deep Q-Networks, Markov Decision Processes, Policy Gradient Methods, Actor-Critic, Reward Shaping, Autonomous Systems, Machine Learning.},
        month = {April},
        }

Cite This Article

Mulani, S., & Bundele, R., & Tiwari, S., & Shinde, N. (2026). Reinforcement Learning. International Journal of Innovative Research in Technology (IJIRT), 12(11), 8046–8050.

Related Articles