A Comparison of Learning Strategies in Freeze Tag: Behavior Cloning vs Curriculum Learning with RL Agents

  • Unique Paper ID: 180811
  • PageNo: 2456-2463
  • Abstract:
  • This study compares Curriculum Learning (CL) and Behavior Cloning (BC) in training smart agents for a multi-agent Freeze Tag game environment constructed with Unity ML-Agents. The game consists of two types of agents—taggers and runners—and involves cooperative as well as competitive interactions. We utilize Proximal Policy Optimization (PPO) as our base reinforcement learning algorithm, supplemented with CL for progressive skill learning and BC for imitation learning from expert demonstration. Training utilized parallel environments to maximize data throughput, and systematic evaluation by agent-vs agent games. Results show that the BC-trained taggers had marginally better win rates, but the CL-trained taggers had better strategic behavior, resource use, and learnability. These results emphasize the trade-offs between imitation learning and curriculum-guided progression in large-scale multi-agent systems.

Copyright & License

Copyright © 2026 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{180811,
        author = {Yash Nikum and Nilam Honmane and Pranav Potdar and Vedant Pawashe and Harsh Sarda},
        title = {A Comparison of Learning Strategies in Freeze Tag: Behavior Cloning vs Curriculum Learning with RL Agents},
        journal = {International Journal of Innovative Research in Technology},
        year = {2025},
        volume = {12},
        number = {1},
        pages = {2456-2463},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=180811},
        abstract = {This study compares Curriculum Learning 
(CL) and Behavior Cloning (BC) in training smart 
agents for a multi-agent Freeze Tag game environment 
constructed with Unity ML-Agents. The game consists 
of two types of agents—taggers and runners—and 
involves cooperative as well as competitive interactions. 
We utilize Proximal Policy Optimization (PPO) as our 
base reinforcement learning algorithm, supplemented 
with CL for progressive skill learning and BC for 
imitation learning from expert demonstration. Training 
utilized parallel environments to maximize data 
throughput, and systematic evaluation by agent-vs
agent games. Results show that the BC-trained taggers 
had marginally better win rates, but the CL-trained 
taggers had better strategic behavior, resource use, and 
learnability. These results emphasize the trade-offs 
between imitation learning and curriculum-guided 
progression in large-scale multi-agent systems.},
        keywords = {Multi-Agent  Learning,  Curriculum  Learning, Behavior Cloning, Reinforcement Learning,  PPO, Unity ML-Agents},
        month = {June},
        }

Cite This Article

Nikum, Y., & Honmane, N., & Potdar, P., & Pawashe, V., & Sarda, H. (2025). A Comparison of Learning Strategies in Freeze Tag: Behavior Cloning vs Curriculum Learning with RL Agents. International Journal of Innovative Research in Technology (IJIRT), 12(1), 2456–2463.

Related Articles