AI-Driven Video Violence Detection Using CNN-LSTM Architecture and Generative AI for Incident Explanation

  • Unique Paper ID: 197128
  • Volume: 12
  • Issue: 11
  • PageNo: 5952-5958
  • Abstract:
  • Violence detection in surveillance videos is a critical requirement for ensuring public safety in institutional, urban, and commercial environments. Manual monitoring of video feeds is inefficient, error-prone, and not scalable for modern security operations. This paper presents an AI-based Video Violence Detection System designed to automatically analyze pre-recorded surveillance videos and identify violent activities using a hybrid CNN-LSTM deep learning architecture. Convolutional Neural Networks (CNN) extract spatial features from individual video frames, while Long Short-Term Memory (LSTM) networks capture temporal dependencies across frame sequences. A Generative AI (GenAI) module is integrated to generate human-readable incident explanations, enhancing interpretability and supporting operational decision-making. The system is built on a scalable N-Tier architecture utilizing FastAPI, React.js, and PostgreSQL, with JWT-based authentication and Role-Based Access Control (RBAC). Experimental evaluation demonstrates an accuracy of approximately 86%, with precision, recall, and F1-score values of 0.87, 0.85, and 0.86, respectively.

Copyright & License

Copyright © 2026 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{197128,
        author = {Dr Yalla Venkat and M. Chaitrika and V. Veera Sai Manikanta and Y. Prudhvi and S. Asha Jyothi},
        title = {AI-Driven Video Violence Detection Using CNN-LSTM Architecture and Generative AI for Incident Explanation},
        journal = {International Journal of Innovative Research in Technology},
        year = {2026},
        volume = {12},
        number = {11},
        pages = {5952-5958},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=197128},
        abstract = {Violence detection in surveillance videos is a critical requirement for ensuring public safety in institutional, urban, and commercial environments. Manual monitoring of video feeds is inefficient, error-prone, and not scalable for modern security operations. This paper presents an AI-based Video Violence Detection System designed to automatically analyze pre-recorded surveillance videos and identify violent activities using a hybrid CNN-LSTM deep learning architecture. Convolutional Neural Networks (CNN) extract spatial features from individual video frames, while Long Short-Term Memory (LSTM) networks capture temporal dependencies across frame sequences. A Generative AI (GenAI) module is integrated to generate human-readable incident explanations, enhancing interpretability and supporting operational decision-making. The system is built on a scalable N-Tier architecture utilizing FastAPI, React.js, and PostgreSQL, with JWT-based authentication and Role-Based Access Control (RBAC). Experimental evaluation demonstrates an accuracy of approximately 86%, with precision, recall, and F1-score values of 0.87, 0.85, and 0.86, respectively.},
        keywords = {Artificial Intelligence; Violence Detection; Deep Learning; CNN-LSTM; Video Analysis; Generative AI; Explainable AI; N-Tier Architecture; FastAPI; React; PostgreSQL; Surveillance Systems.},
        month = {April},
        }

Cite This Article

Venkat, D. Y., & Chaitrika, M., & Manikanta, V. V. S., & Prudhvi, Y., & Jyothi, S. A. (2026). AI-Driven Video Violence Detection Using CNN-LSTM Architecture and Generative AI for Incident Explanation. International Journal of Innovative Research in Technology (IJIRT), 12(11), 5952–5958.

Related Articles