Driver Drowsiness Detection using Hybrid VGG19 and Vision Transformer

  • Unique Paper ID: 201032
  • PageNo: 192-202
  • Abstract:
  • Driver drowsiness is a major contributor to road accidents, reducing attention, reaction time, and decision-making ability. Conventional detection methods based on Convolutional Neural Networks (CNNs) mainly rely on eye-closure analysis and often suffer from limited generalization and overfitting under diverse real-world conditions. This paper proposes a hybrid deep learning framework combining VGG19 and Vision Transformer (ViT) for robust multi-feature drowsiness detection. The VGG19 network extracts detailed local spatial features such as eyelid and mouth movements, while the ViT captures global contextual relationships using self-attention mechanisms. The integration of these complementary representations enables simultaneous analysis of eye closure and yawning, improving detection reliability. The proposed model enhances robustness against illumination variation, facial orientation changes, and environmental noise. Experimental evaluation demonstrates improved accuracy, recall, and generalization compared with conventional CNN approaches. The system operates in real time and provides timely alerts, offering a scalable and practical solution for intelligent driver monitoring systems and improved road safety.

Copyright & License

Copyright © 2026 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{201032,
        author = {Mrs. J. Veerendeswari and Mrs. B. Celin Julie and Mr. Kishore Kumar K and Mr. Praveen R and Mr. Dhivesh V},
        title = {Driver Drowsiness Detection using Hybrid VGG19 and Vision Transformer},
        journal = {International Journal of Innovative Research in Technology},
        year = {2026},
        volume = {12},
        number = {no},
        pages = {192-202},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=201032},
        abstract = {Driver drowsiness is a major contributor to road accidents, reducing attention, reaction time, and decision-making ability. Conventional detection methods based on Convolutional Neural Networks (CNNs) mainly rely on eye-closure analysis and often suffer from limited generalization and overfitting under diverse real-world conditions. This paper proposes a hybrid deep learning framework combining VGG19 and Vision Transformer (ViT) for robust multi-feature drowsiness detection. The VGG19 network extracts detailed local spatial features such as eyelid and mouth movements, while the ViT captures global contextual relationships using self-attention mechanisms. The integration of these complementary representations enables simultaneous analysis of eye closure and yawning, improving detection reliability. The proposed model enhances robustness against illumination variation, facial orientation changes, and environmental noise. Experimental evaluation demonstrates improved accuracy, recall, and generalization compared with conventional CNN approaches. The system operates in real time and provides timely alerts, offering a scalable and practical solution for intelligent driver monitoring systems and improved road safety.},
        keywords = {Driver drowsiness detection, Vision Transformer, VGG19, fatigue monitoring},
        month = {May},
        }

Cite This Article

Veerendeswari, M. J., & Julie, M. B. C., & K, M. K. K., & R, M. P., & V, M. D. (2026). Driver Drowsiness Detection using Hybrid VGG19 and Vision Transformer. International Journal of Innovative Research in Technology (IJIRT), 192–202.

Related Articles