Deepfake detection using XAI vision transformer

  • Unique Paper ID: 194465
  • PageNo: 5546-5550
  • Abstract:
  • Deepfakes are rapidly growing in sophistication and are becoming much more difficult for the average internet user to spot, especially on the most popular social media platforms. As users begin to lose trust in the content presented online, the need for an effective, reliable countermeasure to deepfake technology is becoming more and more dire. Defenders need to develop technology to identify the most subtle artifacts left by various face-swapping methods, in images that are compressed, altered, and captured under a multitude of scenarios. This project utilizes Vision Transformers. The method takes images of faces and divides them into multiple components. The model is then able to generate self-attention to a particular area of the face which may seem inconsistent with the other components. The approach is fairly straightforward. You detect a face, segment it, pass it into the transformer, and receive a classification of real or fake. The model was able to achieve strong performance on the FaceForensics++ dataset, and with appropriate real-world training modifications, the model was able to achieve successful performance with smooth, progressively improving loss. Simply put, a basic Vision Transformer is a great place to begin developing a model for deepfake detection and the common test benchmarks.

Copyright & License

Copyright © 2026 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{194465,
        author = {saloni and Shweta Garg and Archna jain and Deepak Gupta},
        title = {Deepfake detection using XAI vision transformer},
        journal = {International Journal of Innovative Research in Technology},
        year = {2026},
        volume = {12},
        number = {10},
        pages = {5546-5550},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=194465},
        abstract = {Deepfakes are rapidly growing in sophistication and are becoming much more difficult for the average internet user to spot, especially on the most popular social media platforms. As users begin to lose trust in the content presented online, the need for an effective, reliable countermeasure to deepfake technology is becoming more and more dire. Defenders need to develop technology to identify the most subtle artifacts left by various face-swapping methods, in images that are compressed, altered, and captured under a multitude of scenarios.
This project utilizes Vision Transformers.
The method takes images of faces and divides them into multiple components. The model is then able to generate self-attention to a particular area of the face which may seem inconsistent with the other components. The approach is fairly straightforward. You detect a face, segment it, pass it into the transformer, and receive a classification of real or fake. The model was able to achieve strong performance on the FaceForensics++ dataset, and with appropriate real-world training modifications, the model was able to achieve successful performance with smooth, progressively improving loss. Simply put, a basic Vision Transformer is a great place to begin developing a model for deepfake detection and the common test benchmarks.},
        keywords = {FaceForensics++, Image forensics, Self-attention, Vision Transformers, Deepfake detection.},
        month = {March},
        }

Cite This Article

saloni, , & Garg, S., & jain, A., & Gupta, D. (2026). Deepfake detection using XAI vision transformer. International Journal of Innovative Research in Technology (IJIRT). https://doi.org/doi.org/10.64643/IJIRTV12I10-194465-459

Related Articles