Fake Job Detection System Using Machine Learning and NLP

  • Unique Paper ID: 206170
  • Volume: 13
  • Issue: 2
  • PageNo: 359-363
  • Abstract:
  • Although online job portals have made job search easier, they have also provided the fraudsters a way to spread fake job listings and thousands of job seekers are fooled every year. In this paper we propose a Fake Job Detection System, developed using Machine Learning (ML) and Natural Language Processing (NLP) techniques to categorize job postings as FAKE, SUSPICIOUS or GENUINE. We use a dataset of Kaggle Fake Job Postings, which has 17,880 records with only 4.84% fake records for training and evaluation. The combination of the job title, the company profile, the description and requirements is tokenized and pre-processed and then passed through the TF-IDF vectorizer. We have experimented with three different models - Logistic Regression, Random Forest with equal weights and Random Forest with SMOTE. The best-performing model provided us with 99% accuracy, 86% precision, 84% recall, 0.85 F1-score and 0.984 ROC-AUC. The threshold value has been set at 0.32. The operational system is deployed as a Streamlit web application with features such as confidence meters, keyword highlighting, PDF report generation, and email notifications.

Copyright & License

Copyright © 2026 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{206170,
        author = {DALSTON JOJU},
        title = {Fake Job Detection System Using Machine Learning and NLP},
        journal = {International Journal of Innovative Research in Technology},
        year = {2026},
        volume = {13},
        number = {2},
        pages = {359-363},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=206170},
        abstract = {Although online job portals have made job search easier, they have also provided the fraudsters a way to spread fake job listings and thousands of job seekers are fooled every year. In this paper we propose a Fake Job Detection System, developed using Machine Learning (ML) and Natural Language Processing (NLP) techniques to categorize job postings as FAKE, SUSPICIOUS or GENUINE. We use a dataset of Kaggle Fake Job Postings, which has 17,880 records with only 4.84% fake records for training and evaluation. The combination of the job title, the company profile, the description and requirements is tokenized and pre-processed and then passed through the TF-IDF vectorizer. We have experimented with three different models - Logistic Regression, Random Forest with equal weights and Random Forest with SMOTE. The best-performing model provided us with 99% accuracy, 86% precision, 84% recall, 0.85 F1-score and 0.984 ROC-AUC. The threshold value has been set at 0.32. The operational system is deployed as a Streamlit web application with features such as confidence meters, keyword highlighting, PDF report generation, and email notifications.},
        keywords = {employment fraud, fake job detection, machine learning, natural language processing, Random Forest, SMOTE, Streamlit, text classification, TF-IDF, threshold tuning.},
        month = {July},
        }

Cite This Article

JOJU, D. (2026). Fake Job Detection System Using Machine Learning and NLP. International Journal of Innovative Research in Technology (IJIRT), 13(2), 359–363.

Related Articles