Comparative Analysis of machine learning model twitter spam detection

  • Unique Paper ID: 167958
  • Volume: 11
  • Issue: 4
  • PageNo: 1459-1463
  • Abstract:
  • — The increasing use of social media platforms like Twitter among the users for sharing content or news or thoughts also causes significant challenges to maintain the integrity of online interactions. This paper presents a comparative analysis of six machine learning classifiers, Logistic Regression, Naive Bayes, K-Nearest Neighbors (KNN), Random Forest, AdaBoost, and XGBoost, for the detection of spam tweets. We have collected the balanced dataset and manually labelled it to ensure quality of input. The results indicate that ensemble models, particularly AdaBoost and XGBoost, outperformed traditional classifiers with accuracies of 98% to 100%, while Logistic Regression also achieved a near-perfect accuracy of 99.70% after scaling. In contrast, Naive Bayes and KNN showed lower performance, with accuracies ranging from 84% to 87%. This study highlights the role of data preprocessing, feature scaling, and ensemble techniques in improving the effectiveness of spam detection models, providing understanding for the development of more robust machine learning solutions in real-time social media applications like Twitter.

Cite This Article

  • ISSN: 2349-6002
  • Volume: 11
  • Issue: 4
  • PageNo: 1459-1463

Comparative Analysis of machine learning model twitter spam detection

Related Articles