Spam Email Detection in Machine Learning

  • Unique Paper ID: 178237
  • Volume: 11
  • Issue: 12
  • PageNo: 5034-5039
  • Abstract:
  • Email is one of the greatest used digital communication methods, allowing users to exchange messages, documents, and multimedia via the internet. Email spam is still a big problem in digital communication, affecting user safety and productivity. Conventional spam detection systems depend too much on supervised learning models, which necessitate enormous label data sets that aren't necessarily scalable or accessible. The hybrid model first uses TF -IDF to remove important text features from email and then apply XGBoost to classify them as spam or ham with high accuracy. In the second phase, a semi-supervised self-training algorithm rejects high-confidence predictions, availing unbilled data by labelling and retraining, which improves generalization. Additionally, we employ a graph-based teaching approach where emails are represented as nodes, and the content is formed on the basis of equality or sender metadata. Label proliferation such as graph classifier increases the accuracy of detection using structural relationships within data. Experimental results show that the proposed approach acquires more than 95% classification accuracy, reduces dependence up to 30% on labelled data, and improves strength against sophisticated spam strategies. These conclusions confirm that our system provides a reliable and scalable solution for the real-world email spam trace.

Cite This Article

  • ISSN: 2349-6002
  • Volume: 11
  • Issue: 12
  • PageNo: 5034-5039

Spam Email Detection in Machine Learning

Related Articles