A Hybrid Machine Learning Framework Using Random Forest and XGBoost for Software Bug Prediction

  • Unique Paper ID: 175775
  • PageNo: 4664-4667
  • Abstract:
  • Code smells, indicating poor design or implementation choices, can harm software maintainability and increase bug-proneness. This study explores the significance of code smell metrics in prediction models for detecting bug-prone code modules. By incorporating smell-based metrics, we aim to enhance bug prediction accuracy. Using 14 open-source projects from the PROMISE repository, all written in Java, we trained models with metrics like F1-score, accuracy, precision, and recall. Classifiers like Naïve Bayes, Random Forest (RF), Support Vector Machine (SVM), Logistic Regression, and k-Nearest Neighbor were applied. RF and SVM outperformed the other methods, delivering higher accuracy both within versions and across projects, proving their effectiveness in predicting buggy components.

Copyright & License

Copyright © 2026 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{175775,
        author = {P.C. SANDHYA and Dr. I. NASRULLA},
        title = {A Hybrid Machine Learning Framework Using Random Forest and XGBoost for Software Bug Prediction},
        journal = {International Journal of Innovative Research in Technology},
        year = {2025},
        volume = {11},
        number = {11},
        pages = {4664-4667},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=175775},
        abstract = {Code smells, indicating poor design or implementation choices, can harm software maintainability and increase bug-proneness. This study explores the significance of code smell metrics in prediction models for detecting bug-prone code modules. By incorporating smell-based metrics, we aim to enhance bug prediction accuracy. Using 14 open-source projects from the PROMISE repository, all written in Java, we trained models with metrics like F1-score, accuracy, precision, and recall. Classifiers like Naïve Bayes, Random Forest (RF), Support Vector Machine (SVM), Logistic Regression, and k-Nearest Neighbor were applied. RF and SVM outperformed the other methods, delivering higher accuracy both within versions and across projects, proving their effectiveness in predicting buggy components.},
        keywords = {Code smell, source code, smell-aware, bugs classification.},
        month = {April},
        }

Cite This Article

SANDHYA, P., & NASRULLA, D. I. (2025). A Hybrid Machine Learning Framework Using Random Forest and XGBoost for Software Bug Prediction. International Journal of Innovative Research in Technology (IJIRT), 11(11), 4664–4667.

Related Articles