Automated Root Cause Analysis in Distributed Micro services Systems Using Hybrid AI Techniques

  • Unique Paper ID: 195542
  • PageNo: 746-754
  • Abstract:
  • Modern distributed systems and cloud-native microservices architectures generate massive volumes of log data and telemetry metrics daily, making manual Root Cause Analysis (RCA) nearly impossible for operations teams. Traditional rule-based methods and threshold driven alerting systems fail to capture the dynamic and temporal dependencies inherent in such architectures, resulting in prolonged downtimes and high Mean Time To Repair (MTTR). This paper proposes an Intelligent Root Cause Analysis System that leverages a hybrid Artificial Intelligence pipeline to fully automate the diagnostics lifecycle. The system integrates Unsupervised Machine Learning, specifically Isolation Forest, for real-time anomaly detection; Deep Learning using Long Short-Term Memory (LSTM) networks for proactive failure prediction from time-series data; and Graph Neural Networks, particularly Graph Convolutional Networks (GCN), to model system topology and accurate]ly identify the faulty component responsible for cascading failures. To address the critical 'black-box' limitation of deep AI models, the system further incorporates SHAP (SHapley Additive exPlanations), providing human-readable, interpretable evidence for every automated diagnosis. The proposed solution achieves 94% accuracy in anomaly detection and significantly reduces MTTR, transitioning system maintenance from reactive firefighting to predictive and explainable reliability engineering. Experimental results demonstrate that the hybrid approach substantially outperforms traditional monitoring tools in both speed and accuracy of fault localization.

Copyright & License

Copyright © 2026 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{195542,
        author = {Umamaheswararao Mogili},
        title = {Automated Root Cause Analysis in Distributed Micro services Systems Using Hybrid AI Techniques},
        journal = {International Journal of Innovative Research in Technology},
        year = {2026},
        volume = {12},
        number = {11},
        pages = {746-754},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=195542},
        abstract = {Modern distributed systems and cloud-native microservices architectures generate massive volumes of log data and telemetry metrics daily, making manual Root Cause Analysis (RCA) nearly impossible for operations teams. Traditional rule-based methods and threshold driven alerting systems fail to capture the dynamic and temporal dependencies inherent in such architectures, resulting in prolonged downtimes and high Mean Time To Repair (MTTR). This paper proposes an Intelligent Root Cause Analysis System that leverages a hybrid Artificial Intelligence pipeline to fully automate the diagnostics lifecycle. The system integrates Unsupervised Machine Learning, specifically Isolation Forest, for real-time anomaly detection; Deep Learning using Long Short-Term Memory (LSTM) networks for proactive failure prediction from time-series data; and Graph Neural Networks, particularly Graph Convolutional Networks (GCN), to model system topology and accurate]ly identify the faulty component responsible for cascading failures. To address the critical 'black-box' limitation of deep AI models, the system further incorporates SHAP (SHapley Additive exPlanations), providing human-readable, interpretable evidence for every automated diagnosis. The proposed solution achieves 94% accuracy in anomaly detection and significantly reduces MTTR, transitioning system maintenance from reactive firefighting to predictive and explainable reliability engineering. Experimental results demonstrate that the hybrid approach substantially outperforms traditional monitoring tools in both speed and accuracy of fault localization.},
        keywords = {Intelligent Root Cause Analysis, Isolation Forest, LSTM Networks, Graph Convolutional Networks (GCN), SHAP Explain ability.},
        month = {April},
        }

Cite This Article

Mogili, U. (2026). Automated Root Cause Analysis in Distributed Micro services Systems Using Hybrid AI Techniques. International Journal of Innovative Research in Technology (IJIRT). https://doi.org/doi.org/10.64643/IJIRTV12I11-195542-459

Related Articles