Integrated Air Quality Analytics: Forecasting and Pollution Source Identification Using Machine Learning and Deep Learning Approaches

  • Unique Paper ID: 190057
  • Volume: 12
  • Issue: 8
  • PageNo: 2804-2809
  • Abstract:
  • Air pollution has evolved from a seasonal environ- mental challenge to a perennial public health crisis, particularly in rapidly urbanizing megacities. While traditional chemical transport models offer physical insights, they often lack the computational agility required for real-time, hyper-local fore- casting. This research proposes a robust, hybrid framework for Air Quality Index (AQI) forecasting and pollutant source characterization. We introduce an Attention-based Bi-Directional Long Short-Term Memory (Attention-Bi-LSTM) network, de- signed to capture long-term temporal dependencies and weigh critical historical pollution events. Utilizing a comprehensive dataset spanning nine years (2017-2025) from New Delhi (CPCB) and comparative insights from the USA (EPA), the proposed model demonstrates significant empirical validity. On the test set (2024-2025), the model achieved a Root Mean Square Error (RMSE) of 44.59 and an R2 score of 0.82, effectively predicting extreme pollution spikes associated with winter inversion and anthropogenic activities. Furthermore, this study moves beyond mere prediction to source attribution. Analysis of prominent pollutants reveals that while Particulate Matter (PM10/PM2.5) remains the primary driver of toxicity, secondary pollutants like Ozone (O3) and Nitrogen Dioxide (NO2) have emerged as significant contributors, accounting for over 25% of hazardous days. These findings provide policymakers with explainable, data- driven insights to transition from reactive measures to proactive air quality management.

Copyright & License

Copyright © 2026 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{190057,
        author = {Nikesh Yadav and Sandhya Kaprawan},
        title = {Integrated Air Quality Analytics: Forecasting and Pollution Source Identification Using Machine Learning and Deep Learning Approaches},
        journal = {International Journal of Innovative Research in Technology},
        year = {2026},
        volume = {12},
        number = {8},
        pages = {2804-2809},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=190057},
        abstract = {Air pollution has evolved from a seasonal environ- mental challenge to a perennial public health crisis, particularly in rapidly urbanizing megacities. While traditional chemical transport models offer physical insights, they often lack the computational agility required for real-time, hyper-local fore- casting. This research proposes a robust, hybrid framework for Air Quality Index (AQI) forecasting and pollutant source characterization. We introduce an Attention-based Bi-Directional Long Short-Term Memory (Attention-Bi-LSTM) network, de- signed to capture long-term temporal dependencies and weigh critical historical pollution events. Utilizing a comprehensive dataset spanning nine years (2017-2025) from New Delhi (CPCB) and comparative insights from the USA (EPA), the proposed model demonstrates significant empirical validity. On the test set (2024-2025), the model achieved a Root Mean Square Error (RMSE) of 44.59 and an R2 score of 0.82, effectively predicting extreme pollution spikes associated with winter inversion and anthropogenic activities. Furthermore, this study moves beyond mere prediction to source attribution. Analysis of prominent pollutants reveals that while Particulate Matter (PM10/PM2.5) remains the primary driver of toxicity, secondary pollutants like Ozone (O3) and Nitrogen Dioxide (NO2) have emerged as significant contributors, accounting for over 25% of hazardous days. These findings provide policymakers with explainable, data- driven insights to transition from reactive measures to proactive air quality management.},
        keywords = {Air Quality Forecasting, Attention-Bi-LSTM, Deep Learning, Source Apportionment, Explainable AI (XAI), Urban Health Policy.},
        month = {January},
        }

Cite This Article

Yadav, N., & Kaprawan, S. (2026). Integrated Air Quality Analytics: Forecasting and Pollution Source Identification Using Machine Learning and Deep Learning Approaches. International Journal of Innovative Research in Technology (IJIRT), 12(8), 2804–2809.

Related Articles