Prediction of ambient air pollution with regression approach using machine learning

Research Article

Prediction of ambient air pollution with regression approach using machine learning

DOI: 10.1080/20421338.2025.2577980
Author(s): Bachandeep Singh Bhathal University Institute of Computing, Chandigarh University, India , Gaurav Gupta Punjabi University, India , Brahmaleen Kaur Sidhu Punjabi University, India

Abstract

Accurate air quality prediction is vital for managing pollution and protecting public health. This study evaluates the performance of three machine learning models – Decision Tree Regression (DTR), Linear Regression (LR), and Random Forest Regression (RFR) – to forecast the Air Quality Index (AQI) based on pollutants like SO2, CO, O3, NO2, PM2.5, and PM10. Using data from the Central Pollution Control Board (CPCB), the research identifies RFR as the most reliable model. RFR achieved the highest accuracy scores: 0.872 (SO2), 0.82 (O3), 0.71 (NO2), 0.91 (PM2.5), and 0.82 (PM10), outperforming DTR and LR. RFR’s ensemble learning approach effectively captures complex patterns and minimizes overfitting. The study highlights that RFR-based forecasting can support real-time pollution monitoring in Punjab. Integrating such models into environmental policies can lead to better-informed and timely decisions. This research offers a strong comparative framework and is a significant step toward developing AI-based air pollution warning systems for regional applications.

Get new issue alerts for African Journal of Science, Technology, Innovation and Development