PM2.5 Estimation in the Czech Republic using Extremely Randomized Trees: A Comprehensive Data Analysis


  • Saleem Ibrahim Department of Geomatics, Faculty of Civil Engineering, Czech Technical University in Prague
  • Martin Landa Department of Geomatics, Faculty of Civil Engineering, Czech Technical University in Prague
  • Eva Matoušková Department of Geomatics, Faculty of Civil Engineering, Czech Technical University in Prague
  • Lukáš Brodský
  • Lena Halounová



air quality, Artificial Intelligence, Spatial autocorrelation, PM2.5, Czech Republic


The accuracy of artificial intelligence techniques in estimating air quality is contingent upon a multitude of influencing factors. Unlike our previous study that examined PM2.5 over whole Europe using unbalanced spatial-temporal data, the focus of this study was on estimating PM2.5 specifically over the Czech Republic using more balanced dataset to train and evaluate the model. Moreover, the spatial autocorrelation between the ground-based station was taken into consideration while building the model. The feature importance while developing the Extra Trees model revealed that spatial autocorrelation had greater significance in comparison to commonly used inputs such as elevation and NDVI. We found that R2 of the 10-CV for the new model was 16% higher than the previous one. R2 reached 0.85 when predicting unseen data in new locations. The developed spatiotemporal model was employed to generate comprehensive daily maps covering the entire study area throughout the 2018–2020 years. The temporal analysis showed that the levels of PM2.5 exceeded recommended limits of 20 µg/m3 during the year 2018 in many regions. The eastern part of the country suffered from the highest concentrations especially over Zlín and Moravian-Silesian Regions where in the 2018 winter, the values reached risky average concentrations of 30 µg/m3 and 35 µg/m3 respectively. Air quality improved during the next two years in all regions reaching promising levels in 2020 where almost all regions had average concentrations less than 20 µg/m3. The generated dataset will be available for other future air quality studies.


Download data is not yet available.


N. R. Martins and G. Carrilho da Graça, “Impact of PM2.5 in indoor urban environments: A review,” Sustainable Cities and Society, vol. 42. 2018, doi: 10.1016/j.scs.2018.07.011.

A. Baklanov, L. T. Molina, and M. Gauss, “Megacities, air quality and climate,” Atmospheric Environment, vol. 126. 2016, doi: 10.1016/j.atmosenv.2015.11.059.

M. Pascal et al., “Short-term impacts of particulate matter (PM10, PM10-2.5, PM2.5) on mortality in nine French cities,” Atmos. Environ., vol. 95, 2014, doi: 10.1016/j.atmosenv.2014.06.030.

European Commission, “Directive 2008/50/EC of the European Parliament and of the Council of 21 May 2008 on ambient air quality and cleaner air for Europe (OJ L 152, 11.6.2008, pp. 1-44).,” 2008. [Online]. Available:

D. Tóthová, “Respiratory diseases in children and air pollution - The cost of - Illness assessment in Ostrava City,” Cent. Eur. J. Public Policy, vol. 14, no. 1, pp. 43–56, Jun. 2020, doi: 10.2478/CEJPP-2020-0003.

P. Mikuška, K. Křůmal, and Z. Večeřa, “Characterization of organic compounds in the PM2.5 aerosols in winter in an industrial urban area,” Atmos. Environ., vol. 105, pp. 97–108, Mar. 2015, doi: 10.1016/J.ATMOSENV.2015.01.028.

R. J. Sram, “Impact of Air Pollution on the Health of the Population in Parts of the Czech Republic,” Int. J. Environ. Res. Public Heal. 2020, Vol. 17, Page 6454, vol. 17, no. 18, p. 6454, Sep. 2020, doi: 10.3390/IJERPH17186454.

R. Seibert, I. Nikolova, V. Volná, B. Krejčí, and D. Hladký, “Air Pollution Sources’ Contribution to PM2.5 Concentration in the Northeastern Part of the Czech Republic,” Atmos. 2020, Vol. 11, Page 522, vol. 11, no. 5, p. 522, May 2020, doi: 10.3390/ATMOS11050522.

I. Hůnová, “Erratum: Hůnová, I. Ambient Air Quality in the Czech Republic: Past and Present. Atmosphere 2020, 11, 214,” Atmos. 2021, Vol. 12, Page 720, vol. 12, no. 6, p. 720, Jun. 2021, doi: 10.3390/ATMOS12060720.

J. Horák et al., “Estimation of selected pollutant emissions from solid-fuel combustion in small heating appliances,” Chem. Sheets, vol. 105, no. 11, pp. 851–855, Dec. 2011, Accessed: Jun. 11, 2023. [Online]. Available:

J. Hovorka, P. Pokorná, P. K. Hopke, K. Křůmal, P. Mikuška, and M. Píšová, “Wood combustion, a dominant source of winter aerosol in residential district in proximity to a large automobile factory in Central Europe,” Atmos. Environ., vol. 113, pp. 98–107, Jul. 2015, doi: 10.1016/J.ATMOSENV.2015.04.068.

IEA, “Czech Republic 2021 – Analysis - IEA,” 2021. (accessed Jun. 11, 2023).

I. Pavlíková, D. Hladký, O. Motyka, K. N. Vergel, L. P. Strelkova, and M. S. Shvetsova, “Characterization of PM10 Sampled on the Top of a Former Mining Tower by the High-Volume Wind Direction-Dependent Sampler Using INNA,” Atmos. 2021, Vol. 12, Page 29, vol. 12, no. 1, p. 29, Dec. 2020, doi: 10.3390/ATMOS12010029.

H. J. Lee, “Advancing Exposure Assessment of PM2.5 Using Satellite Remote Sensing: A Review,” Asian J. Atmos. Environ., vol. 14, no. 4, 2020, doi: 10.5572/ajae.2020.14.4.319.

J. Wang and S. A. Christopher, “Intercomparison between satellite-derived aerosol optical thickness and PM2.5 mass: Implications for air quality studies,” Geophys. Res. Lett., vol. 30, no. 21, p. 2095, Nov. 2003, doi: 10.1029/2003GL018174.

Y. Liu et al., “Mapping annual mean ground-level PM2.5 concentrations using Multiangle Imaging Spectroradiometer aerosol optical thickness over the contiguous United States,” J. Geophys. Res. Atmos., vol. 109, no. D22, pp. 1–10, Nov. 2004, doi: 10.1029/2004JD005025.

B. Liu et al., “The relationship between atmospheric boundary layer and temperature inversion layer and their aerosol capture capabilities,” Atmos. Res., vol. 271, 2022, doi: 10.1016/j.atmosres.2022.106121.

X. Li, Y. J. Feng, and H. Y. Liang, “The Impact of Meteorological Factors on PM2.5 Variations in Hong Kong,” in IOP Conference Series: Earth and Environmental Science, 2017, vol. 78, no. 1, doi: 10.1088/1755-1315/78/1/012003.

W. R. Tobler, “A Computer Movie Simulating Urban Growth in the Detroit Region,” Econ. Geogr., vol. 46, p. 234, Jun. 1970, doi: 10.2307/143141.

H. Wang, Z. Chen, and P. Zhang, “Spatial Autocorrelation and Temporal Convergence of PM2.5 Concentrations in Chinese Cities,” Int. J. Environ. Res. Public Heal. 2022, Vol. 19, Page 13942, vol. 19, no. 21, p. 13942, Oct. 2022, doi: 10.3390/IJERPH192113942.

Y. Zhang et al., “Estimating high-resolution PM2.5 concentration in the Sichuan Basin using a random forest model with data-driven spatial autocorrelation terms,” J. Clean. Prod., vol. 380, p. 134890, Dec. 2022, doi: 10.1016/J.JCLEPRO.2022.134890.

W. Wang et al., “Estimation of PM2.5 Concentrations in China Using a Spatial Back Propagation Neural Network,” Sci. Reports 2019 91, vol. 9, no. 1, pp. 1–10, Sep. 2019, doi: 10.1038/s41598-019-50177-1.

A. Lyapustin et al., “Multiangle implementation of atmospheric correction (MAIAC): 2. Aerosol algorithm,” J. Geophys. Res. Atmos., vol. 116, no. 3, 2011, doi: 10.1029/2010JD014986.

A. Inness et al., “The CAMS reanalysis of atmospheric composition,” Atmos. Chem. Phys., vol. 19, no. 6, 2019, doi: 10.5194/acp-19-3515-2019.

S. Ibrahim, M. Landa, O. Pešek, K. Pavelka, and L. Halounová, “Space-time machine learning models to analyze COVID-19 pandemic lockdown effects on aerosol optical depth over europe,” Remote Sens., vol. 13, no. 15, 2021, doi: 10.3390/rs13153027.

J. Muñoz-Sabater et al., “ERA5-Land: A state-of-the-art global reanalysis dataset for land applications,” Earth Syst. Sci. Data, vol. 13, no. 9, 2021, doi: 10.5194/essd-13-4349-2021.

K. Didan, “MOD13A3 MODIS/Terra vegetation Indices Monthly L3 Global 1km SIN Grid V006,” NASA EOSDIS Land Processes DAAC. 2015.

T. Tadono, H. Ishida, F. Oda, S. Naito, K. Minakawa, and H. Iwamoto, “Precise Global DEM Generation by ALOS PRISM,” ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci., vol. II–4, 2014, doi: 10.5194/isprsannals-ii-4-71-2014.

C. D. Elvidge, M. Zhizhin, T. Ghosh, F. C. Hsu, and J. Taneja, “Annual Time Series of Global VIIRS Nighttime Lights Derived from Monthly Averages: 2012 to 2019,” Remote Sens. 2021, Vol. 13, Page 922, vol. 13, no. 5, p. 922, Mar. 2021, doi: 10.3390/RS13050922.

P. Geurts, D. Ernst, and L. Wehenkel, “Extremely randomized trees,” Mach. Learn., vol. 63, no. 1, pp. 3–42, Apr. 2006, doi: 10.1007/s10994-006-6226-1.

S. Ibrahim, M. Landa, O. Pešek, L. Brodský, and L. Halounová, “Machine Learning-Based Approach Using Open Data to Estimate PM2.5 over Europe,” Remote Sens. 2022, Vol. 14, Page 3392, vol. 14, no. 14, p. 3392, Jul. 2022, doi: 10.3390/RS14143392.

P. A. P. Moran, “Notes on Continuous Stochastic Phenomena,” Biometrika, vol. 37, no. 1/2, p. 17, Jun. 1950, doi: 10.2307/2332142.

L. Anselin, “Local Indicators of Spatial Association—LISA,” Geogr. Anal., vol. 27, no. 2, pp. 93–115, Apr. 1995, doi: 10.1111/J.1538-4632.1995.TB00338.X.

C. Zhang, L. Luo, W. Xu, and V. Ledwith, “Use of local Moran’s I and GIS to identify pollution hotspots of Pb in urban soils of Galway, Ireland,” Sci. Total Environ., vol. 398, no. 1–3, pp. 212–221, Jul. 2008, doi: 10.1016/J.SCITOTENV.2008.03.011.

R. Schneider et al., “A satellite-based spatio-temporal machine learning model to reconstruct daily PM2.5 concentrations across great britain,” Remote Sens., vol. 12, no. 22, 2020, doi: 10.3390/rs12223803.

T. Li, H. Shen, C. Zeng, Q. Yuan, and L. Zhang, “Point-surface fusion of station measurements and satellite observations for mapping PM2.5 distribution in China: Methods and assessment,” Atmos. Environ., vol. 152, 2017, doi: 10.1016/j.atmosenv.2017.01.004.

J. Wei et al., “Estimating 1-km-resolution PM2.5 concentrations across China using the space-time random forest approach,” Remote Sens. Environ., vol. 231, 2019, doi: 10.1016/j.rse.2019.111221.

M. Stafoggia et al., “Estimation of daily PM10 and PM2.5 concentrations in Italy, 2013–2015, using a spatiotemporal land-use random-forest model,” Environ. Int., vol. 124, pp. 170–179, Mar. 2019, doi: 10.1016/J.ENVINT.2019.01.016.

B. Czernecki, M. Marosz, and J. Jędruszkiewicz, “Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish Agglomerations,” Aerosol Air Qual. Res., vol. 21, no. 7, p. 200586, 2021, doi: 10.4209/AAQR.200586.