Skip to main content
eScholarship
Open Access Publications from the University of California

UC Irvine

UC Irvine Previously Published Works bannerUC Irvine

A novel ensemble-based statistical approach to estimate daily wildfire-specific PM2.5 in California (2006-2020).

Abstract

Though fine particulate matter (PM2.5) has decreased in the United States (U.S.) in the past two decades, the increasing frequency, duration, and severity of wildfires significantly (though episodically) impairs air quality in wildfire-prone regions and beyond. Increasing PM2.5 concentrations derived from wildfire smoke and associated impacts on public health require dedicated epidemiological studies. Main sources of PM2.5 data are provided by government-operated monitors sparsely located across U.S., leaving several regions and potentially vulnerable populations unmonitored. Current approaches to estimate PM2.5 concentrations in unmonitored areas often rely on big data, such as satellite-derived aerosol properties and meteorological variables, apply computationally-intensive deterministic modeling, and do not distinguish wildfire-specific PM2.5 from other sources of emissions such as traffic and industrial sources. Furthermore, modelling wildfire-specific PM2.5 presents a challenge since measurements of the smoke contribution to PM2.5 pollution are not available. Here, we aim to use statistical methods to isolate wildfire-specific PM2.5 from other sources of emissions. Our study presents an ensemble model that optimally combines multiple machine learning algorithms (including gradient boosting machine, random forest and deep learning), and a large set of explanatory variables to, first, estimate daily PM2.5 concentrations at the ZIP code level, a relevant spatiotemporal resolution for epidemiological studies. Subsequently, we propose a novel implementation of an imputation approach to estimate the wildfire-specific PM2.5 concentrations that could be applied geographical regions in the US or worldwide. Our ensemble model achieved comparable results to previous machine learning studies for PM2.5 prediction while avoiding processing larger, computationally intensive datasets. Our study is the first to apply a suite of statistical models using readily available datasets to provide daily wildfire-specific PM2.5 at a fine spatial scale for a 15-year period, thus providing a relevant spatiotemporal resolution and timely contribution for epidemiological studies.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View