Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Methods Ecol Evol ; 12(11): 2117-2128, 2021 Nov.
Article in English | MEDLINE | ID: mdl-35874972

ABSTRACT

The ecological and environmental science communities have embraced machine learning (ML) for empirical modelling and prediction. However, going beyond prediction to draw insights into underlying functional relationships between response variables and environmental 'drivers' is less straightforward. Deriving ecological insights from fitted ML models requires techniques to extract the 'learning' hidden in the ML models.We revisit the theoretical background and effectiveness of four approaches for deriving insights from ML: ranking independent variable importance (Gini importance, GI; permutation importance, PI; split importance, SI; and conditional permutation importance, CPI), and two approaches for inference of bivariate functional relationships (partial dependence plots, PDP; and accumulated local effect plots, ALE). We also explore the use of a surrogate model for visualization and interpretation of complex multi-variate relationships between response variables and environmental drivers. We examine the challenges and opportunities for extracting ecological insights with these interpretation approaches. Specifically, we aim to improve interpretation of ML models by investigating how effectiveness relates to (a) interpretation algorithm, (b) sample size and (c) the presence of spurious explanatory variables.We base the analysis on simulations with known underlying functional relationships between response and predictor variables, with added white noise and the presence of correlated but non-influential variables. The results indicate that deriving ecological insight is strongly affected by interpretation algorithm and spurious variables, and moderately impacted by sample size. Removing spurious variables improves interpretation of ML models. Meanwhile, increasing sample size has limited value in the presence of spurious variables, but increasing sample size does improves performance once spurious variables are omitted. Among the four ranking methods, SI is slightly more effective than the other methods in the presence of spurious variables, while GI and SI yield higher accuracy when spurious variables are removed. PDP is more effective in retrieving underlying functional relationships than ALE, but its reliability declines sharply in the presence of spurious variables. Visualization and interpretation of the interactive effects of predictors and the response variable can be enhanced using surrogate models, including three-dimensional visualizations and use of loess planes to represent independent variable effects and interactions.Machine learning analysts should be aware that including correlated independent variables in ML models with no clear causal relationship to response variables can interfere with ecological inference. When ecological inference is important, ML models should be constructed with independent variables that have clear causal effects on response variables. While interpreting ML models for ecological inference remains challenging, we show that careful choice of interpretation methods, exclusion of spurious variables and adequate sample size can provide more and better opportunities to 'learn from machine learning'.

2.
Nat Clim Chang ; 11: 449-455, 2021.
Article in English | MEDLINE | ID: mdl-35136420

ABSTRACT

Africa's ecosystems have an important role in global carbon dynamics, yet consensus is lacking regarding the amount of carbon stored in woody vegetation and the potential impacts to carbon storage in response to changes in climate, land use, and other Anthropocene risks. Here, we explore the socio-environmental conditions that shaped the contemporary distribution of woody vegetation across sub-Saharan Africa and evaluate ecosystem response to multiple scenarios of climate change, anthropogenic pressures, and fire disturbance. Our projections suggest climate change will have a small but negative effect on above ground woody biomass at the continental scale, and the compounding effects of population growth, increasing human pressures, and socio-climatic driven changes in fire behavior further exacerbate climate-driven trends. Relatively modest continental-scale trends obscure much larger regional perturbations, with climatic and anthropogenic factors leading to increased carbon storage potential in East Africa, offset by large deficits in West, Central, and Southern Africa.

3.
Sci Total Environ ; 703: 134615, 2020 Feb 10.
Article in English | MEDLINE | ID: mdl-31767338

ABSTRACT

The pedosphere is the largest terrestrial reservoir of organic carbon, yet soil-carbon variability and its representation in Earth system models is a large source of uncertainty for carbon-cycle science and climate projections. Much of this uncertainty is attributed to local and regional-scale variability, and predicting this variation can be challenging if variable selection is based solely on a priori assumptions due to the scale-dependent nature of environmental determinants. Data mining can optimize predictive modeling by allowing machine-learning algorithms to learn from and discover complex patterns in large datasets that may have otherwise gone unnoticed, thus increasing the potential for knowledge discovery. In this analysis, we identify important, regional-scale determinants for top- and subsoil-carbon stabilization in production forestland across the southeastern US. Specifically, we apply recursive feature elimination to a large suite of socio-environmental data to strategically select a parsimonious, yet highly predictive covariate set. This is achieved by recursively considering smaller and smaller covariate sets-or features-by first training the estimator on the full set to obtain feature importance. The least important features are pruned, and the procedure is recursively repeated until a desired number of covariates is identified. We show that although carbon ranges from 0.3 to 8.2 kg m-2 in the topsoil (0 to 20 cm), and from 0.4 to 17.6 kg m-2 in the subsoil (20 to 100 cm), this variability is predictably distributed with precipitation, soil moisture, nitrogen and sand content, gamma ray emissions, mean annual minimum temperature, and elevation. From our spatial predictions, we estimate that 2.6 Pg of soil carbon is currently stabilized in the upper 100 cm of production forestland, which covers 34.7 million ha in the southeastern US.

4.
Sci Data ; 6(1): 5, 2019 02 26.
Article in English | MEDLINE | ID: mdl-30808877

ABSTRACT

The original version of this Data Descriptor incorrectly referenced the "United Nations (UN) Food and Agriculture Organization (FAO) soilGrids250m system". This has been corrected to "SoilGrids predictions" throughout the text in both the HTML and PDF versions.

5.
Sci Data ; 5: 180091, 2018 May 15.
Article in English | MEDLINE | ID: mdl-29762550

ABSTRACT

Hydrologic soil groups (HSGs) are a fundamental component of the USDA curve-number (CN) method for estimation of rainfall runoff; yet these data are not readily available in a format or spatial-resolution suitable for regional- and global-scale modeling applications. We developed a globally consistent, gridded dataset defining HSGs from soil texture, bedrock depth, and groundwater. The resulting data product-HYSOGs250m-represents runoff potential at 250 m spatial resolution. Our analysis indicates that the global distribution of soil is dominated by moderately high runoff potential, followed by moderately low, high, and low runoff potential. Low runoff potential, sandy soils are found primarily in parts of the Sahara and Arabian Deserts. High runoff potential soils occur predominantly within tropical and sub-tropical regions. No clear pattern could be discerned for moderately low runoff potential soils, as they occur in arid and humid environments and at both high and low elevations. Potential applications of this data include CN-based runoff modeling, flood risk assessment, and as a covariate for biogeographical analysis of vegetation distributions.

6.
Sci Total Environ ; 493: 974-82, 2014 Sep 15.
Article in English | MEDLINE | ID: mdl-25010945

ABSTRACT

Historically, Florida soils stored the largest amount of soil organic carbon (SOC) among the conterminous U.S. states (2.26 Pg). This region experienced rapid land use/land cover (LULC) shifts and climate change in the past decades. The effects of these changes on SOC sequestration are unknown. The objectives of this study were to 1) investigate the change in SOC stocks in Florida to determine if soils have acted as a net sink or net source for carbon (C) over the past four decades and 2) identify the concomitant effects of LULC, LULC change, and climate on the SOC change. A total of 1080 sites were sampled in the topsoil (0-20 cm) between 2008 and 2009 representing the current SOC stocks, 194 of which were selected to collocate with historical sites (n = 1251) from the Florida Soil Characterization Database (1965-1996) for direct comparison. Results show that SOC stocks significantly differed among LULC classes--sugarcane and wetland contained the highest SOC, followed by improved pasture, urban, mesic upland forest, rangeland, and pineland while crop, citrus and xeric upland forest remained the lowest. The surface 20 cm soils acted as a net sink for C with the median SOC significantly increasing from 2.69 to 3.40 kg m(-2) over the past decades. The SOC sequestration rate was LULC dependent and controlled by climate factors interacting with LULC. Higher temperature tended to accelerate SOC accumulation, while higher precipitation reduced the SOC sequestration rate. Land use/land cover change observed over the past four decades also favored the C sequestration in soils due to the increase in the C-rich wetland area by ~140% and decrease in the C-poor agricultural area by ~20%. Soils are likely to provide a substantial soil C sink considering the climate and LULC projections for this region.

SELECTION OF CITATIONS
SEARCH DETAIL
...