Neighborhood size of training data influences soil map disaggregation

TitleNeighborhood size of training data influences soil map disaggregation
Publication TypeJournal Article
Year of Publication2017
AuthorsLevi MR
JournalSoil Science Society of America Journal
Date Published04/2017
ARIS Log Number332246
Soil class mapping relies on the ability of sample locations to represent portions of the landscape with similar soil types; however, most digital soil mapping (DSM) approaches intersect sample locations with one raster pixel per covariate layer regardless of pixel size. This approach does not take into account the variability of covariate information adjacent to training data that represent the polypedon. My objective was to disaggregate a soil map in a semiarid Arizona rangeland (78,569 ha) by exploring different neighborhood sizes for extracting covariate data to points. Eight machine learning algorithms were compared to assess the influence of aggregating covariate data in neighborhood sizes between 0 – 180 m radius and a multi-scale model. Kappa values of all models ranged between 0.24 and 0.44 and increased with buffer radius up to a radius of 150 m. Support vector machine and random forest algorithms performed best across all scales. The radial support vector machine model using 150 m aggregations of covariates had the highest kappa and produced a more generalized map compared to the best multi-resolution model (random forest) which resulted in a mix of general and detailed soil features. Evaluating a range of neighborhood sizes for aggregating covariate data provides a method of accounting for multi-scale processes important for predicting soil patterns without modifying pixel size of final maps. Incorporating polypedon concepts from traditional soil survey with DSM approaches can strengthen ties between them and optimize the extraction of landscape information for predicting soil properties.