Mapping Soil Organic Carbon stocks and estimating uncertainties at the regional scale following a legacy sampling strategy (Southern Belgium, Wallonia)

Mapping Soil Organic Carbon stocks and estimating uncertainties at the regional scale following a legacy sampling strategy (Southern Belgium, Wallonia)

Accepted Manuscript Estimating and spatialising soil organic carbon stocks and uncertainties based on a convenience sampling scheme - a case study fro...

2MB Sizes 0 Downloads 8 Views

Recommend Documents

Geological control of soil organic carbon and nitrogen stocks at the landscape scale
•SOM stocks in the top 30 cm are controlled by parent material in forest soils.•SOM stock differences between forest and

Modelling the three-dimensional spatial distribution of soil organic carbon (SOC) at the regional scale (Flanders, Belgium)
The rate of exchange of CO2 between the soil and the atmosphere depends on the stability of the organic carbon stored in

Mapping mean total annual precipitation in Belgium, by investigating the scale of topographic control at the regional scale
•Geoscientific models require annual precipitation data with high spatial resolution.•Simple method predicts precipitati

Towards mapping soil carbon landscapes: Issues of sampling scale and transferability
•Different modelled patterns can have similar levels of performance.•Spatial regression approaches can extrapolate spati

On stability sampling strategy at the slope scale
Snow slope stability evaluation is often based on a single test location within a slope. However, we know that snow cove

A simplified approach for estimating soil carbon and nitrogen stocks in semi-arid complex terrain
We investigated soil carbon (C) and nitrogen (N) distribution and developed a model, using readily available geospatial

Soil organic carbon content indicators and web mapping applications
Distributing geographic information via the Internet allows interoperability with similar information and real-time inte

Mapping soil carbon stocks across Scotland using a neural network model
•A neural network model was trained to predict soil organic matter across Scotland.•The data used was developed from the

National and sub-national assessments of soil organic carbon stocks and changes: The GEFSOC modelling system
Soil organic carbon (SOC) plays a vital role in ecosystem function, determining soil fertility, water holding capacity a

Modelling soil organic carbon distribution in blanket peatlands at a landscape scale
•A new methodology for mapping blanket peatland SOC is applied to Dartmoor, UK.•Strong relationships between bulk densit

Accepted Manuscript Estimating and spatialising soil organic carbon stocks and uncertainties based on a convenience sampling scheme - a case study from southern Belgium

Caroline Chartin, Antoine Stevens, Esther Goidts, Inken Kruger, Monique Carnol, Bas van Wesemael PII: DOI: Reference:

S2352-0094(16)30049-9 doi: 10.1016/j.geodrs.2016.12.006 GEODRS 112

To appear in:

Geoderma Regional

Received date: Revised date: Accepted date:

15 June 2016 1 December 2016 21 December 2016

Please cite this article as: Caroline Chartin, Antoine Stevens, Esther Goidts, Inken Kruger, Monique Carnol, Bas van Wesemael , Estimating and spatialising soil organic carbon stocks and uncertainties based on a convenience sampling scheme - a case study from southern Belgium. The address for the corresponding author was captured as affiliation for all authors. Please check if appropriate. Geodrs(2016), doi: 10.1016/j.geodrs.2016.12.006

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT Estimating and spatialising Soil Organic Carbon stocks and uncertainties based on a convenience sampling scheme - a case study from southern Belgium

PT

Caroline Chartin1,*, Antoine Stevens1, Esther Goidts2, Inken Kruger3, Monique Carnol3 &

1

RI

Bas van Wesemael1

George Lemaître Centre for Earth and Climate Research, Earth & Life Institute, Université catholique

Soil Protection Direction, Direction générale Agriculture, Natural Resources and Environment, Public

NU

2

SC

de Louvain, Place Louis Pasteur 3, 1348 Louvain-la-Neuve, Belgium.

3

MA

Administration of Wallonia, B-5100, Jambes, Belgium.

Laboratoire d'Ecologie Végétale et Microbienne, Université de Liège, Botanique B22, Quartier Vallée

D

1, Chemin de la Vallée 4, 4000 Liège, Belgium.

PT E

* Corresponding author at: TECLIM, Université catholique de Louvain, Place Louis Pasteur 3, 1348 Louvain-la-Neuve, Belgium. Tel.: +32 10/47 26 73; fax: +32 10/47 28 77.

CE

E-mail adresses : [email protected]; [email protected];

Abstract

AC

[email protected]; [email protected]; [email protected]

The quantification and the spatialisation (i.e., addressing the questions ‘How much?’ and ’Where?’ respectively ) of reliable SOC densities (Mg C ha-1) and stocks (Tg C) baselines and associated uncertainties are fundamental to detect the gains or losses in SOC, and to locate sensitive areas with low SOC density levels. Here, we aim to both quantify SOC stocks and spatialize SOC density at regional scale (southern Belgium) based on data from one non-design-based nor model-based sampling scheme. To this end, we developed a computation procedure based on Digital Soil Mapping 1

ACCEPTED MANUSCRIPT techniques and stochastic simulations (Monte-Carlo) allowing the estimation of multiple (here, 10,000) independent spatialized datasets. This procedure accounts for the uncertainties associated to the both estimations of i) SOC density at the pixel-related area scale and ii) parameters of the spatial model. Based on these 10,000 individuals, mean SOC stocks and confidence intervals were computed at the pixel scale, as well as SOC budgets and their confidence intervals for selected sub-

PT

areas and for the entire study area. Hence, a Generalised Additive Model (GAM) explaining 67.4% of the SOC density variance was fitted and validated (R² = 0.64). A positive gradient of SOC density

RI

occurred from the northwest to the center of Wallonia with a slight decrease on the southernmost

SC

part, correlating the evolution of precipitation and temperature (along with elevation) and dominant

NU

land use. At the catchment scale higher SOC density were predicted on valley bottoms, especially for poorly drained soils under grassland. Mean predicted SOC stocks for cropland and grassland in

MA

Wallonia were of 25.93 Tg C (SD 2.30) and 46.16 Tg C (8.02), respectively. The model induced an underestimation of the SOC density and consequently SOC stocks for grassland of Wallonia, whereas

D

it was not the case for cropland. The procedure developed here allowed to predict realistic spatial

PT E

patterns of SOC density all over agricultural lands of southern Belgium (Where?) and to produce reliable statistics of SOC stocks for each of the 20 combinations of land use / agricultural regions of Wallonia (How much ?). This procedure appears useful to produce soil maps as policy tools in

CE

conducting sustainable management at regional and national scales, and to compute statistics which

AC

comply with specific requirements of reporting activities. Keywords. Soil organic carbon stocks; Digital Soil Mapping; Soil Monitoring Network; Luvisol; Cambisol

2

ACCEPTED MANUSCRIPT 1. Introduction Soil Organic Carbon (SOC) is a key element for soil quality and fertility, as it improves aggregate stability, nutrient availability and water retention capacity (Hallet et al., 2012; Robinson et al., 2012). Thus, SOC contributes to many ecosystem services, such as food production, maintenance of soil structure and the prevention of soil erosion (FAO, 2015). The sequestration of CO2 in the form of SOC

PT

in particular through agricultural practices that increase organic matter inputs and reduce their decomposition is considered to be a potential offset to current anthropogenic CO2 emissions

SC

RI

(Paustian et al., 2016).

The quantification and the spatialisation (i.e., addressing the questions ‘How much?’ and ’Where?’ )

NU

of reliable SOC densities (Mg C ha-1) and stocks (Tg C) baselines and associated uncertainties appear therefore fundamental to: i/ detect the trends of gains or losses in SOC (i.e., how much?) linked to

MA

recent and future national, European or global policy decisions related to soil quality and carbon sequestration, and, ii/ locate sensitive areas characterized with low SOC density levels (i.e., where?)

PT E

D

for prioritizing conservation practices.

In order to monitor SOC stocks at regional or national scales, a Soil Monitoring Network (SMN) has to be developed as a set of sites or areas where changes in soil characteristics can be observed through

CE

regular assessment of different soil properties (Morvan et al., 2008). An important decision related to the design of a SMN is the choice of the sampling locations that can be executed by probability

AC

sampling (random sampling with known inclusion probabilities) or by non-probability sampling. Hence, two main types of soil sampling strategies exist, i.e. the design-based strategy and the modelbased strategy (Brus and de Gruijter, 1993; de Gruijter et al., 2006). The design-based strategy is a classical sampling theory based on probability sampling: i.e. random sampling within the area of interest. The population is then assumed independent and identically distributed. For the modelbased strategy, also called geostatistical modelling, randomness is introduced via the model, which is

3

ACCEPTED MANUSCRIPT a stochastic model and therefore contains a random error term. The inference is based on assuming the validity of the stochastic model. Hence, design-based sampling is a combination of probability sampling and design-based inference, whereas model-based sampling is a combination of purposive sampling and interpolation (de Gruijter et al., 2006). The choice between these two sampling strategies relies finally on the purpose of the

PT

SMN itself, i.e., to quantify or to spatialize the target variable. Design-based sampling is more

RI

suitable to deal with the ‘How much?’ question, calculating global mean, quantiles, standard

SC

deviation, i.e. the frequency distribution of the target variable. The model-based sampling is more suitable to answer the ‘Where?’ question allowing the prediction of values at unsampled locations,

NU

and so the mapping of the target variable in the whole study area along with its confidence interval

MA

through the associated stochastic model parameters (Wang et al., 2012). In practice, the means (e.g., funding, time, labour) needed to develop SMNs are far from being ideal for the proper development of these types of soil sampling networks, i.e. design-based and model-

PT E

D

based. A random sampling (design-based sampling) is difficult to achieve because of the impossibility to access some sites, and/or because the environment (forest, buildings…) can occasionally hamper a precise localization of random selected sites by GPS devices (Kidd et al., 2015). Concerning the

CE

realization of model-based sampling, it requires the prior knowledge of some information about the spatial variability of the target parameter, which often does not exist or is too scarce. Simpler

AC

approaches for a spatial sampling to be used in a model-based strategy can be designed, such as a regular grid or spatial coverage samples (de Gruijter et al., 2006). These are often more timeconsuming and costly, as their density are higher due to the lack of a priori knowledge of the spatial variability or due to the existence of conflicting trends in the spatial variability of different soil parameters. Many western European countries have been developing their own SMN within the last decades such as Netherlands, France and Denmark (Bloem et al., 2005; Jolivet et al.,. 2006; Østergaard, 1989), 4

ACCEPTED MANUSCRIPT mostly choosing one particular type of SMN amongst design-based or model-based ones, or capitalizing on legacy database such as in southern Belgium (Goidts et al, 2009). The Soil Monitoring Network ‘CARBOSOL’ was developed to survey the quality of agricultural soil in Wallonia by quantifying SOC densities and their evolution (Goidts & van Wesemael, 2007). A reasonable number of locations from a previous national sampling campaign from 1949-1965 (used to map soils in

PT

Belgium) were chosen according to their belonging to 15 spatial LandScape Units (LSU) and resampled from 2005 to 2009. The LSU are characterized by homogeneous climate conditions, soil

RI

type (texture, drainage and stoniness), land use (cropland or grassland) and main management

SC

practices (inherent to the agricultural regions). Two main limitations arise for answering the ‘How

NU

much?’ question: the 15 LSU considered here all together covered only 63 % of the Walloon agricultural areas, and the legacy database on which the SMN is based was created as a convenience

MA

design sampling. Hence, SOC stocks were only computed on a limited area in Wallonia and this, based on a non-independent sampling data invalidating the statistical inference.

D

The first attempts to map SOC density in Belgium answering the ‘Where?’ question consisted of

PT E

assigning the mean and associated uncertainties observed in each LSU to the corresponding spatial polygons (Goidts et al., 2009; Lettens et al., 2005). Thus, the process of mapping lacking statistical

CE

inference, this approach can be considered as a digitising not resulting in a proper understanding of SOC density spatial variability. To produce more realistic results, a Digital Soil Mapping (DSM)

AC

technique was first applied to Flanders (northern Belgium; Meersmans et al., 2008) and later, to the entire Belgian territory for two different periods (Meersmans et al., 2011). The DSM technique consisted in fitting a multiple linear regression to observed data using a set of spatialized layers of covariates, i.e. environmental factors controlling soil formation and explaining the spatial variability of the target variable (Lagacherie, 2008; McBratney et al., 2003). However, Meersmans et al. (2011) used a multiple linear regression preventing the representation of potential non-linear relations existing between SOC and environmental covariates. They also applied a PTF (PedoTransfer Function)

5

ACCEPTED MANUSCRIPT to estimate soil bulk density, increasing considerably the uncertainties (2.6%–29.0%) for SOC density calculation. Here, we aim to both quantify SOC stocks (i.e., How much?) and spatialize SOC densities (i.e., Where?) at regional scale (southern Belgium) based on data from a non design-based nor modelbased SMN. To this end, we developed a computation procedure based on Digital Soil Mapping

PT

techniques and stochastic statistic approaches allowing the estimation of multiple independent

RI

spatialized datasets and their uncertainties associated to both SOC density calculation and spatial

SC

modelling. Hence, reliable statistics, mean or median SOC density and confidence intervals, can be computed at the pixel scale, as well as SOC stocks and their confidence intervals for selected sub-

NU

areas or the entire study area.

MA

2. Materials and methods 2.1. Study area

D

The study area covers southern Belgium, also called Wallonia, which covers about 16,900 km². From

PT E

the northwest to the southeast, there is an increase in precipitation (from 800 to 1200mm) along with elevation (from 180 to 690 m; Fig. 1A) and a decrease of mean annual temperature (from 10 to

CE

8 °C). In the same direction , a shift occurs from deep sandy loam and silty soils to shallow silt loam and stony soils, from Haplic Luvisol to Distric Cambisol mainly (with Fluvisol on valley bottoms), along

AC

with a shift from intensive arable agriculture to more extensive cattle breeding (Fig. 1B; for further details, see Goidts & van Wesemael, 2007). Ten agricultural regions (Fig. 1) are defined on homogeneous characteristics in terms of agricultural practices (types of crops and/or livestock) and crop yields, both directly linked to soil types (e.g., texture, fertility, stoniness, drainage) and climatic characteristics (Lettens et al., 2005a). Agricultural regions are now used as reference in most of the recent reporting activities linked to agricultural

6

ACCEPTED MANUSCRIPT activities and diagnosis of soil quality evolution (such as National Inventory Report –NIR- commanded by EU or the state of the environment report published by the Public Administration of Wallonia). 2.2. Soil data SOC Monitoring Network in Wallonia

PT

Between 1949 and 1965, the National Soil Survey (NSS) resulted in the description and analysis of about 13,000 soil profiles to create the Belgian soil map (De Leenheer et al., 1968). The resulting

RI

digitized legacy database (“Aardewerk”, Van Orshoven et al., 1988) contains precious information per

SC

horizon (up to a depth of about 1.5m) for each soil profile, e.g., texture, SOC content and rock fragment content (RF). Other information such as “soil series”, land use, and coordinates of the

NU

profile, is also available.

MA

Between 2005 and 2014, 591 locations from the original NSS sampling network were resampled to form the CARBOSOL SMN dedicated to agricultural soils (croplands and grasslands). Firstly, 434

D

locations were sampled from 2005 to 2009 within 15 homogeneous LSUs (Goidts, 2009) to study SOC

PT E

stock changes between 1955 and 2005, then completed in 2012 - 2014 by 157 locations sampled within 30 additional LSUs to improve the spatial coverage of the network. As the CARBOSOL SMN has

CE

not been originally designed to assess SOC stocks by agricultural regions and main agricultural land use (i.e., cropland and grassland), the spread of the samples through the territory is not optimal to

AC

produce significant and reliable statistics of SOC stocks for the 20 existing agricultural region/land use combinations (Tab. 1). Sampling method At each location, composite samples of the 0-30cm topsoil layer were taken from 5 points sampled by auger and located within a circle of 4 m radius centered on the known geographical coordinates. When the first 30cm were composed of different horizons, the sampling was done by horizon. The samples were air-dried, gently crushed with a mortar and pestle, sieved at 2 mm and stored in plastic

7

ACCEPTED MANUSCRIPT bags at room temperature. Rock fragment content was expressed as mass fraction. Three intact soil cores of 100 cm3 (Øcore = 5.3 cm) were taken in the middle of each sampled layer to measure the corresponding bulk density. For thorough explanations, please refer to Goidts et al. (2009). Carbon analysis

PT

SOC content of fine earth (< 2mm) was measured using the classic dichromate oxidation method of Walkley & Black (1934) (Cwb) for soils sampled between 2005 and 2009. This method is known as

RI

requiring the application of a correction factor to raw measures in order to correct for incomplete

SC

oxidation and estimate total organic carbon concentration. The total C content (TC) was measured in soils sampled since 2009 using a VarioMax CN dry combustion Analyzer (Elementar GmbH, Germany)

NU

and was taken as SOC content (Cdc) unless a HCl test showed the presence of carbonates. In which

MA

case, Inorganic Carbon (IC) content was then determined separately by a pressure method (Sherrod et al 2009) and Cdc was taken as the difference between TC and IC.

D

Amongst the 434 locations originally analysed for SOC content by Walkley & Black method between

PT E

2005 and 2008, 100 were randomly selected and analysed by dry combustion and if required corrected for carbonates. These 100 measurements were used to determine a specific correction to

CE

apply to Walkey & Black raw data. To this end, a linear model was fitted between raw data obtained by Walkley and Black method (Cwb) and by dry combustion analysis (Cdc), using lm function from the

AC

stats package in R (Eq. 1): 𝐶𝑑𝑐 = 𝑎 ∗ 𝐶𝑤𝑏 + 𝑏

(Eq. 1)

Where a and b are the constant parameters of the equation. SOC density calculation The SOC density (SOC in Mg.ha-1) has been calculated for the 0-30cm soil depth according to Eq. (2): 𝑆𝑂𝐶 =

𝑑∗𝐶∗[1−𝑅𝐹]∗𝐵𝐷 100

(Eq. 2) 8

ACCEPTED MANUSCRIPT where d is the depth (m), C is the content in organic carbon (gC.kg-1), RF is the mass fraction of rock fragments (-), and BD is the bulk density (kg.m-3). 2.3. Environmental covariates considered in the DSM approach The digital soil mapping (DSM) approach used in this study fitted a statistical regression model between the soil property to predict (SOC density in Mg C ha-1) and independent environmental

PT

covariates at the same location. SOC density values can be predicted at unsampled locations by

RI

applying the fitted model to the spatial continuous layers of covariates. A spatial resolution of 40m

SC

was chosen as a compromise between the different original resolutions of covariates already continuous in space (varying from 20 to 90m), the scale of the sampling sites (i.e., a circle of 4m

NU

radius) and the limitation of the computation time. So, all the continuous variables used as environmental covariates in our mapping procedure were first rescaled via ArcGIS10 (ESRI, 2011) by

MA

bilinear interpolation, which determined the output value of a cell based on a weighted distance average of the four nearest input cell centers. Once the stack of layers was constituted, the values of

D

these covariates were extracted at each of the 591 locations constituting the CARBOSOL SMN, then

PT E

creating the dataset used in the computation procedure of SOC densities and uncertainties. The environmental covariates considered in this study are discussed below.

CE

Digital Elevation Model and derivates

AC

A Digital Elevation Model (DEM) with a resolution of 20m provided by the National Geographic Institute (NGI) of Belgium was used, from which different morphologic and hydrologic derivatives were computed in ArcGIS10 (ESRI, 2011): slope, aspect, profile, plan and general curvatures, Topographic Position Index (TPI; Jenness, 2006), Wetness Index (WI; Phillipps, 1990), flow accumulation, upstream flow length and LS-factor as used in the RUSLE equation (Renard et al., 1991; Fig. 1A). Finally, we converted aspect into easting and northing which represent the degree to which aspect is close to the east and the north, respectively (Zar, 1999).

9

ACCEPTED MANUSCRIPT Climate data Annual mean temperature and precipitation data were extracted from weather stations in Belgium and neighboring countries for the period 1971-2000. These data were supplied by the national meteorological institutes of Belgium (Koninklijk Meteorologisch Instituut, KMI), the Netherlands (Koninklijk Nederlands Meteorologisch Instuut, KNMI), Germany (Deutscher Wetterdienst, DWD),

PT

France (Meteo France) and Grand Duchy of Luxemburg (Observatory for Climate and Environment).

RI

Based on these data, climatic maps were created by modeling temperature and precipitation with

SC

altitude using thin-plate splines regression (Fig. 2). Using altitude as covariate for mapping climatic variables can improve predictions dramatically (Boer et al., 2001). The smoothing parameter is

NU

chosen automatically by generalized cross-validation. Here, elevation data were derived from the Shuttle Radar Topography Mission (SRTM) mission of the NASA (http://srtm.csi.cgiar.org) to cover

MA

Belgium and neighboring countries.

D

Soil parameters

PT E

Numerous soil parameters were measured during the NSS and compiled in the digitized database AardeWerk and used to produce the Digital Soil Map of Wallonia (DSMW). All the polygons of the

CE

DSMW corresponding to mineral soils were selected and converted to a raster format with a resolution of 40m. Minimum and maximum ground water levels, i.e. the position of the winter and

AC

summer ground water table, were mapped as proposed by Meersmans et al. (2008). Four maps of soil texture were created using the data of 6129 historical soil profiles from the AardeWerk database (Fig. 3). We computed three maps corresponding to the classical texture classes (sand 50 µm-2 mm, silt 2-50 µm, clay < 2 µm), and one map corresponding to the clay and fine silt class (i.e., particles <20 µm) generally assumed to be linked to the stable SOC fraction (Hassink, 1997). Only the first soil horizon was kept in the AardeWerk database to compute these four maps. We spatialized the observations using regression kriging (Pebesma, 2006). First, we computed the mean sand, silt and clay content per soil association and assigned the resulting values to the corresponding spatial 10

ACCEPTED MANUSCRIPT polygons. Then, we computed the residuals between the observations and the mean and kriged the residuals in space by regression kriging. Finally, the obtained spatial layers of residuals were added to the corresponding map of mean values. Anthropogenic influences

PT

Land cover data were extracted from the Carte d’Occupation du Sol de Wallonie (COSW), a land cover map. Land cover classes were reclassified into cropland, (permanent) grassland, forest and

RI

others (Fig. 1). The data is in polygon format and was therefore converted to raster with a 40 m

SC

resolution. We computed the mean annual input of organic carbon by organic amendments (Mg C ha-1 yr-1), e.g. farmyard manure and slurry, for each municipality of Wallonia based on data provided

NU

by the National Institute of Statistics for the period 2007-2012 and the method proposed by

MA

Dendoncker et al. (2004). Resulting calculations were then applied to the corresponding polygons of the Walloon municipalities and, transformed to raster format with a resolution of 40m.

D

2.4. Simulation procedure of SOC densities and uncertainties at the regional scale

PT E

After the creation of the database (step 1 in Fig. 4), two consecutive methods were applied to, first, fit the mean trend model linking SOC density spatial variability to the most significant environmental

CE

covariates (step 2 in Fig. 4) and to, secondly, assess the uncertainties associated with SOC density

AC

calculation and spatial modelling (step 3 in Fig. 4).

Generalized Additive Model The initial dataset was split into a calibration (2/3) and a validation (1/3) set by Latin hypercube sampling stratified by SOC density values and land use (Minasny and McBratney, 2006). Then, a Generalized Additive Model (GAM; Hastie and Tibshirani, 1996) was fitted on the calibration dataset. This regression technique is a generalization of linear regression models in which the coefficients can be a set of smoothing functions, then accounting for the non-linearity that could exist between the dependent variable and the covariates (Eq. 3): 11

ACCEPTED MANUSCRIPT 𝐸(𝑌 | 𝑋1 , 𝑋2 , … , 𝑋𝑝 ) = 𝛼 + 𝑓1 𝑋1 + 𝑓2 𝑋2 + ⋯ + 𝑓𝑝 𝑋𝑝

(Eq. 3)

where Y is the dependent variable, X1,X2,…Xp represent the covariates and the fi's are the smooth (non-parametric) functions. As for generalized linear models, the GAM approach specifies a distribution for the conditional mean µ (Y) along with a link function g relating the latter to an

PT

addictive function of the covariates (Eq. 4): (Eq. 4)

RI

𝑔[µ (𝑌)] = 𝛼 + 𝑓1 𝑋1 + 𝑓2 𝑋2 + ⋯ + 𝑓𝑝 𝑋𝑝

The GAM model was built using regression splines, and the smoothing parameters were estimated by

SC

penalized Maximum Likelihood to avoid an over-fitting (Wood, 2001). An extra penalty added to each

NU

smoothing term allowed each of them to be set to zero during the fitting process in case of multicollinearity or multi-concurvity. Finally, the interaction of geographical coordinates (X, Y) was added

MA

to each model (as a two-dimensional spline on latitude and longitude) to account for the spatial dependence and trends of the target variable at the regional scale.

D

A model including all the covariates was first developed (the ‘full’ model). Then a backward stepwise

PT E

procedure was applied to select the appropriate covariates where each of the covariates were sequentially removed from the full model (de Brogniez et al., 2014; Poggio et al., 2013). The

CE

difference (dAIC) between AIC (Akaike’s Information Criterion; Akaike (1974)) calculated for the ‘full model’ and each AIC obtained when removing sequentially one of the covariate from the model was

AC

computed to evaluate the influence of each covariate on the overall capabilities of model prediction. For validation, predicted values of the independent validation set were compared to observed ones and mean error (ME) and root mean square error (RMSE) were calculated to appreciate the goodness of fit of the model, as well as the coefficient of determination. After this validation phase, a final model was built with all samples (i.e. in both the calibration and validation sets) using the covariates selected by the backward stepwise procedure in order to improve model accuracy and representativity over the entire territory.

12

ACCEPTED MANUSCRIPT Estimation of the uncertainties The uncertainties due to SOC density measurement and those associated with the GAM model (i.e., uncertainties of the parameters) were both considered in the SOC density mapping procedure (Fig. 4). For this purpose, two consecutive Monte Carlo simulations (Steps 3.1 and 3.2 in Figure 4) were



PT

used (Hammersley and Handscomb, 1964; Rubinstein, 1981). Step 3.1: Applying a regression model to spatial continuous layers implies that only one value

RI

would be predicted for the pixel centroid. As one of our aims was to compute SOC stocks for

SC

continuous surfaces, it is necessary to account in the simulation procedure for the spatial variability of SOC density within the pixel related area. The uncertainties associated with the SOC

NU

density calculation at the sub-field scale in the specific context of the CARBOSOL SMN were

MA

estimated by Goidts et al. in 2009 considering the land use (cropland or grassland) and soil type (texture and stoniness). The coefficients of variation (CV = 100*sd/µ) of SOC density calculation derived from Goidts et al. (2009) were assumed to be representative for the different agricultural

PT E

D

soils in southern Belgium, and were used as estimators of SOC density uncertainties at the 40*40m pixel resolution (Table 2). The SOC density values were then randomly varied following a normal probability distribution with a mean equal to the parameter value and a standard

CE

deviation based on the CV proposed in Table 2. Hundred MC simulations were performed and, based on each of these new 100 datasets, the parameters of the mean trend model were



AC

readjusted to create 100 new GAMs (i from 1 to100; step 3.1 in Fig. 4). Step 3.2: To take into account the uncertainties associated with the modelling procedure, the prediction matrix of each of the 100 GAMs obtained above were used to obtain multiple realizations of the trend, according to the approach proposed by Wood (2006). For each of the 100 GAM models produced by the MC1 simulation, confidence intervals of the model parameters were estimated. For each of these models, 100 replicate sets of the associated parameters were created through MC simulations considering the mean value of the parameters, their confidence

13

ACCEPTED MANUSCRIPT intervals and a normal distribution. To finish, the prediction matrix relating the model parameters to the predictor were used to obtain 100 replicates (j) of each GAMi prediction on the 40 m grid (i.e. i and j from 1 to 100; step 3.2 in Fig. 4). Hence, the simulation led to the computation of 10,000 (100*100) independent realizations of the trend model. SOC density values were then predicted at unsampled locations of Walloon agricultural

PT

areas by applying separately each of the 10,000 models to the spatial continuous layers of concerned

RI

environmental covariates (step 4 in Fig. 4).

SC

Spatial estimation (Where?)

NU

Based on the 10,000 independent spatial estimations of SOC density values produced above, we computed the median and upper (Q95) and lower (Q5) percentiles for each pixel. Resulting median

MA

values were used to map SOC density values (Mg C ha-1) all over the territory whereas the variability of this parameter (dSOC) was estimated as the difference between the 95th and the 5th percentiles

D

(Eq. 5):

(Eq. 5)

PT E

dSOC = Q95 - Q5

CE

SOC stock calculation (How Much?)

For each of the 10,000 independent spatial estimations of SOC density values, SOC stocks (Mt C)

AC

were computed by summing pixels values multiplied by the pixel surface (40*40m) from the areas requested by GHG reporting , i.e. the 20 combinations of land use/agricultural regions (Tab. 1). Finally, summary statistics of SOC stocks (mean and standard deviation; n = 10,000) were computed in order to be comparable with results from the literature. The simulation presented in Figure 4 and the summary statistics were run in R software.

14

ACCEPTED MANUSCRIPT 3. Results and discussion 3.1. Descriptive statistics Based on 100 randomly selected locations, a linear model was fitted between raw data obtained by Walkley and Black method (Cwb) and those obtained by dry combustion method (Cdc) (Fig. 5). Meersmans et al. (2009) also found significant positive intercepts (Equation Fig. 5) of similar values

PT

for soils under cropland and dry silt loam grassland soils located in Northern Belgium. The positive

RI

significant intercept suggested the presence of recalcitrant organic matter resulting in an incomplete

SC

oxidation by the Walkley and Black procedure (Meersman et al., 2009). The mean error between Cdc and Cwb* (i.e., Cwb corrected for incomplete oxidation by the linear model fitted above) was of

NU

6*10-5 g C kg-1 and non-significant (paired-sample t-test P= 0.73, n=100; Fig. 5). From these results, we assumed that the difference between SOC content values obtained from these two different

spatial trends and SOC stocks calculation.

MA

methods were comparable after correction of the Cwb values and induced no bias in the SOC density

D

Descriptive statistics of the total dataset (n=591) showed minimum and maximum SOC density values

PT E

of 28.61 and 161.63 Mg C ha-1 with a mean of 68.31 Mg C ha-1 (Fig. 5). The dataset showed a positively-skewed unimodal distribution (p=0.98 at the Dip Test; Hartigan and Hartigan, 1985). The

CE

mode was located around 55 Mg C ha-1 and related to locations sampled under the predominant land use in the dataset, i.e. cropland (n=371; median=53.38 Mg C ha-1; mean=54.51 Mg C ha-1; Fig. 5).

AC

Locations under grassland which represented ca. 37% of the total dataset showed higher values (n=220; median=88.41 Mg C ha-1; mean=91.59 Mg C ha-1) than those under cropland. A correlation matrix was first computed by including SOC density and all the environmental covariates in order to study potential multicollinearity. From this first matrix (data not shown), we pre-selected environmental covariates showing a minimum correlation coefficient of abs(0.10) with SOC density to produce a simplified matrix (Table 3). As the GAM fitting procedure used in this study already deals with collinearity and coconcurvity between covariates, the correlation matrix allows a 15

ACCEPTED MANUSCRIPT first investigation on the relations between all the parameters and later to support the interpretation of the results from the backward stepwise procedure (section 2.4). The highest correlations observed between SOC and environmental covariates were with the Clay + fine Silt (C+fS; ρ=-0.46) and then the clay content (Clay; ρ=-0.42) (Table 3). The most significant correlations between SOC density and morphometric parameters were 0.31 for elevation and 0.25 for ls-factor. Precipitation and mean

PT

temperature showed correlation coefficients of 0.35 and -0.27 with SOC density, respectively. Amongst the independent covariates, strong correlations were observed between elevation and

RI

temperature (-0.99), rain and temperature (-0.86), and unsurprisingly between ls-factor and slope

SC

(0.88). Latitude and longitude also showed high correlation values with several covariates such as

NU

elevation and climatic covariates.

Minimum, mean and maximum SOC density values were of 28.6, 68.5 and 161.6 Mg C ha-1 in the

MA

calibration dataset (n=395), and of 31.2, 67.9 and 151.4 Mg C ha-1 in validation dataset (n=196; Tab.

PT E

3.2. SOC density model

D

4).

Model calibration

CE

The SOC density dataset being continuous and strictly positive, we applied a Gamma distribution in the GAM model. The log-link function was chosen for the model fitting considering the positively-

AC

skewed unimodal characteristic of the SOC density distribution. The Q-Q plot showed that the calibration residuals appeared to be mostly aligned along a Gamma distribution (Fig. 7). The backward stepwise procedure showed that the land use was the covariate influencing the most the prediction model, followed by the coordinates interaction (X,Y), the LS-factor, the maximum ground water table depth (GWL max), the clay + fine Silt content (C+fS) and the temperature (Fig. 8). The other covariates considered in this study appeared to not significantly influence the model performance.

16

ACCEPTED MANUSCRIPT Several studies have already demonstrated the importance of land use/land cover on SOC content (e.g., Jones et al., 2005; Goidts et al., 2009). Meersmans et al (2011) demonstrated that the ground water table depth, especially within grassland, influenced greatly SOC contents. Bulk SOC consists of heterogeneous pools with different turnover rates and stabilities depending on complex interactions between biological, chemical, and physical processes in the soil (Post & Kwon, 2000). The stable

PT

fraction of SOC, which has slower turnover (from years to decades), has been demonstrated to be associated to clay and fine silt content (Hassink, 1997). The stable fraction can represent around

RI

more than 60% of the bulk SOC in cropland of the Northern Wallonia, i.e. the Loam agricultural

SC

region (Fig. 1 region 2) (Trigalet et al., 2014). This could explain the selection of the clay and fine silt

NU

content as the significant texture covariate selected to model the spatial variation of SOC density instead of the classical clay contents.

MA

The LS-factor as used in the RUSLE equation combines flow length and slope gradient, explaining the removal of both of them from the model by the backward stepwise procedure. In the model, the LS-

D

factor and the temperature impacted significantly SOC density prediction for cropland but not for

PT E

grassland (Fig. 9). SOC density increased with LS-factor in cropland, very slightly between 0 and ca. 11 and then very rapidly. Fields located in Sandy-loam (n°1 in Fig. 1) and Loam (n°2) agricultural regions

CE

(intensive agriculture) are prone to severe water erosion processes which transport fine soil particles (rich in SOC) and deposit them temporarily downslope where LS-factor values are high due to flow

AC

length. Moreover, as highlighted in Table 3, LS-factor increases with elevation (ρ=0.29). Some locations sampled in cropland with a high LS-factor are located in hilly areas where soils are managed very differently (extensive agriculture) than in flatter areas (intensive agriculture) because of the relief and the stoniness of the soils. Soils in hilly areas are mainly used as grassland especially in very stony areas, but it is not unusual that crop fields located in less stony areas include a grass ley in the rotation. For mean annual temperatures from 7 to 8.5-9.0°C, SOC density decreased with mean annual temperature increase which is in agreement with literature (e.g., Kirschbaum, 1995; Wang et al., 2013), strongly for cropland and slightly for grassland (Fig. 9). These mean annual temperature 17

ACCEPTED MANUSCRIPT occurred mainly in hilly areas between plateaus and valleys (Fig. 1A) where crop fields are temporary used as meadows. This could explain why the slope of the relation between temperature for the range 7 to 8.5-9.0°C and SOC density in cropland is much more important than the slope for grassland. Areas with mean annual temperatures ranging from 8.5 to 10°C are concentrated in the plan northwestern part of the territory dominated by intensive managed croplands and where

PT

temperature gradients are not as contrasted in space as in southeastern hilly areas. However, we observed a slight increase in SOC density values in grassland along with temperature (8.5 to 10°C) in

RI

this region that could be explained by the fact that soils of these regions are not stony. Then, the 0-

SC

30cm soil depth under cropland could potentially store more SOC in these conditions than in areas

NU

with lower temperature and higher rainfall.

MA

Finally, the fitted GAM model explained 67.4% of the SOC density variance in the calibration dataset. Model validation

D

The validation of the model gave satisfactory results with a R² value of 0.64 and a RMSE of 14.1 Mg C

PT E

ha-1 (10.6 Mg C ha-1 for cropland and 19.1 Mg C ha-1 for grassland; Fig. 10). The model showed a trend to overestimate low SOC density (below 50 Mg C ha-1) and to underestimate high SOC density

CE

(especially those above 100 Mg C kg-1).

Goidts et al.(2009) explained 12 to 29% of the SOC density variance in southern Belgium by a

AC

digitizing approach, whereas Meersmans et al. (2011) explained 36% of the variance in northern Belgium by using a multiple linear regression approach based on land use, texture and soil drainage. For the entire Belgian territory, Meersmans et al. (2011) obtained a R² of 0.42 (for a legacy data set of the 1960s) and 0.65 for a data set from 2005. Studies performed in other closed European countries showed similar results. Schulp and Verburg (2009) explained variances of 21 to 43% of SOC contents and stocks in different study areas in the Netherlands while extensive data on soil properties, topography, environmental conditions and

18

ACCEPTED MANUSCRIPT recent historic land use were used as predictors. Wiesmeier et al. (2014) produced a spatial prediction using regression tree technique explaining 39% of the spatial variability of SOC density in Bavaria (Germany). Despite the incorporation of numerous environmental covariates and the diversity of modelling techniques used in these different approaches, capturing the complexity of the processes controlling

PT

SOC storage and mineralization at regional scale remains challenging. Only low levels of explanation

RI

are expected for SOC predictions due to an inherent high spatial variability of SOC as suggested by

3.3. Spatial variability of SOC density: ‘Where?’

SC

(Schulp and Verburg, 2009; Schulp et al., 2013).

NU

Median values of the 100,000 independent spatial simulations of SOC density were applied to map

MA

the SOC density (Mg C ha-1; Figs. 4 and 10A). A positive gradient of SOC density occurred from the northwest to the center of Wallonia with a slight decrease on the southernmost part, i.e. the Ardenne and locally in Jurassic agricultural regions (9 and 10 in Fig. 1A). This regional trend is mainly

PT E

D

correlated with the evolution of elevation (Fig. 1A), precipitation and temperature (Fig. 2) and dominant land use (Fig. 1B). The visual comparison of the spatial distribution of SOC density produced here and this of SOC content produced by Meersmans et al. in 2011 were in good

CE

agreement for the main regional trends described above. However, while highest values appeared predominantly in Ardenne (8) and Haute Ardenne (10; Fig. 1A) in Meersmans et al. (2011), here the

AC

highest values were in Herbagère Fagnes (6), Famenne (7), Herbagère Liège (5) and Haute Ardenne (10; Fig 1A). These four agricultural regions are dominated by grassland (Fig. 1B), characterized by high clay and fine silt contents in topsoil (especially for regions 6 and 7; Fig. 3). The differences observed could probably come from the fact that Meersmans et al. (2011) predicted SOC content (gC kg-1) whereas, here, we predicted SOC density (MgC ha-1). Also, we used different methods for the spatial modelling, i.e. multiple linear regression for Meersman et al. (2011) and GAM in this study. Based on the 10,000 independent spatial simulations produced in this study, the 5th and 95th

19

ACCEPTED MANUSCRIPT percentiles were computed for each pixel to map the errors linked to both SOC density calculation and spatial modelling (Mg C ha-1; Fig. 10B). Uncertainties for SOC density values exerted the same main regional spatial trends as those observed for SOC density values. Indeed, the most reliable predictions were obtained for northwestern agricultural regions dominated by cropland, i.e. the Sandy-loam (n°1), Loam (n°2) and Condroz (n°4; Fig. 1A).

PT

In 2009, Goidts et al. characterised the variability of SOC density at different spatial scales in Wallonia

RI

and showed that SOC variation in few meters could be locally on the same order of magnitude as the

SC

variation at the regional scale. This results from the multiple processes acting simultaneously on SOC balance (i.e. between SOC stabilisation and mineralisation) in topsoil but at different spatial scales.

NU

Indeed, these processes are controlled by environmental factors which show scale-dependent links to SOC content and density. Stevens et al. (2015a) demonstrated that SOC content variance over

MA

central Belgium could be decomposed in three main -signals acting at different spatial ranges, and potentially corresponding to different environmental factors controlling the entire SOC variability; i.e.

D

long-range (~30km, i.e. regional influences), medium-range (~5km, i.e. linked to watershed systems)

PT E

and short-range (~700m, i.e. related to field / farm management). By separating the variation of the SOC at these three spatial scales, the authors disentangled the combined effect of the controlling

CE

factors and improve the qualitative and quantitative characterization of the controlling factors separately. In this study, we explained 67% of the spatial variability of SOC density in agricultural

AC

topsoil through a GAM including five significant covariates, i.e. the land use, the LS-factor, the maximum Ground Water Level, the clay and fine silt content, and precipitation in decreasing order of importance (Fig. 7). Besides the land use, these covariates hence act mainly at two different scales on SOC density variation, the long-range scale for precipitation and the clay and fine silt content, and the medium-range scale for the maximum ground water level and the LS-factor. None of the covariates used initially in this study could reflected field or farm management as this type of information is difficult to synthetize and moreover at the regional scale. However, a study from Luxembourg, and based on the use of mixed-effects models and airborne imagery showed that field 20

ACCEPTED MANUSCRIPT effect explained locally from 23.3 ± 4.1% up to 68.8 ± 12.0% of the SOC variance (Stevens et al., 2015b). Based on DSM, the modelling procedure applied here allowed to explain 67% of the SOC density variance during the calibration procedure and to depict realistic spatial patterns. However, a better performance could not be achieved given the sampling density that is too coarse to detect the field

PT

scale variability.

RI

3.4. SOC stocks in Wallonia: ‘How much?’

SC

SOC stocks for the top 0-30 cm depth of soil were computed based on the 100,000 independent spatial simulations for each of the 20 land use/agricultural region combinations. Descriptive statistics

NU

of predicted SOC stocks per combination and the entire Wallonia, as well as observed and predicted

MA

SOC densities, are presented in Table 5.

Mean predicted SOC stocks for cropland and grassland in Wallonia were of 25.93 Tg C (SD 2.30) and

D

46.16 Tg C (8.02), respectively. This corresponds to a mean predicted SOC density of 54.7 Mg C ha-1

PT E

(SD 4.9) in cropland, equivalent to the observed value of 54.5 Mg C ha-1 (SD 11.3), and to a mean predicted SOC density of 80.9 Mg C ha-1 (SD 14.1) in grassland, which appeared lower than the

CE

observed value of 92.1 Mg C ha-1 (SD 22.0). Six of the seven combinations of grassland/agricultural regions sampled in this study showed predicted mean SOC density values inferior to the observed

AC

values (Table 5). Indeed, the model developed in this study tended to underestimate high SOC density values, especially for validation sites with observed SOC densities above 100 Mg C ha-1 (Fig. 10). For example, the agricultural regions ‘Herbagère Liège’ and ‘Famenne’ (respectively 5 and 7 in Fig. 1A) which showed the higher mean observed SOC density values, i.e. 106.8 Mg C ha-1 (SD 21.2) and 94.6 Mg C ha-1 (SD 24.9), represent 27.3% of the total grassland area in Wallonia (Table 1). Then, the model induced an overall underestimation of the SOC density and consequently SOC stocks for grassland of Wallonia, whereas it is not the case for cropland.

21

ACCEPTED MANUSCRIPT Lettens et al. (2005b) measured mean SOC density values of 48 Mg C ha-1 in cropland and 75 Mg C ha1

in grassland for Wallonia which appeared lower than the observed and predicted values calculated

here (Table 5). Lettens et al. (2005b) used different sampling methods and estimated the soil bulk density through a PTF. At the whole Belgian territory scale, Meersmans et al. (2011) predicted SOC density values of 49.5 Mg C ha-1 (SD 0.90) for cropland and 79.9 Mg C ha-1 (SD 1.0) for grassland using

PT

a multiple linear regression. These results for Belgium are non-significantly different from the predictions made here for the southern part of the country (reminder: 54.7 Mg C ha-1 (SD 4.9) in

RI

cropland and 80.9 Mg C ha-1 (SD 14.1) for grassland). However, the predictions for entire Belgium

SC

(Meersmans et al., 2011) showed lower uncertainties than the predictions produced here for

NU

Wallonia (Table 5). We considered here the uncertainties associated to the SOC density calculation at the pixel-area related scale (40x40m) and the uncertainties associated with the estimation of the

MA

model parameters, while only uncertainties associated to the model parameters were considered for the study consecrated to the whole Belgium. In France, mean SOC density values of 55.7 Mg C ha-1

D

and 85.8 Mg C ha-1 were predicted for cropland and grassland respectively (Meersmans et al., 2012),

PT E

which are similar to our predictions for Wallonia. Similarly, SOC stocks were estimated for France where the relative model error (i.e. model error divided by predicted value) on the total national SOC stock was 33.9% (Meersmans et al., 2012), whereas here the relative model error was 8.9% for

CE

cropland and 17.4% for grassland. These differences could be explained by the fact that for France

AC

the model included not only cropland and grassland but also other land uses as vineyards and forests, and by the different methods used to compute the SOC stocks and associated uncertainties. Meersmans et al. (2012) also used a MC simulation but first perturbed the parameters of the model estimating SOC contents, and then the bulk density (as estimated by a PTF) and rock matter content were perturbed to finally obtain SOC stocks. The combination of DSM technique and Monte-Carlo simulations led to the computation of reliable statistics for SOC stocks for each of the 20 combinations of land use / agricultural regions, including combinations which had no or very few sampling sites amongst the CARBOSOL SMN. Hence, the 22

ACCEPTED MANUSCRIPT modelling procedure developed here allows to comply with specific requirements for reporting activities (e.g. NIR).

4. Conclusion A methodology combining Digital Soil Mapping technique and stochastic simulations (Monte-Carlo)

PT

was developed to both quantify SOC stocks (i.e., How much?) and spatialize SOC density (i.e., Where?) for southern Belgium based on a convenience sampling scheme. A Generalised Additive

RI

Model (GAM) explaining 67.4% of the SOC density variance was fitted and then validated (R² = 0.64).

SC

During the model fitting procedure, the landuse (cropland or grassland), the LS-factor, the maximum ground water level, the clay and fine silt content and the temperature were selected as the most

NU

significant covariates explaining SOC density variation at the regional scale. A positive gradient of

MA

SOC density occurred from the northwest to the center of Wallonia with a slight decrease on the southernmost part. This regional trend is mainly correlated with the evolution precipitation and

D

temperature (along with elevation) and dominant land use. At the catchment scale higher SOC

PT E

density were predicted on poorly drained soils of valley bottoms, especially under grassland. A better performance for SOC density spatial prediction could not have been achieved given the sampling density of the SMN CARBOSOL that is too coarse to detect the field scale variability. Two successive

CE

Monte-Carlo simulations produced 10,000 spatialized datasets for SOC density all over Wallonia and

AC

allowed to consider uncertainties due to SOC density measurement and those associated with the GAM model (i.e., uncertainties of the parameters). Based on these 10,000 independent datasets, reliable statistics for SOC stocks were computed. Mean predicted SOC stocks for cropland and grassland in Wallonia were of 25.93 Tg C (SD 2.30) and 46.16 Tg C (8.02), respectively. The model induced an underestimation of the SOC density and consequently SOC stocks for grassland of Wallonia, whereas it was not the case for cropland. The procedure developed here allowed to predict realistic spatial patterns of SOC density all over agricultural lands of southern Belgium (Where?) and to produce reliable statistics of SOC stocks for 23

ACCEPTED MANUSCRIPT each of the 20 combinations of land use / agricultural regions of Wallonia (How much ?). This procedure appears useful to produce soil maps as policy tools in conducting sustainable management at regional and national scales, and to produce statistics which comply with specific requirements of reporting activities.

PT

Acknowledgement

SC

RI

This research was funding by the Public Administration of Wallonia (SPW-DGO3).

References

NU

Akaike, H., 1974. A new look at the statistical model identification. IEEE Transactions on Automatic

MA

Control 19, 716–723.

Bloem, J., Schouten, J. A., Sorensen, S. J., Rutgers, M., van der Werf, A., Breure, A. M., 2006.

D

Monitoring and evaluating soil quality, in: Bloem, J., Hopkins, D.W., Benedetti, A. (Eds),

PT E

Microbiological methods for assessing soil quality. CABI Publishing, 23-49. Boer, E.P.J., de Beurs, K.M., Hartkamp, A.D. 2001. Kriging and thin plate splines for mapping climate

CE

variables. Int. J. Appl. Earth Obs. Geoinf. 3(2), 146-154.

AC

Brus, D.J., de Gruijter, J.J., 1993. Design-based versus model-based estimates of spatial means. Theory and application in environmental soil science. Environmetrics 4, 123–152. de Brogniez, D., Ballabio, C., Stevens, A., Jones, R.J.A., Montanarella, L., van Wesemael, B., 2014. Eur. J. Soil Sci. 66(1), 121-134. De Leenheer, L., Appelmans, F., and Vandamme, J. (1968). Système des cartes perforées de la section caractérisation du sol" de la cartographie des sols de Belgique. Pédologie 18, 208-227.

24

ACCEPTED MANUSCRIPT de Gruijter, J.J., Brus, D.J., Bierkens, M.F.P., Knotters, M., 2006. Sampling for Natural Resource Monitoring. Springer-Verlag, New York. Dendoncker, N., van Wesemael, B., Rounsevell, M., Roelandt, C., 2004. Belgium’s CO2 mitigation potential under improved cropland management. Agric. Ecosystems Environ. 103, 101–116.

PT

DGATLP, 2001. Limites administratives – Direction générale de l’Aménagement du Territoire, du

RI

Logement et du Patrimoine de la Région Wallonne.

SC

ESRI 2011. ArcGIS Desktop: Release 10. Redlands, CA: Environmental Systems Research Institute. FAO, 2015. International Year of Soil. http://www.fao.org/soils-2015/

NU

Goidts, E., van Wesemael, B., Crucifix, M., 2009. Magnitude and sources of uncertainties in soil

MA

organic carbon (SOC) stock assessments at various scales. Eur. J. Soil Sci. 60(5), 723-739. Goidts, E., van Wesemael, B., 2007. Regional assessment of soil organic carbon changes under

D

agriculture in Southern Belgium (1955-2005). Geoderma 141 (3), 341-354.

PT E

Hallett, P.D., Loades, K.W. & Krümmelbein, J. 2012. Soil physical degradation: threats and opportunities to food security, in: Soils and Food Security, Hester, R.E. and Harrison, R.M. (Eds). Royal

CE

Society of Chemistry, Cambridge, pp. 198–226.

AC

Hammersley, J. M., Handscomb, D. C. 1964. Monte Carlo methods. Methuen. Hartigan, J.A., Hartigan, P.M., 1985. The dip test of unimodality. The Annals of Statistics 13(1), 70-84. Hastie, T., Tibshirani R., 1986. Generalized additive models. Statistical science 1(3), 297-310. Hassink, J., 1997. The capacity of soils to preserve organic C and N by their association with clay and silt particles. Plant Soil 191, 77–87.

25

ACCEPTED MANUSCRIPT Jenness, J., 2006. Topographic Position Index (tpi_jen. avx) extension for ArcView 3. x, v. 1.3 a. Jenness Enterprises. Jolivet, C., Arrouays, D., Boulonne, L., Ratié, C., Saby, N., 2006. Le réseau de mesures de la qualité des sols de France (RMQS). État d’avancement et premiers résultats. Étude et Gestion des Sols 13 (3),

PT

149-164. Jones, R.J.A., Hiederer, R., Rusco, E. & Montanarella, L., 2005. Estimating organic carbon in the soils

RI

of Europe for policy support. Eur. J. Soil Sci. 56, 655–671.

SC

Kidd, D., Malone, B., McBratney, A., Minasny, B., Web, M., 2015. Operational sampling challenges to

NU

digital soil mapping in Tasmania, Australia. Geoderma Regional 4, 1-10. Kirschbaum, M.U.F, 1995. The temperature dependence of soil organic matter decomposition and

MA

the effect of global warming on soil organic C storage. Soil Biology and Biochemistry 27, 753–760

D

Lagacherie, P., 2008. Digital soil mapping: a state of the art, in: Hartemink, A.E., McBratney, A.,

PT E

Mindonça-Santos, M.L., Digital soil mapping with limited data. Springer, Netherlands, pp. 3-14. Lettens, S., Van Orshoven, J., van Wesemael, B., De Vos, B., Muys, B., 2005a. Stocks and fluxes of soil

CE

organic carbon for landscape units in Belgium derived from heterogeneous data sets for 1990 and 2000. Geoderma 127 (1-2), 11–23.

AC

Lettens, S., Van Orshoven J., Van Wesemael, B., Muys, B., Perrin, D., 2005b. Soil organic carbon changes in landscape units of Belgium between 1960 and 2000 with reference to 1990. Glob. Chang. Biol. 11(12), 2128-2140. McBratney, A.B., Mendonça Santos, M.L., Minasny, B. 2003. On digital soil mapping. Geoderma 117 (1-2), 3-52.

26

ACCEPTED MANUSCRIPT Meersmans, J., De Ridder, F., Canters, F., De Baets, S., Van Molle, M., 2008. A multiple regression approach to assess the spatial distribution of soil organic carbon (SOC) at the regional scale (Flanders, Belgium). Geoderma 143(1-2), 1-13. Meersmans, J., van Wesemael, B., Goidts, E., van Molle, M., De Baets, S., De Ridder, F., 2011. Spatial analysis of soil organic carbon evolution in Belgian croplands and grasslands, 1960-2006. Glob.

PT

Chang. Biol. 17, 466-479.

RI

Meersmans, J., Martin, P.M., Lacarce, E., De Baets, S., Jolivet, C., Boulonne, L., Lehmann, S., Saby,

SC

N.P.A., Bispo, A., Arrouays, D., 2012. A high resolution map of French soil organic carbon. Agron.

NU

Sutain. Dev. 32, 841-851.

Minasny, B., McBratney, A.B. 2006. A conditioned Latin hypercube method for sampling in the

MA

presence of ancillary information. Computers & Geosciences 32, 1378–1388. Morvan, X., Saby, N.P.A., Arrouays, D., Le Bas, C., Jones, R.J.A., Verheijen, F.G.A. et al. 2008. Soil

PT E

Environ.ent 391, 1–12.

D

monitoring in Europe: a review of existing systems and requirements for harmonisation. Sci. Total

Østergaard, H.S. 1989. Analytical methods for optimization of nitrogen fertilization in agriculture, in:

CE

Germon, J.C. (Ed.), Management Methods to Reduce Impact of Nitrates. Elsevier, London, pp. 224–

AC

234.

Paustian, K., Lehmann, J., Ogle, S., Reay, D., Robertson, G. P., Smith P., 2016. Climate-smart soils. Nature 532, 49-57.

Pebesma, E.J., 2006. The Role of External Variables and GIS Databases in Geostatistical Analysis. Trans. GIS 10(4), 615–632. Phillips, J. D. 1990. A saturation-based model of relative wetness for wetland identification. Water Resources Bull. 26, 333-342. 27

ACCEPTED MANUSCRIPT Poggio, L., Gimona,A., Brewer,M.J., 2013. Regional scale mapping of soil properties and their uncertainty with a large number of satellite-derived covariates. Geoderma 209–210, 1–14. Renard, K.G., Foster, G.R., Weesies, G.A., Porter, J.P., 1991. RUSLE: Revised Universal Soil Loss equation. J. Soil Water Conserv. 46, 30-33.

PT

Rubinstein, R. Y., 1981. Simulation and the Monte Carlo method. John Wiley & Sons, New York.

RI

Robinson, D.A., Emmett, B.A., Reynolds, B., Rowe, E.C., Spurgeon, D., Keith, A.M. et al., 2012. Soil natural capital and ecosystem service delivery in a world of global soil change, in: Hester, R.E.,

SC

Harrison, R.M. (Eds.), Soils and Food Security. Royal Society of Chemistry, Cambridge, pp. 41–68..

NU

Schulp, C.J.E., Verburg, P.H., 2009. Effect of land use history and site factors on spatial variation of

MA

soil organic carbon across a physiographic region. Agric. Ecosyst. Environ. 133 (1–2), 86–97. Schulp, C.J.E., Verburg, P.H., Kuikman, P.J., Nabuurs, G.-J., Olivier, J.G.J., de Vries, W., Veldkamp, T.,

PT E

Environ. Manag. 51, 709–723.

D

2013. Improving national-scale carbon stock inventories using knowledge on land use history.

Stevens, F., Bogaert , P., van Wesemael, B., 2015a. Spatial filtering of a legacy dataset to characterize

CE

relationships between soil organic carbon and soil texture, Geoderma 237-238, 224-236. Stevens, F., Bogaert, P., van Wesemael, B., 2015b. Detecting and quantifying field-related spatial

93-103.

AC

variation of soil organic carbon using mixed-effect models and airborne imagery, Geoderma 259-260,

Trigalet, S., van Oost, K., Roisin, C. van Wesemael, B., 2014. Carbon associated with clay and fine silt as indicator for SOC decadal evolution under different residue management practices. Agric. Ecosyst. Environ. 196, 1-9. Van Orshoven, J., Maes, J., Vereecken, H., Feyen, J., and Didal, R. 1988. A structured database of Belgian soil prole data. Pédologie 38, 191-206. 28

ACCEPTED MANUSCRIPT Walkley, A., Black, I.A., 1934. An examination of Degtjareff method for determining soil organic matter and a proposed modification of the chromic acid titration method. Soil Sci. 37, 29-37. Wang, G., Zhou, Y., Xu, X., Ruan, H., Wang, J., 2013. Temperature sensitivity of soil organic carbon mineralization along an elevation gradient in the Wuyi Mountains, China. PloS one 8(1), e53914.

PT

Wang, J.-F., Stein, A., Gao, B.-B., Ge, Y., 2012. A review of spatial sampling. Spatial Statistics 2, 1-14.

RI

Wiesmeier, M., Barthold, F., Spörlein, P., Geuβ, U., Hangen, E., Reischl, A., Schilling, B., Angst, G., von Lützow, M., Kögel-Knabner, I., 2014. Estimation of total organic carbon storage and its driving factors

SC

in soils of Bavaria (southeast Germany). Geoderma Regional 1, 67-78.

NU

Wood, S.N., 2001. mgcv: GAMs and generalized ridge regression for r. R news 1(2), 20-25.

MA

Wood, S.N., 2006. Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC Press.

AC

CE

PT E

D

Zar, J.H., 1999. Biostatistical Analysis. Prentice Hall, New Jersey.

29

Fig. 1

AC

CE

PT E

D

MA

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

30

RI

PT

ACCEPTED MANUSCRIPT

AC

CE

PT E

D

MA

NU

SC

Fig. 2

31

MA

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

AC

CE

PT E

D

Fig. 3

32

AC

CE

PT E

D

MA

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

Fig. 4

33

CE AC

Fig. 5

PT E

D

MA

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

34

MA

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

AC

CE

PT E

D

Fig. 6

35

PT E

D

MA

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

AC

CE

Fig. 7

36

MA

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

AC

CE

PT E

D

Fig. 8

37

MA

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

AC

CE

PT E

D

Fig. 9

38

MA

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

AC

CE

PT E

D

Fig. 10

39

AC

CE

PT E

D

MA

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

Fig. 11

40

ACCEPTED MANUSCRIPT Figure 1: Elevation (A) and main land uses (http://cartopro3.wallonie.be/) with sample sites location (B) in southern Belgium (Wallonia) with delineation of the 10 official agricultural regions (1: Sandyloam, 2: Loam, 3: Campine Hennuyère, 4: Condroz, 5: Herbagère Liège, 6: Herbagère Fagnes, 7: Famenne, 8: Ardenne, 9: Jurassic, and 10: Haute Ardenne). Figure 2. Annual precipitation (mm) and mean temperature (°C) of southern Belgium (Wallonia) with delineation of the 10 official agricultural regions (see Fig. 1)

PT

Figure 3. Clay, clay and fine silt (C-fS), silt and sand contents (%) in topsoil of Southern Belgium (Wallonia) with delineation of the 10 official agricultural regions (see Fig. 1). Grey color corresponds to built up areas.

RI

Figure 4. Workflow of the consecutive procedures used to assess the SOC densities and associated uncertainties for agricultural soils of southern Belgium.

NU

SC

Figure 5: Carbon content measured by dry combustion measurement (Cdc; of which inorganic carbon content has been subtracted) plotted against organic carbon content (Cwb*) measured by Walkley and Black method and corrected for incomplete oxidation. The equation used to correct from the incomplete oxidation is displayed above, with Cwb the raw carbon content obtained by Walkley and Black method. (Red line is the 1:1 line).

D

MA

Figure 6: Histogram of SOC stocks measured at the 591 investigated locations (A) and boxplot of SOC density by land use, i.e. cropland and grassland (B). In A, the small bars at the bottom of the graph indicate the presence or absence of samples. In B, the horizontal line in the middle of the boxes refers to the median, while their lower and upper limits are to the first and third quartiles, and the red point represents the mean.

PT E

Figure 7: Q-Q plot of the calibration residuals. Figure 8: Importance of covariates selected in the final GAM model for spatial interpolation of SOC density based on the difference in Akaike Information Criterion (abs(dAIC)).

AC

CE

Figure 9: Smooth functions of LS-factor as produced by the GAM model for cropland and grassland (land use is implemented as a categorical variable in the model). The dashed lines represent the 95% confidence interval. The bar at the bottom of the graph indicates the presence or absence of a sample. Figure 10. Predicted SOC stock plotted against observed SOC stock for the validation dataset (n=196). (red line is the 1:1 line). Figure 11: Topsoil organic carbon density (SOC stock in Mg C ha-1) (A) and relative prediction uncertainties (%; ((Q95-Q05)/SOC)*/100) (B) in Southern Belgium (Wallonia). Grey areas correspond to environments where no prediction of organic carbon stock could be made (e.g. built areas and forests). The agricultural regions are delineated by dark grey lines (1: Sandy-loam, 2: Loam, 3: Campine Hennuyère, 4: Condroz, 5: Herbagère Liège, 6: Herbagère Fagnes, 7: Famenne, 8: Ardenne, 9: Jurassic, and 10: Haute Ardenne).

41

ACCEPTED MANUSCRIPT Table 1: Number of soil samples from SMN CARBOSOL per landuse/agricultural region combinations in Wallonia. (Areas were computed using spatial layers of landuses and agricultural regions; § 2.3).

NU

SC

RI

PT

59 133 127 4 18 23 6 1 371 49 22 40 41 42 4 23 221

MA

Grassland

9.3 53.2 0.1 19.8 2.1 1.0 4.9 5.6 2.7 1.2 100 4.6 15.4 0.1 14.3 14.5 3.1 11.8 22.4 6.9 6.9 100

Sandy-loam (1) Loam (2) Camipne Hennuyère (3) Condroz (4) Herbagère Liège (5) Herbagère Fagnes (6) Famenne (7) Ardenne (8) Jurassic (9) Haute Ardenne (10) All (~ 4745 Km²) Sandy-loam (1) Loam (2) Camipne Hennuyère (3) Condroz (4) Herbagère Liège (5) Herbagère Fagnes (6) Famenne (7) Ardenne (8) Jurassic (9) Haute Ardenne (10) All (~ 5710 Km²)

D

Grassland

number of samples

PT E

Cropland

% of total area per landuse

CE

Cropland

Agricultural Region

AC

Landuse

42

ACCEPTED MANUSCRIPT Table 2: Estimators used as uncertainties for SOC stock calculation considering land use and stoniness at the sampling site scale, i.e. a 4-m radius circle (after Goidts et al., 2009).

Landuse Cropland Grassland

Soil texture

CV* of SOC stock (%)

Silt (loam) Stony loam Silt (loam) Stony loam

8 8 11 17

AC

CE

PT E

D

MA

NU

SC

RI

PT

*CV = Coefficient of Variation

43

ACCEPTED MANUSCRIPT Table 3: Correlation coefficients between SOC density and different environmental covariates.

1.0 0

Y

1.0 0

C+fS (1)

0.4 8

0.4 3

0.5 7 0.6 0

0.2 8 0.3 9

1.0 0

0.7 7

Clays

1.0 0

C-factor

0.1 9 0.0 8 0.3 3 0.1 5 0.4 2 1.0 0

0.3 0 0.3 6

-0.65

0.3 5 0.2 8

0.65

0.2 5

0.1 4

0.78

-0.45

0.38

0.1 9

1.00

0.3 1 1.0 0

PT E

Flow acc.(2)

0.19 0.26

Rai n

Temp. (3)

0.3 6

-0.29

0.6 5 0.6 0.67 0.24 2 0.34 0.33

0.5 2

-0.71

0.71

0.17 0.19

0.3 5

-0.35

0.62 0.29

0.8 7

-0.99

0.2 2

0.3 0 0.4 0.25 0.36 3 0.24 0.13

0.2 7 0.0 0.05 0.06 2 0.5 1.00 0.18 4

-0.64

-0.28

0.42

-0.69

0.2 1 0.1 9 0.0 3

0.2 2

-0.27

0.1 3

1.0 0

-0.86

0.17 0.88

-0.29

0.04

1.00 GWL

(1) Clay + fine Silt content (2) Flow accumulation (3) Temperature

44

GW GW L L ma min x 0.1 0.1 4 4 0.2 0.0 8 5 0.0 0.0 6 4 0.0 3 0.0 8 0.0 9 0.0 4

0.54 0.26

1.00

AC

LS-factor

0.2 3 0.0 9 0.0 2 0.0 5 0.0 6 1.0 0

CE

Flow length

0.2 4

0.1 6 0.3 5

0.30

D

Slope

Temp.(3)

0.3 4

1.0 0

Elevation

Rain

0.1 7 0.2 8

0.1 6 0.0 2 0.0 5

0.33

MA

C-input

0.3 0

Flo Flow LSElevati Slop w leng fact on e acc. th or (2)

PT

X

0.1 9 0.3 9

Silt

Cinp ut

RI

1.00

0.3 9

Y

Cla ys

SC

SOC density

X

C+f S (1)

NU

SOC stock

0.1 3 0.2 1 1.0 0

0.0 4

0.0 0 0.0 5 0.0 7 0.0 5 0.1 0 0.0 5 0.0 1 0.1 0 1.0 0

0.0 3 0.0 1 0.0 4

ACCEPTED MANUSCRIPT

Dataset for

Calibration

Validation

n

396

196

min

28.6

31.3

Q25

50.6

50.6

60.6

60.5

68.7

70.4

Q75

84.9

84.8

Max

163.8

161.5

sd

24.3

25.6

AC

CE

PT E

D

MA

NU

SC

RI

Q50 mean

PT

Table 4: Summary descriptive statistics of measured SOC density (Mg C ha-1) in calibration and validation datasets.

45

ACCEPTED MANUSCRIPT Table 5. Mean observed and predicted SOC stocks (Mg C ha-1) and predicted total stocks (Tg C) in cropland and grassland of the ten agricultural regions of Wallonia (§ Fig. 1A).

Sandy-loam

Grassland

51.0 (8.8)

53.5 (1.8)

2.19 (0.07)

(2) 133

52.0 (12.5)

54.5 (1.6)

13.18 (0.39)

-

53.1 (4.8)

0.02 (0.00)

56.2 (9.4)

60.7 (2.2)

5.04 (0.18)

65.02 (8.2)

83.6 (5.1)

0.75 (0.05)

-

Herbagère Fagnes

(6)

-

-

69.5 (11.7)

0.34 (0.06)

Famenne

(7)

18

58.8 (13.7)

71.7 (7.8)

1.59 (0.17)

Ardenne

(8)

23

74.5 (9.1)

88.1 (9.2)

2.16 (0.23)

Jurassic

(9)

6

56.1 (7.5)

67.4 (6.6)

0.81 (0.08)

Haute Ardenne

(10)

1

87.7 (-)

107.0 (16.5)

0.50 (0.08)

371

55.3 (12.1)

59.9 (2.8)

26.58 (1.52)

-

86.4 (7.6)

1.76 (0.15)

SC

All

PT

Herbagère Liège

RI

(3)

(4) 127 (5) 4

(1)

-

Loam Camipne Hennuyère Condroz

(2)

49

90.3 (25.2)

90.5 (4.9)

6.26 (0.34)

-

-

94.0 (50.5)

0.04 (0.02)

(3) (4)

22

Herbagère Liège

(5)

40

Herbagère Fagnes

(6)

-

Famenne

(7)

Ardenne

(8)

Jurassic

(9)

84.4 (12.5)

87.1 (4.4)

5.28 (0.27)

109.0 (21.0)

103.4 (5.5)

7.32 (0.39)

-

103.9 (20.4)

1.66 (0.33)

41

99.4 (24.3)

94.9 (15.3)

5.63 (0.91)

42

86.0 (15.1)

86.7 (5.1)

9.43 (0.55)

68.3 (10.4)

79.1 (9.4)

2.76 (0.33)

90.0 (16.3)

91.5 (6.5)

3.16 (0.22)

93.5 (22.2)

92.3 (6.2)

43.30 (2.93)

4

(10) 23 221

AC

CE

All

NU

Sandy-loam

Haute Ardenne Grassland

59

MA

Cropland

Mean (SD) total SOC stocks - Tg C

PT E

Cropland

Mean (SD) SOC stocks Mg C / ha

n (1)

Loam Camipne Hennuyère Condroz

Predicted

Observed Mean (SD) SOC density Mg C / ha

Agricultural Region

D

Landuse

46

ACCEPTED MANUSCRIPT Table 6. Mean predicted SOC stocks (Mg C ha-1) and predicted total stocks (Tg C) in cropland and grassland for the different types of soils (according to WRB) of Wallonia.

Agricultural Region Eutric Fluvisol

49.0 (2.7)

1.10 (0.06)

38.66

57.7 (1.6)

20.45 (0.55)

Dystric Cambisol

3.64

91.6 (9.0)

3.06 (0.30)

Gleyic Luvisol

0.75

70.0 (5.7)

0.48 (0.04)

Haplic Arenosol

0.31

51.7 (4.5)

0.15 (0.01)

Dystric Histosol

< 0.01

96.4 (38.8)

< 0.01 (< 0.001)

Distric Gleysol

< 0.01

87.4 (39.6)

Calcaric Luvisol

< 0.01

74.1 (11.0)

Eutric Cambisol

0.07

84.3 (6.7)

0.06 (< 0.01)

Arenic Luvisol

0.41

61.3 (7.2)

0.23 (0.03)

Calcaric Cambisol

0.01

71.7 (9.0)

< 0.01 (< 0.001)

Eutric Fluvisol

1.54

Haplic Retisol

0.83

Haplic Luvisol

23.31

Dystric Cambisol

20.02

Gleyic Luvisol

4.54

Haplic Arenosol

0.27

Dystric Histosol Distric Gleysol Eutric Cambisol Arenic Luvisol

Calcaric Cambisol

RI

< 0.01 (< 0.001)

NU

SC

< 0.01 (< 0.001)

1.28 (0.08)

86.9 (9.6)

0.66 (0.07)

91.7 (3.5)

19.61 (0.76)

89.4 (3.8)

16.43 (0.69)

100.8 (6.1)

4.20 (0.25)

90.4 (36.2)

0.23 (0.09)

0.13

103.2 (13.0)

0.12 (0.01)

< 0.01

95.5 (14.1)

< 0.01 (< 0.001)

0.01

77.7 (15.4)

< 0.01 (0.001)

0.24

80.2 (12.8)

0.18 (0.03)

0.78

70.9 (20.7)

0.51 (0.15)

0.02

77.6 (14.6)

0.01 (< 0.01)

MA

91.1 (5.9)

PT E

Calcaric Luvisol

PT

2.46

Haplic Luvisol

CE

Grassland

Haplic Retisol

AC

Cropland

2.01

Predicted Mean (SD) SOC Mean (SD) total stocks Mg C / ha SOC stocks - Tg C 55.9 (2.1) 1.03 (0.04)

D

Landuse

% of total area per landuse

47