Research Journal of Biological Sciences

Year: 2009
Volume: 4
Issue: 1
Page No. 93 - 102

Evaluation and Comparison of Ordinary Kriging and Inverse Distance Weighting Methods for Prediction of Spatial Variability of Some Soil Chemical Parameters

Authors : Jafar Yasrebi , Mahboub Saffari , Hamed Fathi , Najafali Karimian , Masome Moazallahi and Reza Gazni

Abstract: Analysis and interpretation of spatial variability of soils properties is a keystone in site-specific management. The objective of this study was to determine degree of spatial variability of soil chemical properties with Ordinary Kriging (OK) and Inverse Distance Weighting (IDW) methods. Spatial distributions for 6 soil chemical properties were examined in a fallow land in Bajgah, Fars province, Iran. Soil samples were collected at approximately 60ΧΆ0 m grids at 0-30 cm depth and coordinates of each of the 100 points were recorded with GPS. Kriging and inverse-distance weighting are two commonly used techniques for characterizing this spatial variability and interpolating between sampled points. Data were interpolated with OK and IDW with powers of 1-5. All studied soil chemical parameters were strongly spatially dependent, but the range of spatial dependence was found to vary within the soil parameters. Phosphorous had the shortest range of spatial dependence (49.50 m) and pH had the longest (109.50 m). The accuracy of OK predictions was generally unaffected by the coefficient of variation. We concluded, for all soil chemical properties, OK performed much better than the five IDW procedures in this study.

How to cite this article:

Jafar Yasrebi , Mahboub Saffari , Hamed Fathi , Najafali Karimian , Masome Moazallahi and Reza Gazni , 2009. Evaluation and Comparison of Ordinary Kriging and Inverse Distance Weighting Methods for Prediction of Spatial Variability of Some Soil Chemical Parameters. Research Journal of Biological Sciences, 4: 93-102.

INTRODUCTION

Site-specific management has received considerable attention due to the three main potential benefits of increasing input efficiency, improving the economic margins of crop production and reducing environmental risks. Uniform management of crops grown under spatially variable conditions can result in less than optimum yields due to nutrient deficiencies as well as excessive fertilizer application that may potentially reduce environmental quality (Redulla et al., 1996). Geostatistical methods can provide reliable estimates at unsampled locations provided that the sampling interval resolves the variation at the level of interest (Kerry and Oliver, 2004). Spatial prediction techniques, also known as spatial interpolation techniques, differ from classical modeling approaches in that they incorporate information on the geographic position of the sample data points (Cressie, 1993). The most common interpolation techniques calculate the estimates for a property at any given location by a weighted average of nearby data. Weighting is assigned either according to deterministic or statistical criteria. A number of factors affect map quality including the nature of the soil variability (Sadler et al., 1998), intensity of sampling and method of interpolation. The variety of available interpolation methods has led to questions about which is most appropriate in different contexts and has stimulated several comparative studies of relative accuracy. Among statistical methods, geostatistical kriging-based techniques, including simple and ordinary kriging, universal kriging and simple cokriging have been often used for spatial analysis (Deutsch, 2002). Among deterministic interpolation methods, inverse distance weighting method and its modifications (Nalder and Wein, 1998) are the most often applied. Kriging and IDW are the most commonly used methods in agriculture practices (Franzen and Peck, 1995; Weisz et al., 1995). Kriging requires the preliminary modeling step of a variance-distance relationship, but IDW does not require such step and is very simple and quick. Both methods estimate values at unsampled locations based on the measurements at surrounding locations with certain assigned weights for each measurements. Creutin and Obled (1982) and Tabios and Salas (1985) compared kriging with several other interpolation techniques, including IDW, for annual precipitation distributions and found kriging to be superior to IDW. Many studies have compared IDW and kriging. In some cases, the performance of kriging was generally better than IDW (Hosseini et al., 1994; Dalthorp et al., 1999; Kravchenko and Bullock, 1999; Kravchenko, 2003; Reinstorf et al., 2005). Warrick et al. (1988) also reported kriging to be better than inverse distance weighting for mapping potato yield and soil properties, such as percent of sand, Ca content and infiltration rate. In other studies, IDW generally out-performed kriging (Weisz et al., 1995; Nalder and Wein, 1998). Gotway et al. (1996) observed the best results in mapping soil organic matter contents and soil NO3- levels for several fields when IDW was used as an interpolation technique. Often, however, the results have been mixed (Schloeder et al., 2001; Mueller et al., 2001; Lapen and Hayhoe, 2003). Kriging performance can be significantly affected by variability and spatial structure of the data (Leenaers et al., 1990) and by the choice of variogram model, search radius and the number of the closest neighboring points used for estimation. As might be expected, the performance of kriging improved relative to IDW when spatial structure was known. The objectives of this study was to describe and predict the relative performance of Ordinary Kriging (OK) and Inverse Distance Weighting (IDW) and provide map quality of some soil fertility indicators at field scale in Bajgah, Fars province, Iran.

MATERIALS AND METHODS

Study area, sampling design and laboratory analysis: The study was conducted in a fallow land in Bajgah (N 29°36', W 52°32'), About 15 km northeast of Shiraz, Fars province, Iran (Fig. 1). According to the USDA Soil Taxonomy (Soil Survey Staff, 2006), the soil at the study region was classified as Fine, mixed, mesic, Typic Calcixerepts. One hundred soil samples were collected (September 2007) from the cross-line nodes of an approximately 60x60 m grids at 0-30 cm depth (Fig. 1) and coordinates of each sampling point were recorded with GPS. The soil samples were taken to the laboratory and passed through a 2 mm sieve. Available Phosphorous (P) was determined Olsen method (1982); available potassium (K) was determined by extraction with ammonium acetate (Richards, 1954); total Nitrogen (TN) was determined using Kjeldal (Bremner, 1996); Organic Matter (OM) content was determined using oxidation method (Walkly and Black, 1934); pH was determined in saturated paste; Electrical Conductivity (ECe) was determined with conductivitymeter.

Descriptive of prediction methods: Statistical analyses were done in 3 stages. First, the frequency distributions were analyzed and normality was tested using the Kolmogoroph-Smironoph test (SAS, 1999). Secondly, the distribution of data was described using conventional statistics such as mean, maximum, minimum, median, Standard Deviation (S.D), Coefficient of Variation (CV), skewness and kurtosis. These analyses were conducted using the STATISTICA software package (StatSoft Inc, 2004). Thirdly, geostatistical analysis was performed using the GS+9 (Gamma Design Software, 2008) to determine the spatial dependency of soil properties. Isotropic semivariograms for the soil parameters were computed to determine any spatially dependent variance within the field.


Fig. 1:

Location of the study area and sampling pattern in 46.7 ha area

A semivariogram was calculated for each soil property as follow (Isaaks and Srivastava, 1989; Journel and Huijbregts, 1978):

(1)

where:

γ (h) = The experimental semivariogram value at distance interval h.
N (h) = Number of sample pairs within the distance interval h.
z (xi), z (xi + h) = Sample values at two points separated by the distance interval h.

Experimental semivariograms were examined for the best models (i.e., exponential, spherical and Gaussian) separately and the best fitted model was selected based. Using the model semivariogram, basic spatial parameters such as nugget variance (Co), structural variance (C), range (A) and sill (C + Co) was calculated. Nugget variance is the variance at zero distance, sill is the lag distance between measurements at which one value for a variable does not influence neighboring values and range is the distance at which values of one variable become spatially independent of another (Lopez-Granadoz et al., 2002). Different classes of spatial dependence for the soil variables were evaluated by the ratio between the nugget semivariance and the total semivariance (Cambardella et al., 1994). For the ratio <25%, the variable was considered to be strongly spatially dependent, or strongly distributed in patches; for the ratio between 26 and 75%, the soil variable was considered to be moderately spatially dependent, for the ratio >75%, the soil variable was considered weakly spatially dependent and for the ratio of 100%, or if the slope of the semivariogram was close to zero, the soil variable was considered non-spatially correlated (pure nugget). In the process of calculating the experimental semivariograms, the active lag distance and the lag class distance interval were changed until the smallest nugget variances in the best model semivariograms were achieved (Mapa and Kumaragamage, 1996).

Ordinary Kriging (OK): Ordinary kriging is one of the most basic of kriging methods. It provides an estimate at an unobserved location of variable z, based on the weighted average of adjacent observed sites within a given area. The theory is derived from that of regionalized variables (Matheron, 1965, 1971) and can be briefly described by considering an intrinsic random function denoted by z (si), where, (si) represents all sample locations, i = 1, 2,…, n. An estimate of the weighted average given by the ordinary Kriging predictor at an unsampled site z (s0) is defined by:

(2)

where, λ are the weights assigned to each of the observed samples. These weights sum to unity so that the predictor provides an unbiased estimation:

(3)

The weights are calculated from the matrix equation:

C = A-1 b,
 

where:

A = A matrix of semivariances between the data points.
b = A vector of estimated semivariances between the data points and the points at which the variable z is to be predicted.
c = The resulting weights.

Inverse Distance Weighting (IDW): All interpolation methods have been developed based on the theory that points closer to each other have more correlations and similarities than those farther. In IDW method, it is assumed substantially that the rate of correlations and similarities between neighbors is proportional to the distance between them that can be defined as a distance reverse function of every point from neighboring points. It is necessary to remember that the definition of neighboring radius and the related power to the distance reverse function are considered as important problems in this method. This method will be used by a state in which there are enough sample points (at least 14 points) with a suitable dispersion in local scale levels. The main factor affecting the accuracy of inverse distance interpolator is the value of the power parameter p (Isaak and Srivastava, 1989). In this study, we compared estimates of inverse distance interpolator using different integer powers parameters 1, 2, 3, 4 and 5, which are the most commonly used in literature (Kravchenco and Bullock, 1999). Since, the goal of using inverse distance functions as estimators is giving more weight (importance) to the closest sampled points (Webster and Oliver, 2001), in this study we just considered integer values of p parameter, because the values lower than one are closest to a simple average estimation (Isaaks and Srivastava, 1989). In addition, the size of the neighborhood and the number of neighbors are also relevant to the accuracy of the results.

(4)

where:

ZO = The estimation value of variable z in point I.
Zi = The sample value in point I.
di = The distance of sample point to estimated point.
N = The coefficient that determines weigh based on a distance.
n =

The total number of predictions for each validation case.

Forecasting evaluation methods: The performance of each interpolation technique, in terms of the accuracy of estimates, was assessed by comparing the deviation of estimates from the measured data through the use of a jackknifing technique or cross-validation (Isaak and Srivastava, 1989; Webster and Oliver, 2001). In such a procedure, sample values are deleted from the data set, one at a time and then the value in turn is interpolated by performing the interpolation algorithm with the remaining sample values. This yields a list of estimated values of variable data paired to those measured at sampled locations. Therefore, the comparison of performance between interpolation techniques was achieved by using the following statistics: coefficient of determination between measured and estimated variable values, the Mean Error (ME), the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) (Zar, 1999).

The ME is used for determining the degree of bias in the estimates and it is calculated with equation:

(5)

The MAE provides an absolute measure of the size of the error. MAE is calculated with the equation:

(6)

The RMSE provides a measure of the error size that it is sensitive to outliers. RMSE values can be calculated with equation:

(7)

where:

=

The prediction values.

Z(xi) =

The mean values.

n = The total number of prediction for each validation case.

The coefficient of determination, R2 of linear regression line between the predicted and the measured values were also used as a measure of performance for each methods.

(8)

where:

oi and ti = The observed and predicted values for the i output respectively.
oi = The mean of observed values.
N = The total number of events considered.

Finally, the Relative Improvement (RI)

of the best method compared with the others is calculated with equation:

(9)

where:

RMSEbest = The minimum value of RMSE.
RMSEcurrent = Represents the RMSE of the current model.

RESULTS AND DISCUSSION

Statistical analysis: The summary statistics of soil parameters are shown in Table 1. The descriptive statistics of soil data suggested that they were all normally distributed (according to Kolmogrov-Smironov test). Coefficient of Variation (CV) for all of variables was very different. The greatest and the smallest variation were observed in the total nitrogen (CV = 29.6) and pH (CV = 1.7), respectively. Avalible phosphorus, pH and K had low variation (CV <15%) whereas all other soil parameters exhibited a medium variation (CV 15-50%) according to the guidelines provided by Warrick (1998) for variability of soil properties. In order to identify the possible spatial structure of different soil properties, semivariograms were calculated and the best models that describe these spatial structures were identified. The spatial variation depicted by the semivariogram models are shown on Table 2. Spherical, Gaussian and Exponential models were found to fit well the experimental semi-variograms (Fig. 2). The geostatistical analysis presented different spatial distribution models and spatial dependence levels for the soil properties.


Table 1:

Descriptive statistics for selected soil properties


Table 2:

Parameters of variogram models for studied soil properties

Spatial ratio = nugget semivariance/total semivariance, total semivariance = nugget + sill. Spatial class: S = strong spatial dependency

Fig. 2:

Omnidirectional semivariogram for soil paremeters

As seen in Table 2, the ranges of spatial dependences show a large variation (from 49.50 m for P up to 109.50 m for pH). Knowledge of the range of influence for various soil properties allows one to construct independent datasets to perform classical statistical analysis. Furthermore, it aids in determining where to resample if necessary and in the design of future field experiments to avoid spatial dependency. The range values showed considerable variability among the parameters (Table 2). There were great differences between ranges of the different soil variables, as had been already reported in several studies. Weitz et al. (1993) found most of the soil properties had variable range between 30 and 100 m. Doberman (1994) fitted the spherical models to variograms with range between 80-140 m. Cambardella et al. (1994) reported that the range of spatial distribution of 80m for total organic N in a farm from Iowa, USA. In site-specific management it is always advantageous to look for a soil property with a greater spatial correlation due to practical reasons. Lauzon et al. (2005) observed that the current 100 m sampling grid in southern Ontario for site-specific P fertilizer management is not reliable due to the lack of spatial correlation of available P in distances >30 m. The different ranges of spatial correlation for nutrients may be related to the ions mobility in the soil. In the present study spatial distribution of TN appeared to be correlated to that of OM. The variogram ranges of TN and OM are the same in studied area (Table 2). These results are in accordance with the results of Cahn et al. (1994). A large range indicates that observed values of a soil variable are influenced by other values of this variable over greater distances than soil variables which have smaller ranges (Lopez-Granadoz et al., 2002). Thus a range of >109.5 m for pH indicates this variable values influenced neighboring values of pH over greater distances than other soil variable (Table 2). The interpolation maps of all ordinary Kriging estimated soil parameters can be seen in Fig. 3.


Fig. 3:

Map of ordinary Kriging interpolated soil proerties

Table 3:

Results of mean error, mean absolut error, root mean square error, coefficient of determination for different soil chemical properties, using kriging


Table 4:

Results of mean error, mean absolute error, root mean square error, coefficient of determination for different soil chemical properties, using IDW with power of 1

Inverse Distance Weighting (IDW): Inveres distance weighting predictions were performed varying the number of power (from 1-5) and the best consequence obtained using different radiuses and neighbors. The results, in terms of the accuracy of estimates (estimated errors), obtained from the cross validation procedures are presented in Table 4-8. The mean error (ME), the Mean Absolute Error (MAE) and the root mean square error (RMSE) are generally lower for IDW with power of 4 in comparison to that of other powers. The relative improvement (RI) of interpolation techniques is also showed in Table 9. Inveres distance weighting procedur with power of 4 result in significant reduction of RI (about 22% for pH, 15% for EC, TN, K,OM and 20% for P) compared to other IDW powers.


Table 5:

Results of mean error, mean absolute error, root mean square error, coefficient of determination for different soil chemical properties, using IDW with power of 2


Table 6:

Results of mean error, mean absolute error, root mean square error, coefficient of determination for different soil chemical properties, using IDW with power of 3


Table 7:

Results of mean error, mean absolute error, root mean square error, coefficient of determination for different soil chemical properties, using IDW with power of 4


Table 8:

Results of mean error, mean absolute error, root mean square error, coefficient of determination for different soil chemical properties, using IDW with power of 5


Fig. 4:

Map of IDW (with power of 4) interpolated soil proerties

Table 9:

Relative improvement (RI) for selected soil properties, using Ordinary Kriging and IDW powers of 1-5, interpolation methods

The interpolation maps of all soil properties using IDW with power of 4 can be seen in Fig. 4. Kravchenko and Bullock (1999) report a significant improvement in accuracy of soil properties interpolated using IDW by manipulating the exponent value. They found that data with high skewness (>2.5) were often best estimated with a power of four (5 out of 8 datasets) and for most of the soil properties with low skewness (<1), a power of one yielded the most accurate estimates (9 out of 15 datasets). Alternatively, Weber and Englund (1994) reported that IDW with a power of one resulted in a better estimation for data with skewness coefficients of 4-6. Likewise, a larger exponent produced better estimations when the data had low skewness. In this study, in all implementations of IDW the power of four was the best choice among the others, which is possibly due to the relatively inherent low skewness in all modelled soil properties (as also found by Weber and Englund, 1994). To select the best accurate of methods using Table 3-9 those represents the gist of results in this research for all prediction methods. Results indicated that kriging is most accurate performance in all applied methods.

CONCLUSION

The generation of soil properties maps is the most important and first step in precision agriculture. These maps will measure spatial variability and provide the basis to control it. Site-specific management is done to optimize crop production and minimize soil fertility losses. For this proposes we must identify the best method in order to determine the spatial variability of soil chemical properties. Kriging and inverse-distance weighting are two commonly used techniques for characterizing this spatial variability and interpolating between sampled points. The accuracy of ordinary Kriging predictions was generally unaffected by the coefficient of variation and was relatively high for all of the sampling configurations considered in this study. The range of spatial dependence was found to vary within soil parameters. Overall, the results obtained from the comparison of the two applied interpolation methods indicated that kriging was the most suitable methods for prediction and mapping the spatial distribution of soil chemical properties in this area. The results showed, IDW with powers of 4 and had almost the same precisions, but IDW with power of 4 was is better than IDW with power of 5. Results also revealed that although the IDW is relatively simple and ease to use, but is less accurate than OK. In cpmparison to OK, IDW increase the error more than 22% for pH, 15% for EC,TN, K, OM and 20% for avalible P. The result of present study were in agreement to other studies (Hernandez-Stefanoni and Ponce-Hernandez, 2006; Voltz and Webster, 1990).

Design and power by Medwell Web Development Team. © Medwell Publishing 2024 All Rights Reserved