Evaluation of Artificial Neural Network Models for Prediction of Spatial Variability of Some Soil Chemical Properties

Authors : Mahboub Saffari, Jafar Yasrebi, Farkhonde Sarikhani, Reza Gazni, Masome Moazallahi, Hamed Fathi and Mostafa Emadi

Abstract: Analysis and interpretation of spatial variability of soils properties is a keystone in site-specific management. The objectives of this study were to evaluate two different Artificial Neural Network (ANN) structures as single hidden-layer and multiple hidden-layer for estimation of spatial variability of some soil chemical properties. Soil samples were collected at approximately 60x60 m grids at 0-30 cm depth and coordinates of each of the 100 points were recorded with GPS. ANN models, applicable to each of these soils and consisting of two input parameters (X and Y coordinate system) were developed. The whole data is composed of 100 data points, which separated into two parts randomly: A training set consisting of 80% data points and a validation or testing set consisting of 20% data points. Generally, approximately the study highlights the superiority of the multiple hidden layers ANN model over single hidden layer ANN models (except Ca), for determining soil properties compacted to a given state.

How to cite this article:

Mahboub Saffari, Jafar Yasrebi, Farkhonde Sarikhani, Reza Gazni, Masome Moazallahi, Hamed Fathi and Mostafa Emadi, 2009. Evaluation of Artificial Neural Network Models for Prediction of Spatial Variability of Some Soil Chemical Properties. Research Journal of Biological Sciences, 4: 815-820.

URL: https://medwelljournals.com/abstract/?doi=rjbsci.2009.815.820

INTRODUCTION

Soil fertility properties of a soil vary spatially and temporally. Therefore, the available methods for estimating the soil fertility properties in Site-specific management are either not suitable or have restrictions for use in field condition. Before, the development of a property it has been prudent practice to undertake a soil survey to provide this understanding. Soil properties change from place to place, even for the same soil type (Warrick and Nielsen, 1980) and the analysis of data on soil requires a mathematical model which can usefully be assumed to underlie the observed variation and which then provides a basis for generalization, prediction and interpretation (Heuvelink and Webster, 2001). It is for these reasons that extensive experimental investigation has been conducted in an attempt to determine a method for predicting soil fertility properties. Artificial Neural Networks (ANN) are generally the software systems that imitate the neural networks of the human brain (Trippi and Turban, 1996). Neural networks are powerful tools that have the ability to identify underlying highly complex relationships from input-output data only (Haykin, 1999). The study indicate that the expert systems such as ANN are efficient in simulating the complicated phenomena due to its non-linear structures. This model mimics the historical pattern of phenomena during the training process and uses them to simulate the results for new inputs. Merdum et al. (2006) employed ANN and regression pedotransfer functions for prediction of soil water retention and saturated hydraulic conductivity functions and indicated that the differences between the two methods were not statistically significant. Salam et al. (2006) used an artificial neural network approach to model and predict the relationship between the grounding resistance and the length of the buried electrode in the soil based on experimental data and indicated the that model can be used to predict the grounding resistance with high accuracy. Some recent studies have shown that the ANN are not purely black box models and it is possible to shed some light on the hydrological processes inherent in an ANN if its architectural features are explored further (Wilby et al., 2003; Jain and Ormsbee, 2004; Sudheer and Jain, 2004). In recent years, several soil and water studies have used artificial neural networks and neuro-fuzzy techniques to make predictions. These techniques are capable of dealing with uncertainties in the inputs and can extract information from incomplete or contradictory data sets (Rashid et al., 1992; Rogers et al., 1995; Tamari et al., 1996; Holger and Dandy, 1996; Woldt et al., 1996; Schaap et al., 1998; Dixon, 2005; Chang and Chao, 2006; Islam et al., 2006; Ahmad et al., 2007).

The objectives of this study were to evaluate the accuracy of Artificial Neural Network (ANN) for estimation of soil fertility properties with two different artificial neural network structures as single hidden-layer and multiple hidden-layer.

MATERIALS AND METHODS

Study area, sampling design and laboratory analysis: The study was conducted in a fallow land in Bajgah, About 15 km northeast of Shiraz, in Fars Province, Iran (Fig. 1). According to the USDA, Soil Taxonomy (Soil Survey Staff, 2006), the soil at the study region was classified as fine, mixed, mesic, Fluventic Calcixerepts. Soil samples were collected (September 2007) at approximately, 60 m² at 0-30 cm depth and coordinates of each of the 100 points were recorded with GPS (Fig. 1). The soil samples were taken to the laboratory and air-dried over night and passed through a 2 mm sieve. Electrical Conductivity (ECe) was measured with Electroconductimeter; available potassium (K) was measured using extraction with ammonium acetate (1N) (Richards, 1954); calcium and magnesium were measured with titration method (Richards, 1954).

Descriptive statistics: For determine degree of variability of soil chemical properties, data were analyzed statistically. Classical descriptors such as mean, median, minimum, maximum, Coefficient of Variation (CV%), Standard Deviation (SD), skewness and kurtosis of data distribution were determined using the Statistical Analysis System. These analysis were conducted using the STATISTICA software package (StatSoft Inc., 2001).

Development of estimation methods: In this study, two types of ANN models were developed; single hidden-layer ANN models consisting of only one hidden layer and multiple hidden-layer ANN models consisting of two and three hidden layers. The task of identifying the number of neurons in the input and output layers is normally simple, as it is dictated by the input and output variables considered in the model physical process. But as mentioned, the number of neurons in the hidden layer (s) can be determined through the use of trial and error procedure. The optimal architecture was determined by varying the number of hidden neurons (from 1-20) and the best structure was selected. The training of the ANN models was stopped, when either the acceptable level of error was achieved or when the number of iterations exceeded a prescribed maximum of 2500. The neural network with feed forward back propagation consists of input layers of nodes, output layers and one or more layers of nodes in between. The middle layers are called hidden layers. The number of nodes in the input and output layers are determined by the nature of the problem under consideration. Figure 2 shows, the schematics of a three layer neural network with a feed forward configuration. ANN was implemented by using MATLAB software package (MATLAB version 7.2 with neural network toolboxes).

Accuracy determination: The performance of all methods was assessed based on calculating the Akaike’s Information Criterion (AIC), Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE).


Fig. 1:	Location of the study area and sampling pattern in 46.7 ha area

The coefficient of determination, R²of linear regression line between the predicted values from each methods and the measured value were also used as a measure of performance. The three statistical parameters used to compare the performance of the various methods configurations are as follows:

(1)

(2)

(3)

(4)

Where:

ESS	=	The Error Sum of Squares
q	=	The number of model parameters,
O_i and t_i	=	The observed and predicted for the ith output
	=	The mean of observed
N	=	The total number of events considered. The smaller the AIC value, the better the model


Fig. 2:	Schematics of a three-layer neural network model

RESULTS AND DISCUSSION

Soil characterizes: The summary of the statistics of soil parameters are shown in Table 1. Coefficient of Variation (CV) for all of variables was different; the greatest variation was observed in the magnesium whereas the smallest variation was in K. Calcium and Electric conductivity medium variation (CV 15-50%) according to the guidelines provided by Warrick (1998) for variability of soil properties.

Estimation Ca at random selected data: The main reason of executing this research is to determine the ability of ANN for prediction basic infiltration rate in unsampled points. The whole data set consisting of 100 data points, which was divided into two parts randomly: A training set consisting of 80% of the data points and a validation or testing set consisting of 20% of the data points. In this prediction method, optimal architecture was determined by varying the number of hidden neurons (from 1-20) and the best structure was selected. It was found that the most accurate results of two types of ANN models were obtained by using of the Feed Forward Back Propagation with single hidden layer and the architecture configuration was: 2-4-1. In ANN multiple hidden-layers model finally, the appropriate structure is 2-6-20-1. To get reliable results, the input data always need to be trustworthy, too.

To evaluate the performance of the ANN Table 2 shows, the results with the performance indices between predicted and observed data for the training and testing data sets, respectively. The Table 2 exhibits that multiple hidden-layers model has higher error in validation test compared with single hidden layer and difference between methods is not significant in training. As shown in that Table 2, approximately ANN with single hidden layer has performed better in predicting the Ca than multiple hidden-layers.

Estimation K at random selected data: In this case like last case, the whole data set consisting of 100 data points, which divided into two parts randomly: A training set consisting of 80% data points and a validation or testing set consisting of 20% data points. In ANN prediction optimal architecture determined by varying the number of hidden neurons and the best structure was selected. It was found that the most accurate results involved use of the Feed Forward Back Propagation with two hidden layer and architecture of configuration is 2-4-7-1and most accurate of single hidden layer architecture of configurations is 2-20-1. In this case, the results do not exist significant different between validations of mentioned methods, but multiple hidden-layers has good accuracy compared to single hidden layer (Table 3).

Table 1:	Descriptive statistics for variables within the field grid to a depth of 0.3 m

Estimation Mg at random selected data: The whole data set in this section is composed of 100 data points, which separated into two parts randomly: A training set consisting of 80% data points and a validation or testing set consisting of 20% data points.

Table 2:	Statically result of ANN model for estimation Ca

Training and validation in this case was found that the most accurate results of two types of ANN models were obtained by using of the Feed Forward Back Propagation with multiple hidden layers and the architecture configuration was: 2-4-7-1. In ANN single hidden-layer models finally, the appropriate structure is 2-16-1. The results show multiple hidden layers model has higher accuracy in training and validation (Table 4).

Estimation ECe at random selected data: In this case like other cases, the whole data set consisting of 100 data points, which divided into two parts randomly: A training set consisting of 80% data points and a validation or testing set consisting of 20% data points. The best structure of single hidden layer is 2-20-1 and the best structure of multiple hidden layers is 2-4-5-1.

Table 3:	Statically result of ANN model for estimation K


Fig. 3:	Contour maps of soil properties prepared by ANN models

Table 4:	Statically result of ANN model for estimation Mg

Table 5:	Statically result of ANN model for estimation Ece

In Ece estimation validations in two type of ANN model have good result, but training of single hidden layer show higher error according to multiple hidden layers. Figure 3 shows, the contour maps (generated using SURFER8, Golden software, 2002) obtained by ANN models for soilproperties. The comparison of these maps may be useful in the interpretation of the results (Table 5).

CONCLUSION

In this study, ANN models that can be used for determining soil properties have been developed. For this aim, experimental results for Ca, K, Mg and Ece have been used. ANN models, applicable to each of these soils and consisting of two input parameters (X and Y coordinate system) were developed. While, the generalized ANN model, applicable two type of ANN model include single hidden layer model and multiple hidden layers. All these models have one output parameter. However, approximately the study highlights the superiority of the g multiple hidden layers ANN model over single hidden layer ANN models, for determining soil properties compacted to a given state.

Related Links

Journals By Subject

Research Journal of Biological Sciences

Evaluation of Artificial Neural Network Models for Prediction of Spatial Variability of Some Soil Chemical Properties

How to cite this article: