Throughout this document, packages, programs and external functions are written in sans serif font. R input commands, R functions and their arguments are typed in slanted typewriter font, corresponding output, if any, in upright typewriter font.
The package KriSp (Kriging with Sparse matrices) is considered as an add-on to the package fields. KriSp has a similar class to one of fields and uses some of its methods. Further, KriSp uses the package SparseM to handle the sparse matrix techniques. Thus the package is not exhaustive in its functionality compared to other geostatistical packages like geoR. Also, most functions are not fully optimized in order to enhance readability of the code.
The reader should be familiar with R as well as with standard geostatistical terms and modeling (see for instance Cressie, 1993 or Stein, 1999, for a detailed discussion on that topic). Insight of the packages fields and SparseM is beneficial but not mandatory. This tutorial should give insight in how to use the different functions and methods of KriSp. Typically, default arguments are used for the function calls. The user is strongly encouraged to examine the function arguments using the the R functions args and help (the help files of major functions included in KriSp are given in the Appendix). The commands used in this tutorial are also available at http://www.mines.edu/~rfurrer/software/KriSp/KriSp.tutorial.R. Please send any comments concerning this document or the package to rfurrer@mines.edu.
$ R CMD INSTALL -l /path/to/library KriSp_0.4.tar.gzFor Windows, the equivalent command to be executed at the DOS prompt is
$ Rcmd INSTALL -l /path/to/library KriSp_0.4.tar.gzAlternatively, the package can also be installed within an R session (see the R manual for more details). Start by opening an R session (version 1.7) and load the package
> library("KriSp", lib.loc="/path/to/library")The required libraries fields and SparseM are automatically loaded, provided they have been previously installed.
The main class of KriSp is sparse. In order use the methods defined for the Krig class in fields, the class sparse has the secondary class Krig. The main functions of the package KriSP are
> Krig.simple.sparse() > Krig.sparse() > predict()The first function performs simple kriging, the second universal kriging. The third function is a method to perform predictions on a given set of locations. In order to simplify the coding, the simple and the universal kriging approach have been separated. Both approaches are illustrated in the subsequent sections. In Section 4, the tapering technique is further illustrated with two examples.
To illustrate the capacity of sparse matrix techniques, we start with the simple spatial model
There are a few datasets included in the package distribution. To illustrate the simple kriging approach, we use the dataset simple.data. This artificial dataset consists of 100 locations randomly distributed in a unit longitude-latitude square. The spatial Gaussian process has an exponential covariance structure with a range of 10 miles and a sill of 1 and no measurement error. The data are loaded with
> data(simple) > attach(simple.data)Figure 1 shows the locations and a histogram of the values.
> look <- as.image( Y, x=x) > image.plot( look) > hist( Y)Kriging in KriSp is performed by creating a sparse object with a call to a kriging function such as Krig.simple.sparse and a subsequent call to a prediction function like predict.
> obj <- Krig.simple.sparse(x, Y)The variable obj contains among other quantities the vector and the predictions at the observed locations. As there is no measurement error, the predictions are identic to the observations. Prediction of the proces at an unobserved location is obtained simply by
> pre <- predict(obj, x=cbind(-105.5,40.5))For prediction at several points, we just specify the arguments x with the locations in a matrix.
If we want to predict on a fine grid, we use fields function predict.surface (Figure 2).
> surf <- predict.surface( obj) > image.plot( surf)There exists a rudimentary plott and print methods for the sparse object (Figure 3).
> plot( obj) > obj
Call: Krig.simple.sparse(x = x, Y = Y) Covariance: expo.cov Taper: spher.cov with range 10 Number of obs. = 100The method summary is just a more extended version of print. Other methods for the sparse class include residuals, fitted and coef. Those are included, as simple and universal kriging can be considered as a linear model.
KriSp is not about parameter estimation, thus we use our a priori knowledge for the parameter specification in the kriging routines. Typically, the covariance parameters are different to the default values. The exponential covariance and its parameters range=10, sill=1 and nugget=0 are the default values for the argument cov.fun="expo.cov" and cov.fun.args respectively. We also used the default taper function and taper parameter. The previous Krig.simple.sparse is acutally identical to
> obj <- Krig.simple.sparse(x, Y, cov.fun = "expo.cov", + cov.fun.args = list(range = 10, sill = 1, nugget = 0), + taper.fun = "spher.cov", taper.fun.args = list(range = 50))Note that the variance of the measurement error is given by the nugget argument in cov.fun.args.
Too small values for the arguments tmpmax (working array for chol) or nnzmax (upper bound of nonzero elements in ) result in a warning or core dump respectively. Too big values do not hurt, just consume some computing time (see alse Section 4.2).
A little clean up, prior to the universal kriging example.
> detach(simple.data) > rm("obj","pre","look","surf")
The function Krig.sparse loops over the regression and simple kriging steps until convergence. Consider the example dataset universal.data. The artificial data is similar to simple.data except that we have a linear trend (Figure 4).
> data(universal) > attach(universal.data) > look <- as.image( Y, x=x) > image.plot( look)We create a sparse object by calling
> obj <- Krig.sparse(x, Y, + cov.fun.args=list(range=10,sill=.9,nugget=.1))The error norm between the consecutive coefficients are in out$coef
The universal kriging object obj contains more information than in the simple kriging case.
> summary( obj)
Call: Krig.sparse(x = x, Y = Y, cov.fun.args = list(range = 10, sill = 0.9, nugget = 0.1)) Covariance: expo.cov Taper: spher.cov with range 10 Number of obs. =100 nnz(sigma) =820, (8.2%) nnz(Chol sigma)=645, (6.45%) Spatial trend: order m=2, betahat=(-0.5179,4.319,-166.8438) convergence with MSE=0.008 after 25 steps (criterion maxiter>=25, or MSE<0.001) Residuals: Min 1Q Median 3Q Max -0.168041 -0.045088 0.002130 0.047612 0.189869 Timing: Covariance Cholesky Backsolve Iterations 0.07 0.05 0.00 0.07
The predicted surface is in out$fitted, which is the sum of the trend, out$trend, and the spatial field, out$spatial.
As in the simple kriging case, the predict method returns a vector, the prediction on some specified locations. If trend.only=TRUE, predict returns the mean structure of the field only (Figure 5).
> image.plot( predict.surface(obj)) > image.plot( predict.surface(obj, trend.only=TRUE))
Finally, proper clean up.
> detach(universal.data) > rm("obj","look","surf")
In this section we use the universal dataset to compare interpolation results obtained from different approaches, namely the straightforward naive approach (i.e. using equations (3) and (4)), fields' Krig and KriSps Krig.sparse. Refer to the fields for a detailed discussion of the use of Krig function.
To get started, we load the data and and set the covariance parameters..
> data(universal) > attach(universal.data) > range <- 10 > nugget <- 0.1 > sill <- 0.9Prediction is performed on a grid (depending on the available computing power, you might want to lower the grid size to ).
> nx <- ny <- 50 > xgrid <- make.surface.grid( grid.list=list(lon='x',lat='y'), + X=x, nx=nx, ny=ny)To predict the surface using fields, we type
> out.fields <- Krig( x, Y, expo.earth.cov, + sigma2=nugget, rho=sill, lambda=nugget/sill) > fields <- predict.surface( out.fields, nx=nx, ny=ny)The sparse approach (with taper range 30) is
> out.sparse <- Krig.sparse( x, Y,cov.fun.args=list(range = range, + sill=sill, nugget=nugget), + taper.fun.args = list(range = 30), + scale.type ="range") > sparse <- predict.surface( out.sparse, nx=nx, ny=ny)We used the scale.type argument to transform the locations to obtain better numerical stability. The naive approach is somewhat longer to calculate.
> lagC <- rdist.earth(x) > lagc <- rdist.earth(x, xgrid) > C <- expo.cov( lagC, range=range, sill=sill, nugget=nugget) > c <- expo.cov( lagc, range=range, sill=sill, nugget=nugget) > > invC <- solve(C) > M <- out.sparse$xM > scaledxgrid <- scale(xgrid, center = out.sparse$x.center, + scale = out.sparse$x.scale) > > m0 <- t(cbind(scaledxgrid, 1)) > > betahat <- solve(t(M) %*% invC %*% M) %*% t(M) %*% invC %*% Y > naive <- c( t(c) %*% invC %*% (Y-M%*%betahat )+t(m0)%*%betahat )The results and the differences in the predicted fields are obtained with the following commands. (Figures 6 and 7).
> fields.naive <- fields > fields.naive$z <- fields$z - array( naive, c(nx,ny)) > fields.sparse <- fields > fields.sparse$z <- fields$z - sparse$z > sparse.naive <- fields > sparse.naive$z <- sparse$z - array(naive, c(nx,ny)) > > image.plot( sparse) > image.plot( fields) > > image.plot( fields.naive) > image.plot( fields.sparse) > image.plot( sparse.naive)
The differences seem nevertheless remarkably high. With higher taper ranges, the difference between approaches KriSp and naive can be considerably lowered. Note the difference in the fitted coefficients for the mean structure.
> c(betahat)
[1] -0.1607431 4.0057863 60.2022324
> out.sparse$coef
[1] -0.5928071 4.2595088 60.3194296
For this illustrative example we used a small dataset such that the computing performance of KriSp is not as impressive as it could be.
In this section we look at the large dataset (anomaly.data) to illustrate the capacity of KriSp. The data are anomalies of aggregated monthly precipitation for April 1948 at 11,918 stations in the US (for more details about the data refer to Johns et al., 2003 and Furrer et al., 2006). We assume that the data is second order stationary. To simplify the document, we also suppose that the underlying isotropic structure is a mixture of two exponential covariance fucntions with range parameters of 40 and 520 miles with respective sill of 0.28 and 0.72.
> data(anomaly) > attach(anomaly.data) > US() > points(x, pch=".", col=3)According to Furrer et al. (2006) a tapering radius with 16 to 24 points within the support is a sufficiently precise approximation. With our observation density, a tapering range of 40 miles is sufficient (cf. Figure 8 and 10). We first create the covariance function, being a mixture of two exponential ones. Typical KriSp covariance functions take the arguments distance, range and eps. If some are not used, just include the ``dot'' construction ... in the function definition. The mixture covariance could be defined by
> exp.mix.cov <- function(distance, ...) + 0.28 * exp( -abs(distance)/40) + 0.72 * exp( -abs(distance)/520)The sparse object is created with the following arguments.
> timing <- numeric(3) > dummy <- gc() # clean up before the heavy work... > obj <- Krig.simple.sparse(x, Y, cov.fun = "exp.mix.cov", + taper.fun.args = list(range=40), covfun = T, + nnzmax = 320000, tmpmax = 12000) > timing[1] <- (proc.time()-prtm)[1] > summary( obj)
Call: Krig.simple.sparse(x = x, Y = Y, cov.fun = "exp.mix.cov", taper.fun.args = list(range = 40), nnzmax = 320000, tmpmax = 15000, covfun = T) Covariance: exp.mix.cov Taper: spher.cov with range 40 Number of obs. =11918 nnz(sigma) =317832, (0.2238%) nnz(Chol sigma)=720234, (0.5071%) Residuals: Min 1Q Median 3Q Max -2.146e-04 -2.220e-16 0.000e+00 2.220e-16 5.786e-05 Timing: Covariance Cholesky Backsolve Fitting 21.38 1.50 0.03 21.06Note that we had to increase the values of nnzmax and tmpmax to perform the kriging.
Be aware that the next step might be very time consuming. We perform a prediction and visualisation step (Figure 9).
> prtm <- proc.time() > lon=seq( -126, -66, length=500) > lat=seq( 24, 50, length=500) > pred.surf <- predict.surface(obj, + grid.list = list(lon=lon, lat=lat), extrap=T) > timing[2] <- (proc.time()-prtm)[1] > > prtm <- proc.time() > image.plot(pred.surf, xaxt="n", yaxt="n") > US(add=T) > timing[3] <- (proc.time()-prtm)[1] > timing # in seconds
[1] 43.99 308.37 0.98(The calculations were performed on a Linux powered Xeon processor with 2Gbytes RAM.)
Even if extrap or chull.mask is altered, the computation time is not reduced as fields' predict.surface calculates the values for the entire grid and sets the values not falling in the chull.mask region to NA afterwards.
For readers that are familiar with the storage structure of sparse matrices, the following lines create Figure 10.
> ia <- slot( obj$sigma, "ia") > n.tap <- diff( ia) - 1 > hist( n.tap)
As the we have 11.913 observations, the full covariance matrix takes more than 1136 Mbytes compared to 3.86 Mbytes, if we have ``typical'' precision with 8-byte reals and 4-byte integers.
The method predict is essentially a wrapper to a Fortran function, which loops over all points at which to predict. For each point it calculates , where given by a sparse object. There are several ways to optimize this costly procedure:
Evaluating the Matérn covariance function is computationally heavy. R essentially uses the Netlib function rkbesl.f. Instead of using the same or similar functions in the Fortran code of KriSp, a linear approximation scheme is used. All covariances are evaluated on a fine grid with R functions such as mater.cov, expo.cov, etc. Those function values are then passed to the corresponding Fortran routine in where a linear interpolation is made. Of course, this method could be refined to a cubic spline approximation (using fields' css.f) or some other interpolation scheme. This approach also eliminates the need of providing a large database of covariance functions coded in Fortran. Note that in Furrer et al. (2006), no covariance approximations were made. The the following lines display the logarithm of the maximum error of the linear covariance approximation as a function of the number of grid points (Figure 11).
> nseq <- seq(10,1000,by=25) > maxerr <- sapply(nseq, "covapprox.error") > plot( nseq, log(unlist(maxerr[1,]),10), type="l", + ylab="log( max error), base 10")
The functions of KriSp are written in a linear, sequential way, calling as few functions as possible in order to save memory. Of course, there would be further gain in ``unrolling'' some of the remaining functions. When dealing with massive data sets, I recommend to use the source of KriSp and to hard code the functions to completely adjust it to the specific problem. In such cases, it is possible to work with datasets involving up to 400,000 locations (Sain and Furrer, 2004).
A possible extension might be to uniquely use S4 classes and methods. As the current version of fields uses S3 classes, KriSp does not entirely use the new class concept.
The next major version of fields (version>3.2) will contain a sparse matrix module and extends the capacities of KriSp. Therefore, KriSp will no longer be significantly extended and future releases will most likely consist of bug fixes only.
Another possible extension might be to uniquely use S4 classes and methods. As the current version of fields uses S3 classes, KriSp does not entirely use the new class concept.