VARIOGRAM MODELING AND ESTIMATION
by
Norman E. LeMay, Jr.
B.S., University of Colorado at Denver, 1995
A thesis submitted to the
University of Colorado at Denver
in partial fulfillment
of the requirements for the degree of
Master of Science
Applied Mathematics
1998
This thesis for the Master of Science
degree by
Norman E. LeMay, Jr.
has been approved
by
Weldon Lodwick
LeMay, Norman E.(M.S., Applied Mathematics)
Variogram Modeling and Estimation
Thesis directed by Professor James R. Koehler
ABSTRACT
Kriging is a form of spatial interpolation. A model, Z(x) = /j,(x) + J(x) is
considered for the region D, where ju(x) represents the global variability and
J(x) is a zero mean, stationary random process which represents small scale
variability. An estimator for Z{xo) is Z(x0) A'Z(x). It will be shown that
the As depend on the spatial covariance function. Kriging assumes a known
covariance function, but in practice, the covariance function is unknown, and
is estimated from the data.
An alternative function to represent spatial variability is the variogram,
from which the covariance, under certain conditions can be derived. The
main emphasis of this thesis is estimating the variogram used in Kriging. The
first chapter presents background to the geostatistical problem. Chapter two
presents the theoretical framework for variogram modeling. Random functions,
moments of random functions, stationary, along with variogram functions and
in
graphs are defined and discussed in this chapter. Chapter three discusses es
timation of the variogram, including sampling, terminology used in estimating
variograms, estimators and how to fit a true variogram to a sample variogram.
Chapter four illustrates this material by analyzing a topographical data set.
An analysis of this data set will show trend removal, simulations to aid in as
sessing anisotropy, and the final kriging surface along with the standard error
surface.
This abstract accurately represents the content of the candidates thesis. I
recommend its publication.
Signed
f James R. Koehler
IV
ACKNOWLEDGEMENTS
My sincere thanks to my parents, Norman E. and Deanna F. LeMay. With
out their loving support, kindness and encouragement, I would have not been
able to complete this goal. A great debt of gratitude is owed to my long suffer
ing advisor, Dr. James R. Koehler. His encouragement, advising and patience
has gone above and beyond the call of duty as mentor and friend. Thanks Dr.
Koehler from the bottom of my heart.
v
CONTENTS
Chapter
1. Introduction............................................... 1
1.1 Motivation................................................ 1
1.2 Outline................................................... 3
2. Terminology................................................ 5
2.1 Random Function........................................... 6
2.2 Properties of Random Functions............................ 7
2.2.1 First Moments....................................... 7
2.2.2 Second Moments...................................... 7
2.3 Stationarity.............................................. 8
2.3.1 Strict Stationarity................................. 8
2.3.2 Covariance Stationarity............................. 9
2.3.3 Intrinsic Stationarity ............................ 11
2.4 Variogram Properties..................................... 11
2.4.1 Ordinary Kriging .................................. 11
2.4.2 Functions and Graphs............................... 17
vi
2.4.3 Isotropic Transition Models....................... 17
2.4.4 Nontransition models............................. 19
2.4.5 Hole Effect Models................................ 21
2.5 Anisotropy............................................... 21
2.5.1 Geometric ........................................ 24
2.5.2 Sill Anisotropy................................... 27
2.5.3 Range Anisotropy.................................. 29
2.5.4 Nugget Anisotropy................................. 30
3. Data Analysis ................................................ 31
3.1 Sample Design............................................ 32
3.1.1 Gridded Data ..................................... 32
3.1.2 NonGridded Data.................................. 34
3.2 Estimators............................................... 36
3.3 Computing Sample Variograms.............................. 37
3.4 Diagnostic Tools......................................... 39
3.5 Fitting Sample Variograms................................ 41
4. Analysis of Topographical Data Set ...................... 43
4.1 Exploratory Data Analysis................................ 43
4.2 Variogram Analysis....................................... 46
4.3 Simulation............................................... 51
vii
4.4 Completion of Original Data............................. 55
4.5 Kriging................................................. 55
4.6 Concluding remarks and open questions.............. 58
References........................................................... 60
vm
FIGURES
Figure
2.1 Spherical model.............................................. 18
2.2 Exponential Model ........................................... 19
2.3 Gaussian Model............................................... 20
2.4 Power model: 7(h) = x2....................................... 20
2.5 Logarithmic Model ........................................... 21
2.6 HoleEffect Model............................................ 22
2.7 Variogram contour surface with anisotropy.................... 23
2.8 Variograms representing geometric anisotropy. ............... 24
2.9 Transformed Axis............................................. 25
2.10 Sill anisotropy variograms.................................. 27
2.11 Nugget Anisotropy........................................... 30
3.1 Coal Ash Data Set............................................ 33
3.2 Data which are not on a grid................................. 34
3.3 NonGridded Data............................................. 35
IX
3.4 Omnidirectional variogram of elevation data. The parameters
need to be adjusted to smooth the sample variogram........ 38
3.5 Omnidirectional variogram of elevation data after parameter
adjustment................................................... 39
4.1 Various Plots of Topographical Data.......................... 45
4.2 Plots of GAM output.......................................... 47
4.3 Various Plots of Residuals................................... 48
4.4 Directional Variogram of Residuals........................... 49
4.5 Contour of the Variogram Surface............................. 50
4.6 Omnidirectional variogram, number of lags equal 10........... 52
4.7 Directional variogram, nlag=10, tol.azimuth=40............... 52
4.8 Directional Variogram and Variogram Surface Plots............ 54
4.9 Weighted least squares fit of omnidirectional variogram ... 56
4.10 Results from Kriging........................................ 57
x
1. Introduction
1.1 Motivation
In my first semester of graduate school, I was working with my advisor on
propagation of errors in maps. We were continuing some research that he had
done previously and this led me into spatial statistics and Geographical Infor
mation Systems (GIS). Wanting to further my understanding of these spatial
problems, I took some statistics courses and an independent study in geostatis
tics, after which I transfered into the statistics program. I found the problems
in spatial statistics, especially geostatistics, intriguing and wanted to further
my knowledge. The problems encountered in geostatistics involve spatial in
terpolation and structural analysis. Spatial interpolation, known as kriging,
and structural modeling, known as variogram estimation are two particular
problems in geostatistics that seemed interesting.
Kriging is a method of spatial interpolation used in geostatistics. The
kriging model is posed as follows. Let D, a design space, be a fixed subset
of 9?d. A location in D is denoted as x. Sampling cannot be done at every
point in the design region, so a finite sampling is done in D at locations x*
1
for i = 1,..., n. An unsampled location is denoted x0. We want to measure
some feature in the design region such as carbon monoxide, elevation, depths to
underground water. Thus, the measurement is unknown and Z(xi) is treated
as a random variable. The objective is to estimate Z(x0).
A spatial random process, Z(x), is modeled as, Z(x) = /x(x) + 5(x) where
/j,(x) models the large scale variability (trend) of the spatial random process
>>
Z(x) and 5(x) is a zero mean stationary stochastic process, with covariance
function, (7(h). It is obvious that the random variables Z(x{) and Z(xj) are
similar at a small separation distance. For the unsampled location, we can
form the estimator Z(x0) == A'Z(x), where Aj are weights.
Three models for kriging are: simple, ordinary and universal. In simple
kriging, the mean, fj,(x) is known and constant, /i(x) = (j,, and the covariance
function, (7(h) is known. In Ordinary kriging, the mean is unknown but as
sumed constant, and the covariance function (7(h) is known. Ordinary kriging
is presented in Section 2.4. Finally, in universal kriging, the mean n(x) is un
known and is estimated by a finite polynomial, and the covariance function is
known.
In practice, the covariance function is usually unknown, and is estimated
from the data. I will show in section 2.4 that the kriging weights, A, are in
terms of the covariance between all sampled locations,C(Z(x + h), Z(x)), and
2
the covariance between the sampled locations and the unsampled locations
C(Z(x0),Z(xi)).
The variogram is an alternative method to estimate variability in the design
region D. In practice, the variogram is estimated, and the covariance function,
under second order stationarity assumptions, is derived from the variogram.
This derivation and definition is found in chapter two. In fact, kriging can
be done in terms of the variogram function, 7(h), rather than the covariance
function, (7(h).
1.2 Outline
Stochastic modeling concepts used in geostatistics, e.g. random functions,
moments of a random function, stationarity, the variogram, isotropy and anisotropy
are defined in chapter two. Chapter three presents methods to estimate the
variogram from data. For example, the classical estimator due to Matheron [5]
and a robust estimator due to Cressie [5] are introduced. How to assess the
assumption of stationarity, how to compute a sample variogram if the data are
on a grid or not on a grid, different sampling schemes to collect data in the
design region are also discussed in this chapter. Finally in chapter four, an
example using topographical data is presented showing how the techniques in
chapters two and three are implemented. I also use simulation to aid in the
3
analysis of the variogram modeling process. Finally, ordinary kriging is used to
obtain the surface of the topological data and is shown along with the standard
error surface.
4
2. Terminology
Let xi,... ,xn be spatial locations in a design region. The design region is
denoted as D and is contained in *R.d. At each location, x*, i = 1,... ,n, a mea
surement is taken. Denote the sample measurement as z(xi). Stochastically,
let Z(xi) denote a random variable, which z(X{) is a realization. Within the
design region, these random variables are spatially correlated. Measurements
that are close together tend to be more similar in value than measurements fur
ther apart. Statistically, random variables closer together are more correlated
than random variables further apart. I will use the above notation through
out this thesis to denote spatial locations, xiy sample measurements, z(xj) and
random variables, Z(xj). Furthermore, locations Xj and Xj are separated by
a distance vector h. Here, h includes both magnitude and direction. Alter
natively, two locations are denoted by x, x I h. The language of geostatistics
includes terms such as random functions, stationarity, variogram, covariogram,
isotropy and anisotropy. All these terms are defined in this chapter along with
other geostatistical terminology.
5
2.1 Random Function
Randomness in geostatistics is described by a Random Function. In geo
statistics, a random function is defined on all x Â£ D and produces random
variables for each x. In general, a random function, Z(x) is characterized
by the set of all its nvariate cdfs for any number n and any choice of the n
locations, xi5 i = 1,..., n:
F(xl,...,xn,zu...,zn) =P{Z(xl)
In particular, we will be concerned with the bivariate cdf of two random
variables, Z(xx) and Z(x2):
F(xi,x2; zi, z2) =F{Z(x1) < zx,Z(x2) < z2)} (2.2)
More general properties of random functions can be found in [26]
A few observations are in order. Two locations x* and Xj are separated by
a distance vector h has both magnitude and direction. The two locations will
be denoted as x and x + h. The random variables are dependent for small
h. Realizations of the random variable are denoted, z(x). Lastly, we have
only one realization of the random variables (i.e. we do not have replication).
For example, if we take elevation measurements and repeat the measurements,
baring any measurement errors, we will have the same results. This leads
to the needed statistical assumption of stationarity, presented in section 2.3.
6
Although in practice we do not know the underlying random function, we will
use the data to find two properties of the random function, namely the first
and second moments.
2.2 Properties of Random Functions
2.2.1 First Moments Recall from classical statistics, the first moment
of a random variable is defined by the formula
Similarly in geostatistics, when we impose a distribution, the first moment
is defined by the formula:
when it exists. As seen, the mean of spatial data is a function of the locations,
x.
2.2.2 Second Moments Second moments of spatial data, when they ex
ist, have three forms. First is the a priori Variance, when it exists and is
defined,
Definition 1 (Variance)
E[Z{yi)\ m(x)
Var[Z(x.)] = E[(Z{x) m(x))2]
(2.3)
7
If Var[2(xj)] and Var[Z(xj)] exists, then the Covariance exists and is defined
as,
Definition 2 (Covariance)
C(xj, Xj) = E [(Zfc) mfc)) (Zfc) mfc))] (2.4)
Lastly, the variance of the difference of two random variables known as the
Variogram, and is defined as,
Definition 3 (Variogram)
2y(x;, Xj) = Var [Zfc) Zfc)} (2.5)
Cressie [5] notes that some people have defined the variogram as, 2y(xi,x2) =
E [(Z(x + h) Zfc)2j. This works fine when mfc is constant Vx G D. Note
that all the above functions depend on locations x, and Xj.
2.3 Stationarity
Recall, there is only one realization of Zfc. Since there are no replications,
we need some sort of assumptions for statistical inference. The assumptions
made regard various degrees of spatial homogeneity. Homogeneity allows us
to view zfc and z(x + h) as two different realizations of the same random
variable.
2.3.1 Strict Stationarity
8
Definition 4 (Strict Stationarity) A random function {Z(x),x 6 D} is
said to be Strictly Stationary within the field D if its multivariate cdf is invari
ant under any translation of the N coordinate vectors, that is,
Fxi,X2,,xn (Z1) z2i j zn) Fx\+h,X2+h,,Xn+h{zl> z2i ' > zn) (26)
I should mention that strict stationarity is also referred as strong station
arity or wide sense stationarity.
2.3.2 Covariance Stationarity Covariance stationarity, also known in
the literature as second order stationarity, is a weaker form of stationarity
than strict stationarity.
Definition 5 (Covariance Stationarity) A random function Z(x) satisfy
ing
E[Z(x.i) Z(xj)] = 0 V xi}Xj and
For Z(x), Z(x + h), the covariance exists and depends only on the sepa
ration distance h.
A property from the definition 5 is,
C(h) = E[Z(x + h)Z{x) m2} V x (2.7)
where m is the true mean of the distribution.
Having defined covariance stationary, we can notice some important conse
quences. For example, assuming the random function is covariance stationary,
9
we can find a relationship between the covariance function and the variogram.
Recall equation 2.3:
Var[Z(x)\ = E[(Z{x)m)2}
= E [Z(x)2 2Z{x)m + to2]
= E[Z(x)2} 2mE[Z{x)\ + m2
= E[Z{x)2] m2
From equation 2.7, we have
(7(0) = E[Z(x)2] m2
This states the variance of a spatial random variable, under the covariance
stationarity assumption is the covariance function with no translation.
A relationship exists between the variogram and the covariogram (see equa
tion 2.7). Under covariance stationarity:
7(h) = lE[{Z(x + h)Z{x))2
= \ {E[Z{x + h)2] 2E[Z{x + h)Z(x)} + E[Z{x)2})
= ^ {E[Z(x + h)2] 2E[Z(x + h)Z(x)] + (7(0) + to2)
= 1 (E[Z(x + h)2] 2(<7(h) + m2) + (7(0) + to2)
= ^ [(7(0) + m2 2(7(h) 2m2 + (7(0) + to2]
= (7(0) (7(h) (2.8)
10
2.3.3 Intrinsic Stationarity An even weaker form of stationarity is that
of Intrinsic Stationarity.
Definition 6 (Intrinsic Stationarity) A random variable Z(x) is intrinsi
cally stationary if the following two conditions hold
E[Z{s)} m V x.
Z(x + h) Z(x) has a finite variance and does not depend on x.
Hence the following notation,
2.4 Variogram Properties
The variogram was defined in equation 2.5. Under intrinsic stationarity
conditions, the variogram can also be defined as follows:
The former definition is a more universal definition. The intrinsic assumption
allows for infinite variance, whereas the covariance hypothesis allows only for
finite variance. For the rest of this thesis, I will be using the definition in
equation 2.10, since I will be assuming intrinsic stationarity, which is least
restrictive of the three forms of stationarity.
2.4.1 Ordinary Kriging Variography [15] is important for kriging. Con
sider the following model:
Var[Z(x + h) Z(x)] = 27(h) V x
(2.9)
(2.10)
11
Z(x) = + 5(x) xÂ£D
(2.11)
where <5() is a zeromean stochastic process with known covariance function.
We will assume the model in equation 2.11 along with the model assump
tions that S() is a zero mean stochastic process with variance modeled by a
known covariance function. Further assume the first order component, p, is
constant in the region, but unknown. Using the data, we would like to predict
at an unsampled location using a linear estimator. Form the linear estimator,
Z{x o) = A*Z(x)
We would like our estimator to be unbiased and have minimum variance. For
the predictor to be unbiased this means, E z(:ro) Z(x) = 0, or
E[Â£(x0)] = ^ = E[z(xq)
= E
EW
Li=0
= 2>E[Z(Xi)]
i=0
=
1=0
=> ^2 Aj 1
i=0
Since E[Z(x)] = n Vx eD. The mean squared error is:
E [(Z(x) Z{xjf = E [Z(x)2] 2E [Z(x)Z(x)] + E [Z{x)2]
12
evaluating each term,
E[Z2] = E[(A'Z)2]
= E[ A'ZZ'A]
= A'E[ZZ']A
= A'(C + /t2[l]n)A (2.12)
Where C is a n x n covariance matrix for all sample locations. The next term,
E [ZZ\ = E[X'ZZ]
= A'E[ZZ]
= A'(c + /i2l) (2.13)
c is a n x 1 vector of covariance between the unsampled location and sample
locations. The final term,
E [Z2] = a2 + /n2
(2.14)
in combination with equations 2.12 2.13 and 2.14, leads to:
Q(A) = A'(C + ti2[l]n)A 2(A'(c + /i2l)) + a2 + fj,2 (2.15)
We have a constraint on the As which must sum to one. Using Lagrange
multipliers1, we can take the derivative of Q and solve for A:
Q(A) = A'(C + /j2[l]n)A 2(A'(c + fx21)) + a2 + /i2 + 2{X' l)v
1The Lagrange multipliers will be denoted by u.
13
Taking the derivative and setting it equal to zero,
= [C + fj?[l]n]\ c ij?1 + vi = 0
CA c + ul  0
CA + vl = c
and including the unbiased constraint gives
C+A+ c+
where:
C+ =
Cn Cln
Cn i Cnn
1 1
and Cij is the covariance between location i and j,
Ao
A+ =
An
v
Coi
C+ =
Co n
0
14
hence if C+ is invertible,
A+ = C;1^ (2.16)
Cressie [5] does the above analysis in terms of the variogram. Also, the solution
to the kriging system yields a best linear unbiased predictor (BLUP) of Z(x0).
The solution to the kriging system requires C to be invertible. Let Y be a
linear combination of Z(xj)s and let Z(xj) have expectation m and covariance
C(h) or semivariogram 7(h). Y is a random variable, hence, the variance of
Y must be positive or 0.
m
y = 5> jZ(Xi) (2.17)
1=1
for any weight A* The variance of Y can be written as:
Var(Y) =^2^2 XiXjC(xi Xj) > 0 (2.18)
i j
Because C is a covariance matrix it must be positive definite. All positive
definite matrices are invertible.
Now, take into account equation 2.8 and the above relation, and rewrite
the variance of Y in terms of the variogram.
Var[y] = C(0)EAiEAiEEAiAi7(ii^) (2.19)
i j i j
In the case where the variance C(0) does not exist, only intrinsic stationarity
is assumed, and the variance of Y is defined on the condition that the sum of
15
the As is 0, and the term C(0) is eliminated, leaving only
Var[Y] = ~Y,Y, AiAr/(^ ~ Xj)
i j
(2.20)
Thus, the variogram function must have the condition of conditional positive
definite (negative) function. The above discussion is found in [15].
Cressie [5] cites three important properties of the variogram.
(1) 2y() is continuous at the origin, then Z() is L2 continuous. That is,
E [(Z(x + h) Z(x))2 v 0 if and only if 27(h) > 0 as h y 0.
(2) If 27(h) does not approach 0 as h approaches the origin, then Z() is
not L2 continuous and is highly irregular. This discontinuity of 7 at
the origin is referred to as the nugget effect.
(3) If 27(h) is a positive constant, then Z(xj)s are uncorrelated for all x
regardless of how close they are; Z() is called white noise.
Further properties of the covariance function and the variogram are:
C(0) = Var(Z(x)) > 0 a priori variance cannot be negative
C(h) = C(h) covariance is an even function
C(h) < (7(0) Schwarzs inequality
7(0) = 0
lih) = 7 (h)
16
2.4.2 Functions and Graphs There are a number of known functions
that are useful as model for the variogram and assure negative definiteness.
Graphically, if the variogram stops increasing beyond a certain distance, this
value is known as a sill. Mathematically, this is denoted 7(00) = M. Infor
mally, this says that two random variables separated by a large distance are
uncorrelated. The distance at which the variogram stops increasing is known
as the range. Variograms with a sill are known as transition models. If an
intrinsic random function reaches a sill, then the random function is second or
der stationary. Then we can have the following relationship from equation 2.8
C(h) = 7(00) 7(h) From this relationship, 7(00) = (7(0).
Finally, if the variogram 7(h) is a function only of magnitude and not of
direction, then the variogram is said to be isotropic (this is also true for the
covariance function). A function which is not isotropic is said to be anisotropic.
Anisotropy will be discussed in the next section.
2.4.3 Isotropic Transition Models Three typical isotropic transition
models are are given below along with their graphs.
(1) Spherical Model
f
0
h = 0
V h [0, a] (2.21)
Cq + C7
V h> a
17
where cq + a is the sill, c0 is a nugget effect and a is the range. Further
more, a, Co and a are all positive. The following plot is standardized
to reach a sill value of 1 at a range of 5 with no nugget effect.
Figure 2.1. Spherical model
(2) Exponential Model
0 h = 0
II f ( 3*M > h^O (2.22)
Co + O' < 1 ("T 1
The exponential model will reach its sill asymptotically. Again, c0 + a
is the sill, cq is the nugget effect and a is the range, all positive real
numbers. For this model, the range is the value where the variogram
reaches 95% of the sill, since this model reaches the sill asymptoticly.
A graph of the exponential function has a range of 5 along with a sill
of 1, and no nugget effect.
18
Figure 2.2. Exponential Model
(3) Gaussian Model
0
h = 0
7 (h) = <
Co + o < 1 exp
m
a*
0 < h <
(2.23)
where c0, a and a have the same interpretation as above, where the
range is 3.
2.4.4 Nontransition models Models without a sill are intrinsic only.
These models allow for infinite variance. Frequently used models are:
(1) Power Model
(2)
7(h) = <
0 h = 0
(2.24)
c0 + bp\h\e h 7^ 0
where c0 is the nugget effect. A special case is the linear model, where
9 is 1. bp is the slope in the linear model. A graph of this model is
shown in Figure 2.4.
Logarithmic Models
7(h) = log(h)
(2.25)
19
variogran variogram
0.8
Figure 2.3. Gaussian Model
Figure 2.4. Power model: 'y(h) = x2
20
1.8
Figure 2.5. Logarithmic Model
With this model, lim/^o h = oo. Journal and Huijbregts point out this model
must be regularized, see page 82 [15].
2.4.5 Hole Effect Models A third type of models are known as Hole
Effect Models. These models are used when growth is not monotonic and can
occur with or without sills. They are of the form:
7(h) = 1 ^ (2.26)
2.5 Anisotropy
The Random Variable is said to be anisotropic when its variability is not
the same in every direction. Using Figure 2.7 below, variability is dependent
on both magnitude and direction. For example, a nice form of anisotropy is
21
1.4
Figure 2.6. HoleEffect Model
when a contour of the variogram surface will yield elliptical shapes at various
distances. Along the major axis of the ellipse, variability increases slowly, and
along the minor axis of the ellipse, variability rises rapidly. These will be
referred to as major and minor axes of anisotropy, respectively.
A complete understanding of the region of interest is needed to correct for
anisotropy. Anisotropy corresponds to the existence of preferential directions at
the time of the genesis of the studied phenomenon [15]. Examples of anisotropy
are:
Monitoring air pollution with a prevailing wind.
Ore grades along a river bed.
Contamination along porous media
Journel and Huijbregts [15] state, Anisotropies will be represented by the
method of reducing them to the isotropic case either by a linear transformation
22
Figure 2.7. Variogram contour surface with anisotropy
of the rectangular coordinates (hu, hv, hw) of the vector h in the case of geomet
ric anisotropy, or by representing separately each of the directional variabilities
concerned in the case of zonal anisotropy
Traditionally, anisotropy has been classified as either geometric or zonal.
In the classical book by Isaaks and Srivastava [13] along with Journel and Hui
jbregts [15], geometrical anisotropy is characterized by directional variogram
that have approximately the same sill but at different ranges. Geometrical
anisotropy can be corrected by simple linear transformations, whereas zonal
anisotropy cannot [15]. Isaaks and Srivastava [13] describe zonal anisotropy
as one in which the sill value changes with direction while the range remains
constant. I agree with Zimmerman [27] that the above definitions of zonal
anisotropy are inadequate. This nomenclature does not describe the type of
23
anisotropy that characterizes the region of study. Zimmerman suggests aban
doning the term zonal anisotropy in favor of more descriptive terms, such as
range anisotropy, sill anisotropy, and nugget anisotropy. The classical term
for geometrical anisotropy should be retained. Each of the different types of
anisotropy is discussed below.
2.5.1 Geometric An excellent discussion of Geometric Anisotropy is dis
cussed in chapter 16 of Isaaks and Srivastava [13], as well as in Journel and
Huijbregts [15]. Basically, variograms are computed and plotted in various di
rections, and in each direction, the sill value will be the same, but the range
value will be different. Below is a graph to demonstrate this concept.
Figure 2.8. Variograms representing geometric anisotropy.
Both variograms have a sill value of 1. The E/W direction reaches its sill at a
range of 3, the N/S reaches its sill at a range of 6.
Using Figure 2.7 the correction for geometrical anisotropy is as follows:
24
(1) Rotation of coordinate axis
(2) Transformation of locations.
The rotation of coordinate axis uses the following transformation matrix in 2
and 3 dimensions, respectively
cos(0) sm(6)
R =
sin (6) cos (9)
where 9 is the angle of rotation, and
R =
cos(a) cos((f)) sin(a) cos(^) sin(^)
sin (a;) cos (a) 0
cos(a) sin(0) sin(a;) sin((/>) cos(a)
(2.27)
(2.28)
where a is the rotational angle in the xydirection and
zx direction.
For simplicity, the figure below shows the rotated axis aligned along the
major and minor axis of anisotropy in two dimensions.
Figure 2.9. Transformed Axis
Axis aligned along major and minor axis of anisotropy
25
Now that we have addressed rotation of coordinates, we next address trans
formation of the coordinates. Isaaks and Srivastava talk about standardizing
the range. That is, we can reduce the range in which the variogram reaches
its sill value to 1. Refer to Figure 2.8, suppose that each of these variograms
represent a variogram in the x direction and y direction. Suppose that in the
x direction, the sill is reached at the range value of 3, and the y direction, the
sill value is reached at the range of 6. We can divide each of the x coordinates
by the range value of 3 and the y coordinates by the range value of 6. Now,
in each direction, the sill value in each direction will reached at a range of 1.
We can consider the coordinate transformation matrix for 2 and 3 dimensions,
respectively.
(2.29)
and for 3 dimensions
T =
r 0
Q>x
0
0
0
0 0 r
(2.30)
Putting the two transformations together, we have the new coordinates give
by h',
h' = TRh
(2.31)
Note that T and R are not reversible, the transformations must be in this
order.
26
25.2 Sill Anisotropy Figure 2.10 below is an example of anisotropy in
the sill. Here we see the sill values are different in two directions.
Figure 2.10. Sill anisotropy variograms.
The semivariogram in the EW direction given by the dotted line and the
semivariogram in the NS direction is given by the solid line.
Since the variogram attains a sill, this means the region has second order
stationarity. Zimmerman [27] shows that sill anisotropy may be evidenced by
a trend in the data, nonvanishing spatial correlation or measurement errors
which are correlated or have unequal means. If one were computing an ex
perimental variogram that has unequal sills, then exploratory analysis of the
data may yield a trend in the data. This may explain the first reason for
sill anisotropy. Zimmerman states experience shows this to be most likely the
case. A trend in the data is a violation of the stationarity hypothesis, and may
cause the non vanishing spatial correlation. The analyst should abandon the
27
stationarity hypothesis, estimate the trend, remove it and proceed with ana
lyzing the residuals as a new data set. This should yield isotropic directional
experimental variograms.
Correlated measurement errors or measurement errors having unequal means
was another reason for sill anisotropy. Zimmerman uses an example from a drill
hole data set. The samples can be analyzed on different occasions, even per
haps by different individuals resulting in correlated measurement errors, or
measurement errors of unequal mean. Also, it may be reasonable to assume
that measurements from different drill holes are uncorrelated, but measure
ments from the same drill hole are equally and positively correlated. This
being the case, the only correction the analyst can use is a nested model.
Nested variograms are of the form:
7*(k) =Â£7i(lAih)
i=l
(2.32)
where
Ai,..., Ar
are matrices that define appropriate directions
A1h/A1h,...,Aroh/Amh
and
7i ()> >7n()
28
are isotropic variograms.
2.5.3 Range Anisotropy In Section 2.5.1, geometric anisotropy and its
correction was discussed. Referring to Figure 2.8, the variograms in two differ
ent directions reach the same sill, but at different ranges. Range anisotropy is
the type of anisotropy which is not corrected by linear transformations. Here
we are contributing the range anisotropy solely to measurement error and not
to any physical considerations of the region of study.
Zimmerman states that the current practice is to fit a nested variogram
model which imposes no constraints on the ranges of the directional variograms
or their sills. Zimmerman believes it is advisable, based on the assumption
and plausibility of second order stationarity, vanishing spatial correlation and
white noise measurement error, that constraints should be imposed on the
range and sills to ensure the fitted sills are equal. In the example he uses,
he supposes empirical variograms are fitted in the following directions: N/S,
E/W, NE/SW, NW/SE. Furthermore, the empirical variograms resemble the
exponential model. Then,
EW direction 72(fi) = Q\\l exp(a\h)]
NESW direction 72 (/i) = d2\l exp(a2h)\
NS direction 72 (fi) = 03[1 exp(a3h)}
29
NWSE direction 7z{h) = 04[ 1 exp(04/1)]
Under the assumptions of second order stationarity, vanishing spatial correla
tion and whitenoise measurement error, we must require 61 = 02 = O3 #4
2.5.4 Nugget Anisotropy Finally, for completeness I would like to show
and mention nugget anisotropy. Figure 2.11 is an example of nugget anisotropy.
A nugget effect is discontinuity at the origin. This is attributed to microscale
variability in the design region. Directional variograms not only have a nugget
effect, but they are different in each direction. Nugget anisotropy is attributed
to correlated measurement error. This measurement error is inadequately de
scribed by white noise. Correlated measurement error is found between rows,
between columns or both. Below is a graph of nugget anisotropy.
Figure 2.11. Nugget Anisotropy
30
3. Data Analysis
The previous chapter concentrated on the theory of geostatistics. This
chapter is primarily concerned with the practice of geostatistics. Some of the
information that the analyst should know, includes:
How the data were collected
The region of study
The variable (s) of interest
Who collected the data
Goal of the study
Assessment of stationarity
An exploratory data analysis will reveal interesting aspects of the data.
Before computing any experimental variograms, trends should be checked to
assess stationary assumptions. Recall in section 2.3, the mean of the data is
constant and does not depend on location. A trend in the data implies a non
constant mean, depending on the location. If there is a trend, it should be
estimated by either median polish, or polynomial regression. Once identified,
31
the trend can be removed, and the residuals used as a new data set. This chap
ter begins with identifying different sample designs. Second, the terminology
used to compute the experimental variogram are discussed. Third, the meth
ods used to compute the experimental variogram and finally, methods used to
fit variograms are discussed.
3.1 Sample Design
The sample design for spatial data may or may not be on a grid. In chapter
3 of Ripley [21], he shows four ways to sample spatial data. Uniform Sampling
chooses locations uniformly throughout the region and independently of each
other. Stratified random sampling takes a uniform random sample of size k
from each of m strata or subareas, so n=km. Centric Sampling is where the
samples are taken on an equally spaced grid in the region. Finally, NonAligned
Sampling is where the sample locations are off the nodes of a grid. Of these,
Uniform Sampling, Stratified Random Sampling and NonAligned Sampling
result in sampling that are not on a grid. Centric Sampling is on a grid. For
the rest of this chapter, I will differentiate gridded sampling and nongridded
sampling.
3.1.1 Gridded Data The following figure is an example of data which
are on a grid. This example is from Cressie [5], chap. 2. These data are
32
0
0 0 0
0 0 0 0 0
0000000
0000000 0 0 0
00000000 0 0000000
00000000 0 0 0 0000000 0 0
0000000 0 0 0 0000000 0 0
0000000 0 0 0 0000000 0 0 0 0
0000000 0 0 0 000000 0 0 0 0
000000000000000000000
000000 0 0 0 0000000 0 0 0 0 0
0 0 0 0 0 0 0 0 0000000 0 0 0 0 0
0 0 0 0 0 0000000 0 0
0 0 0 0 0 0 0
0 0 0
1 02 1 SI 1 (H 1 s
+3
CD Ti
CO 3
$ bO
cd cfl
Q bO
rO Â§
co ,0
'Trt 03
CO +?
O c3
O to
CO
CD
fn
3
CO
O
o
r1
tJ
a3
O
bO o
*h ^
& ^
qyou
sampled on equidistant transects of a design region. The number of partitions
along the x and y axis are denoted as nx, ny respectively. The separation
between each adjacent partition along the x and y directions are denoted xsiz,
ysiz, respectively. A lag is the distance between two locations and should not
exceed Lj2, where L is the maximum distance of any two locations in the
design region.
3.1.2 NonGridded Data The figure below shows a sample area that
the sample data are not on a grid.
6.50.:
: 8
5.50 J <3
:
4.50.1 < 0
3.50 J 2.50.1 ( D
1.50 J
.50J
.50
.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00
Figure 3.2. Data which are not on a grid
Figure 3.3 should help clarify some of the following terminolog needed to
compute the sample variogram with nongridded data. Using Figure 3.3, the
origin will represent a location in the design region. The azimuth is the mea
sured angle from the yaxis, clockwise. We can partition the direction vector
into nlag, the number of lags to compute the sample variogram. In Figure 3.3,
34
Y axis (North)
Figure 3.3. NonGridded Data
This figure shows the terminology used in computing the experimental
semivariogram for nongridded data.
35
there are 5 lags. The maximum distance used to compute the sample variogram
is L/2, where L is the maximum distance between two locations in the sample
region. The lag separation hiag, is determined by dividing the maximum dis
tance, L/2 by the number of lags. Two important tolerance intervals are the
lag tolerance and azimuth tolerance. The lag tolerance is some 5 amount from
the current lag, such that any location within h(a5 5 interval is considered to
be at the current lag. Next is the azimuth tolerance, this is a e amount such
that any point within the interval azimuth e is considered in the current
azimuth.
3.2 Estimators
Now, I will introduce some methods of calculating spatial variability from
data. The first estimator is known as Matherons estimator. This estimator is
based on method of moments, and is an unbiased estimator of 7(h), where h
is magnitude, \\h\\.
1 Nh
2 = aT Â£(Z(x + h) z(*))2 C31)
i=1
where Nh is the number of pairs of locations separated by a distance h.
The next estimator is a robust estimator derived from Cressie and Hawkins [3].
It is recommended to use this estimator when there are many outliers in the
36
sample data.
= feESll^(x + h)Z(x)r)
71 } 0.457 + 0.494/A^
(3.2)
Deutsch and Journel [8] give nine other methods to measure spatial variability.
3.3 Computing Sample Variograms
The variogram modeling process can begin with an omnidirectional vari
ogram. An omnidirectional variogram can be thought of loosely as an average of
the various directional variograms [13]. The omnidirectional variogram serves
a useful starting point for establishing some of the parameters required for
sample variogram calculations. It has been shown to be useful when the data
are not on a grid to establish the lag tolerance. Once the lag tolerance is es
tablished, then directional sample variograms are computed, and the azimuth
tolerance can be established. This does not necessarily need to be done when
the locations are on a grid since the number of pairs to compute the sample
variogram is highest when the variogram is computed along the North/South
and East/West directions, respectively. Once the parameters are established,
the analyst can proceed with assessing anisotropy, if needed.
Figure 3.4 shows an omnidirectional variogram before adjusting the lag
tolerances.
The number of lags in the computed omnidirectional variogram has been
37
Semivariogram
direction omnidirectional
Y
T
3.00
Figure 3.4. Omnidirectional variogram of elevation data. The
parameters need to be adjusted to smooth the sample variogram.
38
reduced to produce the sample variogram in Figure 3.5. By reducing the num
ber of lags used to compute the sample variogram, we increase the lag spacing,
which in turn includes more pairs in estimating each point in the sample vari
ogram. The result is a smoother sample variogram.
Figure 3.5. Omnidirectional variogram of elevation data after
parameter adjustment
3.4 Diagnostic Tools
When the experimental variogram is computed in various directions, and,
if they are different, the analyst must determine the reason. There may be a
number of reasons for the apparent anisotropy such as outliers in the data or
a trend in the data may result in violations in the stationarity assumptions.
This section will deal with ways of handling the apparent anisotropy.
Graphical tools help in identifying trends in the data, particularly spin plots
and scatter pair plots. GSLIB [8] has a routine called locmap which color scales
the location of the data. Another useful plot is a text plot, which plots the
39
value of the data at the location instead of a dot. One technique of removing
the trend is by Least Squares. The trend is estimated by a polynomial function
and removed. The residuals are obtained and used as a new data set. Hence, an
experimental variogram is computed on the residuals, and if the trend removal
has been successful, the process yields an isotopic experimental variogram. As
previously mentioned, outliers must be taken into consideration. If the analyst
can justify the removal of the data, they can be removed; otherwise more robust
methods should be used in the analysis.
One method of removing certain twodimensional trends uses median pol
ish [5]. When the data are on a regular grid, they may be treated as row and
columns of data. Assuming there are no row and column interactions, median
polish gives a twoway fit, and the residuals become the new data set. In either
method, anisotropy may still be an issue. In some cases, the removal of the
trend may yield an isotropic data set, but in others it may not. The analyst
must employ methods of dealing with anisotropy in the new data set by the
methods outlined in section 2.5, or other creative ways.
With nongridded data, if there is a trend in the data, it must be removed.
Cressie [5] mentions that a grid can be imposed over a region of study, and
the locations are referenced to the nodes of the imposed grid. Median pol
ish with missing values is again employed, and residuals obtained. Another
40
method uses linear smoothers. If there is no interaction in the northern and
eastern directions, smoothing techniques may be used. Specifically, General
ized Additive Models [11] are nice to use. If a plot in the data shows a linear
trend in the north direction and east direction, the trend is removed by the
deterministic part and a smooth of the residuals in each direction is used to
assess the removal of the trend. An example of using this method is presented
in the next chapter.
3.5 Fitting Sample Variograms
Once the sample variogram has been computed and anisotropy has been
detected, and plotted, the next major task is to fit a parametric variogram func
tion. As several authors have mentioned, fitting a variogram function to a sam
ple variogram has been done by adhoc techniques, such as eyeballing the
sample variogram and guessing parameters. One software package, GeoEas [1],
employs such a method. Another method is using interactive software, such as
infomap (Bailey and Gatrell [2]). A sample variogram is computed, the user
enters a variogram function, then interactively tweaks the parameters until the
curve looks like it fits the sample variogram. These two packages are limited
to the exponential, spherical and gaussian models. SPLUS [12] also has this
capability, but has the power and flexibility to execute the methods mentioned
41
below.
A number of methods to fit a sample variogram are outlined in section 4.6
of Cressie [5]. A comparison of the methods are found in Zimmerman and
Zimmerman [28] as applied to ordinary kriging. The methods are: maximum
likelihood, restricted maximum likelihood (REML), minimum norm quadratic
estimation (MINQ), least squares, generalized least squares. Further infor
mation on maximum likelihood is found in [17], REML and MINQ in [5]. As
Zimmerman and Zimmerman have demonstrated, the method of weighted least
squares using Matherons estimator, weighted least squares using Cressies ro
bust estimator, or ordinary least squares are preferred. This assessment is
based on kriging variances and ease of computation.
Are using the adhoc techniques such a bad idea? Cressie and Zimmer
man [4] discuss the stability of variogram parameters. Grondona and Cressie [10]
have shown the parameters of the variogram to be more stable than estimat
ing the covariogram. I have found that this area has great potential for large
amounts of research, as well as good techniques for fitting valid variogram
functions to sample variograms.
42
4. Analysis of Topographical Data Set
This chapter presents an example using a topographical data set to il
lustrate topics mentioned in the previous two chapters. The objective is to
estimate the variogram from the data and to use Kriging to reconstruct the
surface. The data set is taken from Davis [7], and the measurements are eleva
tion data taken on a 310 x 310 area in 9?2, where the units are in yards. The
locations in the design region were chosen randomly. Furthermore, the region
is scaled to 0 and 6 both in the North/South and East/West directions.
4.1 Exploratory Data Analysis
The data are displayed using various plots in Figure 4.1. Figure 4.1(a)
is a contour plot of the data, along with the locations of the data. Two hills
appear in the lower right corner and lower left corners, with a valley separating
them and continuing North. The values are higher on the left and right side
of the picture, decreasing towards the center. This observation may suggest a
quadratic nature in the East/West direction. Furthermore the values increase
moving from the North towards the South, indicating a linear structure in
the North/South direction. Figure 4.1(b) is a text plot of the values of the
43
data. The following two graphs are perspective plots viewing South and East,
respectively. In the East/West direction, the trend appears quadratic, along
the North direction, it appears to have a linear component. Care must be
taken using these graphs as they have been preprocessed using an interpolation
algorithm. A brush and spin plot verifies the above assessment. Because of
these trends, I have decided to remove them before any further analysis. I used
a Generalized Additive Model [11] to model the trend since the data are not on
a grid, and I assumed no interaction between the North/South and East/West
directions based on the above plot. The model consists of a second degree
polynomial in the East/West direction, and a first degree polynomial in the
North/South direction. A second component was added to the model to check
for any bias. This component consists of a loess smooth of the residuals on
both the North/South and East/West. The output of the GAM fit is shown in
Figure 4.2.
The loess fit confirms a quadratic component in the East/West direction.
A regression line of the quadratic fit is added in Figure 4.2(a). The linear com
ponent in the North/South is captured by the fit and added to Figure 4.2(b).
The residuals plot in Figure 4.2(c) and Figure 4.2(d) show no structure in
the East/West and North/South directions. The loess smooth is added to the
residual plots to assess the fit of the model. The smooth shows no structure.
44
650 700 750 800 850 900 950 1000
(a) Contour and point plot of data
(b) Text plot of data.
4 3 2
East Direction
(c) Perspective plot of data looking (d) Perspective plot looking East
South.
Figure 4.1. Various Plots of Topographical Data
45
Therefore, I have modeled the trend successfully. Figure 4.3 displays some plots
to visualize how well the GAM model worked. Now that the trend has been
identified and removed, I will use the residuals as a new data set to compute a
variogram.
4.2 Variogram Analysis
The next step is to compute an omnidirectional variogram using the residu
als as a new data set. This iterative procedure will help obtain the settings for
the number of lags and the angular tolerance. After three iterations, I found
that using 10 lags produced a nice omnidirectional variogram. The next step
is to determine if the data are anisotropic. This is accomplished by computing a
directional variogram with a small angular tolerance. This iterative procedure
started with 5 angular tolerance and I incremented in steps of 10. A direc
tional variogram of the residuals is shown in Figure 4.4 using an angular toler
ance of 45. The directions computed are 0, 30, 45, 90, 120, 135, and 150.
It appears, in the 90 through 150 degree directions, the sill of the variogram
is between 1000 and 1500, with a range of 3 to 4. The results appear to be
different in the 0 to 60 degree direction. A conclusion of anisotropy may be
assessed at this stage of analysis. To aid in the assessment, Figure 4.5 is a
contour of the variogram surface based on the directional variogram.
46
lo(x) poly(x. 2)
X
(a) Quadratic component along
East/West
X
(c) Residuals along East/West
Figure 4.2. Plots
y
(b) Linear component along
North/South
y
(d) Residuals along North/South
GAM output
47
Residuals
(a) Contour plot of the residuals.
(b) Perspective plot of residuals.
(c) Perspective plot of residuals look (d) Perspective plot of the residuals
ing South. looking looking West.
Figure 4.3. Various Plots of Residuals
48
gamma
0.0 OS 1.0 is 2.0 2.5 3.0 0.0 0.5 t.O 14 2.0 24 40
distance
Figure 4.4. Directional Variogram of Residuals
49
N/S distance
E/W distance
Figure 4.5. Contour of the Variogram Surface
50
From Figure 4.5, one may assess geometrical anisotropy due to the ellipti
cal form of the variogram surface. Recall from section 2.5.1 that geometrical
anisotropy is corrected by rotating and extending one of the coordinates. If the
transformation works, then we can conclude geometrical anisotropy. However,
apparent anisotropy may be due to the sampling design or noise. To assess
the apparent anisotropy, I simulated a new response using a known covariance
structure that produces isotropic realizations.
4.3 Simulation
The omnidirectional variogram computed in section 4.2 is shown in Fig
ure 4.6. From this graph, it appears to have a range of 2 and sill of 1200.
Using the coordinates from the topological data set, I simulated values using
an exponential covariance function with range of two and a sill of 1200. The
simulator returns values from an isotropic random field.
The directional variogram using the simulate data is shown in Figure 4.7.
The angular tolerance is 45 degrees and the number of lags is 10 as before.
We can see in the 30, 120, 135 and 150 degree azimuths there is some clear
structure: the sample variogram appears to reach a sill. Knowing the simulated
values are stationary and isotropic, the variograms should be the same in each
direction computed, which clearly is not occurring.
51
The number of pairs used to compute each point on the sample directional
variogram is less than 30. A rule of thumb is that the number of pairs used to
estimate each point in the sample variogram should have at least 30 to 50 pairs.
I chose a minimum of 40. To increase the number of pairs, one can increase
the lag tolerance, which by default is half the lag separation. I increased the
lag tolerance for a number of instances, but the results were not favorable, the
number of pairs did not increase. The next method is to increase the angular
tolerance. I increased the angular tolerance in increments of 5. Finally at
the 70 angular tolerance I achieved favorable results. The number of pairs
to estimate each point was 40 and the sample directional variogram looked
favorable.
The results of increasing the angular tolerance show that in each direction,
the sample variograms are similar, and the variogram surface is fairly circular.
It was interesting to note the angular tolerance used to produce the results
shown in Figure 4.8 is 70 degrees. I performed this simulation 5 times and each
time came to the same conclusion. The conclusion drawn from this analysis, is
that I have an anomalous sample data set, in which these high tolerances are
needed to produce the known isotropic results. I need to increase the angular
tolerance to produce a reliable sample variogram.
53
to).aÂ£nurtft70 degrees
(a)
Variogram Surface
E/W
(b)
Figure 4.8. Directional Variogram and Variogram Surface Plots
54
gamma
(9
E
E
a
a
dstance
Figure 4.6. Omnidirectional variogram, number of lags equal 10
distance
Figure 4.7. Directional variogram, nlag=10, tol.azimuth=40
52
4.4 Completion of Original Data
Going back to the original data set, I used the omnidirectional variogram
in Figure 4.6. I believe by the simulations, the original data set does not
have any anisotropy, and the results are unique due to the sampling and the
measurements under study.
Using Cressies weighted least squares, I fit an isotropic gaussian variogram
to the residuals. The fitting procedure returned the values: range = 1.522 ,
sill = 943.724 and the nugget = 18.07. The fit looks good and is shown in
Figure 4.9
4.5 Kriging
Now that the variogram parameters are estimated, we can Krige the surface.
The SPlus [12] algorithm chooses a 30 x 30 grid in the design region along with
the data locations to perform kriging. After kriging the residuals, I will add
in the estimated trend from section 4.1. This will give me the predictions at
unsampled locations. A contour of the surface along with the contour of the
standard errors are shown in Figure 4.10.
The small standard errors show the kriging routine produced reasonable
results. The surface along the North/South direction and the East/West di
rections are smooth as compared to Figure 4.1. Overall, I am comfortable with
55
gamma
Figure 4.9. Weighted least squares fit of omnidirectional variogram
56
EifcWe* QLwW EkaWcdDndon
(a) Kriging surface (b) Standard errors of krig
ing surface
i a a < s
NotvSaihCMcCen
(c) Perspective plot of Krige
surface
4 3 .
EasVWest Direction
(d) Kriged surface along
East/West
Figure 4.10. Results from Kriging.
57
the results starting with the estimation of the variogram and the surface results
produced in Figure 4.10.
4.6 Concluding remarks and open questions
The above analysis was an application of some of the geostatistical tech
niques used in practice. The data was plotted and trends were identified. The
next step was to remove the trend. My research has shown that analysts are
looking for robust methods to remove the trend. Robust methods produce
smoother Kriging surfaces. I chose to use a generalized additive model (GAM)
to remove the trend. The basic reason for choosing this method is because
I wanted to verify the trend had been correctly identified and removed. A
nonparametric tool for this assessment was to use a loess smooth [11] on the
residuals. Generalized additive models has become an interest to me and I plan
to further my knowledge of this tool. After many steps of computing the om
nidirectional sample variogram, I computed the directional variogram to assess
anisotropy. Simulation helped to reject anisotropy. Finally, a parametric vari
ogram was fitted to the sample variogram. This completed the variogram study
of the region, which is the longest part of any geostatistical analysis. Kriging
the region was a simple task to complete. After completion of the Kriging
routine, questions remain as to the fit of the surface. We have not accounted
58
for variability from the variogram fit and the trend fit when we reproduced the
Kriged surface. This area is open for further research.
59
REFERENCES
[1] U.S. Environmental Protection Agency. GeoEas 1.2.1 Users Guide. 1991.
[2] Trevor Bailey and Anthony C. Gatrell. Interactive Spatial Data Analysis.
Longman Group Limited England, 1995.
[3] Noel Cressie and Douglas Hawkins. Robust estimation of the variogram.
Mathematical Geology, 12(2), 1980.
[4] Noel Cressie and Dale L. Zimmerman. On the stability of the geostatistical
method. Mathematical Geology, 1992.
[5] Noel A.C. Cressie. Statistics For Spatial Data. John Wiley &: Sons USA,
1993.
[61 Noel A.C. Cressie. The origins of kriging. Mathematical Geology, v 87(n
3):239252, Apr 01, 1990.
[7] John C. Davis. Statistics and Data Analysis in Geology. John Wiley and
Sons, New York, 1986.
[8] Clayton Deutsch and Andre Journel. GSLIB: Geostatistical Software and
Users Guide. Oxford New York, 1992.
[9] Clayton V. Deutsch. Kriging with strings of data. Mathematical Geology,
1994.
[10] Martin 0. Grondona and Noel Cressie. Residuals based estimators of the
covariogram. Statistics, 1995.
[11] T.J. Hastie and R.J. Tibshirani. Generalized Additive Models. Chapman
and Hall, 1990.
[12] Mathsoft Inc. S+SpatialStats. 1996.
[13] Edward H. Isaaks and R. Mohan Srivastava. An Introduction to Applied
Geostatistics. Oxford New York, 1989.
60
[14] E.H. Isaaks and Mohan Srivastava. Spatial continuity measures for proba
bilistic and deterministic geostatistics. Mathematical Geology, 20(4):313
341, 1988.
[15] A. Journel and H. Huijbregts. Mining Geo statistics. Academic Press, New
York, 1978.
[16] P.K. Kitanidis. Introduction to Geo statistics. Cambridge, 1 edition, 1997.
[17] K.V. Mardia and R. J. Marshall. Maximum likelihood estimation of models
for residual covariance in spatial regression. Biometrika, 71:135146,1984.
[18] K.V. Mardia and A. J. Watkins. On multimodality of the likelihood in the
spatial linear model. Biometrika, 76(2):289295, 1989.
[19] G. Matheron. Principles of geostatistics. Economic Geology, 58:1246
1266, 1963.
[20] G. Matheron. The intrinsic random functions and their applications. Ad
vances in Applied Probability, 5:439468, 1972.
[21] Brian Ripley. Spatial Statistics. Wiley and Sons, 1982.
[22] Michael Stein and Mark Handcock. Some asymptotic properties of krig
ing when the covariance function is misspecified. Mathematical Geology,
21(2):171190, 1989.
[23] J.J. Warnes and B.D. Ripley. Problems with likelihood estimation of
covariance functions of spatial gaussian processes. Biometrika, 73(3):640
642, 1987.
[24] A.J. Watkins. On models of spatial covariance. Computational Statistics
and Data Analysis, 13:474481, 1992.
[25] A.J. Watkins and F.H.M. AlBoutiahi. On maximum likelihood estimation
of parameters in incorrectly specified models of covariance for spatial data.
Mathematical Geology, 22(2), 1990.
[26] A.M. Yaglom. An Introduction to the Theory of Stationary Random Func
tions. PrenticeHall, 1962.
[27] Dale L. Zimmerman. Another look at anisotropy in geostatistics. Mathe
matical Geology, 1993.
61
[28] Dale L. Zimmerman and M. Bridget Zimmerman. A comparison of spatial
semivariogram estimators and corresponding ordinary kriging predictors.
Technometrics, Feb 1991.
62

PAGE 1
VARIOGRAM MODELING AND ESTIMATION by Norman E. LeMay, Jr. B.S., University of Colorado at Denver, 1995 A thesis submitted to the University of Colorado at Denver in partial fulfillment of the requirements for the degree of Master of Science Applied Mathematics 1998
PAGE 2
This thesis for the Master of Science degree by Norman E. LeMay, Jr. has been approved by Weldon Lodwick s /b/ct 5 Date
PAGE 3
LeMay, Norman E.(M.S., Applied Mathematics) Variogram Modeling and Estimation Thesis directed by Professor James R. Koehler ABSTRACT Kriging is a form of spatial interpolation. A model, Z(x) = JJ(x) + c5(x) is considered for the region D, where JJ(x) represents the global variability and c5(x) is a zero mean, stationary random process which represents small scale variability. An estimator for Z(x0 ) is Z(x0 ) = XZ(x). It will be shown that the A's depend on the spatial covariance function. Kriging assumes a known covariance function, but in practice, the covariance function is unknown, and is estimated from the data. A.n alternative function to represent spatial variability is the variogram, hom which the covariance, under certain conditions can be derived. The main emphasis of this thesis is estimating the variogram used in Kriging. The first chapter presents background to the geostatistical problem. Chapter two presents the theoretical framework for variogram modeling. Random functions, moments of random functions, stationary, along with variogram functions and lll
PAGE 4
graphs are defined and discussed in this chapter. Chapter three discusses estimation of the variogram, including sampling, terminology used in estimating variograms, estimators and how to fit a true variogram to a sample variogram. Chapter four illustrates this material by analyzing a topographical data set. An analysis of this data set will show trend removal, simulations to aid in as sessing anisotropy, and the final kriging surface along with the standard error surface. This abstract accurately represents the content of the candidate's thesis. I recommend its publication. James R. Koehler !V
PAGE 5
ACKNOWLEDGEMENTS My sincere thanks to my parents, Norman E. and Deanna F. LeMay. With out their loving support, kindness and encouragement, I would have not been able to complete this goal. A great debt of gratitude is owed to my long suffer ing advisor, Dr. James R. Koehler. His encouragement, advising and patience has gone above and beyond the call of duty as mentor and friend. Thanks Dr. Koehler from the bottom of my heart. v
PAGE 6
CONTENTS Chapter l. Introduction .......................... 1.1 1.2 Motivation .......................... Outline ............ ......... .... 2. Terminology ............................ 2.1 2.2 Random Function ....... ............. Properties of Random Functions 2.2.1 2.2.2 First Moments Second Moments ' ........... .... . ........... .... 2.3 Stationarity ...... ........... ..... 2.4 2.3.1 2.3.2 2.3.3 Strict Stationarity Covariance Stationarity Intrinsic Stationarity ............ ' Variograrn Properties .. ..... ........... 2.4.1 2.4.2 Ordinary Kriging Functions and Graphs VI 1 1 3 5 6 7 7 7 8 8 9 11 11 11 17
PAGE 7
2.4.3 2.4.4 2.4.5 Isotropic Transition Models Nontransition models Hole Effect Models 2.5 Anisotropy .... 2.5.1 2.5.2 2.5.3 2.5.4 Geometric Sill Anisotropy Range Anisotropy Nugget Anisotropy 3. Data Analysis ... 3.1 Sample Design 3.2 3.3 3.4 3.5 3.1.1 3.1.2 Gridded Data NonGridded Data Estimators . . . . Computing Sample Variograms Diagnostic Tools . . . Fitting Sample Variograms 4. Analysis of Topographical Data Set 4.1 4.2 4.3 Exploratory Data Analysis Variogram Analysis Simulation . . Vll 17 19 21 21 24 27 29 30 31 32 32 34 36 37 39 41 43 43 46 51
PAGE 8
4.4 Completion of Original Data 4.5 Kriging ...... 4.6 Concluding remarks and open questions References Vlll 55 55 58 60
PAGE 9
FIGURES Figure 2.1 Spherical model .. 2.2 Exponential Model 2.3 Gaussian Model .. 2.4 Power model: ol( h) = x2 2.5 Logarithmic Model 2.6 HoleEffect Model 2. 7 Variogram contour surface with anisotropy 2.8 Variograms representing geometric anisotropy. 2.9 Transformed Axis . . 2.10 Sill anisotropy variograms. 2.11 Nugget Anisotropy. Coal Ash Data Set 3.1 3.2 Data which are not on a grid 3.3 NonGridded Data ..... !X 18 19 20 20 21 22 23 24 25 27 30 33 34 35
PAGE 10
3.4 Omnidirectional variogram of elevation data. The parameters need to be adjusted to smooth the sample variogram. 3.5 Omnidirectional variograrn of elevation data after parameter 4.1 4.2 adjustment . . . . . . . Various Plots of Topographical Data Plots of GAM output ... 4.3 Various Plots of Residuals 4.4 4.5 4.6 4.7 Directional Variogram of Residuals Contour of the Variogram Surface Omnidirectional variogram, number of lags equal 10 Directional variogram, nlag=10, tol.azimuth=40 .. 4.8 Directional Variogram and Variograrn Surface Plots 4.9 Weighted least squares fit of omnidirectional variograrn 38 39 45 47 48 49 50 52 52 54 56 4.10 Results from Kriging. . . . . . . . . . 57 X
PAGE 11
1. Introduction 1.1 Motivation In my first semester of graduate school, I was working with my advisor on propagation of errors in maps. We were continuing some research that he had done previously and this led me into spatial statistics and Geographical Infor mation Systems (GIS). Wanting to further my understanding of these spatial problems, I took some statistics courses and an independent study in geostatis tics, after which I transfered into the statistics program. I found the problems in spatial statistics, especially geostatistics, intriguing and wanted to further my knowledge. The problems encountered in geostatistics involve spatial in terpolation and structural analysis. Spatial interpolation, known as kriging, and structural modeling, known as variogram estimation are two particular problems in geostatistics that seemed interesting. Kriging is a method of spatial interpolation used in geostatistics. The kriging model is posed as follows. Let D, a design space, be a fixed subset of iRd A location in D is denoted as x. Sampling cannot be done at every point in the design region, so a finite sampling is done in D at locations x, 1
PAGE 12
for i = 1, ... n. An unsampled location is denoted x0 We want to measure some feature in the design region such as carbon monoxide, elevation, depths to underground water. Thus, the measurement is unknown and Z(xi) is treated as a random variable. The objective is to estimate Z(x0). A spatial random process, Z(x), is modeled as, Z(x) = p,(x) + o(x) where p,(x) models the large scale variability (trend) of the spatial random process Z(x) and o'(x) is a zero mean stationary stochastic process, with covariance function, C(h). It is obvious that the random variables Z(xi) and Z(xj) are similar at a small separation distance. For the unsampled location, we can form the estimator Z(x0 ) = XZ(x), where Ai are weights. Three models for kriging are: simple, ordinary and universaL In simple kriging, the mean, p,(x) is known and constant, p,(x) = Jl, and the covariance function, C(h) is known. In Ordinary kriging, the mean is unknown but as sumed constant, and the covariance function C(h) is known. Ordinary kriging is presented in Section 2.4. Finally, in universal kriging, the mean p,(x) is un known and is estimated by a finite polynomial, and the covariance function is known. In practice, the covariance function is usually unknown, and is estimated from the data. I will show in section 2.4 that the kriging weights; A, are in terms of the covariance between all sampled locations,C(Z(x +h), Z(x)), and 2
PAGE 13
the covariance between the sampled locations and the unsampled locations C(Z(xo), Z(xi)). The variogram is an alternative method to estimate variability in the design region D. In practice, the variogram is estimated, and the covariance function, under second order stationarity assumptions, is derived from the variogram. This derivation and definition is found in chapter two. In fact, kriging can be done in terms of the variogram function, 1(h), rather than the covariance function, C(h). 1.2 Outline Stochastic modeling concepts used in geostatistics, e.g. random functions, moments of a random function, stationarity, the variogram, isotropy and anisotropy are defined in chapter two. Chapter three presents methods to estimate the variogra.m from data. For example, the classical estimator clue to Matheron [5] and a robust estimator clue to Cressie [5] are introduced. How to assess the assumption of stationarity, how to compute a sample variogram if the data are on a grid or not on a grid, different sampling schemes to collect data in the design region are also discussed in this chapter. Finally in chapter four, an example using topographical data is presented showing how the techniques in chapters two and three are implemented. I also use simulation to aid in the 3
PAGE 14
analysis of the variogram modeling process. Finally, ordinary kriging is used to obtain the surface of the topological data and is shown along with the standard error surface. 4
PAGE 15
2. Terminology Let x1 ... Xn be spatial locations in a design region. The design region is denoted as D and is contained in 3td At each location, xi, i = 1, ... n, a mea surement is taken. Denote the sample measurement as z(x;). Stochastically, let Z(x;) denote a random variable, which z(x.i) is a realization. Within the design region, these random variables are spatially correlated. Measurements that are close together tend to be more similar in value than measurements further apart. Statistically, random variables closer together are more correlated than random variables further apart. I will use the above notation throughout this thesis to denote spatial locations, x;, sample measurements, z(x.;) and random variables, Z(x;). Furthermore, locations X; and Xj are separated by a distance vector h. Here, h includes both magnitude and direction. Alter natively, two locations are denoted by x, x +h. The language of geostatistics includes terms such as random functions, stationarity, variogram, covariogram, isotropy and anisotropy. All these terms are defined in this chapter along with other geostatistical terminology. 5
PAGE 16
2.1 Random Function Randomness in geostatistics is described by a Random }unction. In geo statistics, a random function is defined on all x E D and produces random variables for each x. In general, a random function, Z(x) is characterized by the set of all its nvariate cdfs for any number n and any choice of the n locations, X;, i = 1, ... n: In particular, we will be concerned with the bivariate cdf of two random variables, Z(xi) and Z(x2): (2.2) More general properties of random functions can be found in [26] A few observations are in order. Two locations x; and Xj are separated by a distance vector h has both magnitude and direction. The two locations will be denoted as x and x + h. The random variables are dependent for small llhll Realizations of the random variable are denoted, z(x). Lastly, we have only one realization of the random variables (i.e. we do not have replication). For example, if we take elevation measurements and repeat the measurements, baring any measurement errors, we will have the same results. This leads to the needed statistical assumption of stationarity, presented in section 2.3. 6
PAGE 17
Although in practice we do not know the underlying random function, we will use the data to find two properties of the random function, namely the first and second moments. 2.2 Properties of Random Functions 2.2.1 First Moments Recall from classical statistics, the first moment of a random variable is defined by the formula E[X] = J: xf(x)dx = Jl Similarly in geostatistics, when we impose a distribution, the first moment is defined by the formula: E[Z(x)] = m(x) when it exists. As seen, the mean of spatial data is a function of the locations, X. 2.2.2 Second Moments Second moments of spatial data, when they exist, have three forms. First is the a priori Variance, when it exists and is defined, Definition 1 (Variance) Var[Z(x)] = E[(Z(x)m(x))2 ] (2.3) 7
PAGE 18
If Var[Z(xi)] and Var[Z(x1)] exists, then the Covariance exists and is defined as, Definition 2 (Covariance) C(x.;, x1 ) = E [(Z(xi)m(xi)) (Z(xJ)m(x1))] (2.4) Lastly, the variance of the difference of two random variables known as the Variogram, and is defined as, Definition 3 (Variogram) (2.5) Cressie [5] notes that some people have defined the variogram as, 21(x1 x2 ) = E [ (Z(x +h) Z(x)/]. This works fine when m(x) is constant Vx ED. Note that all the above functions depend on locations xi and x1 2.3 Stationarity Recall, there is only one realization of Z(x). Since there are no replications, we need some sort of assumptions for statistical inference. The assumptions made regard various degrees of spatial homogeneity. Homogeneity allows us to view z(x) and z(x + h) as two different realizations of the same random variable. 2.3.1 Strict Stationarity 8
PAGE 19
Definition 4 (Strict Stationarity) A random function {Z(x),x E D} is said to be Strictly Stationary within the field D if its multivariate cdf is invariant under any translation of the N coordinate vectors, that is, I should mention that strict stationarity is also referred as strong station arity or wide sense stationarity. 2.3.2 Covariance Stationarity Covariance stationarity, also known in the literature as second order stationarity, is a weaker form of stationarity than strict stationarity. Definition 5 (Covariance Stationarity) A random function Z(x) satisfy mg E[Z(xi)Z(xJ)] = 0 V Xi, Xj and For Z(x), Z(x + h), the covariance exists and depends only on the sepa ration distance h. A property from the definition 5 is, C(h) = E[Z(x + h)Z(x) m2 ] V x (2.7) where m is the true mean of the distribution. Having defined covariance stationary, we can notice some important conse quences. For example, assuming the random function is covariance stationary, 9
PAGE 20
we can find a relationship between the covariance function and the variogram. Recall equation 2.3: Var[Z(x)] From equation 2.7, we have = E [ Z(x)2 2Z(x)m + m2 ] E[Z(x?J 2mE[Z(x)] + m2 E[Z(x)2 ] m2 This states the variance of a spatial random variable, under the covariance stationarity assumption is the covariance function with no translation. A relationship exists between the variogram and the covariogram (see equation 2. 7). Under covariance stationarity: !(h) = [(Z(x+ h)Z(x))2 ] ( E[Z(x + h)2 ] 2E[Z(x + h)Z(x)] + E[Z(x)2J) (E[Z(x + h)2 ] 2E[Z(x + h)Z(x)J + C(O) + m2 ) (E[Z(x + h)2]2(C(h) + m2 ) + C(O) + m2 ) [c(o) + m2 2C(h)2m2+ C(O) + m2 ] C(O)C(h) 10 (2.8)
PAGE 21
2.3.3 Intrinsic Stationarity An even weaker form of stationarity is that of Intrinsic Stationarity. Definition 6 (Intrinsic Stationarity) A random variable Z(x) is intrinsi cally stationary if the following two conditions hold E[Z(x)] = m \f x. Z(x +h) Z(x) has a finite variance and does not depend on x. Hence the following notation, Var[Z(x +h) Z(x)] = 2r(h) \f x (2.9) 2.4 Variograrn Properties The variogram was defined in equation 2.5. Under intrinsic stationarity conditions, the variogram can also be defined as follows: 2r(h) = E [(Z(x +h)Z(x))2 ] (2.10) The former definition is a more universal definition. The intrinsic assumption allows for infinite variance, whereas the covariance hypothesis allows only for finite variance. For the rest of this thesis, I will be using the definition in equation 2.10, since I will be assuming intrinsic stationarity, which is least restrictive of the three forms of stationarity. 2.4.1 Ordinary Kriging Variography [15] is irnportant for kriging. Con sider the following model: 11
PAGE 22
Z(x) = !L + o(x) X E D (2.11) where o() is a zeromean stochastic process with known covariance function. We will assume the model in equation 2.11 along with the model assumptions that o() is a zero mean stochastic process with variance modeled by a known covariance function. Further assume the first order component, /L, is constant in the region, but unknown. Using the data, we would like to predict at an unsampled location using a linear estimator. }orm the linear estimator, Z(xo) = xtz(x) We would like our estimator to be unbiased and have minimum variance. For the predictor to be unbiased this means, E [z(x0)Z(x)] = 0, or E [Z(xo)] = !L = E [z(xol] E n L XiE [Z(x;)] i=O i=O Since E[Z(x)] = !L 1;/x ED. The mean squared error is: 12
PAGE 23
evaluating each term, E[Z2 ] E[(XZ)2 ] E[XZZ',\] XE[ZZ'],\ X(C + .u2[1]n),\ (2.12) Where C is a n x n covariance matrix for all sample locations. The next term, E[ZZJ = E[XZZ] XE[ZZJ X(c + .u21) (2.13) c is a n x 1 vector of covariance between the unsampled location and sample locations. The final term, (2.14) in combination with equations 2.12 2.13 and 2.14, leads to: Q(,\) = X(C + .u2[1]n),\2(X(c + .u21)) + u2 + .u2 (2.15) We have a constraint on the A's which must sum to one. Using Lagrange multipliers1 we can take the derivative of Q and solve for ,\: 1The Lagrange multipliers will be denoted by v. 13
PAGE 24
Taking the derivative and setting it equal to zero, G\c+ vl 0 C.\+ vl c and including the unbiased constraint gives where: 1 1 0 and Cij is the covariance between location i and j, Ao ,\+ = An v co1 c+ = Con 0 14
PAGE 25
hence if c+ is invertible, (2.16) Cressie [5] does the above analysis in terms of the variogram. Also, the solution to the kriging system yields a best linear unbiased predictor (BLUP) of Z(x0). The solution to the kriging system requires C to be invertible. Let Y be a linear combination of Z(xi)'s and let Z(xi) have expectation m and covariance C(h) or semivariogram r(h). Y is a random variable, hence, the variance of Y must be positive or 0. rn Y = :z= ),iZ(xi) i=l for any weight ),i The variance of Y can be written as: Var(Y) = :z= :z= ),i),JC(xix1 ) 2: 0 j (2.17) (2.18) Because C is a covariance matrix it must be positive definite. All positive definite matrices are invertible. Now, take into account equation 2.8 and the above relation, and rewrite the variance of Y in terms of the variograrn. (2.19) j j In the case where the variance C(O) does not exist, only intrinsic stationarity is assumed, and the variance of Y is defined on the condition that the sum of 15
PAGE 26
the Xs is 0, and the term C(O) is eliminated, leaving only Var[Y] =LLAiAJI(XixJ) i j (2.20) Thus, the variogram function must have the condition of conditional positive definite (negative) function. The above discussion is found in [15]. Cressie [5] cites three important properties of the variogram. (1) 21() is continuous at the origin, then Z() is L2 continuous. That is, E [(Z(x +h)Z(x))2]t 0 if and only if 21(h) t 0 as llhl[t 0. (2) If 21(h) does not approach 0 as h approaches the origin, then Z() is not L2 continuous and is highly irregular. This discontinuity of 1 at the origin is referred to as the nugget ejJect. (3) If 21(h) is a positive constant, then Z(xi)'s are uncorrelated for all x regardless of how close they are; Z ( ) is called white noise. Further properties of the covariance function and the variograrn are: C(O) = Var(Z(x)) 0 a priori variance cannot be negative C(h) = C( h) covariance is an even function [C(h) I :S C(O) Schwarz's inequality 1(0) = o 1(h) = 1( h) 16
PAGE 27
2.4.2 Functions and Graphs There are a number of known functions that are useful as model for the variogram and assure negative definiteness. Graphically, if the variogram stops increasing beyond a certain distance, this value is known as a sill. Mathematically, this is denoted 1( oo) = M. lnformally, this says that two random variables separated by a large distance are uncorrelated. The distance at which the variogram stops increasing is known as the range. Variograms with a sill are known as transition models. If an intrinsic random function reaches a sill, then the random function is second order stationary. Then we can have the following relationship from equation 2.8 C(h) = 1(oo)AI( h) From this relationship, 1(oo) = C(O). Finally, if the variogram 1(h) is a function only of magnitude and not of direction, then the variogram is said to be isotropic (this is also true for the covariance function). A function which is not isotropic is said to be anisotropic. Anisotropy will be discussed in the next section. 2.4.3 Isotropic Transition Models Three typical isotropic transition models are are given below along with their graphs. (1) Spherical Model 0 h=O (2.21) co+ a Vh>a 17
PAGE 28
where c0 +cr is the sill, c0 is a nugget effect and a is the range. Furthermore, cr, c0 and a are all positive. The following plot is standardized to reach a sill value of 1 at a range of 5 with no nugget efiect. o. e Figure 2.1. Spherical model (2) Exponential Model h=O (2.22) hfO The exponential model will reach it's sill asymptotically. Again, c0 + cr is the sill, c0 is the nugget effect and a is the range, all positive real numbers. For this model, the range is the value where the variogram reaches 95% of the sill, since this model reaches the sill asymptoticly. A graph of the exponential function has a range of 5 along with a sill of 1, and no nugget effect. 18
PAGE 29
0.8 0.6 0.4 0.2 6 7 distance Figure 2.2. Exponential Model (3) Gaussian Model 1(h) = { 0 ((3h)2)} c0 + J 1 exp 2 a h=O O
PAGE 30
' ' ,, "" c 0.6 ""' 0.2 .. cu.""''"""'"" Figure 2.5. Logarithmic Model With this model, lim1,_,0 h = oo. Journal and Huijbregts point out this model must be "regularized", see page 82 [15]. 2.4.5 Hole Effect Models A third type of models are known as HoleEffect Models. These models are used when growth is not monotonic and can occur with or without sills. They are of the form: 2.5 Anisotropy (h) sin(h) I h (2.26) The Random Variable is said to be anisotropic when its variability is not the same in every direction. Using Figure 2.7 below, variability is xiependent on both magnitude and direction. For example, a "nice" form of anisotropy is 21
PAGE 31
E O.B 0.6 0.4 0.2 0 0 2 4 0 B distance Figure 2.6. HoleEffect Model when a contour of the variogram surface will yield elliptical shapes at various distances. Along the major axis of the ellipse, variability increases slowly, and along the minor axis of the e1lipse, variability rises rapidly. These will be referred to as major and minor axes of anisotropy, respectively. A complete understanding of the region of interest is needed to correct for anisotropy. Anisotropy corresponds to the existence of preferential directions at the time of the genesis of the studied phenomenon [15]. Examples of anisotropy are: Monitoring air pollution with a prevailing wind. Ore grades along a river bed. Contamination along porous media Journel and Huijbregts [15] state, "Anisotropies will be represented by the method of reducing them to the isotropic case either by a linear transformation 22
PAGE 32
Figure 2.7. Variogram contour surface with anisotropy of the rectangular coordinates (hu, hu, hw) of the vector h in the case of geometric anisotropy, or by representing separately each of the directional variabilities concerned in the case of zonal anisotropy." Traditionally, anisotropy has been classified as either geometric or zonal. In the classical book by Isaaks and Srivastava [13] along with Journel and Hui jbregts [15], geometrical anisotropy is characterized by directional variogram that have approximately the same sill but at different ranges. Geometrical anisotropy can be corrected by simple linear transformations, whereas zonal anisotropy cannot [15]. Isaaks and Srivastava [13] describe zonal anisotropy as one in which the sill value changes with direction while the range remains constant. I agree with Zimmerman [27] that the above definitions of zonal anisotropy are inadequate. This nomenclature does not describe the type of 23
PAGE 33
anisotropy that characterizes the region of study. Zimmerman suggests abandoning the term zonal anisotropy in favor of more descriptive terms, such as range anisotmpy, sill anisotmpy, and nugget anisotropy. The classical term for geometrical anisotropy should be retained. Each of the different types of anisotropy is discussed below. 2.5.1 Geometric An excellent discussion of Geometric Anisotropy is discussed in chapter 16 of Isaaks and Srivastava [13], as well as in Journel and Huijbregts [15]. Basically, variograms are computed and plotted in various elirections, and in each direction, the sill value will be the same, but the range value will be different. Below is a graph to demonstrate this concept. 08 0.6 E .' !" 04 02 0 0 2 3 4 5 6 7 8 Distance Figure 2.8. Variograms representing geometric anisotropy. Both variograms have a sill value of 1. The E/W direction reaches its sill at a range of 3, the N /S reaches its sill at a range of 6. Using Figure 2. 7 the correction for geometrical anisotropy is as follows: 24
PAGE 34
(1) Rotation of coordinate axis (2) Transformation of locations. The rotation of coordinate axis uses the following transformation matrix in 2 and 3 dimensions, respectively [ cos( 0) R= sin( 0) where f) is the angle of rotation, and sin( 0) l cos( fi) cos( a) cos() sin(a) sin( a) cos(4') sin() R= cos(a) 0 cos(a) sin(cp) sin(a) sin() cos( a) (2.27) (2.28) where a is the rotational angle in the xydirection and (p is the rotation in the zxdirection. For simplicity, the figure below shows the rotated axis aligned along the major and minor axis of anisotropy in two dimensions. y Y' X' Figure 2.9. Transformed Axis Axis aligned along major and minor axis of anisotropy 25
PAGE 35
Now that we have addressed rotation of coordinates, we next address transformation of the coordinates. Isaaks and Srivastava talk about standardizing the range. That is, we can reduce the range in which the variogra.m reaches it's sill value to 1. Refer to Figure 2.8, suppose that each of these variograms represent a variogram in the x direction and y direction. Suppose that in the x direction, the sill is reached at the range value of 3, and the y direction, the sill value is reached at the range of 6. We can divide each of the x coordinates by the range value of 3 and the y coordinates by the range value of 6. Now, in each direction, the sill value in each direction will reached at a range of 1. We can consider the coordinate transformation matrix for 2 and 3 dimensions, respectively. [ l l T= Gx (2.29) 0 and for 3 dimensions l 0 0 a, T= 0 l 0 (2.30) ay 0 0 l a, Putting the two transformations together, we have the new coordinates give by h', h' =TRh (2.31) Note that T and R are not reversible, the transformations must be in this order. 26
PAGE 36
2.5.2 Sill Anisotropy Figure 2.10 below is an example of anisotropy in the sill. Here we see the sill values are different in two directions. 3.5 '' j '' 0.0 0 Distance Figure 2.10. Sill anisotropy variograms. The semivariogram in the EW direction given by the dotted line and the semivariograrn in the NS direction is given by the solid line. Since the variogram attains a sill, this means the region has second order stationarity. Zimmerman [27] shows that sill anisotropy may be evidenced by a trend in the data, nonvanishing spatial correlation or measurement errors which are correlated or have unequal means. If one were computing an experimental variogram that has unequal sills, then exploratory analysis of the data may yield a trend in the data. This may explain the first reason for sill anisotropy. Zimmerman states experience shows this to be most likely the case. A trend in the data is a violation of the stationarity hypothesi&, and may cause the non vanishing spatial correlation. The analyst should abandon the 27
PAGE 37
stationarity hypothesis, estimate the trend, remove it and proceed with analyzing the residuals as a new data set. This should yield isotropic directional experimental variograms. Correlated measurement errors or measurement errors having unequal means was another reason for sill anisotropy. Zimmerman uses an example from a drill hole data set. The samples can be analyzed on different occasions, even perhaps by different individuals resulting in correlated measurement errors, or measurement errors of unequal mean. Also, it may be reasonable to assume that measurements from different drill holes are uncorrelated, but measurements from the same drill hole are equally and positively correlated. This being the case, the only correction the analyst can use is a nested model. Nested variograrns are of the form: rn lz(h) = Lli(IIA;hll) (2.32) i=l where are matrices that define appropriate directions and /1(), rrn() 28
PAGE 38
are isotropic variograms. 2.5.3 Range Anisotropy In Section 2.5.1, geometric anisotropy and its correction was discussed. Referring to Figure 2.8, the variograms in two differ ent directions reach the same sill, but at different ranges. Range anisotropy is the type of anisotropy which is not corrected by linear transformations. Here we are contributing the range anisotropy solely to measurement error and not to any physical considerations of the region of study. Zimmerman states that the current practice is to fit a nested variogram model which imposes no constraints on the ranges of the directional variograms or their sills. Zimmerman believes it is advisable, based on the assumption and plausibility of second order stationarity, vanishing spatial correlation and white noise measurement error, that constraints should be imposed on the range and sills to ensure the fitted sills are equal. In the example he uses, he supposes empirical variograms are fitted in the following directions: N /S, E/W, NE/SW, NW /SE. Furthermore, the empirical variograms resemble the exponential model. Then, EW direction lz(h) NESW direction lz(h) N S direction 1 z (h) 29 111[1exp( a1h)] 112[1exp( a2h)] e3[1exp( aah)]
PAGE 39
NWSE direction !z(h) Under the assumptions of second order stationarity, vanishing spatial correlation and whitenoise measurement error, we must require 01 = 02 = 03 = 1!4 2.5.4 Nugget Anisotropy Finally, for completeness I would like to show and mention nugget anisotropy. Figure 2.11 is an example of nugget anisotropy. A nugget effect is discontinuity at the origin. This is attributed to microscale variability in the design region. Directional variograms not only have a nugget effect, but they are different in each direction. Nugget anisotropy is attributed to correlated measurement error. This measurement error is inadequately described by white noise. Correlated measurement error is found between rows, between columns or both. Below is a graph of nugget anisotropy. .. 5 4 } 3 2 Distance Figure 2.11. Nugget Anisotropy 30
PAGE 40
3. Data Analysis The prevwus chapter concentrated on the theory of geostatistics. This chapter is primarily concerned with the practice of geostatistics. Some of the information that the analyst should know, includes: How the data were collected The region of study The variable(s) of interest Who collected the data Goal of the study Assessment of stationarity An exploratory data analysis will reveal interesting aspects of the data. Before computing any experimental variograms, trends should be checked to assess stationary assumptions. Recall in section 2.3, the mean of the data is constant and does not depend on location. A trend in the data implies a non constant mean, depending on the location. If there is a trend, it should be estimated by either median polish, or polynomial regression. Once identified, 31
PAGE 41
the trend can be removed, and the residuals used as a new data set. This chap ter begins with identifying different sample designs. Second, the terminology used to compute the experimental variogram are discussed. Third, the meth ods used to compute the experimental variogram and finally, methods used to fit variograms are discussed. 3.1 Sample Design The sample design for spatial data may or may not be on a grid. In chapter 3 of Ripley [21], he shows four ways to sample spatial data. Uniform Sampling chooses locations uniformly throughout the region and independently of each other. Stratified random sampling takes a uniform random sample of size k from each of m stmta or subareas, so n=km. Centric Sampling is where the samples are taken on an equally spaced grid in the region. Finally, NonAligned Sampling is where the sample locations are off the nodes of a grid. Of these, Uniform Sampling, Stratified Random Sampling and NonAligned Sampling result in sampling that are not on a grid. Centric Sampling is on a grid. For the rest of this chapter, I will differentiate gridded sampling and nongridded sampling. 3.1.1 Gridded Data The following figure is an example of data which are on a grid. This example is from Cressie [5], chap. 2. These data are 32
PAGE 42
0 0 0 0 0 0 0 0 0 0 0 0 c 0 "' 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 "' 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 g 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 i 0 0 0 0 0 0 0 0 0 0 0 '' 0 5 '0 >5 east Figure 3.1. Coal Ash Data Set Locations of data along a grid. 33
PAGE 43
sampled on equidistant transects of a design region. The number of partitions along the x and y axis are denoted as nx, ny respectively. The separation between each adjacent partition along the x and y directions are denoted X8iz, ysiz, respectively. A lag is the distance between two locations and should not exceed L/2, where L is the maximum distance of any two locations in the design region. 3.1.2 NonGridded Data The figure below shows a sample area that the sample data are not on a grid. 6.50_ "' "' "' !II 5.50:: "' "' "' @) 4.50 "" @} "' @} "' "' 3.50_ "' "' "' "' "' @) @ 2.50: "' @ "' "' 1.50 @ "' "' @ "' "' "' @ .50: "' "' @ CD @ CD "' @ .50 .00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 Figure 3.2. Data which are not on a grid Figure 3.3 should help clarify some of the following terminolog needed to compute the sample variogram with nongridded data. Using Figure 3.3, the origin will represent a location in the design region. The azimuth is the measured angle from the yaxis, clockwise. We can partition the direction vector into nlag, the number of lags to compute the sample variogram. In Figure 3.3, 34
PAGE 44
Y (Nmth) / I Azimuth I / X axis (Eas\) Figure 3.3. NonGridded Data This figure shows the terminology used in computing the experimental semivariogram for nongridded data. 35
PAGE 45
there are 5 lags. The maximum distance used to compute the sample variogram is L/2, where Lis the maximum distance between two locations in the sample region. The lag separation htag, is determined by dividing the maximum distance, L /2 by the number of lags. Two important tolerance intervals are the lag tolerance and azimuth tolerance. The lag tolerance is some /i amount from the current lag, such that any location within h1a.9 5 interval is considered to be at the current lag. Next is the azimuth tolerance, this is a e amount such that any point within the interval azimuth e is considered in the current azimuth. 3.2 Estimators Now, I will introduce some methods of calculating spatial variability from data. The first estimator is known as Mathemn's estimator. This estimator is based on method of moments, and is an unbiased estimator of r(h), where h is magnitude, llhll1 Nh 2 :Y(h) =l::(Z(x +h)Z(x))2 Nh i=J (3.1) where Nh is the number of pairs of locations separated by a distance llhll The next estimator is a robust estimator derived from Cressie and Hawkins [3]. It is recommended to use this estimator when there are many outliers in the 36
PAGE 46
sample data. (N1 I:f':;\ IZ(x +h) Z(x) 1112 ) 4 2 (h) "'=" 1 0.457 + 0.494/Nh (3.2) Deutsch and Journel [8] give nine other methods to measure spatial variability. 3.3 Computing Sample Variograms The variogram modeling process can begin with an omnidirectional variogmm. An omnidirectional variogram can be thought ofloosely as an average of the various directional variograms [13]. The omnidirectional variogram serves a useful starting point for establishing some of the parameters required for sample va.riogra.rn calculations. It has been shown to be useful when the data. are not on a. grid to establish the lag tolerance. Once the lag tolerance is established, then directional sample variogra.rns are computed, and the azimuth tolerance can be established. This does not necessarily need to be done when the locations are on a. grid since the number of pairs to compute the sample variogra.m is highest when the va.riogra.m is computed along the North/South and East/West directions, respectively. Once the parameters are established, the analyst can proceed with assessing anisotropy, if needed. Figure 3.4 shows a.n omnidirectional va.riogra.m before adjusting the lag tolerances. The number of lags in the computed omnidirectional va.riogra.m has been 37
PAGE 47
Semlvariogram direction omnidirectional 3000. 2000. y 1000. o;stance Figure 3.4. Omnidirectional variogram of elevation data. The parameters need to be adjusted to smooth the sample variogram. 38
PAGE 48
reduced to produce the sample variogram in Figure 3.5. By reducing the number of lags used to compute the sample variogram, we increase the lag spacing, which in turn includes more pairs in estimating each point in the sample variogram. The result is a smoother sample variogram. Semivariogram taff:elevation head:elevation direction 1 3000. 2000. y 1000 .oo .so 1.0o 1.50 2.00 2.so Distance Figure 3.5. Omnidirectional variogram of elevation data after parameter adjustment 3.4 Diagnostic Tools vVhen the experimental variogram is computed in various directions, and, if they are different, the analyst must determine the reason. There may be a number of reasons for the apparent anisotropy such as outliers in the data or a trend in the data may result in violations in the stationarity assumptions. This section will deal with ways of handling the apparent anisotropy. Graphical tools help in identifying trends in the data, particularly spin plots and scatter pair plots. GSLIB [8] has a routine called locmap which color scales the location of the data. Another nsefnl plot is a text plot, which plots the 39
PAGE 49
value of the data at the location instead of a dot. One technique of removing the trend is by Least Squares. The trend is estimated by a polynomial function and removed. The residuals are obtained and used as a new data set. Hence, an experimental variogram is computed on the residuals, and if the trend removal has been successful, the process yields an isotopic experimental variogram. As previously mentioned, outliers must be taken into consideration. If the analyst can justify the removal of the data, they can be removed; otherwise more robust methods should be used in the analysis. One method of removing certain twodimensional trends uses median polish [5]. When the data are on a regular grid, they may be treated as row and columns of data. Assuming there are no row and column interactions, median polish gives a twoway fit, and the residuals become the new data set. In either method, anisotropy may still be an issue. In some cases, the removal of the trend may yield an isotropic data set, but in others it may not. The analyst must employ methods of dealing with anisotropy in the new data set by the methods outlined in section 2.5, or other creative ways. With nongridded data, if there is a trend in the data, it must be removed. Cressie [5] mentions that a grid can be imposed over a region of study, and the locations are referenced to the nodes of the imposed grid. Median pol ish with missing values is again employed, and residuals obtained. Another 40
PAGE 50
method uses linear smoothers. If there is no interaction in the northern and eastern directions, smoothing techniques may be used. Specifically, General ized Additive Models [11 J are nice to use. If a plot in the data shows a linear trend in the north direction and east direction, the trend is removed by the deterministic part and a smooth of the residuals in each direction is used to assess the removal of the trend. An example of using this method is presented in the next chapter. 3.5 Fitting Sample Variograms Once the sample variogram has been computed and anisotropy has been detected, and plotted, the next major task is to fit a parametric variogram func tion. As several authors have mentioned, fitting a variogram function to a sam ple variogram has been done by adhoc techniques, such as "eyeballing" the sample variogram and guessing parameters. One software package, GeoEas [1], employs such a method. Another method is using interactive software, such as infomap (Bailey and Gatrell [2]). A sample variograrn is computed, the user enters a variogram function, then interactively tweaks the parameters until the curve looks like it fits the sample variogram. These two packages are limited to the exponential, spherical and gaussian models. SPLUS [12] also has this capability, but has the power and flexibility to execute the methods mentioned 41
PAGE 51
below. A number of methods to fit a sample variogram are outlined in section 4.6 of Cressie [5]. A comparison of the methods are found in Zimmerman and Zimmerman [28] as applied to ordinary kriging. The methods are: maximum likelihood, restricted maximum likelihood (REML), minimum norm quadratic estimation (MINQ), least squares, generalized least squares. Further information on maximum likelihood is found in [17], REML and MINQ in [5]. As Zimmerman and Zimmerman have demonstrated, the method of weighted least squares using Matheron's estimator, weighted least squares using Cressie's ro bust estimator, or ordinary least squares are preferred. This assessment is based on kriging variances and ease of computation. Are using the adhoc techniques such a bad idea? Cressie and Zimmerman [4] discuss the stability of variogram parameters. Grondona and Cressie [10] have shown the parameters of the variogram to be more stable than estimat ing the covariogram. I have found that this area has great potential for large amounts of research, as well as "good" techniques for fitting valid variogram functions to sample variograms. 42
PAGE 52
4. Analysis of Topographical Data Set This chapter presents an example using a topographical data set to il lustrate topics mentioned in the previous two chapters. The objective is to estimate the variogram from the data and to use Kriging to reconstruct the surface. The data set is taken from Davis [7], and the measurements are eleva tion data taken on a 310 x 310 area in where the units are in yards. The locations in the design region were chosen randomly. Furthermore, the region is scaled to 0 and 6 both in the North/South and East/West directions. 4.1 Exploratory Data Analysis The data are displayed using various plots in Figure 4.1. Figure 4.1 (a) is a contour plot of the data, along with the locations of the data. Two hills appear in the lower right corner and lower left corners, with a valley separating them and continuing North. The values are higher on the left and right side of the picture, decreasing towards the center. This observation may suggest a quadratic nature in the East/West direction. Furthermore the values increase moving from the North towards the South, indicating a linear structure in the North/South direction. Figure 4.1(b) is a text plot of the values of the 43
PAGE 53
data. The following two graphs are perspective plots viewing South and East, respectively. In the East/West direction, the trend appears quadratic, along the North direction, it appears to have a linear component. Care must be taken using these graphs as they have been preprocessed using an interpolation algorithm. A brush and spin plot verifies the above assessment. Because of these trends, I have decided to remove them before any further analysis. I used a Generalized Additive Model [11] to model the trend since the data are not on a grid, and I assumed no interaction between the North/South and East/West directions based on the above plot. The model consists of a second degree polynomial in the East/West direction, and a first degree polynomial in the North/South direction. A second component was added to the model to check for any bias. This component consists of a loess smooth of the residuals on both the North/South and East/West. The output of the GAM fit is shown in Figure 4.2. The loess fit confirms a quadratic component in the East/West direction. A regression line of the quadratic fit is added in Figure 4.2(a). The linear com ponent in the North/South is captured by the fit and added to Figure 4.2(b). The residuals plot in Figure 4.2(c) and Figure 4.2(d) show no structure in the East/West and North/South directions. The loess smooth is added to the residual plots to assess the fit of the model. The smooth shows no structure. 44
PAGE 54
Eas:Pir&::lion (a) Contour and point plot of data (c) Perspective plot of data looking South. yg:; 755 800 l 01 710 780 8551 72B "i 800 7JO 80. 762 780 i I 830 785 7'0 765 813 790 '"I 812 773 812 1l55 827 805 830 i Nj 800 873 0751 873 885 841 855 BOO 1 862 908 882 910 i 830 000 940 915 870 0 i 890 ,860 c 4 5 8 East Direction (b) Text plot of data. (d) Perspective plot looking East Figure 4.1. Various Plots of Topographical Data 45
PAGE 55
Therefore, I have modeled the trend successfully. Figure 4.3 displays some plots to visualize how well the GAM model worked. Now that the trend has been identified and removed, I will use the residuals as a new data set to compute a vanogram. 4.2 Variogram Analysis The next step is to compute an omnidirectional variogram using the residu als as a new data set. This iterative procedure will help obtain the settings for the number of lags and the angular tolerance. After three iterations, I found that using 10 lags produced a "nice" omnidirectional variogram. The next step is to determine if the data are anisotropic. This is accomplished by computing a directional variogram with a small angular tolerance. This iterative procedure started with 5 angular tolerance and I incremented in steps of 10. A direc tional variogram of the residuals is shown in Figure 4.4 using an angular toler ance of 45. The directions computed are 0, 30, 45', 90, 120', 135, and 150. It appears, in the 90 through 150 degree directions, the sill of the variogram is between 1000 and 1500, with a range of 3 to 4. The results appear to be different in the 0 to 60 degree direction. A conclusion of anisotropy may be assessed at this stage of analysis. To aid in the assessment, Figure 4.5 rs a contour of the variogram surface based on the directional variogram. 46
PAGE 56
Oj .. (a) Quadratic component along East/West . : : : .. .. . : . (c) Residuals along East/West 0r i gj ":i .. 3 y :. I (b) Linear component along North/South 0 3 y (d) Residuals along North/South Figure 4.2. Plots of GAM output 47
PAGE 57
Easli)recl'on (a) Contour plot of the residuals. 3 East Direction (c) Perspective plot of residuals look ing South. (b) Perspective plot of residuals. 3 North Direction (d) Perspective plot of the residuals looking looking West. Figure 4.3. Various Plots of Residuals 48
PAGE 58
00 05 IO 15 lO 25 30 oo oo 10 ts 25 3.o 0 D t 0 0 0 0 0 0 0 0 Q Q 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Q 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0.5 I o 15 20 25 3.0 distance Figure 4.4. Directional Variogram of Residuals 49
PAGE 59
3 2 i 0 2 3 EIW distance Figure 4.5. Contour of the Variogram Surface 50
PAGE 60
From Figure 4.5, one may assess geometrical anisotropy due to the ellipti cal form of the variogram surface. Recall from section 2.5.1 that geometrical anisotropy is corrected by rotating and extending one of the coordinates. If the transformation works, then we can conclude geometrical anisotropy. However, apparent anisotropy may be due to the sampling design or noise. To assess the apparent anisotropy, I simulated a new response using a known covariance structure that produces isotropic realizations. 4.3 Simulation The omnidirectional variogram computed in section 4.2 is shown in Fig ure 4.6. From this graph, it appears to have a range of 2 and sill of 1200. using the coordinates from the topological data set, I simulated values using an exponential covariance function with range of two and a sill of 1200. The simulator returns values from an isotropic random field. The directional variogram using the simulate data is shown in Figure tl.7. The angular tolerance is 45 degrees and the number of lags is 10 as before. We can see in the 30, 120, 135 and 150 degree azimuths there is some clear structure: the sample variogram appears to reach a sill. Knowing the simulated values are stationary and isotropic, the variograms should be the same in each direction computed, which clearly is not occurring. 51
PAGE 61
The number of pairs used to compute each point on the sample directional variogram is less than 30. A rule of thumb is that the number of pairs used to estimate each point in the sample variogram should have at least 30 to 50 pairs. I chose a minimum of 40. To increase the number of pairs, one can increase the lag tolerance, which by default is half the lag separation. I increased the lag tolerance for a number of instances, but the results were not favorable, the number of pairs did not increase. The next method is to increase the angular tolerance. I increased the angular tolerance in increments of 5o. Finally at the 70 angular tolerance I achieved favorable results. The number of pairs to estimate each point was 40 and the sample directional variogram looked favorable. The results of increasing the angular tolerance show that in each direction, the sample variograms are similar, and the variogram surface is fairly circular. It was interesting to note the angular tolerance used to produce the results shown in Figure 4.8 is 70 degrees. I performed this simulation 5 times and each time came to the same conclusion. The conclusion drawn from this analysis, is that I have an anomalous sample data set, in which these high tolerances are needed to produce the known isotropic results. I need to increase the angular tolerance to produce a reliable sample variogram. 53
PAGE 62
5 .............. .. olstanoa (a.) Variogram Surface (b) Figure 4.8. Directional Variogram and Va.riogram Surface Plots 54
PAGE 63
0 8 0 8 0 0 E E g 0 g 0 00 0.5 0.0 05 20 2.5 3D distance Figure 4.6. Omnidirectional variogram, number of lags equal 10 0 0 0 0 0 0 0 0 0 0 o,; 00 0 {J 0 0 oo 00 0 0 0 0 0 00 0 00 0 0 0 0 1000 0 0 0 0 0 H 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 ,, 0 0o00 o () 0{) 0 0 0 0 0 00 oo 0 distance Figure 4.7. Directional variograrn, nlag=lO, tol.azimuth=40 52
PAGE 64
4.4 Completion of Original Data Going back to the original data set, I used the omnidirectional variogram m Figure 4.6. I believe by the simulations, the original data set does not have any anisotropy, and the results are unique due to the sampling and the measurements under study. Using Cressie's weighted least squares, I fit an isotropic gaussian variogram to the residuals. The fitting procedure returned the values: range = 1.522 sill = 943.724 and the nugget = 18.07. The fit looks good and is shown in Figure 4.9 4.5 Kriging Now that the variogram parameters are estimated, we can Krige the surface. The SPlus [12] algorithm chooses a 30 x 30 grid in the design region along with the data locations to perform kriging. After kriging the residuals, I will add in the estimated trend from section 4.1. This will give me the predictions at unsampled locations. A contour of the surface along with the contour of the standard errors are shown in Figure 4.10. The small standard errors show the kriging routine produced reasonable results. The surface along the North/South direction and the East/West di rections are smooth as compared to Figure 4.1. Overall, I am comfortable with 55
PAGE 65
0 0 :::' 0 0 0 "' 0 0 0 <0 D E E 0 0 .. 0 0 0 "' 0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 distance Figure 4.9. Weighted least squares fit of omnidirectional variogram 56
PAGE 66
(a) Kriging surface (c) Perspective plot of Krige surface (b) Standard errors of kriging surface (d) Kriged surface along East/West Figure 4.10. Results from Kriging. 57
PAGE 67
the results starting with the estimation of the variogram and the surface results produced in Figure 4.10. 4.6 Concluding remarks and open questions The above analysis was an application of some of the geostatistical tech niques used in practice. The data was plotted and trends were identified. The next step was to remove the trend. My research has shown that analysts are looking for robust methods to remove the trend. Robust methods produce smoother Kriging surfaces. I chose to use a generalized additive model (GA.M) to remove the trend. The basic reason for choosing this method is because I wanted to verify the trend had been correctly identified and removed. A nonparametric tool for this assessment was to use a loess smooth [11] on the residuals. Generalized additive models has become an interest to me and I plan to further my knowledge of this tool. After many steps of computing the om nidirectional sample variogram, I computed the directional variogram to assess anisotropy. Simulation helped to reject anisotropy. Finally, a parametric vari ogram was fitted to the sample variogram. This completed the variogram study of the region, which is the longest part of any geostatistical analysis. Kriging the region was a simple task to complete. After completion of the. Kriging routine, questions remain as to the fit of the surface. We have not accounted 58
PAGE 68
for variability from the variogram fit and the trend fit when we reproduced the Kriged surface. This area is open for further research. 59
PAGE 69
REFEHENCES [1] U.S. Environmental Protection Agency. GeoEas 1.2.1 Users Guide. 1991. [2] Trevor Bailey and Anthony C. Gatrell. Interactive Spatial Data Analysis. Longman Group Limited England, 1995. [3] Noel Cressie and Douglas Hawkins. Robust estimation of the variogram. Mathematical Geology, 12(2), 1980. [4] Noel Cressie and Dale L. Zimmerman. On the stability of the geostatistical method. Mathematical Geology, 1992. [5] Noel A.C. Cressie. Statistics For Spatial Data .John Wiley & Sons USA, 1993. [6] Noel A. C. Cressie. The origins of kriging. Mathematical Geology, v 87(n 3):239252, Apr 01, 1990. [7] John C. Davis. Statistics and Data Analysis in Geology. John Wiley and Sons, New York, 1986. [8] Clayton Deutsch and Andre Journel. GSLIB: Geostatistical Software and User's Guide. Oxford New York, 1992. [9] Clayton V. Deutsch. Kriging with strings of data. Mathematical Geology, 1994. [10] Martin 0. Grondona and Noel Cressie. Residuals based estimators of the covariogram. Statistics, 1995. [11] T.J. Hastie and R.J. Tibshirani. Generalized Additive Models. Chapman and Hall, 1990. [12] Mathsoft Inc. S+SpatialStats. 1996. [13] Edward H. Isaaks and R. Mohan Srivastava. An Introduction Jo Applied Geostatistics. Oxford New York, 1989. 60
PAGE 70
[14] E. H. Isaaks and Mohan Srivastava. Spatial continuity measures for proba bilistic and deterministic geostatistics. Mathematical Geology, 20(4):313341, 1988. [15] A. Journel and H. Huijbregts. Mining Geostatistics. Academic Press, New York, 1978. [16] P.K. Kitanidis. Introdnction to Geostatistics. Cambridge, 1 edition, 1997. [17] K.V. Mardia and R.J. Marshall. Maximum likelihood estimation of models for residual covariance in spatial regression. Biometrika, 71:135146, 1984. [18] K.V. Mardia and A.J. Watkins. On multimodality of the likelihood in the spatial linear model. Biometrika, 76(2):289295, 1989. [19] G. Matheron. Principles of geostatistics. Economic Geology, 58:12461266, 1963. [20] G. Matheron. The intrinsic random functions and their applications. Ad vances in Applied Probability, 5:439468, 1972. [21] Brian R.ipley. Spatial Statistics. Wiley and Sons, 1982. [22] Michael Stein and Mark Handcock. Some asymptotic properties of krig ing when the covariance function is misspecified. Mathematical Geology, 21(2):171190, 1989. [23] J.J. Warnes and B.D. Ripley. Problems with likelihood estimation of covariance functions of spatial gaussian processes. Biometrika, 73(3):640642, 1987. [24] A.J. Watkins. On models of spatial covariance. Compntational StatiBtics and Data Analysis, 13:474481, 1992. [25] A.J. Watkins and F.H.M. AlBoutiahi. On maximum likelihood estimation of parameters in incorrectly specified models of covariance for spatial data. Mathematical Geology, 22(2), 1990. [26] A.M. Yaglom. An Introdnction to the Theory of Stationary Random Fnnc tions. PrenticeHall, 1962. [27] Dale L. Zimmerman. Another look at anisotropy in geostatistics. Mathe matical Geology, 1993. 61
PAGE 71
[28] Dale L. Zimmerman and M. Bridget Zimmerman. A comparison of spatial semivariogram estimators and corresponding ordinary kriging predictors. Technometrics, Feb 1991. 62
