Citation
Efficient solver for mixed and control-volume mixed finite element methods in three dimensions

Material Information

Title:
Efficient solver for mixed and control-volume mixed finite element methods in three dimensions
Creator:
Wilson, John D ( John David ), 1960-
Publication Date:
Language:
English
Physical Description:
154 leaves : ; 28 cm

Subjects

Subjects / Keywords:
Groundwater flow -- Mathematical models ( lcsh )
Fluid dynamics ( lcsh )
Finite element method ( lcsh )
Finite element method ( fast )
Fluid dynamics ( fast )
Groundwater flow -- Mathematical models ( fast )
Genre:
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )

Notes

Bibliography:
Includes bibliographical references (leaves 149-154).
General Note:
Department of Mathematical and Statistical Sciences
Statement of Responsibility:
by John D. Wilson.

Record Information

Source Institution:
|University of Colorado Denver
Holding Location:
Auraria Library
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
48713584 ( OCLC )
ocm48713584
Classification:
LD1190.L622 2001d .W54 ( lcc )

Full Text
EFFICIENT SOLVER FOR MIXED AND CONTROL-VOLUME MIXED
FINITE ELEMENT METHODS IN THREE DIMENSIONS
by
John D. Wilson
B.S., University of Colorado at Denver, 1991
M.S., University of Colorado at Denver, 1994
A thesis submitted to the
University of Colorado at Denver
in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy
Applied Mathematics
2001


This thesis for the Doctor of Philosophy
degree by
John D. Wilson
has been approved
by
Thomas F. Russell
Lynn S. Bennethum
Gita Alaghband
Date


Wilson, John D. (Ph.D., Applied Mathematics)
Efficient Solver for Mixed and Control-Volume Mixed Finite Element Methods
in Three Dimensions
Thesis directed by Professor Thomas F. Russell
ABSTRACT
In simulations of ground water flow in heterogeneous aquifers, accuracy of
velocities can be substantially enhanced by discretizations based on mixed or
control-volume mixed finite element methods. A potential side benefit is im-
proved accuracy of transport calculations that depend on the flow field. Mixed
finite element linear systems are not positive definite and cannot be solved by
straightforward applications of well-known iterative solvers. We present a spe-
cialized iterative solver for mixed methods in 3-D, built around a decomposition
of the discrete velocity space that leads to a symmetric positive definite system
(the pressure variable is eliminated). Our discretization uses distorted hexahedral
elements. Hexahedral elements are often used in practice for modeling the sub-
surface hydrogeology. Lowest-order Raviart-Thomas vector functions are used for
the velocity space. These vector functions have continuous normal components
which gives us conservation of mass.
Since the reduced system is symmetric positive definite we can use the precon-
ditioned conjugate gradient method. An additive domain decomposition precon-
ditioner is described and the uniform convergence of the method is established for
m


small overlap. Numerical results confirm the uniform convergence. Parallelization
of the problem is also demonstrated. A similar decomposition of the velocity space
is being investigated for the control-volume mixed discretization, hoping to obtain
a positive definite system, although the system is not, in general, symmetric.
This abstract accurately represents the content of the candidates thesis,
ommend its publication.
I rec-
Signed
Thomas F. Russell


ACKNOWLEDGMENT
The research of the author was supported in part by National Science Foundation
Grant No. DMS-9706866 and Army Research Office Grant No. 37119-GS-AAS.


CONTENTS
Figures ................................................................. ix
Tables................................................................... xi
Chapter
1. Introduction.......................................................... 1
1.1 Existing Methods...................................................... 2
1.2 The New Results in this Thesis........................................ 5
1.3 Thesis Outline ....................................................... 6
1.4 Future Work .......................................................... 8
2. Governing Equations.................................................. 10
2.1 Saturated Flow....................................................... 10
2.1.1 Conservation of Mass.............................................. 12
2.1.2 Darcy Flow........................................................ 13
2.2 Flow Equation ..................................................... 14
3. Mixed Finite Element Method.......................................... 16
3.1 Mixed Variational Formulation ....................................... 16
3.2 Divergence-Free Subspace............................................. 19
3.3 Existence and Uniqueness............................................. 20
4. Discretization....................................................... 25
4.1 Trilinear Hexahedral Elements........................................ 27
4.2 Approximation Spaces................................................. 34
4.2.1 Properties of the Piola Transformation............................ 38
vi


4.2.2 Interpolation of Velocity Functions................................... 41
4.3 Divergence-Free Basis ................................................ 45
4.3.1 Dirichlet Boundary Conditions ........................................ 50
4.4 Initial Guess Vector Vj............................................... 61
4.5 Mass and Stiffness Matrices........................................... 64
4.5.1 Mass Matrix .......................................................... 68
4.5.1.1 Mass Matrix with Boundary Fluxes ............................... 73
4.5.2 Stiffness Matrix...................................................... 79
4.6 Numerical Integration ................................................ 84
5. Preconditioners.......................................................... 89
5.1 Vector Potential Space.............................................. 90
5.2 Condition Number of Stiffness Matrix.................................. 95
5.3 Domain Decomposition Preconditioner................................... 97
5.3.1 Convergence Analysis ................................................ 102
5.4 Preconditioner for Bordered Matrix................................... 106
5.4.1 Analysis of Bordered Matrix Preconditioner........................... 110
6. Numerical Results....................................................... 115
6.1 PCG Performance ..................................................... 117
6.1.1 Test Case I....................................................... 118
6.1.2 Test Case II....................................................... 119
6.1.3 Test Case III...................................................... 119
6.1.4 Test Case IV....................................................... 122
6.2 Model Problem I...................................................... 124
6.2.1 Parallelization of Model Problem I................................... 128
vii


6.3 Model Problem II
130
6.4 Error Analysis....................................................... 136
Appendix
A. Subdomain Restriction and Interpolation.............................. 139
B. Coarse-Grid Restriction and Interpolation ............................ 144
References................................................................ 149
viii


FIGURES
Figure
2.1 DArcys Experiment................................................. 14
4.1 Trilinear Mapping of Reference Cube................................. 28
4.2 Random Hexahedral Grid.............................................. 28
4.3 Uniform Flow on Distorted Element .................................. 43
4.4 Relative Error of Interpolated Flux................................. 44
4.5 Boundary Divergence-Free Basis Vector .............................. 51
4.6 Divergence-Free Pipe Function ...................................... 55
4.7 Divergence-Free Corner Functions.................................... 56
4.8 Connectivity Graph of Dirichlet Boundaries.......................... 59
4.9 Spanning Tree ...................................................... 60
4.10 Non-Zero Structure of MO on Non-Orthogonal Grid..................... 73
4.11 Non-Zero Structure of A1............................................ 76
4.12 Non-Zero Structure of A2............................................ 77
4.13 Non-Zero Structure of M ............................................ 78
4.14 Non-Zero Structure of CMC7......................................... 79
4.15 Non-Zero Structure of CMC7 with Boundary Contributions............. 80
4.16 Matrix Permutation................................................. 83
5.1 Partitioned of Domain into Subdomains............................... 98
6.1 Reduction Histories for Test Case I................................ 118
6.2 Low Conductivity Block............................................. 120
IX


6.3 Test Case III on 4 x 4 x 4 Coarse-Grid............................. 121
6.4 Test Case III on 8 x 8 x 8 Coarse-Grid............................. 121
6.5 Reduction Histories for Test Case IV............................... 123
6.6 Cylinder on 4 x 4 x 4 Grid ........................................ 126
6.7 Cylinder on 16 x 16 x 16 Grid..................................... 127
6.8 Flow Around Cylinder............................................. 127
6.9 Computed Flow Lines for Tank Experiment............................ 131
6.10 Computed Flow Lines for Tank Experiment with Sink Term............. 134
B.l Coarse Grid rc-Slice Function ..................................... 145
x


TABLES
Table
4.1 Reference Coordinates and Physical Coordinates ................ 29
5.1 Minimum and Maximum Eigenvalues................................ 97
6.1 Preconditioner Assembly Time ................................. 116
6.2 PCG Method for Test Case I.................................... 118
6.3 PCG Method for Test Case II................................... 119
6.4 PCG Method for Test Case III ................................. 120
6.5 PCG Method for Test Case IV .................................. 122
6.6 PCG Method for Cylinder Problem............................... 126
6.7 Speedup on 16 x 16 x 16 Grid.................................. 128
6.8 Speedup on 32 x 32 x 16 Grid.................................. 128
6.9 Sand Characteristics.......................................... 132
6.10 Computing PI ................................................. 132
6.11 Computing P2 ................................................. 133
6.12 PCG Method Using PI .......................................... 133
6.13 PCG Method Using P2 .......................................... 133
6.14 PCG Method for Model Problem II with Sink Using PI ........... 135
6.15 PCG Method for Model Problem II with Sink Using P2 ........... 135
6.16 Extrapolated Error Estimate ................................. 138
xi


1. Introduction
Our goal is to develop an efficient solver for numerical methods modeling
three-dimensional saturated flow in a heterogeneous anisotropic porous medium.
Saturated flow is modeled by a second order partial differential equation (PDE).
For example, we will consider
KVj) = f in fi (1.1)
where the pressure, p, or hydraulic head is the primary variable to be solved for
and Q is a bounded three-dimensional domain. Approximating a solution to a
PDE, such as (1.1), is typically done through a discretization either using a finite
difference or finite element method. The result of the discretization is a large
sparse matrix problem. It is desirable to use an iterative method for sparse matrix
problems. If the matrix is symmetric positive definite, then the preconditioned
conjugate gradient (PCG) method can be used. The efficiency of the PCG method
depends on the condition number of the matrix. The coefficient K can be highly
discontinuous and this can cause problems for standard numerical methods.
The velocity, or flux, can be of primary interest in hydrogeological models.
An equation for the velocity in terms of the pressure is often written as:
v = -KV p. (1.2)
This leads to a system of first order equations which can then be solved using
a mixed finite element (MFE) method or a control-volume mixed finite element
1


(CVMFE) method. Simultaneous approximations for the velocity and the pres-
sure can then be computed. This approach will lead to better approximations
of the velocity. A sophisticated implementation of the three-dimensional solute-
transport equation has recently been done in [42], This type of implementation
requires an accurate description of the flow field for mass tracking.
The discretization of either the MFE method or the CVMFE method leads to
a semidefinite matrix problem. The PCG method can no longer be used because
the matrix is not positive definite. Our method is to eliminate the pressure through
a decomposition of the velocity space using a divergence-free subspace. When we
eliminate the pressure variable from the MFE method, and discretize the problem,
we end up with a symmetric positive definite matrix. If the diameters of the
elements are order h, then the condition number of the matrix is of order h-2.
This means that the PCG method will perform poorly for small h. Therefore, a key
part of our method will be to implement a domain decomposition preconditioner
which will make the convergence rate of the PCG method independent of grid
size.
1.1 Existing Methods
The earliest known domain decomposition method was introduced by Schwarz
in 1870. The Schwarz method is an overlapping method, and thus, all overlapping
methods can be referred to as Schwarz methods. The two level additive Schwarz
method is due to Dryja and Widlund [24], Glowinski and Wheeler formulated
the first domain decomposition methods for MFE in [37]. Domain decomposition
methods for MFE have also been analyzed in [29],[20], and [17].
2


Convergence analyses for Schwarz methods are often dependent on a large
overlap. However, numerical results usually show that the rate of convergence for
a Schwarz method is quite satisfactory even for a small overlap of size h. Dryja
and Widlund in [27] were able to analyze this behavior.
The construction of a divergence-free subspace in three dimensions on a rect-
angular grid with pure homogeneous Neumann boundary conditions is described
in [17]. A convergence analysis for the domain decomposition method for this
problem is also given. This convergence analysis relies on a sufficiently large over-
lap of order H which is the approximate diameter of a coarse-grid element.
More recently, a decoupling of the pressure and velocity spaces using the
divergence-free subspace was described by Scheichl in [54], The method was im-
plemented on triangle elements in two dimensions and on tetrahedra in three
dimensions. A bound on the eigenvalues of the resulting stiffness matrix is ana-
lyzed giving a condition number of order h-2. Various iterative solvers were tested
and compared. It was clear that the PCG method using a domain decomposition
preconditioner performed the best for the two-dimensional case. Results for do-
main decomposition in three dimensions were not given. The PCG method with
additive Schwarz preconditioner was implemented using the DOUG package [40].
One of the important differences between the method in [17] and the method
in [54] is that the divergence-free subspace is known a priori in [17]. In [54], a
decoupling stage is performed as part of the assembly process, determining the
divergence-free basis using a graph theoretical approach.
A mixed finite element method for second order elliptic problems is described
by Raviart and Thomas in [52], This early work sets the framework for ana-
3


lyzing the mixed finite element method. A discretization of the velocity space
and pressure space is described in two dimensions for triangles and rectangles.
The triangles and rectangles are affine mappings of a reference element. The
Piola transformation is used to map the velocity from the reference element to
the physical element. The use of the Piola transformation is critical. Using the
properties of the Piola transformation, one can describe an interpolation operator
and approximations for smooth vector valued functions onto the discrete velocity
space. It also leads to satisfying the discrete inf-sup condition. This work was
later extended to three dimensions by Nedelec in [51]. For three-dimensional dis-
torted hexahedral grids the mapping of the reference cube to a hexahedral element
is not affine. In this case the theory does not apply. In Thomass thesis [57] the
interpolation of smooth vector valued functions is described for two-dimensional
quadrilaterals and error estimates are given. If an interpolation estimate where
known for three-dimensional hexahedral elements, then it could be used along the
lines of Thomass thesis to verify the inf-sup condition and obtain error estimates.
In particular, for the special case of three-dimensional elements that are products
of two-dimensional quadrilaterals with a one-dimensional interval (quadrilateral
prisms), Thomass theory would apply.
A more complete mathematical theory of mixed finite element methods by
Brezzi and Fortin can be found in [15]. Many different methods are discussed
using model problems. The goal was to provide an analysis of the methods in
order to understand their properties as much as possible. A discussion of the
properties of the if (div, fi) velocity space is given. Mixed finite element methods
constantly make use of this space. A very general framework is given for existence
4


and uniqueness of solutions in [52] along with approximation properties.
One may also find the analysis given by Brenner and Scott in [12] to be useful
for understanding existence and uniqueness of the mixed finite element method.
Between [52], [15], and [12], the mathematical theory of mixed finite element
methods is almost complete. Practical implementation details can be found in
[44] and [46],
1.2 The New Results in this Thesis
We extend the work in [17] by constructing a divergence-free subspace in three
dimensions for general hexahedral elements which are trilinear mappings of a unit
cube. We also introduce Dirichlet and Neumann boundary conditions. The dis-
torted hexahedral elements are considered so as to better model the structure of
the subsurface heterogeneity. Hexahedral elements are widely used in hydrogeol-
ogy therefore they have an advantage over tetrahedral elements.
We analyze the upper and lower bounds on the eigenvalues of the stiffness
matrix. These bounds agree with those found in [54] for tetrahedra and are
confirmed through numerical tests. A convergence analysis for the PCG method
using a two level additive Schwarz method is then given for a small overlap of order
h. Although this analysis is standard, we are able to confirm, through numerical
results, that the convergence rate of the preconditioned system is independent of
grid size and number of subdomains. We also show the effects of heterogeneity
and anisotropy.
Finally, it is shown (through numerical results) that the method is highly
parallelizable. This is due to being able to solve the subdomain problems inde-
pendently.
5


Our implementation is done completely from scratch using the C language.
The only exception is the use of a Cholesky factorization module down-loaded
from the Netlib repository [1], The Cholesky factorization is used for solving
subdomain problems and the coarse-grid problem. Very close attention is paid
to structured programming techniques. Structured programming using the C
language can be very much like object-oriented programming. This is due to
the ability to use callback functions. Callback functions are functions which are
passed as arguments into other functions. These callback functions can also be
represented as data members of a structure. Therefore, C structures are really
the building blocks for classes in object-oriented programming.
The implementation of our method along with the use of hexahedral elements
constitutes new and original research. This research contributes to the body of
scientific knowledge by providing a model, with numerical results, which can be
extended and compared to existing and future implementations.
1.3 Thesis Outline
In Chapter 2 we derive the governing equations which give rise to a second
order PDE. The derivation is fairly standard. However, more sophisticated models
can arise from variations of these governing equations. It is important to under-
stand the physical principles involved. Boundary conditions are also given. The
second order PDE is then transformed into a system of first order PDEs using
the flux variable.
The MFE method, described in Chapter 3, is used to solve the system of first
order PDEs. Approximation spaces for the velocity and pressure are defined.
Existence and uniqueness are well known for the MFE method using these spaces.
6


We discuss the existence and uniqueness proof for the MFE method as given in
[12] where coercivity of the problem is shown using a divergence-free subspace.
The divergence-free subspace is also a primary feature of our method.
We begin the discussion of the discretization of the MFE method in Chapter
4 by first describing the trilinear mapping of the reference cube to a hexahedral
element. Next, we describe the lowest-order Raviart-Thomas basis vector func-
tions for the velocity space. These vector functions are defined on a reference
cube and then mapped to the hexahedral elements using the Piola transforma-
tion. The properties of the Piola transformation are discussed. As mentioned
above, the interpolation estimate for three-dimensional non-affine elements is still
an open question. We describe the theory presented in Thomass thesis for two-
dimensional quadrilaterals and note that in special cases the theory will apply to
three-dimensional elements.
Continuing with chapter 4 we describe the discrete divergence-free subspace.
We extend the work done in [17] to non-homogeneous boundary conditions. Once
the discrete divergence-free subspaces have been defined we can derive the matrix
equation which is to be solved using the PCG method. Some discussion about
the non-zero structure of the stiffness matrix is also given.
In Chapter 5 we analyze the condition number of the stiffness matrix. The
domain decomposition preconditioner is defined and a convergence analysis is
given. If we have two Dirichlet boundaries which are disconnected, then we end
up with a single dense column and row in the stiffness matrix. We describe this
bordered matrix in Chapter 5 and we define a bordered matrix preconditioner.
We are able to analyze the convergence properties of the preconditioned bordered
7


stiffness matrix.
Finally, in Chapter 6 we demonstrate the efficiency of the method in terms of
iteration counts and CPU times. We also describe two model problems. The first
model problem is a vertical cylinder of low permeability. We demonstrate that
the cylinder problem can be parallelized with near-perfect speed up. The second
model problem is a laboratory tank experiment. This model results in a bordered
stiffness matrix. The numerical results of the bordered matrix problem confirm
the analysis given in Chapter 5. We also extrapolate error estimates for the three-
dimensional problem. At present we do not have a good three-dimensional model
problem where the exact solution is known. Therefore, we use an extrapolation
technique and show that the convergence of the error is of order h which agrees
with Thomass theory for lowest-order Raviart-Thomas velocities.
1.4 Future Work
The analysis for domain decomposition algorithms with small overlap is given
by Dryja and Widlund in [27]. Domain decomposition methods for divergence-
free subspaces have been analyzed before, for example, in [13]. However, these
methods are usually defined on triangle elements in 2-D. Sharp bounds on the
minimum eigenvalue of the preconditioned system for divergence-free subspaces
can be shown in 2-D for triangles and quadrilaterals. A sharp bound in three
dimensions on hexahedral elements may still be an open question and needs to be
investigated further.
Even though we were able to show the ability to parallelize our method, we
have not implemented it on a large parallel machine. It will be important to test
the method on a larger computer.
8


Another extension to the work in this thesis would be to implement the
CVMFE method using our solver. It is shown in [16] that the CVMFE method
in two dimensions leads to more accurate velocities even in the presence of dis-
continuous anisotropic coefficients on a distorted grid. The implementation of
the CVMFE method in three dimensions would be new. In the CVMFE method,
the test space for the velocity has discontinuous normal components and thus,
is not in if (div, fi). We propose a divergence-free test space which is weakly
divergence-free. A proposal to implement the CVMFE method using the solver
described in this thesis has been submitted, and accepted, by the National Re-
search Council. Work on this project should begin soon at the U.S. Geological
Survey Hydrogeology Division.
9


2. Governing Equations
We will derive a second order PDE for the pressure. This particular PDE is
referred to as a second order elliptic boundary value problem. The resulting flow
equation is given in Section 2.2. A flux variable will be introduced in the derivation
of the second order PDE. This flux variable will later be used to transform the
second order PDE into a system of first order equations.
The problem we are interested in modeling is saturated subsurface flow. Sat-
urated flow means that whatever fraction of the volume is available to a fluid is
filled by that fluid. The fraction of the volume available to the fluid is called the
effective porosity of the medium. We call this type of medium a porous medium.
2.1 Saturated Flow
The derivation of the second order elliptic boundary value problem is fairly
standard and can be found in most PDE textbooks, for example, [39]. However,
we want to emphasize the physical properties specific to hydrogeology. Units of
measure and field techniques specific to hydrogeology may be found in [30].
We will begin by considering an underground volume, V, along with its bound-
ary dV. The underground volume will consist of soil and rock and will have certain
physical properties. The fluid, water in our case, will also have physical proper-
ties. The physical properties of the saturated flow system are given below. The
dimensions are given in SI units.
Effective Porosity

10


Intrinsic Permeability K, (m2)
Specific Discharge v (m/s)
Pressure P (kg/(ms2))
Density of Fluid p (kg/m3)
Viscosity p (kg/(ms))
Source/Sink / (kg/(m3s))
Gravitational Constant a (m/s2)
The effective porosity is the fraction of the porous medium which is available
to hold a fluid. This is not the same as porosity which is the fraction of the porous
medium which is void of material. Some of the voids may not be interconnected
thus, fluid may not flow in or out of these voids. When we say saturated flow
we mean that the fraction of fluid in the porous medium is equal to the effective
porosity. We assume that the effective porosity does not change with time.
The intrinsic permeability is a property of the porous medium alone. It is
basically a measure of the square of the mean pore diameter and thus, has the
units of area. The permeability is a symmetric 3x3 matrix which is spatially
dependent. We assume that it satisfies the following ellipticity condition:
a\wtw < wjKjW < Q!2w*w, Vw e 5R3 (2.1)
Later, we will define a hydraulic conductivity which is measured in units of length
over time.
11


The specific discharge has units of velocity. However, it is not a true velocity.
Specific discharge is a measure of volume per time per cross-sectional area. The
cross-sectional area is partially blocked by the porous medium thus, reducing the
volume of fluid which passes through it. To get an average velocity or seepage
velocity we would need to divide the specific discharge by the effective porosity.
The pressure is measured as force over cross-sectional area. The standard
unit of measure is the Pascal (Pa). Hydraulic head is often used instead of pres-
sure. Hydraulic head is measured in units of length and the hydraulic gradient is
dimensionless.
We assume that gravity is in the vertical direction. Therefore, we can write
the gravitational vector g as:
2.1.1 Conservation of Mass
The rate of change of the mass of the fluid in volume V is given by
This rate of change in the mass must be equal to the net rate at which mass
crosses the boundary of the volume minus the rate at which mass is produced
from the volume plus the rate at which mass is injected into the volume. This
relationship is given by
g = 0
(2,2)
(2.3)
(2.4)
12


We can use the divergence theorem to write the boundary integral as a volume
integral giving
J pt(j)dV = J div (pv)dV + j f dV. (2.5)
V V V
2.1.2 Darcy Flow
Darcys law is often expressed as:
v = ^ (VP + pg) (2.6)
/i
In the mid-1800s, a French engineer, Henri DArcy, made the first systematic
study of the movement of water through a porous medium. We can demonstrate
Darcys experiment by considering a horizontal pipe filled with sand. Water is
applied under pressure at one end of the pipe at point A. The pressure can be
measured through a vertical tube at that end of the pipe. Another vertical tube
at the other end of the pipe can be used to measure the pressure there. The
pressure is a measurement of how high the water is forced up these vertical tubes
(see Figure 2.1).
Darcy found experimentally that the discharge, Q, is proportional to the gradi-
ent of the pressure, (ha^hb)/L. The discharge also depends on the cross-sectional
area, A. Combined with a proportionality constant K we have:
= (2-7)
We define the specific discharge as v = Q/A.
The proportionality constant, K, is a property of the sand. In general, we will
use the hydraulic conductivity tensor to represent this proportionality constant.
13


Figure 2.1: I)'A rev's Experiment
With this experiment in mind, and assuming constant density of water,
equation (2.6) as:
v = ~KV/i
where the hydraulic conductivity, K, is given by
K = Kj
and the hydraulic head, h, is given by
, P
h =-----h z.
pa
2.2 Flow Equation
We substitute (2.8) into (2.5) to get
0 = j div(pKVZi) dV + j f dV
v v
where the left-hand side is zero because the density is constant, i.e, pt
we rewrite
(2.8)
(2,9)
(2.10)
(2.11)
0.
14


We now consider a three-dimensional domain Q and its boundary <9Q =
dVtD [JdQN. Since (2.11) is defined for any arbitrary volume, V C Q, we drop the
integrals and add boundary conditions to get
KV/i = (f/p) in 0 (2.12)
h = Q on dVtD (2.13)
(KV/i) n = 9 on dVtN (2.14)
where <9Qd is the Dirichlet boundary and <9Qn is the Neumann boundary. Equa-
tions (2.12)-(2.14) define the second order boundary value problem, or in our case,
the flow equation.
We can rewrite (2.12) as a system of first order equations using the flux
variable, v, defined in equation (2.8). First, we rewrite (2.8) as
K 'v = Vh. (2.15)
Then, we substitute (2.15) into (2.12) to get the first order equation
V-v =(f/p) infl (2.16)
Combining (2.15) with (2.16) and the boundary conditions (2.13)-(2.14) we get
our system of first order equations
W + V/i = 0
in 0 (2.17)
V v = (f/p)
h = q on dVtD (2.18)
(KV/i) n = g on <9QN (2.19)
We will use the MFE method to define a solution for this problem. From now on
we will write f/p as just / and we will write the hydraulic head, h, as p.
15


3. Mixed Finite Element Method
To compute an approximate solution to the first order system of equations
defined in (2.17)-(2.19) we use the MFE method. We first define a weak form of
the system of equations. Existence and uniqueness is then shown for the weak
form. The weak form is also sometimes called the variational form or mixed
variational form. This is because the weak form can be derived from the calculus
of variations [39].
3.1 Mixed Variational Formulation
The variational form will require us to define appropriate function spaces. For
mixed finite element methods we use constantly the space
V = H(div, Q) = {w 6 (X2(0))3 : div w e L2(tt)} (3.1)
with the norm
11 w 11 v = ||w|||2(n)3 + 11 div w|||2(n) (3.2)
llwlli2(n)3 = J w-wdfi (3.3)
11div w|||2(n) = J (div w) 2 dQ. (3.4)
It is possible to define the normal trace w n of a function w e V on <9fi, in
Lemma 3.1 For w e V, we can define w-n|5n £ H^l!2(dVt) and we have Greens
formula,
/divwP n n
16


Proof. See Brezzi and Fortin [15, Lemma III. 1.1]
The duality < w n,p >qq, between H^l/2(dVt) and Hl/2(dVt) is defined as
< w n.p >,')<>= / w-npds. (3.6)
an
We can use this definition of w n|5n to define the following subspace of V:
V0 = ffo,iv(div, 0) = {w6V :< w n ,p >dn= 0 Vp 6 (3.7)
where
KD(Sl) = {pe HHn) : p\dnD=o} (3.8)
The space (3.7) contains functions of V whose normal traces vanish on 8Qn-
It is stated in [15] that, for reasons related to pathological properties of Hl!2(dVtD)
and H-^2(dQN), it is necessary to use definition (3.7) and not an expression such
as w n = 0 in II l:2(()il \ ).
For example, we would not, at this time, define Vo to be the space given by
Hq{div, fi) = {w e V : w n = 0 on dQN} (3.9)
We will define the pressure space as:
A = L2(tt) (3.10)
with norm
11A11 a = || A||x,2(j7) . (3.11)
In the case of a purely homogeneous Neumann boundary condition, we have the
additional constraint given by:
A0 = {A 6 L2(tt) : [ \dtt = 0} (3.12)
in
17


which factors out the constant functions from L2(Vt).
We multiply the first equation in (2.17) by a vector function w e V0 and
integrate over Q giving
j K 'v w dQ + j Vp w dfi = 0. (3.13)
n n
We use integration by parts and the boundary condition, (2.18), to get
J K V w (M J (V w)pdQ = < w n. q >qq (3.14)
n n
Note that the Neumann boundary condition is an essential boundary condition
and we specify a homogeneous Neumann boundary in the test space so that it
disappears when we integrate by parts. The Dirichlet boundary is referred to as
a natural boundary condition.
Next, we multiply the second equation in (2.17) by a test function in A (or
A0 for pure homogeneous Neumann boundary) and integrate over il to get
J(V-v)Xdtt = J fXdtt. (3.15)
n n
We introduce the following bilinear forms and linear functionals:
o(v, w) = / K W w dVt
Jn
b(w,p) = J V-wp dQ
G(w) = < w n, q >qq
F(A) = [ fX dtt.
Jn
(3.16)
(3.17)
(3.18)
(3.19)
Now, problem (2.17) is stated in the weak form as: Find v6V0 and p E A such
that
a(v, w) b(w,p) = G(w), Vw e V0
b(v, A) = F(X), VAeA
(3.20)
18


If we have non-zero Neumann boundary conditions, say v n = g on dQN,
then we will need to consider veV given by:
v = v0 + v9 (3.21)
with v0 G V0 and v9 an arbitrary hxed element of VV where
V v = {w //(div. <>=< g.p ><>. Mp e (3.22)
We substitute this into (3.20) to get
{a(v0, w) b(w,p) = -G(w) a(v9, w), Vw e V0
b{v0,\) = F(\) b(vg,\), VA 6 A
3.2 Divergence-Free Subspace
Problem (3.23) is symmetric, but indefinite. We can decompose the velocity
space using a divergence-free subspace. In this way, we can eliminate the pressure
from problem (3.23). The result is a symmetric positive definite form of the
problem.
The divergence-free subspace is defined as:
D = {w 6 V0 : b{w, A) = 0 VA 6 A}. (3.24)
We want to find a vector, vj + v9 eV, where v9 e VN and vj e V0 satisfies
6(vj, A) = F(X) b(vg, A), VAeA. (3.25)
There are many such vectors and similarly we can easily construct one which
satisfies:
div vj = / div v9 (3.26)
19


and thus satisfies (3.25). We then write v e V as a vector with the correct
divergence, vj + v9, plus a divergence-free correction as:
v = vD + (vj + v9). (3.27)
The divergence-free correction, vD e D, satisfies
a(v£>, w) = a(v/ + v9, w) G(w), Vw e D (3.28)
It can be shown that a(-, ) is a coercive, continuous bilinear form, and by the
Lax-Milgram theorem, (3.28) is well posed.
With vD well-defined as the solution to (3.28), we then determine p E A such
that
b(w,p) = a(v, w) G(w) Vw e V0. (3.29)
The well-posedness of this problem will rely on a special kind of coercivity for
3.3 Existence and Uniqueness
We will first show that the two bilinear forms, a(-, ) and £>(, ), are continuous.
Lemma 3.2 There exist positive constants, Ca and Ch, such that for all u, v e V
and for all p E A we have
a(u,v) < Ca||u||v||v||v (3.30)
Kup) ^ ^6IIvl|v||p||A- (3.31)
Proof. The proof for (3.30) and (3.31) is essentially an application of the Schwarz
inequality. Clearly, ||u||v > ||u|||2(n)3- Therefore,
|U||v||v||v ^ l|u||7.2C013 ||V|| r,2fo13
\2
lL2(n)3
\2
lL2(n)3
20


>
2
(3.32)
luvl dVt
a
> ala(u, v)
The constant, aQ, comes from condition (2.1) on the permeability tensor. As a
result, the constant, Ca, in (3.30) is p/(apao). Again, it is clear that ||v||y >
11div v11jy(o)3 Therefore,
MIvIWIa 11 di v v||x,2(n)3 ||p||a
>
n
| (div v)p| dVt
> b(v,pYz
(3.33)
Next, we show that a(-, ) is coercive on D.
Lemma 3.3 The bilinear form, a(-, ), is coercive on D. Namely, there exists a
positive constant, 7, such that for all v e D we have
7||vllv ^ a(v,v). (3.34)
Proof. Since div v e we have div v = A e A. Therefore, 11div v|||2(n) =
£>(v, A) = 0, and
llvllv = ||v||i2(n)s < Q!ia(V,v) VveD. (3.35)
The constant, a\, again comes from the condition, (2.1), on the permeability ten-
sor and the constant, 7, in (3.34) is equal to p{apa 1).
21


This proves the existence and uniqueness of a solution for problem (3.28).
Now, we want to show existence and uniqueness of problem (3.29). First, we need
the following:
Lemma 3.4 There exists a positive constant, /3, such that for every j) 6 A we
have
P\\p\\a < sup -rn. (3.36)
wev |1 w||v
Proof. Let p e A. There exists a unique $ E H2(Q) which satisfies
AT = p
$ = o on an
By elliptic regularity we have
(3.37)
||$||i?2(n) < C\\p\\L2{Q) (3.38)
Since $ E H2(Q) we have V$ e V. Therefore, we let wp = and thus,
div wp = p. Furthermore,
||wp||v < C||p||a (3.39)
Now, we can finish the proof.
b(w,p) b(wp,p) HpIII 1 i, i,
sup -r--ji > ~r---~7j ~ = ' > ||p||a
wev ||w||v ||Wp||v ||wp||v F
Therefore, we let /3 = 1/C and we are done.
(3.40)
The proof of Lemma 3.4 can be modified in the case when p e A0, i.e, ho-
mogeneous Neumann boundary conditions. In this case we take $ e Hq(Q) and
wp e V0. (See [12, Lemma 9.2.3].)
22


a(v, w) G(w). Then, we rewrite equation (3.29) as
Let F(w)
b(w,p) = F(w) Vw E V0
(3.41)
Now, we prove existence and uniqueness of (3.41).
Theorem 3.5 Problem (3.41) has a unique solution.
Proof. Let D1- denote the orthogonal complement of D in V. We need only
consider w e D1- since the behavior of b(w,p) on D is trivial. The Eiesz Repre-
sentation Theorem implies that there exists a linear operator, T : A D-1, such
that b(w,p) = (w,Tp) and ||b||v' = ||Tp||v- Therefore,
The inequality in (3.42) is due to the continuity of £>(, ).
We need to show that T is bijective. We show first that the image, R, of
T is closed in DA Suppose that pj 6 A is a sequence with the property that
Tpj we D Then {Tpj} is Cauchy in D and
The inequality, (3.43), is from (3.36) and the equality (3.44) is because of (3.42).
(By a similar argument, T is one-to-one.) Therefore, {pj} is Cauchy in A. Let
q = limj^ooPj. By the continuity of T. 7V/ = w. This shows that R is closed.
Therefore, the space D1- can be decomposed into R and its orthogonal complement
RL. If w 6 R then b(w,q) = (w.Tq) = 0 for all q E A which implies w e D.
(3.42)
P\\Pi~Pk\\k <
__ ||w || v
Tpj ^ Tpk||v
(3.43)
(3.44)
23


However, w e /? implies w e D which forces w = 0. Thus D = R, and T is
onto.
Using the Eiesz Representation Theorem one more time gives us a unique
v e D such that
/'(w) = (v. w)r Vw 6 D1. (3.45)
We just let Tp = v, which uniquely determines p by the bijectivity of T. a
24


4. Discretization
In this chapter we describe the discretization of problem (3.28). We begin
by defining a partition of fi into hexahedral elements. Hexahedral elements are
trilinear mappings of the reference cube.
We then define finite-dimensional velocity spaces, Vj C V0, and C VN,
and Vh C V, along with a hnite-dimensional pressure space, Ah C A. The discrete
pressure space consists of piecewise constant functions. The discrete velocity space
consists of lowest-order Raviart-Thomas velocity functions.
The discrete velocity space must be treated in a special way. We define a
transformation of a velocity function on the reference cube to a velocity function
on the hexahedral element via the Piola transformation. Raviart and Thomas
demonstrated in [52] the existence of a projection operator Uh : V Vh such
that, for any v e V,
b(Uhv, A) = b(v, A), VA e Ah, (4.1)
and
||nftv v||L2(n)3 < C'/is||v||iJS(n)3, s = 0,1. (4.2)
The inf-sup condition can then be verified and error estimates given. This anal-
ysis was done for affine elements where the Jacobi matrix and the Jacobian are
constant. Results for 2-D quadrilateral elements where later given in Thomass
thesis [57]. Results for non-affine 3-D elements is still an open question. If an
interpolation estimate where known, it could be used along the lines of Thomass
thesis to verify the inf-sup condition and obtain error estimates. In particular,
25


for the special case of 3-D elements that are products of 2-D quadrilaterals with
a 1-D interval (quadrilateral prisms), Thomass theory applies.
There are numerous Piola-type transformations which satisfy the properties
of the Piola transformation. The different Piola-type transformations will have
certain interpolation properties which need to be considered. For example, the flux
of a velocity vector representing uniform flow can be interpolated to the interior of
a two-dimensional quadrilateral exactly. However, in the three-dimensional case
an exact interpolation of the uniform flow, in general, can not be obtained.
The discrete velocity space can be decomposed using a discrete divergence-free
velocity space, Dh C Vf. The construction of the discrete divergence-free velocity
space on a rectangular grid in three dimensions is described in [17]. We will see
here that this construction is also valid for the hexahedral grid. We will extend
the work in [17] to hexahedral grids with non-homogeneous boundary conditions.
We now consider vf e Vf and vf e Vfr and seek a solution to the divergence
equation as before:
div vf = / div vf. (4.3)
Then, we seek the divergence-free correction vf, e D/l given by
a(vf,, w) = a(vf + vf, w) G(w), Vw e Dft. (4.4)
We write our approximate solution as:
vfe = vf, + (vf + vf). (4.5)
The discretization results in a symmetric positive definite matrix problem.
This matrix is defined and a right-hand side is given. The PCG method can be
used to solve for vf,.
26


4.1 Trilinear Hexahedral Elements
Figure 4.1 shows the image of the unit cube under the trilinear mapping and
Figure 4.2 shows a random hexahedral grid in three dimensions. We partition fi
into hexahedral elements as follows:
a = U Qi,,,k (4.6)
i,j,k
Each element, Qij,k, is a trilinear mapping of the reference cube, Q. We will call
such a partition Of1. The superscript h refers to mesh size. We will usually think
of h as being the maximum diameter of the elements of the decomposition.
A physical coordinate, (x,y,z), in the hexahedral element, Qi,j,k, is given in
terms of a reference coordinate, (x,y,z), through the trilinear mapping given by
where
and
x{x,y,z) = W(P b(.r. //..;)
y(x,y,z) = -b(x,y,z)
z(x,y,z) = a-(PF;.() -b(.r.//..;)
b7 (.r. y, z) = (1,3;, y, z, xy, yz, xz, xyz)
(4.7)
(4.8)



( /-fX X X' X' X' X' X' X' \
\a0 > al ] a2 > a3 5 a4 5 a5 5 a6 5 a7J
(4,af,a|,a|,al,a|,a|,af)
(ag,af,a|,a|,a|,af,a|,af).
(4.9)
The vector, pf *., represents the eight x coordinates of the corners of hexahedral
element Qijtk- Likewise, p|jfeji and p/. ( (- are the y and z coordinates, respectively.
The vectors a,. ay and az are uniquely determined by these points. The corre-
spondence between the vertices of the reference cube and the vertices of Qij.k is
summarized in the Table 4.1.
27


Figure 4.2: Random Hexahedral Grid
We can immediately write down eight equations
MPij,*) b(> > ) = *G1/2j l/2,fc1/2
W(P';(./,) -b( 1.0.0) = Xi+1/2J-l/2,k-l/2
Wtp';,./,) -b((). 1.0) = Zj_i/2j+i/2,*-i/2
a^(Pfj,fe) M1 !> ) = WK/2j+l/2,fc1/2
a< (P';(./,) b(0. 0. I ) = a;i-l/2j-l/2,fe+l/2
aic(Pi,j,k) 1)( l 0. I ) Xjijt\j2J 1/2,fc+1/2
a*(p?j,*) -b(0, R!) = G-i/2j+i/2,fe+i/2
a^(pfj,fe) M1 !) !) = ^i+l/2j+l/2,fc+1/2
(4.10)
28


(x,y,z) /v-jT Pi,j,k pyh . Pk,ij
(0,0,0) xi-l/2,j-l/2,k-l/2 Ujl/2,k l/2,il/2 zk-l/2,i-l/2,j-l/2
(1,0,0) xi+l/2,j-l/2,k-l/2 Vj-l/2,k-l/2,i+l/2 zk-l/2,i+l/2,j-l/2
(0,1,0) xi-l/2,j+l/2,k-l/2 Vj+l/2,k-l/2,i-l/2 zk-l/2,i-l/2,j+l/2
(1,1,0) xi+l/2,j+l/2,k-l/2 Vj+l/2,k-l/2,i+l/2 zk-l/2,i+l/2,j+l/2
(0,0,1) xi-l/2,j-l/2,k+l/2 Vj-l/2,k+l/2,i-l/2 zk+l/2,i-l/2,j-l/2
(1,0,1) xi+l/2,j l/2,k+l/2 Vj-l/2,k+l/2,i+l/2 zk+l/2,i+l/2,j-l/2
(0,1,1) xil/2,j+l/2,k+l/2 yj+l/2,k+l/2,i-l/2 zk+l/2,i-l/2,j+l/2
(1,1,1) Vj+l/2,k+l/2,x+l/2 zk+l/2,i+l/2,j+l/2
Table 4.1: Reference Coordinates and Physical Coordinates
and solve the system of equations for p[ ( /. giving

i,j,k)
"
ao
af
a2
f
a|
of
Of
1 p i
xi-l/2,j-l/2,k-l/2
xi+l/2,j-l/2,k-l/2 ~ xi-l/2,j-l/2,k-l/2
xi-l/2,j+l/2,k-l/2 ~ xi-l/2,j-l/2,k-l/2
xi-l/2,j-l/2,k+l/2 ~ xi-l/2,j-l/2,k-l/2
xi+l/2,j+l/2,k-l/2 ~ xi-l/2,j+l/2,k-l/2
xi-l/2,j+l/2,k+l/2 ~ xi-l/2,j-l/2,k+l/2
xi+l/2,j-l/2,k+l/2 ~ xi+l/2,j-l/2,k-l/2
6
xi+l/2,j+l/2,k+l/2 ~ E 0XS
s=0
(4,11)
29


In the same manner we get and az(p|jij) as follows:


1 o af
af
a3
af
V a5
V al
i i
Vj-l/2,k-l/2,i-l/2
yj-l/2,k-l/2,i+l/2 ~ Uj-l/2,k-l/2,i-l/2
yj+l/2,k-l/2,i-l/2 ~ yj l/2,kl/2,il/2
yj-l/2,k+l/2,i-l/2 ~ yj l/2,kl/2,il/2
Vj+l/2,k-l/2,i+l/2 ~ yj+l/2,k-l/2,i-l/2
V
af
y
Vj+l/2,k+l/2,i-l/2 ~ yj-l/2,k+l/2,i-l/2 ~ 0-2
yj-l/2,k+l/2,i+l/2 ~ yj-l/2,k-l/2,i+l/2 af
6
?/j+l/2,fc+l/2,i+l/2 S Og
s=0
(4.12)
MpI,,,
1 P OM 1 Zk-l/2,i-l/2,j-l/2
af zk-l/2,i+l/2,j-l/2 ~ Zk-l/2,i-l/2,j-l/2
a2 zk-l/2,i-l/2,j+l/2 ~ zk-l/2,i-l/2,j-l/2
a3 zk+l/2,i-l/2,j-l/2 ~ Zk-l/2,i-l/2,j-l/2
af zk-l/2,i+l/2,j+l/2 ~ zk-l/2,i-l/2,j+l/2 ~ af
a5 Zk+l/2,i-l/2,j+l/2 ~ Zk+l/2,i-l/2,j-l/2 ~ af
a6 Zk+l/2,i+l/2,j-l/2 ~ zk-l/2,i+l/2,j-l/2 ~ af
1 1 6 Zk+l/2,i+l/2,j+l/2 E af L s=0 J
We can define a 3 x 8 matrix, agiven by
ai,j,k
MPij*))T
(ay(Plk,i))

MpLu))

(4.13)
(4.14)
Once &ij^k has been computed for a particular element Qij,k, we can then compute
the physical coordinates of a point in in terms of reference coordinates by
30


using
/ \
x
y
Ti,j,k(x, // //. 2).
W
(4.15)
We will call our trilinear mapping from Q to Qi,j,k- We will also assume we
have an inverse mapping, Txk from Qij,k to ().
The covariant, or coordinate, basis vectors for element Qij,k are given by
xlj,SB) dx (ax{Pi,j,k) b (x,y,z)) dx dx
xlj,SB) = b (x,y,z)) = dy dx (4.16)
_xlj,SB) _i(a,(p k,ij) b (x,y,z)) 1
1 1 §gMPi,j,k) b(x,y,z)) 1 1
Xj,k,i(z, X) ^ **0 = ^MPjVm) -b(x,y,z)) = dy dy (4.17)
1 _^MPk,i,j) 'b{x,y,z)) _ I 1
ZlLj(x,y) dz (ax(Pij,fe) b (x,y,z)) 1
Zk,iA*>y) ^k,i,j (x^y) = mMPxkj) b (x,y,z)) = dy dz . (4.18)
1 aj(az(Pfe,ij) b (x,y,z)) dz dz
We will allow ourselves to drop the dependence of the reference variables x, y,
and z when convenient. The unit normal vector (orthogonal to and
is given by
n;
i;j;k
(x,y,z) = ,
Xj,k,i x Z
k,i,j
Xj,k,i x
(4.19)
The sign would be negative if we are talking about an outward normal at
x = 0 and would be positive if we are talking about an outward normal at x = 1.
31


Likewise, we have a unit normal vector orthogonal to and Zfejij- given by
S,x) = 1 Z*'uXXii'*
x X,fe||
Finally, we have a unit normal vector orthogonal to X^*, and Yj^j given by
(4.20)
nzk-^z,x,y) = -
We also define the area Jacobians as:
Xj,j,fc x Yj;fc;j
|X> x Yjykyi\
(4.21)
Tljjk(x,y,z)
rlki(y,z,x)
Tz
1 k,id
(z, x, z)
Yj,k,i x ZkM\\ (4.22)
^k,i,j x Xjj^ll (4.23)
X,fe x Y^^||. (4.24)
The Jacobi matrix is dehned as:
\^0 7
Ai,j,k 1j,k,i
vl V"1 71
Ai,j,k 1j,k,i
v2 v2 72
Ai,j,k j-k-i _
and its determinant, the volume Jacobian Ji;j;k(x,y, z), is given by
%(hid) = ~^-i,j,k (Yjtk,i X Zfe^j)
= X Xjj^)
= Z (Xjj^ x Yj,fe,j).
We denote the faces with their respective unit normals in as:

(4.25)
(4.26)
Fi,j,k(%) , with normal n^jk(x,y,z) (4.27)
Fj,k,i(y) , with normal n|jfeji(^, f, 7) (4.28)
Fk ij{z) , with normal nk ij{z, x, y). (4.29)
32


The areas of the faces would then be given by
1 1
= J dvdz = 11 vihkdvdz
o o
i i
Ay;h,:(y) = / dxdz = / / Tyh-dxdz
jpV , j j j,n,i
o o
i i
Ak,i,j(z) = / dxdy= / / Tzk i jdxdy.
(4.30)
(4.31)
(4.32)
/ 0 0
Note that throughout we have used {i, j, k} ordering for the X coordinates,
{j, k, i} ordering for the Y coordinates and {k, i, j} ordering for the Z coordinates.
This leads to some permutations that may be used to advantage during the im-
plementation. For example, we define a function which computes a*^ using the
physical coordinates pf *., p?k i, and pkij as follows:
compute_a(a, px, py, pz, i,j, k).
(4.33)
Next, we dehne a function which takes the matrix ai;j;k and computes Xijjfe(^, z)
for a particular y and z as follows:
compute_X(X, a, y, z).
(4.34)
Now, suppose we call the same functions, but permute the arguments as follows:
compute_a(a, py, pz, px,j, k, i) (4.35)
compute_X(Y, a, 2, x). (4.36)
This will produce the vector, Yjjkji(z, x), with a left cyclic shift on the components,
that is, a {YYk,Y?jk,YAk) ordering.
33


4.2 Approximation Spaces
The pressure space, Ah, consists of piecewise constants. A pressure on the
reference element is a constant scalar function. It, therefore, has one degree of
freedom. We map this scalar function to a hexahedral element in an isoparametric
way. This means that the mapping will only depend on the coordinate transforma-
tion given by the trilinear mapping. Since the scalar function is constant, there
will be no dependence on the coordinates and, therefore, it will have the same
constant value when mapped to a hexahedral element. The dimension of Ah on
a. l x m x n grid is Inm. In the case of a pure homogeneous Neumann boundary
condition, we will have the additional constraint specified for Aq. In this case the
dimension of Aq is Imn 1.
To define the finite-dimensional velocity space, Vq C V0, we consider the
lowest-order Raviart-Thomas vector function on a reference cube, Q. We denote
this space as 1ZT[q] (Q). We then use the Piola transformation to define the velocity
space on 7ZT[o](Qi,j,k)- For v E 7ZT[o](Q), the Piola transformation is defined as:
v = -J Bi,3,kA o Trl (4.37)
and the inverse is given by
v = JijfiBfJfiV o Y(4.38)
Raviart-Thomas elements were first introduced in [52] for triangles and rect-
angles in two dimensions. This was later extended to three dimensions by Nedelec
[51]-
34


A velocity function on the reference cube has the following form:
a + bx
v(x,y,z)= c + dy (4.39)
e + fz
This vector function has six unknowns. Therefore, we can associate a value on
each face of the reference cube for v. These values will represent the fluxes across
each face of the reference cube. We can define a basis vector function as a vector
function with a flux of magnitude one on one of the faces and zero flux on the
other faces. This will give us six basis vector functions as follows:
v-1/2,0,0 = ((1^),0,0)T (4.40)
vi/2,0,0 = (A0,0f (4.41)
Vo,-1/2,0 = (0,l-y,0)T (4.42)
vo,i/2,o = (0,y,0)T (4.43)
Vo,0,-1/2 = (0, 0,1 zf (4.44)
Vo,0,1/2 = (0,0, zf. (4.45)
These basis vectors when mapped to Qij,k using the Piola transformation are
given by
.X _ (1 (4.46)
1/2
vx 1/2 (4.47)
.y _ (1 y)Y j.k,i (4.48)
1/2 Jj,k,i
vy (4.49)
* j,k,v, 1/2 Jj,k,i
35


(4.50)
v
Z
V
Z
k,i,j;l/2
(1 Zrj Zfcij
^k,i,j
(4.51)
For now, we specify zero fluxes on the boundary of fi. Due to the continuity
of fluxes across faces, the dimension of the global velocity space on a l x m x n
grid will be (l 1 )nm + l(m 1 )n + lm(n 1). That is, each basis vector function
will have a support of two elements. The global velocity basis functions are thus
defined as follows:
X
+l/2,j,A:
'i,j,k; 1/2
i+l,j,k;-l/2 >
on Q
on Q
i-j,k
i+l,j,k
(4.52)
Vj+l/2,fe,j
vf,iM;i/2 011 Qi-j,k
V?
j+l,k,i;-l/2 >
on Q
i,j+l,k
(4.53)
' k+l/2,i,j
'k,i,j; 1/2 5
'k+l,i,j;-l/2 >
on Q
on Q
i-j,k
i;j;k +1'
(4.54)
Our discrete velocity space, Vq, is given by
VS = sPan M+i/2,j,k^y3+i/2,k.pvk+i/2,i,j}-
(4.55)
The volumes associated with two elements that have a face in common are given
by
Qi+l/2,j,k = Qi.j.k U Q/ 1 .././ (4.56)
Qi,j+l/2,k = Qi,j,k U Qi,j+l,k (4.57)
Qi,j,k+1/2 = Qi.j.k [J Qi.j.k 1 (4.58)
36


Define the unit normals of the reference cube as follows:
fi-1/2,0,0 = (-1,0, Of (4.59)
hi/2,0,0 = (1,0, 0)T (4.60)
Ho,-1/2,0 = (0,-1, of (4.61)
Ho,1/2,0 = (0,1, Of (4.62)
no,o,-i/2 = (0,0,-If (4.63)
Ho,0,1/2 = (0,0, If. (4.64)
We can define an interpolation operator if : V(Q) 7ZT[o](Q)- Let v e V(Q).
Then, we can uniquely define 7rv e 'RT[q](Q) by
7TV = /-l/2,0,0V-i/2,0,0 + /l/2,0,0W/2,0,0
+ /o,-1/2,0V0,-1/2,0 + /o,1/2,0 Vo,1/2,0 (4.65)
+ /o,0,1/2 Vo,0,1/2 + /o,0,1/2 Vo,0,1/2
where
1 1
/-1/2,0,0 = / / V H1/2,0,o dydz,
0 0
1 1
/o,-1/2,0 = / / V Ho-i/2,0
0 0
1 1
/o,0-1/2 = / / V H0,0-1/2 dxdy,
o o
Then, for any A e Tq{Q) we have:
i i
/i/2,o,o = / / v Hi/2,0,0 dydz
o o
i i
/o,1/2,0 = / / V Ho,1/2,0 (4.66)
0 0
1 1
/o,0,1/2 = / / V H0,0,1/2 dxdy.
o o
Furthermore, if we let
(fv v) nAds = 0.

/)(V v) = V 7TV
(4.67)
(4.68)
37


then, we have:
I p(V v)A d() = / V vA dQ. (4.69)
Q Q
In other words, V (fv) is the L2 projection of V v onto the space Tq{Q).
Note that we have used V- to represent the divergence operator. We will use this
notation constantly.
4.2.1 Properties of the Piola Transformation
The Piola transformation results in the following:
Lemma 4.1 For any v e V(Q) we have
V v)A dxdydz
(V v)A dxdydz, VA E L2(Q)
Q
Qi,j,k
(4.70)
n)A ds
(v n)A ds, VA e L2(dQ)
(4.71)
dQ dQi.j.k
Proof. Equation (4.71) is essentially a definition of the Piola transformation. For
example, let v = {v\,V2,vQT. Then,
v n;
L\fc(0) dydz
rpx
(o)
Vi
X

ij'A
FQ.kW
i i
J jvi\x=o dydz
0 0
61/2,0,0 d§-
Q
x z-:,dydz
1 'hjx A /
(4.72)
(4.73)
We use (4.71) and Greens formula to show (4.70) as follows:
V v)A dxdydz
Q
dQ
n)A ds / v VA dxdydz
Q
38


= J (v n)A ds J v VA dxdydz (4.74)
9Qi,j,k Q
= J (V v)A ds + J v VA dxdydz J v VA dxdydz
Qi.j.k Qi.j.k Q
= j (V v)A ds.
Qi,j,k
To get the last equality we need to show that
J v VA dxdydz J v VA dxdydz = 0. (4.75)
Q
In the case of lowest-order Raviart-Thomas spaces we can use the fact that the
scalar functions, A and A, are constant. However, we can also show it for higher
order spaces. To show this we will look at a simple example with lowest-order v
and arbitrary A; an analogous, but more tedious, argument works for higher-order
v. Suppose v = vy/2,0,0- Then,
v VA dxdydz
x
Qi.j.k
Qi.j.k
Ji.i
Xj VA dxdydz
i,j,k
Qi.j.k
x dX
Ji,j,k 9x
dxdydz
Q
x dxdydz
ox
v VA dxdydz.
Q
Note also that (4.70) implies that
V-v
T-V-voT^.
Jh],k
(4.76)
39


We now give the results described by Thomas in [57] for the approximation of
smooth vector functions on 2-D quadrilaterals. As noted above, the analysis for
non-affine 3-D elements is still an open question. However, Thomass theory does
apply for the special case of 3-D elements that are products of 2-D quadrilaterals
with a 1-D interval (quadrilateral prisms).
We will use C to denote a generic positive constant independent of the mesh
size h. We have for 2-D quadrilaterals the following:
Theorem 4.2 There exists an operator 7rijjfe such that, for all v e Hl(Qijjfe)2
with V v e Hl(Q,ij^k) we have
ll7rij,*v VIU2(Qij.k)2 ^ Ch\v\HHQij.k)2 + Ch2\V (4.77)
and
IIv (TTjj.fcV v)||x,2(Q_-fc) < C||V v\\L2{Qi:j:k) + Chl\V v\Hi{Qij k). (4.78)
Proof. The proof is given in [57, Theorem III.4.3 (pg. III-26)].
A result of Theorem 4.2 is the error estimates given by
Theorem 4.3 For p E ff1(Q), v e ff^fi)2, and V v e H:(Q) we have
\\p ^Ph\\L'2(Q) + ||vfe|U2(n)2 < Ch(\p\Hi(fy + |v|ffi(n)2) + Ch2\V v|ffi(n) (4.79)
and
||V (v v/i)||i2(n) < C|v|iji(n)2 + Ch(\p\ffi^ + |V v|ffqn))- (4.80)
40


Proof. The proof is given in [57, Theorem IX.3.2 (pg. IX-17)] and [57, Theorem
IX.4.1 (pg. IX-19)].
Note that V (v vh) does not necessarily go to zero as h goes to zero. For
higher-order elements, V (v vh) does go to zero as long as p E H2(Q) and
v, V v E ff2(fi); with 7ZT[i] instead of 777[o], all the above estimates hold with
one more power of h.
4.2.2 Interpolation of Velocity Functions
We now look at how the flux gets interpolated to the interior of a hexahedral
element. We can define numerous Piola-type transformations in three dimen-
sions which still have the properties of the Piola transformation discussed above.
The difference between these various Piola-type mappings is how the flux is in-
terpolated to the interior of an element. One simple modification of the Piola
transformation is to include an area correction factor as follows:
v'
hJ-
A;-1/2 Pi,j,k;-l/2 j
o1
(1
i,j,k
7X,
hj
,fc;l/2 Pi-j-M 1/2 t
A i A
i;j;k
i,j,k
Vj,k,i;~ 1/2 Pj,k,i\-1/2 j
0J,k,i
vy ny y^j,k,i
Vj,k,i; 1/2 Pj,k,i; 1/2 j
j,k,i
vz _ nz (1 z)Zk,j,j
Vk,i,j; 1/2 Pk,i,j\-1/2 j
h
_ z '~~k,j,j
Vk,i,j; 1/2 Pk,i,j\ 1/2 j
k.t. i
zZfr
1k,i,j
where the area correction factors are given by
Pi,j, A;1/2

(4.81)
(4.82)
(4.83)
(4.84)
(4.85)
(4.86)
(4.87)
41


(4.88)
X
Pt,j,k;l/2
ny
Pj,k,r-1/2
ny
Pj,k,r, 1/2
Pk,i,r-1/2
Pk,1/2
Yj,k,i x Z

44)
4*4)
44)
441)
The velocity, interior to a cell, Qij,k, is approximated by
(4.89)
(4.90)
(4.91)
(4.92)
_i_ f-y vy
^ j-l/2,k,iw j,k,r-1/2
+ fk-l/2,i,j'Vk,i,j;-l/2
The nodal values /f 1/2j-fe, fi+i/2,j,k
(x, y, z) + /f+i/2j>vfjijfe.1/2(x, y, z)
(y, z, x) + /j'+i/2,jfc,ivJ)fc)j.1/2(y, x) (4-93)
(z,x,y) + /Jfc+i/2,ijv^jj.1/2(^,x,y).
fy fV fz arlr| fz
^j l/2,k,i'> 2j+l/2,k,iZ Jk1/2,i,j cillu Jk+1/2,i,j
are the fluxes on faces fffO),
^ij,*
(i),
^..(0), FJ^l), Fr
k,i,j
M,
id
respectively. For example,
__ fx
U/t./4: 11/'!./!A: J i l/2Jtk *
*%.*()
The scalar value, \Xitjtk{x,y,z) nfj^(x,y,z) is given by
XX XX
uij,*(£> &f n44 y, f = ^1/24 (1 f 4-
(4.94)
(4.95)
Equation (4.95) is a linear interpolation of the average normal velocities from the
two faces at either end of the element. Suppose we have uniform flow given by
v = (c, 0, Of. Furthermore, suppose nf is in the direction of v on element
Qi,j,k- Then, the flux across face Ffjk(x) is cAfjk(x). Substituting this flux
42


at the two faces, F*jjk(0) and Ff^;A.(1), into equation (4.95) gives a value of c.
Therefore, when we integrate over the face, Ffjk(x), we get the correct flux. This
would not have been the case with the standard Piola mapping, which would give
c(-4fJlfc(0)(l x) + Af j fe(l)*^)i this is equal to cA^j k{x^ if Aijtk(x) varies linearly
with x as it must in 2-D, but in general this is not the same.
Now, we look at a different example. Again, we will assume a uniform flow,
v = (c,0,0f. This time, however, we will have a face where nfjk(x,y,z) is not
parallel to v. In particular, consider the element shown in Figure 4.3.
h h
h
Figure 4.3: Uniform Flow on Distorted Element
We will consider the flux in only this one element, so we will drop the i,j, k
subscripts. This element has a face at x = 0 with area h2 and a normal, n'. in the
43


direction of the flow. The face at x = 1 has area y/2h2 with normal at an angle
at 45 degrees with the flow. In general, the flux across the face, Fx(x), will be
ch2 and the area will be h2-\/x2 + 1. Substituting these expressions into equation
(4.95) and integrating over the face, Fx(x), gives the interpolated flux as:
J u n* d>Ax = ch2 fl + ~ *') Vi2 + 1. (4.96)
F*($) \ \ / /
The relative error of this interpolation is plotted in Figure 4.4. In this case,
because the flux varies linearly with x (actually, it is constant), the standard
Piola mapping gives the correct interpolated flux.
Figure 4.4: Relative Error of Interpolated Flux
For general hexahedral elements, especially when the faces are not planar, the
flux will not always be interpolated exactly for uniform flow. This situation is
44


particular to 3-D. In two dimensions we have no problem interpolating uniform
flow. This is due to the use of linear interpolation which works fine in 2-D, but not
in 3-D. We perhaps need a quadratic term in the interpolation. Different velocity
trial functions are being investigated by Richard Naff, [50], at the U.S. Geological
Survey. It appears, at this point, that there will be no way to exactly interpolate
uniform flow for general hexahedral elements. We will have to be satisfied with a
close approximation on the order of h2 when using lowest-order Raviart-Thomas
velocity functions.
4.3 Divergence-Free Basis
We need a computationally convenient basis for the divergence-free velocity
space. The construction of such a basis for a rectangular grid in three dimensions
for homogeneous boundary conditions is described in [17]. More recently, a com-
putationally convenient divergence-free basis is described by Scheichl in [54] for
triangles in two dimensions and tetrahedra in three dimensions. Scheichl uses a
spanning tree algorithm to compute the divergence-free basis. Our method is to
define the divergence-free basis a priori.
We will see here that the divergence-free basis described in [17] is also valid
for distorted grids using hexahedral elements. We will extend this basis to include
Dirichlet boundary conditions. It became convenient to define a small part of the
divergence-free basis for Dirichlet boundary conditions using the spanning tree
algorithm described in [54].
A divergence-free vector function is the curl of a vector potential function.
We wish to construct the divergence-free subspace such that it is contained in Vq.
45


Consider the bilinear basis on the reference cube, Q, consisting of scalar func-
tions which are bilinear in y and z and constant in x given by
*1/2,1/2 = (4.97)
*-1/2,1/2 = (! y)z (4-98)
*-1/2,-1/2 = (l-y)(l-) (4.99)
*1/2,-1/2 = $(1 ^)- (4.100)
We now define the vector potential functions on the reference cube as:
*i/2,l/2 = (^1/2,1/2) 0) 0) (4.101)
*-1/2,1/2 = (**1/2,1/20,0)T (4.102)
*-1/2,-1/2 = (*-1/2,-1/2) 0) 0) (4.103)
*1/2,1/2 = (**1/2,1/2) 0) 0) (4.104)
We take the curl of each of these vector functions giving divergence-free vector
functions on the reference cube as:
wf/2,1/2 = curl $?/2jl/2
0
y
v0,1/2,0 v0,0,1/2
(4.105)
w
1/2,1/2
curl
1/2,1/2
W
1/2,-1/2
curl
1/2,-1/2
0
i -y
z
0
(! -y)
1 z
Vo,-1/2,0 + V0,0,1/2 (4.106)
-V0,-1/2,0 + V0,o,-1/2 (4.107)
46


0
Wi/2,-1/2 = CUrl *1/2-1/2 = ~jj VO,1/2,0 V0,0-1/2- (4.108)
(1 z)
Now, we use a Piola-type mapping to map w^2jl/2, w1^2)1^2, w1^2)_1^2, and
^1/2,-i/2 t0 tlie physical elements Qij,k, Qij+i,k, Qij+i,k+u and Qij,k+i respec-
tively giving
(4.109)
(4.110)
(4.111)
(4.112)
Note that if we use the standard Piola mapping, then by (4.76) the w'"s will
be divergence-free. If we use a different Piola-type mapping, for example (4.81)-
(4.86), then the w'"s may not, in general, be strongly divergence-free. However,
the Piola-type mapping still preserves the face fluxes. Therefore, the w'"s will
be weakly divergence-free. From now on when we define a divergence-free vector
basis function we will mean weakly divergence-free.
We use continuity across faces to define the divergence-free vector function
with support on Qj+i/2,k+i/2,i = Qi,j,kUQj+i,k,iUQj+i,k+hi^Qi,j,k+i as:
We can think of (4.113) as an x-slice function and recognize it as such by the third
index i. That is, an x-slice function has its support on four elements contained
within a vertical slice of the domain for a given i. With a moments thought we
(4.113)
Vfc+l/2,ij 4^ Vfc+l/2,i,j+l-
47


can see that the divergence-free basis vector function must have support on four
elements in order to be weakly divergence-free. This is because the magnitude of
the fluxes on the four faces must be equal in order for them to cancel out. The
above construction also assures that a divergence-free vector function is contained
m v
o-
We define analogous divergence-free vector functions for //-slices and ^-slices
as follows:
y 2/%
Wk+l/2,i+l/2,j = Vfc+1 /2,i,j ~ Vk+l/2,i+l,j
(4.114)
T2C I rX
' i+l/2,j,k r" vi+l/2j,fc+l
Wi+l/2,j+l/2,k
\t2C __ rX
vi+l/2j,fc vi+l/2j+l,fc
*2+1/2,fc,i
Vy
i+l/2,&,x+l'
(4.115)
The divergence-free vector functions (4.113), (4.114), and(4.115) span the divergence-
free subspace. We can write the a+slice function, w|+1//2 fe+i/2 i, as w|+i/2fe+i/2o
plus a linear combination of //-slice functions and z-slice functions as follows:
WJ+l/2,fc+l/2,i
Wi+l/2,fe+1/2,0
i-1
l-l /
E
1 A V
i-1
y y
^ k-\-l/2./is-\-l/2.J ^k-\-l/2./is-\-l/2.J-\-l
(4.116)
t-i /
e H
1 A V
is= 0
+ l/2J+l/2,fc
isJr\/2 j+1/2,^+1
We can show similar linear dependencies for the //-slice and z-slice functions.
These linear dependencies are given by
w
fc+l/2,i+l/2j
W
fc+l/2,i+l/2,0
i-1
E (w,
1 A V
ja= 0
i-i
i+l/2,is+l/2,fe
W
E (w|
3s=0 V
is+l/2,fe+l/2,i
i+l/2,is+l/2,fe+l
^Es+l/2,fc+l/2,i+l
(4.117)
48


and
Wi+l/2j+l/2,fc Wi+l/2j+1/2,0
k-1
Z (w*
ka =0
fe-1
v-1 /
£
A V
ks=0
j+l/2,fcs + l/2,i j+l/2,fes + l/2,i+l
y y
^ks+l/2,i+l/2,j ^rfes+l/2,i+l/2j+l
(4.118)
respectively.
Therefore, we can eliminate one of the above sets of linearly dependent func-
tions and still span the divergence-free subspace. We choose to eliminate the
linearly dependent rc-slice functions to end up with the divergence-free basis given
by
wi+i/2,fc+i/2,i ; = 0. 0<./< m ^ 2. () < A- < /; 2 (4.119)
wI+i/2i+i/2j : 0 < i < l 2, 0 w*Z+i/2j+i/2,fc J 0< The pressure space, Aq, can be defined as the divergence of Vj. This means
that Aq is the range of the divergence operator and the divergence-free subspace
is the kernel of this operator. Therefore, the dimension of the divergence free
subspace is given by
dimDfe = dimVj dimA(j
= (l 1 )mn + l(m 1 )n + lm(n 1) (Imn 1) (4.122)
= 2 Imn Im In nm + 1.
The number of divergence-free vector functions given in (4.119)-(4.121) is equal
to (4.122). This will form our divergence-free basis.
49


4.3.1 Dirichlet Boundary Conditions
We now consider the case when part of the boundary has Dirichlet boundary
conditions. In this case we will allow our velocity space, Vo, to have non-zero
fluxes on the Dirichlet boundary. We will still require that vector functions in Vq
have zero fluxes on the Neumann boundary. We no longer have a constraint on
the pressure space. In other words, we now define our pressure space, Ah, as being
spanned by piecewise constants and allow global constant functions.
We assume that if a given side of the domain has Dirichlet boundary condi-
tions, then all the faces on that side have a Dirichlet boundary condition. This
assumption is not necessary for the method to work, but simplifies the develop-
ment to be described subsequently. For example, if we impose Dirichlet boundary
conditions at x = 0, then all the faces at x = 0 will have unknown fluxes. These
divergence-free boundary vector functions are illustrated for a single z-slice in
Figure 4.5.
For each unknown flux on the boundary, there will be a corresponding basis
vector function in Vq given by
,ra;0 vx ~ V0j,fc;-l/2> on Qo,j,k (4.123)
xl Vj,k on Qi-ij'k (4.124)
yo Yk,i y ~ V0,fc,i; 1/2 on Qi,o,k (4.125)
vyl Yk,i vy vml,fc,i;l/2 , on Qijm-ijk (4.126)
v? hJ ** o' > on Qi.jp (4.127)
V'1 TJZ vnl,ij;l/2> On QiJ.n1- (4.128)
50


y
X
Figure 4.5: Boundary Divergence-Free Basis Vector
The divergence-free vector functions on the boundary are written in terms of
vector functions from Vq. For example, on the x = 0 boundary shown in Figure
4.5, we have the z-slice divergence-free vector functions given by
w.
.z,xQ _rx0 xO y
Vj+l,fe ^ vj+l/2,fe,0-
vj+l/2,k vj,k
Similarly, we have y-slice divergence-free vector functions given by
(4.129)
w
V,xO _____ xQ xQ z
k+1/2 ,j Vj,fe+1 + Vj,fe Vfe+l/2,0j-
(4.130)
Equations (4.129)-(4.130) dehne a divergence-free vector function for each edge on
the boundary x = 0 and involve fluxes across faces on the boundary in addition
to fluxes across internal faces. Note that divergence-free vector functions on edges
not on the boundary dont involve any boundary fluxes.
The divergence-free vector functions defined in (4.129)-(4.130) span the divergence-
free subspace on the boundary x = 0. However, we note that w 1+ij2j can
51


written in terms of wf+i/20 plus a linear combination of other divergence-free
vector functions. This relationship is given by
w
y,x,0
k+1/2 ,j
= w
y,x,0
k+1/2,0
z,x 0
s+l/2,k
W
z,x 0
s+l/2,fc+l
Ws+l/2,fe+l/2,0
(4.131)
Alternatively, we could have written wk+1 in terms of wj+i/2 plus a linear
combination of other divergence-free vector functions using the relationship given
by
z,x 0
W/+l/2,k
z,x 0
WJ+l/2,0
k-1
v-1 /
E (w
s=0
y,x0
s+l/2,j
W
y,x0
s+l/2j+l
W
J+l/2,s+l/2,0
(4.132)
Therefore, the divergence-free subspace on the boundary x = 0 is spanned by
z,x 0
W/+l/2,k
; 0 < j < m 2. () w
y,x0
k+1/2,j
; j = 0, 0 < k < n 1
(4.133)
or
TZ+ 0
Wj+l/2,k
:()<./< m ^ 2. A- = 0
w
y,x 0
fe+l/2j
; 0 < j < m 1, 0 < k < n 1.
(4.134)
Which ever set of functions we choose, they both have dimension mn 1. We
no longer have the constraint on the pressure space. Therefore, the dimension of
A h is Imn. In general, the dimension of Dh increases by the number of unknown
fluxes on the boundary (because of the new degrees of freedom in Vq), minus 1
(because of the increase of 1 in the dimension of Ah). Therefore, the number of
linearly independent divergence-free vector functions on the boundary is nm 1,
which is the number of divergence-free vector functions given in (4.133), or (4.134).
For our implementation we use the set of vector functions defined in (4.133) as a
52


divergence-free basis for the boundary x = 0. The divergence-free basis for all of
the boundaries is summarized as:
wz.o ,,xo _ I vy
Wj+l/2,k Vj,k Vj+l,fe + Vj+l/2,fc,0
()<./< m 2. 0 y,xQ
T+l/2,0
> for x = 0
rx0 _____ _.x0
/0,A+1 v0 ,k yk+1/2,0,0
j = 0. 0 (4.135)
z,xi xi xi v
Wi+l/2,k = Vlk ~ Vj+l,k ~ Vf+l/2,iM1
()<./< m 2. 0 < A' w
2/,£l
fe+l/2,0
> for x = 1
vxlo,fe+i vgjfc + v|+1/2)t_1)0
(4.136)
j = 0. 0 £,2/0 ,,2/0 0 z
1/2,x *&+l/2,x,0
0 < / wz,2/ _ v2/0 _ 2/0 x
Wi+l/2,0 V0,i+1 v0,i Vi+l/2,0,0
0 < i < I 2. /,' = 0
> for y = 0
(4.137)
,£,2/1
,2/1
.2/1
wfe+i/2,i = ni n,+i,i -
0 w.
2,2/1
t-2/1
t-2/1
> for y = 1
'i+l/2,0 V0,i+1 v0,i
0 < i < I 2. /,' = 0
i+l/2,m1,0
(4.138)
w
2/,z0
i+l/2j
,z0
ij
V
zO
i+ij
+ v
X
i+1/2,J',0
0 < / < / 2. 0 < ./ < /// 1
. ,z0 _ z0
j+1/2,0 V0j+1
v5;j
V
2/
j+1/2,0,0
> for z
0
* = 0,0 (4.139)
53


() > for 2: = 1.
(4.140)
,y
j+1/2,n1,0
i = 0. 0 < j < /// 2
Suppose we have Dirichlet boundary conditions on .r = 0 and x = 1. This
gives us 2nm unknown boundary fluxes. So, we need 2nm 1 divergence-free
basis vector functions for the boundary. If we add up the number of divergence-
free vector functions given in (4.135) and (4.136), then we have 2nm 2 vector
functions. Therefore, we need one additional divergence-free vector function which
is linearly independent from all the rest. The only such vector function is a global
vector function, that is, one which connects both Dirichlet boundaries. We call
this type of vector function a pipe function. We want to choose a basis which best
models the physics of the equation. Imposing pressure conditions on opposite sides
of the domain induces a Darcy flow across the domain. If the flow is constant in
the positive x direction and zero in the other two directions, then the vector
function describing this would look like the one in Figure 4.6.
In fact, if the conductivity tensor, K, were diagonal and constant, then this
pipe function would be a(-, ^-orthogonal to all the other divergence-free vector
functions and we would have the solution in one iteration. Therefore, we think
of the ideal pipe function as an a(-, ^-orthogonal projection on to the locally
divergence-free subspace. Approximating the orthogonality of the pipe function
plays an important role in the preconditioner. In general, for each additional
Dirichlet boundary region we add, which is disconnected from the other Dirichlet
boundaries, we need an additional pipe function. In our implementation, we can
54


y=l boundary
o
x
o
X
------------------=s
X
y=0 boundary
Figure 4.6: Divergence-Free Pipe Function
have at most two disconnected components of the Dirichlet boundary, requiring
one pipe function. This is because if we have three sides of the domain which are
Dirichlet, then they will be connected. The more general case would occur when
we allow patches of Dirichlet boundaries on a given side of the domain rather than
the whole side being Dirichlet.
If two adjacent sides of the domain both have Dirichlet boundary conditions,
then we must define divergence-free vector functions on the edges connecting the
two sides which involve fluxes from both the boundaries. These divergence-free
corner functions are illustrated in Figure 4.7.
55


y=l boundary
X
y=0 boundary
Figure 4.7: Divergence-Free Corner Functions
The divergence-free corner function on the .r = 0. // = 0 boundary is given by
= _
0 < k < n
+ v
I/O
k, 0
(4.141)
The vector function, wly0, can be written as a linear combination of Wq0,?/0 plus
other vector functions using the relationship
(4.142)
Therefore, we only keep the vector function wly0, for k = 0. In the case of
Dirichlet boundary conditions on the two sides, x = 0 and y = 0, we have mn + ln
unknown fluxes on the boundary. Therefore, the number of divergence-free vector
functions on the boundary should be nm + In 1. If we add up the number of
divergence-free functions given in (4.135) and (4.137) and add the divergence-free
56


corner function given in (4.141), for k = 0, we get nm + In 1 divergence-free
vector functions. Similar results hold for the other x, ^-boundaries. We also have
corner functions for y, ^-boundaries and z, ^-boundaries. There are a total of 12
corner vector functions to consider given by
w
, £0,2/0
£0
'0,0
ry0
'0,0)
for ./ = 0. // = 0
w
£1,2/0
£l
'0,0
ryl
'0,0)
for x = 1, y = 0
W£,2/i = v^q + v^o, for ./ = 0. // = I
W£'- w
w
w
w
w
w
w
w
2/0, zO
2/1, zQ
2/0, zl
2/1,zl
z0,£0
zl,x0
Z 0,£l
Zl,£l
^V00 + V00) for y = 0, z = 0
^vo,o^vo,0) for y = 1, z = 0
voo + vo,0) for y = 0,2=1
Voi-VoJ), for y= 1,2 = 1
^v00 + Vq0, for = 0. .r = 0
^vo,o vo,0) for 2 = 1, a; = 0
v5> + vo,o> for 2: = 0, a; = 1
vo,o^vo,0) for 2 = 1, a; = 1.
(4.143)
(4.144)
(4.145)
(4.146)
(4.147)
(4.148)
(4.149)
(4.150)
(4.151)
(4.152)
(4.153)
(4.154)
If the four sides, x = 0, x = 1, y = 0, and y = 1 are all Dirichlet bound-
aries, then we would have 2nm + 2In unknown fluxes. If we add up the number
of divergence-free vector functions given in (4.135)-(4.138) and (4.143)-(4.146) we
end up with 2nm + 2In divergence-free boundary vector functions which is one
too many. It can be shown that the corner vector function, (4.146), can be writ-
ten as a linear combination of other divergence-free basis vector functions. This
57


relationship is given by
__ ___
rri2 12
Y, Y Wi+l/2j+l/2,0
j=0 i=0 '
(4.155)
Therefore, we eliminate (4.146) from the basis. If we have Dirichlet boundaries
on all six sides of the domain, then a dimensional count would tell us that we
only need five out of the twelve corner vector functions. It becomes inconvenient
to determine the divergence-free basis a priori for every possible combination of
Dirichlet boundaries. Knowing the correct dimension is not good enough; we need
to know which corner vector functions to throw out to keep a linearly independent
set. To make it computationally convenient, we use the spanning tree algorithm
described in [54], In this algorithm we define a set a vertices and a set of edges
connecting the vertices. This represents a graph (see Figure 4.8). It is assumed
that the graph is connected. That is, there is a series of edges, or a path, which
connects any two vertices. The algorithm produces a connected graph with the
same vertices, but only a subset of the edges, in such a way that there are no
cycles. This is called a tree. In the context of finding a divergence-free basis we let
the sides of the domain which have Dirichlet boundary conditions be represented
by the vertices of the graph. Therefore, there are a maximum of six vertices in
the graph. The edges of the graph correspond to the vector functions in (4.143)-
(4.154). For example, the face at x = 0 and the face at y = 0 are connected by the
edge corresponding to w310^0. The spanning tree algorithm will produce the correct
number of linearly independent vector functions. With Dirichlet boundaries on all
58


six sides of the domain, our current implementation of the spanning tree algorithm
would compute the tree shown in Figure 4.9. Based on this we would keep the
vector functions (4.143), (4.144), (4.146), (4.148), and (4.150).
zl
Figure 4.8: Connectivity Graph of Dirichlet Boundaries
59


zl
Figure 4.9: Spanning Tree
60


4.4 Initial Guess Vector Vj
We define a vector, f = fij.k, which represents a source/sink in each element
Qi,j,k- We think of this value as being a point source/sink averaged over the
volume of the element. We want to compute a vector, v/ e Vj, which satisfies
(4.3). We will do this element by element. The vector function, vf, in element,
Qijje, is given by
(4.156)
+ uk-i/2,i,jvk,i,r-i/2 + uk+i/2,i,jvk,i,r,1/2-
The vector function, v^, is given by
(4.157)
Substituting (4.156) and (4.157) into (4.3) gives:
y , y
Uj-l/2,k,i + Uj+l/2,k,i
Uk-l/2,i,j + Uk+l/2,i,j)^i,j,k
(4.158)
61


For now, assume <9fi]v = <9fi. It is assumed that f = satisfies the com-
patibility condition given by
n1 m1 11
EEE hi* = o.
(4.159)
k=Q j=0 i=0
Then, the strategy is to satisfy (4.158) element by element. We start at the x = 0
side and push the mass through the domain by assigning fluxes to the faces with
normals in the x direction. Each time we go to the next element in the x direction,
we push the mass of the source term plus the flux from the previous element as a
flux into the next element. When we get to the x = 1 side we push the mass plus
flux along that side in the y direction going from y = 0 to y = 1. Finally, when
we get to the y = 1 side we push the mass plus flux up a column from z = 0 to
z = 1. These steps are summarized as:
Ul/2,j,k ~ fo,j,k
(4.160)
1, 0 < A; < n 1
Ui+l/2,j,k ~ fi,j,k + ui-l/2,j,k
(4.161)
1 < i < l ^ 2, 0 < j Ul/2,k,l-l ~ f(1,0,A: + ul-3/2,0,k
0 < k < n 1
(4.162)
U^+l/2,k,l-l ~ fl-hhk + f-3/2,j,k + u!j-l/2,k, 1-1
(4.163)
1 < ./ < /// 2. 0 < A- < /; 1
z j? % y
ul/2,l-l,m-l = J il,m1,0 + 1-3/2,171-1,0 + m-3/2,0,1-1
fe+1/2,ll,ml = fll,ml,k T f3/2,171l,fe
+ Um-3/2,k,l-l + i-1/2,1-1,in-1
(4.164)
(4.165)
62


1 < k < n 2.
The remaining fluxes are assumed to be zero. The vector function, vf, is now
given by
12 m1 n 1
E E E f+i/2j,fevf+1/2j,fe
i=0 j=0 fe=0
m2 n1 I1
E E E ^i+l/2,fe,iVj+l/2,fe,i
j=0 fc=0 ?=0
n2 11 m1
,jVk + l/2,i,j
k=0 z=0 j=0
Define the residual on element Q;-i,m-i,n-i as:
(4.166)
fll,ml,n lEl,ml,n 1 &(Vj ,
= (Uf-3/2,m-l,n-l + Um-3/2,n-l, 1-1 (4.167)
+Un-3/2,l-l,m-l + /f-l,m-l,n-l)A;-i)m-i,ra-l-
We need to show that the residual on the element Qi-i,m-i,n-i given by (4.167)
is zero. (The residuals on the other elements are zero by construction.) Using
equations (4.160)-(4.165) we can write the fluxes in (4.167) in terms of source
functions fij,k as follows:
u
X
3/2,ml,n1
Um-3/2,n-l,l-1
U
z
n3/2,1l,m1
1-2
53 fi,m-
i=0
l.n1
m2
53 + ul-3/2,j,n-1
i=o
m2 12
53 + + 53 fhJ,n~l)
j=0 i=0
n2
53 + ul-3/2,m-l,k + Um-3/2,k,l-l
k=0
n2 12
5 ) (/ll,m l,fe 5 ) fi,ml,k
k=0 i=0
m2 12
£(/< -1J,* +
j=0 i=0
(4.168)
(4.169)
(4.170)
63


Substituting (4.168)-(4.170) into (4.167) gives:
6(Vj, /;_i ,m l,n1 A|l,ml,n1
n1 m1 1-1 (4.171)
( X) X) X)
k=0 j=0 i=0
By (4.159) the residual in (4.171) is zero.
If we have Dirichlet boundaries, then (4.159) no longer holds. To insure a
zero residual, we specify a flux on the Dirichlet boundary. The total flux on the
Dirichlet boundary is set to be equal to the sum in (4.159). This way we get
consistency and then, we use the same algorithm as before. This is summarized
in the following algorithm:
Algorithm 4.4 Solving divergence equation.
1. Assemble the source/sink, /, and Vg.
2. Compute / = / div v%.
3. Compute the residual, r = ]T /.
4. Add the residual to the Dirichlet boundary fluxes, go-
5. Recompute / = / go-
6. Compute vf using (4.160)-(4.165).
4.5 Mass and Stiffness Matrices
Now that we have a discretization of the velocity space and a construction
of the divergence-free subspace, we can define the system of equations which
represents the discrete bilinear form.
64


For now, we will assume a pure homogeneous Neumann boundary condition.
We look for vectors vf e Vj and vhD E D/l such that
&(v?,A) = F(A), VA 6 Aq
a(v^,w) = a(vj, w), Vw e Dfe.
The divergence-free vector function, v^, is given by
n2 m2
= E E ow|+1/2jfe+1/2jo
=0 j=0
m1 i=l2 n2
^ .j^7h ) 1 /2./ ) 1 /2.J
J=0 ?=0 fe=0
n1 m2 12
+ £ ^ S rffj!fewf+1/ ; fe.
k=0 j=0 ?=0
(4.172)
(4.173)
(4.174)
Dehne a multi-index, J, such that (4.174) can be written with a single index
as:
VD
N
N-1
E djwj
j=o
(4.175)
(m l)(n 1) + (l l)m(n 1) + (I l)(m 1 )n.
The vector, vf, which is a solution to (4.172) is given by
Th
n 1 m1 12
vj = E EE fe=0 j=0 i=0 ;
11 n1 m2
E E E Mil,/
i=0 k=0 j=0
m1 11 n2
'j,k;i j+l/2,k,i
(4.176)
+ E E E
J=0 i=0 fe=0
The multi-index, 77, is used in (4.177) to write (4.176) using a single index as
follows:
(4.177)
M-i
Vj = E uKvK
K=0
M = (l l)nm + l(m l)n + lm(n 1).
We note that M > N, i.e., Vj belongs to the space Vq, which is larger than the
divergence-free subspace Dh.
65


Let L be another multi-index like J. Then, for each wL e Dh we substitute
(4.175) and (4.177) into (4.173) to get:
N-1 M-1
E a(wj,wL)dj = E a(vK,wL)uK- (4.178)
J=0 K=0
We introduce two more multi-indices, P and Q, which go from 0 to M 1 like
the multi-index K. We then write wj and wL in terms of basis vector functions
in Vq as:
M-1
Wj = E cj;pVp (4.179)
p=o
M-1
Wi = E L,QVQ- (4.180)
Q=0
Substitution of (4.179) and (4.180) into (4.178) gives:
M-1 M-1 N-1 M-1 M-1
E cl,q E o(vp,vq) ^ cj'pdj E C£,Q E fl(vjr, vq)uK- (4.181)
Q=0 P=0 J=0 Q=0 AT=0
This gives us N equations with N unknowns. We deline the N x M matrix CO,
the M x M mass matrix MO, and the two coefficient vectors d of length N and
u of length M as:
CO = cL.Q (4.182)
MO = a(vp, vq) (4.183)
C0M0C0t = cpjqo(v p, 'Vq)cjjp (4.184)
d = dj (4.185)
u = uk- (4.186)
Substituting (4.182)-(4.186) into (4.181) for all L gives the matrix equation given
by
C0M0C0Td = ^COMOu. (4.187)
66


The vector of coefficients, v = CO7 d u. represents the fluxes of the approximate
solution vh.
The CO matrix consists of 1 and is given by the relationships (4.113), (4.114),
and (4.115). We dont explicitly form the matrix C0M0C0T. Instead we assemble
only MO and then define operators CO and C0T which give the action of these
matrices on a vector.
When we want to include boundary fluxes into the mass matrix we need to
augment it. We will call this augmented matrix M which is given by
M
MO A1
A I 7 A2
(4.188)
Likewise, we will augment our CO operator as follows:
C
CO 0
Cl C2
(4.189)
The Cl matrix represents contributions to the divergence-free vector functions
on the boundary from fluxes across interior faces, and the C2 matrix represents
contributions to the divergence-free vector functions on the boundary from fluxes
across faces on the boundary.
For a Dirichlet boundary condition, we have a specified pressure, q, on the
boundary which gets added to the right-hand side of equation (4.187). For ex-
ample, suppose we have a Dirichlet boundary at x = 0. Then, we will define the
vector q = as the specified pressure on each face, i'o^O), f this boundary.
When we substitute this in to the functional, G(wjj,), for each j and k we get
G(wf
hk)
T'O
n;
Oj+l,fc
(0, y, z) dydz
F-
J- r
0,j+l,k
(o)
67


/

- Qj,k
C2q.
(4.190)
Equation (4.187) now becomes
CMCTd = -CMu + C2q.
(4.191)
If we have disconnected Dirichlet boundaries, then we will also have to include
the pipe function. The pipe function adds a dense column and dense row of non-
zeros to the matrix CMC7. We will deal with this bordered matrix when we
describe the iterative solver.
4.5.1 Mass Matrix
The mass matrix, M, is assembled from basis vector functions in Vj. The
M matrix can best be visualized in block form. We will describe the submatrices
which make up the blocks. We could define a subroutine to assemble each subma-
trix. However, we take advantage of the type of permutation described in Section
4.1 to minimize the number of subroutines required.
Let MO be the mass matrix assembled from basis vector functions in Vq. The
mass matrix, MO, has (i 1 )mn + l(m 1 )n + im(n 1) rows and columns. On
a rectangular grid with diagonal tensor K, the MO matrix has the diagonal-block
68


form given by
MO
MO^ 0 0
0 0
0 0 MO
(4.192)
Since a basis vector function in Vq has support on two elements we get a tridi-
agonal block form for the sub-matrices MO^, M0OT, and M0ZZ. We define the
weights which will make up the non-zero entries of MO^ as:
?J,*;l/2,-l/2 = a(Vfj,Jfc;l/2Vfj,Jfc;-l/2) (4193)
1/2,1/2 = a(Vi,j,k;l/2ivi,j,k;l/2) (4.194)
mi,j,k;-l/2,-l/2 = a(Vi,j,k;-l/2Vi,j,k;-l/2)- (4.195)
The tridiagonal entries mliyjyk, Tndij^, and muij^ used in assembling MO^ are
given by
MOxx.mli-ut = mlhk.l/2_l/2
MOxX.mdij^ = M0xx.muiJ}k = m*+ltjtk.lj2_lj2.
The weights ml^i and muij^ represent the lower diagonal, the diagonal
and the upper diagonal elements of MO^ respectively. In the MFE method, the
matrix is symmetric so we need to assemble only the upper or lower diagonals
along with the diagonal.
We assume that we are using vector functions defined using the area correction
factors given in (4.87)-(4.92). To assemble the weights in (4.193)-(4.195) for the
MFE method, we need to compute the following integrals:
rn
l
TX
1/2-1/2
X \
i,j,k)
x( I X)
Qi,j,k "i,j,k
(^-i,j,k^-i,j,k) dxdydz
69


Xijjfe dxdydz
(4.197)
TX
1/2,1/2
X
;(r:
2 i,j,k)
Qi,j,h
J
hhk
dxd/ljdz
X
(r
x \2
2 i,j,kJ
dj t
Q
i,j,k
-Xjj,*. dxdydz (4.198)
TX
1/2-1/2
(r? )2
(1 xfK 'f' (k Xidtk dxdydz
Wj'.fc "i,j,k
1 £)'
:(£
Ji
X \
i,j,k)
Q
hhk
(Kij^Xjj,*.) Xijijfe dxdydz. (4.199)
When all the faces are planar we can write analytical forms of (4.197)-(4.199).
In general, we will need to numerically approximate the integrals. The weights
(4.193)-(4.195) are now given by
m
X
i,j,k\ 1/2,1/2
m
x
i,j,k\ 1/2,1/2
m
X
i,j,k\~ 1/2-1/2
U
,j,k\ 1/2,-1/2
a(1)^(0)
;i/2,i/2
fc(l)5
-1/2,-1/2
(4.200)
(4.201)
(4.202)
We use permutations of the data to obtain the weights for M0OT and MO...
In the presence of a non-orthogonal grid or full tensor K, the matrix, MO,
will have off-diagonal blocks representing transverse flux contributions. In this
case the block structure of MO is given by
MO
MO** M0XJ, M0X*
M0j,x M0ra MO**
M0ZX M0zy M0ZZ
(4.203)
70


We define the weights which will make up the non-zero entries of MO^ as:
m
xy
1/2,-1/2

(4.204)
1/2,1/2 a(V?j,fe;l/2 Vj,k,i;l/2)
3U-i/2,-i/2 = a(vfj,*;-i/2vJ)fc)j._i/2)
1/2,1/2 alVi+lj,fe;-l/2,0,0; Vj,fe,i+1;0,1/2,0
(4.205)
(4.206)
(4.207)
The non-zero entries of MO^ are now given by
MOxy.mlj^.k.i = rrf^-. 1/2_1/2
M0xy.m2jj]eji = mJJ)j;1/2)1/2
M0xy.m3i_1)fc)j = JJ)j+1._1/2 _1/2
MOxy.m&jfai = m*J)j+1._1/2)1/2.
(4.208)
These sub-matrices are sparse, but have a large bandwidth due to the ordering of
the unknowns.
To assemble the weights in (4.204)-(4.207) for the MFE method, we need to
compute the following integrals:
jxy
1/2,-1/2
/ £(i y)
pa; pi/
j,k,i
7(1 y)
T2
*,j,k
w.jw
],k,i

Q
hhk
(
' Yj,k;i dxdydz
i,j,k^-i,j,k) ^j,k,i dxdydz
(4.209)
jxy
1j,k,v, 1/2,1/2
pa; pi/
xy-hilLJ** (Krj.fcXjj,dxdydz
Vi.j.fc - /
xy
Q
i,j,k
pa; pi/
5/7 j-,k,i /Vr-1 v
h],k
k) Yjuidxdydz
(4.210)
pa; pi/
JJU-V*../* = L d- *H1 V) "% m (Ka%) Y^ dxdydz
71


c r?
(l-x)(l- y) h3* h'k,t YjAi dxdydz (4.211)
Q
jxy
1j,k,v-l/2,l/2
/ (1 x)y
J Qi,j,k
rwrw
h],k ],k,i
J2
Ji+l,j,k
1 x)y
rwrw
h],k ],k,i
J,
Q
hhk
(
(^-i,j,k^-i,j,k) ^j,k,i dxdydz
i,j,k^-i-j;k) ^j;k,i dxdydz.
(4.212)
The weights (4.204)-(4.207) are now given by
mxy
1/2-1/2
mili;l/2,l/2
jxy
1j,k,v, 1/2,-1/2
^?,(i)4fe,()

j,k,i; 1/2,1/2
rri
xy
jxy
1j,k,v-1/2,-1/2
jxy
mxy _ 1i,k,r-1/2,1/2
(4.213)
(4.214)
(4.215)
(4.216)
We can obtain the weights for the M0I/Z matrix and the M0za; matrix through
permutations of the data. In the MFE method we need not assemble MO^ which
is the transpose of MO-.,-. or MO^ which is the transpose of MO^, or M0Z|/ which
is the transpose of M0I/Z since these are all self-adjoint.
The non-zero structure of the MO matrix, with off-diagonal blocks, is shown
in Figure 4.10. On an orthogonal grid with diagonal tensor, K, we would have
just the tridiagonal blocks shown along the diagonal in Figure 4.10.
72










Ml













































Figure 4.10: Non-Zero Structure of MO on Non-Orthogonal Grid
4.5.1.1 Mass Matrix with Boundary Fluxes
Now we consider the case when we need to compute contributions from fluxes
on the boundary. As mentioned above, this will add additional blocks to our mass
matrix giving the augmented matrix, M, in (4.188). The block structure of A1 is
given by
Alj; Alsj, Alxz
A1 =
Alj,* A1 yy A.lyX
Alza; Aljfj, A1Z0
(4.217)
73


The integrals for assembling the weights for the A1 matrix are analogous to those
computed for MO and we omit them here. Al^ is assembled by computing the
weights given by
'0j\fc;l/2> Yj,k)
* v i
(4.218)
. 1 \ jj, = a(vg
Alxx.milj^k = a(v?-ij,fe;-l/2; vj,kJ
The blocks Alyy and Alzz are assembled through permutations of the data.
The A1XJ, represents the transverse flux contribution from the ^-boundary
to the rr-fluxes. The weights for this matrix are given by
Alxy.mlk,i = a(vfj0jfe;1/2,vg)
Alxy.m2k,t = a(vf+10k._1/2, vfi+1)
Alxy.m3k.i = a(vfm_lk;1/2, vg}
Alxy.mAk,i = a(v?+1)m_1)Jfc._1/2> v£j+1)
The blocks Alyz and A1are assembled through permutations of the data.
The weights for the block, Al^, are given by
(4.219)
Alyx.mlhk = a(v|fej0;1/2,v|0fe)
Alyx.m2hk = a(vJ+ljM;_1/2, v|ljfe)
Alyx.m3hk = a(vJ)JM_1;1/2, vfk)
Alyx.m4hk = a(vJ+w_1;_1/2J
(4.220)
The blocks Alzy and A1.,.. are assembled through permutations of the data.
The block structure of the transpose of the A1 matrix is given as:
A1

A1T Alt A1T
XX xy xz
Alt ail
yx yy yz
A | lx zy A1T zz
(4.221)
74


In the MFE method, the matrix A1 is self adjoint so we dont need to assemble the
transpose. The snbmatriees of A1T will have the same lexicographical ordering as
those in the A1 matrix. The snbmatriees of A1T will jnst be aliases or references
to the A1 snbmatriees and have the same memory addresses.
The block strnetnre of A2 is given by
A2
A2
A2
A2
XX
yx
zx
A2.
xy
A2
yy
A2
zy
A2
A2
A2
xz
yz
zz
The A2;7,7. block is assembled from the weights given by
(4.222)
A2xxQj,k = a(vjk, Vjk)
(4.223)
A2xxljjk = a(vfk,vfk)
The weights for A2yy and A2.. are assembled through permutations in the data.
The off-diagonal submatrices of A2 are from corners where two sides of the
domain are adjacent. For example, the A2xy block represents contributions from
the corners where the x = 0 and x = 1 sides are adjacent to the y = 0 and y = 1
sides. The weights for the A2xy block are given by
A2xy.m\k = a(vj$,v f0)
A2xy.m2k = a(v
rx0 ...2/1 'i
yk,0J
A2xy.m3k = a(vfk,vy0
(4.224)
2/1
A2xy.mAk = a(v%_hk,
The matrices A2yz and A2-.,. are assembled through permutations in the data.
The non-zero structures of the A1 and A2 matrices on a non-orthogonal grid
are shown in Figures 4.11 and 4.12 respectively. The non-zero structure of the M
matrix on a non-orthogonal grid with boundary fluxes is shown in Figure 4.13.
75


















m
























































Figure 4.11: Non-Zero Structure of A1
76


Figure 4.12: Non-Zero Structure of A2


Figure 4.13: Non-Zero Structure of M
78


4.5.2 Stiffness Matrix
We may want to assemble the CMCT matrix explicitly so that we can do a
direct solve on subdomains or the coarse grid. The non-zero structure of CMCr
is shown in Figure 4.14 without boundary contributions and in Figure 4.15 with
boundary contributions. We note that the bandwidth is very large.
Figure 4.14: Non-Zero Structure of CMCr
79


80
I*


The non-zero structure of the stiffness matrix does not depend on the con-
ductivity tensor or the orthogonality of the grid.
Theorem 4.5 The stiffness matrix on a non-orthogonal grid with full tensor, K,
will not have any more non-zero entries than the stiffness matrix on an orthogonal
grid with a diagonal tensor.
Proof. Suppose we have a rectangular grid and the identity matrix for the tensor
K. We will see that any divergence-free vector function in the basis of Dh is
non-orthogonal, with respect to a(-,-), to all its neighbors (we ignore boundary
functions to simplify the proof). Without loss of generality, suppose we choose
wj+i/2,k+i/2,i as our vector function (assume 0 < i < l 1). This vector function
has support on four elements. All the vector functions in the basis of Dh which
have support overlapping the support of w|+1//2 k+1/2 i are given below.
W?_i/2,*+i/2)i,
Wj-l/2,fc+3/2,i>
y
Wk-l/2,i-l/2,j>
y
Wk-l/2,i+l/2,j>
y
Wfcl/2,il/2j+l >
y
Wk-l/2,i+l/2,j>
Wi-l/2,j-l/2,k>
Wi+1 /2,jl/2,k)
W|+ l/2,k-l/2,V
Wi+l/2,fe+l/2,i >
Wi+l/2,fe+3/2,i)
y
Wfe+l/2,i1/2 j >
y
Wfc+l/2,i+l/2 j >
y
Wk+l/2,i-l/2,j+l
y
Wfe+l/2,i+l/2 j >
Wi1/2 j+l/2,fe)
Wi+l/2 j+l/2,fc)
W|+3/2,fe-l/2,i
Wi+3/2,fe+l/2,i
W|+3/2,fe+3/2,i
y
Wfe+3/2,i1/2 j
y
Wfe+3/2,i+l/2j
y
Wfc+3/2,i1/2 j+l
y
Wfe+3/2,i+l/2j
Wil/2j+3/2,fe
Wi+!/2 j+3/2,fe
81


Wil/2j l/2,fc+l) Wf1/2j+1 /2,fc+l > Wi1/2 j+3/2,fe+l
Wi+l/2j-l/2,fc+l! Wi+l/2j+l/2,fe+l5 Wi+l/2j+3/2,fe+l
The inner-product of wj+1^2 fe+1^2 with any one of the vector functions above
is non-zero. For example,
a(Wi-l/2,fe-l/2,i! Wi+l/2,fe+l/2,i) = _a(V.?-l/2,A;,i;0,l/2,0, ^j-l/2,k,v,Q -1/2,o)
= (l/6)(XjJ-)fc XjJ-)fc)/(
0
A similar result can be shown for each of the vector functions above either by
inspection or by using the definition of a divergence-free vector function in the
basis of D\
Before factoring the matrix, it is best to compute a permutation which will
reduce the bandwidth. For example, the Symmetric Reverse Cuthill-McKee or-
dering reduces the bandwidth by about half [48]. The result of the permutation
is shown in Figure 4.16.
82


Figure 4.16: Matrix Permutation
83


4.6 Numerical Integration
As mentioned above, the weights for the mass matrix, M, will, in general,
require the use of numerical integration. The method used in our implementation
is a Gaussian quadrature rule in three dimensions which is a one-dimensional
Gaussian quadrature rule applied on each dimension separately. The number of
integration points in each dimension is a user defined input of 1,2,3,4 or 5 points.
The Gaussian quadrature formulas are found in any numerical analysis book and
many finite element books (for example [44]). We use vectors, x = xsi, and
w = wsi to store the one-dimensional quadrature points and weights respectively.
For example, let N be the number of integration points and suppose we want
to compute the integral in (4.197). Then, this would be represented as a triple
sum given by
TX
N-1
N-1
N-1
Cd E WS?J E Ws2 E
s3=0 s2=0 sl=0
%sl)
("^i j k-lf2 5^53 )) -
Ji,j,k (Efi 1 ?Xs2
%s3)j ' ,s2, ^*3)
(4.225)
The constant c is a constant of integration resulting from a change in variables in
each dimension.
The 3x3 matrix, K( (1/.. is assumed to be constant and symmetric on each
element Qi,j,k- The elements of this matrix are given by
K
-1
i,j,k
Kxx Kxy Kxz
Kyx Kyy Kyz
Kzx Kzy Kzz
(4.226)
84


Since the matrix is symmetric, we only store the diagonal and 3 of the off-
diagonals. The off-diagonals we choose to store and the lexicographical ordering
of all these matrix elements is critical to the implementation of our method. The
off-diagonals we choose to store are Kxy, Kyz, and Kzx. The matrix (4.226) is
now defined as:
The product, Kis given by
l\ xxj jjt. Kxyjfai h
KxVj,k,i Kyyhk, Kyzjs^j
l\ z.ri j k / Kzzk,i,j
(4.227)

KxXi,j,kXjj k + Kxyj;k;iXi j k + l\ A

Kxyhk,iXlhk + KyyhkjiXljk + Kyzk^Xljk
h -''i.j.i,- j.j.k + ^yzk,i,jXij^k + KzzkjijXij^k
(4.228)
We explain why we choose to define the Ki h matrix the way we do. Consider
the permutations given by


-0 V1
KJh (4.229)
Kyyj,k,i Kyzk,ij Kx Uj,k,i
Kyzk,ij K zzk,i,j Kzxijj (4.230)
Kx Uj,k,i I\ l.Vj jk I\ .V.Vjjk
to ro} (4.231)
(4.232)
With these permutations, equation (4.228) now becomes

KyyjAjYjAi + Kyzk^j ^jjkji + Kxyj^^jY k^
Kyzk,i,j^j,k,i d" ^-zzk,i,j^jjkji + ^zxi,j,k^j,k,i
KxyjjkjYj^kj + I\ j j.; + /\ r.r(.(./T j). (
(4.233)
85


Equation (4.233) gives us the vector Kjl^Yjjkji which has the permutation (4.231).
Another, analogous, permutation gives us the permuted vector K^j -Z
When we compute the covariant basis vector, Ywe will use a permutation
of the data which gives the permutation in (4.231). Therefore, the dot-product
will be invariant under these permutations since the vector, Kj^^Ywill have
the same permutation as the covariant basis vector YJjfeji. This means that the
same subroutines for numerical integration in the assembly of MO^a, can be used
for the assembly of M0OT and M0ZZ. Our strategy of using a single subroutine for
the assembly of all our tridiagonal blocks in the M matrix is therefore preserved.
When assembling the MO^ block of the mass matrix, we will need a separate
subroutine for computing the product YThis is because the
covariant basis vectors, and Yneed to have the same ordering. However,
we can permute these vectors in a consistent way when assembling MO^ and
M0*x.
For a given volume, Qijtk, we compute the 3 A'3 matrices XV* and XWi;j,k
as:
(4.234)
XWijijfe(sl,s2,s3) = Tlj k(xsi, xs2, xs3)XLj!k(xs2, xs3).
Now, equation (4.225) is given by
(4.235)
a(Vi,j,k;l/2,Vi;j;k;1/2)
N-1 N-1 N-1
^si^iXVjJ)t(sl, s2, s3) XWijijfe(sl, s2, s3).
86
(4.236)


The area, 0), is computed by
N-1 N-1
Ai,j,k(o) = C2 Y, ws3 Y ws2Tljjk(0,xs2,xs3). (4.237)
s3=0 s2=0
The reason for defining these matrices, XV, and XW, separately is that we may
want to change the definition of the test function and thus the definition of XW.
It is an attempt to make the code as generic as possible so that it can be easily
modified for other numerical methods such as the CVMFE method.
The computation of the other integrals needed for the weights for the tridiag-
onal matrix, MO^^, are given as:

A-

(0)2
N-l N-1 N-1
E WS?J E Ws2 E
s 3=0 s2=0 sl=0
tosi(l xai) XVi+1J)*(sl,s2,s3) XWj_|_ijjj;(sl, s2, s3)
(4.238)
a(v.
Vi+1 j,fe;l/2)
N-l

N-l N-l
E "',3 E Ws2 E
s3=0
s2=0
sl=0
(4.239)
tosi(l ^'si)*^siXVj-(-ij)fc(sl, s2, s3) XWj^-i(si, s2, s3).
To compute the integrals for the assembly of MO^ we will need the 3 A'3 matrix
YW given by
YWijijfe(sl,s2,s3) = r(xs2,xs3,xsl)YjAi(xs3,xsl).
(4.240)
The weights for the matrix, MO^, are given by
ml,-
j l,k,i
N-l N-l
N-l
4- 4-, -fol E ws3 E ^*2(1 ^*2) E
-C.jMO-ys3=0 s2=0 sl=0
^ixsiXVi,j,k(1j s2, s3) YWijjk(sl, s2, s3)
(4.241)
87


(4.242)
m2
j,k,i
N-1
N-1
N-1
4- i(1)4i 411 E ws3 E ws2%s2 E
s3=0 *2=0 si=o
six siX-V %,j,k(1 j 2, s3) YWjjjj;(sl, s2, s3)
N-1
iV-1
N-1
4-j-i .(0)4 i --n(O) E E ws2(l W2) E
^+i,3,h\)^3,k,i+iK) s3=0 s2=0 sl=0
tosi(l xsl)XVi+ltjtk(sl,s2,s3) YWj_(_i^^(sl, s2, s3)
A,;.
+i,j,fc(0)AJ-,fc,l+n
N-1
tjt E tos3 E wS2XS2 Y
1 s3=0
iV-1
E
s2=0
iV-1
E
sl=0
(4.243)
(4.244)
tosi(l xsl)XVi+1J,Jfc(slJs2,s3) Y"Wj-)-i(si) s2, s3).
We can compute the weights for the A1 and A2 matrices in an analogous fashion.


5. Preconditioners
We now discuss the iterative solver used to approximate the divergence-free
solution. Our system is symmetric positive definite which means that we can use
the PCG method. It is well known that the reduction of the error on each iteration
is bounded by:
K
nz i i
(5.1)
where n is the condition number of the preconditioned matrix. We will show that
the condition number of CMC7 can be bounded, at best, by a constant times
h-2.
In our implementation we use a two level additive Schwarz method as a precon-
ditioner. Numerical results show that this domain decomposition method gives a
convergence rate independent of grid size. We will define the method and analyze
its behavior.
We also describe the bordered matrix problem which results from having dis-
connect Dirichlet boundaries. A preconditioner for the bordered matrix is defined
and the condition number of the preconditioned bordered matrix is analyzed.
Most of the theoretical results presented in this section assume an orthogonal
grid. It should be possible to extend these results to a distorted hexahedral grid
although, as mentioned before, interpolation estimates are not known for non-
affine distorted grids in 3-D. A Poincare-type inequality is presented assuming
homogeneous boundary conditions. This is not a problem for Neumann bound-
ary conditions since the Neumann boundary condition is an essential boundary