MULTIGRID METHODS
FOR LINEAR QUADRATIC REGULATOR (LQR) PROBLEMS
OF INFINITE DIMENSIONAL SYSTEMS
by
Tiejun Li
B.Sc., Peking University
Peking, China, 1959
A thesis submitted to the
Faculty of the Graduate School of the
University of Colorado in partial fulfillment
of the requirements for the degree of
Master of Science
Applied Mathematics
1995
This thesis for the Master of Science
degree by /
Tiejun Li
has been approved
eve McCormick
Roland Sweet
JWA Â£7, Mb
Date
Li, Tiejun (M.S., Applied Mathematics)
Multigrid Methods for Linear Quadratic Regulator (LQR) Problems of Infinite
Dimensional Systems
Thesis directed by Associate Professor Adjunct John W. Ruge
ABSTRACT
Partly due to great interest in the construction and deployment of large space
structures, problems in the analysis and control of flexible mechanical systems
have become increasingly important over the last several decades. One of the main
problems encountered is the determination of the applied force necessary to move
a structure quickly and accurately into a given position, with minimal force and
excess motion. This can be posed more precisely as a linear quadratic regulator
(LQR) problem: Find the force that, together with the system state, satisfies the
governing time dependent partial differential equation and minimizes an appropri
ate quadratic cost functional. The minimizing force is a linear function of the state,
with the linear state feedback operator satisfying an associated Riccati equation.
During the last ten years, a much effort has gone into developing solution methods
for LQR problems coming from these infinite dimensional (continuous) systems.
The standard approach is to discretize in space, obtaining a finite dimensional
system and a matrix Riccati equation, for which algebraic solution methods have
been devised. Such methods can be quite expensive, and do not take advantage of
the continuous nature of the problem. Our approach is to reformulate the continu
ous Riccati equation as a nonlinear integraldifferential equation, to which we
apply a multigrid solution method. For a problem in which the constraint equation
is firstorder in time, we develop and test two multigrid solvers, one based on
m
multilevel projection methods (PML), and the other on the full approximation
storage (FAS) scheme. Both are shown to work with typical optimal multigrid effi
ciency. We also look at a problem that is secondorder in time, point out some
difficulties we encountered, and present possible solutions.
This abstract accurately represents the content of the candidate's thesis. I recom
mend its publication.
Signer
John W. Ruge
IV
CONTENTS
List of Tables..............................................................vi
Acknowledgements...........................................................vii
Chapter
1. Introduction............................................................ 1
2. Multigrid Methods........................................................3
2.1 Basic Multigrid Ideas.................................................3
2.2 Multilevel Projection (PML) Methods...................................5
2.3 Full Approximation Storage (FAS) Scheme............................. 11
2.4 The Full Multigrid (FMG) Algorithm ................................ 12
3. The Control System with First Order Time Derivative.................... 15
3.1 The Operator Riccati Equation and MG Solvers........................ 15
3.2 Multilevel Projection (PML) Solver.................................. 18
3.2.1 The Discretization and Equations.................................... 18
3.2.2 Relaxation.......................................................... 24
3.2.3 Numerical Results....................................................25
3.3 Full Approximation Scheme (FAS) Solver...............................30
3.3.1 The Discretization, Equations and Relaxation.........................30
3.3.2 Numerical Results....................................................32
4. The Control System with Second Order Time Derivative...................36
4.1 Control of a Simply Supported Beam...................................37
4.2 Multigrid Difficulties...............................................39
4.3 Possible Multigrid Approaches........................................40
5. Conclusions ...........................................................42
References.................................................................43
v
TABLES
Table
3.1. PML V(l,0)cycle Results..................................................27
3.2. PML V(l,l)cycle Results................................................. 27
3.3. Comparison of Efficiency for PML Vcycles.................................28
3.4. Discretization Error of PML solutions.....................................28
3.5. PML FMG (1 V(1,0) cycle/level)............................................28
3.6. PML FMG (2 V(1,0) cycles/level)...........................................29
3.7. PML FMG (1 V(l,l) cycle/level)............................................29
3.8. PML FMG (2 V(l,l) cycles/level)...........................................29
3.9. FAS V(l,0)cycle Results..................................................33
3.10. FAS V(l,l)cycle Results..................................................33
3.11. Comparison of Efficiency for FAS Vcycles.................................34
3.12. Discretization Error of FAS solutions.....................................34
3.13. FAS FMG (1 V(1,0) cycle/level)............................................34
3.14. FAS FMG (2 V(1,0) cycles/level)...........................................35
3.15. FAS FMG (1 V(l,l) cycle/level)............................................35
3.16. FAS FMG (2 V(l,l) cycles/level) ..........................................35
vi
ACKNOWLEDGEMENTS
I am deeply indebted to Professor J. Ruge for his great help with my grasp of the
multigrid method, countless discussions and much significant guidance. I would
also like to thank Professor S. McCormick, the founder of PML, for his sugges
tion to apply PML for solving the operator Riccati equations arising from optimal
control problems of infinite dimensional control systems.
I am also indebted to the AFOSR for their support of this work under Contract 91
0156.
vii
1. Introduction
For at least two decades, due to the construction of larger and more flexible engi
neering structures, especially in space applications, the analysis and design of
flexible mechanical systems have increased in importance. Because of increased
flexibility, it is especially important to know how to effectively control the motion
of these structures. Thus, there has been a corresponding increased interest in
extending optimal control theory, originally developed for finite dimensional
problems, to such infinite dimensional (i.e., continuous) problems (see Lions [1]
and Wang [2]). Here, the behavior of the mechanical system (specified by the state
variable) as a function of the system input (the control) is typically modeled by a
timedependent partial differential equation. In turn, we assume that there is
feedback from the state to the control. The problem is defined as finding the
control (or more precisely, the function defining the control in terms of the state)
so that the state/control pair satisfy the differential equation and, in addition,
minimize a given functional, which generally involves system energy and work
done. Much of the work on the optimal control of these systems is based on
Pontryagin's maximum principle, and Komkov [3] has successfully applied this
principle to the optimal control of transversely vibrating beams and plates.
During the last 10 years, a great deal of effort has been focused on approximation
schemes for the solution of linear quadratic regulator (LQR) problems of infinite
dimensional systems [410]. In such problems, the idea is to determine a force, or
control, applied to a system in order to minimize a functional involving the energy
of the system and the applied force. The main idea in the approximation scheme is
to discretize the structure, which produces a finite dimensional system and an
associated finite dimensional Riccati equation to be solved. A number of algo
1
rithms have been developed for the solution of these matrix Riccati equations.
However these are mainly algebraic methods, and do not exploit the underlying
continuous nature of the problem. Convergence of these methods can be slow,
and, like many matrix methods, as the problem size is increased (which is neces
sary for improving accuracy) solution quickly becomes expensive and impractical.
Pontryagin's maximum principle gives the form of optimal control for the infinite
dimensional system as u(t)=CPv(x,t), where u is the system input or control, the
operator C is determined by the system, and v(x,t) is the state. The operator P, to
be determined, then satisfies the operator Riccati equation. In many cases, P can
be written as an integral operator, which makes it possible to transform the Riccati
equation into an integralpartial differential equation for the kernel of P (see
section 2 for details). Multigrid methods, which are often used for fast solution to
partial differential equations, can then be applied.
The purpose of this thesis is to develop multigrid methods for the fast solution
some classes of LQR problems. The main idea is to transform the operator Riccati
equation coming from a continuous LQR problem into a partial differential
equation (PDE) and apply multigrid methods to the solution of the resulting PDE.
In Section 2, we introduce multigrid methods and describe two MG approaches we
use for solution of the nonlinear PDE's. In Section 3, we describe LQR problems
where the equations describing the system to be controlled is first order in time.
Solution methods are described, and numerical results are given. In Section 4, we
discuss problems for which the equations are secondorder in time. Finally,
conclusions and remarks on future work are given in Section 5.
2
2. Multigrid Methods
Multigrid (MG) methods provide optimal solution for a large class of PDE's, in
the sense that a problem can be solved to the level of discretization error in a time
proportional to the number of points on the finest grid (see [13] and [14]). They
can be applied in cases involving nonlinearities, varying coefficients, and irregular
domains and grids in little more time than is needed for a corresponding
linearized, constantcoefficient problem. For our problem, we are particularly
interested in the ability of MG to solve nonlinear problems. We consider two
approaches: the full approximation storage scheme (FAS) developed by Brandt
[14], and the multilevel projection method (PML) developed by McCormick [13].
In this section, we introduce the basic multigrid algorithm, describe the FAS and
PML approaches for nonlinear problems, and present the full multigrid (FMG)
process for solving the finest level problem to within discretization error.
2.1 Basic Multigrid Ideas
The speed of convergence of an iterative method depends on the nature of the
error in the solution approximation. Some error components converging quickly,
but the overall speed is determined by those components which are slowest to con
verge. For example, GaussSeidel iteration, when applied to a discretized Poisson
equation, reduces high frequency error (error that oscillates on the scale of the
grid) very quickly. The smoother the error, the slower relaxation converges. This
is because GaussSeidel acts by averaging the error locally, so that the high
frequency part is quickly damped. Thus, GaussSeidel by itself is not an efficient
solver for this problem. However, smooth error can be well represented on
coarser grids, and an equation for this error can be formulated and solved on
3
coarser levels (where processing is cheaper). The solution there can be interpolat
ed back to the finer level and used to correct the solution approximation. A
recursive application of relaxation and this coarse grid correction is called a
multigrid cycle, and can reduce all error components by some small factor
independent of the size of the fine grid.
To illustrate this; suppose we want to solve a given partial differential equation:
Lu = f on Cl <= M2.
(2.1)
In order to apply a multigrid solution method to this problem, we need to define
the following components:
Grids Q1cQ2c.cQM,and spaces of grid functions
Discrete operators Lk:Sk^Sk,
Relaxation on each level.
Interpolation operators 1^: S*"1 Sk, k=2,3,...,M.
Restriction operators 7*_1: Sk Sk~l, k=2,3,...,M.
The coarsest grid typically contains only a few points, so that the cost of direct
solution of a problem there is negligible, and uniform refinement is used to obtain
the finer grids. The operators on all levels are usually finite difference or finite
element discretizations of the continuous operator. The standard interpolation,
which takes corrections from a coarser to a finer levels, is piecewise bilinear.
Restriction, used Tor transferring residuals to the coarse grid, is a weighted
average called full weighting. Since all error components must be reduced by
either relaxation or coarse grid correction, so any error that cannot be approx
imated on coarser levels (i.e., highfrequency error) must be reduced effectively
by relaxation if the cycle is to work well. Thus the proper choice of relaxation is
important. GaussSeidel relaxation works well as a smoother for many problems.
4
With these components, we define a V(oI,o2)cycle:
If k = 1, solve the problem exactly.
Otherwise:
Relax v l times on Lkuk = fk
Perform coarse grid correction:
Define the coarser level problem (i.e., set fk~l and uk~x)
Perform 1 V(vvv2)cycle on the problem Lk~luk~l = fk~l
Correct the level k approximation: uk uk + lÂ£_xuk~l
Relax v2 times on Lkuk =fk
Here, uk_1 is the coarse grid approximation to the level k error. In linear prob
lems, the coarse grid righthand side is the restricted fine grid residual. In nonlin
ear problems, as we show later, the form of the correction is somewhat different.
2.2 Multilevel Projection (PML) Methods
Multilevel projection (PML) methods were developed by McCormick [13] and can
be applied to functional minimization problems as well as some nonlinear equa
tions. The basic idea is straightforward, and simplifies many choices of multigrid
components. The method generally starts with a choice of a sequence of nested
finite element spaces. The cycle can then be viewed as an approximate sequential
elimination of error in each of these subspaces, by projecting error onto each
subspace and using that to correct the solution approximation. A big advantage of
this method is that relaxation on all levels are performed as if it were done on the
finest level, which makes implementation and troubleshooting simpler (at least
conceptually), and, under some conditions, can guarantee convergence of PML.
We begin with an abstract description of a cycle, then consider a practical imple
5
mentation. Suppose we have a continuous (possibly nonlinear) problem
F(p) = 0 onfield (2.2)
defined on some space S of functions on domain Q. We choose a sequence of
finite dimensional subspaces S1cS2e...t=SMcrS (typically finite elements) with
increasingly finer resolution, and SM the desired solution space. (2.2) will gener
ally not have a solution in SM. Instead, our problem is to find pM in SM so that
(F(pM),qM) = 0 for all qM eSM (2*3)
where (, ) is the standard L2 inner product. Thus, the L2orthogonal projection of
F(pM) onto SM should be zero. (It should be noted that, as in many finite element
applications, integration by parts is used to make sense of (2.3) for the spaces Sk.)
The spaces Sk for k
problem corresponding to (2.3) is used as an initial approximation to the level k+1
solution. This process is known as full multigrid (FMG), and will be described
later. Second, these spaces are used in a multilevel cycle, similar to that described
in the previous section and defined here formally:
PML F(v 15v2) cycle
For k=M,M1,...2:
M
Relax vx times on (F( ^ pl +pk),qk) = 0 for all qkeSk
l=k+1
Set pk~l= 0
M
Solve (F(%2 P1 +P1)A= 0 for all q1eS1
1=2
For k=2,3,...,M:
Set pkpk+pk~l
M
Relax v2 times on (F( P1 +Pk)A*) = 0 for all qkeSk
l=k+1
6
In this abstract setting, there are no grids defined, but the subspaces can be
identified with levels of resolution. The main idea is that, for each k, a suitable
relaxation processes applied to the level k problem effectively reduces error in
Sk/Sk1. A full cycle should then reduce all error components. Here, note that pM
could be updated directly immediately after each relaxation sweep on each level,
since the form of the solution, written in this way, is not unique. However, this
form is completely equivalent, and more closely matches the multigrid cycle de
scribed earlier.
This cycle, as written, cannot be directly implemented, but serves as a guide for
constructing a practical multilevel algorithm. In obtaining our algorithm, we have
two aims: First, the algorithm should be completely equivalent to the PML cycle
above; and second, the work performed on each level should be proportional to
the dimension of that level's subspace. The reason for the first is that, since all
processes on all levels are performed as if in the continuous space, there is a
common goal of all computation. With the exception of relaxation, this approach
dictates the choice of all the necessary multigrid components, and some problems
that can arise in other multigrid settings can be avoided or more easily diagnosed
and fixed. This first aim could be attained relatively easily and simply by working
on the finest level only. However, this leads to an inefficient algorithm, since the
amount of work needed on each level is proportional to the size of SM, instead of
decreasing on coarser levels. This is the reason for the second aim. The coarser
level problems should be formulated so that, for each k, the level k relaxation and
coarse grid correction never need appeal to finer level quantities or processes. This
is not possible for general nonlinear operators. For the problems we consider,
which contain a nonlinearity of special form, such a formulation can be done. In
this section, we mainly consider the first task of realizing the algorithm in an
equivalent form. The second question is answered after we introduce our operator.
7
The first step in obtaining a practical algorithm is to choose bases for the spaces,
which requires the introduction of some operators and notation. Let {
basis (usually a nodal basis) for Sk, k=l,2,...,M, and let nk=dim(Sk). Then let Gk
be the space of coefficient vectors with respect to this basis. For convenience, we
use an overbar to denote the vector of Coefficients corresponding to a function in
Sk, and subscripts to denote the vector entries. That is, pkeGk is the vector of
coefficients corresponding to pkeSk, so that
Pk = Y,Pitii where (PiP2> >pIk)T =Pk
i
Let Ik:GkSk be the mapping from taking to pk:
ijk=Y,pWi=pk
i
We also define Ik=(Ik)*. Let {ek} be the usual coordinate basis for Gk, so that
Ikek=<])k, i= 1,2.nk. Since Sk'1cSk, for some constants Wy, the level k1 basis
functions can be expressed in terms of the level k basis functions as follows:
.fcl v' k jk
4>i = E wji
j
Thus, we can write
Pky E Pi'1 r=E Pi'1 E ^ $ E (E Vi'1)^
i i j j i
This defines an interpolation in the coefficient spaces that corresponds to the
natural embedding of pk_1 in Sk, with the matrix Ik_1:Gk'1Gk given by [wkj],
. Then we define
/ if k = l (iidentity)
Tl jll Tk+1 if k
if k>l (restriction).
8
Note that restriction is defined as the transpose of interpolation. It follows that
/*=/,/* and Ik = IlkIl for k
Now, using the basis functions, we can first write the finest level problem as a
system of equations:
FM$M) = = 0, (2.4)
'M
where
FtM(pM) = {F(Â£PjMtf)d?). (25)
This is the system of discrete equations to be solved. Now we consider the coarser
level equations, and proceed by induction. Suppose that we have a form for the
level k equations, k
FHp*) = (Flk(p*),Ft(pk)..rft(pk))T = 0 (2.6)
where
F(p*) = (F(. E I,p E//^f > (2.7)
/=*+l
This is true for k=M, where the sum over 1 is empty. For k
to note that here, Fk depends on the part of the solution approximation from levels
k+l,...,M. The level k1 equations, for i=l,2,...,nk, can then be written as
ifv1)=Â£ /r+ikp+)
l=k+l
= P (28)
l=ifc+1
= EwyI^V+41^1).
9
This gives the foil level k1 system:
pkiQjki) =I*1Fk(pk+I*_1pk~1). (2.9)
Let ( >k denote the usual 12 inner product on Gk. Then, we can obtain an alterna
tive form of (2.8) that we find useful later:
V1)=E +hki pkM\
= (F*(p*+ (210)
As noted above, it is convenient to absorb ^ into the definition of Fk_1. When
dealing with linear operators, however, this would be confusing and we keep this
contribution to the equation separate. Thus, if Fk is a linear operator Lk, we define
Lk~l = if1 Lk lÂ£_lt (211)
and write instead
= {flLkI^^lLkpk^X,
= (Lklp^+lXLkpkX\i>
(2.12)
or more simply
IkkXLk($k + = Lk~lpk~l + Ik~lLkpk, (2.13)
Now, the form of F determines whether or not Fk_1 in (2.9) or (2.10) can be
evaluated (during relaxation for example) without interpolating pk'1 to the finer
level and applying the operator there. We show that this is possible in our case.
10
For linear operators, this results in the usual finite element, or Galerkin, formula
tion. For example, if F(p)=Lpf, where L is a linear differential operator, and the
spaces chosen are standard piecewise bilinear functions on uniform grids, then the
algorithm is usual multigrid where the operators are the standard finite element
approximations to L, interpolation is bilinear, restriction is full weighting (the
transpose of interpolation), and relaxation is pointwise GaussSeidel. In our case,
we will derive the form of the discrete equations in a later section.
The relaxation chosen in PML depends on the character of the discrete equations.
When F is nonlinear, a nonlinear version of GaussSeidel relaxation can be defined
on each level. This is done by sweeping through the grid in some order, updating
p^ so that F^(pk)=0. Here, F^ is quadratic, so the exact solution is used.
2.3 Full Approximation Storage (FAS) Scheme
The original multigrid approach for nonlinear problems was developed by Brandt
[14], and is known as the full approximation storage (FAS) scheme. MG for linear
problems involves solving a coarse grid version of the finer grid residual equation
to obtain an approximation for the error. With nonlinear problems, it is generally
not possible to formulate a residual equation for the error. Instead, we use a
coarse grid version of the full problem, with a change made to the righthand side
so that, at least at convergence, the coarse grid solution coincides with the fine
grid solution on the coarse gridpoints. Assuming that the error is smooth, and that
the solution to the modified coarse grid problem is a better approximation to the
solution at coarse grid points, then the difference between the coarse and fine grid
approximations (taken at coarse grid points) should be a good approximation to the
error. This difference is interpolated and used as the correction we seek.
To make this more precise, suppose we have a nonlinear problem of the form
11
F(P) =f
(2.14)
and have a sequence of grids and approximations Fk to F, for k=l,2,...,M. Some
relaxation such as nonlinear GaussSeidel is used, and the usual interpolation
and restriction can be defined. The resulting cycle is defined as follows.
FAS F(v15v2)cycle for nonlinear problems:
If k = M, solve the problem exactly.
Otherwise:
1. Relax Vj times on Fk(u *) =fk
2. Perform coarse grid correction:
Set fk\ = I(/* F\pk)) + Fk~l (/*" V) and pk~l = 1 pk
Perform 1 FAS V(y vv f)cycle on the problem Fk~l(uk~l) = fk~l
Correct the level k approximation: pk = pk + I^i (pk l /*1 pk)
3. Relax v2 times on F\uk) fk
For linear problems, this is equivalent to the previous coarse grid correction,
although it uses the coarse grid approximation to the full solution, rather than the
error. It should be noted that we can use a different restriction for solution
approximations than for residuals, and simple injection is often used.
2.4 Full Multigrid (FMG) Algorithm
The cycle can reduce all error components by some fixed factor, but the number
of cycles needed depends on the accuracy of the initial approximation and the final
desired accuracy. Usually, it is unnecessary to reduce the algebraic error far below
the level of discretization error, since the extra work involved does not help give a
better solution to the continuous problem. The main idea with full multigrid
12
(FMG) is to obtain the initial approximation on each level by interpolating the
final solution from the next coarser level, then doing only enough cycles there to
reduce the algebraic error to below the level of discretization error on that level.
An important point is that the initial interpolation to a new finer level must be of
sufficiently high order (a tilde is used in step 3 of the algorithm below to denote
that this is different from the standard interpolation used in cycling ). Generally,
for a second order problem, cubic interpolation must be used to ensure a good
enough initial approximation. The FMG algorithm (also known as FMV, when V
cycles are used in step 3) is as follows:
Full Multigrid (FMG):
(1) For k=M, M1,..., 2, Set fk+1 = I*+1fk
(2) Solve the problem L1 (u1) = fl
(3) For k2,
Set uk+1 = fk+luk
Perform p cycles on the level k problem
Here, we want to see the error reduction needed on each level in order to have the
final algebraic error in the solution below the level of discretization error. Let P
denote the continuous solution, and Pk denote the exact solution to the problem
discretized on level k. We make the following simplifying assumptions. We
assume 0(h2) discretization error is so that PkP=chk2 where the constant c is
a constant independent of h, so that Pk1P=4PkP. We also assume that
convergence of Pk to P is nicely monotonic, so that Pk1P = Pk_1Pk[+ PkP.
Now, let p0k denote the initial level k approximation, and pfk be the final approxi
mation after cycling. The goal on level k is to reduce the algebraic error (error
between the approximate solution and the exact level k solution) to below the level
of discretization error. That is, we want pfkPk<PkP. If Pk_1 is used as an
initial approximation on level k (i.e., p0k=Pk_1), then, ignoring interpolation error,
13
under our assumptions, we get
p0*P* = IP^P*!
= IP*1PIP*PI
= 3P*P.
Thus, we would need to reduce the error by a factor of 3 to get within the level of
discretization error. In FMG, however, the level k1 problem is not solved
exactly. Instead, we take p0k=lj[.1pfk'1. If the level k1 problem is solved to the
level of discretization error (i.e., pfk"1Pk'1< Pk'1P), then we have instead
lp0*Pll llpf1^*
<
\\Pf~1 Pk~l  + [Pfc1Pfc
<  P  +  p k
= 4 PfeP +3 P* P
= 7 P*P.
Thus, this argument indicates that a reduction factor of 7 will suffice to stay below
the level of discretization error on all levels, assuming that the initial problem on
the coarsest grid is solved well enough. This can be used as a guide to determining
the number of cycles needed on each level in the FMG process. Denoting this
number by p, and letting p be the asymptotic convergence factor of the Vcycle,
we only need to find p so that p^
This is only a guide, and, especially on coarser grids, convergence to the continuous
solution may not behave like 0(h ), so sometimes extra cycles on finer levels may be
used to ensure that the level of discretization error is reached.
14
3. The Control System with First Order Time Derivative
3.1 The Operator Riccati Equation and PDE
In this section, we consider the following LQR problem. Let the function v(x,t) be
the state of system and u(t) be a function describing the control applied to the
system. Then we want to minimize the quadratic cost functional:
OO
J(u) = J {(Qv,v) + ru2(t)} dt (3*1)
f=0
subject to the constraint that u and v satisfy
v (x,t) =Av(x,t) + Bu(t) x e [0,1], t > 0
v(0,f) = v(l,r) =0, t>0 (3.2)
v(x,0) = v0(x), x E [0,1]
where the state v(x,t) is in the Hilbert space H = {h(x):h(0)=h(l)=0 and
h,h'eL (0,1)}; < > is the usual L inner product, and not the inner product on
H; r is a positive real number; A and Q are operators on H and B is an operator
from M to H. Let Q=[0,l]x[0,l]. We assume that Q is a nonnegative symmetric
integral operator with kernel q(x,y), i.e., for any heH,
i
Qh = fq(x,y)h(y)dy,
o
and
(Qh,h)> 0
q(x,y)=q(y,x) for all (x,y) E Q =[0,1] x [0,1].
15
For concreteness, we consider the case A=32/3x2, so that the system is described
by the heat equation. Let be defined in terms of a given function b(x)eH:
Ba = a b(x)
For the LQR problem (3.1) and (3.2), by the Pontryagin maximum principle [12],
the optimal control u(t) which minimizes J has the form of state feedback
u(t) = r~1B*Pv(x,t) @.3)
where the unknown P:HH is the unique selfadjoint nonnegative solution to the
following operator Riccati equation:
A*P + PAPBB*P + Q = 0. (34)
Here, A* and B* denote the adjoints of A and B, respectively. From the definitions
of A and B, we have A*=A, and B*:HM is defined by
i
B*h= J b(x) h(x) dx.
o
The goal is to find the form of state feedback (3.3), and for this, P must be
determined. We assume that P can be expressed as an integral operator
i
Ph = f p(x,y)h(y)dy
o
with kernel p:[0, l]x[0,l]l, which we must find. The facts that P is nonnegative
and self adjoint imply that p(x,y)=p(y,x) and
i i
J J p(pc,y) h(x) h(y) dxdy > 0 for all h e H
o o
16
We also need to determine the appropriate boundary conditions for p. Since PheH
for any h, we must have Ph(0)=Ph(l)=0. That is, or any heH,
/ p(0,y) h(y)dy= f p(l,y) h(y) dy = 0.
These, together with symmetry of p, give the boundary condition
P(*o0 Ian = 
(3.5)
An equation for p is now derived from (3.4) by applying the operator there to an
arbitrary h e H. We look at each term separately. The first term applied to h is
A *Ph =APh = ~fp(x,y)h(y)dy = jp^htydy
dx Q 0
Also, applying integration by parts and using (3.5), we get PAh as follows:
PAh = f p(x,y)^h(y)dy = fpyyh(y)dy.
o dx o
For the nonlinear part PBB P, by the definitions of B and B we have
i i i
PBB *P{h) = Jp(x,z) [ f b(s) Jp(s,y) h(y)dy ds] b(z)dz
0 0 0
1/1 W 1 A
= f fp(xj)b(z)dz Jp(s,y)b(s)ds h(y)dy.
o \o Ho
Combining these, for all heH, we obtain
I
Ap ~ (f b(x)p(x,y)dy) (f b(y)p(x,y)dy) + q
h(y)dy.
17
We thus must have
Pxx +Pyy BJP) 5y(P) + 4 ='0
(3.6)
where
Bx(p)=fb(x)p(x,y)dx and By(p)= J b(y)p(x,y)dy.
o
o
So the LQR problem for (3.1) and (3.2) has been transformed into finding the
unique nonnegative symmetric solution to the integraldifferential equation (3.6),
with boundary condition (3.5). We consider two multigrid methods, multilevel
projection (PML) and a full approximation storage scheme (FAS) for solving the
problem:
3.2 Multilevel Projection (PML) Solver
In order to apply PML to the solution of the equation (3.7), we need to specify the
sequence of spaces to be used, show that the cycling scheme can be realized by
defining the fine and coarser level discrete equations, and specify the relaxation
used.
3.2.1 The Discretization and Equations
We consider a finite element discretization of (3.7). Let nk=2k, and hk=l/nk for
k=l,2,...,M. Let Xjk=ihk for i=0,l,...,nk. We can then let jk be the usual 1D
"hat" function so that
(Pxx +Pyy) +Bx(p)By(p) =q on Q =[0,1] X[0,1]
p0 on dQ.
(3.7)
18
[Xj ^Xjk]. Let (Xjk,yjk)=(ihk,jhk). Rather than using a single subscript as in section
2, we use the subscript pair ij to denote a quantity associated with the point
(Xjk,yjk), and define the 2D piecewise bilinear functions <)ijk(x,y)=(()ik(x)
Let Sk denote the space spanned by the set of basis functions {jjk: l
This defines the nested function spaces used. As before, we let Gk be the space of
coefficient vectors with respect to the basis. The interpolation Ik+1:Gk>Gk+1 is
determined by the basis functions, and is piecewise bilinear. We also need the 1D
piecewise linear interpolation operator, which we denote by Ik+1. We define the
corresponding 1D restriction operator Ik+1=(Ik+1)T.
We first want to derive the form of the fine level discrete equations, then the
coarser level equations. Letting L=A and rewriting (3.6), we get
F(p) = Lp + BJp) By(p)q= 0. (3.8)
As stated in section (2.2), the discrete equations can be written in the form:
Fij(pM)=0, 1 < ij < nM~l.
where
Fij(pM) = (F(pM),$*)
= (LpM+Bx(pM)By(pM)q,
Separating the terms of the inner product, this is rewritten as
fijiP ) ~ Lij p +{Bxjp ){Byi p ) qtj ,
(3.9)
(3.10)
(3.11)
19
where
L*(p*)=
(p M) (B^fp M) = {Bx(p M)By(p )
Qij W>4*(/'
(3.12)
The form of the discrete nonlinear term is not obvious, and will be derived below.
First, we consider the linear term, which is obtained from integration by parts:
LijPM = (LpM,fyy)
M x.M\
/ ,M .M,M\ ,M\
M(i,M' .Mxi.M ,Mw M'
Pkl (W* A 'A A '+A A' A*/ A' ')
kl
8 IK 1, M M M MM M M M N
= ~Pij ~ ~ (Pilj +Pi+lj +Pij1 +Pij+lPiljl +Pi+ljl +Pilj+l +Pi+lj+d
(3.13)
This follows from the facts:
2 1 j
\hM Nl=l
0 otherwise
h
M
1
h
*=J
\i~j I =1
M
0 otherwise
(3.14)
The nonlinear term has an especially nice form for our purposes. Because Bx(pM)
is a function of y only and By(pM) is a function of x only, we can rewrite the inner
product in (3.12) as follows:
(Bx(p M)By(p MU?j) = (Bx(p M)$) (By(p (315)
20
Thus we can define BxjM and ByiM separately:
B%pM = {Bx(pM),i\/f)
kl M A M M = ~rz2bk (Pkji +*Pkj +PkJ+1) 0 k (3.16a)
BylpM = (By(pM), tf) =E pSfytfM,?) kl ,2 = rz2bk (Piu+*Pik +Pt+iO O k (3.16b)
These follow from (3.14) and the definitions of Bx and By: i (BJAffi = (f b(x)4hi dx$f) = (b* hMi> = hMb^( (>f,f> 0 (3.17a)
= (/b(y)
The final term involving q is defined in the straightforward way, completing the
fine level discretization.
The next step is to obtain the coarser level discrete equations. For this, we assume
that the discrete equations are defined on level k+1, and derive the level k equa
tions. We use a more general form than the fine level equations, since extra terms
21
appear on coarser levels. For l
F^(Pk+1) = L*+1pk+1 he*;1? k+1+b^)(B;;
k+1k+1
yi P
uk+U *+l
+ byi
(3.18)
We write the coarser level equations as in (2.10), separating the terms:
F(P*) = <41Â£w(p*1+/1'V).4>i
* uUif'G*"4,V)+?Y<<(?'w (319>
The first term is treated as in (2.12):
/ jk j &+1 /_ Â£+1 jk+1 n k\ k\ It k~k jk j k+1 fc+1 k\
Uk+1L (P +1k P heijk~^L P +1k+\L P Aj'l
k Tk+l~k+lk\
(3.20)
The second term can be separated again into functions in x and y as follows:
dUB?'l(P'1 +Â£lpk) +f>,W)(<*(P w +Ilk"p*)+b*'\ek\
= :
(3.21)
The first inner product is over y, and the second over x. Taking the first, we get
V*1 +Jt"pi) +b*'\ef\
=
= (flp1 ti (B,V*1 + i>,** V>*
Similarly, the second inner product can be written as:
(3.23)
22
We now define the following level k quantities:
bk = tk^Bk+lpk+l+bk^) (324)
qk= IkW+1Lk+1pk+1^
Using these, and recombining the inner products in (3.22) and (3.23), we can
finally rewrite (3.19) as
F*(p*) = {Lkpk + {Bkpk + bk){Bkpk + bk)q\e^
 Lkpk + (Bkjpk + bkj)(Bykipk + bki)qk.
(3.25)
The level k operators Lk, Bk and Bk are defined in the same way as the level M
operators:
T k fc 8 * 1 k k k k
LijP 3Pij ~ 3 (PiU +Pi+V +Pij1 +Pij+1
+Piljl +Pi+ljl +Pilj*l +Pi+lj+l)
,2
n k t "'k v'' k/ k A k k \
BxjP (Plj1 + 4P// +Plj*l)
(3.26)
hi
ByiP = iEblk(PiU^PilPi^
0 l
Thus, we have reached our goal and shown that the level k problem is well
defined in terms of level k quantities only.
23
3.2.2 Relaxation
Because of the quadratic nature of the nonlinear term, we use an exact nonlinear
GaussSeidel for the relaxation process. As stated previously, relaxation consists of
sweeping through all points, and at each point (i,j), l
pk^pk*s^h
where s is chosen to satisfy
(Fh(ph + si$]),<$). Q>27)
This leads to a quadratic equation in s:
The coefficients are defined as follows:
a* = bexh(i) bey h(j)
where
bex\Q (.*) =
and
beyh(j) = bjh2.
Also
bH = (Ltfjtf) + iBJp h)By(tfj),$ + %4>l)
=  + [fee h(j) bey h(j) + by h(i) bex h(i)]
(3.28)
(3.29)
(3.30)
(3.31)
(3.32)
where bx(j), by(i), bex(i) and bey(j) are defined by (3.24), (3.30) and (3.31).
24
And finally, we have
c,
v
&P hMj> + bx h(f) by h(i) qjj
where
1 1
= / fq(x,y)Â¥(x)
(3.33)
o o
*q(Xi,yph2.
Now, we solve (3.27) exactly, taking
This form is used since the quadratic coefficient a can be small, and the root
chosen is that closest to the solution of the linear part of (3.28). This completes
the definition of relaxation on the fine grid.
3.2.3 Numerical Results
Here, we present results both from Vcycling, to ensure that multigrid is converg
ing as expected, and from the full multigrid solution process, to show that the
algorithm does solve to below the level of discretization error. For our tests, we
take b(x) = 100., p(x,y)=x2(x2l)y2(y2l), and use this to generate the righthand
side for the continuous problem. This is used to define the righthand sides for the
discrete problems on each level. Tables 3.1 and 3.2 show convergence histories
and asymptotic convergence factors for V(1,0) and V(l,l) cycles, respectively.
Factors for V(1,0) cycles are 0.20 per cycle for all levels (except for the case
h=l/4), showing that acceptable convergence, independent of mesh size, has been
obtained. Factors for V(l,l) cycles also appear to be bounded independent of mesh
25
size, with an average factor for the first 10 cycles of around 0.05. Smaller conver
gence factors do not necessarily imply that a greater efficiency. To compare the
cycling efficiency, we use the time per cycle tc and the convergence factor p to
find time required to reduce the residual norm by a factor of 10 as follows:
f.i = Vlo8ioP
This measure is shown in Table 3.3 for the V(1,0), the V(l,l) and the V(2,l)
cycles. Cycling efficiency is best for the V( 1,1) cycle.
In the FMG process, the initial approximation on a given level is the (interpolated)
solution from the next coarser grid, and the algebraic error there must be reduced
to below the level of discretization error. The size of the discretization error on
each level is shown m Table 3.4. For h< 1/4, this error behaves like 0(h ),
decreasing by a factor of close to 4 per level. Thus, the analysis of section 2.4
should apply, and we can use 1/7 as a guide for the desired reduction factor per
level. For this, two V(0,1) cycles of one V(l,l) cycle should be sufficient. Tables
3.53.8 show the results of FMV cycling. Following our earlier notation, and
using a discrete approximation to the continuous L2 norm, the columns in each
table indicate p0kPk (initial algebraic error), pfkPk (final algebraic error),
PkP (discretization error), and ]pfkPk/PkP. The last quantity is then less
than 1 when the final algebraic error is small enough. Tables 3.5 and 3.6 show
results for one and two V(0,1) cycles per level, while tables 3.7 and 3.8 give
results for one and two V(l,l) cycles. In all cases, the algebraic error is smaller
than discretization error on all levels, even for one V(1,0) cycle per level, which
has a reduction factor is larger than 1/7. (Note that although the discretization
error PkP behaves as expected, the ratio of initial error to discretization error
PokPk/PkP is only slightly larger than 3 on all levels, which is better than
predicted.) FMG solution times are given with the tables, and the best time of
0.49 seconds is obtained with one V(1,0) cycle per level.
26
Cycle h 1/4 h = 1/8 h = 1/16 h = 1/32 h = 1/64
0 0.10E+01 0.56E+00 0.28E+00 0.14E+00 0.72E01
1 0.15E+00 0.35E01 0.13E01 0.58E02 0.28E02
2 0.74E01 0.59E02 0.25E02 0.11E02 0.55E03
3 0.40E01 0.89E03 0.48E03 0.22E03 0.11E03
4 0.21E01 0.12E03 0.95E04 0.44E04 0.22E04
5 0.11E01 0.16E04 0.19E04 0.89E05 0.43E05
6 0.54E02 0.31E05 0.36E05 0.18E05 0.87E06
7 0.27E02 0.45E06 0.69E06 0.35E06 0.17E06
8 0.13E02 0.89E07 0.13E06 0.71E07 0.34E07
9 0.65E03 0.15E07 0.23E07 0.14E07 0.69E08
10 0.32E03 0.31E08 0.39E08 0.28E08 0.14E08
Factor 0.50 0.20 0.20 0.20 0.20
Table 3.1. PML V(l,0)cycle convergence histories for the model problem for
different mesh sizes. Asymptotic convergence factors for each level are given at
bottom of columns.
Cycle h = 1/4 h = 1/8 h = 1/16 h = 1/32 h = 1/64
0 0.10E+01 0.56E+00 0.28E+00 0.14E+00 0.72E01
1 0.77E01 0.61E02 0.26E02 0.12E02 0.57E03
2 0.16E01 0.13E03 0.11E03 0.53E04 0.25E04
3 0.25E02 0.30E05 0.48E05 0.26E05 0.12E05
4 0.28E03 0.67E07 0.22E06 0.13E06 0.63E07
5 0.51E04 0.18E08 0.99E08 0.77E08 0.41E08
6 0.13E04 0.64E10 0.52E09 0.47E09 0.27E09
7 0.23E05 0.20E11 0.24E10 0.30E10 0.20E10
8 0.29E06 0.57E13 0.11E11 0.19E11 0.15E11
9 0.42E07 0.28E14 0.47E13 0.13E12 0.11E12
10 0.11E07 0.12E15 0.18E14 0.85E14 0.92E14
Factor 0.16 0.027 0.038 0.048 0.051
Table 3.2. PML V(l,l)cycle convergence histories for the model problem for
different mesh sizes. Average convergence factors for each level are given at
bottom of columns.
27
Cycle Factor
V(1,0) .200
V(l,l) .065
V(2,l) .028
V(2,2) .015
tc t.i
0.25 0.39
0.41 0.35
0.57 0.37
0.73 0.40
Table 3.3 Comparison of efficiency for PML Vcycles for h=l/64. Shown are
convergence factors, times per cycle, and cycling time to reduce residuals by a
factor of 10.
Level IIP II
1/2 1.76E02
1/4 2.48E02
1/8 2.54E02
1/16 2.54E02
1/32 2.54E02
1/64 2.54E02
lPk PkP 
2.56E02 8.05E03
3.05E02 7.23E03
2.66E02 1.73E03
2.57E02 4.15E04
2.55E02 1.02E04
2.54E02 2.55E05
Table 3.4. Norms of continuous and PML algebraic solutions and their differences
measured on each level.
Level Ilp0kPkl llpfkPk PkP  ipfkpki/ipkpi
1/4 0.16E01 0.18E04 0.72E02 0.25E02
1/8 0.50E02 0.94E03 0.17E02 0.55E+00
1/16 0.15E02 0.25E03 0.42E03 0.59E+00
1/32 0.36E03 0.59E04 0.10E03 0.58E+00
1/64 0.90E04 0.15E04 0.26E04 0.58E+00
Table 3.5. PML FMG (1 V(1,0) cycle/level). Solution time = 0.49 s.
28
Level
IIPokI,t
1/4 0.16E01
1/8 0.50E02
1/16 0.12E02
1/32 0.30E03
1/64 0.73E04
llpfkPki PkPl pfkPk/PkP
0.18E04 0.72E02 0.25E02
0.17E03 0.17E02 0.10E+00
0.39E04 0.42E03 0.94E01
0.92E05 0.10E03 0.90E01
0.23E05 0.26E04 0.89E01
Table 3.6. PML FMG (2 V(1,0) cycles/level). Solution time = 0.71 s.
Level O pr i hd ipfkpki PkP  pfkPk/PkP
1/4 0.16E01 0.68E06 0.72E02 0.95E04
1/8 0.50E02 0.19E03 0.17E02 0.1 IE+00
1/16 0.12E02 0.47E04 0.42E03 0.1 IE+00
1/32 0.30E03 0.13E04 0.10E03 0.13E+00
1/64 0.77E04 0.39E05 0.26E04 0.15E+00
Table 3.7. PML FMG (1 V(l,l) cycle/level). Solution time = 0.88 s.
Level llp0kPkl llpfkPk PkP iipfkpki/ipkpi
1/4 0.16E01 0.68E06 0.72E02 0.95E04
1/8 0.50E02 0.50E05 0.17E02 0.29E02
1/16 0.12E02 0.25E05 0.42E03 0.61E02
1/32 0.30E03 0.69E06 0.10E03 0.67E02
1/64 0.73E04 0.20E06 0.26E04 0.77E02
Table 3.8. PML FMG (2 V(l,l) cycles/level). Solution time = 1.32 s.
29
3.3 Full Approximation Scheme (FAS) Solver
The application of FAS to equation (3.7) is much simpler to describe than the
PML approach. The FAS algorithm given in Section 2.3 is used, with simple finite
difference approximations for the operators on each level. This is described here,
and numerical results are presented.
3.3.1 The Discretization, Equations and Relaxation
We use a sequence of uniform discretizations of the domain Q=[0,l]x[0,l], where
Qk has mesh size hk=2'k. To define our algorithm, we only need to specify the
discretization of the nonlinear operator F in (3.7):
F(p) = Lp+Bx(p)By(p)
where
i i
LP = (Pxx+Pyy)> Bx(p)=fb(x)p(x,y)dx and By(p) =Jb(y)p(x,y)dy.
0 0
Straightforward finite difference discretizations are used, so that on level k,
F*(p*) = L
where at each point (Xj,yj)=(ihk,jhk), l
(l kp\j = 2 (APij Ptu Pi+lj Pijx Pipi)
K
and
BX (p% = KEbiPij and By (p% = bkpkt.
i i
30
Point evaluation is used for discretizing the righthand side q on the finest level M,
which we denote by f1^. Now, we have the problem written in the form used in
the FAS algorithm in Section 2.3, so that the equations we use on each level are
fully specified.
The relaxation we use is nonlinear GaussSeidel. The level k equation at a point i j
can now be written as:
a(# k
[53W
\ i ) \ i
,2 A
hfij
which is a quadratic equation in p y of the form
a (P))2 + bp^ + c = 0
with
atibfb*
b=A+ht{b*b*ij+b*b!;iJ)
c = bxijbyij(Pilj+Pi+lj+Pij1 +Pij. i)blfij
where
Kij = E biPij and byij = YJbiPii
M l*j
We solve exactly, taking the root closest to that of the linear problem, so that
pSlcSlOt + jQtftaScS).
This finishes the description of our FAS algorithm.
31
3.3.2 Numerical Results
Tables 3.93.16 contain the results corresponding to Tables 3.13.8 for the FAS
rather than the PML version of the algorithm. Again, we take b(x) = 100 and
p(x,y)=x2(x2l)y2(y2l). It is interesting to compare the discretization errors
PkP for PML and FAS. Tables 3.4 and 3.12 show that on the finest level used,
this error is 3.410'7 for FAS, while it is 2.610'5 for PML. This merits further
study, but will not be examined here. Another difference between PML and FAS
is seen in Tables 3.3 and 3.11. Cycle times for FAS are smaller than those for
PML, mostly due to the use of smaller finite difference stencils, but this savings is
somewhat offset by higher convergence factors, resulting in comparable efficiency.
Again, 0(h2) discretization error behavior is seen, so that reduction per level
required in FMG the same as with PML. Tables 3.9 and 3.10 show convergence
histories and asymptotic convergence factors for V(1,0) and V(l,l) cycles,
respectively. Generally, die convergence factors are not as good as for the PML
algorithm. This is understandable, since the PML algorithm is designed so that
coarse grid corrections are optimal in some sense, while FAS uses an approxima
tion for the correction to the righthand side of the coarser level equations.
Asymptotic convergence factors obtained are 0.35 for V(1,0) cycles and 0.15 for
V(l,l) cycles (compared to 0.2 and 0.05 for PML). Tables 3.13 and 3.14 show
FMG results for i and 2 V(0,1) cycles per level, while tables 3.15 and 3.16 give
the results for one and two V(l,l) cycles. Here, not unexpectedly, we see that one
V(1,0) cycle per level is not sufficient, so that the ratio of algebraic error to
discretization error increases on finer levels. Timings given with the tables show
that the optimal solution time is 0.44 seconds, obtained with 1 V(l,l) cycle per
level. This is slightly better than the optimal PML solution time obtained.
32
Cycle h = 1/4 h = 1/8 h = 1/16 h = 1/32 h = 1/64
0 0.19E+01 0.12E+00 0.62E01 0.24E01 0.92E02
1 0.48E+00 0.22E01 0.11E01 0.70E02 0.32E02
2 0.23E+00 0.48E02 0.27E02 0.22E02 0.12E02
3 0.97E01 0.16E02 0.70E03 0.71E03 0.44E03
4 0.40E01 0.52E03 0.20E03 0.23E03 0.17E03
5 0.17E01 0.18E03 0.66E04 0.74E04 0.63E04
6 0.69E02 0.65E04 0.22E04 0.25E04 0.23E04
7 0.28E02 0.24E04 0.75E05 0.83E05 0.79E05
8 0.12E02 0.89E05 0.26E05 0.29E05 0.28E05
9 0.49E03 0.33E05 0.90E06 0.10E05 0.95E06
10 0.20E03 0.12E05 0.32E06 0.36E06 0.33E06
Factor 0.41 0.37 0.35 0.35 0.35
Table 3.9. FAS V(l,0)cycle convergence histories for the model problem for
different mesh sizes. Asymptotic convergence factors for each level are given at
bottom of columns.
Cycle h = 1/4 h = 1/8 h = 1/16 h = 1/32 h = 1/64
0 0.56E+00 0.24E01 0.14E01 0.69E02 0.28E02
1 0.97E01 0.20E02 0.87E03 0.61E03 0.30E03
2 0.16E01 0.28E03 0.10E03 0.56E04 0.34E04
3 0.27E02 0.44E04 0.13E04 0.56E05 0.41E05
4 0.45E03 0.70E05 0.20E05 0.62E06 0.50E06
5 0.74E04 0.11E05 0.32E06 0.80E07 0.62E07
6 0.12E04 0.18E06 0.52E07 0.11E07 0.80E08
7 0.20E05 0.27E07 0.89E08 0.18E08 0.11E08
8 0.34E06 0.42E08 0.16E08 0.31E09 0.15E09
9 0.56E07 0.65E09 0.27E09 0.53E10 0.24E10
10 0.93E08 0.10E09 0.43E10 0.92E11 0.54E11
Factor 0.17 0.16 0.17 0.17 0.15
Table 3.10. FAS V(l,l)cycle convergence histories for the model problem for
different mesh sizes. Asymptotic convergence factors for each level are given at
bottom of columns.
33
Cycle factor tc t.i
V(1,0) 0.365 0.18 0.41
V(l,l) 0.130 0.25 0.28
V(2,l) 0.088 0.33 0.31
V(2,2) 0.064 0.40 0.33
Table 3.11. Comparison of efficiency for FAS Vcycles for h=l/64. Shown are
convergence factors, times per cycle, and cycling time to reduce residuals by a
factor of 10.
Level  IIP II jpkii PkP
1/2 1.76E02 1.71E02 4.95E04
1/4 2.48E02 2.47E02 9.57E05
1/8 2.54E02 2.53E02 2.23E05
1/16 2.54E02 2.54E02 5.46E06
1/32 2.54E02 2.54E02 1.36E06
1/64 2.54E02 2.54E02 3.39E07
Table 3.12. Norms of continuous and FAS algebraic solutions and their differences
measured on each level.
Level llp0kPll llpfkPk p^pi pfkPk/PkP
1/4 0.17E01 0.16E05 0.96E04 0.16E01
1/8 0.49E03 0.54E04 0.22E04 0.24E+01
1/16 0.62E04 0.18E04 0.55E05 0.33E+01
1/32 0.19E04 0.65E05 0.14E05 0.48E+01
1/64 0.66E05 0.24E05 0.34E06 0.71E+01
Table 3.13. FAS FMG (1 V(1,0) cycle/level). Solution time = 0.39 s.
34
Level
1/4
1/8
1/16
1/32
1/64
tPokPkll llpfkPk
0.17E01 0.16E05
0.49E03 0.15E04
0.29E04 0.25E05
0.44E05 0.41E06
0.10E05 0.70E07
PkP pfkPk/PkP
0.96E04 0.16E01
0.22E04 0.68E+00
0.55E05 0.45E+00
0.14E05 0.30E+00
0.34E06 0.21E+00
Table 3.14. FAS FMG (2 V(1,0) cycles/level). Solution time = 0.60s.
Level lp0kPkll IpfkPkll PkP pfkPk/PkP
1/4 0.17E01 0.20E07 0.96E04 0.21E03
1/8 0.48E03 0.18E04 0.22E04 0.79E+00
1/16 0.31E04 0.39E05 0.55E05 0.71E+00
1/32 0.53E05 0.78E06 0.14E05 0.58E+00
1/64 : 0.12E05 0.16E06 0.34E06 0.47E+00
Table 3.15. FAS FMG (1 V(l,l) cycle/level). Solution time = 0.44 s.
Level llp0kPkI llpfkPk PkP  iPfkpk i/ ipkp ii
1/4 0.17E01 0.20E07 0.96E04 0.21E03
1/8 0.48E03 0.26E05 0.22E04 0.12E+00
1/16 0.26E04 0.22E06 0.55E05 0.41E01
1/32 0.34E05 0.28E07 0.14E05 0.20E01
1/64 0.95E06 0.22E07 0.34E06 0.64E01
Table 3.16. FAS FMG (2 V(l,l) cycles/level). Solution time = 0.82s.
35
4. The Control System with Second Order Time Derivative
The governing equation for many systems, including the control of many large
space structures, is second order in time and can be described by the following
partial differential equation:
v(x,t) + D v(x,t) +A v(x,t) = B u(t) (41)
where the state v(x,t) is in some Hilbert space H. The stiffness operator A and the
damping operator D are operators on H, u(t)eM is the control input to the system,
and B is an operator from M into H. The problem is to find the control u(t) that
minimizes a cost functional J(u) (similar to (3.1)), subject to the constraint (4.1).
The resulting LQR problem can be more difficult to solve than the problem of
Section 3, depending on the form equation (4.1). For completeness, in this section,
we present a particular problem and an unsuccessful approach we tried, discuss the
difficulties encountered, and outline possible methods to overcome these problems.
6 i
Figure 1. The simply supported beam. Applied force is u(t)*b(x), where b(x) is a
given function. In practice, b can be a continuous function along the beam or a
force applied at one or more points.
36
4.1 Control of a Simply Supported Beam
An example of a physical problem whose behavior is modeled by (4.1) is illustrat
ed in Figure 1, showing control of a simply supported beam. In this example we
have A=54/9x4 and D=2^A1/2= l^cP/dx2. The damping parameter \ is
generally quite small, and values around Â£,=0.05 are typical.
A standard approach for control problems with a constraint equation that is
secondorder in time is to rewrite it as a system of firstorder equations, so that
the method of the previous section can be applied. Let
w(x,t) =
v(x,t)
v(x,t)'
Then (4.1) becomes
w(x,t) =Aw(x,t) + Bu(t)
where
0 I 0
A = A D and B = B
(4.2)
That is, our goal is to find a control u(t) that minimizes a quadratic cost functional
of the form
J(u) = J" {(Qw,w) + u 2{t)} dt (43)
o
subject to the constraint (4.1), where Q is nonnegative and selfadjoint and may be
written as
37
@1
Q ~t<Â£ Q2
By the Pontryagin maximum principle, we get the optimal control as
u(t) =B* Pw(x,t)
with P the unique nonnegative solution to the following operator Riccati equation:
A*p+papbb*p+q = o. <44)
Many articles (see [4][9]) focus on approximation techniques to solve the LQR
problem (4.1) and (4.4). Let
Bo
p; P2
and assume that P1? P2 and P0 are integral operators with kernels pj(x,y), p2(x,y)
and p0(x,y), respectively, that B is defined as in Section 3, and that Qj, Q2 and Q0
are integral operators with kernels q^x.y), q2(x,y) and q^x.y), respectively.
Our approach was to transform the Riccati equation into a system of PDE's, using
the method of Section 3, resulting in the following system for p0, p: and p2:
Poxxxx + Poyyyy + Bx(Po)By(pQ) = ^
~P\ + Plxxxx Poyy + = ?0
~Pl + Plyyyy ~PQxx + 5>0 )'^(P2) =
 (Pq +Po) ~ (PL +P2yy) + BM 'BJP2) =
38
We can eliminate p1 in (4.5) to get:
POxxx* + Poyyyy + BfPo)' By(P<) = qi
(Pqxx Poyy) + &2XXXX 'Plyyy) + By^) ~ ^(Po ) By(P2) = % ~ 0 (4.6)
 (P0 +Po) ~ (P2xx +P2yy) + BfPj *5y(P2) = ^2
with the boundary conditions
Poly=0 _Poly=l Poxyly=0 _Poyyly=l (4 7)
T^lan =^ P2xJaQ =P23ylan =
Here pQ is the kernel of Pq, so that we have
Po (*o0 =P<$ys) for all (x,y) e [0,1] x[0,l]
For a small mesh size h=l/4, the solution to (4.6) and (4.7) is the same as that
from the standard approximation techniques applied directly to the discrete form of
(4.4), indicating that our method for transforming the LQR problem into system
(4.6) with boundary conditions (4.7) is essentially correct. However, we encoun
tered difficulties in formulating a multigrid method similar to that used in the
previous section that converged for this problem with h smaller than 0.25.
4.2 Multigrid Difficulties
For a multigrid solver, we tried a modified application of the FAS approach
described in Section 2. A relaxation scheme was devised in which points (Xj.yj)
and (yj,X;) were relaxed simultaneously (because of the equation coupling due to
the "adjoint" terms). After tests gave rather strong divergence for small h, we
examined the system and found that the problem was with the nonelliptic nature of
39
the system (4.6). This was somewhat hidden by the presence of the adjoint terms
(which are not typical in pde's). The nature of the equations is more easily seen if
we rewrite (4.6), ignoring the nonlinear terms, letting s=(p0+Pq)/2, t=(p0pQ)/2,
and noting that p2=P2:
(d,Sa>0,2a,2)A*
(a2 a2)s+(a2 a2; t (dl apAp2 = qa <70* (4.8)
2s A.p2 = q2
The operator is nonelliptic, and is contained in the principal part of the full
system operator. Thus, there are oscillatory components that are not smoothed by
usual relaxation methods. In addition, when discretized, d^d^ gives a zero diago
nal coefficient, so that, for small enough h, offdiagonal coefficients dominate and
relaxation diverges. Thus our initial multigrid approach was not successful.
4.3 Possible Multigrid Approaches
Here, we outline several possible approaches to the problem, which we were
unable to explore due to time constraints. These include modifications to the
multigrid algorithm, as well as possible changes to the problem formulation itself.
The first idea is to deal with the main difficulty of the nonelliptic operator 5^5^.
This is basically the wave equation operator. Null space components of this
operator (ignoring boundary conditions) are those which are constant along the
characteristic directions x+y=c or xy=c. The values along adjacent lines are
arbitrary (and thus nonsmooth). A global relaxation step may help, in which
constants are added along each of these lines, so that some combination of the
equations are satisfied. Other types of relaxation can also be explored. An alterna
tive to eliminating problem components by relaxation is to change interpolation
40
and the coarse grid problem so that these components are contained in the range of
interpolation. This could be done by using semicoarsening in a diagonal direction.
Another possibility that may help, although it will not change the character of the
system, is to change the formulation of the firstorder system. The method shown
earlier for reducing the order of the time derivative appears to have been chosen
because of it's simplicity, and the fact that the firstorder formulation fits well with
existing algebraic methods for solution of the control problem. The form of the
resulting system does not seem to be especially important. In our case, a different
formulation may make the system easier to deal with (although probably not
change the its essential character). For example, we can take
w(x,f) =
v(x,f)
A ~1/2v(x,t)
Then (4.1) becomes
where
w(x,t) = A w(x,t) +Bu(t)
A ' 0 A112 0
A = A112 D and B = B
(4.9)
Here, note that the order of the operators has been reduced, so that only second
and lowerorder operators appear. As stated above, this will not change the basic
character of the system, and the need to deal with the "wave equation" operator,
but it may simplify the resulting multigrid algorithm.
More time and work is needed to construct an effective multigrid algorithm for
this problem.
41
5. Conclusions
We have formulated a differential/integral equation for the solution of a differen
tial Riccati equation and developed two efficient multigrid solvers for this prob
lem, showing that our basic approach is sound. A second type of problem present
ed some difficulties, and we have outlined some approaches that may help over
come these and develop an efficient multigrid solver for more general Riccati
equations. It is important to note that the problems we encountered were not with
the Riccati equation nor with our formulation of a system of PDE's, but with the
nature of the original timedependent constraint. The governing equation, essential
ly a higherorder wavetype equation, was nonelliptic, and this character was
preserved in the resulting system. If the original equation can be effectively solved
by multigrid methods, then we should also be able to develop an efficient
multigrid solver for the resulting control problem.
42
References
[1] J.L. Lions, Optimal control of systems governed by partial differential equa
tions, SpringerVerlag, Berlin, 1971.
[2] P.K.C. Wang, Control of distributed parameter systems, in Advances in
Control Systems, vol.l, Academic Press, New York, 1964.
[3] V. Komkov, Formulation of Pontryagin's maximality principle in a problem of
structural mechanics, Inter. J. Control, 17 (1973), pp. 455463.
[4] J.S. Gibson and A. Adamian, Approximation theory for linear quadratic
Gaussian optimal control of flexible structures, SIAM J. Contr. & Optim., 29
(1991), pp. 137.
[5] J.S. Gibson, Linearquadratic optimal control of hereditary differential systems:
Infinite dimensional Riccati equations and numerical approximations, SIAM J.
Contr. & Optim., 21(1983), pp. 95115.
[6] J.S. Gibson, An analysis of optimal control regulation: Convergence and
stability, SIAM J. Contr. & Optim., 19(1981), pp. 686707.
[7] H.T. Banks, I.G. Rosen, and K. Ito, A spline based technique for computing
Riccati operators and feedback controls in regulator problems for delay equations,
SIAM J. Sci. Statist. Comp., 5(1984), pp. 830855.
[8] H.T. Banks and K. Kunisch, The linear regulator problem for parabolic
systems, SIAM J.Contr. & Optim., 22(1984), pp. 684698.
[9] M.J. Balas, Trends in large space structure control theory: Fondest hopes,
wildest dreams, IEEE Trans, on Auto. Contr. AC27(1982), pp.522535.
43
[10] M J. Balas, Feedback control of flexible systems, IEEE Trans, on Auto.
Contr. AC23(1978), pp.673679.
[11] V.M. Alekseev, V.M. Tikhomirov and S.V. Fomin, Optimal Control,
Consultants Bureau, New York, 1987.
[12] J. Zabczyk, Mathematical Control Theory: An Introduction, Birkhauser,
Boston, 1992.
[13] S.F. McCormick, Multilevel Projection Methods for Partial Differential
Equations, SIAM, Philadelphia, 1992.
[14] A. Brandt, MultiLevel adaptive solutions to boundary value problems, Math.
Comp., 31 (1977), pp. 333390.
44

PAGE 1
MULTIGRID METHODS FOR LINEAR QUADRATIC REGULATOR (LQR) PROBLEMS OF INFINITE DIMENSIONAL SYSTEMS by Tiejun Li ,:I.Sc., Peking University Peking, China, 1959 A thesis submitted to the Faculty of the Graduate School of the University of Colorado in partial fulfillment of the requirements for the degree of Master of Science Applied Mathematics 1995
PAGE 2
This thesis for the Master of Science degree by Tiejun Li has been approved by Roland Sweet
PAGE 3
Li, Tiejun (M.S., Applied Mathematics) Multigrid Methods for Linear Quadratic Regulator (LQR) Problems of Infinite Dimensional Systems Thesis directed by Associate Professor Adjunct John W. Ruge ABSTRACT Partly due to great interest in the construction and deployment of large space structures, problems in the analysis and control of flexible mechanical systems have become increasingly important over the last several decades. One of the main problems encountered is the determination of the applied force necessary to move a structure quickly and accurately into a given position, with minimal force and excess motion. This can be posed more precisely as a linear quadratic regulator (LQR) problem: Find the force that, together with the system state, satisfies the governing time dependent partial differential equation and minimizes an appropri ate quadratic cost functional. The minimizing force is a linear function of the state, with the linear state feedback operator satisfying an associated Riccati equation. During the last ten years, a much effort has gone into developing solution methods for LQR problems coming from these infinite dimensional (continuous) systems. The standard approach is to discretize in space, obtaining a finite dimensional system and a matrix Riccati equation, for which algebraic solution methods have been devised. Such methods can be quite expensive, and do not take advantage of the continuous nature of the problem. Our approach is to reformulate the continu ous Riccati equation as a nonlinear integraldifferential equation, to which we apply a multigrid solution method. For a problem in which the constraint equation is firstorder in time, we develop and test two multigrid solvers, one based on iii
PAGE 4
multilevel projection methods (PML), and the other on the full approximation storage (FAS) scheme. Both are shown to work with typical optimal multigrid effi ciency. We also look at a problem that is secondorder in time, point out some difficulties we encountered, and present possible solutions. This abstract accurately represents the content of the candidate's thesis. I recommend its publication. IV Signe"""d John W. Ruge
PAGE 5
CONTENTS List of Tables .................................................................................. vi Acknowledgements ............................................................................ vii Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2. Multigrid Methods ........................................................................ 3 2.1 Basic Multigrid Ideas ................................................................. 3 2.2 Multilevel Projection (PML) Methods ............................................ 5 2.3 Full Approximation Storage (FAS) Scheme .................................... 11 2.4 The Full Multigrid (FMG) Algorithm ........................................... 12 3. The Control System with First Order Time Derivative .......................... 15 3.1 The Operator Riccati Equation and MG Solvers .............................. 15 3.2 Multilevel Projection (PML) Solver ............................................. 18 3.2.1 The Discretization and Equations ................................................ 18 3 .2.2 Relaxation ............................................................................. 24 3.2.3 Numerical Results ................................................................... 25 3.3 Full Approximation Scheme (FAS) Solver ..................................... 30 3.3.1 The Discretization, Equations and Relaxation ................................. 30 3.3.2 Numerical Results ................................................................... 32 4. The Control System with Second Order Time Derivative ....................... 36 4.1 Control of a Simply Supported Beam ........................................... 37 4.2 Multigrid Difficulties ............................................................... 39 4.3 Possible Multigrid Approaches ................................................... 40 5. Conclusions ..................................................................... .......... 42 References ..................................................................................... 43 v
PAGE 6
TABLES Table 3.1. PML V(l,O)cycle Results ......................................................... 27 3.2. PML V(l,l)cycle Results ......................................................... 27 3.3. Comparison of Efficiency for PML Vcycles .................................. 28 3.4. Discretization Error of PML solutions .......................................... 28 3.5. PML PMG (1 V(l,O) cycle/level) ................................................ 28 3.6. PML PMG (2 V(l,O) cycles/level) ............................................... 29 3.7. PML PMG (1 V(l,l) cycle/level) ................................................ 29 3.8. PML PMG (2 V(l,l) cycles/level) ............................................... 29 3.9. PAS V(l,Q)cycle Results .......................................................... 33 3.10. PAS V(l,l)cycle Results .......................................................... 33 3 .11. Comparison of Efficiency for PAS V cycles . . . . . . . . . . . . . . . . . 34 3.12. Discretization Error of PAS solutions ........................................... 34 3.13. PAS PMG (1 V(l,O) cycle/level) ................................................. 34 3.14. PAS PMG (2 V(l,O) cycles/level) ................................................ 35 3.15. PAS PMG (1 V(l,1) cycle/level) ................................................. 35 3.16. PAS PMG (2 V(l,1) cycles/level) ................................................ 35 VI
PAGE 7
ACKNOWLEDGEMENTS I am deeply indebted to Professor J. Ruge for his great help with my grasp of the multigrid method, countless discussions and much significant guidance. I would also like to thank Professor S. McCormick, the founder of PML, for his sugges tion to apply PML for solving the operator Riccati equations arising from optimal control problems of infinite dimensional control systems. I am also indebted to the AFOSR for their support of this work under Contract 910156. Vll
PAGE 8
1. Introduction For at least two decades, due to the construction of larger and more flexible engi neering structures, especially in space applications, the analysis and design of flexible mechanical systems have increased in importance. Because of increased flexibility, it is especially important to know how to effectively control the motion of these structures. Thus, there has been a corresponding increased interest in extending optimal control theory, originally developed for finite dimensional problems, to such infinite dimensional (i.e., continuous) problems (see Lions [1] and Wang [2]). Here, the behavior of the mechanical system (specified by the state variable) as a function of the system input (the control) is typically modeled by a timedependent partial differential equation. In turn, we assume that there is feedback from the state to the control. The problem is defined as finding the control (or more precisely, the function defining the control in terms of the state) so that the state/control pair satisfy the differential equation and, in addition, minimize a given functional, which generally involves system energy and work done. Much of the work on the optimal control of these systems is based on Pontryagin' s maximum principle, and Komkov [3] has successfully applied this principle to the optimal control of transversely vibrating beams and plates. During the last 10 years, a great deal of effort has been focused on approximation schemes for the solution of linear quadratic regulator (LQR) problems of infinite dimensional systems [410]. In such problems, the idea is to determine a force, or control, applied to a system in order to minimize a functional involving the energy of the system and the applied force. The main idea in the approximation scheme is to discretize the structure, which produces a finite dimensional system and an associated finite dimensional Riccati equation to be solved. A number of algoI
PAGE 9
rithms have been developed for the solution of these matrix Riccati equations. However these are mainly algebraic methods, and do not exploit the underlying continuous nature of the problem. Convergence of these methods can be slow, and, like many matrix methods, as the problem size is increased (which is neces sary for improving accuracy) solution quickly becomes expensive and impractical. Pontryagin' s maximum principle gives the form of optimal control for the infinite dimensional system as u(t)=CPv(x,t), where u is the system input or control, the operator C is determined by the system, and v(x,t) is the state. The operator P, to be determined, then satisfies the operator Riccati equation. In many cases, P can be written as an integral operator, which makes it possible to transform the Riccati equation into an integralpartial differential equation for the kernel of P (see section 2 for details). Multigrid methods, which are often used for fast solution to partial differential equations, can then be applied. The purpose of this thesis is to develop multigrid methods for the fast solution some classes of LQR problems. The main idea is to transform the operator Riccati equation coming from a continuous LQR problem into a partial differential equation (PDE) and apply multigrid methods to the solution of the resulting PDE. In Section 2, we introduce multigrid methods and describe two MG approaches we use for solution of the nonlinear PDE' s. In Section 3, we describe LQR problems where the equations describing the system to be controlled is first order in time. Solution methods are described, and numerical results are given. In Section 4, we discuss problems for which the equations are secondorder in time. Finally, conclusions and remarks on future work are given in Section 5. 2
PAGE 10
2. Multigrid Methods Multigrid (MG) methods provide optimal solution for a large class of PDE's, in the sense that a problem can be solved to the level of discretization error in a time proportional to the number of points on the finest grid (see [13] and [14]). They can be applied in cases involving nonlinearities, varying coefficients, and irregular domains and grids in little more time than is needed for a corresponding linearized, constantcoefficient problem. For our problem, we are particularly interested in the ability of MG to solve nonlinear problems. We consider two approaches: the full approximation storage scheme (F AS) developed by Brandt [14], and the multilevel projection method (PML) developed by McCormick [13]. In this section, we introduce the basic multigrid algorithm, describe the F AS and PML approaches for nonlinear problems, and present the full multigrid (FMG) process for solving the finest level problem to within discretization error. 2.1 Basic Multigrid Ideas The speed of convergence of an iterative method depends on the nature of the error in the solution approximation. Some error components converging quickly, but the overall speed is determined by those components which are slowest to con verge. For example, GaussSeidel iteration, when applied to a discretized Poisson equation, reduces high frequency error (error that oscillates on the scale of the grid) very quickly. The smoother the error, the slower relaxation converges. This is because GaussSeidel acts by averaging the error locally, so that the high frequency part is quickly damped. Thus, GaussSeidel by itself is not, an efficient solver for this problem. However, smooth error can be well represented on coarser grids, and an equation for this error can be formulated and solved on 3
PAGE 11
coarser levels (where processing is cheaper). The solution there can be interpolat ed back to the finer level and used to correct the solution approximation. A recursive application of relaxation and this coarse grid correction is called a multigrid cycle, and can reduce all error components by some small factor independent of the size of the fine grid. To illustrate this, suppose we want to solve a given partial differential equation: Lu =jon Q c (2.1) In order to apply a multigrid solution method to this problem, we need to define the following components: Grids Q1 c Q2c ... c QM,and spaces of grid junctions sl, ... ,sM. Discrete operators L k :S k S k, k = 1,2, ... ,M. Relaxation on each level. Interpolation operators Ikk_1:sk1Sk, k=2,3, ... ,M. R . Ik1 S k sk1 k 2 3 M estrzctzon operators k : = , The coarsest grid typically contains only a few points, so that the cost of direct solution of a problem there is negligible, and uniform refinement is used to obtain the finer grids. The operators on all levels are usually finite difference or finite element discretizations of the continuous operator. The standard interpolation, which takes corrections from a coarser to a finer levels, is piecewise bilinear. Restriction, used for transferring residuals to the coarse grid, is a weighted average called full weighting. Since all error components must be reduced by either relaxation or coarse grid correction, so any error that cannot be approx imated on coarser levels (i.e., highfrequency error) must be reduced effectively by relaxation if the cycle is to work well. Thus the proper choice of relaxation is important. GaussSeidel relaxation works well as a smoother for many problems. 4
PAGE 12
With these components, we define a V(u1,u2)cycle: If k = 1, solve the problem exactly. Otherwise: Relax v 1 times on L ku k = fk Perform coarse grid correction: Define the coarser level problem (i.e., set fk1 and u k1 ) Perform 1 V(vl' Vz)cycle on the problem rk1uk1 = rl Correct the level k approximation: u k u k + u k1 Relax v2 times on L ku k = fk Here, ukl is the coarse grid approximation to the level k error. In linear prob lems, the coarse grid righthand side is the restricted fine grid residual. In nonlin ear problems, as we show later, the form of the correction is somewhat different. 2.2 Multilevel Projection (PML) Methods Multilevel projection (PML) methods were developed by McCormick [13] and can be applied to functional minimization problems as well as some nonlinear equa tions. The basic idea is straightforward, and simplifies many choices of multigrid components. The method generally starts with a choice of a sequence of nested finite element spaces. The cycle can then be viewed as an approximate sequential elimination of error in each of these subspaces, by projecting error onto each subspace and using that to correct the solution approximation. A big advantage of this method is that relaxation on all levels are performed as if it were done on the finest level, which makes implementation and troubleshooting simpler (at least conceptually), and, under some conditions, can guarantee convergence of PML. We begin with an abstract description of a cycle, then consider a practical imple5
PAGE 13
mentation. Suppose we have a continuous (possibly nonlinear) problem F(p) 0 on Q c Jad. (2.2) defined on some space S of functions on domain Q. We choose a sequence of finite dimensional subspaces S1cS2c ... cSMcS (typically finite elements) with increasingly finer resolution, and sM the desired solution space. (2.2) will gener ally not have a solution in sM. Instead, our problem is to find pM in SM so that (2.3) where (, ) is the standard L2 inner product. Thus, the L2orthogonal projection of F(pM) onto sM should be zero. (It should be noted that, as in many finite element applications, integration by parts is used to make sense of (2.3) for the spaces sk.) The spaces sk for k < M are used in two ways. First, the solution to the level k problem corresponding to (2.3) is used as an initial approximation to the level k + 1 solution. This process is known as full multigrid (FMG), and will be described later. Second, these spaces are used in a multilevel cycle, similar to that described in the previous section and defined here formally: PML V(vpv2)cycle For ... 2: M Relax v 1 times on (F( .E p 1 + p ky,q k) 0 for all q k E S k l=k+ 1 Set pk1 0 M Solve all q1ES1 1=2 For ... ,M: Set pkpk+pk1 M Relax v2 times on (F( .E p 1 + p ky,q k) 0 for all q kES k l=k+l 6
PAGE 14
In this abstract setting, there are no grids defined, but the subspaces can be identified with levels of resolution. The main idea is that, for each k, a suitable relaxation processes applied to the level k problem effectively reduces error in sk;skl. A full cycle should then reduce all error components. Here, note that pM could be updated directly immediately after each relaxation sweep on each level, since the form of the solution, written in this way, is not unique. However, this form is completely equivalent, and more closely matches the multigrid cycle de scribed earlier. This cycle, as written, cannot be directly implemented, but serves as a guide for constructing a practical multilevel algorithm. In obtaining our algorithm, we have two aims: First, the algorithm should be completely equivalent to the PML cycle above; and second, the work performed on each level should be proportional to the dimension of that level's subspace. The reason for the first is that, since all processes on all levels are performed as if in the continuous space, there is a common goal of all computation. With the exception of relaxation, this approach dictates the choice of all the necessary multigrid components, and some problems that can arise in other multigrid settings can be avoided or more easily diagnosed and fixed. This first aim could be attained relatively easily and simply by working on the finest level only. However, this leads to an inefficient algorithm, since the amount of work needed on each level is proportional to the size of sM, instead of decreasing on coarser levels. This is the reason for the second aim. The coarser level problems should be formulated so that, for each k, the level k relaxation and coarse grid correction never need appeal to finer level quantities or processes. This is not possible for general nonlinear operators. For the problems we consider, which contain a nonlinearity of special form, such a formulation can done. In this section, we mainly consider the first task of realizing the algorithm in an equivalent form. The second question is answered after we introduce our operator. 7
PAGE 15
The first step in obtaining a practical algorithm is to choose bases for the spaces, which requires the introduction of some operators and notation. Let { be a basis (usually a nodal basis) for S\ k=1,2, ... ,M, and let nk=dim(Sk). Then let Gk be the space of coefficient vectors with respect to this basis. For convenience, we use an overbar to denote the vector of coefficients corresponding to a function in sk, and subscripts to denote the vector entries. That is, tfEGk is the vector of coefficients corresponding to pkESk, so that We also define Ik=(Ik)*. Let be the usual coordinate basis for Gk, so that i=l,2, ... ,nk. Since sk1cSk, for some constants wij the level k1 basis functions can be expressed in terms of the level k basis functions as follows: Thus, we can write This defines an interpolation in the coefficient spaces that corresponds to the natural embedding ofpk1 in sk, with the matrix given by i=l, ... ,nk, j=l, ... ,nk_1 Then we define I if k = l (identity) l l l1 k+l Ik = IHII2 ... lk if k< l (interpolation) (I/f if k > l (restriction). 8
PAGE 16
Note that restriction is defined as the transpose of interpolation. It follows that Now, using the basis functions, we can first write the finest level problem as a system of equations: (2.4) where (2.5) This is the system of discrete equations to be solved. Now we consider the coarser level equations, and proceed by induction. Suppose that we have a form for the level k equations, that can be expressed only in level k quantities. That is, k k k k T F (p 1 = (F1 (p hF2 (p hFn (p J) = 0 k (2.6) where (2.7) This is true for k=M, where the sum over I is empty. For k
PAGE 17
This gives the full level k1 system: (2.9) Let ( \ denote the usual 12 inner product on Gk. Then, we can obtain an alterna tive form of (2.8) that we find useful later: F k1(pk1) = .k (Fk(pk + lk k1) k) ..., W 1 kJP ,e1 k ( k k k k1) k k) F (p + lk_1p ..., wjiej k (2.10) = (Fk(pk + et1)k = (It 1 F k(p k + It1P k1),eik1 )k1. As noted above, it is convenient to absorb if into the definition of pkl. When dealing with linear operators, however, this would be confusing and we keep this contribution to the equation separate. Thus, if pk is a linear operator L k, we define and write instead or more simply (Jk1 Lklk k1 l1Lkk k1) k k1P + k P ,ei k1 Ik1Lk(pk Ik k1) = Lk1k1 1k1Lk k k + k1P P + k P (2.11) (2.12) (2.13) Now, the form ofF determines whether or not pkl in (2.9) or (2.10) can be evaluated (during relaxation for example) without interpolating jfl to the finer level and applying the operator there. We show that this is possible in our case. lO
PAGE 18
For linear operators, this results in the usual finite element, or Galerkin, formula tion. For example, if F(p)=Lpf, where Lis a linear differential operator, and the spaces chosen are standard piecewise bilinear functions on uniform grids, then the algorithm is usual multigrid where the operators are the standard finite element approximations to L, interpolation is bilinear, restriction is full weighting (the transpose of interpolation), and relaxation is pointwise GaussSeidel. In our case, we will derive the form of the discrete equations in a later section. The relaxation chosen in PML depends on the character of the discrete equations. When F is nonlinear, a nonlinear version of GaussSeidel relaxation can be defined on each level. This is done by sweeping through the grid in some order, updating so that =0. Here, is quadratic, so the exact solution is used. 2.3 Full Approximation Storage (FAS) Scheme The original multigrid approach for nonlinear problems was developed by Brandt [14], and is known as the full approximation storage (FAS) scheme. MG for linear problems involves solving a coarse grid version of the finer grid residual equation to obtain an approximation for the error. With nonlinear problems, it is generally not possible to formulate a residual equation for the error. Instead, we use a coarse grid version of the full problem, with a change made to the righthand side so that, at least at convergence, the coarse grid solution coincides with the fine grid solution on the coarse gridpoints. Assuming that the error is smooth, and that the solution to the modified coarse grid problem is a better approximation to the solution at coarse grid points, then the difference between the coarse and fine grid approximations (taken at coarse grid points) should be a good approximation to the error. This difference is interpolated and used as the correction we seek. To make this more precise, suppose we have a nonlinear problem of the form 11
PAGE 19
F(p) f (2.14) and have a sequence of grids and approximations pk to F, for k=l,2, ... ,M. Some relaxation such as nonlinear GaussSeidel is used, and the usual interpolation and restriction can be defined. The resulting cycle is defined as follows. FAS V(vpvz)cycle for nonlinear problems: If k M, solve the problem exactly. Otherwise: 1. Relax v 1 times on F k(u ") fk 2. Perform coarse grid correction: Setfk! It1(jkFk(pk))+Fkl(l:lpk) andpk1 It1pk Perform 1 FAS V(v1,v2)cycle on the problem pk1(uk!) /1 Correct the level k approximation: pk pk + (pk1 l:1 pk) 3. Relax v 2 times on F k(u ") fk For linear problems, this is equivalent to the previous coarse grid correction, although it uses the coarse grid approximation to the full solution, rather than the error. It should be noted that we can use a different restriction for solution approximations than for residuals, and simple injection is often used. 2.4 Full Multigrid (FMG) Algorithm The cycle can reduce all error components by some fixed factor, but the number of cycles needed depends on the accuracy of the initial approximation and the final desired accuracy. Usually, it is unnecessary to reduce the algebraic error far below the level of discretization error, since the extra work involved does not help give a better solution to the continuous problem. The main idea with full multigrid 12
PAGE 20
(FMG) is to obtain tbe initial approximation on each level by interpolating the final solution from the next coarser level, then doing only enough cycles there to reduce tbe algebraic error to below the level of discretization error on that level. An important point is that the initial interpolation to a new finer level must be of sufficiently high order (a tilde is used in step 3 of the algorithm below to denote tbat this is different from the standard interpolation used in cycling ) Generally, for a second order problem, cubic interpolation must be used to ensure a good enough initial approximation. The FMG algorithm (also known as FMV, when V cycles are used in step 3) is as follows: Full Multigrid (FMG): (1) For k=M, M1, ... 2, Set jk<1 = It1 fk (2) Solve the problem L 1 ( u 1 ) = l (3) For k=2, 3, ... ,M Perform j.1 cycles on the level k problem Here, we want to see the error reduction needed on each level in order to have tbe final algebraic error in tbe solution below the level of discretization error. Let P denote tbe continuous solution, and pk denote the exact solution to the problem discretized on level k. We make the following simplifying assumptions. We assume O(h2 ) discretization error is so that IIPkPII=ch/ where the constant cis a constant independent of h, so that 11Pk1P11=411PkPII. We also assume that convergence of pk toP is nicely monotonic, so tbat IIPk1PII= IIPk1Pkll+ IIPkPII. Now, let p0 k denote the initial level k approximation, and P/ be tbe final approxi mation after cycling. The goal on level k is to reduce tbe algebraic error (error between the approximate solution and the exact level k solution) to below the level of discretization error. That is, we want llplPk!ISIIPkPII. If pk1 is used as an initial approximation on level k (i.e., p0k=pk1), then, ignoring interpolation error, 13
PAGE 21
under our assumptions, we get llp;Pkll = IIPk1Pkll = IIPk1 P 11IIPk PII =311Pkn Thus, we would need to reduce the error by a factor of 3 to get within the level of discretization error. In FMG, however, the level k1 problem is not solved exactly. Instead, we take If the level k1 problem is solved to the level of discretization error (i.e., IIPkl_PII), then we have instead IIPok_Pkll = IIP;1Pkll ,; IIP;1Pk111 + llpkt_pkll ,; llpkt_PII+IIPkt_pkll = 411Pk PII +311PkPII = 711Pk PIIThus, this argument indicates that a reduction factor of 7 will suffice to stay below the level of discretization error on all levels, assuming that the initial problem on the coarsest grid is solved well enough. This can be used as a guide to determining the number of cycles needed on each level in the FMG process. Denoting this number by /t, and letting p be the asymptotic convergence factor of the V cycle, we only need to find 1t so that This is only a guide, and, especially on coarser grids, convergence to the continuous solution may not behave like O(h2), so sometimes extra cycles on finer levels may be used to ensure that the level of discretization error is reached. 14
PAGE 22
3. The Control System with First Order Time Derivative 3.1 The Operator Riccati Equation and PDE In this section, we consider the following LQR problem. Let the function v(x,t) be the state of system and u(t) be a function describing the control applied to the system. Then we want to minimize the quadratic cost functional: J(u) = Jl(Qv,v)+ru2(t)}dt (3.1) t=O subject to the constraint that u and v satisfy v (x,t) =A v(x,t) + Bu(t) x E [0,1], t;, 0 v(O,t) = v(l,t) = 0, t ;, 0 (3.2) v(x,O) = v0(x), X E [0,1] where the state v(x,t) is in the Hilbert space H={h(x):h(O)=h(1)=0 and h,h' EL2(0,1)}; < > is the usual L2 inner product, and not the inner product on H; r is a positive real number; A and Q are operators on H and B is an operator from lfk to H. Let O=[O,l]x[O,l]. We assume that Q is a nonnegative symmetric integral operator with kernel q(x,y), i.e., for any hEH, 1 Qh = fq(x,y)h(y)dy, 0 and (Qh,h) ;,0 q(x,y)=q(y,x) for all (x,y) E Q = [O,l]x[0,1]. 15
PAGE 23
For concreteness, we consider the case A=rf218x2 so that the system is described by the heat equation. Let B:JE.+H be defined in terms of a given function b(x)EH: Ba = ab(x) For the LQR problem (3.1) and (3.2), by the Pontryagin maximum principle [12], the optimal control u(t) which minimizes J has the form of state feedback u(t) = r 1B 'P v(x,t) (3.3) where the unknown P:H+H is the unique selfadjoint nounegative solution to the following operator Riccati equation: A 'P +PA PBB'P +Q =0. (3.4) Here, A* and B denote the ad joints of A and B, respectively. From the definitions of A and B, we have A* =A, and B*:H+JE. is defined by 1 B 'h = J b(x)h(x) dx. 0 The goal is to find the form of state feedback (3.3), and for this, P must be determined. We assume that P can be expressed as an integral operator Ph= J p(x,y)h(y) dy 0 with kernel p:[O,l]x[O,l]+JR, which we must find. The facts that Pis nonnegative and self adjoint imply that p(x,y)=p(y,x) and 1 1 J J p(x,y)h(x)h(y) dxdy ;e 0 for all h E H 0 0 16
PAGE 24
We also need to determine the appropriate boundary conditions for p. Since Ph E H for any h, we must have Ph(O)=Ph(l)=O. That is, or any hEH, 1 1 f p(O,y) h(y) dy = f p(l,y) h(y) dy = 0. 0 0 These, together with symmetry of p, give the boundary condition p(x,y) I an = 0. (3.5) An equation for p is now derived from (3 .4) by applying the operator there to an arbitrary hE H. We look at each term separately. The first term applied to his 2 1 1 A 'Ph =APh = = f Pxxh(y)dy ax 0 0 Also, applying integration by parts and using (3.5), we get PAh as follows: 1 Cf2 1 PAh = f p(x;y)2 h(y)dy = f Pyyh(y)dy. 0 ax 0 For the nonlinear part PBB *p, by the definitions of B and B *, we have 1 1 PBB 'P(h) = J p(x,z)[ J b(s) J p(s,y)h(y)dy ds] b(z)dz 0 0 0 = J(J p(x,z)b(z)dz) (J p(s,y)b(s)ds h(y)dy. 0 0 0 ) Combining these, for all hEH, we obtain 1 [ 1 1 j J Llp(jb(x)p(x,y)dy)(Jb(y)p(x,y)dy) +q h(y)dy. 0 0 0 17
PAGE 25
We thus must have where 1 B)p) J b(x)p(x,y)dx and 0 (3.6) 1 B/p) J b(y)p(x,y)dy. 0 So the LQR problem for (3.1) and (3.2) has been transformed into finding the unique nonnegative symmetric solution to the integraldifferential equation (3.6), with boundary condition (3.5). We consider two multigrid methods, multilevel projection (PML) and a full approximation storage scheme (FAS) for solving the problem: (pxx +pYY) +Bx(p)BY(p) on Q on an. 3.2 Multilevel Projection (PML) Solver (3.7) In order to apply PML to the solution of the equation (3. 7), we need to specify the sequence of spaces to be used, show that the cycling scheme can be realized by defining the fine and coarser level discrete equations, and specify the relaxation used. 3.2.1 The Discretization and Equations We consider a finite element discretization of (3.7). Let nk=2k, and hJ<=1/nk for k=1,2, ... ,M. Let xik=ihk for i=0,1, ... ,nk. We can then let be the usuallD "hat" function so that 1, for j;
PAGE 26
[xi},xik]. Let (xikYt)=(ihk,jhk). Rather than using a single subscript as in section 2, we use the subscript pair ij to denote a quantity associated with the point (xik,y/). and define the 2D piecewise bilinear functions Let sk denote the space spanned by the set of basis functions { l:o;i,j:o;n1}. This defines the nested function spaces used. As before, we let Gk be the space of coefficient vectors with respect to the basis. The interpolation is determined by the basis functions, and is piecewise bilinear. We also need the 1D piecewise linear interpolation operator, which we denote by We define the corresponding 1D restriction operator We first want to derive the form of the fine level discrete equations, then the coarser level equations. Letting L=Ll and rewriting (3.6), we get (3.8) As stated in section (2.2), the discrete equations can be written in the form: (3.9) where Fif(jj M) (F(p (3.10) Separating the terms of the inner product, this is rewritten as (3.11) 19
PAGE 27
where (3.12) The form of the discrete nonlinear term is not obvious, and will be derived below. First, we consider the linear term, which is obtained from integration by parts: MM ( M ,t,M) Lij p Lp 'f'ij (3.13) This follows from the facts: 2 i=j hM 3 M lh M1 M1 (3.14) jijj (
PAGE 28
Thus we can define Bxj M and ByiM separately: (3.16a) (3.16b) These follow from (3.14) and the definitions of Bx and BY: (3.17a) (3.17b) The final term involving q is defined in the straightforward way, completing the fine level discretization. The next step is to obtain the coarser level discrete equations. For this,, we assume that the discrete equations are defined on level k + 1, and derive the level k equa tions. We use a more general form than the tine level equations, since extra terms 21
PAGE 29
appear on coarser levels. For lsi,jsnk+l1, we write the level k+ 1 equations as: (3.18) We write the coarser level equations as in (2.10), separating the terms: (3.19) ( k k'1 k) lk+l q ,eij k" The first term is treated as in (2.12): (3.20) The second term can be separated again into functions in x and y as follows: (3.21) The first inner product is over y, and the second over x. Taking the first, we get (3.22) Similarly, the second inner product can be written as: (3.23) 22
PAGE 30
We now define the following level k quantities: (3.24) Using these, and recombining the inner products in (3.22) and (3.23), we can finally rewrite (3 .19) as (3.25) The level k operators Lk, and are defined in the same way as the level M operators: k k 8 k 1 k k k k Lij P c;,Pij 3 (pHi +Pi !J + PiJ! + Pij! k k k k +pi!j! +pi!}! +pH}+! +pi+!j+!) Thus, we have reached our goal and shown that the level k problem is well defined in terms of level k quantities only. 23 (3.26)
PAGE 31
3.2.2 Relaxation Because of the quadratic nature of the nonlinear term, we use an exact nonlinear GaussSeidel for the relaxation process. As stated previously, relaxation consists of sweeping through all points, and at each point (i,j), b:i,j:O:n1, making the change where s is chosen to satisfy (3.27) This leads to a quadratic equation in s: (3.28) The coefficients are defined as follows: (3.29) where (3.30) and (3.31) Also (3.32) where bx(j), by(i), bex(i) and bey(j) are defined by (3.24), (3.30) and (3.31). 24
PAGE 32
And finally, we have where qij = 1 1 = J J q(x,y)i(x)/Y) dxdy 0 0 = q(x;,Y) h 2 Now, we solve (3.27) exactly, taking (3.33) This form is used since the quadratic coefficient a can be small, and the root chosen is that closest to the solution of the linear part of (3.28). This completes the definition of relaxation on the fine grid. 3.2.3 Numerical Results Here, we present results both from Vcycling, to ensure that multigrid is converg ing as expected, and from the full multigrid solution process, to show that the algorithm does solve to below the level of discretization error. For our tests, we take b(x) = 100., p(x,y) =x2(x2l)y2(y2l), and use this to generate the righthand side for the continuous problem. This is used to define the righthand sides for the discrete problems on each level. Tables 3.1 and 3.2 show convergence histories and asymptotic convergence factors for V(l,O) and V(l,l) cycles, respectively. Factors for V(l,O) cycles are 0.20 per cycle for all levels (except for the case h= 114), showing that acceptable convergence, independent of mesh size, has been obtained. Factors for V(l,l) cycles also appear to be bounded independent of mesh 25
PAGE 33
size, with an average factor for the first 10 cycles of around 0.05. Smaller conver gence factors do not necessarily imply that a greater efficiency. To compare the cycling efficiency, we use the time per cycle t0 and the convergence factor p to find time required to reduce the residual norm by a factor of 10 as follows: tl = tJ!oglo P. This measure is shown in Table 3.3 for the V(l,O), the V(l,l) and the V(2,1) cycles. Cycling efficiency is best for the V (1, 1) cycle. In the FMG process, the initial approximation on a given level is the (interpolated) solution from the next coarser grid, and the algebraic error there must be reduced to below the level of discretization error. The size of the discretization error on each level is shown in Table 3.4. For h < 114, this error behaves like O(h2), decreasing by a factor of close to 4 per level. Thus, the analysis of section 2.4 should apply, and we can use 1/7 as a guide for the desired reduction factor per level. For this, two V(0,1) cycles or one V(l, 1) cycle should be sufficient. Tables 3.53.8 show the results of FMV cycling. Following our earlier notation, and using a discrete approximation to the continuous L 2 norm, the columns in each table indicate IIPok_pkll (initial algebraic error), llp/Pkll (final algebraic error), IIPkPII (discretization error), and llp/Pkii/IIPkPII. The last quantity is then less than 1 when the final algebraic error is small enough. Tables 3.5 and 3.6 show results for one and two V(O,l) cycles per level, while tables 3.7 and 3.8 give results for one and two V (1, 1) cycles. In all cases, the algebraic error is smaller than discretization error on all levels, even for one V(l ,0) cycle per level, which has a reduction factor is larger than 1/7. (Note that although the discretization error IIPkPll behaves as expected, the ratio of initial error to discretization error IIPok_pkii/IIPkPII is only slightly larger than 3 on all levels, which is than predicted.) FMG solution times are given with the tables, and the best time of 0.49 seconds is obtained with one V ( 1, 0) cycle per level. 26
PAGE 34
Cycle h = 114 h = l/8 h = 1116 h = 1132 h = 1164 0 O.lOE+Ol 0.56E+OO 0.28E+OO 0.14E+OO 0.72EOl 1 0.15E+OO 0.35EOl 0.13E01 0.58E02 0.28E02 2 0.74EOl 0.59E02 0.25E02 0.11E02 0.55E03 3 0.40EOl 0.89E03 0.48E03 0.22E03 O.llE03 4 0.21E01 0.12E03 0.95E04 0.44E04 0.22E04 5 O.llE01 0.16E04 0.19E04 0.89E05 0.43E05 6 0.54E02 0.31E05 0.36E05 0.18E05 0.87E06 7 0.27E02 0.45E06 0.69E06 0.35E06 0.17E06 8 0.13E02 0.89E07 0.13E06 0.71E07 0.34E07 9 0.65E03 0.15E07 0.23E07 0.14E07 0.69E08 10 0.32E03 0.31E08 0.39E08 0.28E08 0.14E08 Factor 0.50 0.20 0.20 0.20 0.20 Table 3.1. PML V(1,0)cycle convergence histories for the model problem for different mesh sizes. Asymptotic convergence factors for each level are given at bottom of columns. Cycle h = 114 h = 118 h = 1/16 h = 1132 h = 1164 0 O.lOE+Ol 0.56E+OO 0.28E+OO 0.14E+OO 0.72E01 1 0.77E01 0.61E02 0.26E02 0.12E02 0.57E03 2 0.16E01 0.13E03 0.11E03 0.53E04 0.25E04 3 0.25E02 0.30E05 0.48E05 0.26E05 0.12E05 4 0.28E03 0.67E07 0.22E06 0.13E06 0.63E07 5 0.51E04 0.18E08 0.99E08 0.77E08 0.41E08 6 0.13E04 0.64E10 0.52E09 0.47E09 0.27E09 7 0.23E05 0.20E11 0.24E10 0.30E10 0.20E10 8 0.29E06 0.57E13 O.llE11 0.19E11 0.15E11 9 0.42E07 0.28E14 0.47E13 0.13E12 O.llE12 10 O.llE07 0.12E15 0.18E14 0.85E14 0.92E14 Factor 0.16 0.027 0.038 0.048 0.051 Table 3.2. PML V(1 ,I)cycle convergence histories for the model problem for different mesh sizes. Average convergence factors for each level are given at bottom of colunms. 27
PAGE 35
Cycle Factor tc tl V(1,0) .200 0.25 0.39 V(1,1) .065 0.41 0.35 V(2, 1) .028 0.57 0.37 V(2,2) .015 0.73 0.40 Table 3.3 Comparison of efficiency for PML Vcycles for h= 1/64. Shown are convergence factors, times per cycle, and cycling time to reduce residuals by a factor of 10. Level liP II IIPkll IIPkPII 1/2 1.76E02 2.56E02 8.05E03 114 2.48E02 3.05E02 7.23E03 1/8 2.54E02 2.66E02 1.73E03 1116 2.54E02 2.57E02 4.15E04 1132 2.54E02 2.55E02 1.02E04 1/64 2.54E02 2.54E02 2.55E05 Table 3.4. Norms of continuous and PML algebraic solutions and their differences measured on each level. Level IIP0kPkll llp/Pkll IIPkPII llprPkii/IIPkPII 1/4 0.16EOl 0.18E04 0.72E02 0.25E02 118 O.SOE02 0.94E03 0.17E02 0.55E+OO 1116 0.15E02 0.25E03 0.42E03 0.59E+OO 1132 0.36E03 0.59E04 O.lOE03 0.58E+OO 1164 0.90E04 0.15E04 0.26E04 0.58E+OO Table 3.5. PML FMG (1 V(l,O) cycle/level). Solution time = 0.49 s. 28
PAGE 36
Level k k IIP0 P II llp/Pkll 1/Pk P II IIP/Pkii/IIPkPII 1/4 0.16E01 0.18E04 0.72E02 0.25E02 118 O.SOE02 0.17E03 0.17E02 O.lOE+OO 1/16 0.12E02 0.39E04 0.42E03 0.94EOl 1132 0.30E03 0.92E05 O.lOE03 0.90EOl 1164 0.73E04 0.23E05 0.26E04 0.89E01 Table 3.6. PML FMG (2 V(l,O) cycles/level). Solution time = 0.71 s. Level IIP0kPkll llp/Pkll 1/PkPII llp/Pk 11/IIPkPII 1/4 0.16EOl 0.68E06 0.72E02 0.95E04 118 O.SOE02 0.19E03 0.17E02 0.11E+OO 1/16 0.12E02 0.47E04 0.42E03 O.llE+OO 1/32 0.30E03 0.13E04 0.10E03 0.13E+OO 1164 0.77E04 0.39E05 0.26E04 0.15E+OO Table 3.7. PML FMG (1 V(1,1) cycle/level). Solution time = 0.88 s. Level IIP0k_pkll llp/Pkll IIPkPII IIP/Pkii/IIPkPII 114 0.16E01 0.68E06 0.72E02 0.95E04 118 O.SOE02 O.SOE05 0.17E02 0.29E02 1/16 0.12E02 0.25E05 0.42E03 0.61E02 1132 0.30E03 0.69E06 O.lOE03 0.67E02 1164 0.73E04 0.20E06 0.26E04 0.77E02 Table 3.8. PML FMG (2 V(1,1) cycles/level). Solution time = 1.32 s. 29
PAGE 37
3.3 Full Approximation Scheme (FAS) Solver The application of FAS to equation (3. 7) is much simpler to describe tban tbe PML approach. The FAS algorithm given in Section 2.3 is used, with simple finite difference approximations for the operators on each level. This is described here, and numerical results are presented. 3.3 .1 The Discretization, Equations and Relaxation We use a sequence of uniform discretizations of the domain Q=[O,I]x[O,l], where ok has mesh size hk=2k. To define our algorithm, we only need to specify the discretization of the nonlinear operator F in (3. 7): where 1 B/p) = J b(x)p(x,y)dx 0 1 and B/p) = J b(y)p(x,y)dy. 0 Straightforward finite difference discretizations are used, so that on level k, and k "' h "bk k and By (p Jij = k ...., t Pu l 30
PAGE 38
Point evaluation is used for discretizing the righthand side q on the finest level M, which we denote Now, we have the problem written in the form used in the FAS algorithm in Section 2.3, so that the equations we use on each level are fully specified. The relaxation we use is nonlinear GaussSeidel. The level k equation at a point i j can now be written as: which is a quadratic equation in of the form k 2 b k 0 a (pij) + Ptj + c = with 4 k k k k b = 4 + hk (b; byij + bj bxij) k k 2 k c = bxu bytj(ptlj + Pt+lj + Pu1 + Pu+l) hk ftj where bxij = L b/p1J and byij = L b/p17, loti [f'j We solve exactly, taking the root closest to that of the linear problem, so that This finishes the description of our FAS algorithm. 31
PAGE 39
3.3.2 Numerical Results Tables 3.93.16 contain the results corresponding to Tables 3.13.8 for the FAS rather than the PML version of the algorithm. Again, we take b(x) = 100 and p(x,y)=x2(x2l)y2(l1). It is interesting to compare the discretization errors IIPkPII for PML and FAS. Tables 3.4 and 3.12 show that on the finest level used, this error is 3.47 for FAS, while it is 2.65 for PML. This merits further study, but will not be examined here. Another difference between PML and FAS is seen in Tables 3.3 and 3.11. Cycle times for FAS are smaller than those for PML, mostly due to the use of smaller finite difference stencils, but this savings is somewhat offset by higher convergence factors, resulting in comparable efficiency. Again, O(h2 ) discretization error behavior is seen, so that reduction per level required in FMG the same as with PML. Tables 3.9 and 3.10 show convergence histories and asymptotic convergence factors for V(l,O) and V(l,l) cycles, respectively. Generally, the convergence factors are not as good as for the PML algorithm. This is understandable, since the PML algorithm is designed so that coarse grid corrections are optimal in some sense, while FAS uses an approxima tion for the correction to the righthand side of the coarser level equations. Asymptotic convergence factors obtained are 0.35 for V(l ,0) cycles and 0.15 for V(l,l) cycles (compared to 0.2 and 0.05 for PML). Tables 3.13 and 3.14 show FMG results for 1 and 2 V(O,l) cycles per level, while tables 3.15 and 3.16 give the results for one and two V(l,l) cycles. Here, not unexpectedly, we see that one V ( 1, 0) cycle per level is not sufficient, so that the ratio of algebraic error to discretization error increases on finer levels. Timings given with the tables show that the optimal solution time is 0.44 seconds, obtained with 1 V(l, 1) cycle per level. This is slightly better than the optimal PML solution time obtained. 32
PAGE 40
Cycle h = 114 h = 1/8 h = 1116 h = 1132 h = 1164 0 0.19E+Ol 0.12E+OO 0.62E01 0.24E01 0.92E02 1 0.48E+OO 0.22E01 0.11E01 0.70E02 0.32E02 2 0.23E+OO 0.48E02 0.27E02 0.22E02 0.12E02 3 0.97E01 0.16E02 0.70E03 0.71E03 0.44E03 4 0.40E01 0.52E03 0.20E03 0.23E03 0.17E03 5 0.17EOl 0.18E03 0.66E04 0.74E04 0.63E04 6 0.69E02 0.65E04 0.22E04 0.25E04 0.23E04 7 0.28E02 0.24E04 0.75E05 0.83E05 0.79E05 8 0.12E02 0.89E05 0.26E05 0.29E05 0.28E05 9 0.49E03 0.33E05 0.90E06 0.10E05 0.95E06 10 0.20E03 0.12E05 0.32E06 0.36E06 0.33E06 Factor 0.41 0.37 0.35 0.35 0.35 Table 3. 9. F AS V (1, 0)cycle convergence histories for the model problem for different mesh sizes. Asymptotic convergence factors for each level are given at bottom of colunms. Cycle h = 1/4 h = 118 h = 1116 h = 1/32 h = 1164 0 0.56E+OO 0.24EOl 0.!4EOI 0.69E02 0.28E02 I 0.97EOl 0.20E02 0.87E03 0.61E03 0.30E03 2 0.16EOl 0.28E03 O.!OE03 0.56E04 0.34E04 3 0.27E02 0.44E04 0. !3E04 0.56E05 0.41E05 4 0.45E03 0.70E05 0.20E05 0.62E06 0.50E06 5 0.74E04 O.l!E05 0.32E06 0.80E07 0.62E07 6 0.12E04 0.18E06 0.52E07 O.l!E07 0.80E08 7 0.20E05 0.27E07 0.89E08 0.18E08 O.llE08 8 0.34E06 0.42E08 0.!6E08 0.31E09 0.15E09 9 0.56E07 0.65E09 0.27E09 0.53E10 0.24E10 10 0.93E08 O.lOE09 0.43E10 0.92E11 0.54E11 Factor 0.17 0.16 0.17 0.17 0.15 Table 3.10. FAS V(1,1)cycle convergence histories for the model problem for different mesh sizes. Asymptotic convergence factors for each level are given at bottom of columns. 33
PAGE 41
Cycle factor tc tl V(1,0) 0.365 0.18 0.41 V(1,1) 0.130 0.25 0.28 V(2, 1) 0.088 0.33 0.31 V(2,2) 0.064 0.40 0.33 Table 3 .11. Comparison of efficiency for PAS V cycles for h = 1/64. Shown are convergence factors, times per cycle, and cycling time to reduce residuals by a factor of 10. Level liP II IIPkll IIPk_p II 1/2 1.76E02 1.71E02 4.95E04 114 2.48E02 2.47E02 9.57E05 118 2.54E02 2.53E02 2.23E05 1116 2.54E02 2.54E02 5.46E06 1132 2.54E02 2.54E02 1.36E06 1164 2.54E02 2.54E02 3.39E07 Table 3 .12. Norms of continuous and PAS algebraic solutions and their differences measured on each level. Level IIPok_pkll llp/Pkll IIPkPII liP/pk 11/IIPkP II 114 0.17E01 0.16E05 0.96E04 0.16EOl 118 0.49E03 0.54E04 0.22E04 0.24E+Ol 1116 0.62E04 0.18E04 0.55E05 0.33E+Ol 1132 0.19E04 0.65E05 0.14E05 0.48E+Ol 1164 0.66E05 0.24E05 0.34E06 0.71E+Ol Table 3.13. PAS PMG (1 V(l,O) cycle/level). Solution time = 0.39 s. 34
PAGE 42
' I I I I I I I I I I I r I I I I I I Level 114 118 1116 1132 1/64 llrtPkll 0.17E01 0.49E03 0.29E04 0.44E05 O.lOE05 llr/Pkll 0.16E05 0.15E04 0.25E05 0.41E06 0.70E07 0.96E04 0.22E04 0.55E05 0.14E05 0.34E06 0.16E01 0.68E+OO 0.45E+OO 0.30E+OO 0.21E+OO Table 3.14. FAS FMG (2 V(l,O) cycles/level). Solution time = 0.60s. Level IIP0k_pkll llp/Pkll IIPkPII IIP/Pkii/IIPkPII 114 0.17EOl 0.20E07 0.96E04 0.21E03 1/8 0.48E03 0.18E04 0.22E04 0.79E+OO 1/16 0.31E04 0.39E05 0.55E05 0.71E+OO 1132 0.53E05 0.78E06 0.14E05 0.58E+OO 1164 0.12E05 0.16E06 0.34E06 0.47E+OO Table 3.15. FAS FMG (1 V(l,1) cycle/level). Solution time = 0.44 s. Level IIPok_pkll llr/Pkll IIPk P II liP/pk II/ IIPkP II 114 0.17EOl 0.20E07 0.96E04 0.21E03 118 0.48E03 0.26E05 0.22E04 0.12E+OO 1116 0.26E04 0.22E06 0.55E05 0.41E01 1/32 0.34E05 0.28E07 0.14E05 0.20E01 1164 0.95E06 0.22E07 0.34E06 0.64E01 Table 3.16. FAS FMG (2 V(l,l) cycles/level). Solution time = 0.82s. 35
PAGE 43
I I I I I I I I I 4. The Control System with Second Order Time Derivative The governing equation for many systems, including the control of many large space structures, is second order in time and can be described by the following partial differential equation: v(x,t) + Dv(x,t) +A v(x,t) = Bu(t) (4.1) where the state v(x,t) is in some Hilbert space H. The stiffness operator A and the damping operator D are operators on H, u(t)ER is the control input to the system, and B is an operator from R into H. The problem is to find the control u(t) that minimizes a cost functional J(u) (similar to (3.1)), subject to the constraint (4.1). The resulting LQR problem can be more difficult to solve than the problem of Section 3, depending on the form equation (4.1). For completeness, in this section, we present a particular problem and an unsuccessful approach we tried, discuss the difficulties encountered, and outline possible methods to overcome these problems. Figure 1. The simply supported beam. Applied force is u(t)*b(x), where b(x) is a given function. In practice, b can be a continuous function along the beam or a force applied at one or more points. 36
PAGE 44
4.1 Control of a Simply Supported Beam An example of a physical problem whose behavior is modeled by (4.1) is illustrat ed in Figure 1, showing control of a simply supported beam. In this example we have A=84/8x4 and D=2C,A112= 2C,a2Jax2 The damping parameter C, is generally quite small, and values around C, =0.05 are typical. A standard approach for control problems with a constraint equation that is secondorder in time is to rewrite it as a system of firstorder equations, so that the method of the previous section can be applied. Let [ v(x,t)] w(x,t) v(x,t) Then ( 4 .I) becomes w(x,t) Aw(x,t) + Bu(t) (4.2) where A [ 0 f ] A [0] and A D B That is, our goal is to find a control u(t) that minimizes a quadratic cost functional of the form J(u) j"I(Qw,w)+u2(t)}dt 0 (4.3) subject to the constraint (4.1), where Q is nonnegative and selfadjoint and may be written as 37
PAGE 45
By the Pontryagin maximwn principle, we get the optimal control as u(t) B 'Pw(x,t) with P the unique nonnegative solution to the following operator Riccati equation: A* A A A* A P+PAPBB (4.4) Many articles (see [4][9]) focus on approximation techniques to solve the LQR problem (4.1) and (4.4). Let and asswne that PI, P2 and P0 are integral operators with kernels pi(x,y), p2(x,y) and p0(x,y), respectively, that B is defined as in Section 3, and that QI, Q2 and Q0 are integral operators with kernels qi(x,y), q2(x,y) and %(x,y), respectively. Our approach was to transform the Riccati equation into a system of PDE' s, using the method of Section 3, resulting in the following system for p0 pi and p2 : Po'xxxx + Poyyyy + Bjp;)B/pr) qi p1 + Pzxxxx Poyy + B/Pz)B/Po) qo p1 + Pzyyyy Poxx + Bx(p;)ByCpz) qo' (po + P;) (p;xx + Pzyy) + Bx(pz) B/Pz) qz 38 (4.5)
PAGE 46
We can eliminate p1 in (4.5) to get: Poxxxx + Poyyyy (poxxPoy) + (p2xxxxP2yyyy) + Bx(p2)By(p0)Bx(po')By(p2) = qo qo' (po + P;)(p;xx + Pzyy) + Bx(p2) By(pz) = qz with the boundary conditions Here is the kernel of so that we have p;(x,y) =p0(y,x) for all (x,y) E[O,l]x[O,l] (4.6) (4.7) For a small mesh size h= 1/4, the solution to (4.6) and (4.7) is the same as that from the standard approximation techniques applied directly to the discrete form of (4.4), indicating that our method for transforming the LQR problem into system ( 4. 6) with boundary conditions ( 4. 7) is essentially correct. However, we encoun tered difficulties in formulating a multigrid method similar to that used in the previous section that converged for this problem with h smaller than 0.25. 4.2 Multigrid Difficulties For a multi grid solver, we tried a modified application of the FAS approach described in Section 2. A relaxation scheme was devised in which points (xi,Y) and (yj,xi) were relaxed simultaneously (because of the equation coupling due to the "adjoint" terms). After tests gave rather strong divergence for small h, we examined the system and found that the problem was with the nonelliptic nature of 39
PAGE 47
the system (4.6). This was somewhat hidden by the presence of the adjoint terms (which are not typical in pde' s). The nature of the equations is more easily seen if we rewrite (4.6), ignoring the nonlinear terms, letting and noting that p2 = p;: 22 22 22 cax ay)s+(ax ay)t+(ax ay)!1p2 = qo qo (4.8) 2s!1p2 = q2 The is nonelliptic, and is contained in the principal part of the full system operator. Thus, there are oscillatory components that are not smoothed by usual relaxation methods. In addition, when discretized, a zero diago nal coefficient, so that, for small enough h, offdiagonal coefficients dominate and relaxation diverges. Thus our initial multigrid approach was not successful. 4.3 Possible Multigrid Approaches Here, we outline several possible approaches to the problem, which we were unable to explore due to time constraints. These include modifications to the multigrid algorithm, as well as possible changes to the problem formulation itself. The first idea is to deal with the main difficulty of the nonelliptic operator This is basically the wave equation operator. Null space components of this operator (ignoring boundary conditions) are those which are constant along the characteristic directions x+y=c or xy=c. The values along adjacent lines are arbitrary (and thus nonsmooth). A global relaxation step may help, in which constants are added along each of these lines, so that some combination of the equations are satisfied. Other types of relaxation can also be explored. An alterna tive to eliminating problem components by relaxation is to change interpolation 40
PAGE 48
and the coarse grid problem so that these components are contained in the range of interpolation. This could be done by using semicoarsening in a diagonal direction. Another possibility that may help. although it will not change the character of the system, is to change the formulation of the firstorder system. The method shown earlier for reducing the order of the time derivative appears to have been chosen because of it's simplicity, and the fact that the firstorder formulation fits well with existing algebraic methods for solution of the control problem. The form of the resulting system does not seem to be especially important. In our case, a different formulation may make the system easier to deal with (although probably not change the its essential character). For example, we can take [ v(x,t) l w(x,t) = A ll2v(x,t) Then (4.1) becomes w(x,t) = Aw(x,t) + Bu(t) (4.9) where A=l 0 Altzl and B=[o]. A 1/2 D B Here, note that the order of the operators has been reduced, so that only second and lowerorder operators appear. As stated above, this will not change the basic character of the system, and the need to deal with the "wave equation" operator, but it may simplify the resulting multigrid algorithm. More time and work is needed to construct an effective multigrid algorithm for this problem. 41
PAGE 49
5. Conclusions We have formulated a differential/integral equation for the solution of a differen tial Riccati equation and developed two efficient multigrid solvers for this prob lem, showing that our basic approach is sound. A second type of problem present ed some difficulties, and we have outlined some approaches that may help over come these and develop an efficient multigrid solver for more general Riccati equations. It is important to note that the problems we encountered were not with the Riccati equation nor with our formulation of a system of PDE's, but with the nature of the original timedependent constraint. The governing equation, essential ly a higherorder wavetype equation, was nonelliptic, and this character was preserved in the resulting system. If the original equation can be effectively solved by multigrid methods, then we should also be able to develop an efficient multigrid solver for the resulting control problem. 42
PAGE 50
T ...... References (1] J.L. Lions, Optimal control of systems governed by partial differential equa tions, SpringerVerlag, Berlin, 1971. [2] P.K.C. Wang, Control of distributed parameter systems, in Advances in Control Systems, vol.l, Academic Press, New York, 1964. (3] V. Komkov, Formulation of Pontryagin's maximality principle in a problem of structural mechanics, Inter. J. Control, 17 (1973), pp. 455463. [4] J.S. Gibson and A. Adamian, Approximation theory for linear quadratic Gaussian optimal control of flexible structures, SIAM J. Contr. & Optim., 29 (1991), pp. 137. [5] J.S. Gibson, Linearquadratic optimal control of hereditary differential systems: Infinite dimensional Riccati equations and numerical approximations, SIAM J. Contr. & Optim., 21(1983), pp. 95115. [6] J .S. Gibson, An analysis of optimal control regulation: Convergence and stability, SIAM J. Contr. & Optim., 19(1981), pp. 686707. [7] H.T. Banks, I.G. Rosen, and K. Ito, A spline based technique for computing Riccati operators and feedback controls in regulator problems for delay equations, SIAM J. Sci. Statist. Comp., 5(1984), pp. 830855. [8] H.T. Banks and K. Kunisch, The linear regulator problem for parabolic systems, SIAM J.Contr. & Optim., 22(1984), pp. 684698. [9] M.J. Balas, Trends in large space structure control theory: Fondest hopes, wildest dreams, IEEE Trans. on Auto. Contr. AC27(1982), pp.522535. 43
PAGE 51
[10] M.J. Balas, Feedback control of flexible systems, IEEE Trans. on Auto. Contr. AC23(1978), pp.673679. [11] V.M. Alekseev, V.M. Tikhomirov and S.V. Fomin, Optimal Control, Consultants Bureau, New York, 1987. [12] J. Zabczyk, Mathematical Control Theory: An Introduction, Birk.hauser, Boston, 1992. [13] S.F. McCormick, Multilevel Projection Methods for Partial Differential Equations, SIAM, Philadelphia, 1992. [14] A. Brandt, MultiLevel adaptive solutions to boundary value problems, Math. Comp., 31 (1977), pp. 333390. 44
