Citation |

- Permanent Link:
- http://digital.auraria.edu/AA00002127/00001
## Material Information- Title:
- Preconditioned iterative methods for linear systems, eigenvalue and singular value problems
- Creator:
- Vecharynski, Eugene
- Publication Date:
- 2011
- Language:
- English
- Physical Description:
- xiii, 149 leaves : ; 28 cm.
## Subjects- Subjects / Keywords:
- Iterative methods (Mathematics) ( lcsh )
Linear systems -- Mathematical models ( lcsh ) Eigenvalues ( lcsh ) - Genre:
- bibliography ( marcgt )
theses ( marcgt ) non-fiction ( marcgt )
## Notes- Thesis:
- Thesis (Ph. D.)--University of Colorado Denver, 2011.
- Bibliography:
- Includes bibliographical references (leaves 143-149).
- General Note:
- Department of Mathematical and Statistical Sciences
- Statement of Responsibility:
- by Eugene Vecharynski.
## Record Information- Source Institution:
- University of Colorado Denver
- Holding Location:
- Auraria Library
- Rights Management:
- All applicable rights reserved by the source institution and holding location.
- Resource Identifier:
- 747540792 ( OCLC )
ocn747540792
## Auraria Membership |

Full Text |

PRECONDITIONED ITERATIVE METHODS FOR LINEAR SYSTEMS, EIGENVALUE AND SINGULAR VALUE PROBLEMS by Eugene Vecharynski M.S., Belarus State University, 2006 A thesis submitted to the University of Colorado Denver in partial fulfillment of the requirements for the degree of Doctor of Philosophy Applied Mathematics 2011 by Eugene Vecharynski All rights reserved. This thesis for the Doctor of Philosophy degree by Eugene Vecharynski has been approved by /i- Izz Iloio Date Vecharynski, Eugene (Ph.D., Applied Mathematics) Preconditioned Iterative Methods for Linear Systems, Eigenvalue and Singular Value Problems Thesis directed by Professor Andrew Knyazev ABSTRACT In the present dissertation we consider three crucial problems of numerical linear algebra: solution of a linear system, an eigenvalue, and a singular value problem. We focus on the solution methods which are iterative by their nature, matrix-free, preconditioned and require a fixed amount of computational work per iteration. In particular, this manuscript aims to contribute to the areas of research related to the convergence theory of the restarted Krylov subspace min- imal residual methods, preconditioning for symmetric indefinite linear systems, approximation of interior eigenpairs of symmetric operators, and preconditioned singular value computations. We first consider solving non-Hermitian linear systems with the restarted generalized minimal residual method (GMRES). We prove that the cycle- convergence of the method applied to a system of linear equations with a normal (preconditioned) coefficient matrix is sublinear. In the general case, however, it is shown that any admissible cycle-convergence behavior is possible for the restarted GMRES at a number of initial cycles, moreover the spectrum of the coefficient matrix alone does not determine this cycle-convergence. Next we shift our attention to iterative methods for solving symmetric indefi- nite systems of linear equations with symmetric positive definite preconditioners. We describe a hierarchy of such methods, from a stationary iteration to the op- timal Krylov subspace preconditioned minimal residual method, and suggest a preconditioning strategy based on an approximation of the inverse of the abso- lute value of the coefficient matrix (absolute value preconditioners). We present an example of a simple (geometric) multigrid absolute value preconditioner for the symmetric model problem of the discretized real Helmholtz (shifted Lapla- cian) equation in two spatial dimensions with a relatively low wavenumber. We extend the ideas underlying the methods for solving symmetric indefinite linear systems to the problem of computing an interior eigenpair of a symmet- ric operator. We present a method that we call the Preconditioned Locally Minimal Residual method (PLMR), which represents a technique for finding an eigenpair corresponding to the smallest, in the absolute value, eigenvalue of a (generalized) symmetric matrix pencil. The method is based on the idea of the refined extraction procedure, performed in the preconditioner-based inner product over four-dimensional trial subspaces, and relies on the choice of the (symmetric positive definite) absolute value preconditioner. Finally, we consider the problem of finding a singular triplet of a matrix. We suggest a preconditioned iterative method called PLMR-SVD for computing a singular triplet corresponding to the smallest singular value, and introduce pre- conditioning for the problem. At each iteration, the method extracts approxima- tions for the right and left singular vectors from two separate four-dimensional trial subspaces by solving small quadratically constrained quadratic programs. We illustrate the performance of the method on the example of the model prob- lem of finding the singular triplet corresponding to the smallest singular value of a gradient operator discretized over a two-dimensional domain. We construct a simple multigrid preconditioner for this problem. This abstract accurately represents the content of the candidates thesis. I recommend its publication. Signed Andrew Knyazev DEDICATION To my family and friends. ACKNOWLEDGMENT I am deeply grateful to my advisor, Professor Andrew Knyazev, for introducing me into the field. His vision of many problems and insights into their solution have definitely influenced this work. Without his guidance and support the present dissertation would have been impossible. I would like to direct my deepest thank to Dr. Julien Langou. His advice and opinion, as of a colleague and as of a friend, have always been important and timely. Chapter 2 of this dissertation is based on the research that I have performed under his supervision. It has partially been written during the three months support kindly provided by Julien in Summer 2008. I am also indebted to Professor Jan Mandel for introduction into the basics of multilevel methods. Several learned ideas have echoed in this manuscript. I am grateful to Dr. Merico Argentati and Professor Michele Benzi for reading this thesis and agreeing to be on my PhD committee. I am thankful to my family and friends. Their care and support have been inspiring during all stages of work on this dissertation. Finally, I would like to thank the faculty and my fellow students at the University of Colorado Denver for creating an excellent working atmosphere. I am also grateful to Mr. and Mrs. Warren Bateman and the Department of Mathematical and Statistical Sciences for the financial support. CONTENTS Figures .................................................................. xi Tables.................................................................. xiii Chapter 1. Introduction............................................................ 1 2. Convergence of the restarted GMRES...................................... 6 2.1 The sublinear cycle-convergence of GMRES(m) for normal matrices 10 2.1.1 Krylov matrix, its pseudoinverse, and spectral factorization ... 11 2.1.2 The sublinear cycle-convergence of GMRES(m)...................... 13 2.2 Any admissible cycle-convergence behavior is possible for the restarted GMRES at its initial cycles......................................... 24 2.2.1 Outline of the proof of Theorem 2.11............................. 27 2.2.2 Proof of Theorem 2.11 for the case of a strictly decreasing cycle- convergence ....................................................... 28 2.2.3 Extension to the case of stagnation................................ 37 2.2.4 Difference with the work of Greenbaum, Ptak, and Strakos [34] . 38 2.2.5 Generating examples with nonzero rq+1.............................. 39 2.2.6 Any admissible convergence behavior is possible for full and restarted GMRES (at its q initial cycles).................................. 43 2.2.7 Restarted GMRES with variable restart parameter.................... 45 2.3 Conclusions.......................................................... 45 viii 3. Solution of symmetric indefinite systems with symmetric positive defi- nite preconditioners................................................... 47 3.1 Iterative methods for symmetric indefinite systems with SPD pre- conditioners ........................................................ 50 3.1.1 Stationary iteration for solving symmetric indefinite systems with SPD preconditioners................................................. 53 3.1.2 Simple residual-minimizing methods for solving symmetric indefi- nite systems with SPD preconditioners..................................... 58 3.1.3 The second-order and minimal residual methods for solving indef- inite systems with SPD preconditioners.................................... 61 3.2 Absolute value preconditioners for symmetric indefinite systems . . 65 3.2.1 Optimal SPD preconditioners for symmetric indefinite systems . 65 3.2.2 An absolute value preconditioner for a model problem .............. 69 3.2.2.1 Multigrid absolute value preconditioner........................... 71 3.2.2.2 Numerical examples................................................ 77 3.3 Conclusions...................................................... 81 4. Preconditioned computations of interior eigenpairs of symmetric oper- ators ................................................................. 84 4.1 Idealized preconditioned methods for finding an interior eigenpair . 87 4.2 The Preconditioned Locally Minimal Residual method for computing interior eigenpairs.............................................. 95 4.2.1 PLMR: The choice of trial subspaces............................. 96 4.2.2 PLMR: The choice of iteration parameters........................ 98 4.3 Numerical examples .................................................. 103 ix 4.4 Conclusions.......................................................... 108 5. Preconditioned singular value computations .................... 110 5.1 Idealized preconditioned methods for finding a singular triplet . . 117 5.2 The Preconditioned Locally Minimal Residual method for computing the smallest singular triplet....................................... 123 5.2.1 PLMR-SVD: The choice of trial subspaces...................... 124 5.2.2 PLMR-SVD: The choice of iteration parameters................. 127 5.3 Numerical example.................................................... 133 5.4 Conclusions.......................................................... 141 References......................................................... 143 x FIGURES Figure 2.1 Cycle-convergence of GMRES(5) applied to a 100-by-100 normal ma- trix................................................................ 18 2.2 Cycle-convergence of GMRES(5) applied to a 100-by-100 diagonal- izable (nonnormal) matrix........................................ 22 3.1 Comparison of the MG absolute value and the inverted Laplacian preconditioners for PMINRES applied to the model problem of the size n = (27 l)2 1.6 x 104................................... 78 3.2 Performance of the MG absolute value preconditioners for the model problem with different shift values. The problem size n = (27 l)2 rs 1.6 x 104. The number of negative eigenvalues varies from 0 to 75. . 80 3.3 Comparison of PMINRES with locally optimal methods (3.17), (3.19) and (3.21), (3.24), all with the MG absolute value precon- ditioners, applied to the model problem of the size n = (27 l)2 1.6 x 104........................................................ 81 xi 4.1 Comparison of the PLMR method with the MG absolute value pre- conditioner versus the idealized eigenvalue solvers, applied to the model eigenproblem of the size n = (27 l)2 1.6 x 104. The tar- geted eigenpairs correspond to the smallest magnitude eigenvalues of the shifted discrete negative Laplacian (from top left to bottom left, clockwise): A13 ss 6.33 x 10-4, Ai3 2.7426, A15 3.4268 and A17 ss 7.19 x 104, given by shift values c2 = 197.258, 200, 250 and 256.299, respectively.............................................. 105 4.2 Comparison of the PLMR method with and without orthogonaliza- tion on the trial subspaces. Both versions of the method are applied to the model eigenproblem of the size n = (27 l)2 1.6 x 104 and use the MG absolute value preconditioner. The targeted eigenpairs correspond to the smallest magnitude eigenvalues of the shifted dis- crete negative Laplacian: Ai3 2.7426 (left) and Ai5 3.4268 (right), given by shift values c2 = 200 and 250, respectively......107 5.1 Comparison of the PLMR-SVD method with one MG preconditioner versus the idealized singular value solvers, applied to find the smallest singular triplet of the m-by-n discrete gradient operator, n (27 l)2 1.6 x 104, m 2n................................................... 139 TABLES Table 3.1 Mesh-independent convergence of PMINRES with the MG absolute value preconditioner........................................... 79 xiii 1. Introduction Complex numerical simulations and solutions of mathematical problems on large-scale data sets have become the routine tasks in cutting edge research and industry, resulting in a broad variety of computational algorithms. The nature of the algorithms can be very diverse, however, often their efficiency and robustness rely, ultimately, on the underlying techniques for solving basic problems of numerical linear algebra. In this work, we consider numerical solution of linear systems, eigenvalue problems, and singular value problems; see, e.g., [42]. We assume that the problems are of an extremely large size, and possibly sparse, i.e., the involved coefficient matrices contain a significant number of zero entries. The exact so- lutions of such problems are rarely needed. Instead, it is desirable to construct computationally inexpensive approximations to the exact solutions. In this con- text, the use of iterative methods, see, e.g., [3, 33, 59, 56], may be the only option. The study of theoretical and practical aspects of several iterative meth- ods, as well as the introduction of novel iterative techniques for solving the above mentioned problems constitutes the core of the present dissertation. The methods that we consider in this work share a number of common characteristics. First, their mathematical formulations are based on short-term recurrent relations, which allows constructing solvers with a fixed amount of computational work and storage per iteration. Second, the methods are precon- ditioned, i.e., they can use auxiliary operators, called preconditioners, which, if 1 properly defined, significantly improve the convergence, and, ideally, only mod- estly affect the cost of each iteration. In the current manuscript, we address a set of computationally challenging problems, such as numerical solution of symmetric indefinite and nonsymmetric linear systems, computation of interior eigenpairs of symmetric matrix pencils, and finding the smallest singular triplets of general matrices. Our main re- sults concern the convergence theory of the restarted Krylov subspace minimal residual methods, novel preconditioning strategies for symmetric indefinite lin- ear systems and eigenvalue problems, as well as the extension of the concept of preconditioning to singular value problems. In Chapter 2, we consider the restarted generalized minimal residual method (GMRES) for non-Hermitian linear systems. We prove that the cycle-convergence of the method applied to a system of linear equations with a normal (precondi- tioned) coefficient matrix is sublinear. In the general case, however, it is shown that any admissible cycle-convergence behavior is possible for the restarted GM- RES at a number of initial cycles, moreover the spectrum of the coefficient matrix alone does not determine this cycle-convergence. The results of this chapter are mostly published in [77, 76]. In Chapters 3, 4, and 5, we consider iterative solution of symmetric indefinite systems, symmetric eigenvalue, and singular value problems, respectively. The material is presented in such a way that we can emphasize the interconnections between the problems, which allows us to treat their numerical solution within a unified approach. The obtained results, presented in the chapters, appear here for the first time. We note that the choice of the real vector spaces has been 2 motivated merely by the desire to simplify the presentation. The extension to the complex case is straightforward. In Chapter 3, first, we describe a hierarchy of methods for solving symmetric indefinite linear systems with symmetric positive definite (SPD) precondition- ers. These methods are, mainly, based on the known idea of the minimization of the residual in the preconditioner-based norm. The careful study of such methods is motivated by a search of appropriate iterative schemes, which can be extended to the problems of finding interior eigenpairs of symmetric oper- ators, as well as computing the smallest singular triplets of general matrices. For example, we describe a method, which can be viewed as a natural analogue of the preconditioned steepest descent algorithm for solving SPD systems, and is the simplest proved to be convergent residual-minimizing method for solving symmetric indefinite systems with an SPD preconditioner. We use the locally optimal accelerations of this method to construct the base scheme, which is further extended to eigenvalue and singular value problems. Second, in Chapter 3, we suggest a novel preconditioning strategy, which is based on the idea of approximating the inverse of the absolute value of the coef- ficient matrix. We call preconditioners, which are obtained using this strategy, the absolute value preconditioners. We show, for a model problem of the dis- cretized real Helmholtz (shifted Laplacian) equation in two spatial dimensions with a relatively low wavenumber, that such preconditioners can be efficiently constructed, e.g., in the multigrid framework. It is significant that the same preconditioners can be used for finding interior eigenpairs of symmetric matrix pencils, if applied within the scheme described in Chapter 4. The absolute value 3 preconditioners for symmetric indefinite systems also motivate the definition of preconditioners for the singular value problems in Chapter 5. Using the results of Chapter 3, in Chapter 4, we present a new method, that we call the Preconditioned Locally Minimal Residual method (PLMR), which represents a technique for finding an eigenpair corresponding to the smallest, in the absolute value, eigenvalue of a (generalized) symmetric matrix pencil. The method is based on the idea of the refined extraction procedure, also called the refined projection procedure, performed in the preconditioner-based inner prod- uct over four-dimensional trial subspaces, and relies on the choice of the (SPD) absolute value preconditioner. We applied the described technique to the model problem of finding an eigenpair of the two-dimensional discretized Laplace oper- ator, which corresponds to the eigenvalue, closest to a given shift. The method demonstrated a satisfactory convergence behavior, with the convergence rate comparable, at a number of initial steps, to that of the optimal preconditioned minimal residual method, applied to the problem of finding the corresponding null space (eigenspace) of the shifted Laplacian. Finally, in Chapter 5, we consider the problem of finding a singular triplet of a matrix. We suggest a new preconditioned iterative method, that we refer to as PLMR-SVD, for computing a singular triplet corresponding to the smallest singular value. The method has several important features. First, at every step, it uses a pair of separate four-dimensional trial subspaces for extracting the right and left singular vectors, respectively. Second, it admits two SPD preconditioners, designed specifically for a singular value problem. We show that even the proper choice of only one of the two preconditioners can result in 4 a significantly improved convergence behavior. As a model problem we consider computing a singular triplet, corresponding to the smallest singular value, of a discrete two-dimensional gradient operator. We present a simple construction of the multigrid preconditioner for this problem. Let us summarize the main results obtained within the scope of the present dissertation: we have proved two theoretical results which concern the con- vergence theory of the restarted GMRES algorithm, introduced a new precon- ditioning strategy for symmetric indefinite linear systems, suggested a novel preconditioned method for computing interior eigenpairs of symmetric matrix pencils, and described a preconditioned method for finding the smallest singular triplets. This work has been partially supported by the NSF-DMS 0612751. 5 2. Convergence of the restarted GMRES ** The generalized minimal residual method (GMRES) was originally intro- duced by Saad and Schultz [61] in 1986. and has become a popular method for solving non-Hermitian systems of linear equations, Ax = b, A e Cnxn, be Cn. (2.1) Without loss of generality, to simplify the presentation below, we assume that system (2.1) is already preconditioned. GMRES is classified as a Krylov subspace (projection) iterative method. At every new iteration i, GMRES constructs an approximation x^ + JCt (A, A0)) to the exact solution of (2.1) such that the 2-norm of the corre- sponding residual vector rW = b Ax^ is minimized over the affine space r() _|_ Afc. (A, r^), i.e., r^ min ||r^ it||, (2.2) uÂ£AICi(A,r(0)) where /Q (A, r^) is the z-dimensional Krylov subspace K-i (A, r^) = span{r^\ Ar^\ ..., Al_V0'} induced by the matrix A and the initial residual vector A) = b Ax^ with x^ being an initial approximate solution of (2.1), A/Ci (A,r(0)) = span{Ar(0), AV(0),..., AV(0)}. 6 As usual, in a linear setting, a notion of minimality is associated with some orthogonality condition. In our case, minimization (2.2) is equivalent to forc- ing the new residual vector to be orthogonal to the subspace AKi (A, r^) (also known as the Krylov residual subspace). In practice, for a large problem size, the latter orthogonality condition results in a costly procedure of orthog- onalization against the expanding Krylov residual subspace. Orthogonalization together with storage requirement makes the GMRES method complexity and storage prohibitive for practical application. A straightforward treatment for this complication is the so-called restarted GMRES [61]. The restarted GMRES, or GMRES(m), is based on restarting GMRES after every m iterations. At each restart, we use the latest approximate solution as the initial approximation for the next GMRES run. Within this framework a single run of m GMRES iterations is called a GMRES(m) cycle, and m is called the restart parameter. Consequently, restarted GMRES can be regarded as a sequence of GMRES(m) cycles. When the convergence happens without any restart occurring, the algorithm is known as the full GMRES. In the context of restarted GMRES, our interest will shift towards the resid- ual vectors r^ at the end of every kth GMRES(m) cycle (as opposed to the residual vectors (2.2) obtained at each iteration of the algorithm). Definition 2.1 (cycle-convergence) We define the cycle-convergence of the restarted GMRES(m) to be the convergence of the residual nouns i|r^fc^||, where, for each k, Ak> is the residual at the end of the kth GMRES(m) cycle. GMRES(m) constructs approximations x^ G xSk~^ + /Cm (A, r(fe~d) to the exact solution of (2.1) such that each residual vector Akl = b Ax^ satisfies >-r i the local minimality condition = min ||r^fc ^ w||, (2.3) ueAICm[A,r<-k~x)) where Km (A, r^-1)) is the ra-dimensional Krylov subspace produced at the A*th GMRES(m) cycle, Km (A, r{k~l)) = span{r^fc-1\ AAk~l\ ..., A^r^}, (2.4) AKm (A,r(fc-1)) = span{AAfc-1), A2Ak~l\ ..., J4mr(fc~1)} js the corresponding Krylov residual subspace. The price paid for the reduction of the computational work in GMRES(m) is the loss of global optimality in (2.2). Although (2.3) implies a monotonic de- crease of the norms of the residual vectors r^k\ GMRES(m) can stagnate [61, 80]. This is in contrast with the full GMRES which is guaranteed to converge to the exact solution of (2.1) in n steps (assuming exact arithmetic and nonsingular A). In practice, however, if n is too large, proper choices of a preconditioner and a restart parameter, e.g., [25, 26, 46], can significantly accelerate the convergence of GMRES(m), thus making the method attractive for applications. While a great deal of effort has been devoted to the characterization of the convergence of the full GMRES, e.g., [74, 21, 34, 35, 43, 70, 72], our understand- ing of the behavior of GMRES(m) is far from complete, leaving us with more questions than answers, e.g., [25]. For a few classes of matrices, convergence estimates are available for the restarted GMRES and/or the full GMRES. For example, for real positive definite matrices (that is, for matrices A for which H = (A+AT)/2 is symmetric positive 8 definite, or, equivalently, for matrices A for which xTAx > 0 for any nonzero x G Rn), the Elmans bound [22, 23, 33, 61] can be stated as follows ||r||2 < (i p)fc||r{0)||2 where 0 < p = (Xmin{H)/\\A\\f < 1. The latter guarantees the linear cycle-convergence of GMRES (m) for a positive definite matrix. Improvements and generalizations of this bound can be found in [8, 63, 82], For normal matrices the convergence of the full GMRES is well studied. In particular, the convergence is known to be governed solely by the spectrum of A [62, 74], In Section 2.1 of this manuscript, we prove that the cycle-convergence of restarted GMRES for normal matrices is sublinear. This statement means that, for normal matrices, the reduction in the norm of the residual vector at the current GMRES(m) cycle cannot be better than the reduction at the previous cycle. We would like to mention the simultaneous but independent work [5], where the authors present an alternative proof of the sublinear convergence of the restarted GMRES for normal matrices. Assuming that the coefficient matrix A is diagonalizable, some character- izations of the convergence of the full GMRES rely on the condition number of the eigenbasis [74]. Other characterizations of the convergence of the full GMRES rely on pseudospectra [52]. More commonly, the field of values is used [8, 22, 23, 33, 61, 63, 82]. A discussion on how descriptive some of these bounds are is given by Embree [24]. In the general case, for the full GMRES, the following theorem shows that we cannot prove convergence results based only on the spectrum of the coefficient matrix alone. 9 Theorem 2.2 (Greenbaum, Ptak, and Strakos, 1996, [34]) Given a non- increasing positive sequence /(0) > /(1) > > f(n 1) > 0, there exists an n-by-n matrix A and a vector r^ with ||r^|| = /(0) such that f(i) ||r^||, i = 1,... ,n 1, where r^ is the residual at step i of the GMRES algorithm ap- plied to the linear system Ax = b, with initial residual r^ bAxMoreover, the matrix A can be chosen to have any desired (nonzero) eigenvalues. This result states that, in general, eigenvalues alone do not determine the con- vergence of the full GMRES. A complete description of the set of all pairs {A, b} for which the full GMRES applied to (2.1) generates the prescribed convergence curve while the matrix A has any (nonzero) eigenvalues, is given in [2]. In Section 2.2, we show that any admissible cycle-convergence behavior is possible for restarted GMRES at a number of initial cycles, moreover the spec- trum of the coefficient matrix alone does not determine this cycle-convergence. The latter can be viewed as an extension of the result of Greenbaum, Ptak, and Strakos, given by Theorem 2.2, for the case of restarted GMRES. 2.1 The sublinear cycle-convergence of GMRES(m) for normal matrices Throughout this section we assume (unless otherwise explicitly stated) A to be nonsingular and normal, i.e., A admits the decomposition A = VAV*, (2.5) where A e Cnxn is a diagonal matrix with the diagonal elements being the nonzero eigenvalues of A, and V Cnxn is a unitary matrix of the corresponding eigenvectors. || || denotes the 2-norm throughout. 10 2.1.1 Krylov matrix, its pseudoinverse, and spectral factorization For a given restart parameter m (1 GMRES(A, m, We assume that the residual vector r^k\ produced at the end of GMRES(A, m, is nonzero. A run of GMRES(A, m, r^-1)) generates the Krylov subspace fCm (A, r^k~^) given in (2.4). For each /Cm (A, r^fc_1^) we define a matrix K (A,r{k~l)) = [r{k~1) Ar(fc_1) Amr{k~l)] G C"x(m+1), (2.6) where k = 1,2,..., q, and q is the total number of GMRES(m) cycles. Matrix (2.6) is called the Krylov matrix. We say that K (A, r^fc_1^) cor- responds to the cycle GMRES(A, m, Ak~^). Note that the columns of K (A, Afc-1)) span the next (m-l-l)-dimensional Krylov subspace /Cm+i(A, Afc-1)). According to (2.3), the assumption r^ ^ 0 implies that r^k~^ cannot be ex- pressed as a linear combination of vectors in AJCm (A, r^-1^). Thus, the matrix K (A, Afe-1)) in (2.6) is of the full rank, rank (A (A,r the matrix K (A,Afc_1)), i.e., K] (A,r(fc_1)) = (K* (A,r(fc_1)) K (A, r(fc~1)))~1 K* (A,r{k~l)) G C(m+1)xn, which is well-defined and unique. The following lemma shows that the first column of (A^ (A, r^-1)))* is the next residual vector r^ up to a scaling factor. 11 Lemma 2.3 Given A G Cnxn (not necessarily normal), for any k = 1,2,..., q, we have (2.7) where e\ = [1 0 0]T G Rm+1. Proof: See Ipsen [43, Theorem 2.1], as well as [17, 65]. Another important ingredient, first described in [43], is the so-called spectral factorization of the Krylov matrix K (A,r^fe_1^). This factorization is made of three components that encapsulate separately the information on eigenvalues of A, its eigenvectors, and the previous residual vector Lemma 2.4 Let A G Cnxn satisfy (2.5). Then the Krylov matrix K (A,Ak~^), for any k = 1,2,... q, can be factorized as K (A,r^k~^) = VDk_iZ, (2.8) where dk-i V*r(k~^ G Cn, Dk-i diag((4_i) G Cnxn, and Z G Cnx^m+1^ is the Vandermonde matrix computed from the eigenvalues of A, Z = [e Ae Ame], (2.9) e = [1 1 l]rGP. 12 Proof: Starting from (2.5) and the definition of the Krylov matrix (2.6), K (A, r^-1)) = Ar^k~1'1 J4mr(fc_1)j = [l/R*r(fe_1) VAV*r{k~l) VAmV*r{k~1)] = V [dk-i Adk-i Amdfc_i] = V [Dk-ie ADk-ie AmDk~ie] = VDk-X[e Ae A me\ = VDk^Z. d It is clear that the statement of Lemma 2.4 can be easily generalized to the case of a diagonalizable (nonnormal) matrix A providing that we define dk-1 = in the lemma. 2.1.2 The sublinear cycle-convergence of GMRES(m) Along with (2.1) let us consider the system = b (2.10) with the matrix A replaced by its conjugate transpose. Clearly, according to (2.5), A* = VAV*. (2.11) It turns out that m steps of GMRES applied to systems (2.1) and (2.10) produce residual vectors of equal norms at each stepprovided that the initial residual vector is identical. This observation is crucial for proving the sublinear cycle-convergence of GMRES (m) and is formalized in the following lemma. 13 Lemma 2.5 Assume that A Â£ Cnxn is a nonsingular normal matrix. Let r^ and f be the nonzero residual vectors obtained by applying m steps of GMRES to systems (2.1) and (2.10); 1 < m < n 1. Then provided that the initial approximate solutions of (2.1) and (2.10) induce the same initial residual vector r^. Moreover, if pm{z) and qm(z) are the (GMRES) polynomials which mini- mize, respectively, ||p(j4)r()|| and ||p(A*)r^|| over p(z) Â£ Vm, where Vm is the set of all polynomials of degree at most m defined on the complex plane such that p(0) = 1, then Pm(z) = Qm(z), where p(z) denotes the polynomial obtained from p(z) Â£ Vm by the complex conjugation of its coefficients. Proof: Let us consider a polynomial p(z) G Vm. Let r^ be a nonzero initial residual vector for systems (2.1) and (2.10) simultaneously. Since the matrix A is normal, so is p(A)\ thus p(A) commutes with its conjugate transpose p*(A). We have ||p(j4)r<>||2 = (p{A)r^,p{A)rw) = {r{0\p*(A)p(A)rw) = (rm,p(A)p*{A)r{0)) = (p*(A)r{0\p*{A)rW) = {(Vp(A)V*)* r(0), (Vp(A)V*)* r(0)) = (Vp(K)V*rw,Vp(K)V*rw) = (p(VAV*)r{0\p(VAV*)rw) = ||p(PAW>(0)||2, 14 where p(z) is the polynomial obtained from p(z) by conjugating its coefficients. By (2.11) we conclude that l|pM)r(0>|| = ||pM><0>||. We note that the last statement is true for any polynomial p, for any r0, and for any normal A. Now, let us look at ||r^|| and ||f^||. On the one hand, ||r(m)|| = min ||p(A)r(0)|| = ||pm(A)r{0)|| = ||pm(A*)r(0)|| pEPm > min ||p(v4*)r^|| = min ||p(J4*)r^|| = ||r^||. pVm peVm On the other hand, ||f(m)|| = min ||p(A*)r{0)|| = ||gm(v4*)r(0)|| = || P&Vm > min ||p(.4)r^0)|| = min ||p(.4)r^|| = ||r^||. peVm pePm Combining both results, we conclude that l|r(m)|| = r-ll, which proves the first part of the lemma. To prove the second part of the lemma, we consider the following equalities: ||gTO(A*)r(0)!| = min ||p(/l*)r(0)|| = ||f(m)|| = ||r(m)|| =. min ||p(.4)r(0)|| peVm peVm By uniqueness of the GMRES polynomial [36, Theorem 2], we conclude that Pm(*0 = Qm(z). 15 The previous lemma is a general result for the full GMRES, which states that, given a nonsingular normal matrix A and an initial residual vector r^\ GMRES applied to A with produces the same convergence curve as GMRES applied to A* with r^0). In the framework of restarted GMRES, Lemma 2.5 implies that the cycles GMRES(A, m, and GMRES(A*, m, result in, respectively, residual vectors and f^ that have the same norm. Theorem 2.6 (the sublinear cycle-convergence of GMRES(m)) Let {r(fc)}^_0 be a sequence of nonzero residual vectors produced by GMRES(m) applied to system (2.1) with a nonsingular normal matrix A Cnxn, 1 < m < n 1. Then ||r(fc+i)| (fc)i < k = 1. (2.12) ||^.(fc1)|| |||.(fc)|| Proof: Left multiplication of both parts of (2.7) by K* (.4, ?4fc-1)) leads to 1 ei = |r(k) I)' ;K* (A,r(fe-1}) (fc-m _(fe) By (2.8) in Lemma 2.4, we factorize the Krylov matrix K (.4,r^ in the previous equality: ei 1 ,rW||2 1 || rtfc) | ;Z*Dk-ldk- Applying complex conjugation to this equality (and observing that e\ is real), we get ei = 1 ;ZT Dk-idk- ||r(fc)||2 According to the definition of Dk-i in Lemma 2.4, D^-idk Dkdk-i, thus \\r(k)\\2ZTDkdk-1 = ||r(fc)||2 (ZTDkV*) r(k 1} = ||r(fc)||2 DkZ) ei (fc-i) 16 From (2.8) and (2.11) we notice that K (A\r{k)) = K (VAV\r{k)) = VDkZ, and so therefore ei = t4'-r Considering the residual vector r^-1^ as a solution of the underdetermined system (2.13), we can represent the latter as r(fc-D = ||r(fc)||2 (K* (A*,r(fc)))te1 + wk, (2.14) where wk G null (K* (A*, r(fc^)). We note that since /4fe+1) is nonzero (as- sumption in Theorem 2.6), the residual vector f^k+l^ at the end of the cycle GMRES(j4*, m, r^) is nonzero as well by Lemma 2.5; hence the corresponding Krylov matrix K (A*,r^) is of the full rank, and thus the pseudoinverse in (2.14) is well defined. Moreover, since wk(K* (T*,r))fei, using the Pythagorean theorem we obtain ||r(fc-1>||2= ||r(fc)||4|| (K* (A*,rW))te1||2 + |K||2. Now, since (K*(A*, r(fc)))f = (Kf(A*, r(fc))*), we get ||r(^)||2 = ||r||4|| (K* (A\r^))* ei\\2 + |M|2, and then by (2.7), llr(fc)ll4 j|f(fc+l)||2 > ||rW ||4 ||f(fc+l) ||2 17 Figure 2.1: Cycle-convergence of GMRES(5) applied to a 100-by-100 normal matrix. where is the residual vector at the end of the cycle GMRES(A*, m, r^). Finally, ||r(fc)||2 ||r(fc)||2||^(A-+l)||2 ||f(fc+l)||2 ||r(fc-l)||2 - ||rW||4 = ||r(*)||2 so that 4k) \ 4k+1)1 llr(fc-l)ll llr(fc)ll (2-15) By Lemma 2.5, the norm of the residual vector f^k+1'> at the end of the cycle GMRES(j4*, m, r^) is equal to the norm of the residual vector at the end of the cycle GMRES(2l, m, r^), which completes the proof of the theorem. Geometrically, Theorem 2.6 suggests that any residual curve of a restarted GMRES, applied to a system with a nonsingular normal matrix, is nonincreasing and concave up (Figure 2.1). Corollary 2.7 (cycle-convergence of GMRES(m)) Let ||r^|| and |jr-G)j| be given. Then, under assumptions of Theorem 2.6, norms of the residual vectors 18 (2.16) r^ at the end of each GMRES(m) cycle satisfy the following inequality Proof: Directly follows from (2.12). Inequality (2.16) shows that we are able to provide a lower bound for the residual norm at any cycle k > 1 after performing only one cycle of GMRES(ra), applied to system (2.1) with a nonsingular normal matrix A. Prom the proof of Theorem 2.6 it is clear that, for a fixed k, the equality in (2.12) holds if and only if the vector in (2.14) from the null space of the corresponding matrix K* (A*, r^) is zero. In particular, if the restart parameter is chosen to be one less than the problem size, i.e., m n 1, the matrix K* (A*, r^) in (2.13) becomes an n-by-n nonsingular matrix, hence with a zero null space, and thus inequality (2.12) is indeed an equality if m = n 1. We now show that the cycle-convergence of GMRES(n 1), applied to system (2.1) with a nonsingular normal matrix A, can be completely determined by norms the of the two initial residual vectors r(0^ and Corollary 2.8 (the cycle-convergence of GMRES(n 1)) Let us suppose that ||r^|| and ||r^)|| are given. Then, under the assumptions of Theorem 2.6, norms of the residual vectors r^ at the end of each GMRESfn 1) cycle obey the following formula: (2.17) Proof: Representation (2.14) of the residual vector r^k ^ for m = n 1 turns into (2.18) 19 implying, by the proof of Theorem 2.6, that the equality in (2.12) holds at each GMRES(n 1) cycle. Thus, r(fc+i) k = 11. We show (2.17) by induction in k. Using the formula above, it is easy to verify (2.17) for ||r(2)|| and jjr^3^[j (k 1,2). Let us assume that for some k, 3 < k < q 1, ||r(fc-1)|| and ||r(fc)|| can also be computed by (2.17). Then (fc+i)ll iir(U lu +1 II \\rW\\ ||r(fc-i)|| ||r = ilra)|i (O)*'1 (JLLi) = 11^)11, 11 "lir>n; l|r vtir (i)i .(i)l fc-i (\\r^\\\k 1 -(0)1 iir<1)n (IKS|)2 (i)| Thus, (2.17) holds for all k = 1,..., q 1. Another observation in the proof of Theorem 2.6 leads to a result from Baker, Jessup, and Manteuffel [6]. In this paper, the authors prove that, if GMRES(n 1) is applied to a system with Hermitian or skew-Hermitian matrix, the residual vectors at the end of each restart cycle alternate direction in a cyclic fashion [6, Theorem 2]. In the following corollary we (slightly) refine this result by providing the exact expression for the constants ak in [6, Theorem 2], Corollary 2.9 (the alternating residuals) Let {t^}^=0 be a sequence of nonzero residual vectors produced by GMRES(n 1) applied to system (2.1) with a nonsingular Hermitian or skew-Hermitian matrix A G Cnxn. Then ll (fc+i)||2 r<+1> = atr Proof: For the case of a Hermitian matrix A, i.e., A* = A, the proof follows directly from (2.7) and (2.18). Let A be skew-Hermitian, i.e., A* = A. Then, by (2.7) and (2.18), r(*-i) = (K* (A*,rw)) 1e1 = (K* (-A,r(fe))) 1 ex = ||r(fc)||2 if^+bip5 ?(*+!) where f(fe+1) is the residual at the end of the cycle GMRES(A, n 1, r^). According to (2.3), the residual vectors Afc+1) and at the end of the cycles GMRES(A, n 1, r^) and GMRES(A, n 1, r^) are obtained by orthogonalizing r^ against the Krylov residual subspaces AK.n-\ (A, r^) and (A) fCn-i (A, r^), respectively. But (A)/Cn_i (A,r^) = AKn-\ (A,r and hence f^'+1^ = Ak+1\ In general, for systems with nonnormal matrices, the cycle-convergence be- havior of the restarted GMRES is not sublinear. In Figure 2.2, we consider a nonnormal diagonalizable matrix and illustrate the claim. Indeed, for non- normal matrices, it has been observed that the cycle-convergence of restarted GMRES can be superlinear [81]. In this concluding subsection we restrict our attention to the case of a diag- onalizable matrix A, A = VKV~\ A* = V~*AV*. (2.20) The analysis performed in Theorem 2.6 can be generalized for the case of a diagonalizable matrix [79], resulting in inequality (2.15). However, as we depart from normality, Lemma 2.5 fails to hold and the norm of the residual vector f(fc+1) at the end of the cycle GMRES(A*, m, r(fe)) is no longer equal to the norm of the vector r(fc+1) at the end of GMRES(A, m, r(fc)). Moreover, since the eigenvectors 21 Residual Curve Rate of Convenience Curve 10' 10 15 20 GMRES(m) cycle number 25 30 10 15 20 GMRE$(m) cycle number 25 30 Figure 2.2: Cycle-convergence of GMRES(5) applied to a 100-by-100 diago- nalizable (nonnormal) matrix. of A can be significantly changed by transpose-conjugation, as (2.20) suggests, the matrices A and ^4* can have almost nothing in common, so that the norms of p(/c+i) ancj r(k+i) ar6j possibly, far from being equal. This creates an opportunity to break the sublinear convergence of GMRES(m), provided that the subspace AJCm (A, rresults in a better approximation (2.3) of the vector r^ than the subspace A*K,m (A*,r^). It is natural to expect that the convergence of the restarted GMRES for al- most normal matrices will be almost sublinear. We quantify this statement in the following theorem. Theorem 2.10 Let 0 be a sequence of nonzero residual vectors produced by GMRES(m) applied to system (2.1) with a nonsingular diagonalizable matrix A E Cnxn as in (2.20), 1 < m < n 1. Then where a = crm^n(R), pk = \\pk{A)(I VV*)r{kl\\, pk(z) is the polynomial con- structed at the cycle GMRESfA, m, r(fc)), and where q is the total number of (2.21) 22 GMRES(m) cycles. We note that 0 < a > 1 and 0 < /?*. 0 as V*V > I. Proof: Let us consider the norm of the residual vector f^+1^ at the end of the cycle GMRES(A*, m, r^). Then we have ||f(fe+1)|| = min ||p(A*)r(fe)|| < ||p(A*)r(fc)||, peVm where p(z) Vm is any polynomial of degree at most m such that p(0) = 1. Then, using (2.20), ||r(fc+1)|| < ||p(A*)r(fc)|| = ||p->(A)VrV(fc)|| = ||P'XA)(p-1y)P*r(fc)|| = ||P-*p(A)p-:(PP*)r(fc)|| = \\V~*p(H)V~1(I -(I- 1/Vr*))r(fc)|| - ||V-*p(A) (V~lrk V~\I PP*)rW) || < ||V||||p(A) (y-1^ V~\l VV*)rw) ||. We note that ||p(A) (y-V(fc) V~\I VV*)r{k)) II = ||p(A) (y-V(fc) V~\I VV*)rW) ||. Thus, ||f(fc+1)|| < ||y-*||||p(A) (y-V(fc) v~\i vv*)r{k)) || = l|v-i||(y_1y)p(A) (y-Vfc) v~\i vv*)r^) y < IIV IIIIV-1 IIII^(A)!^-1^ Vp(A)V-\l VV*)r(fc)ll = -p(vhv-')(i vvVn min \ / < -T^rrr, (IIp(^)>-,II + \\p(A)(I W)r<)||), 23 where amin is the smallest singular value of V. Since the last inequality holds for any polynomial p(z) Vm, it also holds for p(z) = pk(z), where Pk(z) is the polynomial constructed at the cycle GMRES(A, m, r^). Hence, Illll < -Y^TTTs (li>-(l+1)ll + IW>t)U W>)ll) min V ) Setting a = 2 1(vj, Pk = \\pk(A)(I VT/*)r^||, and observing that a > 1, 6k > 0 as V*V > /, from (2.15) we obtain (2.21). 2.2 Any admissible cycle-convergence behavior is possible for the restarted GMRES at its initial cycles In the previous section, we have characterized the cycle-convergence of the restarted GMRES applied to system of linear equations (2.1) with a normal coefficient matrix. Now we turn our attention to the general case. The main result of the current section is stated as the following Theorem 2.11 Let us be given an integer n > 0, a restart parameter m (t) < m < n), and a positive sequence {f(k)}9k=0, such that /(0) > /(1) > > f(s) > 0, and f(s) = f(s + 1) = ... = f(q), where 0 < q < n/m, 0 < s < q. There exists an n-by-n matrix A and a vector r^ with ||r^0^ j| = /(0), such that ||r(fc)|| __ f(k), k = 1 ,...,q, where r^ is the residual at cycle k of restarted GMRES with restart parameter m applied to the linear system Ax b, with initial residual r^ = b Ax^. Moreover, the matrix A can be chosen to have any "desired (nonzero) eigenvalues. The full GMRES has a nonincreasing convergence for any i > 0, f(i) > f(i + 1) and it computes the exact solution in at most n steps (f(n) = 0). 24 We note that the assumptions on {f(k)}k~l in Theorem 2.2 do not cover the class of convergence sequences corresponding to the convergence to the exact solution before step n. One can see, however, that these assumptions axe suffi- cient to conclude that the theorem holds in this case as well. In this sense it is remarkable that Greenbaum, Ptak, and Strakos are able to prove that any ad- missible convergence behavior is possible for the full GMRES at its n steps. At the same time we would like to note that the cycle-convergence of the restarted GMRES can have two admissible scenarios: either f(i) > f(i + 1) for any i, in other words, the cycle-convergence is (strictly) decreasing; or there exists s such that f(i) > f(i + 1) for any i < s, and then f(i) = f(s) for any i > s, in other words, if the restarted GMRES stagnates at cycle s + 1, it stagnates forever. Thus assumptions on {f(k)}gk==0 in Theorem 2.11 reflect any admissible cycle-convergence behavior of restarted GMRES at the first q cycles, except for the case where the convergence to the exact solution happens within these q cycles. It turns out that the assumptions are sufficient to guarantee that The- orem 2.11 also holds in the above mentioned case of early convergence. In Subsection 2.2.6, we point out how exactly the assumptions of Theorem 2.2 and Theorem 2.11 allow us to conclude that any admissible convergence behavior is possible for the full and restarted GMRES (at its q initial cycles). As mentioned above, the maximum number of iterations of the full GMRES is at most n, and the method delivers the exact solution in a finite number of steps. The restarted GMRES, however, may never provide the exact solution. It (hopefully) decreases the residual norm at each cycle, that is, provides a more and more accurate approximation to the exact solution. With n2 parameters 25 in A and n parameters in b we are not able to control the convergence for an infinite amount of cycles. For this reason, we consider only the first q < n/m initial GMRES(m) cycles. We note that, in practice, n >> m so q is relatively large. The rest of this section concerns the proof of Theorem 2.11. The proof we provide is constructive and directly inspired by the article of Greenbaum, Ptak, and Strakos [34]. Although Greenbaum, Ptak, and Strakos laid the path, there are several specific difficulties ahead in the analysis of the restarted GMRES. Let n be a matrix order and m a restart parameter (ra < n), A = {Ai, A2,... A} cC\{0} be a set of n nonzero complex numbers, and {f(k)}qk=0 be a positive sequence, such that /(0) > /(1) > > f(s) > 0 and f(s) = f(s + 1) = ... = f(q), where 0 < q < n/m, 0 < s < q. In this section we construct a matrix A Crtx and an initial residual vector r-() = b Ax^ e C" such that GMRES(m) applied to system (2.1) with the initial approximate solution produces a sequence {xof approximate solutions with corresponding residual vectors {r^}k=0 having the prescribed norms: ||r^|| = f(k). Moreover the spectrum of A is A. For clarity, we first restrict our attention to the case of the strictly de- creasing cycle-convergence, and, in Section 2.2.2, prove Theorem 2.11 under the assumption that /(0) > /(1) > > f(q) > 0 (i.e., we assume that s = q). Next, in Section 2.2.3, we complete the proof of Theorem 2.11 by handling the (remaining) case of stagnation, i.e., /(0) > /(1) > < > f(s) > 0 and f(s) = f(s + l) = ... = f(q),0 26 2.2.1 Outline of the proof of Theorem 2.11 The general approach described in this paper is similar to the approach of Greenbaum, Ptak, and Strakos [34]: we fix an initial residual vector, construct an appropriate basis of Cn, and use this basis to define a linear operator A. This operator is represented by the matrix A in the canonical basis. It has the prescribed spectrum and provides the desired cycle-convergence at the first q cycles of GMRES(m). However, the presence of restarts somewhat complicates the construction: the choice of the basis vectors, as well as the structure of the resulting operator A, becomes less transparent. Below we briefly describe our three-step construction for the case of the strictly decreasing cycle-convergence and then suggest its easy modification to prove the general case, which includes stagnation. At the first step we construct q sets of vectors = {w[k\..., wl^}, k = 1,..., g, each set Wm is the orthonormal basis of the Krylov residual subspace AK,m (A,r^_1^) generated at the k-th GMRES(m) cycle such that span Wf]1 = AtCj (A, r(fc_1)) j = 1,..., m. (2.22) The orthonormal basis W needs to be chosen in order to generate residual vectors r^ with the prescribed (strictly decreasing) norms f(k) at the end of each cycle subject to the additional requirement that the set of mq + 1(< n) vectors /r() ,(!) r(i) ,(2) V > a l wm-D ' > W1 > j wm_1, . I , ,(?) > wm\i ' / (2.23) is linearly independent. 27 Once we have the set S, we will complete it to have a basis for Cn. If the number of vectors in S is less than n, a basis S of C" is obtained by completion of S with a set S of n mq 1 vectors, i.e., S = {5, Cn = span S = span{r^, wÂ£i:} span{r^_1\ span{r^, The latter translates in terms of Krylov subspaces into C" = span S = K-m (A, r^) Km (A, span{r^, Cn > C" with spectrum A which generates the Krylov residual subspaces in (2.22) at each GMRES(m) cycle, by its action on the basis vectors S, such that the desired matrix A is the operator As representation in the canonical basis. The third step accomplishes the construction by a similarity transformation. In the following subsection we show that this three-step approach indeed allows us to prove Theorem 2.11 in the case of a strictly decreasing positive sequence {f(k)}gk_Q. In order to deal with the particular case of stagnation, i.e., /(0) > /(1) > > f(s) > 0 and f(s) = f(s + 1) = ... = /( same framework but set q = s + 1 and redefine the vector (r^ is the last vector in (2.23)). More details are provided in Subsection 2.2.3. 2.2.2 Proof of Theorem 2.11 for the case of a strictly decreasing cycle-convergence Throughout this subsection we let the positive sequence {f(k)}qk=0 only to be strictly decreasing. We also assume here that q = max {z e Z : z < n/m}. This 28 means that for the given n and m we perform our construction along the largest number of initial cycles where we are able to determine A (having a prescribed spectrum) and r^ which provide the desired cycle-convergence. Although our proof is formally valid for any 0 < q < n/m, the assumption emphasizes the extent to which we can take control over the process. We note that any case with q < max {z 6 Z : z < n/m} can be extended to the one assumed above by properly defining a number of additional elements in {f(k)}9k=z0. Step 1: Construction of a sequence of Krylov subspaces which provide the prescribed cycle-convergence At the kth GMRES(m) cycle, the residual vector r^ satisfies minimality condition (2.3). We assume that each set Wm* is an orthonormal basis of a cor- responding Krylov residual subspace A}Cm (A,r(k~1'>), therefore condition (2.3) implies r(k) = r(k-1) ^(r(fc-i)>w^)wf\ k=l,...,q. (2.25) j'=i At this stage, in order to simplify the forthcoming justification of the linear independence of the set S, we impose a stricter requirement on the residual change inside the cycle. We require that the residual vector remains constant during the first m 1 inner steps of GMRES and is reduced only at the last, mth, step. Thus, the equality in (2.25) can be written as r(k) =r(k-i) _ k = l,...,q. (2.26) 29 This implies that the vectors wf\ j = 1, are orthogonal to the residual vector i.e., (r(fc_1\ Wjk^) = 0, j = 1,... ,m 1, k=l,...,q. (2.27) Prom (2.26), using the fact that r^ J. w$ and the Pythagorean theorem, we obtain |(r<*-1>,<))| = A/||r(*-0||2-||r(*)||2, By defining (acute) angles ipk = /(r^-1\ w$) and the corresponding cosines cos uy = ; we can equivalently rewrite the identity above in the fol- lowing form: cos ipk !u. Vf(k-ir-mr nk-i) (0,1), k=l,...,q, (2.28) where f(k 1) and f(k) are the prescribed values for the norm of the residual vectors and r^, respectively. Thus, if we are given one way to ensure the desired cycle-convergence at cycle k of GMRES(ra) is to choose the unit vectors such that (2.26)-(2.28) holds. In the following lemma, we show constructively that the described approach (2.26)-(2.28) leads to an appropriate set S. Lemma 2.12 Given a strictly decreasing positive sequence {f{k)}k=0 and an initial vector r^\ ||r^|| = /(0), there exist vectors r^k\ ||r(fc)|| = f(k) and orthonormal sets Wm^ such that (2.26), (2.27), and (2.28) hold, and the set S in (2.23) is linearly independent, k = 1,... ,q. 30 Proof: The proof is by induction. Let k = 1. Given the initial vector r^\ llr^l| = /(0), we pick = {ic^, ,^m-i) an orthonormal set in r^ in order to satisfy equalities (2.27). The set is linearly independent. In order to choose the unit vector orthogonal to the previously con- structed vectors and satisfying (2.28), we introduce a unit vector 6 {r(\ so that w, (i) _ ~(o) m cos-01 + f/^sim/q. We find the vector rb) by satisfying (2.26). Equality (2.28) guarantees that llr^H = /(1), as desired. Finally, we append the constructed vector rb) to {r(), an(j get ge{. |r()) W^!, r(i)}, which is linearly independent, since, by construction, r^1) is not in span{r^\ The induction assumption is that we have already constructed k 1 vectors r^\... with the prescribed norms /(l),...,/(fc 1) and orthonormal sets , Wm_1\ such that equalities (2.26), (2.27) and (2.28) hold, and the set /r() yy(1). r(k-2) yp(fe-0 r(k-i)\ > vm1 > ' > Ym1 ) ' J (2.29) is linearly independent. We want to show that we can construct the next vec- tor r^k\ ||r^|| = f{k), and the orthonormal set Wm\ satisfying (2.26), (2.27) and (2.28), such that /r() yy(1), r(fe_2) W(fcT1} r(fc_1) W(fc\ r(fe)} X > vm\i i 1 i vm 1 > ' i y vm 1 > ' / (2.30) is linearly independent, k < q. We start by constructing orthonormal vectors W^li = {w[k\ ..., (k) satisfying (2.27), with the additional requirement that the set is not 31 in the span of the previously constructed vectors given in set (2.29). From these considerations, we choose W,^ as an orthonormal set in the orthogonal complement of (2.29), i.e., ,r (fc-2) yV(fc-1) i vm-1 ' S i j = l,...,m- 1. Appending to the set (2.29) gives a linearly independent set. To finish the proof, we need to construct the vector Wm\ satisfying (2.28) and orthogonal to For this reason we introduce a unit vector y^k\ y (k) e (r(0) W(1), r(fc-2) r(fe_1) c l' y vm 1 ) i vm1 > ' i Vm\S > (fc-1) >(*) so that, w (*) JL_____________ f(k -1) cos^fc + ?/(fc)sin^fc, where cos ipk = 1(r<|*r We define the vector with (2.26). Equality (2.28) guarantees ||r^|| = f(k). Set (2.30) is linearly independent, since, by construction, the vector r is not in span{r^\ .... r^fc_2\ W^lj}. Step 2: Definition of a linear operator with any prescribed spectrum So far we have shown that, given a strictly decreasing positive sequence {/mu an<^ an i^ial residual vector r^\ ||r^|| = /(0), it is possible to con- struct vectors r^k\ ||r^|| = f(k), and orthonormal vectors Wm\ k = 1 satisfying equalities (2.26), (2.27) and (2.28), such that the set S of mq + 1 vectors in (2.23) is linearly independent. In order to define a (representation of) unique linear operator, we need to have a valid basis of Cn at hand. Thus, we expand the set S by linearly 32 independent vectors (g-i) v\/?) Art (2.31) so that S is a basis of Cn. Before we define a linear operator A, let us consider the set A {Al5 A2,.. , A} of nonzero numbers in the complex plane that defines the spectrum of A. We split A into q+ 1 disjoint subsets A = {Ai, A2,..., Ag, A9+i}, (2.32) such that each A^, k = 1contains m elements of A, and the remaining n mq elements are included into A9+i- For each set A/-, k = 1,..., q, we define a monic polynomial Pk{x), such that the roots of this polynomial are exactly the elements of the corresponding A/.: m1 pk(x) = xm ^2 af] x3, k = l,...,q, (2.33) i=0 with being the coefficients of the respective polynomials, 7^ 0- Each polynomial Pk{x) in (2.33) can be thought of as the characteristic polynomial of an m-by-m matrix with spectrum Ak- Let us also introduce an arbitrary (t + l)-by-(t + 1) matrix C with the spectrum Ag+1: C= 0%), A(C) = Ag+1, i,j l,...,t + l = n mq. (2.34) 33 We define the operator A : C71 Cn as follows: Ar^k ^w[k\ Aw^ = w^\ (k) (k) w m1 > ^m-2 (fc) (fc) tj w\ Ar{g) = 0n riq) + (32iSi + + A+1,1 Si, Asi = + P22S1 + + Pt+l,2Sti Aw!Â£L 1 = -a{0k)r(k) + a(0k)r(k 1} + a\k)w[k) H----------h a^w^, k = l,...q\ (2.35) Ast Pl,t+lT^ + ,$2,t+lsl + + fit+l,t+lst) where 0^'s are the coefficients of polynomials (2.33) and /5ys are the elements of the matrix C in (2.34). The following lemma showrs that, given vectors and orthonormal sets constructed according to Lemma 2.12, the linear operator A, defined by (2.35) and represented by a matrix A in the canonical basis, generates the desired Krylov residual subspaces given in (2.22); and the spectrum of A can be chosen arbitrarily. Lemma 2.13 Let the initial residual vector ||r^|| = /(0), as well as the residual vectors r^ and orthonormal sets be constructed according to Lemma 2.12. Let S be the basis ofCn as defined by (2.31) and A be an arbitrary set ofn nonzero complex numbers. Then the linear operator A defined according to (2.32)-(2.35) generates the Krylov residual subspaces given in (2.22), where 34 the matrix A is a representation of A in the canonical basis. Moreover, the spectrum of A is A. Proof: From definition (2.35) of the linear operator A one can notice that Ar{-k~l'> = w[k\ AVk~V = w{k\ Am-lr(k-l)=w(k)^ Amr(k-1) _Q,Wr(fe) _|_ +-----h k 1,.. .q. Since, by construction in Lemma 2.12, i.e., equality (2.26), 0 7^ a^r^ + span{u>^}, the above relations immediately imply that for each k 1,... q, span{>lr^ , AM^ ^} = span j = 1,..., m. Thus, given the representation A of the linear operator A in the canonical basis, we have proved that A generates the Krylov residual subspaces given in (2.22). 35 To prove that an arbitrarily chosen set A can be the spectrum of A, let us consider the matrix [Xj5 of the operator A in the basis S (see (2.31) and (2.35)): 00-- 0 o1} 10-- 0 a[^ 01 0 a(2l) 00-- 1 a(1) 1 am-1 -q(1) ao Ms ' t (9-1) 0 0 0 a. (9) 0 (2.36) 1 0 0 01-- (9) 0 a?> 0 0 0 la (9) m 1 -a. (9) fin fii; t+i fit+1,1 fit+l,t+1 The matrix (y4.]5 has a block lower triangular structure, hence the spectrum of [*4]5 is a union of eigenvalues of all diagonal blocks. The first q blocks are the companion matrices corresponding to the sets A*, k 1,... ,q, with char- acteristic polynomials defined in (2.33). The last block is exactly the matrix C (/3ij) from (2.34) with the spectrum A9+i. Thus, given partition (2.32) of the set A, we conclude that spectrum of A is A. 36 Step 3: Conclusion of the proof of Theorem 2.11 for the case of the strictly decreasing cycle-convergence Finally, we define A as the representation of the operator A in the canonical basis Â£ = {ei,e2, . ,en}, A = S[A]sS~l, (2.37) where the square matrix S is formed by the vectors given in (2.31) written as columns and [.A]^ is defined by (2.36). The constructed matrix A pro- vides the prescribed (strictly decreasing) norms of residual vectors at the first q GMRES(ra) cycles if starting with and its spectrum is A. We note that the field over which the resulting matrix is defined depends heavily on the par- tition (2.32) of the set A, e.g., A turns out to be (non-Hermitian) complex if a conjugate pair from A is not included into the same subset Ak. 2.2.3 Extension to the case of stagnation In the previous subsection, we have proved Theorem 2.11 only for the case of the strictly decreasing positive sequence {f(k)}qk=0. Now, in order to conclude the rest of Theorem 2.11, we consider the case of stagnation: /(0) > /(1) > > f(s) > 0 and f(s) f(s + 1) = ... = f(q). The latter fits well (after a minor modification) into the framework presented above. Let us set q s + 1 and, without loss of generality, reduce the problem to constructing a matrix A with a spectrum A and an initial residual vector A\ ||r(0)|| = /(0), for which GMRES(ra) produces the following sequence of residual norms: /(0) > /(1) > > f(q 1) = f(q) > 0. We observe that the sequence is strictly decreasing up to the last cycle q. Thus, by Lemma 2.12, at the initial 37 q 1(= s) cycles we are able to construct sets W and vectors r^k\ such that ||r^|| = /(&) and the set {r(0), ..., r(9_1)} (2.38) is linearly independent. Then, formally following the construction in Lemma 2.12 at the cycle q, we get orthonormal set Wm* from the orthogonal complement of (2.38) and the residual vector This leads to set (2.23), which is no longer linearly independent due to the above mentioned equality of residual vectors. To enforce the linear independence we substitute in (2.23) the incon- venient vector by Wm + r^-1) and obtain the set /r() yyd) r(q-2) yp(9-b r(9-i) yp^) ^(9) + r(9-i)\ l' > vm 1) > ' ) vm 1 > ' ) r vm 1 > wm 1 i i) (?) ,,,(9) _l_ -(9-1)1 (2.39) which is linearly independent, due to the fact that the orthonormal set Wm is chosen, by construction, from the orthogonal complement of (2.38). The rest of the proof exactly follows the pattern described in Subsection 2.2.2 with replaced by Wm + q = s + 1; see (2.31)(2.37). The resulting matrix A has the prescribed spectrum A and with the initial residual vector r(), ||r()|| = /(0), provides the desired cycle-convergence of GMRES(ra) with a stagnation starting after cycle s. This concludes the proof of Theorem 2.11. In what follows we suggest several remarks and generalizations related to the result. 2.2.4 Difference with the work of Greenbaum, Ptak, and Strakos [34] For the reader familiar with the work of Greenbaum, Ptak, and Strakos [34], it might be tempting to obtain the present result by pursuing the following scheme: fix and then consider the first restarted GMRES cycle as the initial 38 part of a full GMRES run where the convergence is prescribed for the first m it- erations (and set arbitrarily for the remaining n m iterations). Then, similarly, given the starting residual vector r^ provided by this first cycle, construct the next Krylov residual subspace, which provides the desired convergence following the scheme of Greenbaum, Ptak, and Strakos [34]. Proceed identically for the remaining cycles. This approach, however, does not guarantee the linear inde- pendence of the set S in (2.23) and, hence, one meets the problem of defining the linear operator A. These considerations have been the reason for assump- tions (2.26), (2.27) on the residual reduction inside a cycle, which have allowed us to quite easily justify the linear independence of the set S and to control the spectrum, as well. 2.2.5 Generating examples with nonzero r9+1 We note from definition (2.35) of the operator A in Subsection 2.2.2 that spanjr^, si,..., s)} is an invariant subspace of A and, hence, rWGAKw(4,r<)), where A is the representation of the operator A in the canonical basis and t n mq 1 (< m, by the assumption that q = max {z e Z : z < n/m} at the beginning of Subsection 2.2.2). This implies that at the end of the (q + l)st cycle GMRES(m) converges to the exact solution of system (2.1), i.e., r^9+1^ = 0. This fact might seem unnatural and undesirable, e.g., for constructing theoretical examples. The drawback, however, can be easily fixed by a slight correction of the basis S in (2.31)somewhat similarly to how we handled the stagnation case in Theorem 2.11. 39 Given residuals r^ and orthonormal sets W constructed according to Lemma 2.12, instead of considering the set S, we consider the following basis of C": S = {r(0\ ..., ,r(g 1),w\Q>,...,w'Â£>_1,r(q) + ^r(q 1J,si,... ,st}, (2.40) where 7 ^ 1,0. Here we substituted the basis vector in (2.31) by + 7r(9_1). The vector cannot be represented as a linear combination of other vectors in 5, since it contains the component r(q\ which is not represented by these vectors. Hence, S is indeed a basis of Cn. Thus we can define the operator A by its action on S: ,(<7-1) ,( (?) A
Ar(k 1'> = w\k\Aw 1 (k) w. (k) 2 > Aw (k) (fc) m-2 = wm-1. (*) - a. (*).(*-!) ^ (*),(*) Aw^i, = ctQJr() + aoV + + k = 1,... ,q 1; , (k) {k) 40 Ar^q 1') = w(f\ Awf* w^\ A{r{q) Aw{Â£_ 2 - ?n(9) wm-1> _ (9) ^m-1 II ^ 1 + oC .(9) _|_ ^,r(9 1)) _|_ a^r(9 -i) + 4. L_ 0,(9) .(9) ^,r(9-l)) qS II + jr(q~1}) + 021Si + + 0t+ 1,1 -4.s'i Ast Pij+i(r^ + 7r^q + 02,t+is\ + + 0t+i,t+isu where a^s are the coefficients of the corresponding characteristic polynomials (2.33) and 0^s are the elements of the matrix C in (2.34). The fact that the operator A produces the correct Krylov residual subspace at the cycle q, i.e., span {Ar^-V,..., Amr( can be observed from the following equalities: (?) Aw^-1 = .. a (r(g) + 7r(<7_1)) + QqH-----------------------------------------h i 1+7 M a0 (r(9) r(9 b -|- (1 + y)r^9 1J) + CTq'V + Q1 -I---------r 1+7 a{q) aG + 7 Jm-1 = ^(r^ r^) + + + 1 + 7 where, by (2.41), Aw^_x Amr(-q~^ and, by (2.26), 0 ^ r(?) r(9-1) E span{n;m^}- 41 The matrix [AJj of the operator *4, defined by (2.41), in the basis S is identical to (2.36), with the only change of the subdiagonal element to _Q() -y^-, 7 A 1,0. Hence, A has the desired spectrum A. The representation A of the operator A in the canonical basis is then determined by the similarity transformation in (2.37), with the matrix S formed by vectors from S in (2.40) written as columns. Finally, to see that the residual vector Ag+1) is generally nonzero with the new definition of the operator A, we notice from (2.41) that now span {r^ + 7r^_1\si,... ,st} is an invariant subspace of A and, hence, r(Q) + 7r(?-0 G AJCt+i (A, r(g) + 7r(,?-1)) 7 A -1,0, or, r(9) G A/Ct+i (A, r<>) + JCt+2 (A, r^) , (2.42) where t+1 < m by the assumption that q max {z 6 Z : z < njm}. Due to the fact that r(9+1) = 0 if and only if r(q) G AJCm (A, r(?)), it suffices to show that the component of the vector from lCt+2 (A, A9-1)) in representation (2.42) does not generally belong to AJCm (A,r( that the term from /Ct+2 (A,r(9-1)) in (2.42) contains a nonzero component in the direction r( is chosen from a specific subspace of C", i.e., expressing in terms of and vectors Wm by (2.26), 42 r(0)^I, X = span {w$,..., x)} + AKm (A r{q)), where dimX < q + m 1 < n/m + m 1 < n, provided that 0 < m < n. 2.2.6 Any admissible convergence behavior is possible for full and restarted GMRES (at its q initial cycles) As we pointed out at the beginning of the current section, the conver- gence behavior of the full GMRES in Theorem 2.2 is restricted to the class of convergence sequences which allow convergence to the exact solution only at step n, i.e., /(0) > /(1) > > fin 1) > 0 if in) = 0). Similarly, the cycle-convergence behavior of restarted GMRES in Theorem 2.11 is restricted to cycle-convergence sequences which exclude the possibility of convergence to the exact solution within the initial q cycles, i.e., /(0) > /(1) > > f(s) > 0 and f{s) = f(s + 1) = ... = /( Theorem 2.2 and Theorem 2.11 are sufficient for the theorems also to hold if /(0) > /(1) >> f{n- 1) > 0 and /(0) > /(1) > > f(s) > 0, /(s) = fis + 1) = ... = fiq), respectively. Given an integer n > 0, assume that we want to construct an n-by-n matrix A with a prescribed spectrum A and an initial residual vector r^ (or, equiva- lently, a right-hand side b, since = b after setting x^ to 0) such that the full GMRES applied to the corresponding system (2.1) results in the following convergence pattern: /(0) > /(1) > > /(s 1) > /(s) = /(s + 1) = = fin 1) = 0, s < n, ||rfc|| = /(A:). The construction is straight-forward. We first split the set A into two disjoint subsets, say, A = As U A_s, where As contains s elements from A while the remaining n s elements are included into 43 A_s. Next, by Theorem 2.2 we construct a matrix Aa G Csxs and a right-hand side bs G Cs = 06 Cs), such that the full GMRES applied to the system Aax = ba produces the convergence sequence /(0) > /(1) > > f(s 1) > 0 (/(s) = 0), moreover the spectrum of Aa is As. Finally, we define the resulting matrix A G Cnx" and the right-hand side vector t G C" (x^ = 0 G Cn) as follows: f As 0 ^ fh\ bs A = , b = ^ 0 An_S y W where An_s G ch1-8)*!"-8) is an arbitrary matrix with a spectrum An_s. It is easy to see that the full GMRES applied to the system of equations defined by (2.43) produces the desired sequence of residual norms /(0) > /(1) > > f(s 1) > f(s) = f(s +1) = = fin 1) = 0, II r<*> || = /(fc), r = b. Clearly the matrix A in (2.43) has the prescribed spectrum A = As U An_s. For the restarted GMRES the construction of a matrix A with the spectrum A and a right-hand side b (x^ = 0) that provide the cycle-convergence sequence /(0) > /(1) >> f(s 1) > f(s) = f(s + l) = ... = f(q) = 0 is analogous, s < q, ||r^|| = f(k). Following Theorem 2.11, one constructs a matrix As G Cmsxms and a right-hand side vector bs G Cms, such that the GMRES(m) applied to the corresponding linear system produces the cycle-convergence curve /(0) > /(1) > > f(s 1) > f(s) = 0. The spectrum of As is chosen to coincide with a subset of ms elements of A. The construction of the matrix A G Cnxn and the right-hand side be Cn is then accomplished by introducing an (nms) x (nms) diagonal block with eigenvalues from A, which are not in the spectrum of As, and expanding the vector b with (n ms) zeros, similarly to (2.43). 44 2.2.7 Restarted GMRES with variable restart parameter The result of Theorem 2.11 can be generalized to the case where the restart parameter m is not fixed, but varies over successive cycles according to an a priori prescribed parameter sequence The proof, basically, repeats the one in Subsection 2.2.2 with the difference that the constructed operator A in the corresponding basis has block lower triangular structure with varying diagonal block sizes mk, rather than the constant size mk = m as in (2.36). Corollary 2.14 Let us be given an integer n > 0, a sequence {mk}qk=zl, 0 < n?.fc < n, and a positive sequence {f{k)}qk_0, such that /(0) > /(1) > > f(s) > 0 and f(s) = f(s + 1) = ... = f(q), where q is defined by the condition ||r^|| /(0) such that ||r^|| = f(k), k = 1 ,...,q, where r^ is the residual at cycle k of restarted GMRES with a restart parameter varying according to the sequence {mk}k=1 applied to the linear system Ax = b, with initial residual r() = b Ax^0K Moreover, the matrix A can be chosen to have any desired (nonzero) eigenvalues. 2.3 Conclusions In this chapter we have established several results which address the cycle- convergence behavior of the restarted GMRES. First, we have proved that the cycle-convergence of the method applied to a system of linear equations with a normal coefficient matrix is sublinear, and at best linear. Second, in the general case, we have shown that any admissible cycle-convergence behavior is possible There exists an n-by-n matrix A and a vector r^ with 45 for the restarted GMRES at q initial cycles, regardless of the eigenvalue distri- bution of the coefficient matrix. This leads to the conclusion that no estimates, which rely solely on the matrix spectrum, can be derived to characterize the cycle-convergence of restarted GMRES at the first q cycles if the method is applied to a linear system with a general nonsingular non-Hermitian matrix. Though in practice q tends to be reasonably large (q < n/m), it remains an open question if the above mentioned estimates hold at cycles which follow the n/m-th GMRES(m) cycle. 46 3. Solution of symmetric indefinite systems with symmetric positive definite preconditioners We consider a system of linear equations Ax = b, A A* e Rnxn, be Mn, (3.1) where the coefficient matrix A is nonsingular and symmetric indefinite, i.e., the spectrum of A contains both positive and negative eigenvalues. Linear systems with large, possibly sparse, symmetric indefinite coefficient matrices arise in a variety of applications. For example, in the form of saddle point problems (see [10] and references therein), such systems may result from mixed finite element discretizations of underlying differential equations of fluid and solid mechanics. In acoustics, large sparse symmetric indefinite systems may be obtained after discretizing the Helmholtz equation [69] for certain media types and boundary conditions. Sometimes the need of solving the indefinite problem (3.1) comes as an auxiliary task within other computational routines, e.g., inner Newton step in the interior point methods in linear and nonlinear optimization, see [53], or solution of the correction equation in the Jacobi-Davidson method [64] for a symmetric eigenvalue problem. Because of the large problem size, direct methods for solving linear sys- tems may become infeasible, which motivates the use of iterative techniques for finding satisfactory approximations to the exact solutions. There is a number of iterative methods developed specifically to solve symmetric indefinite sys- tems, ranging from modifications of the Richardsons iteration, e.g., [51, 58, 16], 47 *" to optimal Krylov subspace methods, see [33, 59]. It is known, however, that in practical problems the coefficient matrix A in (3.1) may be extremely ill- conditioned which, along with the location of the spectrum of A to both sides of the origin, can make the straightforward application of the existing schemes inefficient due to a slow convergence rate. In order to improve the convergence, one can introduce a matrix T RTlxri and consider the preconditioned system TAx = Tb. (3.2) If T is not symmetric positive definite (SPD), the matrix TA of the pre- conditioned system (3.2), in general, is not symmetric with respect to any inner product, implying that the specialized methods for solving symmetric indefinite systems are no longer applicable and need to be replaced by methods for solving nonsymmetric systems, e.g., one of the Krylov subspace methods: GMRES or GMRES(m), BiCG, BiCGstab, QMR, etc (see, e.g., [33, 59]). Though known to be effective for a number of applications, this approach can have several dis- advantages. First, in order to maintain the optimality of a Krylov subspace method, one has to allow the increase of the computational work at every new iteration, which can become prohibitive for large problems. Second, the conver- gence behavior of methods for solving nonsymmetric systems may not rely on possibly accessible (estimated) quantities, such as, e.g., the spectrum of the coef- ficient matrix (see the corresponding results for GMRES [34, 76]), which makes it difficult or even impossible to estimate the computational costs a priori. If T is chosen to be SPD (T = T* > 0) then the matrix TA of precon- ditioned system (3.2) remains symmetric indefinite, however, with respect to the T_1-inner product, i.e., the inner product defined by (x,y)T-i (x,T~1y) 48 for any x,y G Mn, where (,) denotes the Euclidean inner product, in which the matrix A is symmetric. In particular, due to this symmetry preservation (though in a different geometry), system (3.2) can be solved using an optimal short-term recurrent Krylov subspace method, e.g., preconditioned MINRES [55], or PMINRES, with the convergence behavior fully described in terms of the (estimates of) spectrum of the preconditioned matrix TA. Therefore, in the light of the discussion above, the choice of a properly defined SPD precondi- tioner for solving a symmetric indefinite system can be regarded as natural and favorable. The goal of this chapter is twofold. First, we describe a hierarchy of meth- ods, from a stationary iteration to an optimal Krylov subspace minimal residual method, which allow solving symmetric indefinite linear system (3.1) with an SPD preconditioner T. Second, we suggest a new strategy for constructing SPD preconditioners for general symmetric indefinite systems being solved with the described methods. Although the approaches, underlying the methods are mostly known, e.g., the minimization of an appropriate norm of the residual vector over a subspace, several of our observations seem to be new and are, primarily, of a theoretical interest. In particular, we determine the smallest pos- sible Krylov subspace, which can be used to properly restart the preconditioned minimal residual method, applied to a symmetric indefinite linear system. For example, this leads to a scheme which is a natural analogue of the preconditioned steepest descent iteration for solving SPD systems. Independently of a particu- lar implementation scheme, we state and prove simple convergence bounds, and gain an insight into the structure of the local subspaces which determine a new 49 approximation at each step of the selected method. The results of this chap- ter will motivate the construction of trial subspaces and SPD preconditioners for symmetric eigenvalue (with a targeted interior eigenpair) and singular value problems in Chapter 4 and Chapter 5. In Section 3.1, we present the simplest iterative scheme with stationary iter- ation parameters for solving a symmetric indefinite system with an SPD precon- ditioner. Other methods are obtained essentially by allowing to vary the parame- ters at each step of this stationary iteration in a way that a preconditioner-based norm of the residual vector is minimized. In Section 3.2, we present a notion of the optimal SPD preconditioner for a symmetric indefinite system, and suggest constructing preconditioners based on an approximation of the inverse of the absolute value of the coefficient matrix (absolute value preconditioners). We show on the example of a linear system with a discrete real Helmholtz operator (shifted Laplacian) that such preconditioners can be constructed in practice. Moreover, the use of the preconditioning techniques based, e.g., on multigrid (MG), can make this construction efficient. 3.1 Iterative methods for symmetric indefinite systems with SPD preconditioners Given a (possibly nonsymmetric) matrix A Â£ R71*, an initial guess Â£ R", a (nonzero) iteration parameter a Â£ R, and a (possibly non-SPD) precon- ditioner T Â£ Rnxn, the iteration of the form x(<+1) = x(i) + aw{i), w(i) = Tr{i), r(i) = b Ax{i), i = 0,1,...; (3.3) where xS^ Â£ R71 is an approximation to the exact solution of system (3.1) at iteration i, is commonly referred to as a preconditioned stationary iteration, 50 or Richardsons method with a stationary iteration parameter, see, e.g., [3, 33, 59]. We note that stationary iteration (3.3) can be considered as a simplest iterative method for solving linear systems, and, if properly preconditioned, can be efficient and computationally inexpensive. In general, the (asymptotic) convergence rate of method (3.3) is governed by the spectral radius of the iteration matrix M = I aTA, where T and a (sometimes skipped after replacing aT by T) need to be chosen to make the spectral radius of M strictly less than 1, see, e.g., [3, 33]. If both A and T are SPD, one can always find a sufficiently small value of the step-size parameter a > 0 to ensure that iteration (3.3) monotonically and linearly reduces the A- norm of error. Moreover, a can be set to the value which provides an optimal convergence rate for the method with an optimal convergence factor popt = /c(T/l)+i < where the condition number k(TA) is a ratio of the largest and the smallest eigenvalues of the preconditioned matrix TA (for more details see, e.g., [3])- If A is symmetric indefinite and T is SPD, stationary iteration (3.3), in general, diverges for any choice of the parameter a. Proposition 3.1 Stationary iteration (3.3) applied to linear system (3.1) with a symmetric indefinite matrix A and an SPD preconditioner T diverges for any a, unless the initial guess x^ is specifically chosen. Proof: Let us consider the preconditioned residual, corresponding to iteration (3-3): rr(i+i) = (/ aTA)i+1 7V(0), i = 0,1,.... (3.4) 51 Since T is SPD and A is symmetric indefinite, the preconditioned matrix TA is T_1-Symmetric and indefinite with eigenpairs (Aj,Vj), where the (real) eigenvalues Aj are of both signs and the eigenvectors yj are T_1-orthonormal, j = 1,..., n. Then the eigenpairs of the iteration matrix / aTA are (/ij, yj) where /j,j 1 aXj. Let Cj be the coordinates of the preconditioned initial residual vector Tr^ in the basis of the eigenvectors yj. Assume that the initial guess is chosen in such a way that Tr^ has at least two nontrivial components in the directions of eigenvectors corresponding to eigenvalues of TA of the opposite sign. Then we can always fix an eigenvalue Xj, of the matrix TA such that sign(AjJ = sign(a), for any (nonzero) a, and the vector Tr^ has a nontrivial component in the direction of t/j, i.e., Cj ^ 0. Thus, since the corresponding eigenvalue (J.j, = 1 a\j. of I aT A is strictly greater than 1, using identity (3.4) and the Pythagorean theorem with respect to the TMnner product, we obtain: ||r(*+1)||2 = ||7Y(i+1)|fr-i = || (/ aTA)i+1Tr^\\2T^ = n (/- qt.4)'+i = UX^crf-, j=o j=o = it (4fl^)2 > Â¥Â£1ct)2 -> 3=0 as i > oo. This proves the divergence of iteration (3.3) with A symmetric indefinite and T SPD for any a, unless the initial guess x^ is such that the vector Tro has its nontrivial components only in the direction of eigenvectors corresponding to eigenvalues of the same sign. Since iteration (3.3) is not applicable for solving symmetric indefinite sys- tems with SPD preconditioners, we further question what a correct way is to 52 define a simple scheme with stationary iteration parameters, different from us- ing method (3.3) for the normal equations, that can be applied in the described framework. 3.1.1 Stationary iteration for solving symmetric indefinite systems with SPD preconditioners Given a symmetric indefinite matrix A and an SPD preconditioner T, we consider the iteration of the form rM =b Ax^\ wW = Tpl\ sW = TAw^\ Pw^, (3.5) X(H-!) = x(*) -f q/W, 2 = 0, 1, ... , where a (nonzero) and P are real numbers. Scheme (3.5) can be viewed as preconditioned stationary iteration (3.3) with a search direction replaced by a modified direction which is a linear combination of the preconditioned residual and a vector = TAwh). We notice that method (3.5) is exactly stationary iteration (3.3), applied to solve the (preconditioned, T_1-Symmetric) system (TA pi) TAx = (TA pi) Tb, (3.6) or, equivalently, the symmetric system (AT PI) Ax (AT pi) b with the preconditioner T. Alternatively, (3.6) can be viewed as an instance of a polyno- mially preconditioned system, see, e.g., [59]. With P = 0, method (3.5) turns into iteration (3.3) applied to the sys- tem of the normal equations (TA)2x TATb, or, equivalently, to the system AT Ax = ATb, with an SPD matrix AT A, preconditioned with T. The optimal choice of the parameter a in this case leads to the (optimal) convergence rate 53 with a factor p^ We next show that for certain choices of the pa- rameters a and (3 method (3.5) converges to the solution of system (3.1). More- over, the (optimal) convergence rate is improved, depending on the eigenvalue distribution of the preconditioned matrix TA, compared to the above discussed approach based on solving the corresponding system of normal equations with method (3.3). Let us assume that the spectrum of the preconditioned matrix TA, i.e., A(TA) = {Al5..., Ap, Ap+i,..., An}, is located within the union of the two in- tervals l=[a,b}\J[c,d], (3.7) where a Theorem 3.2 Let us consider method (3.5) applied to solve linear system (3.1) with a symmetric indefinite coefficient matrix A and an SPD preconditioner T. Let us assume that the spectrum of the matrix TA is only known to be enclosed within the pair of intervals X in (3.7). If b < (3 < c and 0 < a llrh+1)|| ..
||r||r K Ae{a,6,c,d} n |