COMPARISON OF INTERIOR POINT METHODS TO SIMPLEX
METHODS IN APPLICATION TO RADIATION THERAPY
by
Maksym, N. Volkov
Diploma of Engineer, Kharkiv National University, 1999
A thesis submitted to the
University of Colorado at Denver
in partial fulfillment
of the requirements for the degree of
Master of Science
Applied Mathematics
2005
I
This thesis for the Master of Science
degree by
Maksym, N. Volkov
has been approved
by
Weldon Lodwick
Vi*
ate
Volkov, Maksym, N. (M.S., Applied Mathematics)
Comparison of Interior Point methods to Simplex methods in application to
Radiation Therapy
Thesis directed by Professor Weldon Lodwick
ABSTRACT
An overview of interior point methods and their importance in applica
tions to the radiation therapy planning problem is presented. In particular, the
following pathfollowing algorithms are described in detail: ShortStep Path
Following, LongStep PathFollowing, PredictorCorrector PathFollowing, In
feasible Point PathFollowing and Mehrotras algorithms.
Several linear models with different objective functions are constructed and
solved by the interior point and simplex algorithms. The solutions obtained from
the interior point and the simplex algorithms are compared using the qualitative
and quantitative measures also defined in the study. An approach to building
the models for the radiation therapy planning problem is proposed.
This abstract accurately represents the content of the candidates thesis. I
recommend its publication.
Signed
Weldon Lodwick
DEDICATION
This study is dedicated to my parents who are not tired yet of my 10year long
universities quest, to my fiance without whom I would never make it on time
and to my project managers who (unknowingly) had to cope with the fact that
I worked on thesis harder than on projects. Many thanks to all them for their
patience!
ACKNOWLEDGMENT
This thesis would not have been possible without the support from CQG Inc.,
and infinite help of my advisor, Prof. Weldon Lodwick in all the ways possible,
including my grammar. It is my pleasure as well to thank Kate Bachman who
provided me with numerous hints and prepared all the attenuation matrices,
as well as helped to integrate GAMS with Matlab. This study has benefit
ted substantially from the positive critique and suggestions from the committee
members. Special thanks to Francis Newman for his openness, support, initial
mathematical models and further very fruitful discussions of the model and ob
jectives as well as comparison criteria. Big thanks to Stephen Billups for numer
ous corrections and improvements to the text. Suggestions from Prof. Harvey
Greenberg added a lot to this study, keeping it focused on Math and shaping in
many ways. My gratitude also goes to Harvey Greenberg for introducing me to
the area of computational biology, in which I intend to work further. Theresa
Ferg has has supported and encouraged me through my whole quest in Colorado
University at Denver, which started as a Certificate in Computational Biology
and ended as Masters in App Math. My infinite gratitude goes to her as well.
CONTENTS
Figures ................................................................ x
Tables................................................................ xii
Chapter
1. Introduction........................................................ 1
1.1 Paper Outline....................................................... 1
1.2 Objectives......................................................... 2
1.3 Radiation Therapy Background....................................... 4
1.4 Optimization and Radiation Therapy................................. 9
1.5 The Model......................................................... 10
2. Interior Point Implementations for Radiation Therapy Problems ... 14
2.1 The Interior Point Method Used.................................... 14
2.1.1 LIPSOL......................................................... 14
2.1.2 PCx............................................................ 16
2.2 The Radiation Therapy Model...................................... 17
2.2.1 Example Problems Used.......................................... 20
2.2.2 Results Obtained............................................... 21
2.2.2.1 Matlab Simplex Method Results.................................. 23
2.2.2.2 GAMS Simplex Method Results ................................... 25
2.2.2.3 Interior Point Method Results.................................. 25
2.2.2.4 Results Comparison............................................. 29
vii
2.2.3 Superpixel Reformulation........................................... 38
2.2.3.1 Superpixel Results............................................ 39
2.3 Conclusions......................................................... 41
3. Future Research and Discussions.................................... 44
3.1 Avoiding Overconstrained Systems.................................. 44
3.2 Improved Attenuation Matrix......................................... 44
3.3 Infeasibility/Sensitivity Analysis.................................. 45
3.4 Model Linearity..................................................... 47
3.5 Surprise Function................................................... 48
3.6 Performance Characteristics......................................... 49
3.6.1 Warm start.......................................................... 49
3.6.2 Statistical Solutions Analysis...................................... 50
Appendix
A. Figures with Discussion............................................ 52
A.l Matlab Deficiency................................................... 52
A.2 Interfacing Matlab with GAMS........................................ 52
A.3 Objective Functions ................................................ 54
A. 4 Additional Results Comparison....................................... 56
B. Interior Point Methods and Algorithms Overview..................... 73
B.l PotentialReduction Method.......................................... 78
B.2 Central Path........................................................ 80
B.3 Central Path Neighborhoods.......................................... 81
B.4 ShortStep PathFollowing Algorithm................................. 82
B.5 PredictorCorrector Algorithm....................................... 83
viii
B.6 LongStep PathFollowing Algorithm............................... 85
B.7 Infeasible Point PathFollowing Algorithms....................... 85
B.8 Mehrotras Algorithms............................................ 88
B. 9 Terminal Step.................................................... 91
C. Glossary......................................................... 93
References.......................................................... 95
ix
FIGURES
Figure
1.1 Desired Density Histogram for Rope Organ.............................. 8
1.2 Desired Density Histogram for Chain Organ............................. 9
2.1 Desired DVH for Tumor................................................ 20
2.2 Critical organs and tumor location. Tumor blue, critical organs 
cyan, lightgreen, yellow, general body brown................... 21
2.3 DVH, Simplex Method, Matlab ......................................... 24
2.4 Dosage distribution for Simplex Method, Matlab................... 25
2.5 DVH, Simplex Method, GAMS....................................... 26.
2.6 Dosage distribution for Simplex Method, GAMS......................... 27
2.7 DVH for organs, Interior Point Method, PCx........................... 28
2.8 Dosage distribution for Interior Point Method........................ 29
2.9 Dosage Distributions for superpixel model, resolution 128x128 ... 40
2.10 DVHs for superpixel model, resolution 128x128 ...................... 40
A.l DVHs for GAMSbased solutions for resolution 64x64 .................. 57
A.2 Dose Distribution for GAMSbased solutions for resolution 64x64 58
A.3 DVHs for Matlab Simplex and LIPSOL solutions for resolution 64x64 60
A.4 Dose Distribution for GAMSbased solutions for resolution 64x64 61
A.5 DVHs and Dose Distribution for PCxbased solutions for resolution
64x64 ............................................................... 62
A.6 DVHs for GAMSbased solutions for resolution 128x128 ................ 63
x
A.7 Dose Distributions for GAMSbased solutions for resolution 128x128 64
A.8 DVHs and Dose Distributions for PCxbased solutions for resolution
128x128 ............................................................ 65
A.9 DVHs for Matlab Simplex and LIPSOL solutions for resolution 128x128 66
A.10 Dose Distributions for Matlab Simplex and LIPSOL solutions for
resolution 128x128 ................................................. 67
xi
TABLES
Table
2.1 Model organ dosage constraints........................................ 22
2.2 Problem constraint matrix sizes per resolution........................ 22
2.3 Objective values...................................................... 34
2.4 Algorithm running times depending on desired resolution............... 34
2.5 Number of zero elements in the solution vectors....................... 35
2.6 Degeneracy count and basis size of solution, GSimplex ................ 36
2.7 Superpixel problem constraint matrix sizes per resolution............ 39
2.8 Solution quality for maximize tumor runs (superpixel), 128x128 . 41
A.l Quantified criteria for max tumor runs, 64x64 ....................... 69
A.2 Quantified criteria for max tumor runs, 128x128 ..................... 70
A.3 Quantified criteria for max tumor runs, 256x256 ..................... 70
A.4 Quantified criteria for min body runs, 64x64 ........................ 71
A.5 Quantified criteria for min body runs, 128x128....................... 71
A.6 Quantified criteria for min body runs, 256x256....................... 72
1. Introduction
1.1 Paper Outline
The presented work consists of several main parts. In the first part the
domain model of the problem is introduced. Radiation therapy terminology is
presented along with mathematical terminology and basic notation. In the very
same introduction the background is prepared justifying and explaining why
interiorpoint, methods are chosen to be used.
The second chapter contains a more detailed description of LIPSOL ([28])
and PCx[4], since these axe the ones chosen for practical application. The differ
ences of these algorithms are covered lightly. The results of applying the methods
to the model developed in the first chapter are provided. Dosage distribution
histograms and the beam densities clearly show the advantage of interior point
solvers.
The concluding chapter outlines some deficiencies of the methods. The
issues that are worth additional analysis and prototyping are also posed. The
terminology used is mainly taken from works [9], [11], [2] and [8] as the ones that
provide very exhaustive and clear treatment of the radiation therapy modeling
basics.
Appendix A contains figures and tables that include all the models and
resolutions obtained in this study, as well as some additional discussion where
pertinent. As such, it includes more data and figures than covered in Chapter
3, since the third chapter covers only the most essential cases. In a way, the
1
Appendix has been written to be read semiindependently from the rest of the
study, which resulted in some duplication of the information such as some figures,
tables and solution comparison criteria definitions (the ones in the Appendix are
more detailed).
Appendix B contains an overview of current interiorpoint algorithms. A
synopsis is provided to each method along with some highlevel complexity
analysis. The PotentialReduction Method is not covered but is mentioned
briefly.
Appendix C contains glossary of the terms used through the text.
1.2 Objectives
The objective of this study is to enhance the understanding of how the
models for the radiation therapy planning problem should be built. Different
models have different features, that make them better in some cases, worse in
other cases. This study proposes a few generic objectives that make sense from
the medical point of view. The question of what particular method to use to
solve a model turns out to be important as well. Out of numerous methods
known now, two methods (or rather classes of methods) have been chosen to be
compared: the Simplex method and the Interior Point Method (IPM). There
are several reasons why these two are selected. First, they are perhaps the
most wellstudied methods and as such have good theoretical foundations as
well as numerous practical implementations. Second, the Simplex method has
been around for so long that the implementations are mostly stateoftheart.
Therefore, this method can be used as a very good benchmark to compare
solutions with those of any other methods. Third, the IPM was chosen since it
2
has some very attractive features from mathematical and medical standpoints:
For the mathematicians and the computer scientists it is important to
know that the IPM algorithms complexity is polynomial, while for the
Simplex it is not the case (practically, the Simplex runtime is often poly
nomial but some examples have been constructed to run at exponential
time).
From the medical point of view, the problem with the Simplex Method is
that it produces vertex( or basic) optimal solutions, which means that more
of the physicianspecified constraints axe attained. This is not medically
optimal, since the structure of the constraints implies two things: either
the tumor gets the least/greatest dosage specified, or some (up to all) crit
ical or noncritical organs get the maximum doses allowed or both. Also,
if the problem has alternate optima, the one with the most constraints
active will be attained, which is not medically optimal as discussed above
(this problem could be alleviated though by enumerating all alternative
optima and selecting the one that makes more medical sense). The IPMs
make it possible to stay away from the vertex if, for instance, the optimal
solution is the whole facet or an edge. In the case of the alternate optima
the IPM solution will be in the relative interior of the optimal set. This is
not a panacea, but at least it is a chance to get away from some constraint
boundaries and be in the radiationsafer middle point. Mathematically
speaking, it means that one favors some middle values from the intervals
available for the tumor and tries to stay away from the maximum dosages
3
allowed for other organs. To express it in formulas is possible, but usually
it means some Gaussianlike function for each parameter in the objective
function and the problem would no longer be linear if formulated pre
cisely. This is a longdiscussed deficiency of linear optimization problem
formulation (see [9]).
The question here is what a model should be like to favor the IPM solu
tions. Some insight into the answer to this question is provided in this study
as a result of comparing different models and methods of solving those. Mathe
matically speaking, the very existence of the IPM solution that is different from
the Simplex solution is the direct result of existence of the alternate optima. If
the optimal solution is unique then the IPM solution will be be the same as the
Simplex one. So the question mentioned above can be reformulated: what a
model should be like to provide for alternate optima. This study attempts to
answer this question.
Some of the auxiliary questions that this work answers as a byproduct are:
how the solutions can be compared, what implementations are faster, which
ones are more stable.
1.3 Radiation Therapy Background
This study concentrates on applying radiation to humans, while the par
ticular organism is generally irrelevant. The difference between a human body
and an animal would be the organs location and the dosage limits. Both of the
mentioned differences are the parameters of the model. Radiation therapy is a
widely used method of treating cancerous tumors. The idea is simple: just like
4
with chemotherapy, a physician wants to apply a dosage of radiation to a tumor
to kill the cells it consists of while sparing the normal cells.
It is a well known fact that tissues of all organisms are susceptible to ra
diation. Different organisms and organs and even different tissues of the same
organism have different levels of response to radiation. It is a very happy circum
stance that cancerous cells along with dysplasiac cells (further both are referred
to as cancerous or tumorous cells) are more sensitive to radiation levels than
normal cells (this is the so called the therapeutic advantage phenomena). This
fact does not really affect the models, but at the same time it makes the treat
ment safer as constraints for normal tissue and cancerous tissue might receive
values that are further apart.
The radiation is delivered to the tumor in one or several beams possibly
having different intensities such that the necessary concentration (overall dosage)
is achieved on the tumor without damaging other normal organs. The procedure
of selecting spatial positions of the beams, along with their intensities is called
a treatment planning. It is the goal of the model covered here to help physicians
and technicians conduct such planning in the most efficient way in terms of
affecting the tumor and normal tissues.
The terms radiation therapy (RT) and radiation surgery are used sometimes
interchangeably. There are at least two things that set them apart.
Radiation surgery is a one time application of a treatment plan, while
radiation therapy usually implies multitime application of the procedure.
There are several reasons why this is done. For one, a tumor can be too
big to be killed in one shot. For another, some intermediate test might be
5
required to see how the procedure is working to make adjustments if neces
sary. Yet another reason is that the patient can be too weak to undertake
the full procedure at once. Finally, the machinery needs to be reconfigured
manually if relative organ positions are such that the physician can not
accomplish the objective with one fixed configuration.
Radiation surgery plays the role of a scalpel of a surgeon. The application
is very similar it burns some abnormal tissues that are not necessarily
cancerrelated. It can be some hematoma or the like. While radiation
surgery can be applied to both cancerous and noncancerous tissues, the
radiation therapy is almost exclusively applied to cancerous tumors.
Different setups stipulate using therapy or surgery. If the cancerous cells are
interdispersed with healthy cells then the therapy is the tool of choice since a
physician is likely to prefer multiple applications of the radiation (this is called
the fractionation technique). The cancerous cells are not capable of healing
themselves, while the normal cells are. So, typically a therapy is applied daily
over several weeks. Between the session (dayspan) the normal cells recover from
the radiation, while the cancer cells do not and eventually die.
If the cancerous cells occupy some area totally and the physician is not
concerned about saving any tissue in the cancerridden area then a surgery can
be applied, burning out all cells and as such being equivalent to surgical removal
of the tumor.
This current work concentrates on onetime application of the procedure.
In this sense it is closer to the radiation surgery. However, the objectives stated
for the model and the overall approach are more related to treating cancerous
6
tumors by creating radiation levels that are fatal to the tumor not so intense as to
burn the tissue. In fact, burning tissues as done in radiosurgery will be avoided
since the massive necrosis of internal tissues cannot always be handled by the
body itself. For the stated reason, the surgery is mostly applied to outer levels
of tissues while therapy is concerned with the internal parts of organs/tissues,
where internal and external are defined as being or not being easily accessible
without major surgical intervention. For example, the skull trepanation is not
considered major, while cutting liver or lung is. [5], [6]
The onetime application model knows of no previous exposures explicitly
and is treated independent of the history. It is the physicians responsibility
to provide the constraint numbers and information about previous or future
applications of the procedure. For instance, if the physician knows that some
organ can only withstand X units of radiation, of which Y units have already
been delivered last time, then this time one will set upper bound for this organ
to something closer to XY as not to exceed total dosage of X. In the same way,
if a tumor is to get at least X units and all the previous plans have delivered Y
units, then some last application procedure must deliver at least XY units.
Different organs react differently to radiation in a spatial sense.
Some organs called rope organs are not that sensitive to loosing some
(sometimes rather big) part of itself. At the same time they are sensitive
to (sometimes relatively small) dosages applied to the whole organ. The
kidney, liver and lung are examples. In a model, such organs have to be
protected from uniform dosage delivered to the wrhole rope organ, or at
least the treatment plan has to be checked to avoid such exposures. The
7
dosage histogram expressing this definition is shown on figure 1.1. Namely,
the tail of the distribution is such that some noticeable number of cells (5
10 percent) could receive the maximum dosage allowed. At the same time,
only relatively small dose is provided to 100 percent of the cells. So, ideally
the curve should decrease steeply right from the leftmost point.
Figure 1.1: Desired Density Histogram for Rope Organ
Other organs (chain organs) axe very sensitive to loosing some parts of
themselves while they are somewhat resistant to relatively high levels of
radiation applied to the whole organ. The spinal cord and bowel go un
der this category. In a model one needs to watch out for unusually high
dosages delivered even to the small parts of such organs. Dosage histogram
expressing this definition is shown on figure 1.2. The difference from the
rope organ density distribution is that the tail is such that almost no cells
receive maximum allowed dosage. Such condition is an implementation of
the requirement to have no part of the organ exposed to high dose. At the
8
same time, the shape of the curve in left part is steplike that expresses the
tolerance of exposing the organ overall to some medium level of radiation.
As such 80100 percent of cells can receive around a half of the maximum
dose allowed.
Figure 1.2: Desired Density Histogram for Chain Organ
In the Results, Chapter (2), the obtained dose volume histograms will be
checked against the mentioned limitations for rope and chain organs.
1.4 Optimization and Radiation Therapy
The idea of applying methods of linear programming to radiotherapy prob
lems has a good foundation in the fact that the overall dosage to the organ
is an additive quantity (neglecting possible cell selfrecovery over some time).
There have been some attempts to build and solve the nonlinear models to
compensate for some difficulties of formulation of the objective for the linear
case (see [6]). The results of those attempts are discussed briefly in [9]. The
9
linear case is chosen here as the most simple one, delivering promising results
at the same time.
1.5 The Model
One of the cornerstones of the model is the notion of the deposition and
attenuation matrix. This matrix describes the dosage each particular pixel (body
part that an area on the image represents) gets under the current configuration
of the beams. A beam here is a beam of radiation that gets produced by some
ejector (linear accelerator) and that has precisely defined boundaries. The beam
is usually more than some homogeneous ray of particles/waves. New linear
accelerators (linacs) allow for rather intricate configuration of a beam. The
collimation device effectively breaks a beam into several parts called beamlets
that can be operated independently. Each individual beamlets energy level
can be set as desired or turned off completely. This is done for fine tuning
of the beams shape and allows physicians to attain more precise results by
varying energies. In turn, the beamlets can be modeled as parallel or radiating
from a point source, producing a coneshaped projection. In this work parallel
beamlets are used. Hereafter the word pixel is used to represent the piece of
body seen on the Xray or tomography image that we take as the smallest and
indivisible for the purposes of the model. Obviously, the more detailed model
one wants to work with, the more pixels one needs to use to refine the model.
In reality, the model resolution is bounded by the computing power that the
researcher has access to. The numerical results are given in chapter 2, but it
can be stated here that even problems with characteristic size of 512 pixels can
cause significant computational difficulties even on the newest available personal
10
computer systems. It takes around 30 hours of dual Pentium4 processors efforts
to solve the problem under rather relaxed constraints. In this model a pixel by
pixel constraint (that is, the conditions are applied to each pixel individually) is
used to get the optimal result. The pixels that are not affected by beams (that
is those that get no dose deposition) are excluded from the problem at the pre
processing step when the attenuation matrix is prepared. This pixelbypixel
approach ensures the finest level possible in precision but requires much more
computations than in some other alternative approaches such as pixel grouping
([9]) where some sort of aggregated pixel parameters are employed.
The two main factors that affect the deposition matrix are:
The dose deposited from some beam is proportional to the that beam
strength (intensity), which allows to consider the model linear, unless some
other, nonlinear elements such as constraints or objectives are introduced
into the model.
Physical beam configuration. The dosage deposited at a pixel is propor
tional to the area of intersection of the beam with the pixel. Prom this it
follows that a pixel is not considered to be a sizeless mathematical abstract
point but rather as an area of real flesh (blood included as type of tissue
as well). It is noted that a beam misses many pixels (in fact, relatively
most pixels) and on average, every pixel except for the tumor is missed by
most beams so the deposition matrix is necessarily sparse by construction.
Here we take a beamlet (characteristic) size (which will be width here) to
be equal to pixel size which for resolutions 64 and up is very reasonable.
11
Attenuation effect. As beams propagate through the body, tissues provide
some protection/resistance against radiation as it moves along. So the cells
along the beam that are closer to the emitter get higher dosage. Usually
the attenuation is accounted for by the terms e~dfl ([9]), where d is the
distance between a pixel and a tissue boundary where the beam enters the
body, and p is an attenuation coefficient depending on the beam energy
and somewhat on the media. Some typical /j, value is of order 102 cm1
for waterrich media such as humans body. In reality different tissues
that a beam goes through have different attenuation coefficients. The
bone obviously shields more radiation than some semiempty lung regions
and so forth.
Modeling more complex cases would also take into account the effect of
deflection (scattering). That is, if some organ is located close to the bound
ary of some bone, then there is some chance that some (relatively small)
portion of the beam will get deflected by the bone, and as such will add to
the pixel dosage even though the pixel might be not directly affected by
the beam. Both of these considerations are rather complicated in practice,
since manual specification of the jufunction piecewise is a very tedious
job.
There are likely to be programs that automate both organ recognition and
/ifunction setup. Presumably, a computer queries some database with /i
values for each recognized tissue and makes more precise computations for
the deposition matrix. The second consideration of accounting for scat
tering effects is also rather complex. However, some great progress has
12
been made recently by video game developers trying to reconstruct lights
and shadows to create reality effects in the video games and movies ([18]),
http://fraktali.849pm.com/graphengines.html. Similarly, motor en
gine developers have been fighting the heat dissipation issue for decades
now and know how to tackle it. Another group of people to know about
this problem are the creators of the Stealth technology who had to cal
culate heat and waves scattering and dissipation precisely. Most of these
algorithms are publicly available and the production code can make use of
it.
At the same time, the deposition matrix preparation is a secondary problem
to the optimization model that is being developed here. So it is further assumed
that all the matrices needed that contain organ description are available. Some
of the issues that the deposition matrix preparer faces have been listed just
to give a feel that it is not really a trivial problem if reliable starting data is
desired.
13
2. Interior Point Implementations for Radiation Therapy Problems
2.1 The Interior Point Method Used
The IPM was chosen to solve the radiation therapy problem in the cur
rent work for the reasons described in the Introduction. There are not that
many quality solvers freely available for IPM. One of the first production grade
solvers of this class was primaldual OBI [15], [16]. It was shown to be a good
match to the best available simplex solvers at the time. The next successful
productionquality implementation of the IPM was LIPSOL [28]. LIPSOLs
author, Yin Zhang in his work [28] showed that LIPSOL is in many cases
(Netlib tests library used) faster than OBI. Another, yet more powerful solver,
is PCx([4]). It has some improvements over LIPSOL and in this study it proved
to be superior to other free solvers used runtimewise. Commercial production
quality CPLEX solver package (http: //www. ilog. com/products/cplex/) was
also used in both Simplex and Barrier modes and was shown to be the fastest
of all tested.
2.1.1 LIPSOL
The LIPSOL algorithm is clearly described in [28], the source code is avail
able from the authors website (http: //www. caam. rice. edu/~zhang/lipsol/)
and is also reportedly implemented in Mathworks Matlab version 7. The Matlab
implementation was chosen as a main tool in the current study. Matlabbased
implementation was chosen because of better connectivity with an existing code
and data such as attenuation matrices and result representation routines that
14
are Matlabbased. Both preexisting code and data have been used in this study
along with the new code developed. It could be that some future production code
shall be C or Fortran based for better performance, while Zhang in [28] claims
that for large problems both Fortran and Matlabbased implementations are
comparable runtimewise with Matlabs version being much easier to implement
due to matrices facilities readyavailable. In this study it has been observed that
the Matlabbased solvers are significantly slower than the precompiled solvers
(Windows executables), but since the solvers are totally different, one has to
be careful when making performance comparison. The compelling reason not to
use Matlab in the future is because it fails for no apparent reason and the source
codes are not available to try to debug it. It has also been found here that Mat
labs problem size limitations are relatively strict as will be pointed below. The
GAMS implementation not based on Matlab is shown to be superfast in this
study (100 times faster than Matlabs on resolutions above 64x64 for radiation
therapy planning). So perhaps Zhangs statement no longer holds true.
Linprog is Matlabs routine that with LargeScale parameter set to On runs
the LIPSOL algorithm internally. LIPSOL as mentioned before is an implemen
tation of Mehrotras PredictorCorrector method described above (section B.8)
with a few changes of numerical stability improvement nature:
1. The centering parameter crfc is evaluated as in Mehrotras algorithm [19],
but LIPSOL imposes an additional upper bound. At early stages of itera
tions the upper bound is around 0.2. Near the end it is forced to be of the
same order as the total relative error. Zhang claims that this approach
provides for faster local convergence at the final iterations.
15
2. Sparse matrix factorization has been changed from the standard sparse
Cholesky procedure to be able to tackle dense columns that can arise.
ORNL (OakRidge National Laboratory) package has been used for fac
torization after the ShermanMorrison formula is applied to get rid of the
dense column [28].
2.1.2 PCx
The same reasons given for using LIPSOL can be given for using the PCx
solver(http://wwwfp.mcs.anl.gov/otc/Tools/PCx/). There are two incar
nations in which the solver is readily available, one is a Matlabcompatible rou
tine with the call signature being the same as Matlabs native linprog routine,
and another being MPS input driven version for MSDOS. Since MPS format
is not used here, the Matlabbased implementation has been chosen. Unfor
tunately, the version available is Ccode compilable for 32bit Linux only. A
version for 64bit Linux (Fedora Core) could be build but would not run with
Matlab. There is no make file for Windows either so all the tests of PCx solver
have been carried out under 32bit Linux.
The PCx algorithm proposed by Wright, Mehrotra, Czyzyk, Wagner in ([4])
is based on Mehrotras algorithm discussed above. The main algorithmic differ
ences between pure Mehrotras and PCx are:
Gondzios higherorder correction (see [7]) when * p is set as a goal
for components most distinct from fi. The number of such components
is flexible and depends on the execution environment performance. The
faster the environment, the more components can be processed this way
(squeezed to /x).
16
Constraint matrix A scaling proposed by Curtis and Reid in [3]. To achieve
matrix scaling the deviation of nonzero elements from l(one) is minimized,
as measured by lg21 Ail To solve this subproblem a conjugate
gradient method has been shown to work efficiently just a few iterations
are required.
2.2 The Radiation Therapy Model
The following model is used:
k
max TiX (2.1)
i= 1
TLBi < < TUBt (2.2)
0 < ACkx < CUBk (2.3)
0 < Agx < GUB (2.4)
0 < x < XUB, (2.5)
where Ti,Ack,Ao are the attenuation matrices for tumors, critical organs, and
body/general organ respectively, and x is beamlet intensity. It is assumed for
generality that there can be several tumors and several critical organs. To
have several critical organs is typical, while to have several tumors is not typ
ical. The developed software handles the most general case of several tumors.
TLBi,TUBi, CUBk, GUB are the ith tumor lower, upper, kth critical organ
upper total dosage and the general organ upper bound correspondingly. XUB
is the upper bound of the values that x can take. A limit on x is a limit that
is used to better smear the solution. It is medically preferable to have more
nonzero lowintensity beams than few highintensity beams. Naturally, some
17
special model could be set up to encourage the smeared behavior, but this is a
subject for future study.
It is convenient to consider all regular healthy tissues as one general organ
that is not too sensitive to radiation and several radiationsensitive organs as
critical. It is possible to consider the general organs as a special case of a critical
organ too. Both chain and rope critical organs are described by the presented
model alike. There is no rope/chain specificity added into the model. Such
a differentiation is a subject for future research. Here are the several ideas
that can be utilized. The difference between a chain and a rope organ can be
accounted for in the treatment plan analysis (passive way a technician reviews
the solution and determines if the proposed treatment plan makes sense having,
say, organ 1 is the rope, while organ 2 is he chain organ). Also, some special
artificial measures can be undertaken in the model(active way) to account for
the difference. In the case of a rope organ the whole organ can be represented as
one critical organ on the model. The chain critical organ can be represented as
several critical organs, with the key parts assigned to separate critical organs in
the model with relatively low upper bounds each to prevent their overexposure
or add constraint that certain percent of organ get less than a number, the rest
can get larger. The rest (notsosensitive part) of the chain organ can be assigned
to one or more critical organs in the model depending on the spatial position
to simplify the analysis. For instance, if in a chain organ the key suborgan
is in the middle, then it makes sense to introduce into the model one critical
organ with low upper bound for the key part and two more critical organs for
the parts separated by the key suborgan. The model (2.1)(2.5) does not make
18
these distinction, but can be adjusted.
The particular objective function, maximization of the total dosage delivered
to tumors, chosen for this study is one of the most simple ones. Such a function
makes medical sense and is simple enough at the same time to show that the
model and the algorithms can be used at all. The model with this function allows
for intuitive analysis. As such, the dose maximization objective function will
push the dosages for tumor organs to maximum possible which for the case of a
narrow tumor dosage values interval would not change much from the case when
minimization of the tumor dosage is used. This is the good part. Such objective
function might also result in maximizing dosage given to other organs. This
would not be an ideal treatment plan. For the ideal plan the physician would
prefer to have a dosage somewhere in between the minimum and maximum dose
necessary for tumors and minimum for other organs. As discussed below, such an
objective would require nonlinear or at least multiple objective functions. The
dose volume histogram (DVH) for the tumor expressing the desired distribution
is shown on figure 2.1
Physically, the model attempts to figure out if it is possible to give the max
imum allowed radiation dosage to tumors without exceeding the upper bounds
for other organs in the given configurations. Numerous runs have shown that
the solutions can be found to most of the problems, but the critical organs
bounds are the most strict constraints that can easily turn the problem infeasi
ble. As discussed below, postoptimization analysis is needed to figure out what
particular critical organ is at risk and by how much the dosage is exceeded.
19
o
A
1
I
60 Dose
Figure 2.1: Desired DVH for Tumor
2.2.1 Example Problems Used
The prostate tumor configuration used in this study is shown in figure 2.2.
This particular configuration was chosen because of its apparent difficulty: The
tumor (blue color) semiembraces one critical organ (lightgreen color) and in
turn is covered from beams by critical organs so that virtually from any beam
direction a critical organ is on the beams way. The number of beams is set
to 40, which is a realistic number for modern devices. The number of critical
organs is set to 3, while one critical organ is physically represented in 4 locations
(cyan color in figure). Such representation of 4 locations by 1 organ is done to
manifest that the model conditions are the same for all pieces and as such can
be referred to as to one organ.
20
Visually the problem configuration (tumors and critical organs location) is
shown in figure 2.2.
Figure 2.2: Critical organs and tumor location. Tumor blue, critical organs
 cyan, lightgreen, yellow, general body brown
The model parameters such as organ dosage limits are given in the table
2.1. Model resolutions were chosen to be 64, 128, 256 and the corresponding
constraint matrix sizes are provided in the table 2.2.
2.2.2 Results Obtained
The results are provided for simplex and IPM solutions separately in sepa
rate subsections with brief individual analysis of each. Then the comparison of
the results is presented in the Results Comparison subsection. All the figures
unless specified otherwise are given for the 128x128 resolution and the analysis
21
Constraint Dosage, Gy
tumor lower bound 55
tumor upper bound 63
general tissue upper bound 40
critical organ 1 upper bound 30
critical organ 2 upper bound 30
critical organ 3 upper bound 30
beam intensity upper bound 25
Table 2.1: Model organ dosage constraints
model resolution rows columns
64 1023 254
128 4080 522
Table 2.2: Problem constraint matrix sizes per resolution
22
is carried out for that resolution. Sometimes the analysis of other resolutions
(with figures from the Appendix A) is provided where the distinction to the
mainstream 128x128 resolution needs to be pointed out.
2.2.2.1 Matlab Simplex Method Results
The DVH in figure 2.3 is for the critical organs, general tissue and tumor
for the constraint values from table 2.1 as obtained using the Simplex Method
implemented in Matlab. The distribution for critical organ 1 shows that half the
cells received around two third the dosage allowed (20 Gy) with only 10 percent
getting more than 25 out of 30 Gy permitted by constraints. The distribution
for critical organ 2 is very different: half the cells receive 25 Gy out of allowed
30 Gy. Such steplike distribution and reaching the maximum limit allowed
hints at two things: first that the critical organ 2 could potentially render the
solution infeasible if it is a rope organ, since relatively large percentage of the
cells get relatively large dosage and that it is the binding constraint of the model
with such parameters. The distribution for critical organ 3 demonstrates that
half the cells get around 13 Gy with 5 percent getting 20 Gy and virtually none
getting above 27 Gy out of 30 permitted. This distribution is close to the desired
distribution. The general body distribution is close to ideal with the curve going
down steeply. Some healthy cells though receive almost the maximum limit of
40 Gy (1 percent gets 38 Gy and more). This hints that the body constraint is
also binding. The narrow range of the permitted tumor dosages does not leave
much freedom for the tumor dose histogram curve. It is steplike as desired with
around 90 percent of the cells getting almost the maximum allowed dosage from
the permitted range. The solution beams density distribution is shown on figure
23
Fraction of Volume
msimplex 128x128 40 beams, maxtumor
.1 ' '
Figure 2.3: DVH, Simplex Method, Matlab
24
2.4.
DOSE IMAGE
ti
Figure 2.4: Dosage distribution for Simplex Method, Matlab
2.2.2.2 GAMS Simplex Method Results
The DVH is given in figure 2.5 for the critical organs, general tissue and tu
mor for the constraints values from table 2.1 as obtained using Simplex Method
implemented in GAMS (Primal and Dual produce identical results). Curves be
havior will not be considered here since the results are identical to those of IPM
and as such will be analyzed in section (2.2.2.3). The solution beams dosage
distribution is shown in figure 2.6.
2.2.2.3 Interior Point Method Results
The DVH is given in figure 2.7 for the. critical organs, general tissue and
tumor for the constraint values from table 2.1 as obtained using IPM. Results
obtained in different IPM'solvers like GAMS Baimer) Matlab LIPSOL, PCx are
25
v;
Kt?
Fraction of Volume
gsimplex 128x128 40 beams, maxtumor
Figure 2.5: DVH, Simplex Method, GAMS
26
20
30
40
10
60
50
Figure 2.6: Dosage distribution for Simplex Method, GAMS
almost identical as shown in Appendix A and as such are generically referred to
The distribution for critical organ 1 (cl) shows that half the cells received around
two third of the dosage allowed (20 Gy) with none getting more than 28 out
of 30 Gy permitted by constraints. The distribution for critical organ 2 (c2) is
somewhat different: half the cells receive between 25 and 26 Gy out of allowed
30 Gy with some cells getting 30 Gy. Such semisteplike shape of the curve and
the fact that it reaches the maximum limit allowed hints one at two things: first
that the critical organ 2 could potentially render the solution infeasible if it is a
rope organ (compare to desired rope organ distribution on figure 1.1) and that
V'V*
. vi %
it is the binding constraint of the model with such parameters. Distribution
for critical organ 3 (c3) demonstrates that half thocells get around 10 Gy with
15 percent getting above 15 Gy out of 30 permitted.) This is almost the ideal
as IPM results. A similar generalization was made for Simplex results above.
Fraction of Volume
pcx 128x128 40 beams, maxtumor
Figure 2.7: DVH for organs, Interior Point Method, PCx
28
distribution as covered in the introduction on the rope and chain organs (figure
1.1). The general body distribution is very close to that of critical organ 3 with
some subtle differences: half the cells get less than 10 Gy but the curve has
thicker tail so some cells get up to maximum limit of 40 Gy. This hints that
the body restriction is also binding for this parameters set. The narrow range
of the permitted tumor dosages does not leave much freedom for the tumor
dosage histogram curve. It is steplike as desired (figure 2.1) with half the cells
getting median dosage from the permitted range. The solution dose distribution
is shown in figure 2.8.
Figure 2.8: Dosage distribution for Interior Point Method
2.2.2.4 Results Comparison
As before, the term IPM as used here refers to any of the LIPSOL, Barrier
and PCx algorithms since the results obtained for them are really similar both
29
with respect to dose distribution and DVH (see figures in Appendix A, for
example). The distinction between different implementations of Simplex and
IPM is pointed out where necessary (like in runtime comparison table 2.4).
Provided here are the model results with technical results like in the runtime
details given afterwards.
In the case of resolution 64x64 the beam intensity for Simplex shows that
not all beams are turned on and that some beams are so intense that they deliver
much dosage along the way (lightblue and yellow colors along the way) as can be
seen in figures A.l, A.2 and A.3, A.4. This observation is confirmed by the dose
distribution diagram which shows that in the Simplex solution critical organs
get exposed to radiation more than in the IPM solution. The beam density for
the IPM is smeared better more beams are enabled, colors for the nontumor
are more to the blue side of spectrum which tell of lower dosages delivered.
In the case of resolution 128x128 and higher the comparison of the dose
distribution diagrams 2.8 to 2.4 and distribution diagrams 2.7 to 2.3 does not
reveal any noticeable advantage of IPM over the Matlabs Simplex or vice versa.
The comparison can hardly be made between the GAMS Simplex solution and
IPM since they are identical (to some small threshold e). This observation
is reconfirmed comparing figures A.6 and A.7 with A.9 and A.10. While not
all runs could be made on Matlabs implementations, for those that were made,
the solutions look identical. The solutions are similar not only between Matlabs
Simplex and PCx, but also between GAMSs Simplex and Barrier (implemented
as PredictorCorrector algorithm).
30
The visual comparison of the solutions figures has been confirmed by the
following quantitative solution comparison criteria:
1. Percentage of zeros in the solution. The more beams are shut off, the
more likely it is that other beams have high intensity which might be
undesirable.
2. Average beam intensity can be defined as the sum of all intensities divided
by total of nonzero beams. Lower number here usually implies better
smearing
3. Total dosage obtained by the body can be defined as the sum of dosages
to tumor, critical organs and general organ. Lower values here may imply
better solutions, but the objective function must be taken into account
too.
4. Tumor dosage received shows how much radiation a tumor has received.
5. The objective function value is used for comparison of different solvers/
algorithms in the case of the same objective function. A higher value is
better for the maximization case, a lower value is better for the minimiza
tion case.
6. Tumor dosage deviation characterizes how far dosage delivered to the tu
mor is from the middle of the allowed tumor dosages range. DEV
Zre(x\(.Tx)i 60)2
Namely, tables A.l, A.2, A.3, A.4, A.5, A.6 summarize all comparison quan
tities. It is seen that for all criteria discussed, the Simplex and IPM solutions
31
are identical for resolutions 128x128 and up. There are some advantages of IPM
for the 64x64 solutions that are in good accordance with the goals that have
been outlined before: obtain a solution that is smeared well and spares healthy
organs as much as possible while killing the tumor. It was predicted that the
IPM solution would have fewer zeros for the beam intensities than Simplex and
this is actually observed (see table 2.5) for the case of Matlabs Simplex. It is
not quite clear why the solutions for all resolution but 64x64 are that close so
that there is no predicted difference between Simplex and IPM solution. A few
reasons could be offered:
1. The IPM should give a solution in the middle of the face if the objective
function takes the same value in any point of the face, while the simplex
would return some corner of the same face (the one with the most ac
tive constraints). If the face is small then the corner solution is hardly
distinguishable from midface solution. If the pixelbypixel constraints
are used instead of blocklike constraints(for instance, see [9]), then the
problem potentially can become over constraint and the optimal face can
be small (given the problem is feasible). For the pixelbypixel constraint
models such an overconstraint situation is likely since the constraint ma
trix is large, varying from 4 times in the case 64x64 resolution to 50 times
in the case of 256x256 resolution see table 2.2).
2. The solvers fail to produce midface solutions (that is, interior face solu
tions) somehow either by crossing over or by some other mean. This is not
clear since in the Barrier runs the crossover option was turned off, while
PCx has no reported crossover.
32
It can be concluded here that no difference between Simplex and IPM solvers
could be detected for resolutions 128x128, 256x256 with some indication that
IPM solutions are better when a difference is detectable (as seen for resolution
64x64). Also, the IPM solution only differs from the Simplex solution if alternate
optima exist. Based on [10], some GAMS code has been reused and modified
to enumerate the alternate optima. So far for the models studied no alternate
optima has been shown to exist, despite the strong indications of their existence
(IPM solutions are sometimes (64x64) very different from the Simplex solutions
but have similar objective function value).
The fact that the objective values grow as resolution grows can be explained
by the increase of the resolution which implies an increase of the number of
beamlets per beam (the attenuation matrix is built such that the beamlet width
is equal to the pixel width and the number of pixels grows as the resolution
increases). This way each dimension gets increased by 2, the problem is two
dimensional, so the growth of the objective value as the resolution grows can
be explained. Also, finer resolution allows more pixels to be the peak in terms
of the dosage allowed, which increases the objective as well. GAMS Simplex
method is equal to IPM for every criteria used herein.
The model has been run with pixel resolutions 64, 128 and 256 for similar
constraint sets. The average running times in seconds for different problem sizes
are provided in table 2.4. These times are not consistent since the server load
on a generalaccess box varies unpredictably.
A component of the solution vector is assumed to be zero if its value is below
e = 0.1. It is assumed that 0.1 value is small enough compared to the full range
33
D Simplex(Matlab) Simplex(GAMS) LIPSOL PCx
64 1134 1134 1134 1134
128 4547 4547 4547 4547
256 17102 17102
Table 2.3: Objective values
n rows columns Simplex(M) Simplex(G) LIPSOL PCx
64 1023 254 353 2 49 19
128 4080 522 38,302 15 1502 928
256 11885 242 11318 (64 bit) 228 8780 38110
Table 2.4: Algorithm running times depending on desired resolution
34
of possible values between 0 and 25. It was established that changing parameter
e in the range 0.01 to 1 does not affect the results obtained using this criteria.
The values of 0.01 and 1 were chosen as big enough to exclude irregularities such
as possible numerical errors and small enough compared with the full solution
range.
n solution size MSimplex MLIPSOL PCx GSimplex GBarrier
64 254 215 91 109 222 93
128 522 397 397 397 397
256 990 648 648 648
Table 2.5: Number of zero elements in the solution vectors
The difference in the number of zeros between different solutions can be
partially explained by the solution degeneracy. To test this assumption it is
necessary to check how the complimentary slackness condition holds. Namely,
in the complimentary slackness expression in the form yi{a\x bi) = 0, where
i is the row index, x solution vector, y dual solution vector, Oi ith row
of the constrain matrix A, bi ith element of the righthand size vector b.
The complimentary slackness condition dictates that for each row either dual
solution component is zero or the primal solution is active. If both terms are
zero at the same time, the degeneracy takes place. It is possible to count the
degree of the degeneracy by counting how many times both terms are zero.
For the solutions obtained using different solvers one can get the degeneracy
degree. Some particular results are presented in table 2.6 for the GAMS Simplex
35
algorithm runs. A conclusion can be drawn that the constraint matrix of the
problem is not of the full rank since the basis size is smaller than the smallest of
the dimensions (in this case, number of columns is smaller than the number of
rows). Similar numbers can be obtained for the case when the number of rows
is smaller than the number of column.
n number of variables degeneracies nonzeroes basis size
64 254 12 32 44
128 522 0 125 125
Table 2.6: Degeneracy count and basis size of solution, GSimplex
Current implementation of the IPM in Matlab seems to be defective. In
Windows environment the problem of size 512 causes Out Of Memory error in
all cases. The infeasible case causes the same error even under 64bit Linux dual
Opteron configuration. Since the process is iterative and it crashes after several
iterations, it seems that some memory does not get released back to the system.
In fact, the Matlabs LP algorithm implementations are so slow that are not
really viable for any but small size problems.
Along with comparison Simplex to IPMs a comparison of methodologies used
in this study and in the elastic constraints study (ECS) [9] can be performed.
The current study uses a constraint matrix for every pixel available, while
ECS only uses pixels in which high dosage areas (hot spots) are likely to
occur. The pixel by pixel constraint approach improves results credibility
36
and accuracy since all points are accounted for.
The grid size in in the current work varies from 64 to 256, while ECSs
grid size is 64.
Running times for comparable size problems (64x64 grid) is of order of
seconds while in the ECS they are in minutes.
The current study uses LIPSOL and PCx IPMs while ECS arguably uses
simplex method internally.
The approach used in the current study produces infeasibility error if the
constraints are too strict. At the same time, the ECS approach still pro
duces some solution even under the infeasible constraints, resulting in the
areas where dosages are above/below the desired.
Overall it can be stated that the elastic constraint method provides substan
tial advantage from the usability point of view as it shows the areas that are
overdosed and where the constraints are prohibitively strict. However, the most
fruitful approach would be to combine the best of both approaches to be able to
run the model in such high resolutions and details as in the current study. An
other feature of the current study that can be used in the future implementation
is the fact that the IPM can potentially be extended to the nonlinear objective
function which will allow for more realistic formulations of the problem. As
such, it would be desirable to give more weight to median point of the tumor
dosages gap in scope of the stated objective function. The weight distribution
might be Gaussianlike with the preference to the middle.
37
2.2.3 Superpixel Reformulation
Since it is suspected that the solutions obtained are simplex(vertex) due to
the fact that the system can be overconstrained though not infeasible, it becomes
interesting to change the model such as to decrease the number of constraints or
the (number of constraints / number of variables) ratio to make the constraint
matrix more squarelike or bigger in columns. There are two ways of doing it:
1. Increase the number of independent variables. Resolution being fixed, the
only way to do it would be to increase the number of angles at which beams
can be located. This is left for future study since it solves the problem
only to a certain extent: the increase of the number of angles available
for use from 40 to 360 gives 9 fold increase, while from table 2.2 trend it
follows that it is not enough for the case of 256x256 resolution and higher.
2. Decrease the number of constraints. This can be accomplished using so
called superpixel approach, in which pixelbypixel constraints are (par
tially) replaced with constraint over some group(s) of pixels.
The superpixel approach for the case when the whole general body G (whole
system without the critical organs and the tumor(s)) is considered as a super
pixel is presented below:
k
max TiX (26)
i=l
TLBt < < TUBi
Ackx f; CUBk
^(Aqx) < GUB number of pixels in General Organ
0 < x < XUB,
38
where notations as before.
This model effectively constraints a dose that all general body cells can
receive altogether. The constraint matrix size in this case becomes (resolution
of 256x256 has not been considered) table 2.7:
model resolution rows columns
64 100 254
128 408 522
Table 2.7: Superpixel problem constraint matrix sizes per resolution
2.2.3.1 Superpixel Results
Beam densities and DVH for the superpixel model are presented in figures
2.9 and 2.10.
The observation that the IPM solution looks better than the Simplex one is
reconfirmed by the results from table 2.8.
Comparing the superpixel resulting figures 2.9, 2.10 with the results of ap
plying the full model in pixelbypixel approach such as figures 2.3, 2.4, 2.7, 2.8
leads to the conclusion that the superpixel approach adds value to the model
by providing medically better results, e.g.. better smearing is observed in the
case of IPM solution as compared to the Simplex result. The drawback of the
superpixel generalization of the pixelbypixel constraints is that in some pixels
the solution is such that the tissue represented by this pixel can get over or
underexposed. In this case some further refinement of the model can be done
39
(a) Maximize tumor dose objective, (b) Maximize tumor dose objective,
128x128, PCx, superpixel 128x128, Matlab Simplex, superpixel
Figure 2.9: Dosage Distributions for superpixel model, resolution 128x128
pcx 128x128 40 beams, maxtumor
msimplex 128x128 40 beams, maxtumor
(b) Maximize tumor dose objective,
128x128, Matlab Simplex, superpixel
(a) Maximize tumor dose objective,
128x128, PCx, superpixel
Figure 2.10: DVHs for superpixel model, ^solution ^28^128
40
40
RES ALG ZERS AVG TOTAL TUMOR OBJ DEV
128 MS 79 13 60205 4599 4599 1168
128 GS 76 13 70121 4599 4599 1168
128 GB 0 3 69221 4599 4599 1168
128 PCx 0 3 63478 4599 4599 1168
128 ML 0 3 58363 4599 4599 1168
Table 2.8: Solution quality for maximize tumor runs (superpixel), 128x128
by introducing several superpixels, configured/aligned such as to represent po
tentially troublesome areas. The process of finding the optimal solution for the
problem then becomes clearly iterative, obtaining the superpixel solution for
the given superpixel configuration, checking the solution pixelwise and then
refining superpixel breakup of the system again to avoid the discovered viola
tions on individual pixel level if any. This whole process can be automated or
semiautomated given the criteria are formulated as to how to break superpixels
down for the next step.
2.3 Conclusions
The conclusions of the analysis provided above along with other ideas ex
pressed in the study are as follows:
1. The factor that influences if the solution is an interior point one or a
simplex(vertex) one is the model itself, not the particular algorithms or
solver used to solve the model. Therefore, more efforts need to be put into
investigation of relationship between the model formulation and character
41
of the solution. So fax, there have only been hints that to make the model
favor interior point solution more one needs to avoid having the model
overconstrained. That is, the number of columns should be larger or equal
to the number of rows in the constraint matrix. One way to accomplish
that is to have superpixels in the model. Superpixel is a group of pixels
that are treated as one entity by some constraint. For instance, instead of
requiring that each pixel receives less than some amount of radiation, one
might require that a group of pixels receives less then some (proportionally
larger) amount of radiation. One has to be careful with the superpixel
approach since it might lead to local pixelwise violations (overdose as
seen in figures 2.9 and 2.10). An approach needs to be developed to create
superpixels such that on a pixel level the (implicit) dose constraints are
not violated. For instance, one superpixel can be a 10pixel wide layer
around the tumor, then another 10 pixels and then the rest of the body.
This issue is a subject of further research and looks very promising despite
the fact that the solution obtained using IPM are only partially interior. So
far the superpixel solutions of IPM are superior to the Simplex solutions
since they provide better smearing and DVHs distributions (see table 2.8
and figures 2.9, 2.10).
2. An RTP model shall account for rope/chain organ differences. More re
search needs to be done into how this can be done.
3. Some code needs to be developed to enumerate alternate solutions of the
model to check that the solution is indeed an interiorpoint solution. Fur
42
ther (medical) analysis of the alternate solutions needs to be conducted to
figure which one is the best (the comparison criteria need to be developed
as well).
4. Matlabbased solvers should be replaced with faster productionquality
solvers like GAMS.
5. More development needs to be conducted to develop additional facilities
like warmstart for existing solvers.
43
3. Future Research and Discussions
The application of particular IPM implementations (LIPSOL, Barrier and
PCx) to the radiation therapy model as described above has shown that a few
things need to be or should rather be addressed.
3.1 Avoiding Overconstrained Systems
It has been shown that using the concept of superpixel (s) one can obtain
interior point solutions or at least partially interior point solutions. This ap
proach decreases the number of rows in the constraint matrix. In the same vein,
one could increase the number of columns by creating an attenuation matrix
for more angles (360 and perhaps more if the devices can handle more precise
positioning). Finer resolutions could be of help as well, but they increase the
number of rows at the same time.
3.2 Improved Attenuation Matrix
A few things can be done to improve the attenuation matrix, that is to make
it more realistic:
1. Make more angles (modern devices can have continuous angles)
2. Use conical instead of pencillike beamlet shape or give user a choice
3. Account for ^related tissue difference, bonerelated scattering
4. Build more precise model based on 3D beam positioning, not necessarily
planar. Account for possible nonspherical location of the accelerators
44
3.3 Infeasibility/Sensitivity Analysis
It is not satisfactory to get an answer that a problem is infeasible without
details on what particular constraints are violated after possibly hours of compu
tations. The problem of determining the model feasibility is a difficult problem
by itself. In [9] it is proposed to use the concept of elastic constraints to analyze
the feasibility violations in the model. The elastic constraints concept proposes
to add additional variables that will measure how far the model has to deviate
from the hard constraints formulation to stay feasible. In the literature, elastic
constraints are also called soft constraints. As a passive method of constraint
violation detection one can use the GAMS model running report that contains
information about what constraint has been violated.
The elastic constraints can be applied to the model as follows:
Acx < CUB  Acx < CUB + Uc
Acx < GUB  Acx < GUB + UG,
where CUB, GUB critical organ upper bound and general organ upper bound.
Imposing additional loose constraints UG > 0, UG > GUB the solution will tell
how far away the initial constraints must change to keep the problem feasible. If
the variables UG, UG themselves are constrained and the problem is still infeasible
then the constraints are too strict and the physician needs to reconsider them.
Also see [12].
Imposing a penalty on the objective function for using elasticity variables
by adding term(s) 7TUG, (3TUG will stimulate the model to refrain from abusing
the elasticity. In fact, the values 7,/? can be called the elasticity parameters
45
or inverse elasticity parameters. The bigger the values of 7, /3 in the objective
function the less elastic the problem is. The main problem with the objective
function coefficients of the elastic constraints is that they are apriori chosen
constants.
Two types of coefficients to regulate the objective function and constraints
for the elastic constraints have been proposed in [9] that keep the model lin
ear:Average Analysis and Absolute Analysis. In the Average Analysis the av
erage number of discrepancies between desired and attainable dosage limits is
calculated. In this case the objective function does three things:
1. minimize the average amount the tumor is under the minimum required
dose
2. minimize the average amount of radiation that the critical organs receive
3. minimize the average amount a normal organ(pixel) is over the specified
upper limit dose
In the Absolute Analysis the coefficients are chosen to minimize the maximum
amount of discrepancy. The objective function then becomes:
1. minimize the maximum amount the tumor is under the minimum required
dose
2. minimize the maximum amount of radiation that the critical organ receives
3. minimize the maximum amount a normal organ(pixel) is over the specified
upper limit dose
46
Both approaches are very promising and have been demonstrated to work
in [9]. It would be interesting to modify the model described in this work
to account for elasticity. However, it will not be a seamless integration since
the current model uses the objective function of maximizing the tumor dosage.
So, adding the elastic approach objective function will lead to multiobjective
function which is harder to analyze. The generic difficulty of the multiobjective
functions is that not easy to clear physical meaning the weight coefficients for
each of the objective function terms.
Another approach in analyzing the infeasibility is studying the log that some
packages leave behind after the run. For example, GAMS logs allow one to see
what constraints were violated. Unfortunately, for large models log file sizes
can be in megabytes, so some homegrown or thirdparty tool would need to be
utilized to make analysis more viable. Elastic constraints approach seems to be
more viable as it has more physical meaning.
3.4 Model Linearity
As discussed in [9] and as shown above, the linearity of the model necessarily
implies some discrepancy between what a physician would like to optimize and
what one can put into model. For instance, the natural constraint that the tumor
gets the dosage somewhere in between the minimum and maximum such as to
avoid either extreme can not be expressed in terms of a linear objective function.
The natural objective function would be some sort of a Gaussian function with an
extreme at the center of the specified interval such that the value is at maximum
at the center of the interval. The IPM as mentioned in the introduction is
called to mitigate this problem by using the pathfollowing algorithms that take
47
the mediumpositioned solution. However, such an approach is artificial and
techniquedependent, not to mention that it might depend on the initial point
choice in a notsorobust implementation. If the point to start with was chosen
far from the medium set, then the solution can be close to the vertex. A model
is required to be more solverindependent. In the future, more attention shall be
given to methods that can solve nonlinear problems, such as LogBarrier and
PotentialReduction Methods [6],[22].
3.5 Surprise Function
When considering adding nonlinearity to the model the Surprise Function
approach could be implemented as discussed in [25] or penalized constraint vio
lation can be used following the discussion in [13]. A goal satisfaction problem
is one in which a set of goals is to be attained. When the goals are in fact
uncertain, then a best compromise can be found by minimization of the sur
prise function (a barrierlike function) associated with the goal. Each flexible
constraint is considered a goal and to have the form as Ax = b, where the b are
target values (radiation limits). However, instead of maximizing the levels of
feasibility of all constraints to a specified degree as in chance constraint speci
fications, the surprise function approach seeks the best compromise solution in
all the constraints as a whole and maximizes the overall combined feasibility by
applying a dynamic penalty to violations of constraints as measured by surprise
functions which are like barrier functions whose limit goes to infinity quickly
when reaching the edge of maximum flexibility and zero in the range of feasibil
ity. The surprise function is applied to each pixel, that is, there is one surprise
function for each pixel and the objective function is to minimize the sum of the
48
surprise functions.
3.6 Performance Characteristics
The computation time for productionsize problems (512x512 and higher
resolutions) is too big. Even for size of 256x256 the running time of Matlabbased
codes is in hours. More productiongrade solvers and packages like CPLEX need
to be tested if they produce results of necessary quality in reasonable time
3.6.1 Warm start
As stated in [26], the warm start does not provide for drastically improved
performance of the IPMs. At the same time, even the 23 times decrease of
runtime cited in [26] deserves much attention when the runtimes reach hours.
The Matlab implementations of the LIPSOL, Simplex and PCx algorithms have
a significant deficiency they do not allow to specify the initial/starting point.
Such a lack of warmstart capability is regrettable since algorithms are slow by
themselves (compared to GAMS). When the computation times are substantial,
it is of great value to be able to use some prior knowledge either from previous
runs or from feasibility analysis to speed the solution. In production, such an
approach could be implemented as a library of feasible(optimal) solutions for
each particular tumor location. For instance, if for some patient the tumor is
located in liver, then it is likely that the already computed solution for another
patient with the similar tumor location could be a good starting point. Of
course, each organism is unique and the constraints will vary a bit to account
for patients size, physical conditions, tumor details, etc. But in general, the new
solution is expected to be somewhere around the one that was already done for
someone else. The more calculations are made, the more solutions are available.
49
3.6.2 Statistical Solutions Analysis
Using statistical methods on the available solutions and the treatment re
sults, the physician could come up with a better treatment plan. It looks like fu
ture production implementations could benefit from statistical analysis of avail
able solutions both for planning and warmstarts for new computation runs.
Such an approach would require one to have some sort of a formalized database
that describes tumor characteristics (size, location, etc.) as well as critical and
general organs relative location and characteristics. Since the number of organs
is limited and the tumors tend to appear in some places more often, such a
database along with treatment outcome would be of help in future treatment
plan analysis.
Further research needs to be undertaken to either locate packages that allow
for specifying the starting values or source codes for existing packages need to be
modified. The latter option is viable for Matlabbased PCx package discussed
in this study as well as for original LIPSOL implementation mentioned here but
not tested.
Yet another way of improving the algorithm could be keeping statistics of
where the Central Path has been computed on each iteration of the algorithm.
Since the computed solutions on each iteration tend to be all around the Central
Path, statistical analysis of the points could help in figuring out what the next
solutions point could be. Such an approach could increase convergence since the
closer to the Central Path the intermediate solutions is, the bigger a step to the
solution can be taken on the next step. Accounting for such statistics may work
as an additional prediction term, so it may be given some weight to participate
50
in the affine prediction expression. This is worth further investigation. Just as
with starting points, some additional codes will need to be developed to achieve
this. Also, full access to the source codes will be required since such change is
not as easy as setting a starting point.
51
Appendix A. Figures with Discussion
This appendix contains some more figures of the solution obtained using
different algorithms and solvers. Where necessary, some discussion is provided
as well as the hints on how to reproduce such results. At the very end a table
is provided summarizing some numbers for criteria used for all the runs.
A.l Matlab Deficiency
It was determined in this study that the LP solvers that are part of the
Matlab Release 14 are not quite useable. Namely, as seen from the table 2.4,
the runtimes of both Simplex and Interior Point Method (LIPSOL, PCx) im
plemented in that package can reach hours. Also, as mentioned in chapter 2,
for relatively large problems (256x256 and up) the Matlabs implementations
run out of memory. Having the situation that the intermediate results are not
saved anywhere and the starting point can not be specified, such crashes are not
recoverable and are waste of resources. The similar logic holds true for the case
of PCx solvers based on Matlab. While the PCx solver has not been observed
to crash, its runtime can also reach hours.
A.2 Interfacing Matlab with GAMS
An attempt to find an alternative Simplex solver for comparing with Interior
Point solution has been made. One system that was found to be adequate and
easy to use is GAMS (www.gams.com). Not only does GAMS produce feasible
solutions that are in good agreement with solutions obtained from other solvers,
52
but GAMS could be switched to use another Interior Point algorithm Barrier
method related to the PotentialReduction method briefly covered in section
B.l with predictorcorrector workflow. As seen from table 2.4, the runtime
advantage of GAMS is substantial it is up to 167 times faster on problems of
size 256 and in fact, the trend of time advantage depending on problem size tells
that on larger problems it could even more.
In order to use GAMS with Matlab which is essential for the reasons al
ready stated (attenuation matrices are prepared in Matlab and graphical out
put is done in Matlab as well as other infrastructure developed in scope of this
study), it was necessary to develop special wrapper GAMS model as well as
some intermediate Matlab code. More detailed description of how interfacing
between Matlab and GAMS is done can be found at http: //www. cs. wise. edu/
~ferris/matlab/matlab.html.
GAMS is a system that allows many particular solvers be plugged in such
that the user could seamlessly license and use the one the he/she likes better.
Such approach ensures that the model developed in GAMS is independent of the
particular solver. The simplex solver that is available for this study is CPLEX
version 7.5 (http: //www. ilog. com/products/cplex/). CPLEX package inter
nally uses a few algorithms that could be invoked on request. Namely, the dual
simplex and barrier algorithms were of particular interest here since in the pro
posed model the number of rows in the constraint matrix is always bigger than
number of columns and since the interior point methods are of particular interest
for this study for the reasons already stated.
53
In order to use the CPLEX package consistently (independently from the
settings that an admin could select for GAMS on the server) a change was made
to the GAMS model. Namely, a line Option LP=Cplex; was added before the
solve statement. Next, in order to be able to control and switch the algorithms
that CPLEX uses internally, the following changes have been made:
1. A line was added to the GAMS model to instruct GAMS to look for the
options file: dosage. OptFile=l; where dosage is the model name. In
response to this line along with solver set to CPLEX and described above,
GAMS starts looking for the file cplex. opt
2. A line was added to the cplex. opt file: Ipmethod M, where M is the para
meter equal to 2 to invoke Dual Simplex and 4 to invoke Barrier algorithm.
3. A line was added to the cplex. opt file: barcrossalg1 to instruct the barrier
algorithm not to use crossover to basic solution since vertex solution is not
sought here.
For more detailed discussion of CPLEX options and configuration for GAMS see
http: //www.gams. com/solvers/cplex.pdf. Keep in mind that the referenced
document is for CPLEX version 9.0.
A.3 Objective Functions
Every run of model under GAMS was carried using both Simplex and Bar
rier algorithms. Moreover, to provide for better insight into the difference of
solutions obtained using Simplex and Interior Point methods in GAMS, two
objectives have been formulated and used for the GAMS model built:
54
Maximize dosage delivered to tumor. This is the main objective formulated
in chapter 2 and the one originally used with Matlabbased solvers. Such
an objective is expected to return nonvertex solution by the interior point
method since the objective hyperplane is parallel to one of the feasible set
polyhedra faces. The model is formulated as:
k
max (A.l)
i=l
TLBi < TiX < TUBi
0 < AGjx < CUBj
0 < Agx < GUB
0
where all the notations are defined as in equations (2.1)(2.5).
Minimize overall dosage delivered to the whole body. This is a new, addi
tional objective, not being the main focus of the model. However, Matlab
based solvers have also been applied to the model with this objective. It
is expected to return vertex solution and it does indeed. The model is
formulated as:
min ^2 TiX + ^ AGjx I AGx (A.2)
* j
TLBi < Tx < TUBi
0 < ACjx < CUBj
0 < Agx < GUB
0
where all notations are as above.
55
The solutions obtained using all available algorithms on the objectives stated
here (extended on all solvers) for the model outlined in section 2.2 are provided
below along with discussions for the resolutions 64x64 and 128x128.
A.4 Additional Results Comparison
On top of the results already discussed in chapter 2, some more runs are
covered here. In the beginning the solutions obtained using one solver are com
pared, then the most characteristic solutions of each solver are compared to
solutions from other solvers. As seen on figure A.l, the solutions for the case
when objective is to minimize overall dosage are identical. In the case when
the objective is to maximize the dosage delivered to tumor, the solutions dif
fer somewhat. Namely,the Barrier method returns somewhat higher dosage for
critical organ 1 (compared to the simplex solution), while the simplex solution
returns higher dosages for critical organs 2 and 3.
The beam density figures reflect similarity of solutions as well (see figure
A.2). It can be observed that in the maximize tumor dosage case in simplex
solution the horizontal beam is brighter, which implies that beam intensity in
that direction is higher, which naturally finds reflection in higher critical organ
dosages on the histogram.
The figures A.3, A.4 and A.5 illustrate once again that for the objective
function of minimizing the overall dosage both histograms and beam densities
are identical between simplex and interior point solutions. Likely, it means that
in this case the solution is vertex and unique. Comparing the simplex solutions
for maximize the tumor dosage objective, it is seen that the GAMS solution is
better: the beams density is smeared better and each critical organ gets lesser
56
(barrier 64>64 40 beams, mastumor
gbamer 64x64 40 *, miooversU
(a) GAMS Barrier for maximize tumor (b) GAMS Barrier for minimize overall
dose objective, 64x64 dose objective, 64x64
(simplex 64x64 40 beams, max tumor
(simplex 64x64 40 beams, minovcnll
(c) GAMS Simplex for maximize tumor (d) GAMS Simplex for minimize overall
dose objective, 64x64 dose objective, 64x64
Figure A.l: DVHs for GAMSbased solutions for resolution 64x64
57
160
50
40
A
60
50
40
30
20
10
(a) GAMS Barrier for maximize tumor (b) GAMS Barrier for minimize overall
dose objective, 64x64 dose objective, 64x64
(c) GAMS Simplex for maximize tumor (d) GAMS Simplex for minimize overall
dose objective, 64x64 dose objective, 64x64
Figure A.2:
Dose Distribution for GA MS based solut^
4i
for resolution 64x64
58
dosage. The interior point solutions (Matlab LIPSOL and GAMS Barrier and
PCx) are identical or very close.
In the case of resolution 128x128 it can be observed that the solutions for
the minimize overall objective axe identical for Simplex and Interior Point meth
ods (see figure A.6(b), A.6(d), A.8(b), A.9(a)), just as in the resolution 64x64
case. The new thing is that max tumor dosage objective solutions for Simplex
and Barrier are now identical as well (there is a difference, but it is not no
ticeable) (see figures A.6 and A.7). It should also be noticed on figure A.9(b)
and A.9(c) that the Matlab simplex solutions for both minimize overall dosage
and maximize tumor dosage objectives are infeasible! Matlab LIPSOL solution
for the minimize overall dosage objective is identical to both PCx(A.8(b)) and
GAMS (A.6(b), A.6(d)) solutions for the same objective, which is somewhat
unexpected, since usually Simplex solutions differ from Interior Point solutions
such as GAMS Barrier, Matlab LIPSOL, PCx. The Matlab LIPSOL solution for
the maximize tumor dosage could not be obtained software defect is suspected.
As seen on the figures, solutions obtained using different methods differ in
opposite ways. Here by opposite ways is implied that no method provides for
absolutely better solution for all organs at once, that is, no method produces
solution that delivers lowest dosage to all organs compared to other solutions.
Some solutions deliver less to critical organ 1, while other solutions deliver less
to organ 2 and so on. Depending on what particular organs are (rope or chain),
a physician can make a decision what solution fits the needs better. This in
turn implies that in reality several runs using different algorithms are required
to determine the best plan. It is possible to develop some software for auto
59
mlipsol 64x64 40 beams, max tumor
mlipsol 64x64 40 beams, minoverall
(a) Matlab LIPSOL for maximize tumor (b) Matlab LIPSOL for minimize overall
dose objective, 64x64 dose objective, 64x64
msimplex 64x64 40 beams, max tumor msimpJex 64x64 40 beams, minoveraJl
(c) Matlab Simplex for maximize tumor (d) Matlab Simplex for minimize overall
dose objective, 64x64 dose objective, 64x64
Figure A.3: DVHs for Matlab Simplex and LIPSOL solutions for resolution
64x64
60
(a) Matlab LIPSOL for maximize tumor (b) Matlab LIPSOL for minimize overall
dose objective, 64x64 dose objective, 64x64
(c) Matlab Simplex for maximize tumor (d) Matlab Simplex for minimize overall
dose objective, 64x64 dose objective, 64x64
Figure A.4: Dose
Distribution for GAMSbased solutions for resolution 64x64
if f
/ AiH Li.;
61
pcx 64x64 40 beams, maxtumor
(a) PCx for maximize tumor dose objec
tive, 64x64
pcx 64x64 40 beams, minoverail
(b) PCx for minimize overall dose objec
tive, 64x64
(c) PCx beam density for maximize tumor (d) PCx beam density for minimize overall
dose objective, 64x64 dose objective, 64x64
Figure A.5: DVHs and Dose Distribution for PCxbased solutions for resolu
tion 64x64
62
gbanier 128x128 40 beams, maxtumor
(a) GAMS Barrier for maximize tumor
dose objective, 128x128
gsimplex 128x128 40 beams, maxtumor
(c) GAMS Simplex for maximize tumor
dose objective, 128x128
gberrier 128x128 40 beams, minoveraJl
(b) GAMS Barrier for minimize overall
dose objective, 128x128
gsimplex 128x128 40 beams, minoverall
(d) GAMS Simplex for minimize overall
dose objective, 128x128
Figure A.6: DVHs for GAMSbased solutions for resolution 128x128
63
(a) GAMS Barrier for maximize tumor (b) GAMS Barrier for minimize overall
dose objective, 128x128 dose objective, 128x128
(c) GAMS Simplex for maximize tumor
dose objective, 128x128
(d) GAMS Simplex for minimize overall
dose objective, 128x128
Figure A.7: Dose Distributions for GAMSbased solutions for resolution
128x128 ' "i" :\r
y>
t *
64
pcx 128x128 40 beams, maxtumor
(a) PCx for maximize tumor dose objec
tive, 128x128
pcx 128x128 40 beams, minoverall
(b) PCx for minimize overall dose objec
tive, 128x128
(c) PCx beam density for maximize tumor (d) PCx beam density for minimize overall
dose objective, 128x128 dose objective, 128x128
Figure A.8: DVHs and Dose Distributions for PCxbased solutions for reso
lution 128x128
65
mlipsol E2Sxl28 40 beams, minoverall
(a) Matlab LIPSOL for minimize overall
dose objective, 128x128
msimplex 128x128 40 beams, maxtumor
msimplex 128x128 40 beams, minoverall
(b) Matlab Simplex for maximize tumor (c) Matlab Simplex for minimize overall
dose objective, 128x128 dose objective, 128x128
Figure A.9: DVHs for Matlab Simplex and LIPSOL solutions for resolution
128x128
66
60
(a) Matlab LIPSOL for minimize overall
dose objective, 128x128
(b) Matlab Simplex for maximize tumor (c) Matlab Simplex for minimize overall
dose objective, 128x128 dose objective, 128x128
Figure A. 10: Dose Distributions for Matlab Simplex and LIPSOL solutions
for resolution 128x128 . ' i >
.AiiyA'f
67
mated run and comparison of several algorithms given the comparison criteria
are specified.
Some quantitative measurements of the solution quality can be applied:
1. Percentage of zeros in the solution (ZERS). The more beams are shut off,
the more likely it is that other beams have high intensity which might be
undesirable. To fully evaluate this criteria one has to look and average
beam intensity criteria as well. Higher average with large number of ze
ros hints that solution is not smeared well enough. See dose distribution
diagrams to confirm
2. Average beam intensity (AVG). Defined as sum of all intensities divided
by total of nonzero beams. Lower number here usually implies better
smearing
3. Total dosage obtained by the body (TOTAL). Defined as sum of dosages
to tumor, critical organs and general organ. Lower values here may imply
better solution, but the objective function shall be taken into account
 maximize tumor dosage objective case shall be compared to minimize
overall dosage case with caution. Intended to be be used for body dosage
minimization objective function comparisons
4. Tumor dosage (TUMOR). This value is provided for comparison of the
solutions in the case of maximization of tumor dosage. Can also be used
in other cases to give an idea how much radiation a tumor has received.
5. Objective value (OBJ). This number is provided for comparison of different
solvers/algorithms for the case the same objective function. Higher value
68
is better for maximization case, lower for minimization, apparently. This
is a generic characteristic equal to other criteria values in corresponding
cases
6. Tumor dosage deviation (DEV). This number characterizes how far dosage
delivered to the tumor is from the middle of the allowed tumor dosages
range. DEV = T,ize{x)({Tx)i 60)2
All the criteria specified above are summarized in tables A.l, A.2, A.3, A.4, A.5,
A.6 for the following algorithms (ALG): Matlab Simplex (MS), Matlab LIPSOL
(ML), PCx, GAMS Simplex (GS), GAMS Barrier (GB). The rows represent
all runs (each resolution(64,128,256), each objective function(TYPE: maximize
tumor dosage (max), minimize overall dosage (min))), columns are the criteria.
RES ALG TYPE ZERS AVG TOTAL TUMOR OBJ DEV
64 MS max 85 10 10726 1134 1134 288
64 GS max 83 9 10715 1134 1134 288
64 GB max 37 2 10562 1134 1134 288
64 ML max 36 2 10657 1134 1134 288
64 PCx max 43 3 10639 1134 1134 288
Table A.l: Quantified criteria for max tumor runs, 64x64
The degeneracy for the 64x64 resolution is 0 for the Barrier solution and 12
for the Simplex solution when obtained in GAMS.
69
RES ALG TYPE ZERS AVG TOTAL TUMOR OBJ DEV
128 MS max 76 8 44636 4547 4547 1089
128 GS max 76 8 44636 4547 4547 1089
128 GB max 76 8 44636 4547 4547 1089
128 PCx max 76 8 44636 4547 4547 1089
Table A.2: Quantified criteria for max tumor runs, 128x128
The degeneracy for the 128x128 resolution is 0 for both the Barrier and the
Simplex solution when obtained in GAMS.
RES ALG TYPE ZERS AVG TOTAL TUMOR OBJ DEV
256 GS max 65 10 199280 17102 17102 3490
256 GB max 65 10 199281 17102 17102 3490
256 PCx max 65 10 199280 17102 17102 3490
Table A.3: Quantified criteria for max tumor runs, 256x256
A cursory analysis of tables A.l, A.2, A.3, A.4, A.5, A.6 unveils that as the
resolution increases the difference in proposed characteristics becomes more and
more negligent. For instance, for resolutions 64x64 and 128x128 the difference
of intensities between simplex solutions and IPM solutions could be around 4
times, while for 256x256 the solutions are almost identical.
70
RES ALG TYPE ZERS AVG TOTAL TUMOR OBJ DEV
64 MS min 89 11 8024 1014 8024 288
64 GS min 89 11 8024 1014 8024 288
64 GB min 89 11 8024 1014 8024 288
64 ML min 89 11 8024 1014 8024 288
64 PCx min 89 11 8024 1014 8024 288
Table A.4: Quantified criteria for min body runs, 64x64
RES ALG TYPE ZERS AVG TOTAL TUMOR OBJ DEV
128 MS min 83 9 35796 4096 35796 1038
128 GS min 83 9 35796 4096 35796 1038
128 GB min 83 9 35796 4096 35796 1038
128 ML min 83 9 35796 4096 35796 1038
128 PCx min 83 9 35796 4096 35796 1038
Table A.5: Quantified criteria for min body runs, 128x128
71
RES ALG TYPE ZERS AVG TOTAL TUMOR ! OBJ DEV
256 GS min 69 11 182311 16280 182311 3520
256 GB min 69 11 182311 16280 182311 3520
256 PCx min 69 11 182311 16280 182311 3520
Table A.6: Quantified criteria for min body runs, 256x256
All source codes developed in scope of this project are available on request
(subject to approval).
72
Appendix B. Interior Point Methods and Algorithms Overview
It is a wellknown fact that each linear program has an associated program
called the dual linear program. Let this original problem be called primal. In
standard form the primal problem can be formulated as follows:
min cTx subject to Ax b, x > 0 (B.l)
Then the problem called dual to primal will be:
max bT\ subject to ATA + s = c, s > 0 (B.2)
In these formulas c,x,s are the vectors in R", while A, b Rm. Vector A is
called dual variables, s dual slacks.
The equations (B.1)(B.2) are often called primaldual pair. It is further
assumed that readers are familiar with duality theory. The primaldual equa
tions have been provided here to explain the notation for variables further on.
As is mentioned in [24], the original primaldual formulations based on normal
form lack some symmetry in the treatment of the unknowns, and the way to im
prove the solution stability is to make the problem formulation more symmetric,
such as in KarushKuhnTucker (further KKT) theorem for our case. The KKT
conditions are stated as:
73
vector x* is a solution of system (B.l) if and only if there exist vectors
s* G Rn and A* G Rm such that the following holds:
Ar\ + s = c (B.3)
Ax = b (B.4)
XiSi = Q, i = 1,2, (B.5)
x > 0, s > 0 (B.6)
Rewriting (B.3)(B.6) conditions for (B.2), one gets that the vectors A* G R"
and s* G Rm are solution to (B.2) if and only if there exists a vector x* G Rn
such that the conditions (B.3)(B.6) hold for (x, A, s) = (x*,X*,s*). In other
words, the vector (x*,X*,s*) solves (B.3) if and only if vectors x*, (A*,s*) solve
systems (B.l) and (B.2) correspondingly. Such a solution is called primaldual
solution.
It is seen that the KKT system is nonlinear due to the terms X{Si in (B.5).
A typical way of solving nonlinear equations is the Newton Method. For an
arbitrary equation F(v) = 0, where v is the vector of the unknowns, we can
linearize the equation locally as:
vk+1 = vk F'(vfc)1F(u*), (B.7)
where F'(vk)~l is an inverse of the Jacobian of F at the (current) point vk and
vk+1 is the new' variable value at the next step.
In short, the Newton algorithm can be stated as:
1. Pick initial point v
2. Use (B.7) to calculate solution at next step
74
3. Evaluate the difference of solutions at next and current step to determine
if optimality condition is met and the desired precision has been attained.
4. Repeat steps 23 if necessary (required precision has not been obtained)
It is wellknown that the Newton Method, being a local method, has excel
lent local convergence properties (quadratic) given that the Jacobian matrix is
nonsingular and Lipschitzcontinuous at a solution point v* and the initial point
v is sufficiently close to v*. In general, the global convergence properties of the
Newton Method are not so good since the method is just a linearization of a
nonlinear system, so the algorithm deviates from the exact solution the further
one goes from the solution(s) unless special measures axe taken(see [14]). More
details on the Newton Method can be found in [20] and [14]. The objective then
is to find some correction that would account for nonlinearity as the solution
proceeds.
Newtons method for the system (B.3)(B.6) can be rewritten in standard
function notation given in (B.7) as:
ATA
F(x, A, s) A
x > 0, s > 0,
+ s c
x b
XSe
= 0,
(B.8)
(B.9)
where X = diag(xi,...,x), S = diag(si,...,sn). The positivity conditions
(B.9) are not accounted for in Newtons algorithm and need to be accounted for
separately.
75
A solution for x, s has to be sought such that the vectors are strictly positive
to allow one to stay away from the boundaries (stay interior) and incorrect
and/or spurious solutions that are not useful. For the radiation therapy problem
(RTP) this corresponds to the physical meaning of x being a radiation level that
can not be negative.
Formally, both regular (feasible) and strictly positive solutions are to belong
to the following sets:
F = {(x, A,s) Ax b, AtA + s c, x > 0, s > 0} (B.10)
F {(x, A,s) Ax b, AtA + s = c, x > 0,s > 0} (B.ll)
In this new notation the set F is the strictly feasible solution set. Each
iteration of Newtons Method needs to ensure that (xk, Xk, sk) F, where k is
the iteration number.
Matrix inversion is not a fast or numerically stable operation, so instead
of (B.7) the search direction in Newtons method is obtained by solving the
following equation:
Ax
J(x, A, s)
AA
= F(x, A, s),
(B.12)
As
where J(x, A, s) is the Jacobian matrix of F.
Equations (B.l) and (B.2) hold for the point (x, A, s) G F, so differentiating
(B.8), system (B.12) can be rewritten as:
0 AT I Ax 0
A 0 0 AA 0
S0A As XSe
(B.13)
76
For sparse A the left hand side of the linear system above is sparse, which caters
for some efficient sparse solvers even when the system size is substantial.
Generally speaking, Newtons method knows nothing of the additional lim
itation to stay strictly feasible, meaning that a full step in the newly found
direction is not always possible. The usual way to deal with this situation is
to use some multiplier to modify the step in order to remain in the feasibility
regions such as:
(xk+1, Xk+\sk+1) = (xk, Xk, sk) + a(Ax, AA, As) (B.14)
Unfortunately, carrying out such a procedure without modifications, the
iterates get very close to the feasibility region boundary where to stay feasible
a must be decreased so much that virtually no progress is made towards the
solution [28].
All the primaldual methods covered in this chapter modify the basic Newton
Method in two ways:
They try to stay away from the x, s boundaries where the convergence is
slow unless one is close to the solution (see, for instance, in [28])
Different measures are undertaken to keep the iteration steps close to the
inner region of the interior of the nonnegativity orthant x,s > 0. It is
intuitively clear that the more interior the iterates are, the longer the step
that can be taken before actually hitting Xi = 0 or s* = 0 (for some
arbitrary i). One giant step cannot be taken since Newtons method is a
linearization technique of a nonlinear problem around some point. The
77
further an iterate is from that point, the less accurate the approximations
can be. So in particular one should not use a > 1 .
All methods below axe just different ways of implementing what is outlined
above.
The interior point methods can be classified in numerous ways. One of
the ways is to divide them into two broad groups: potentialreduction or bar
rier methods and pathfollowing methods. It so happened that the majority
of the solvers accessible to the authors belong to the pathfollowing group so
this study concentrates on the pathfollowing algorithms and solvers based on
them. There axe also some indications that pathfollowing algorithms tend to
be faster, at least for the linear programming problems ([15]).The Potential
Reduction Method is described below briefly for the sake of completeness of the
IPMs picture.
B.l PotentialReduction Method
Potentialreduction methods are used extensively in quadratic and other
nonlinear problem formulations [6]. The linear problems have more efficient
solvers developed such as simplex and pathfollowing algorithms( [16]), but since
radiation therapy problems are oftentimes formulated as nonlinear, it is useful
to cover this method here as well.
The gist of the method is the fact that some potential reductions function
of a very generic nature is introduced and the algorithms moves inside the strict
feasibility region F in such a way as to decrease the potential function value
on every step by at least by some predefined quantity.
78
There are two conditions imposed on the potential function:
(B.15)
(B.16)
oo if XiSi > 0 for some i, but p = xts/n 0
$ oo if and only if (x, A, s) > Q
where Q is the set of primaldual solutions to (B.3) (B.6)
The physics is clear. One finds a function that is big where one does not want
to go and small where one wrants to end up. Succeeding iterates are computed in
such a way as to decrease the potential function value every time and eventually
end up in or near the solution set.
One of the function that seems to circumvent the stated limitations is the
socalled TanabeToddYe [22] function defined by:
where p > n.
The solutions follows a typical path of solving the system (B.24) with the
exception that each step gets modified so as to minimize the potential function
as much as possible by varying the parameter a:
n
a
P
a = sup {a [0,1] j (x, s) + a(Ax, As) > 0}
at = arg min $p(xk + aAi, sk + aAs)
(zfc+\s*+1,Afc+1) (xk,sk,\k) + ak(Axk,Ask,A\k) (B.18)
The complexity of the potentialreduction method is 0(^/n logl/e), where
e is how close the approximate solution to the exact solution ([26]).
n
(B.17)
79
B.2 Central Path
The notion of the Central Path is introduced in order to describe how the
iterates of the algorithm stay in the interior of the nonnegative orthant. This
idea is used extensively in the pathfollowing algorithms described below. It
is also used to describe how one should move around the potential function
mentioned above in section B.l.
The central path is a solution of the following parametric system for all r:
So the cental path is:
AtX + s = c
Ax b
XiSi T, t 15 ...j Tlj
x > 0, s > 0
C = {(xT, At,st)t > 0}.
(B.19)
(B.20)
(B.21)
(B.22)
(B.23)
The only difference with the system (B.3)(B.6) is that all components of the
multiplication in (B.21) are equal and positive instead of being zero.
If the Central Path C converges as r 0 then (B.19)(B.21) becomes more
and more like system (B.3)(B.6) and therefore the solution of the Central Path
must eventually become the solution our our primaldual problem as stated in
[26]. This feature is exploited by moving to a solution along the central path
since the central path equations guarantee that the iterates stay towards the
middle of the interior of the orthant (again as per [26]). Describing the details of
the LIPSOL algorithm [28], it is explained why it is also important to approach
80
the solution with all x,s, components at the same rate. As shown in [28], if
x(s{ = 0 for some /, then x^sj. = 0 Vi > I.
The basic Newtons equation is modified to keep iterates in the vicinity of
the Central Path:
0 AT I Ax 0
A 0 0 AA = 0
S 0 X As XSe + (j/xe
where a is the centering parameter describing how much one goes towards the
central path. If a 0 then no such centering attempt is made. If a = 1 then
the move is towards the path since it is where the solution is such that each
%iSi = t, r = cr/x, since /x = Â£ ^"=1 xisi and T = 07^ So the closer a to 1, the
closer t is to /x, and /x is a duality gap being decreased.
Different algorithms have different ways of choosing a and how to move
around the central path such as to stay in the interior at the same time making
as much progress towards the solution as possible. Moreover, in passing, there
is a connection between the barrier (potentialreduction) methods and path
following algorithms as described in [17].
B.3 Central Path Neighborhoods
The pathfollowing algorithms solution, as described above, at each iteration
must be close to the central path to be efficient. Under this condition the
maximum possible step length can be taken. Since the solution on each iteration
generally does not hit the central path (nonlinearity effect), each components
product xlsi is not quite equal to ^x. The distance from the central path can be
81
estimated by comparing every component product to the average of the products,
\\XSene\
XXSi
*EnS n
(B.25)
Here  denotes norm, but the particular norm to be used has not been
specified. Two of the most frequently used norms are the 2norm and the oo
norm. The 2norm neighborhood N2(0) for the (B.25) is
N2{0) = {(x, A, s) F \\XSe fie\\2 <0fi,0 6 [0; 1)}. (B.26)
. The 2norm has an advantage of simplicity, but it limits the available area
severely (see [26]) by not spanning enough of the area potentially accessible
by strict positivity constraints. So the algorithm does not have much space to
advance over the 2norm. The oonorm is more complex to compute, but it
allows one to span almost all of F ([26]).
The oonorm is not ideal either since it is a twosided bound on XiSi, while
the main concern is just not to let some x^s* go below a fraction of /r. At
the same time, XjSj can be allowed to exceed the fi value. So, instead of
neighborhood one uses a modified neighborhood norm (onesided bound on XjS*)
defined as follows (see [26]):
^00(7) = {(z, A, s) e F > jh for all i, 7 (0,1)}. (B.27)
B.4 ShortStep PathFollowing Algorithm
The ShortStep PathFollowing Algorithm [26] makes use of the N2 neigh
borhood defined in (B.26). There are some usual (cookbook) values that are
82
assigned here to the constants a in (B.24) and 6 in (B.26). They have been
demonstrated to work in practice [26], but not much justification has been pro
vided in the literature.
The algorithms steps are as follows:
1. Set 6 = 0.4, afc = cr = 1 0.4/i/n, (x, A,s) N2(0)
2. while tolerance is not reached or cycles limit is not attained do:
solve (B.24) to obtain (Axk,AXk, Ask)
set (x*+1, Afc+1, sfc+1) = (xk, Afc, sk) + (Ax*, AA*, Asfc)
At each iteration cr*. = cr, that is the same for all.
There are two challenges here. One is to show what the complexity of the
algorithm is and the other one to show that such iterates are never outside
N2(6). These two results are demonstrated in [26], where the complexity of this
algorithm is shown to be 0(y/n\ogl/e). As before, t is how close one wants to
be to the exact solution.
B.5 PredictorCorrector Algorithm
The centricity parameter a of (B.24) is chosen to be fixed such that it
provides for both progress towards optimality and central path in one shot in
the ShortStep PathFollowing Algorithm. The PredictorCorrector Algorithms
[26] does it another way. One step is used purely for centering a = 1 (predictor
step) while another step a = 0 is used to reduce the duality gap (corrector
step). The second feature of the predictorcorrector technique is that it uses
two neighborhoods (N\, N2) instead of one. The corrector step (defined below) is
83
confined to terminate in the inner neighborhood, while the reducer step (defined
below) can terminate wherever in Ni, N2 and generally terminates on the outer
boundary of the outer neighborhood thus attaining maximum reduction of the
gap
The algorithm is:
1. Set (x, A,s) G N2(6inner), N2(0) defined in (B.26)
2. while tolerance is not reached or cycles limit is not attained do:
even step number k: predictor (reducer)
solve (B.24) with a = 0 to obtain (Axk, AAfc, As*)
set a max(l, argmax(xk(a), \k(a), sk(a)) 6 N^Bouter))
set (xk+1,Xk+l,sk+1) = (xk(a), \k(a),sk(a))
odd step number k: corrector
solve (B.24) with a = 1 to obtain (Axk, AXk, Ask)
set (xk+1, Xk+1,sk+1) = (xk, Afc, sk) + (Axk, AXk, Ask)
Complexity of this algorithm is 0(y/n logl/e), where e is how close one
wants to be to the exact solution (see [26]). It can even be shown that the
convergence is twostep quadratic [26]. Moreover, the corrector step always takes
the solution to the inner neighborhood so that this class of algorithms generates
strictly interior iterates and each iteration sets up the favorable conditions for
next iteration (see [26]).
B.6 LongStep PathFollowing Algorithm
The difference of the LongStep PathFollowing Algorithm from both the
ShortStep PathFollowing Algorithm and PredictorCorrector Algorithm is that
84
it makes use of the Nodi) neighborhood rather than (the perhaps restrictive)
N2(0).
Choose a of (B.24) to be fixed for all iterations and strictly between 0
and 1. The overall procedure reminds a hybrid of predictorcorrector in that
the a is chosen to be the maximum value that still keeps the iterate in the
neighborhood. And yet it is like shortstep in that there is single fixed value of
a used throughout.
1. Set 7 small, a (0,1), (.t, A,s) Nooil)
2. while tolerance is not reached or cycles limit is not attained do:
solve (B.24) to obtain (Axfc,AAfc,Ask)
set a = max(l,argmax(xk(a), Afc(ct), sk(a)) E N00(7))
set (xk+1, Ak+1, sk+1) = (xk(a),Xk(a),sk(a))
Complexity of this algorithm is 0(n log 1/e), where e is how close one wants
to be to the exact solution (see [26]).
B.7 Infeasible Point PathFollowing Algorithms
All the algorithms discussed above explicitly used the fact that the initial
point (a;0, A0, s) is feasible so that all succeeding iterates remained feasible until
desired solution precision is attained. In reality, to obtain a feasible point is
difficult. In some problems like the radiotherapy it is not even clear if a solution
exists at all, that is, if the feasibility region is empty or not.
Knowledge of a feasible point is considered a hotstart and can improve
running time drastically (see [23]). Some authors provide figures of 2030 percent
85
benefit (also in [23]). So whenever such a point is available, it should be used.
Unfortunately, some solvers like Matlabs LIPSOL do not let one specify such
a point for hotstart. This is one of the big issues to try to improve on in both
the production and research codes.
A modification of the LongStep PathFollowing Method is available since
the early 90s to handle the problems when the starting point is infeasible. The
LongStep Algorithms use N,x neighborhoods. In order to provide a convergence
iteration of the solution or to confirm that the solution does not exist, the
neighborhood is modified to check that the infeasibility residuals (defined below)
decrease at least as fast as the primaldual residuals.
The infeasibility residuals are defined as
r6 Ax b, rc AtA + s c. (B.28)
At each kth iteration the residuals will have the k subscript to signify that they
also change as the algorithm progresses. The production codes check that these
residual values indeed decrease and report error if they do not decrease after
preconfigured number of iterates.
The modified central path neighborhood is:
NaohiP) = {(z> A,s)l I(r6.rc) < [\r%,rc\/no\Pn, (B.29)
x > 0, s > 0, XiSi > 7/2 for all i},
Here r,r,/zo are the residuals and duality gap at the starting point. 7 is in
(0,1) and /? > 1
The infeasibilities are in all Nooil,P) and are bounded from above by fi
(see [26]). By decreasing n at every step down to 0 (with tolerance) then the
86
residuals will go to zero as well. Other parts of the algorithm are the same as in
the LongStep PathFollowing Algorithm. In the presence of the residuals the
system (B.24) becomes at every iteration k:
0 AT I Axk _rk 1 c
A 0 0 AAfc = _rk rb
Sk 0 Xk A sk XkSke + ak\ike
Then the LongStep Algorithm can be rewritten as:
(B.30)
1. Set 7 small a (0,1), (x, A0, s) are such that x > 0, s > 0
2. while tolerance is not reached or cycles limit is not attained do:
solve (B.30) to obtain (Axfc, AAfc, Ask)
seta = max (l,argmax(xk (a), \k (a), sk (a)) Nooil, 0)) and fik{a) <
(1 0.01 a)fik
set (xfc+1, Afc+1, sfc+1) = (xfc(a:), Afc(a), s*(a)),
where g(a) = x(a)Ts(a)/n.
It is perhaps the relative cost of finding a proper a that contributes to the
fact that the complexity of this algorithm is 0(n2 log (1/e)), where e is how close
one gets to the exact solution (see [26]). Undoubtedly, another explanation can
be that the feasible starting point can be considered as hotstart while there is
no such luxury for infeasible starting point.
The complexity proof that produces the 0(n2 log (1/e)) result relies on the
fact that the initial point is chosen as (x, A0, s) = ((e, 0, Â£e) for some large
If this is not the case, there is convergence, but it is not guaranteed to be
87
polynomial as stated in [26]. In general, the choice of some symmetric starting
point is essential for obtaining the solution quickly ([24]).
B.8 Mehrotras Algorithms
The Mehrotras algorithms [19] is of particular importance since most of the
production interiorpoint solvers implement it in one way or another (see [26],
[28]). In particular, the LIPSOL ([28]) is based on the Mehrotrass algorithm
with additions of some other tricks like Terminal Step discussed below along
with some numerical algebra enhancements.
Just as Infeasible Point PathFollowing Algorithm was based on the Long
Step PathFollowing Algorithm, Mehrotras algorithms is based on the Infeasible
Point Procedure. Moreover, it provides several enhancements, among which are:
The a parameter becomes adaptive and changes at every step. It is known
that around the solution little centering is required, so value close to 0
(zero) can be chosen thus increasing convergence speed. The decision as
to what value to assign to a depends on how much progress has been made
on the affinescaling (predictor) step trying to reduce fi.
Additional correction to account for the central path curvature. The New
tons method generally provides firstorder approximation to the trajec
tory, while the secondorder correction allow iterations stay closer to the
path.
The first step of the algorithm is to use the Newtons method to obtain affine
scaling (predictor) direction similar to (B.30) with the exception that centering
term is not present here.
88
