INVERSION OF THE LAPLACE TRANSFORM USING
SINGULAR VALUE DECOMPOSITION
by
Douglas Melvin Dreher
B.S., University of Colorado, 1982
A thesis submitted to the
Faculty of the Graduate School of the
University of Colorado in partial fulfillment
of the requirements for the degree of
Master of Science
Department of Electrical Engineering
and Computer Science
This thesis for the Master of Science degree by
i
Douglas Melvin Dreher
has been approved for the
Department of
Electrical Engineering and
Computer Science
by
Date 30
Dreher, Douglas Melvin (M.S., Electrical Engineering)
Inversion of the Laplace Transform using Singular Value
Decomposition
Thesis directed by Associate Professor Douglas A. Ross
Singular value decomposition (SVD) is a powerful
method that is often used when other common methods,
such as Gaussian elimination or LU decomposition, fail to
deal successfully with singular or nearsingular sets of
equations or matrices, as are found in many linear least
squares problems. The inversion of the Laplace transform
is an important, and difficult, problem to which both
linear and nonlinear least squares minimization
techniques have been used by others in attempts to obtain
its solution in specific applications, such as
correlation function profile analysis. Since SVD is an
integral part of some of the linear methods, it is desir
able to have a thorough understanding of its
effectiveness when applied to the inversion problem.
This is accomplished using a histogram approach to model
known simple, unimodal, and bimodal distributions, using
generated data corrupted by noise. Several performance
measures are defined to evaluate the SVD solutions
obtained, and the results of applying SVD are presented,
including the effects of data noise, model size, and
calculation precision.
iv
The form and content of this abstract are approved,
recommend its publication.
I
Signed
Faculty member in charge of thesis
CONTENTS
CHAPTER
I. INTRODUCTION................................... 1
II. SINGULAR VALUE DECOMPOSITION AND ITS APPLICATION
TO THE INVERSION OF THE LAPLACE TRANSFORM____ 6
Inversion of the Laplace Transform............. 6
Singular Value Decomposition................... 9
Mathematical Description..................... 9
Calculation of the Singular Value
Decomposition............................. 13
Stability of the Algorithms................. 14
Application to the Inversion of the Laplace
Transform Involving Sampled Data............. 15
Histogram Model of Sampled Data............. 15
III. LINEWIDTH DISTRIBUTIONS AND
GENERATION OF DATA............................ 21
Linewidth Distributions....................... 21
Staircase Linewidth Distribution........... 22
Pearson Linewidth Distribution.............. 23
Polydisperse Linewidth Distribution......... 26
Assumptions and Limitations................... 28
Generation of Simulated Data.................. 29
IV.. MEASUREMENT OF PERFORMANCE...................... 31
Performance Measures.......................... 32
RMS Parameter Error
32
vi
CONTENTS (continued)
CHAPTER
Linewidth Parameter Total Area.............. 33
Computed Baseline........................... 33
RMS Residual................................ 33
Residual Statistic.......................... 34
Plots......................................... 35
Example....................................... 36
V. RESULTS......................................... 42
Staircase Linewidth Distribution.............. 42
Pearson Linewidth Distribution................ 51
Polydisperse Linewidth Distribution........... 59
Precision Considerations...................... 68
VI. CONCLUSION...................................... 74
BIBLIOGRAPHY .......................................... 77
APPENDIX
A. GENERATED DATA SETS............................ 7 8
B. GENERATED RANDOM NUMBERS........................ 81
C. MACINTOSH FLOATING POINT ENVIRONMENT............ 82
D. COMPUTER PROGRAM................................ 84
CHAPTER I
INTRODUCTION
Singular value decomposition is a powerful method
of dealing with sets of equations or matrices that are
singular or nearsingular. It can be used when methods
such as Gaussian elimination or LU decomposition fail to
provide satisfactory solutions, either to provide a solu
tion, or to determine the reason for the failure of these
other methods. In fact, singular value decomposition
(SVD) is very stable in the sense that it will always
give a solution, although not always a good one.
A type of problem for which SVD is particularly
useful is the linear least squares problem. While it is
possible in some cases to solve the set of equations di
rectly, it is generally better to use SVD, especially for
larger sets of equations. The reason for this is that
frequently the linear least squares problem results in
sets of equations that are singular or very nearly so,
due to the inability of the data to distinguish between
the basis functions provided for the least squares fit.
An example of a problem with this characteristic is the
inversion of the Laplace transform where, for example,
(
2
given some observed set of data output from a process, it
is desired to determine the input to or some characteris
tic of that process. Typically, a large or even infinite
set of inputs or characteristics can give the same output
if even the slightest amount of noise is introduced by
the process itself or during the measurement of the out
put. The problem of the inversion of the Laplace trans
form can be reformulated into a linear least squares
problem, thereby making it a prime candidate for solution
by singular value decomposition.
Many methods of solving inverse problems have
been tried, several of which have been considered by Chu,
et al. [1], in correlation function profile analysis,
where, from a measured time correlation function, an at
tempt is made to determine the characteristic linewidth
distribution function: an inverse Laplace transform prob
lem. In that paper, both linear and nonlinear least
squares minimization methods of solving this problem are '
discussed. The two nonlinear methods do not address the
illconditioning of the problem and are found to be pri
marily useful in providing starting estimates for the
linear techniques, two of which, the linear multiexponen
tial and the histogram methods, make use of singular
value decomposition to address the illconditioning. The
singular value decomposition methods require initial es
timates to set up the model, require an interactive rank
3
reduction step to obtain a useful solution, and are not
constrained to distributions that are physically reason
able.
It was found that, given the proper knowledge of
the physical origins of the characteristic linewidth dis
tribution function, thus allowing the investigator to
choose the appropriate experimental conditions and cor
rectly interpret the results, a combination of the meth
ods presented can provide a solution consistent with the
available information.
Since singular value decomposition is the primary
component in two of the three linear approaches, a thor
ough understanding of its characteristics and proper use
is essential for its successful application to illcondi
tioned problems. Because the paper by Chu et al. is es
sentially a survey of methods that have been tried, many
of the details of implementing the various approaches
have been omitted. In addition, although several distri
bution shapes were discussed (unimodal, bimodal, etc.),
results were presented for only one case, resulting in
solutions resembling the Pearson linewidth distribution
to be defined in Chapter III.
The motivation for this thesis was to perform a
more extensive evaluation of the use of singular value
decomposition to the correlation function profile analy
sis problem in particular, but applicable to inverse
4
Laplace transform problems in general. A histogram ap
proach similar to that discussed in [1] is used to model
distributions representing simple, unimodal, and bimodal
cases, with data generated from these distributions being
corrupted by varying amounts of random noise to determine
the sensitivity of SVD to noise.
In Chapter II the problem of the inversion of the
Laplace transform is introduced, along with its reduction
to the problem of solving a system of linear algebraic
equations. After discussing the instability of the
Laplace transform, the method of singular value decompo
sition is presented, including a discussion of its calcu
lation and stability, followed by the development of its
application to sampled data through the use of histogram
models.
In Chapter III, three different linewidth distri
butions are defined and histogram models developed for
them. Because of the difficulty in evaluating the effec
tiveness of applying algorithms to illconditioned prob
lems, it is convenient, and often necessary, to start
with a known solution and from it generate the data on
which the algorithm will operate, thus allowing compari
son of the calculated and known solutions. The genera
tion of data from known solutions is briefly discussed
after mention is made of the assumptions and limitations
that apply to the remainder of the thesis.
5
In Chapter IV several performance measures are
defined which can be used to determine how close a
solution is to the known solution and, since a set of so
lutions is obtained from the singular value decomposi
tion, which solution is "best."
The results of applying singular value decomposi
tion to data generated from known solutions for each of
the distributions defined in Chapter III are presented in
Chapter V. Plots summarize the relationships between the
data rms noise and the parameter error and rms residual,
two of the performance measures defined in Chapter IV.
The effects of histogram model size and calculation pre
cision on the SVD solutions are also considered.
Chapter VI ends the thesis with a summary and
concluding remarks.
CHAPTER II
SINGULAR VALUE DECOMPOSITION AND ITS APPLICATION
TO THE INVERSION OF THE LAPLACE TRANSFORM
Inversion of the Laplace Transform
The integral equation
U(s) = f
JO
u(t)exp(st) dt
(2.1)
is the Laplace transform of u(t). Given U(s), it is de
sired to obtain u(t). If U(s) is one of the simple
functions found in a table of Laplace transforms and
inverses then the problem is solved. More commonly, if
the analytical inversion of (2.1) is impossible or in
feasible to obtain, then numerical inversion is called
for, with the integral of (2.1) approximated by a finite
sum,
reducing the problem to the solution of a system of
linear algebraic equations. If s is allowed to assume
U (s)
t
ki (s) u (ti)
i=l
7
the N different values s^, S2, .., sN, then a linear
system of N equations in the N unknowns u(t^), i = 1, 2,
U(Sj) = ^kiju(ti), j = 1, 2, ..., N
i=l
is obtained, where K = [k^j] is an N x N kernel matrix.
Solution of this equation yields the approximation
u(ti) = ^faijU(Sj), i = 1, 2, N
j = l
where A = [a^j] = K ^. Thus, the solution of the inver
sion of the Laplace transform has been essentially
reduced to finding K 1.
Unfortunately, the inverse of the Laplace trans
form is unstable [2] since the Laplace inverse is an un
bounded operator: small changes in U(s) can produce large
changes in u(t). This instability results in the matrix
K being illconditioned, i.e., K is nonsingular but
contains elements such that small changes in the vector b
produces large changes in the solution of Ax = b. This
illconditioning worsens rapidly as N increases.
Because of the instability of the inverse Laplace
transform, there are infinitely many solutions u(t) that
8
result in the same Laplace transform U(s). Knowledge of
the form of u(t) can be used to constrain the possible
solutions of Ax = b, although there will still be in
finitely many of them. At this point it is usual to con
sider Ax = b as a linear least squares minimization
problem which is solved for the solution that minimizes
2
(Ax b) The illconditioning of this problem, how
ever, renders it unsuitable for solution by standard
methods such as LU decomposition or Gaussian elimination.
Singular Value Decomposition [1,35] is a method commonly
used for just such illconditioned or nearsingular least
squares problems, since, theoretically, it cannot fail to
find a solution, provided that it is properly applied. A
paper by Chu, et al. [1], considered the use of singular
value decomposition for the solution of the inverse of
the Laplace transform in a specific application, and con
cluded that the method showed promise. However, since
several methods were presented, the discussion of singu
lar value decomposition was somewhat limited. Their ap
parent success obtained with the application of the SVD
method warrants further investigation. The rest of.this
chapter describes singular value decomposition and its
use in the solution of a sampled data problem.
9
Singular Value Decomposition
Singular value decomposition (SVD) is a method
that can be used successfully to solve sets of equations
or matrices that are either singular or illconditioned;
conditions for which other methods, such as Gaussian
elimination or LU decomposition, fail. It is also the
method to use to solve linear least squares problems
under similar adverse conditions. SVD takes longer than
the other methods and requires some interaction on the
part of the user, but it will always give a solution.
Mathematical Description
The following description of singular value
decomposition is geared specifically towards its use in
the solution of the linear least squares problem. More
general descriptions, along with the underlying theory
and proofs, may be found in the references [35].
Let A be a real m x n matrix. A may be decom
posed into the product of two orthogonal matrices and a
diagonal matrix,
A
T
USV
(2.2)
where U is an m x m orthogonal matrix, V is an n x n
orthogonal matrix, and S is an m x n diagonal matrix.
10
The diagonal entries of S, s^, are called the singular
values of A; they are the nonnegative square roots of
T
the eigenvalues of A A. Equation (2.2) is called the
singular value decomposition of A.
The linear least squares minimization problem may
be stated as
Ax = b (2.3)
where it is desired to find the vector x that minimizes
the residual norm
r = VAx b
Here, A is an m x n matrix of rank k < min(m,n), x is an
nvector, and b is an mvector.
Since, for the problem considered in this thesis,
A is a square n x n matrix, this will be assumed in the
remainder of the discussion. From (2.2) and (2.3),
1 T
x = VS U b
Defining the vectors g and p by
11
T
g = U b
and
Sp = g
the vector x is then found by the orthogonal linear
transformation
x = Vp
Since S is diagonal the components of p are found by
J 1 / n
(2.4)
Equation (2.4) assumes that all of the singular
values are nonzero, which is the case only if the matrix
A is nonsingular with rank k = n. A property of A is
that it is generally singular, which results in one or
more singular values being zero, or illconditioned,
where one or more singular values are too small and are
dominated by data noise or roundoff error. Therefore, a
rankreduction step is employed which limits the number
of singular values used to obtain a solution. Since the
pseudorank k of the solution is generally not known ahead
of time, a set of candidate solutions
12
x (k) Vp(k)^ k = 0, 1, n
and residual norms
 b
k = 0, 1, ..., n
There are several criteria which may be used to
select the best value of k, such as selecting the
(k) (]r)
solution xv with the smallest norm x' ' or the
(k)
smallest residual norm r or both. Lawson and Hanson
[3] discuss these and other techniques, such as
LevenbergMarquardt analysis. Knowledge or reasonable
assumptions of the properties of the data, such as the
uncertainty of the components of b, is often of much
assistance. For instance, if the components g^k+1^, ...,
(n) T
g of g = U b are smaller than the assumed uncertainty
in b, then these components can be considered to be
(k) =VAx(k)
are defined, where
Pi
p(k) =
Pk
0
L 0 J
13
zero. Thus solution k might chosen as the best solution.
Some of the criteria that might be used for choosing the
best'solution are discussed in a later section and in
Chapter IV.
Calculation of the Singular Value Decomposition
The singular value decomposition (2.2) used in
the solution of (2.3) is calculated by first constructing
a series of Householder transformations [3,5] which are
applied to A, reducing it to upper bidiagonal form. That
is,
Hn . H]_ A . . Qn_2 = D
where Hn ... and ... Qn_2 are two sequences of
Householder transformations and D is the n x n upper bi
diagonal matrix. Then a singular value decomposition for
D is computed by applying a modified QR algorithm [3],
adapted to operate on bidiagonal matrices, to D. A QR
algorithm decomposes any real matrix in the form of the
product of an orthogonal matrix Q and an upper triangular
matrix R, hence the name. For a bidiagonal matrix, this
results in k rotation matrices R^ and being applied to
D so that
14
Rk ... Ri D pj ... pÂ£ = diag [si, . ., sn]
where k depends on the iterative QR process. The sin
gular values of D, s^, which are also those of A, are
sorted into nonincreasing order. The other results of
the SVD computation are
v = Qi Qn2
and
T
U b = R]Â£ ... R]_ Hn ... H]_ b
From these the solutions x^ ... x^ may be computed as
described in the preceding section.
It is not necessary to write the SVD routines
from scratch; they are readily available in numerical
software libraries such as IMSL and NAG, and listings of
the routines, along with explanations in their proper
use, are provided in the references [35]. The Pascal
source code of these routines as used for this thesis is
provided in an appendix.
Stability of the Algorithms
The stability of SVD is assured by the stability
of its component algorithms. Wilkinson [6] has demon
15
strated the stability of the Householder transformations,
as well as the global convergence of the QR algorithm in
the absence of roundoff, a convergence that is asymptoti
cally almost always cubic.
When there are small values on the diagonal or
superdiagonal of the upper bidiagonal matrix, a splitting
technique [4] employing stable orthogonal transformations
is used to ensure numerical stability, at the cost of
possibly slowing down the rate of convergence.
Application to the Inversion of the Laplace Transform
Involving Sampled Data
Histogram Model of Sampled Data
Suppose that the Laplace transform integral
g (t)
G (y) exp (yt) dy
(2.5)
is to be inverted to obtain G(y) If g(t) is given as a
set of M data samples D^, m = 0, 1, ..., Ml, measured, at
times t = mAT, m = 0, 1, ..., Ml, where AT is the sam
m
pie time increment, then (2.5) can be approximated as
D
m
Â£
Gnhn (tm)
n=l
B +
(2.6)
16
where B is introduced as a small baseline correction
term, Gn = (Yn_Yn_]_)G (Yn) is the area of a histogram
approximating the area under G(y) on the interval
Yn_x ^ Y ^ ynr and
hn(t) =
n
_1_____ fn
viVi
exp (yt) dy
(2.7)
exp(Yn_it) exp(y' t)
; / h (0) = 1
(TnVni)t
Such a histogram is shown in Figure 2.1.
Figure 2.1. Histogram model of a
linewidth distribution.
The task is to find B and G n = 1, 2, ..., N, which
n
minimize the residual
17
Ml
R Â£ I h, B C(tm)]2
m=0
where
Â£
C ^ Gnhn ^m^
n=l
is the model fit to the data D .
m
The minimum of R occurs when
and
0R
dB
0
9r
3Gk
0,
k = 1, 2,
N
Solving (2.9) gives
Ml
B = ^ I [Dm C]
m=0
Solving (2.10) gives
Ml Ml Ml
^ Dmhk ^m^ = hk ^m^ + ^ C ^m^ hk ^m^
m=0 m=0 m=0
(2.8)
(2.9)
(2.10)
(2.11)
18
Substituting from (2.11) for B and collecting terms
results in
Ml ( Ml
X D  m M X Dm) hk (tm) m
m=0 \ m=0 1
Ml ( Ml \
 2 hk 2 hk M K
m=0 \ m=0 /
(2.12)
Define
Ml
D_ = D_ ^ 2 Dm
m=0
and
Ml
hk 2hk
m=0
Then (2.12) becomes, after substituting for C(t ) and
simplifying,
m=0
(tm)
m'
yi n
X 2 Gn hn (tm) hk (tm)
m=0 n=l
gk = 2 Dm hk (tm)
m=0
Defining
19
and
M1_ _ ^
Kkn ~ " hk
gives the matrix equation
" 9l ' ' K11 K12
^2 = K21 K22
. % . . %1 KN2
kin 1 o 1
K2N g2
X 3 2 i  GN 
(2.13)
which may be solved for the desired coefficients G^, 0>2i
. .., Gn using singular value decomposition. Note that
the matrix K is symmetrical; therefore, the upper trian
gular part may be calculated and then copied into the
lower triangular part.
The baseline correction term B may be eliminated
from (2.8) by substituting (2.11) for B,
where
m=0
C
^ Gn hn ^m^
n=l
If the data are normalized such that Dg=l,
then
(2.5) evaluated at t=0 is
20
g(0)
G (y) dy
1
and (2.6) evaluated at m=0 is
N
0 = B + S Gn = 1
n=l
For small B the total area of the histogram is close to
1. This fact can be used as a criterion for evaluating
the solutions obtained from the singular value decomposi
tion: those solutions with total area significantly dif
ferent from 1 are rejected. Other criteria that might be
used are the residuals (2.8) and the baselines (2.11) of
the solutions.
CHAPTER III
LINEWIDTH DISTRIBUTIONS AND
GENERATION OF DATA
In the previous chapter a histogram model of sam
pled data^ was developed. In the problem for which this
thesis was originally undertaken such histograms were re
ferred to as linewidth distributions. Three types of
linewidth distributions are defined in this chapter.
Data generated from these distributions will be used to
investigate the properties and effectiveness of singular
value decomposition in obtaining a solution to the linear
least squares minimization problem discussed in Chapter
II. The advantage of using data generated from known
linewidth distributions is the ability to compare solu
tions obtained by SVD with the correct answer, an ability
which is usually not existent in real problems.
Linewidth Distributions
In some applications the solution to the inverse
of the Laplace transform, G (y) is referred to as a line
width distribution, a term which will be used for the
remainder of this thesis. For this thesis, the "truth"
necessary for the evaluation of the inversion of the
22
Laplace transform will consist of known linewidth distri
butions, from which sampled data will be generated, pro
cessed by SVD, and the resulting solutions compared
against the original linewidth distributions. Linewidth
distributions may be continuous or discrete. Since the
SVD solutions are discrete, each continuous distribution
will be modeled by a histogram, as discussed in Chapter
II, to facilitate comparision of the SVD solutions with
the known solution. The areas of the individual compo
nents of this histogram, G n = 1, 2, ..., N, will be
referred to as linewidth parameters. The linewidth dis
tributions to be considered are the staircase linewidth
distribution, the Pearson linewidth distribution, and the
polydisperse linewidth distribution.
Staircase Linewidth Distribution
The staircase distribution is discrete with N
components of area
2n
Gn =
n N (N+l) '
n = 1, 2, . ., N
Note that
f, Gn = 1
n=l
Figure 3.1 shows an example with N = 10, = 1/55, Gg =
2/55, ..., G1q = 10/55, where 1 + 2 + ... + 10 = 55, and
each component of the histogram has the same width.
Figure 3.1. Staircase linewidth distribution
with N = 10 and total area 1.
Pearson Linewidth Distribution
The Pearson linewidth distribution is a
continuous distribution obtained from the Pearson
probability density function
G (y) = ^_(7t0)Lexp(Y:o)
L !
where L and tg are parameters, and
(3.1)
24
g (t)
_Io_
t+to
L+l
(3.2)
This distribution, an example of which is shown in Figure
3.2, is considered to be a more realistic data model than
is the staircase distribution.
with
The maximum of the Pearson distribution occurs at
y = i
'max 4.
t0
G(Yt
max> =^LLexp(L)
The mean and standard deviation are
y = L1
to
25
and
o = Vl+T.
Y t
with mean square value
^2 = (L+l) (L+2)
t2
t0
In order to allow the comparison of the Pearson
linewidth distribution with the solutions obtained by SVD
the distribution must be fitted by a histogram. The
histogram parameters Gn, n = 1, 2, .., N are the proba
bilities
P (Yni < Y < Yn) = P (Y ^ Yn) ~ p (Y ^ Ynl)
where
P (y < yn) = I G (y) dy
*0
Hn
= 771 (yto)Lexp (yto) dy
L Jo
A change of variable, X = ytn, gives
nhto
L!Jo
P (y ^ Yn) = I ^ exp(A.)dA.
L fo
P (L+l,YjrjtQ)
which is the incomplete gamma function. Thus
26
P (Yn1 Y Â£ Yn) = P(L+l,Ynt0) P (L+l,ynit0)
(3.3)
for n = 1, 2
9
N. It can be shown that
P (L+l, X)
1 1 + % + + .
2!
(3.4)
thereby providing the means for calculating
Polydisperse Linewidth Distribution
An example of a polydisperse linewidth
distribution is shown in Figure 3.3. It is a continuous
distribution constructed from two Pearson linewidth
distributions with parameters L^, tj_ and L2, t.2
Choosing two constants and (*2 such that
al + a2 = form
g (t) = aigx(t) + a2g2(t)
(3.5)
where
(3.6)
and
(3.7)
27
Figure 3.3. Polydisperse linewidth distribution.
This results in the polydisperse linewidth distribution
g (y) = aiG! (y) + a2G2 (y)
, , (3.8
(y ti) Li (y t2) l2
= ait]_exp(yti) + a2t2exp(yt2)
Li! L2!
with parameters a^, L^, t^ and I^, tÂ£/ which are
chosen to produce the desired separation and relative
heights of the two peaks of the distribution.
Polydisperse distributions with more than two peaks can
be formed by extension of this method.
A histogram may be fit to the polydisperse
linewidth distribution where the histogram parameters Gn
are given by
P(yn_! < y < yn) = aÂ£p(L1+l/ynti)P(L1+l,yn_1ti)]
(3.9
+ oc2[p (L2+l,ynt2)p (L2+l,yn_1t2)]
for n = 1, 2, ..., N.
28
Assumptions and Limitations
The histogram models of the various linewidth
distributions do not require equal spacing of the
parameters; in fact, it has been stated [1] that an
exponential distribution of the parameters may provide
somewhat better results. For simplicity, however, only
histograms with uniform parameter spacing will be
considered in this thesis.
An important part of the solution of the inverse
of the Laplace transform is the range of y to use. If
the assumed range extends beyond that of the actual
linewidth distribution, then the distribution becomes
negative, as shown in Figure 3.4. Since the linewidth
distribution cannot be negative and is zero at each end
of the proper range of y, an iterative search may be used
to search for the values of y at these end points. One
method that has been tried [1] is to use the method of
cumulants to obtain a preliminary solution, perform
linear interpolation to estimate where this solution
crosses the axis, set the upper and lower bounds on y to
these values, and repeat until these bounds no longer
change significantly. This range of y is then used to
set up the matrix equation (2.13) for SVD solution. The
scope of this thesis is limited to the evaluation of the
29
use of singular value decomposition in finding a solution
of the inverse Laplace transform. Therefore, it is as
sumed that best estimates of the upper and lower bounds
on y have been previously determined; the estimation of
these bounds will not be considered further.
Figure 3.4. The linewidth distribution becomes negative
when the assumed range of Y extends
beyond the actual range.
Generation of Simulated Data
After the linewidth distribution model and its
associated histogram have been specified, the simulated
data must be generated from it. For the staircase dis
tribution the data samples are generated using (2.6) and
(2.7) with B = 0, for tm = mAt, m = 0, 1, ..., Ml. For
the Pearson and polydisperse distributions the M data
30
samples are generated directly from (3.2) and (3.5), re
spectively. Noise is added to the data using random num
bers generated from a uniform distribution over the range
[10 I, 10 I], where I > 0 is a parameter controlling the
magnitude of the noise. Note that the data samples are
automatically normalized such that the first sample, Dq,
is 1, since the linewidth distributions from which the
data are generated each have area equal to 1.
CHAPTER IV
MEASUREMENT OF PERFORMANCE
In this chapter the various measures of per
formance will be defined. These measures serve two
purposes. First, they can be used to help choose from
among the various SVD solutions that solution that is to
be considered "best." Second, they can be used to mea
sure the accuracy and effectiveness of the SVD method
itself. In particular, the parameter error u, to be
defined presently, is not obtainable when working with
real data since the true solution is not known, thus
precluding the use of this measure for choosing the best
solution in actual problems. The parameter error is,
however, an excellent measure of the performance of the
SVD method, as well as a check on the validity and effec
tiveness of the other performance measures for choosing
the best solution.
The performance measures defined in this chapter
are all numbers, which may be interpreted manually by the
experimenter or possibly used in some form of automated
scheme to select the best solution and determine the
quality of the solution. The experimenter also has the
32
capability to look at plots of the data, residuals, and
candidate solutions. In many cases the use of plots may
prove more effective than the numerical measures.
Therefore, several types of plots are also discussed in
this chapter.
After the performance measures and plots are de
fined, an example is presented to illustrate their use in
selecting and evaluating the best solution.
Performance Measures
RMS Parameter Error
The RMS parameter error or uncertainty u is
defined by
where
area = ^ Gn (4.1)
n=l
is used to normalize the sum of the Gn's to 1, since any
valid solution must have this total area. Gn is the exact
value from the initial model of the nth linewidth parame
ter and Gn the value for this parameter obtained from the
SVD solution. A value of u is computed for each of the N
solutions obtained by SVD. A small value of u is consid
ered an indication of a good solution, since this shows
33
close agreement between the linewidth parameters of the
solution and of the model.
Linewidth Parameter Total Area
The area computed in (4.1) can itself be used to
help choose the best solution. Since the data are
usually normalized so that the first data sample equals
1, the total area of the linewidth parameters should be
close to 1 for a good solution.
Computed Baseline
The baseline computed by (2.11) should be close
to the initial baseline of the data for a good solution.
For the data used in this thesis the initial baseline B
was set to zero in (2.6); thus, to be considered good, a
candidate SVD solution should have a computed baseline
close to zero.
RMS Residual
The total rms residual between the data and the
model may be obtained by taking the square root of (2.8) .
The better candidate solutions should have smaller values
for the rms residual since a better linewidth parameter
estimate should produce a better model fit to the data.
34
Residual Statistic
A residual statistic based on "runs" may be
computed to check for the randomness of the individual
data point residuals. For a good model fit to the data
the residuals should be random with zero mean.
Considering just the signs of the residuals, a run is an
unbroken string of positive or negative residuals. To
form the residual statistical measurement for randomness,
a run test on the signs of the residuals is performed as
follows. Defining
nr = number of runs
n^ = number of positive residuals
n2 = number of negative residuals
calculate the quantities [7,8]
and
2nn n,
Li = + i
^ ni + n2
2 (2n^n2) (2^^ ~ ni ~ n2^
2
(n^ + ^2) (n^ + ^2 1)
the run test statistic may be computed as
z
G
(4.2)
35
If n^>10 and ^>10, or either n^>20 or n2>20, then z is
an approximately unit normal deviate for which the
probability of occurrence may be found. The smaller z
is, the more likely that the residuals are random. As an
example of how z would be used, suppose that z=0.9. Then
the probability of this value occurring for a unit normal
is 0.1841. At the 5% level of significance, the residu
als would be considered to be random. The 1/2 in (4.2)
is a continuity correction which helps compensate for the
approximation of a discrete distribution by a continuous
distribution [7].
Plots
While the performance measures discussed in the
previous paragraphs can possibly be used for automatic
selection of the best solution, as well as aids to the
experimenter in choosing among the candidate solutions,
the experimenter has the advantage of being able to view
various plots of the data and results. Several examples
of useful plots are presented in the next section. Among
them are plots of the linewidth parameters of the solu
tions, plots of the experimental and model data points
for each solution, and plots of the residuals. The re
sidual plot is especially useful in that any trends or
nonrandomness in the residuals are readily apparent.
36
Example
The following example illustrates the use of the
performance measures and plots discussed in the previous
section. Starting with a staircase linewidth distribu
tion with N=5, over the range 0 < y < 5000, as shown in
Figure 4.1, and a sample time of 10)is, 128 data points
were generated with added uniformlydistributed noise of
magnitude on the order of 10^. The data were then pro
cessed by the SVD routine, which produced the five candi
date solutions shown in Table 4.1 and plotted in Figure
4.2, where the candidate solutions are plotted with solid
lines and the original solution, repeated in each plot,
is plotted with dotted lines. Table 4.2 lists the
performance measures computed for each of the solutions.
Figure 4.1. Original linewidth distribution.
37
Table 4.1. Linewidth Solutions
1
l.lOel
2.16el
2.40el
2.35el
2.21el
2
3.70e2
1.36el
2.21el
2.78el'
3.10el
3
6.44e2
1.36el
1.98el
2.67el
3.34el
' 4
7.68e2
1.23el
1.98el
2.81el
3.25el
5
6.23el
9.83el
1.82e+0
9.87el
7.20el
Figure 4.2. Plots of the candidate
linewidth solutions.
Table 4.2. Performance Measures
Solution Parameter Error RMS Residual Residual Statistic Baseline Area
1 6.94e2 1.13e2 1.09e+l 5.57e2 1.02e+0
2 2.04e2 3.85e4 1.07e+l 1.61e2 9.83el
3 1.86e3 6.5 6e6 4.20el 9.2 6e4 9.99el
4 9.63e3 1.72e5 8.42e+0 4.24e3 1.00e+0
5 9.09el 1.02e4 1.04e+l 1.96el 1.20e+0
Evaluating these performance measures, solution 3
has the smallest parameter error, smallest rms residual,
smallest run test residual statistic (and therefore the
most likely to have random residuals), baseline closest
to zero, and an area very close to 1. From these parame
38
ters alone is is seen that solution 3 is the best solu
tion.
Looking at the plots, Figure 4.2 shows that
solution 3 is the closest to the true linewidth distribu
tion. An enlargement of the plot of solution 3 is shown
in Figure 4.3, where the solution and the true distribu
tion are almost indistinguishable. Figures 4.44.8 show
for each of the solutions plots of the original data
(dotted lines starting at 1), the model data (solid
lines), and the residuals, along the horizontal axis,
scaled such that the largest magnitude residual has mag
nitude 0.1 on the plot.
Except for solution 1, the original and model
data cannot be distinguished on the plots, and are thus
not useful in this case for selecting the best solution.
In other cases, these plots could be used to weed out the
very bad solutions. Close agreement between the original
and model data should be expected of a good solution, but
does not imply one, as an examination of solution 5 will
show. The residual plots vary considerably among the so
lutions, however, with only the solution 3 residuals ap
pearing to be random, in agreement with the residual run
test statistics in Table 4.2. The residual plots appear
to be much more useful than the data plots when trying to
choose the best solution.
39
Figure 4.3. Solution 3 linewidth distribution.
Figure 4.4. Data and residuals for solution 1.
40
Figure 4.5. Data and residuals for solution 2.
Figure 4.6. Data and residuals for solution 3.
41
Figure 4.7. Data and residuals for solution 4.
Figure 4.8. Data and residuals for solution 5.
CHAPTER V
RESULTS
In this chapter the results of applying singular
value decomposition to various cases are presented.
These cases include the three types of linewidth distri
butions defined in the previous chapter. The magnitude
of the noise applied to the data generated from these
distributions was varied to study the dependence of the
SVD method to noise. Also, in the case of the staircase
distribution, the size of the distribution, i.e., the
number of linewidth parameters, was varied. The results
are presented primarily in the form of graphs which sum
marize these results in a convenient form. In what fol
lows, each type of distribution is considered separately.
Staircase Linewidth Distribution
Nine different staircase distributions were con
sidered, with N = 2, 3, ..., 10. The same range of y, 0
< y <, 5000, was used in all cases. Table 5.1 lists the
values of y and G for N=10, while Figure 5.1 shows this
distribution graphically. The width of
43
Table 5.1. Values for the Staircase Distribution
with N=10
n Y(n) G(n)
0 0.00
1 500.00 0.0182
2 1000.00 0.0364
3 1500.00 0.0545
4 2000.00 0.0727
5 2500.00 0.0909
6 3000.00 0.1091
7 3500.00 0.1273
8 4000.00 0.1455
9 4500.00 0.1636
10 5000.00 0.1818
V
Figure 5.1.
Staircase distribution with N=10.
44
each linewidth parameter is (5000 0)/10 = 500. The
vertical axis has arbitrary units and is scaled such that
the height of each histogram component represents the
area of that component, Gn, n = 1, 2, 10, with total
area equal to 1. Using (2.6) and (2.7) with B = 0, 128
data samples are generated from this histogram and are
listed to six significant digits in Table A.l in Appendix
A and plotted in Figure 5.2. The time between data sam
ples is lO^ls. The histograms and data for N = 2, 3, ...,
9 are similar.
Figure 5.2. Plot of the data generated from
the staircase distribution with N=10.
For each data set noise is added with magnitude
on the order of 101, I = 1, 2, ..., 7, and the resulting
noisy data are then rounded to five decimal places.
45
The same set of 128 uniformlydistributed random numbers,
scaled to the correct magnitude, was used in each case.
The particular set of random numbers used produced noise
with mean 6.41 x 101^ and rms value 5.50 x lO1^, I =
1, 2, ..., 7. The total number of cases was 9 x 7 = 63.
For each case the SVD method was applied to the
noisy data and a set of solutions obtained. The ten
solutions for N=10 and 1=4 (noise ~104) are listed in
Table 5.2 and plotted in Figure 5.3 against the original
linewidth distribution. The performance measures defined
in the previous chapter were computed and used to evalu
ate the set of solutions and choose the solution that
gave the best estimate of the original distribution.
Table 5.3 shows the performance measure values for N=10
and 1=4.
fT if _f j
A d i 
Figure 5.3. The 10 solutions plotted against the
original linewidth distribution.
46
Table 5.2. Solutions Obtained for N=10 and 1=4
l
3.27e2
7.7 9e2
1.02el
1.14el
1.19el
1.20el
1.19el
1.16el
1.13el
1.09el
6
3.06e2
5.53el
7.24el
8.84e2
5.48el
2.57el
4.59e2
1.75el
2.82el
4.95el
2
4.40e3
2.39e2
5.03e2
7.68e2
1.00el
1.20el
1.36el
1.49el
1.58el
1.65el
7
3.06e2
4.40el
7.84el
4.55el
8.00el
5.07el
3.62el
9.91el
2.6 6el
2.7 4el
3
1.56e2
3.79e2
5.51e2
7.21e2
9.00e2
1.0 9el
1.28el
1.47el
1.64el
1.80el
8
8.47el
1.00e+0
7.27el
4.40el
4.07el
4.12el
4.21el
1.21e+0
1.10el
7.54e2
4
8.88e3
3.43e2
5.8 6e2
7.77e2
9.27e2
1.07el
1.23el
1.41el
1.63el
1.87el
9
8.47el
1.00e+0
7.27el
4.40el
4.07el
4.12el
4.21el
1.21e+0
1.10el
7.54e2
5
3.06e2
6.41e2
8.2 6e2
8.78e4
1.70el
3.7 6e2
1.90el
1.24el
1.38el
2.Ole1
10
8.47el
7.83el
4.38el
l.lle+0
1.03e+0
6.2 6el
9.81el
1.97e+0
2.2 6e+0
9.62el
Table 5.3. Performance Measures for the Solutions with
N=10 and 1=4
Solution
1
2
3
4
5
6
7
8
9
10
Parameter
Error
3.98e2
1.02e2
1.21e3
4.57e3
5.25e2
3.66el
5.4 6el
1.07e+0
1.07e+0
2.05e+0
RMS
Residual
1.35e2
4.04e4
5.35e5
5.40e5
5.43e5
8.39e5
5.72e5
7.38e5
7.38e5
6.56e5
Residual
Statistic
1.09e+l
1.04e+l
1.60e+0
9.35el
9.12el
5.40e+0
1.85e+0
3.63e+0
3.63e+0
2.91e+0
Baseline
6.28e2
1.53e2
1.31e3
6.40e3
2.44e2
3.38e2
9.98e3
3.83el
3.83el
4.03el
Area
1.02e+0
9.83el
9.99el
9.94el
9.7 6el
1.03e+0
1.01e+0
6.17el
6.17el
5.9 6el
47
In this case solution 3 was chosen as best since it had
the smallest parameter error, rms residual, and baseline
and its total area was close to 1. Note that the residu
al run test statistic was smallest for solution 5, but
was still reasonably small for solution 3, with a proba
bility of 0.0548 that the residuals came from a random
distribution. At the 5% level of significance, the re
siduals would be considered to be random, as is expected
for a good solution. Figure 5.4 shows the data and re
siduals for solution 3 and Figure 5.5 shows, for compari
son, the data and residuals for solution 2. Note that
the residuals have been greatly magnified for viewing, as
the model data actually plots over the original data in
the Figure. Figure 5.6 shows an enlarged plot of solu
tion 3. It is apparent from Figure 5.3 that solution 3
is the best estimate of the original distribution, fol
lowed closely by solution 4. Note in Table 5.3 that all
of the rms residuals starting with solution 3 are small
and of the same magnitude. Because of the illcondi
tioned nature of the problem, the higherorder solutions
will still generate data models that fit the sampled
data very well even though these higherorder solutions
are themselves very bad, as can be seen in Table 5.2 and
Figure 5.3.
The process of manually selecting the best solu
tion was repeated for the remaining 62 cases, resulting
48
in the graphs of Figures 5.75.9. Figure 5.7 plots the
parameter error versus the rms noise, Figure 5.8 the rms
residuals versus the noise, and Figure 5.9 the parameter
error versus N.
Figure 5.4. Data and residuals for solution 3.
Figure 5.5. Data and residuals for solution 2.
49
0 1000 2000 3000 4000 5000
Y
Figure 5.6. Plot of solution 3.
N
1.00E01 '
1.00E02 "
1.00E03
1.00E04
1.00E05
1.00E06
Parameter
Error
1.00E08
+
1.00E06 1.00E04 1.00E02
RMS Noise
1
1.00E+00
Figure 5.7. Staircase distribution parameter
error versus rms noise.
50
N
2
O 3
4
a 5
+ 6
7
X 8
9
10
Figure 5.8. Staircase distribution
rms residuals versus rms noise.
Parameter
Figure 5.9. Staircase distribution
parameter error versus N.
51
Pearson Linewidth Distribution
The Pearson linewidth distribution shown in
Figure 5.10 was considered, with 0 < y < 5000. This dis
tribution was generated from (3.1) with L=4 and tQ=2.5
ms.
Figure 5.10. Pearson distribution
with L=4 and tQ=2.5 ms.
A distribution histogram with N=10 was generated from
(3.3) and (3.4) as shown in Figure 5.11. Table 5.4 lists
the values of y and G for this histogram, where Gn =
P (Yn1 y Yn) A set of 128 data points was then gen
erated using (3.2) with At=10 ls. Figure 5.12 shows a
plot of this data, which is listed to six significant
figures in Appendix A. Note that the data were generated
from the continuous distribution, not the histogram,
which is used only for comparison with the SVD solutions.
52
V
Figure 5.11. Histogram for the Pearson
distribution of Figure 5.10 with N=10.
Table 5.4. The Pearson Distribution Histogram with N=10,
L=4, and tQ=2.5 ms
n Y (n) G(n)
0 0.00
1 500.00 0.0091
2 1000.00 0.0997
3 1500.00 0.2136
4 2000.00 0.2371
5 2500.00 0.1875
6 3000.00 0.1209
7 3500.00 0.0681
8 4000.00 0.0348
9 4500.00 0.0165
10 5000.00 0.0074
53
Figure 5.12. Data generated from the Pearson
linewidth distribution with N=10,
L=4, tQ=2.5 ms, and At=10 [is.
Noise was added to the data set with magnitude on the
order of 101, 1=1, 2, ..., 7, and the resulting noisy
data were then rounded to five decimal places. The same
set of random numbers used in the staircase distribution
cases was used to generate the noise.
The ten solutions obtained from the application
of the SVD method to the data for the 1=4 case are listed
in Table 5.5 and plotted in Figure 5.13 against the his
togram of the original distribution. Table 5.6 shows the
values of the performance measures for this case.
Solution 4 was chosen as best since it had the smallest
parameter error and smallest baseline, an rms residual
that was close to the smallest, and an area close to 1.
Figure 5.13 shows that solution 4 comes closest to match
ing the shape of the linewidth histogram. Figure 5.14 is
an enlarged plot of
54
Table 5.5. Solutions Obtained for 1=4
1 2 3 4 5
3.23e2 8.20e2 5.52e2 1.06e2 6.13e2
7.68e2 1.71el 1.38el 1.02el 4.83e2
1.Ole1 1.92el 1.80el 2.15el 1.71el
1.13el 1.7 9el 1.90el 2.45el 3.88el
1.18el 1.51el 1.7 6el 2.02el 6.15e2
1.19el 1.19el 1.46el 1.30el 2.57el
1.17el 8.69e2 1.0 6el 5.78e2 6.46e2
1.14el 5.7 4e2 6.22e2 1.14e2 4.32e2
l.lle1 3.13e2 1.65e2 3.09e3 4.91e2
1.07el 8.64e3 2.85e2 3.65e2 1.08e2
6 7 8 9 10
6.13e2 6.13e2 2. lle1 2.lle1 2.lle1
2.65e2 1.59e2 2.03el 9.83e2 1.05el
2.07el 2.Ole1 2.20el 1.22e+0 1.23e+0
3.84el 4.19el 4.14el 8.83el 8.61el
4.47e2 6.86e2 6.23e2 2.28el 1.81el
2.47el 1.75el 2.06el 1.15e+0 1.15e+0
5.41e2 8.41e2 1.04el 1.29e+0 1.24e+0
4.0 9e2 1.18el 1.91el 1.24e+0 1.26e+0
6.77e2 6.91e2 1.21el 6.87el 7.57el
2.25e3 2.33e2 8.94e2 1.94el 2.23el
Figure 5.13. The ten solutions for the 1=4 case
plotted against the original histogram.
55
Table 5.6. Performance Measures for the Solutions
with 1=4
Solution
1
2
3
4
5
6
7
8
9
10
Parameter
Error
7.8 6e2
4.16e2
3.36e2
1.60e2
8.88e2
8.94e2
9.48e2
1.74el
9.78el
9.80el
RMS
Residual
2.37e2
9.78e4
1.62e4
5.58e5
5.49e5
5.48e5
5.51e5
6.05e5
6.10e5
6.07e5
Residual
Statistic
1.09e+l
1.04e+l
9.13e+0
2.14e+0
1.57e+0
1.60e+0
1.47e+0
1.83e+0
1.50e+0
1.49e+0
Baseline
6.10e2
7.62e2
4.26e2
7.37e3
2.54e2
2.28e2
2.05e2
1.10el
1.30el
1.30el
Area
1.01e+0
1.08e+0
1.04e+0
9.93el
1.03e+0
1.02e+0
1.02e+0
8.89el
8.70el
8.7 Oe1
V
Figure 5.14
Plot of solution 4
56
solution 4 with the data and residuals plotted in Figure
5.15. The central peak is modeled very well while the
ends, especially the tail on the right side, are not mod
eled as well. Since the linewidth distribution cannot be
negative, the first model element could be set to zero.
If the expected distribution is known to have a monotoni
callydecreasing tailoff, then the last element of the
model would probably be disregarded.
Figure 5.15. Data and residuals for solution 4.
Since the linewidth distribution is actually continuous,
a curve could be fit to the model histogram, including
the first element and excluding the last element. The
resulting curve would then be truncated where it crosses
zero. Figure 5.16 shows such a curve fit through the
center of each element, with the height of the last ele
ment set to zero. The curve was formed by a series of
57
Bezier curves [9] with end points at the centers of each
of the elements and the slopes of the individual curves
set equal to each other where two curves join at the end
points.
Figure 5.16. Curve fit through solution 4.
Figures 5.17 and 5.18 show the results of manual
ly choosing the best solution for all seven cases, 1=1 to
7. The results are similar to those for the staircase
distribution with N=10, except that the minimum parameter
error is an order of magnitude greater than that for the
staircase distribution.
58
RMS Noise
Figure 5.17. Pearson distribution parameter
error versus rms noise.
Figure 5.18. Pearson distribution
rms residual versus rms noise.
59
Polydisperse Linewidth Distribution
The last linewidth distribution considered was
the polydisperse distribution with 0 < y < 5000. This
distribution is shown in Figure 5.19. The distribution
was generated from (3.8) with a^=0.4, L^=4, t^=6 ms, and
OC2=0.6, L1=33, and t2=10 ms. These values were chosen to
give a distribution with two distinct humps. The distri
bution histogram generated from (3.9) is shown in Figure
5.20, and the values of y and G are listed in Table 5.7.
Using (3.5)(3.7) a set of 128 data points was generated
with At=10 ms. The data are listed in Appendix A and
plotted in Figure 5.21. As was the case with the Pearson
distribution, the data were generated from the continuous
distribution, not the histogram. Using the same set of
random numbers as before, noise was added to the data set
with magnitude on the order of 101, 1=1, 2, ..., 7,
and then rounded to five decimal places.
Figure 5.19. Polydisperse distribution with a1=0.4,
L^=4, t^=6 ms, and OC2=0.6, L^=33, and t2=10 ms.
60
V
Figure 5.20. Histogram for the polydisperse
distribution of Figure 5.19 with N=10.
Table 5.7. The Polydisperse Distribution Histogram with
N=10, a1=0.4, L1=4, t1=6 ms, and
ms
n 7(n) G(n)
0 0.00
1 500.00 0.0739
2 1000.00 0.2121
3 1500.00 0.0920
4 2000.00 0.0205
5 2500.00 0.0310
6 3000.00 0.1238
7 3500.00 0.2006
8 4000.00 0.1553
9 4500.00 0.0678
10 5000.00 0.0188
61
Figure 5.21. Data generated from the polydisperse
linewidth distribution with 0^=0.4, 1^=4, t^=6 ms,
and
Table 5.8 and Figure 5.22 show the results
obtained from the SVD method for the data with 1=4, while
Table 5.9 shows the performance measure values for this
case. Solution 4 was chosen as the best solution because
it had the smallest parameter error and baseline, very
nearly the smallest rms residual, and an area close to 1.
Of course, looking at Figure 5.22, solution 4 is obvious
ly the best, and is, in fact, the only solution that
comes close to matching the original histogram. Both
humps are fairly well modeled, with a clear separation,
although the model has them somewhat shorter and broader
than the original. Figure 5.23 shows an enlargement of
solution 4 and Figure 5.24 the data and residuals. To
62
Table 5.8. tribution Solutions with 1=4 Obtained for the Polydisperse Dis
l 2 3 4 5
3.00e2 4.47e2 6.35e2 1.56el 2.14el
7.13e2 9.94e2 1.23el 1. 73el 1.29el
9.36e2 1.21el 1.29el 8.10e2 4.56e2
1.05el 1.24el 1.16el 3.94e2 1.55el
1.09el 1.19el 1.02el 6.56e2 4.80e2
1.10el 1.10el 9.13e2 1.13el 2.16el
1.09el 9.98e2 8.61e2 1.54el 5.50e2
1.06el 8.93e2 8.59e2 1.57el 1.83el
1.03el 7.94e2 8.98e2 1.09el 1.46el
9.96e2 7.03e2 9.65e2 5.58e3 1.52e2
6 7 8 9 10
2.14el 2.14el 1.30e2 1.30e2 1.30e2
3.66e2 1.03el 5.27e2 7.46e3 2. lle1
1.98el 1.23el 1.39el 3.39el 4.80e2
1.38el 5.94el 5.8 9el 3.30el 3.39el
1.19el 1.94el 8.45e2 1.43el 1.59e+0
1.74el 7.73el 7.4 6el 5.57el 3.42el
9.96e2 2.93el 3.09el 5.46el 1.96e+0
1.73el 1.18e+0 1.25e+0 1.45e+0 6.89el
2.25el 2.44el 2.87el 1.26el 2.29e+0
7.08e2 3.45el 4.Ole1 3.44el 1.24e+0
63
Figure 5.22. The ten solutions for the polydisperse dis
tribution with 1=4 plotted against
the original histogram.
Table 5.9. Performance Measures for the Polydisperse
Distribution Solutions with 1=4
Solution
1
2
3
4
5
6
7
8
9
10
Parameter
Error
'7.37e2
7.12e2
7.Ole2
3.73e2
9.02e2
1.14el
5.05el
5.68el
5.98el
1.26e+0
RMS
Residual
7.06e3
7.18e4
2.46e4
5.35e5
5.34e5
5.37e5
8.52e5
1.00e4
9.87e5
1.16e4
Residual
Statistic
1.09e+l
1.07e+l
9.85e+0
9.35el
9.7 4el
5.98el
5.7 6e+0
7.88e+0
7.88e+0
7.8 9e+0
Baseline
8.19e2
4.12e2
1.7 6e2
5.23e2
7.88e2
6.78e2
3.82e2
7.09e2
7.48e2
5.42e2
Area
9.37el
9.58el
9.83el
1.05e+0
1.08e+0
1.07e+0
1.04e+0
9.29el
9.25el
9.46el
64
Figure 5.23. Plot of solution 4 for the
polydisperse distribution with 1=4.
Figure 5.24. Plot of the data and
residuals for solution 4.
65
determine if a tenelement histogram is adequate to model
this more complex distribution, the same case was run
again with a twentyelement histogram model. The re
sults, presented in Figure 5.25 and Table 5.10, are simi
lar to those for the tenelement model, with somewhat im
proved parameter error and total area. Looking at solu
tion 4, shown in Figure 5.26, the rapid rise on the left
side of the distribution is able to be fitted better by
the narrower elements, but otherwise the height and shape
of the two humps remain essentially the same, although,
of course, smoother. For this case the twentyelement
histogram model does offer some improvement.
1
i
ili
u
i
Hi
b
li
Ip
11
41
ttr
m
l
Figure 5.25. The 20 solutions for the polydisperse dis
tribution with 1=4 obtained using a
twentyelement histogram model.
66
Table 5.10. Performance Measures for the Polydisperse
Distribution Solutions with 1=4 Obtained using a Twenty
element Histogram Model
Solution Parameter Error
1 3.87e2
2 3.7 4e2
3 3.66e2
4 1.97e2
5 5.12e2
6 6.49e2
7 1.08el
8 1.08el
9 1.10el
10 1.12el
11 1.13el
12 1.66el
13 1.72el
14 2.00el
15 2.15el
16 3.28el
17 3.37el
18 3.53el
19 2.66el
20 2.66el
RMS Residual
Residual Statistic
6.91e3 1.09e+l
7.02e4 1.07e+l
2.45e4 1.02e+l
5.35e5 9.74el
5.50e5 1.13e+0
5.43e5 1.17e+0
5.38e5 8.84el
5.38e5 8.84el
5.34e5 5.26el
5.32e5 5.79el
5.33e5 5.7 9el
5.60e5 1.09e+0
5.53e5 1.09e+0
5.33e5 5.55el
5.33e5 9.35el
5.54e5 9.66el
5.37e5 8.84el
5.87e5 2.22e+0
9.52e5 7.16e+0
1.13e4 8.60e+0
Baseline Area
8.41e2 9.34el
4.54e2 9.53el
2.41e2 9.77el
3.96e2 1.04e+0
1.05e2 1.01e+0
1.37e2 1.01e+0
1.52e2 1.02e+0
1.52e2 1.02e+0
1.39e2 1.01e+0
2.35e2 1.02e+0
1.82e2 1.02e+0
7.77e3 9.92el
4.92e3 9.95el
8.45e3 9.91el
9.54e3 1.01e+0
5.85e2 1.06e+0
5.65e2 1.06e+0
8.71e2 1.09e+0
8.62el 1.8 6e+0
8.99el 1.90e+0
Figure 5.26. Plot of solution 4 for the polydisperse
distribution with 1=4 obtained using a
twentyelement histogram model.
67
Figures 5.27 and 5.28 show the results from choosing the
best solution for each of the seven cases 1=1 to 7. The
plot of the rms residual versus rms noise is virtually
identical to that of the Pearson distribution, but the
plot of parameter error versus rms noise levels off for
3
I>3 (rms noise < 10 ) and at a higher value than that of
the Pearson distribution, where the leveling off does not
5
occur until I>5 (rms noise < 10 ).
Figure 5.27. Polydisperse distribution
parameter error versus rms noise.
RMS Noise
Figure 5.28. Polydisperse distribution
rms residual versus rms noise.
68
Precision Considerations
The results presented so far were calculated
using single precision on an Apple Macintosh Plus comput
er (Appendix C). This offers approximately 7 decimal
digits of precision. The same cases were also run using
double precision, giving approximately 16 digits of pre
cision. Figures 5.295.35 show the parameter error and
rms residuals versus rms noise for all three distribu
tions, and, for the staircase distribution, the parameter
error versus solution histogram size N, when double pre
cision is used. For each of the distributions the param
eter error is somewhat less for the lowernoise cases, as
is the rms residual. The input data was still rounded to
five decimal places, thus setting a lower bound on the
errors that cannot be improved upon by using more preci
sion.
N
1.00E06
* 9
1.00E08 1.00E06 1.00E04 1.00E02 1.00E+00
1 0
RMS Noise
Figure 5.29. Staircase distribution parameter error ver
sus rms noise, double precision.
69
N
2
O 3
4
5
4r 6
A 7
X 8
* 9
10
Figure 5.30. Staircase distribution rms residuals
versus rms noise, double precision.
Parameter
Error
234 5 6789 10
N
I
1
O 2
3
4
** 5
6
X 7
RMS Noise =
5.50 X 10AI
Figure 5.31. Staircase distribution parameter
error versus N, double precision.
70
RMS Noise
Figure 5.32. Pearson distribution parameter error
versus rms noise, double precision.
Figure 5.33. Pearson distribution rms residuals
versus rms noise, double precision.
71
1.00E08 1.00E06 1.00E04 1.00E02
RMS Noise
1.00E+00
Figure 5.34. Polydisperse distribution parameter
error versus rms noise, double precision.
RMS Noise
Figure 5.35. Polydisperse distribution rms residuals
versus rms noise, double precision.
72
The polydisperse case with N=20 and 1=4 using
double precision is presented in Figure 5.36 and Table
5.11. Solutions 14 are virtually unchanged, and solu
tion 6 actually looks worse, but solution 5 is greatly
improved. In fact, as can be seen in Table 5.11, solu
tion 5 is slightly better than solution 4. Using double
precision, therefore, can make a difference, and is prob
ably worth using whenever practical.
>1 z<
Figure 5.36. The 20 solutions for the polydisperse dis
tribution with 1=4 obtained using a twenty
element histogram model, double precision.
73
Table 5.11. Performance Measures for the Polydisperse
Distribution Solutions with 1=4 Obtained using a Twenty
element Histogram Model, Double Precision
Parameter RMS Residual i
Solution Error Residual Statistic Baseline Area
1 3.87e2 6.91e3 1.09e+l 8.41e2 9.34el
2 3.74e2 7.02e4 1.07e+l 4.54e2 9.53el
3 3.66e2 2.45e4 1.02e+l 2.41e2, 9.77el
4 1.96e2 5.33e5 1.31e+0 3.84e2: 1.04e+0
5 1.93e2 5.32e5 1.24e+0 2.75e2 1.03e+0
6 1.32el 5.29e5 6. lle1 2. 21el 1.22e+0
7 1.45el 5.29e5 6.19el 1.58el 1.16e+0
8 1.08e+0 5.34e5 1.57e+0 7.39e+l! 7.29e+l
9 6.80el 5.40e5 5.26el 1.65e+2 1.64e+2
10 6.84el 5.64e5 4.70el 1.65e+2 1.64e+2
11 8.15el 5.73e5 7.99el 1.52e+2 1.51e+2
12 1.51e+0 5.56e5 4.51el 1.48e+2 1.47e+2
13 1.55e+0 5.49e5 1.51e+0 1.47e+2 1.46e+2
14 1.55e+0 5.68e5 1.98el 1.47e+2 1.4 6e+2
15 1.56e+0 5.75e5 8. lle1 1.47e+2 1.46e+2
16 1.84e+0 5.65e5 9.15e2 1.48e+2 1.47e+2
17 2.04e+0 5.72e5 4.70el 1.50e+2! 1.49e+2
18 2.10e+0 5.36e5 1.17e+0 1.49e+2 1.48e+2
19 2.22e+0 5.90e5 8.73el 1.47e+2 1.46e+2
20 4.07e+0 6.07e5 1.20e+0 1.39e+2 1.38e+2
CHAPTER VI
CONCLUSION
The application of singular value decomposition
(SVD) to the inversion of the Laplace transform was pre
sented in this thesis. The Laplace transform inversion
problem was introduced in Chapter II, where it was
reduced to the problem of solving an illconditioned sys
tem of linear equations. The SVD method was then intro
duced along with the development of its application using
histogram models to the sampled data commonly encountered
in inverse problems.
Because of the difficulty in determining the ef
fectiveness of a method in solving inverse problems,
three distributions, staircase, Pearson, and polydis
perse, referred to as linewidth distributions, were de
fined in Chapter III. Histogram models were developed
for these distributions for the purpose of evaluating the
SVD results. The assumptions and limitations of this
thesis were also presented, including the assumption that
the range of the distribution is known. The determina
tion of this range is a significant problem in its own
right, one that was discussed in [1].
75
A brief discussion of the generation of simulated
data from the known distributions was presented, includ
ing the addition of uniform random noise of known magni
tude to the data.
Various measures of the performance of the SVD
method were defined in Chapter IV. All but one, the pa
rameter error, are available when working with real data
for which the actual linewidth distribution is unknown or
for which only the approximate form is known. Plots of
the SVD solutions, the input data and the data model pro
duced by SVD, and the data residuals were also intro
duced. It was determined that these plots can be helpful
when choosing the best one of the set of SVD solutions
and evaluating the quality of that solution.
In Chapter V the results of applying SVD to mul
tiple cases of each of the three linewidth distributions
were presented, including, for a typical case with moder
ate noise, the performance measures and plots introduced
in Chapter IV. The magnitude of the random noise added
to the data from the known distributions was varied to
produce plots of the sensitivity of the SVD method to
noise.
The best results were obtained for the staircase
distribution, followed by the Pearson and polydisperse
distributions, in order of increasing complexity.
76
The error in the solutions decreases with decreasing
noise in the data, as expected, until the point where the
magnitude of the noise is less than can be seen given the
precision of the data, when the errors level off.
Some improvement can be obtained by using double
precision SVD calculations, though still limited by the
data precision, as well as by using more elements in the
histogram models. The use of more model elements is par
ticularly useful when more complex distributions, such as
the polydisperse distribution, are being modeled.
Given that suitable limits can be placed on the
proper range of y, singular value decomposition can be
used to solve difficult problems such as the inversion of
the Laplace transform. The usefulness of the SVD results
is highly dependent on the amount of noise in the data,
and, to a lesser extent, on the precision used in the
calculations and on the number of elements in the model.
The quality of the solutions obtained using SVD should be
compared to that of solutions obtained by other methods
before determining if SVD is the method of choice for a
particular class of problem.
77
BIBLIOGRAPHY
[1] Chu, B., Ford, J. R., and Dhadwal, H. S.,
"Correlation Function Profile Analysis of
Polydisperse Macromolecular Solutions and Colloidal
Suspensions," Methods Enzymol., 117 (Enzyme
Structure, Pt. J.), 1985.
[2] Bellman, R. E., Kalaba, R. E., and Lockett, J.,
Numerical Inversion of the Laplace Transform:
Applications to Biology, Economics, Engineering, and
Physics, American Elsevier Publishing Company, Inc.,
New York, 1966.
[3] Lawson, C. L. and Hanson, R. J., Solving Least
Squares Problems, PrenticeHall, Inc., Englewood
Cliffs, N.J., 1974.
[4] Wilkinson, J. H. and Reinsch, C. (ed. by F. L.
Bauer, et al.), Handbook for Automatic Computation,
II, Linear Algebra, SpringerVerlag, New York, 1971.
[5] Press, W. H., Flannery, B. P., Teukolsky, S. A., and
Vetterling, W. T., Numerical Recipes: The Art of
Scientific Computing, Cambridge University Press,
Cambridge, England, 1986.
[6] Wilkinson, J., "Error analysis of transformations
based on the use of matrices of the form I2wwH,"
Error in Digital Computation, II, (ed. by L. B.
Rail), John Wiley & Sons, Inc., New York, 1965.
[7] Draper, N. R., and Smith, H., Applied Regression
Analysis, John Wiley & Sons, Inc., New York, 1966.
[8] Schmidt, J. W., and Taylor, R. E., Simulation and
Analysis of Industrial Systems, Richard D. Irwin,
Inc., Homewood, Illinois, 1970.
[9] Smith, D. W., "C Workshop: Bezier Curve Ahead!,"
MacTutor, Vol. 5, No. 1, January 1989.
APPENDIX A
GENERATED DATA SETS
Table A.l. Data, with No Noise Added, Generated from the
Staircase Distribution with N=10 (128 Data Points)
1.00000e+0
8.51652el
7.28165el
6.25106el
5.38862el
4.66486el
4.05572el
3.54152el
3.10610el
2.73622el
2.42100el
2.15146el
1.92020el
1.72111el
1.54910el
1.39998el
1.27024el
1.15697el
1.05772el
9.70450e2
8.93451e2
8.25278e2
7.64713e2
7.10729e2
6.62451e2
6.19139e2
9.68095el
8.25121el
7.06046el
6.06615el
5.23361el
4.53455el
3.94585el
3.44859el
3.02726el
2.66911el
2.36369el
2.10236el
1.87799el
1.68468el
1.51757el
1.37258el
1.24635el
1.13607el
1.03936el
9.54278e2
8.79152e2
8.12592e2
7.53420e2
7.00642e2
6.53413e2
6.11015e2
9.37354el
7.99542el
6.84709el
5.887 69el
5.08393el
4.40864el
3.83963el
3.35869el
2.95094el
2.60411el
2.30815el
2.05473el
1.83701el
1.64931el
1.48692el
1.34593el
1.22310el
l,11571el
1.02148el
9.38504e2
8.65195e2
8.00200e2
7.42382e2
6.90777e2
6.44568e2
6.03059e2
9.07727el
7.74881el
6.64126el
5.71543el
4.93937el
4.28697el
3.73691el
3.27171el
2.87705el
2.54114el
2.25430el
2.00853el
1.79723el
1.61494el
1.45712el
1.32001el
1.20047el
1.09587el
1.00404el
9.23116e2
8.51570e2
7.88096e2
7.31592e2
6.81127e2
6.35910e2
8.79174el
7.51099el
6.44267el
5.54915el
4.79974el
4.16938el
3.63758el
3.18755el
2.80550el
2.48012el
2.20209el
1.96370el
1.75861el
1.58155el
1.42815el
1.29479el
1.17843el
1.07 655el
9.87031e2
9.08102e2
8.38268e2
7.7 6270e2
7.21043e2
6.71687e2
6.27436e2
79
Table A.2. Data, with No Noise Added, Generated from the
Pearson Distribution with N=10 (128 Data Points)
1.00000e+0
9.05731el
8.21927el
7.47258el
6.80583el
6.20921el
5.67427el
5.19369el
4.76113el
4.37109el
4.01878el
3.69999el
3.41108el
3.14882el
2.91038el
2.69329el
2.49534el
2.31460el
2.14934el
1.99804el
1.85934el
1.73204el
1.61506el
1.50742el
1.40829el
1.31687el
9.80238el
8.88178el
8.06302el
7.33317el
6.68119el
6.09754el
5.57 402el
5.10352el
4.67988el
4.29775el
3.95246el
3.63993el
3.35659el
3.09931el
2.86533el
2.65224el
2.45788el
2.28036el
2.11801el
1.96933el
1.83301el
1.70785el
1.59281el
1.48694el
1.38941el
1.29945el
9.60942el
8.71033el
7.91031el
7.19687el
6.55927el
5.98827el
5.47589el
5.01523el
4.60030el
4.22589el
3.88745el
3.58103el
3.30314el
3.05073el
2.82111el
2.61193el
2.42108el
2.24673el
2.08723el
1.94112el
1.80712el
1.68407el
1.57093el
1.46680el
1.37084el
1.28231el
9.42101el
8.54282el
7.76106el
7.06360el
6.44001el
5.88134el
5.37982el
4.92876el
4.52233el
4.15546el
3.82373el
3.52327el
3.25071el
3.00306el
2.77771el
2.57235el
2.38495el
2.21369el
2.05698el
1.91339el
1.78167el
1.66068el
1.54941el
1.44697el
1.35256el
9.23701el
8.37917el
7.61518el
6.93328el
6.32335el
5.77 669el
5.28577el
4.84407el
4.44594el
4.08643el
3.76125el
3.46663el
3.19928el
2.95629el
2.73511el
2.53350el
2.34946el
2.18123el
2.02725el
1.88614el
1.75664el
1.63768el
1.52825el
1.427 47el
1.33457el
80
Table A.3. Data, with No Noise Added, Generated from the
Polydisperse Distribution with N=10 (128 Data Points)
1.00000e+0
8.90155el
7.96055el
7.1520 6el
6.45531el
5.85292el
5.33037el
4.87550el
4.47810el
4.12961el
3.82284el
3.55172el
3.31115el
3.09683el
2.90510el
2.73289el
2.57757el
2.43693el
2.30908el
2.19240el
2.08551el
1.98725el
1.89660el
1.81269el
1.73478el
1.66223el
9.7 6636el
8.70162el
7.78897el
7.00437el
6.3277 9el
5.74245el
5.23434el
4.7 9172el
4.40474el
4.06513el
3.76594el
3.50131el
3.26632el
3.05678el
2.86919el
2.70055el
2.54833el
2.41039el
2.28489el
2.17028el
2.06521el
1.96854el
1.87931el
1.79665el
1.7198 6el
1.64831el
9.53994el
8.50776el
7.62250el
6.86100el
6.20390el
5.63505el
5.14091el
4.71015el
4.33326el
4.00226el
3.71042el
3.45209el
3.22249el
3.01761el
2.83403el
2.66887el
2.51967el
2.38435el
2.2 6114el
2.14854el
2.04523el
1.95013el
1.8 6227el
1.78085el
1.70515el
1.63457el
9.32049el
8.31975el
7.46097el
6.72178el
6.08354el
5.53064el
5.05002el
4.63074el
4.26362el
3.94095el
3.65623el
3.40401el
3.17966el
2.97929el
2.79961el
2.63782el
2.49155el
2.35879el
2.23782el
2.12717el
2.02559el
1.93201el
1.84550el
1.7 6527el
1.690 64el
9.10777el
8.13741el
7.30421el
6.58660el
5.96659el
5.42911el
4.96157el
4.55341el
4.1957 6el
3.88116el
3.60335el
3.35704el
3.13778el
2.94180el
2.76590el
2.60740el
2.46398el
2.33371el
2.21491el
2.10616el
2.00626el
1.91417el
1.82897el
1.74992el
1.67634el
81
APPENDIX B
GENERATED RANDOM NUMBERS
The pseudorandom numbers in Table B.l were gener
ated using the routine Ran2 in [5]. This routine in turn
uses the linear congruential generator Ij+]_ = alj + c
(mod m) with a = 1366, c = 150889, and m = 714025.
Table B.l. Random Number Set Used (After Scaling) to Add
Noise to the Original Data Sets
6.65815el
8.53280el
3.47612el
1.44646el
4.600 93el
5.95779el
1.66401el
7.03087el
4.81243el
5.65754el
7.86913el
4.11278el
2.59377el
6.82989el
8.74947e2
3.69912el
5.85250el
1.91788el
2.95372el
4.22138el
5.55227e2
3.77514e2
7.33879el
3.88468el
3.02836el
9.39633el
1.20376el
8.83358el
4.32499el
6.91925el
3.50679el
6.83428el
6.82797e2
7.48856el
4.49062el
4.70790el
1.78791el
9.82718el
4.34735el
7.82300el
1.06281el
6.24245el
1.40621el
2.05470el
8.77217e2
3.37281el
1.32568el
2.88940el
9.22906el
2.69641el
4.29949e2
3.76156el
5.42705el
3.37525el
7.65731el
5.34281el
6.29199el
3.15741el
4.62608e2
9.98585el
1.86539el
3.20792el
8.16261el
4.89730el
9.83111el
5.42675el
4.09982el
1.64060el
1.89844el
3.71181el
6.68544el
1.61844el
5.92381e2
7.30162el
7.99563el
7.50446el
3.25162el
7.08840el
8.14966e2
2.86049e2
7.62735el
2.88863el
3.83537el
3.43094el
9.90209el
7.57444el
9.60456el
8.48655el
8.8340 6e2
8.40390el
7.29425el
3.58208el
9.08082el
2.50614el
7.15706el
9.16728el
5.20197el
1.15683el
1.21734el
7.50083el
3.49317el
5.46165el
9.05514el
9.22076el
3.48951el
6.45127el
7.30606el
6.01724el
2.33943el
3.45798el
8.97402el
4.06858el
8.53715el
6.14194el
3.78566el
7.51406el
1.52848el
1.57354el
3.13801el
1.03312el
3.92147el
6.05515el
5.90763el
1.04778e2
2.87751el
1.84118el
9.11986el
1.7 4890el
APPENDIX C
MACINTOSH FLOATING POINT ENVIRONMENT
The Apple Macintosh personal computer uses an im
plementation of Draft 10.0 of IEEE Standard 754 for
Binary FloatingPoint Arithmetic. This implementation is
called the Standard Apple Numeric Environment, or SANE.
The IEEE standard specifies data types, arithmetic, con
versions, and tools for handling limitations and excep
tions. SANE extends the Standard by including an ac
counting data type as well as library functions for both
financial and scientific calculations. Table C.l summa
rizes the properties of the various SANE floatingpoint
data types.
Table C.l. SANE FloatingPoint Data Types
Single Double Extended
Size (bits) 32 64 80
Significand Bits Decimal digits 24 7.22 53 15.95 64 19.27
Approximate Magnitude Range Minimum Maximum 1.5E45 3.4E+38 5.0E324 1.7E+308 1.9E4951 1.1E+4932
83
All arithmetic operations are performed in ex
tended precision, with the results rounded and stored to
the specified precision, thus allowing the maximum accu
racy to be obtained. SANE includes representations for
and NaNs (Not a Number) It also includes the capa
bility to represent and operate on denormalized numbers.
A denormalized number is one in which the leading signif
icand bit is not 1. The use of denormalized numbers al
lows for gradual underflow and is the reason for the
small negative exponents shown in Table C.l. A more im
portant consequence of using denormalized numbers is that
"xy=0 if and only if x=y" is true for all real numbers,
a statement that is not true for most older systems of
computer arithmetic. Some programs that perform tests
such as xy=0 to determine if x is very close to y may
not work properly when denormalized numbers are used.
SANE provides environmental controls that specify
rounding direction, rounding precision (used to simulate
arithmetic systems that do not uses extendedprecision
calculations or denormalized numbers), and error han
dling.
APPENDIX D
COMPUTER PROGRAM
The computer program listed in this appendix was
compiled and executed in the Lightspeed Pascal
environment on an Apple Macintosh Plus computer. The
code is broken into a main program and several "units,"
which exist on distinct source files and are able to be
compiled separately, provided that any units that a
program or unit "uses" have been compiled. The main
program and the units are:
SVDStudy Main program
Globals Global definitions
svdcmp SVD routines
Random Random number generator
MathFunctions Miscellaneous math functions
MatRoutines Matrix manipulation routines
VectRoutines Vector manipulation routines
Plot Highlevel plot routines
PlotRoutines Lowlevel plot routines
Responses User response routines
Several of the routines call Macintosh Toolbox routines
to perform various graphics and windowing tasks.
85
PROGRAM SVDStudy (input, output);
{Generate and process data samples using singular value decomposition}
USES
SANE, MatRoutines, VectRoutines, Random, svdcmp, Globals, Plot, Responses;
PROCEDURE Initialize;
VAR
r: rect;
BEGIN
SetCursor(Arrow);
ObscureCursor;
WITH ScreenBits.bounds DO
SetRect(r, left + 2, top + 40, right 3, bottom 3);
SetTextRect(r);
ShowText;
new(h);
new(c);
calcPrecision := RealPrecision;
SetPrecision(calcPrecision);
END; {Initialize}
PROCEDURE GetParameters;
BEGIN
writeln;
write('Linewidth distribution type (s=Staircase, p=Pearson,
q=Polydisperse)? ');
readln(distrType);
CASE distrType OF
s, 'S':
BEGIN
distrName ;= 'Staircase';
write('Enter starting and ending values for N (Nmin, Nmax): ');
readln(Nmin, Nmax)
END;
P\ 'P':
BEGIN
distrName := 'Pearson';
write(Enter value for N: ');
readln(N);
Nmin := N;
Nmax := N
END;
'q', 'O:
BEGIN
distrName := 'Polydisperse';
write(Enter value for N: ');
86
readln(N);
Nmin := N;
Nmax := N
END
END; {case}
write(Enter starting and ending values for I (Imin, Imax): ');
readln(lmin, Imax);
write('Number of data points? ');
readln(md);
ms := 0; {number of initial data points to skip}
write('Sample time (dt)? ');
readln(dt);
write('Enter minimum and maximum values for gamma (gamin, gamax):');
readln(gamin, gamax);
IF (distrType = 'p') OR (distrType = P') THEN BEGIN
write('Enter parameters L and tO: ');
readln(L, tO)
END;
IF (distrType = 'q') OR (distrType = Q') THEN BEGIN
write('Enter parameters alpha, L1, and t1:');
readln(alpha, L1, t1);
write('Enter parameters beta, L2, and t2:');
readln(beta, L2, t2)
END;
write(Initial random seed? ');
readln(initSeed);
IF initSeed = 0 THEN
initSeed := 1;
initSeed := abs(initSeed); {set < 0 to initialize random number generator}
write('List the data points? ');
IF yes THEN
listTheData := true
ELSE
listTheData := false;
write('Plot original data without noise? ');
IF yes THEN
plotTheOrigData := true
ELSE
plotTheOrigData := false;
write('Plot solutions? ');
IF yes THEN
plotTheSolutions := true
ELSE
plotTheSolutions := false;
write('Plot original (plus noise) and model data? ');
IF yes THEN
plotTheData := true
ELSE
plotTheData := false
END; {GetParameters}
PROCEDURE GenerateData;
VAR
k, m: integer;
FUNCTION p (Lplusl: integer; x: real): real;
{compute the probability integral of the chisquare distribution p(L+1,x)}
VAR
prod: double;
sum: double;
j: integer;
BEGIN
SetPrecision(ExtPrecision); {allows for large Lplusl factorials}
prod := 1 ;
sum := 1;
IF Lplusl > 1 THEN
FOR j := 1 TO Lplusl 1 DO BEGIN
prod := prod (x / j);
sum := sum + prod
END;
p := 1 sum exp(x);
SetPrecision(calcPrecision) {reset precision}
END; {p}
PROCEDURE StaircaseDistribution;
VAR
k, m: integer;
t: real;
egt, egtp: real;
BEGIN
{compute linewidth intervals}
FOR k := 0 TO N DO
gamma[k] := k (gamax gamin) / N + gamin;
{compute linewidth distribution}
FOR k := 1 TO N DO
G[k] := 2.0 k / (N (N + 1));
{compute h[k,tm] for use in generating the data}
FOR m := ms TO md 1 DO BEGIN
t := m dt;
egtp := exp(gamma[0] t);
FOR k := 1 TO N DO BEGIN
IF t = 0.0 THEN BEGIN
hA[k, m] := 1.0;
88
END
ELSE BEGIN
egt := exp(gamma[k] t);
hA[k, m] := (egtp egt) / ((gammafk] gamma[k 1]) t);
egtp := egt
END
END {k}
END; {m}
{generate data}
FOR m := 0 TO md 1 DO
co[m] := 0.0;
FOR m := ms TO md 1 DO
FOR k := 1 TO N DO
co[m] := co[m] + G[k] hA[k, m]
END; {StaircaseDistribution}
PROCEDURE PearsonDistribution;
VAR
k, m: integer;
t: real;
BEGIN
{compute linewidth intervals}
FOR k := 0 TO N DO
gamma[k] := k (gamax gamin) / N + gamin;
{compute linewidth distribution}
FOR k := 1 TO N DO
G[k] := p(L + 1, gamma[k] tO) p(L + 1, gamma[k 1] tO);
{generate data}
FOR m := 0 TO ms 1 DO
co[m] := 0.0;
FOR m := ms TO md 1 DO BEGIN
t := m dt;
co[m] := exp((L + 1) In(t0 / (t + tO)))
END
END; {PearsonDistribution}
PROCEDURE PolydisperseDistribution;
{A polydisperse distribution is the sum of two Pearson distributions}
VAR
k, m: integer;
t: real;
G1, G2: real;
89
c1, c2: real;
BEGIN
{compute linewidth intervals}
FOR k := 0 TO N DO
gamma[k] := k (gamax gamin) / N + gamin;
{compute linewidth distribution}
FOR k := 1 TO N DO BEGIN
G1 := p(L1 + 1, gamma[k] t1)  p(L1 + 1, gamma[k  1] * 11);
G2 := p(L2 + 1, gammafkj t2)  p(L2 + 1, gammafk  1] * t2);
G[k] := alpha G1 + beta G2
END;
{generate data}
FOR m := 0 TO ms 1 DO
co[m] := 0.0;
FOR m := ms TO md 1 DO BEGIN
t := m dt;
c1 := exp((L1 + 1) In(t 1 / (t + t1)));
c2 := exp((L2 + 1) In(t2 / (t + t2)));
co[m] := alpha c1 + beta c2
END
END; {PolydisperseDistribution}
BEGIN
CASE distrType OF
s, 'S':
StaircaseDistribution;
'p', 'P'r
PearsonDistribution;
'q', 'O':
PolydisperseDistribution
END; {case}
page;
writeln(' ', distrName, 1 Distribution N = ', N : 1);
writeln;
write('number of points = ', md : 1, dt =', dt : 12);
IF (distrType = 'p') OR (distrType = 'P') THEN
write(' L = ', L : 1, to =', to : 12);
IF (distrType = 'q') OR (distrType = 'Q') THEN BEGIN
writeln;
writeln(' alpha =', alpha : 5 : 3, L1 = ', L1 : 2, t1 =', t1 : 12);
writelnj' beta =', beta : 5 : 3, L2 = ', L2 : 2, t2 =', t2 : 12)
END;
writeln;
writeln;
writeln( Actual linewidth distribution', Precision: calcPrecision);
writeln;
writeln(' n Gamma(n) G(n)');
writelnC ',0:3, gamma[0] : 10 : 2);
FOR k := 1 TO N DO
writelnC k : 3, gamma[k] : 10 : 2, G[k] : 12 : 4);
writeln;
writelnC Original data (no noise added)');
FOR m := 0 TO md 1 DO BEGIN
IF m MOD 5 = 0 THEN BEGIN
writeln;
write(' ')
END;
write(co[m] : 14)
END;
writeln;
writeln;
IF plotTheOrigData THEN BEGIN
InitPlots;
DrawOrigDataPlot
END
END; {GenerateData}
PROCEDURE ComputeHHat;
{compute the hHat(k.tm) values and the means hb[k]}
VAR
m, k: integer;
d, t: extended;
egt, egtp: extended;
BEGIN
FOR k := 1 TO N DO
hb[k] := zero; {zero means}
FOR m ;= ms TO md 1 DO BEGIN
t := m dt;
egtp := exp(gamma[0] t);
FOR k := 1 TO N DO BEGIN
IF t = zero THEN
hA[k, m] := one
ELSE BEGIN
egt := exp(gamma[k] t);
hA[k, m] := (egtp egt) / ((gamma[k] gamma[k 1]) t);
egtp := egt
END;
hb[k] := hb[k] + hA[k, m]
END {k}
END; {m}
d := one / (md ms);
FOR k := 1 TO N DO BEGIN
hb[k] := hb[k] d; {compute means}
FOR m := ms TO md 1 DO BEGIN
hA[k, m] ;= hA[k, m] hb[k]; {compute h hats}
91
END
END {k}
END; {ComputeHHat}
PROCEDURE AddNoise;
{add noise to original data co and place in data array d}
VAR
m: integer;
noise: real;
meanSum: real;
rmsSum: real;
scale; real;
dummy: extended;
BEGIN
seed := initSeed;
dummy := Ran2(seed); {this initializes Ran2}
meanSum := 0.0;
rmsSum := 0.0;
scale := exp(l ln(10.0)); {10A1}
FOR m := 0 TO ms 1 DO
d[m] := co[m];
FOR m := ms TO md 1 DO BEGIN
noise := (2.0 Ran2(seed) 1.0) scale; { [1,1) 10A 1 }
d[m] := co[m] + noise;
meanSum := meanSum + noise;
rmsSum := rmsSum + noise noise
END;
meanNoise := meanSum / (md ms);
rmsNoise := sqrt(rmsSum / (md ms));
(round d[m] to 5 decimal places}
FOR m := 0 TO md 1 DO
d[m] := trunc(d[m] 100000.0 + 0.5) / 100000.0
END; {AddNoise}
PROCEDURE ComputeDHat;
(Compute dHat, where dHat(m) = d(m) dBar, m = ms...md1, and }
(dBar = (1/(mdms)) sum(d(m), m = ms....md1), the mean of the data.}
(Note: dHat(m) replaces d(m), m = ms...md1, on output. }
VAR
m: integer;
BEGIN
dBar := zero;
FOR m := ms TO md 1 DO
dBar := dBar + d[m];
92
dBar := dBar / (md ms);
FOR m := ms TO md 1 DO
d[m] := d[m] dBar
END; {ComputeDHat}
PROCEDURE SetupMatrices;
{compute the elements of the gl vector and KM matrix, )
{where gl = KM GM is to be solved for GM. }
VAR
j, k, m: integer;
BEGIN
FOR j := 1 TO N DO BEGIN
gl[j, 1] := zero;
FOR m := ms TO md 1 DO
gl[j. 1] := gl[j. 1] + d[m] hA[j, m];
FOR k := j TO N DO BEGIN
KM[j, k] := zero;
FOR m := ms TO md 1 DO
KM[j, k] := KM[j, k] + hA[j, m] hA[k, m];
KM[k, j] := KM[j, k]
END {k}
END; {j}
END; {SetupMatrices}
PROCEDURE DoSVDCMPRS;
BEGIN
SVDCMPRS(KM, N, N, N, gl, N, 1, s, fail);
END; {DoSVDCMPRS}
PROCEDURE ComputeGM;
{Compute the model coefficients GM(1). .GM(N).}
{trans(U)*gstored in gl[] }
{V stored in KM[] }
{S stored in s[x,1] (trans(S) = S) }
{GM = V*(S+)*trans(U)*g, where S+ is the pseudoinverse of S.}
{Compute p(k): store in s[k,3]. }
VAR
j, k, m: integer;
BEGIN
FOR k := 1 TO N DO BEGIN
s[k, 2] := zero;
93
IF s[k, 1] <> zero THEN
s[k, 3] := gl[k, 1] / s[k, 1]
ELSE
s[k, 3] := zero
END; {k}
{compute V*(S+)*trans(U)*g for the possible solutions GMk, k = 1,...,N}
FOR k := 1 TO N DO BEGIN
FOR j := 1 TO N DO
GM[j, k] := zero;
FOR j := 1 TO k DO
FOR m := 1 TO N DO
GM[m, k] := GM[m, k] + s[j, 3] KM[m, j]
END; {k}
END; {ComputeGM}
PROCEDURE ComputeBaselines;
VAR
j, k, m: integer;
BEGIN
FOR k := 1 TO N DO BEGIN
B[k] := zero;
FOR m := ms TO md 1 DO BEGIN
cA[m, k] := zero;
FOR j := 1 TO N DO
cA[m, k] := cA[m, k] + GM[j, k] (hA[j, m] + hb[j]);
B[k] := B[k] + d[m] + dBar cA[m, k]
END; {m}
B[k] := B[k] / (md ms)
END {k}
END; {ComputeBaselines}
PROCEDURE ComputeResiduals;
VAR
k, m: integer;
FUNCTION RunStat (rd: DataArray): real;
VAR
m: integer;
n1: real;
n2: real;
u: real;
oldSign: integer;
mu: real;
sigma: real;
sigNum: real;
{number of positive signs}
{number of negative signs}
{number of runs}
{mean of u}
{s.d. of u}
{numerator of sigmaA2}
sigDen: real; {denominator of sigmaA2}
FUNCTION Sign (x: real): integer;
BEGIN
IF x >= 0.0 THEN
Sign := +1
ELSE
Sign := 1
END; {Sign}
BEGIN
{count positive and negative signs and number of runs}
n1 := 0.0;
n2 := 0.0;
u := 1.0;
oldSign := Sign(rd[1 ]);
FOR m := ms TO md 1 DO BEGIN
IF rd[m] >= 0.0 THEN
n1 := n1 + 1.0
ELSE
n2 := n2 + 1.0;
IF Sign(rd[m]) <> OldSign THEN BEGIN
u := u + 1.0;
OldSign := Sign(rd[m])
END
END;
mu := 2.0 n1 n2 / (n1 + n2) + 1.0;
sigNum := 2.0 n1 n2 (2.0 n1 n2 n1 n
sigDen := sqr(n1 + n2) (n1 + n2 1.0);
sigma := sqrt(sigNum / sigDen);
RunStat := (u mu + 0.5) / sigma
END; {RunStat}
BEGIN
FOR k := 1 TO N DO BEGIN
R[k] := zero;
FOR m := ms TO md 1 DO BEGIN
rd[m] := d[m] + dBar b[k] cA[m, k];
R[k] := R(k] + sqr(rd[m])
END; {m}
R[k] := R[k] / (md ms);
RS[k] := RunStat(rd);
END {k}
END; {ComputeResiduals}
PROCEDURE ComputeParameterErrors;
VAR
j, k: integer;
usum: real;
