
Citation 
 Permanent Link:
 http://digital.auraria.edu/AA00003406/00001
Material Information
 Title:
 Simulations of prostate biopsy methods
 Creator:
 Pellish, Catherine Colby
 Publication Date:
 1997
 Language:
 English
 Physical Description:
 vi, 70 leaves : illustrations ; 29 cm
Subjects
 Subjects / Keywords:
 Prostate  Biopsy  Computer simulation ( lcsh )
 Genre:
 bibliography ( marcgt )
theses ( marcgt ) nonfiction ( marcgt )
Notes
 Bibliography:
 Includes bibliographical references (leaf 70).
 General Note:
 Submitted in partial fulfillment of the requirements for the degree, Master of Science, Applied Mathematics.
 General Note:
 Department of Mathematical and Statistical Sciences
 Statement of Responsibility:
 by Catherine Colby Pellish.
Record Information
 Source Institution:
 University of Colorado Denver
 Holding Location:
 Auraria Library
 Rights Management:
 All applicable rights reserved by the source institution and holding location.
 Resource Identifier:
 37831972 ( OCLC )
ocm37831972
 Classification:
 LD1190.L622 1997m .P45 ( lcc )

Downloads 
This item has the following downloads:

Full Text 
SIMULATIONS OF PROSTATE
BIOPSY METHODS
by
Catherine Colby Pellish
B.S.E.E., Marquette University, 1985
A thesis submitted to the
University of Colorado at Denver
in partial fulfillment
of the requirements for the degree of
Master of Science
Applied Mathematics
1997
This thesis for the Master of Science
degree by
Catherine Pellish
has been approved
by
William L. Briggs
James E. Koehler
Weldon A. Lodwick
Date
Pellish, Catherine Colby (M.S., Applied Mathematics)
Simulations of Prostate Biopsy Methods
Thesis directed by Associate Professor William L. Briggs
Abstract
An accepted practice in screening for prostate cancer involves a nee
dle core biopsy of the prostate gland, which can provide information regarding
if, and how much, cancer is present in a gland. This paper documents several
investigations into prostate gland biopsy techniques. The first phase of study
involves a geometric model of a prostate gland containing one to three tu
mors. This mathematical model of the gland is then used to simulate various
biopsy techniques and compare the resulting data. Secondly, the best biopsy
procedure, as determined from the geometric model, is simulated on actual
specimen data which have been digitized. These specimen data are also used
for simulation of the six random systematic core biopsy technique (SESCB)
currently in clinical use. The results of the geometric model are compared
to the results of the simulation on actual data. Finally, the geometric model
is used in another series of simulations that investigate the number of needle
samples needed to estimate the tumor to gland volume ratio.
m
This abstract accurately represents the content of the candidates thesis. I
recommend its publication.
Signed ______________________
William L. Briggs
IV
ACKNOWLEDGEMENTS
I would like to sincerely thank a number of people who consistently
provided me with their support, encouragement and guidance as I pursued the
completion of this thesis. Dr. Bill Briggs, my advisor, served as a constant
source of insight and motivation, as well as providing considerable direction
throughout this process. I am also grateful for the time spent with Dr. Jim
Koehler who had to teach me the finer points of statistics again and again.
My thanks to both of these professers for proving to be excellent academic
sources. I also would like to thank Norm LeMay who, out of the generousity
of his heart and his need for a free lunch, assisted me in running the ANOVA
analysis which this thesis required.
Finally, I must thank my family, Mark, Eric and Corinne for encour
aging me and making me laugh through every crisis.
CONTENTS
Chapter
1 Introduction............................................... 2
1.1 Clinical Prostate Biopsy Analysis.......................... 2
1.2 Summary of Mathematical Methods............................ 4
2 The Geometric Model........................................ 5
2.1 Geometric Model of gland and tumor......................... 5
2.2 Simulations............................................... 10
2.3 Statistical Analysis of Results........................... 14
2.4 Simulation Results........................................ 16
2.4.1 Applying the ANOVA to the Biopsy Simulation Data 18
2.4.2 ANOVA Mechanics......................................... 23
2.4.3 Residuals .............................................. 24
2.4.4 The Null and Alternate Hypotheses....................... 25
2.4.5 Are the Main Effects all Equal? ........................ 27
2.4.6 Recognizing Interaction between Factors................. 30
2.4.7 Clinical Distribution of Tumors......................... 38
vi
3 Digitized Specimen Data
43
3.1 Summary of Software Tool............................... 43
3.2 Specific Algorithms ................................... 45
3.2.1 Locating the Apex..................................... 45
3.2.2 Establishing Needle Positions......................... 47
3.3 Simulations............................................ 49
3.4 Geometric Model vs Clinical Model ..................... 51
3.5 Optimal Technique vs SESCB ............................ 53
4 Geometric Model Volume Estimates....................... 56
4.1 Tumor Volume Estimates................................. 56
4.1.1 OneDimensional Analysis Line Model............. 58
4.1.2 TwoDimensional Strip Model .................... 58
4.1.3 ThreeDimensional Cylinder Model.................... 59
4.2 Experiment Setup....................................... 60
4.3 Results................................................ 62
4.4 Interactive Utility.................................... 63
Appendix
A ANOVA Definitions............................................. 65
1
1. Introduction
1.1 Clinical Prostate Biopsy Analysis
Currently the standard method of determining if a given prostate
gland is cancerous involves two procedures. The first is the prostatespecific
antigen (PSA) test which measures the level of antigens in the patients blood,
a high level indicating a higher possibility of cancerous tissue. The second
procedure is the needle biopsy which is carried out if the PSA test so indicates.
The clinician conducts this biopsy by inserting a needletool, equipped with
ultrasound capabilities, into the patients rectum. The gland is located and
the urologist fires three needles into the right lobe of the gland and three
needles into the left lobe at approximately symmetric positions. The left
right division of the gland is determined by the position of the urethra in the
gland. This physical landmark is used as the visual dividing line, enabling
clinicians to execute the biopsy in a systematic manner. The needletool is
rotated to the left or right depending on the targeted lobe. This rotation
corresponds to the angle
slight rotation, the needles are inserted at a second independent angle, referred
2
to as 9. The choice of a sixneedle biopsy is based on the six random systematic
core biopsies (SESCB) method developed by Hodge et al [1] and currently
thought to achieve the best detection rates.
The results from this diagnostic biopsy are then analyzed in order to
determine the best treatment plan for the patient. There are several factors
that help the urologist choose the optimal treatment plan. The first factor is
obviously whether the biopsy shows any tumor cells at all. According to the
Hodge study, 96% of the 83 men diagnosed with cancer had the cancer detected
by SESCB. However, as investigated by Daneshgari et al [2], in prostate glands
with low tumor volume, the SESCB fails to achieve such a high percentage of
detection. This study concluded that an improved biopsy strategy may be
needed in detection of CaP (carcinoma of the prostate) in patients with low
volume cancer. Secondly, the volume of the tumor itself is a deciding factor
in determining treatment. Thirdly, the location of the tumor, specifically if
the tumor penetrates the capsule of the gland, can define a specific treatment
plan. Some of this information is available from a single needlecore biopsy;
more information is gleaned from successive, strategically placed biopsies.
3
1.2 Summary of Mathematical Methods
As an aid in understanding this problem, as well as researching ways
to improve diagnosis, two methods of analysis are undertaken. The first
method relies on a geometric model of the prostate gland with from one to
three tumors. Various biopsy methods are simulated with this mathematical
model and results are tabulated. The second method involves running the
same biopsy simulations on actual prostate glands which have been digitized
and stored as threedimensional objects in a computer. The experimental re
sults from these two methods are then compared. All of the simulations were
executed using software created for this purpose primarily by this author, al
though the skeletons of these software tools were engineered during the Spring
1995 Math Clinic on this topic by several participants. The simulations are
written in C and C++, running on a UNIXbased computer. They are exten
sively documented and flexible enough to be useful in a variety of experiments
within this realm of research.
4
2. The Geometric Model
2.1 Geometric Model of gland and tumor
An actual prostate gland is about the size of a walnut with volumes
ranging from 22 cc to 61 cc [3]. The geometry of an ellipsoid closely models
this gland and any tumors present within it. Therefore, an ellipsoid of the
form
x2 y2 z2 1
b 1= 1
A2 B2 C2
is used to represent the prostate gland. Ellipsoids are also used to represent
each of the tumors. The dimensions of the gland, A, B, and C, are chosen
randomly in the following experimentally determined ranges:
3.0 cm < A < 4.8 cm
3.8 cm < B < 4.6 cm
3.8 cm < C < 5.2 cm
22 cc < [gland volume} < 61 cc.
The prostate is divided into 3 zones: the peripheral, the central and
the transition region. The peripheral zone comprises approximately 70% of
the mass of the prostate gland. It is located in the lower area of the gland,
5
closest to the rectum. This region is the site of origin of most carcinomas [3].
The central region makes up approximately 25% of the glandular mass and
is resistant to both carcinoma and inflammation [3]. The transition region
contains the remaining 5% of prostate gland tissue and can be the site of some
cancers. Figure 2.1 shows these regions of the prostate gland. Based on this
clinical information, the softwaregenerated tumors are located in the lower
part of the elliptical gland model to simulate tumors residing in the peripheral
zone. Figure 2.2 depicts the geometrical gland and tumor model in the xyz
system. Since the gland model is centered at the origin, the //coordinate of
the tumor center, yc, is always negative in order to place the tumor in the
peripheral zone of the gland. However, other distributions of y could be used
to improve the model.
Tumors are modeled by an equation of the form
% xc)2 (y yc f (z zc f =
a2 h2 c2
where xc, yc and zc specify the center of the tumor.
The biopsy needle is modeled as a line with the parametric equations
x(t) = Xo + tsm0sin(f)
y(t) = y0 + t sin 9 cos
z(t) = z0 + t cos 9,
6
Figure 2.1. The peripheral (PZ), central (CZ) and tran
sition (TZ) regions divide the prostate gland into 3 ma
jor zones.
Figure 2.2. The gland and tumor are modeled by ellip
soids in the xyz coordinate system.
7
where xQ, yQ, and z0 are the coordinates of the entry point of the needles
(Figure 2.3 and Figure 2.4). The angle
determines a plane. The angle 9 is then assumed to remain in this plane and
is measured from the zaxis. From these definitions, the parametric equations
for the line are determined. The parameter t measures the length of the needle.
Figure 2.3. This figure of the xy plane and needle illus
trates measurement of ip.
Substituting the parametric equations of the needle into the equation for the
tumor, it is possible to determine values of t corresponding to an intersection.
The equation of the tumor is
(x(t) xc f (:y(t) yc f (z(t) zc f
8
Figure 2.4. This figure of the yz plane and needle illus
trates measurement of 9.
Replacing x(t), y(t) and z(t) by the parametric equations of the nee
dle gives
f2{
sin2 6 sin2 9 sin2 9 cos2 6 cos2 9,
1 H
a2 b2 c2 '
^f2(x0 xc) smipsm9  2(y0 yc) sm9cosip
t( o I JO ^
a2 tr
2(^0 zc) cos 9 (xqxc)2 (yoyc)2 (zQ zc)2
c2 1 a2 b2 c2
) = I (21)
If the discriminant (B'2 1.!'("') is positive, two real roots exist. In this case
we have
A
, sin2 6 sin2 9 sin2 9 cos2 6 cos2 9
11 H
a2 h2 c2
B
, 2(a;0 xc) sin
C
b2
, (*^o xc)2 (yQ yc)2 {zq zcy
a2 62 c2
9
If real roots t\ and f2 exist, they give the points where the tumor
ellipsoid and the line intersect. If these values are greater than 0 and less than
the actual needle length, the needle has intersected the tumor. The amount
of tumor extracted by the needle is proportional to the difference between the
two roots of the quadratic,  ti  By comparing the two roots, an estimate
of the volume of the tumor that is contained in the needle can be made. If
real roots do not exist, the needle does not intersect the tumor ellipsoid and
no tumor information is gained by that needle.
In this analysis, each biopsy procedure was simulated on 1000 differ
ent gland models and the number of times a tumor was detected per procedure
was recorded. This method does not differentiate between one or more nee
dles detecting the tumor. It simply records a hit or miss per biopsy procedure.
In addition, an estimate of the tumor volume is made whenever a tumor is
detected.
2.2 Simulations
Since a fundamental goal of any biopsy is to determine whether or
not the gland contains cancerous cells, the first series of simulations is intended
to compare the detection rate of several biopsy techniques. The detection rate
is defined as the number of times a biopsy procedure detects a tumor to the
10
total number of biopsies conducted. A set of 54 different biopsy procedures
is simulated with variation in the following parameters: number of needles,
offset between needles in the z direction, 6, and 4>.
The distance in the z direction between needles can be a relative
spacing based on the gland dimension in the z direction or an absolute spacing
of 1 cm between each needle. The first method is referred to as relative
spacing since it depends on the gland size and separates the needles by equal
distance. The second is referred to as the absolute spacing and has its basis
in the SESCB procedure.
As a means of clarification, Figures 2.5 and 2.6 illustrate the analysis
of a single specimen and the execution of the entire experiment. Each of the
54 biopsy procedures is simulated on 1000 different gland models. The random
number generator is seeded once for each series of 1000 simulations using a
specific biopsy technique. Prior to the next technique, the random number
generator is reseeded with the same number, thereby yielding the identical set
of 1000 prostate models. This insures that each of the biopsies is conducted
on the same set of 1000 simulated glands. The detection rate is determined
for each of these procedures and the results of the simulation are documented
in Table 2.1.
11
Figure 2.5. This flow chart depicts the toplevel algo
rithm for modeling a single biopsy with several needles.
12
Figure 2.6. This flow chart depicts the simulation pro
cess for the entire simulation, each biopsy procedure is
simulated on 1000 geometric gland models.
13
2.3 Statistical Analysis of Results
In order to interpret the output from the simulations legitimately, a
statistical tool is needed. First, we must determine whether or not the various
biopsy settings influence the observed detection rate. In other words, is there a
relationship between the settings of any one or combination of the four factors
(number of needles, ^spacing, 9 and 0) and the detection rate or are the results
completely random, therefore implying that the biopsy specification does not
determine the detection rate? We need a mathematically sound method to
compare the detection rates provided by the simulation and to infer some
conclusions. The statistical model known as Analysis of Variance (ANOVA)
was used to compare the population means between various treatments, thus
resulting in a statistically valid conclusion. This model can be employed to
determine whether the various factors interact and which factors have the most
impact on the outcome.
In order to describe the ANOVA model, a few definitions are required.
(1) Factors are the independent variables that are under investigation.
In this instance, the biopsy parameters (number of needles, spacing
14
method, 9 and
Number of Needles Spacing Method 9
Factor 4 Absolute 30 30
Levels 6 Relative 45 45
8 o O o O
(2) Factor levels are the values that each of the factors can take on during
a single simulation. As shown in the list of biopsy simulation factors
and levels, each factor does not have the same number of factor levels.
The factor Spacing Method only has two factor levels, whereas the
other three factors each have three factor levels.
(3) A treatment is a particular combination of levels of each of the factors
involved in the experiment, where an experiment is the simulation
of the treatment on 1000 geometric specimens. In this example, a
treatment refers to a biopsy with specific settings (for example, 4 nee
dles, absolute spacing, 9 = 45,
are 54 different treatments and therefore, 54 different experiments,
corresponding to all the combinations of the levels of the four factors.
(4) A trial is defined to be a simulation of one treatment on one geomet
ric model. The outcome of a trial is either 1, the biopsy procedure
detected the tumor, or 0, the tumor remained undetected. The out
come of the experiment is the detection rate achieved by a specific
15
treatment simulated on 1000 geometric specimens. In other words, the
outcome of the experiment is the number of specimens in which
tumor is detected versus the total number of specimens simulated and
is referred to as outcome for the remainder of this thesis.
2.4 Simulation Results
For each of the 54 treatments, the simulation is conducted on 1000
different gland models. The following table summarizes the treatment param
eters as well as the results:
Treatment Parameters Outcome
Experiment Number of Needles Spacing Method e Detection Rate
1 4 Relative 45 45 0.252
2 6 Relative 45 45 0.307
3 8 Relative 45 45 0.335
4 4 Absolute 45 45 0.263
5 6 Absolute 45 45 0.293
6 8 Absolute 45 45 0.298
7 4 Relative 60 45 0.267
8 6 Relative 60 45 0.341
9 8 Relative 60 45 0.369
10 4 Absolute 60 45 0.270
11 6 Absolute 60 45 0.320
12 8 Absolute 60 45 0.339
13 4 Relative 30 45 0.196
14 6 Relative 30 45 0.225
15 8 Relative 30 45 0.255
16 4 Absolute 30 45 0.207
17 6 Absolute 30 45 0.221
18 8 Absolute 30 45 0.221
Table 2.1. The results from the 54 geometric model
experiments are displayed.
16
Treatment Parameters Outcome
Number of Spacing Detection
Experiment Needles Method e Rate
19 4 Relative 45 60 0.200
20 6 Relative 45 60 0.234
21 8 Relative 45 60 0.268
22 4 Absolute 45 60 0.211
23 6 Absolute 45 60 0.225
24 8 Absolute 45 60 0.228
25 4 Relative 60 60 0.191
26 6 Relative 60 60 0.254
27 8 Relative 60 60 0.268
28 4 Absolute 60 60 0.209
29 6 Absolute 60 60 0.240
30 8 Absolute 60 60 0.246
31 4 Relative 30 60 0.172
32 6 Relative 30 60 0.194
33 8 Relative 30 60 0.219
34 4 Absolute 30 60 0.188
35 6 Absolute 30 60 0.197
36 8 Absolute 30 60 0.197
37 4 Relative 45 30 0.260
38 6 Relative 45 30 0.316
39 8 Relative 45 30 0.341
40 4 Absolute 45 30 0.264
41 6 Absolute 45 30 0.305
42 8 Absolute 45 30 0.316
43 4 Relative o O 30 0.283
44 6 Relative o O 30 0.351
45 8 Relative o O 30 0.385
46 4 Absolute o O 30 0.279
47 6 Absolute o O 30 0.346
48 8 Absolute 60 30 0.372
49 4 Relative 30 30 0.210
50 6 Relative 30 30 0.247
51 8 Relative 30 30 0.273
52 4 Absolute 30 30 0.225
53 6 Absolute 30 30 0.245
54 8 Absolute 30 30 0.247
Table 2.1. (Cont.) The results from the 54 geometric model
experiments are displayed.
17
2.4.1 Applying the ANOVA to the Biopsy Simulation Data
The biopsy simulation is a multifactored system, in which the four
parameters (number of needles, spacing, 9 and 0) individually and perhaps
in some combinations may have a measurable effect on the detection rate.
Therefore a factor effects model is used in order to determine the impact
of and interactions between these four parameters. This biopsy simulation
is considered a complete factorial study since all possible combinations of the
four parameters were simulated and evaluated. The indices %,j, k, l refer to the
levels of the factors number of needles, spacing method, 9 and respectively.
In this multifactored system, a true overall mean, p which is equiv
alent to the true overall detection rate, is assumed to exist. The entire simu
lation results in 54 observed detection rates, ppui, each of which indicates the
observed detection rate for a given experiment. This set of 54 observed detec
tion rates is used in the ANOVA to determine estimated factor effects and an
estimated overall mean which are used in the factor effects model. The factor
effects model is used to predict a detection rate, a probability of detection,
pijki, given the levels of the four factors.
A factor level mean is the average detection rate for a group of
18
treatments that have one common factor level held constant while all others
vary. For example, all outcomes from experiments with Number of Needles= 6
are averaged to yield the factor level mean for the factor Number of Needles
at the level i = 6. The overall mean, //. is simply the average outcome
of all experiments. The difference between each factor level mean and the
overall mean yields the main effect for that factor level. Because this model
has 4 factors each with either 2 or 3 levels, the following main effects are
designated.
Q!i the main effect for the factor Number of Needles at each of its
levels (4,6,8): 1 < % < 3.
(3j the main effect for the factor Spacing Method at each of its levels
(0,1): 1 < j < 2.
7fe the main effect for the factor 9 at each of its levels (30,45,60):
1 < k < 3.
8i the Main Effect for the factor
1 < l < 3.
A factor at a particular level may influence another factor either by
inhibiting or enhancing its impact. Because of these interactions between
factors, the interaction effects are included in the model. Pairwise interaction
19
effects are a measure of the combined effect of two factors, across the different
levels, minus the main effects of these factors. We define these twoway effects
as follows.
(a/3)ij number of needles and spacing method
ial)ik number of needles and 9
(aS)u number of needles and
iPl)jk spacing method and 9
{(35)ji spacing method and
{j5)ki 9 and 4>.
Threeway factor effects are a measure of the interaction effect of three factors.
(a(3j)ijk number of needles, spacing method and 9
(a(35)iji number of needles, spacing method and
{l3'yS)jk[ spacing method, 9 and
{ar)8)iki number of needles, 9 and 4>.
The fourway effect is the measure of the interaction effect of all four factors.
{aPj5)ijki number of needles, spacing method, 9 and 4>.
20
Summary of Variables
True overall mean n
Estimated overall mean fi
True treatment mean IMjkl
Estimated treatment mean Pijkl
Observed treatment detection rate Pijkl
Transformed observed treatment detection rate Yijkl
Estimated treatment detection rate Pijkl
Transformed estimated treatment detection rate Yijkl
Average observed detection rate P
True main factor level effects a,h 0, 7fe, S(
Estimated main factor level effects a*, Pj, Ik, St
True twoway effects (af])ij, (ay)ik, (aS)u {Pi)jk, (PS)jt, (7S)m
Estimated twoway effects (^7Ma (Pl)jk> {P8)n, (7S)M
Table 2.2. A list of the variables used in the ANOVA analysis
is displayed.
The factor effects model takes the general form
Pijkl /i + a* + Pj + 7fc + Si + (a0)ij + (ay)ik + (aS)u + (Pl)jk + + (7^)fei
+(a#y)iifc + {oiPS)iji + (PyS)jM + (ajS)m + (aPyS)ijkl.
The observed outcome, the detection rate for a particular treatment,
as given in Table 2.1, is pijki and is the sum of the true mean for that treatment
and a residual term:
Pijkl = IMjkl + Oj'fcl
21
The goal of the analysis is to formulate a model that predicts the
outcome of a given treatment. Since the true means and true factor effects are
not known, estimates of these terms are determined from the simulation and
used in the model. Estimated values are indicated with the ~ notation. The
predicted outcome is represented by the following relationship:
Pijki = A + (h + Pj + ik + $i + {ptl3)ij + {aj)ik + {aS)u + {f3j)jk + {P$)ji + (7 $)ki
+ M l)ijk + + W)jkl + (al5)iki + (a^5)ijki
In this equation is the estimated probability of detecting a tumor at the
factor levels indicated by %,j,k,l. This probability is predicted by the model
using least square estimators for the terms in the equation. The probability
of detection is a function of the estimated overall mean, /2, and the estimated
effects from the four factors, alone and in combination with one another. Not
all of these effects may be significant. In order to determine which of the
factors do significantly effect the detection rate and therefore belong in the
final model, various means are evaluated. If all the means for a particular
factor (or combination of factors) are equal, varying a factor level does not
add to or subtract from the overall mean and therefore the factor does not
belong in the final model. This equality question is put, not only to each
factor individually, but to all the combinations of factors as well.
22
2.4.2 ANOVA Mechanics
Use of the ANOVA model is founded on several assumptions:
(1) The outcomes follow a normal probability distribution.
(2) Each distribution has the same variance.
(3) The outcomes for each factor level are independent of the other factor
level outcomes.
With these assumptions in mind, note that the probability distributions of a
factor at each of its levels differs only with respect to the mean [4], Therefore,
the first step in executing the analysis is to determine if the detection rates,
are statistically different. Secondly, if they are different, one of the intents of
the ANOVA model is to determine if the difference between the detection rate
of two or more treatments is sufficient, after examining the variability within
the treatments, to conclude that one treatment does indeed produce a higher
detection rate. In addition, by evaluating the statistical data, conclusions may
be drawn as to how each factor, both independently and within established
interaction groups (pairwise, threeway or fourway), influences the outcome.
23
2.4.3 Residuals
We define p to be the average of all observations. The model states
that Pijki = IMjki + Â£ijkh therefore the residual term is = Pijki IMjki Since
Pijki is estimated by fiijki, the estimated residual term is e^i = pijki Pijki,
the difference between the observed and the estimated average detection rate.
The set of all 54 residuals, e^i, for all i,j,k and l are evaluated for three
characteristics which indicate whether the fitted data are wellsuited for the
analysis. These characteristics are:
1. Normality of error terms.
2. Constancy of error variance.
3. Independence of error terms.
Several statistical tests and plots used on the residual data determine
whether one of the five assumptions is violated. These tests revealed that
the error variances were not stable, thus violating the first characteristic. A
transformation was employed to preserve the statistical information in the
output, but stabilize the error variances. Since nothing is lost by employing a
transformation and the error variances are stabilized, the detection rate data
p is transformed to Y via the following relationship:
Y = 2 arcsin(x/p).
24
The outcome from these simulations is the detection rate, a proportion of the
number of specimens where tumor is detected to the total number of specimens.
The arcsine transformation is the most appropriate transformation when the
outcome is a proportion [4], All ANOVA data referenced from this point on are
transformed unless noted otherwise. The inverse transformation is calculated
at the conclusion of this analysis to get a true estimate of the probability.
2.4.4 The Null and Alternate Hypotheses
A starting point in the ANOVA process is to establish two hypothesis,
a null and alternate hypothesis. The null hypothesis assumes that all effects
are equal, therefore indicating that specific factor levels do not influence the
outcome. The alternate hypothesis assumes that at least two of the effects are
not the same.
The Ftest is used to decide which of these two hypotheses concerning
the data will be accepted. The test consists of computing the ratio of between
effect variation to withineffect variation. This bet weeuelfeet variation, which
changes depending on the effect, is called the treatment sum of squares
and is denoted SSA, SSB, SSC, and SSD (see Appendix also). It is a
measure of the difference between the detection rate of a set of treatments
and the average detection rate over all treatments. The withineffect variation
25
is called the error sum of squares and is denoted SSE. It is a measure
of the difference between the individual outcome for a given treatment and
the estimated detection rate over that treatment. The error sum of squares
measures variability that is not explained by the SSA, SSB, SSC, or SSD
terms and therefore occurs within the set of treatments. Both of these variation
measurements are evaluated using sum of the squares expressions as detailed
in the Appendix. The means of the SSA, SSB, SSC, SSD and SSE terms
are MSA. MSB. MSC. MSI) and MSE respectively, and are computed by
dividing by the degrees of freedom, df, associated with each term. This results
in /' = MSA/MSE where MSA = SS A fdf\ (MSB = SSB/dfs,etc) and
MSE = SSE/df. Large values of F tend to support the conclusion that all
the effects are not equal (Ha), whereas values of F near 1 support the null
hypothesis (H0). In the event that the alternate hypothesis is indicated via
the Ftest, the ANOVA also provides the probability of a TYPE I error. A
TYPE I error occurs when it is concluded that differences between means
exist when, in fact, they do not (i.e. accept Ha when in fact H0 is true). This
information is given in the column labelled Pr(F) in the ANOVA output in
Table 2.3.
26
2.4.5 Are the Main Effects all Equal?
Following the general process of establishing null and alternate hy
pothesis as described above, a pair of null and alternate hypotheses are stated
for each factor in the biopsy model. The null hypothesis assumes that the
main effects for a given factor at each of its levels are equivalent. The alter
nate hypothesis obviously assumes that the main effects differ.
H0: Q!i = Q!2 = Q!3 Ha; not all cq are equal.
Pi = /?2 not all Pi are equal.
<$i = 82 = S3 not all 7i are equal.
7i = 72 = 73 not all Si are equal.
The Ftest statistic is applied to determine which hypothesis to ac
cept in each case. The factor sum of squares for each factor, number of nee
dles, spacing, 9 and p, denoted SSA, SSB, SSC and SSD, respectively,
is computed as shown in the Appendix. The mean of each of these fac
tor sum of square terms is computed by dividing each term by its associ
ated degrees of freedom so that MSA = SSA/S/a, MSB = SSB/dfs, etc.
as detailed in the Appendix. The test statistic is formed for each hypoth
esis in the following manner. To test the effect of the first factor, Num
ber of Needles, F = MSA/MSE; to test the effect of the spacing factor,
27
F = MSB/MSE; to test the effect of 0. /' = MSC/MSE; and to test the
effect of o. /' = MSD/MSE. Accepting the alternate hypothesis means that
a specific setting of the given factor corresponds to a change in detection rate;
thus that factor has an effect on the overall outcome of the biopsy.
Df Sum of Sq Mean Sq F Value Pr(F)
Needles 2 0.15862 0.07931 607.427 0.0000000
Main Spacing 1 0.00498 0.00498 38.209 0.0000011
Effects e 2 0.29249 0.14624 1120.073 0.0000000
2 0.28115 0.14057 1076.661 0.0000000
Ndls:Spc 2 0.1641 0.00820 62.846 0.0000000
Needles: 9 4 0.01444 0.00361 27.653 0.0000000
2Way Spacing: 9 2 0.00059 0.00029 2.283 0.1206068
Effects Needles: 4 0.00395 0.00098 7.569 0.0002892
Spacing: 2 0.00046 0.00023 1.794 0.1848710
9: 4 0.02867 0.00716 54.902 0.0000000
Residuals 28 0.00365 0.00013
Table 2.3. The output from the ANOVA is displayed above. See
Appendix for details of the calculations.
Eefering to this ANOVA output, the column of numbers labelled Sum
of Sq refers to the parameters SSA, SSB, SSC and SSD detailed in the Ap
pendix. The column labelled Mean Square lists the parameters MSA, MSB,
MSC, MSD. The F Value column lists the Ftest outcome for each row: (Nee
dles F Value = MSA/MSE). The larger values in this column tend to support
the alternate hypothesis that the main effect for a given factor differs across
28
the possible levels for that factor. The final column, Pr(F), gives the probabil
ity of a Type I error. Again, a Type I error occurs if the alternate hypothesis is
concluded when in fact, the null hypothesis is true. The row labelled Residuals
indicates the total degrees of freedom, the SSE and the MSE for this analysis.
Based on the numbers in the table, each of the four main effects
has a significant effect on the outcome with the factor 9 having the great
est influence on the detection rate, followed by the factors
ber of Needles. This fact is indicated by the high Fvalue that corre
sponds to each of the four factors. The rows labelled with two factor names
(for example, Needles: Spacing) indicate the ANOVA output correspond
ing to pairwise interactions and include the sum of squares computed for
each pair of factors. The sum of squares for all of the pairwise interac
tion terms (SSAB, SSAC, SSAD, SSBC, SSBD, SSCD) are computed as
detailed in the Appendix. The total treatment sum of squares, SSTR =
SSA+SSB+SSC+SSD+SSAB+SSAC+SSAD+SSBC+SSBD+SSCD.
This sum does not include the sum of square terms due to the threeway and
fourway interactions because there are not enough degrees of freedom in the
experiment to use the full model.
29
2.4.6 Recognizing Interaction between Factors
At this point, the Ftest has determined that each of the main factor
effects contributes to the overall detection rate. To evaluate the interaction
effects, the Ftest is applied again The Ftest is applied to determine inter
action between, in this case, two, three or four factors. A null and alternate
hypothesis is formulated for all possible combinations of factors and sum of
square terms are computed for the factor groups and used in each Ftest. The
null and alternate hypothesis are constructed for each of the pairwise interac
tions.
H0: all (a0)ij = 0 Ha: not all (ap),^ = 0
all (aj)ik = 0 not all (aj)ik = 0
all (aS)u = 0 not all (aS)ii = 0
all {Pi)jk = 0 not all (/3j)jk = 0
all {pS)ji = 0 not a\\((38)ji = 0
all (jS)ki = 0 not all (jS)kl = 0 All threeway combinations are formed, hypotheses are constructed and Ftest
results are evaluated. H0: all (a(3j)ijk = 0 Ha: not all (afij)ijk = 0
all (a(38)iji = 0 not all (a(38)iji = 0
all (ajS)jM = 0 not all (aj8)iki = 0
all (076)jkl = 0 not all (/3jS)jkl = 0 The null/alternate set of hypothesis is constructed for the fourway interaction.
30
H0: all {a(3fS)im = 0
Ha: not all (a(3j8)ijki equal 0
Based on the actual ANOVA results in the preceding table, four of
the pairwise interactions appear strongly significant: Needles: Spacing,
Needles: 9, Needles: . The other two pairwise interactions are
included in the final model even though the strength of their significance is
uncertain. The ANOVA was executed once to include all threeway interac
tions. Since these interactions proved insignificant, they are not included in
the model. There are not enough degrees of freedom in the experiment to
estimate the residuals and test for the fourway interaction.
As stated previously, the Y notation indicates the transformed de
tection rate (p). At this point the general model, of the form
1ijkim = /7... T T j3j T 'Tfc T S[ Main effects
+iaP)ij + (al)ik + (aS)u + ((3j)jk + +((38)ji + (j8)ki Pairwise effects
+(a/3j)ijk + (a(38)iji + (/3j8)jki Threeway effects
+ (a/3j8)ijki Fourway effect
residual error
is reduced to the final model for this analysis:
8'ijki (i + &i + (%+ik + 8i + {aj3)ij + {aj)ik + {aS)u + {f3j)jk + iP8)jt + (7 8)kl.
This model yields the transformed probability of detection at the given levels
for %,j, k and l.
31
Now that the factor effects have been identified, the analysis revolves
around determining the factor levels that result in the highest detection rate.
For this part of the analysis, the tables of means and tables of effects are
evaluated.
Ik... Grand Mean 1.072
Needles 4 6 8 Spacing Relative Absolute
/h... 0.999 1.09 1.128 fJ'.j.. 1.082 1.063
e 30 45 60 30 45 60
ik.k. 0.9723 1.098 1.147 lk..i 1.14 1.1104 0.9724
Table 2.4. The ANOVA tables of means list the transformed values.
Needles 30 0 45 o O Spacing 30 0 45 o O
4 0.926 1.027 1.045 Relative 0.978 1.111 1.157
6 0.979 1.113 1.176 Absolute 0.967 1.084 1.137
8 1.012 1.152 1.221
Needles 30 45 60 Spacing 30 45 60
4 1.054 1.028 0.915 Relative 1.148 1.118 0.980
6 1.161 1.123 0.985 Absolute 1.132 1.091 0.965
8 1.205 1.163 1.017
Spacing
Needles Relative Absolute 0 30 45 60
4 0.987 1.011 30 1.026 0.978 0.913
6 1.099 1.080 45 1.159 1.139 .0994
8 1.159 1.097 60 1.235 1.196 1.010
Table 2.5. The transformed values of the pairwise means are shown.
32
Referring to the ANOVA tables of means, the highest numbers in each
category reflect the best setting for a particular factor. On reading through
the tables of means, the conclusion is that a technique of 8 needles, relative
spacing, 9 = 60 and
corroborate this more fully, the interactions that are deemed significant are
analysed to verify that the main effect is not contradicted by an interaction.
Therefore, the table for Needles: 9 is reviewed and it is found that the setting
of 8 needles and 9 = 60 again yields the highest mean. The tables for all of
the pairwise combinations are reviewed to determine that the best settings
yield the highest means in the interaction tables just as they did in the main
effect tables. This proves to be the case, so none of the interactions contradict
the conclusion drawn from the main effect information.
33
Number of Needles (4, 6, or 8) Q!l &2 &Z
Effect 0.07329 0.01723 0.05607
Spacing (Relative or Absolute) to
Effect 0.009612 0.009612
e (30, 45, or 60) 7i 72 73
Effect 0.1001 0.02519 0.07486
(30, 45, or 60) 5i S2 S3
Effect 0.0678 0.03215 0.09995
Table 2.6. The main factor level effects from the ANOVA output
are documented.
34
Spacing Relative Absolute
4 Needles 6 8 0.02127 0.02127 0.00017 0.00017 0.02143 0.02143
e 30 45 60
4 Needles 6 8 0.02680 0.00244 0.02925 0.01031 0.00127 0.01158 0.01649 0.00118 0.01767
e 30 45 60
Spacing Relative Absolute 0.004354 0.003708 0.000646 0.004354 0.003708 0.000646
30 45 60
4 Needles 6 8 0.01271 0.00292 0.01563 0.00363 0.00087 0.00450 0.00907 0.00206 0.01113
30 45 60
Spacing Relative Absolute 0.001740 0.004148 0.002407 0.001740 0.004148 0.002407
30 45 60
30 6 45 60 0.01404 0.02664 0.04067 0.00621 0.00978 0.00357 0.02025 0.01686 0.03711
Table 2.7. The ANOVA table of effects for pairwise interactions
is displayed.
35
By using the values from the tables of effects, a probability for de
tection is calculated for the optimal setting:
^ 3131 = (l + dz + (h +73 + <5i + (Q!/3)31 + (<27)33 + (<2Â£)31 + (%7)l3 + (^)ll + (7^)31
1.347918 = 1.072 + .05607 + .009612 + .07486 + .0678+
.02143 + .01767 + .00907 + .000646 + ^0.00174 + .02025
This result of 1.347918 is then transformed back (arcsine equation)
to yield a probability of 0.38948 for this setting.
1.347918 = 2 arcsin\f(p)
p = (sin(1.347918/2))2 = 0.38949.
Therefore, with the factors set to 8 needles, relative spacing, 9 = 60 and
4> = 30, the biopsy procedure has a 38.9% probability of detecting the cancer
given the tumor distribution model used. This estimated probability is best
used in comparisons with the other estimated probabilities rather than as
an absolute measure of detection rate. Therefore the conclusion from this
analysis is a relative ranking of treatments in terms of their detection rate.
Since the 1000 simulated specimens were the same for each treatment, the
ANOVA model determined the relative differences between detection rates of
various treatments, not necessarily providing enough data and results to draw
36
conclusions about absolute detection rates. Table 2.8 lists each experiment
and the probability of detection predicted from the factor effects model.
Treatment Parameters
Experiment Number of Spacing Needles Method e Predicted Probability
1 4 Relative 45 45 0.247
2 6 Relative 45 45 0.297
3 8 Relative 45 45 0.327
4 4 Absolute 45 45 0.251
5 6 Absolute 45 45 0.281
6 8 Absolute 45 45 0.291
7 4 Relative o O 45 0.265
8 6 Relative o O 45 0.337
9 8 Relative o O 45 0.369
10 4 Absolute o O 45 0.271
11 6 Absolute o O 45 0.324
12 8 Absolute o O 45 0.335
13 4 Relative 30 45 0.195
14 6 Relative 30 45 0.227
15 8 Relative 30 45 0.251
16 4 Absolute 30 45 0.205
17 6 Absolute 30 45 0.219
18 8 Absolute 30 45 0.224
19 4 Relative 45 60 0.200
20 6 Relative 45 60 0.236
21 8 Relative 45 60 0.260
22 4 Absolute 45 o O 0.208
23 6 Absolute 45 o O 0.227
24 8 Absolute 45 o O 0.232
25 4 Relative o O o O 0.192
26 6 Relative o O o O 0.247
27 8 Relative o O o O 0.273
28 4 Absolute o O o O 0.203
29 6 Absolute o O o O 0.241
Table 2.8. The probabilities of detection for one tumor
simulations are displayed.
37
Treatment Parameters
Experiment Number of Spacing Needles Method e Predicted Probability
30 8 Absolute 60 60 0.248
31 4 Relative 30 60 0.175
32 6 Relative 30 60 0.196
33 8 Relative 30 60 0.215
34 4 Absolute 30 60 0.189
35 6 Absolute 30 60 0.194
36 8 Absolute 30 60 0.195
37 4 Relative 45 30 0.257
38 6 Relative 45 30 0.314
39 8 Relative 45 30 0.346
40 4 Absolute 45 30 0.266
41 6 Absolute 45 30 0.303
42 8 Absolute 45 30 0.315
43 4 Relative 60 30 0.276
44 6 Relative 60 30 0.354
45 8 Relative o O 30 0.389
46 4 Absolute o O 30 0.287
47 6 Absolute o O 30 0.346
48 8 Absolute o O 30 0.360
49 4 Relative 30 30 0.208
50 6 Relative 30 30 0.246
51 8 Relative 30 30 0.272
52 4 Absolute 30 30 0.223
53 6 Absolute 30 30 0.243
54 8 Absolute 30 30 0.250
Table 2.8. (Cont.) The probabilities of detection for one tumor
simulations are displayed.
2.4.7 Clinical Distribution of Tumors
The biopsy simulations were conducted a second time on more real
istic geometric glands. By using a clinically derived distribution of number
of tumors per gland, a better population was available for these biopsy sim
ulations. A sample size of 1000 was again used but in this experiment, 1/4
38
of the glands had a single tumor, 1/2 had two tumors and the remaining 1/4
had 3 tumors. The total gland volume was again held to be less than 6.4
cc. This distribution is based on the analysis done by Daneshagari [2]. The
ANOVA results are found in the Appendix and yield the same optimal biopsy
procedure with a slightly different probability resulting from the factor effects
model.
By using the values from this second table of effects, a probability
for detection is calculated for the optimal setting:
^3131 = fi + d3 + (3i +73 + <$i + (oi(3)31 + (0:7)33 + (0^)31 + (Pi) i3 + (Pd)n + (7^)31
1.7535 = 1.429 + 0.0733 + 0.01507 + 0.07456 + 0.07091 +
0.02321 + 0.02650 + 0.01412 0.005442 0.004094 + 0.03638
Transforming this value (arcsine) yields a probability of detection for
the optimal setting of .5908. This probability of 59.08% is higher than the
38.9% achieved by the simulation using geometric models of one tumor as
would be expected. The predicted probabilities for each of the 54 experiments
given this distribution of tumors is shown in Table 2.9.
39
Treatment Parameters
Experiment Number of Spacing Needles Method e Predicted Probability
1 4 Relative 45 45 0.417
2 6 Relative 45 45 0.489
3 8 Relative 45 45 0.526
4 4 Absolute 45 45 0.417
5 6 Absolute 45 45 0.470
6 8 Absolute 45 45 0.482
7 4 Relative o O 45 0.427
8 6 Relative o O 45 0.524
9 8 Relative o O 45 0.569
10 4 Absolute o O 45 0.436
11 6 Absolute o O 45 0.514
12 8 Absolute o O 45 0.533
13 4 Relative 30 45 0.353
14 6 Relative 30 45 0.405
15 8 Relative 30 45 0.431
16 4 Absolute 30 45 0.354
17 6 Absolute 30 45 0.387
18 8 Absolute 30 45 0.388
19 4 Relative 45 60 0.358
20 6 Relative 45 60 0.408
21 8 Relative 45 o O 0.443
22 4 Absolute 45 o O 0.360
23 6 Absolute 45 o O 0.391
24 8 Absolute 45 o O 0.401
25 4 Relative o O o O 0.322
26 6 Relative o O o O 0.395
27 8 Relative o O o O 0.437
28 4 Absolute o O o O 0.332
29 6 Absolute o O o O 0.386
30 8 Absolute o O o O 0.403
Table 2.9. Given the distribution of one to three tumors,
the probabilities of detection predicted by the ANOVA model
are displayed.
40
Treatment Parameters
Experiment Number of Spacing Needles Method 9 Predicted Probability
31 4 Relative 30 60 0.326
32 6 Relative 30 60 0.357
33 8 Relative 30 60 0.381
34 4 Absolute 30 60 0.329
35 6 Absolute 30 60 0.341
36 8 Absolute 30 60 0.340
37 4 Relative 45 30 0.417
38 6 Relative 45 30 0.498
39 8 Relative 45 30 0.541
40 4 Absolute 45 30 0.425
41 6 Absolute 45 30 0.486
42 8 Absolute 45 30 0.504
43 4 Relative 60 30 0.436
44 6 Relative 60 30 0.541
45 8 Relative 60 30 0.590
46 4 Absolute o O 30 0.451
47 6 Absolute o O 30 0.537
48 8 Absolute o O 30 0.562
49 4 Relative 30 30 0.351
50 6 Relative 30 30 0.412
51 8 Relative 30 30 0.444
52 4 Absolute 30 30 0.359
53 6 Absolute 30 30 0.401
54 8 Absolute 30 30 0.407
Table 2.9. (Cont.) Given the distribution of one to three
tumors, the probablities of detection predicted by the
ANOVA model are displayed.
A selection of detection rates are graphed in Figure 2.7 to provide
visualization of the relative ranking of various treatments. The plots indicate
6 and 8 needles, relative spacing and all of the levels for 9 and
41
0 e = 30 6 needles
n e = 45 6 needles
A e = 60 6 needles
e = 30 8 needles
e = 45; 8 needles
e = 60 8 needles
Legend
Figure 2.7. The detection rates for several experiments
are graphed and the common treatment parameters are
noted for each experiment. This gives a visual under
standing of the ranking of these treatments in terms of
their detection rate.
42
3. Digitized Specimen Data
3.1 Summary of Software Tool
An analysis program, written in C, was created to simulate needle
biopsies on clinical data provided by the University of Colorado Health Sci
ences Center, Pathology Department. The clinical data were gathered from
autopsies, pathologically investigated and digitized [2].
The data for each specimen are stored as a 3dimensional array of
information. The software uses an input hie to determine the characteristics of
a given experiment. These characteristics include the number of needles, the
initial placement of the first needle, the angles 9 and (f>, the spacing between
needles, and the needle diameter and length. In this manner, the analysis
software is flexible enough to handle a variety of simulations. The goal of
this biopsy simulation tool is to provide the means to experiment realistically
with various needle parameters on clinical data in order to determine any
correspondence between biopsy methods and detection rates.
The initial needle position is offset by the distance requested (the
^offset entered by the user), with half of the needles entering the right lobe
43
and the other half entering the left lobe, in symmetry with each other. The
initial position is determined as an absolute (in cm) offset from the apex of the
gland. The other parameters are used to position each needle on the specimen
data set and determine how much of the specimen data is to be returned in
the needle biopsy. This specimen data is analyzed to determine whether and
how much tumor data is present in the needle. This information is available
to the user.
Having read the input hie with parameter values, the code begins a
loop on the specimen data hies requested for simulation. In this loop, the three
dimensional specimen data hie is opened, the data are read into a 3d array,
with all of the background trimmed off, the apex of the gland is located, and
the needle positions are translated into array coordinates. These coordinates
are fed to the biopsy routine which extracts the specimen data coinciding with
the needle and analyzes the data for tumor information. The information for
the entire experiment is stored in an output hie that documents the needle
parameters and the results for each image data set.
44
3.2 Specific Algorithms
3.2.1 Locating the Apex
The apex is defined as the first contact with the prostate when ap
proaching it through the rectum, as done clinically. This location is used as
a landmark for positioning each biopsy needle. In the data set, the algorithm
that searches for this landmark proceeds as follows. The planes are defined as
shown in Figure 3.1.
Each pixel in the threedimensional specimen file contains a number
indicating the type of data at that location. The possible types are gland,
tumor, capsule or background. Capsule data indicate those pixels defining the
boundary of the gland. The apex is indicated by the first pixel pointing to
capsule data. Therefore one plane of specimen data is evaluated at a time,
until a pixel that points to capsule data is found. This location is recorded as
the apex location.
45
Figure 3.1. The x,y,z axis, as defined for the digital
data, mimic those defined for the geometric models.
46
3.2.2 Establishing Needle Positions
The starting position, the location of the apex, serves as the land
mark for each additional needle. From this starting point and the additional
usersupplied parameters (^offset, distance between needles) all of the nee
dle positions are calculated in terms of a vector. This vector, represented by
(x, y, z) coordinates, along with the
image data. The ^offset is assumed to be in centimeters and is added to the
initial (x, y, z) of the starting position to locate the first needle position. Each
time any coordinate is changed, the new vector may be pointing to gland,
tumor, background, urethra or capsule data. The pixel represented by the
vector is read to insure that the needle entry position remains located on cap
sule data. If it does not, the y coordinate is adjusted to make sure that the
entry position of the needle is on capsule data.
At this point in the algorithm, the first needle position is determined.
There are two ways to space the remaining needles. The user may enter
absolute distances in centimeters or a relative measure taken to be a percentage
of the z dimension of the gland. In addition, a zero percentage indicates that
47
the spacing is based on the number of needles in the biopsy; the needles are
equally spaced across the zaxis of the gland. The remaining needle positions
are calculated from the initial needle position: half of the needles are positioned
in the right lobe by using cf>, the remainder use 0 to rotate into the left lobe.
All of the needles have the x coordinate set to the midpoint of the gland in
the x dimension.
The userentered distance, in centimeters, is converted to a specific
number of pixels. This z distance is added to the first needle position to obtain
the second needle position, added to the second to obtain the third, etc. Each
time a needle position is calculated, the coordinates are evaluated to insure
that they point to capsule data. If the gland is too short in the z direction to
handle all the needles requested, the experiment proceeds with the number of
needles that do stay within the gland.
The experiments that depend on a relative distance between needles,
require additional analysis of the yz slice before determining the z offset. The
z diameter of the particular yz slice is calculated. The z distance required for
a needle of a specific length, inserted at a specific angle is then subtracted from
this z diameter. Rather than having the last needle pierce more background
than gland data, this subtraction enables the full number of needles to be
48
inserted into the gland. This new z diameter is then divided into the number
of segments required by the specified percentage. If the user indicates 0% for
the distance spacing, the software calculates the distance based on the number
of needles requested and the diameter of the yz plane.
3.3 Simulations
The 54 treatments used in the geometric model were used as biopsy
procedures on a maximum of 53 digitized clinical specimens. Some of the
biopsy techniques were simulated on only 52 of these clinical specimens. Table
3.1 shows the results from these simulations on the digitized clinical data.
The table documents both the multipletumor geometric model hit rate as
well as the number of hits resulting from the same biopsy on the digitized
clinical data. The first five columns indicate the experiment number and the
biopsy parameter settings for the four variables, number of needles, spacing
method, 9 and
per 1000 simulations of the geometric model. The column labelled Number
of Hits is the number of hits per number of digitized clinical samples. Most
experiments were run on all 53 of the digitized specimens. However, some
of the simulations resulted in an error on one or more of the specimens and
these specimens were then removed from the experiment. The final column,
49
labelled Clincial Detection Rate is the rate for the experiments on the digitized
specimens.
Number Number Clinical
of Spacing Detection of Detection
Experiment Needles Method e Rate Hits Rate
1 4 Relative 45 45 0.417 ff 53 0.1509
2 6 Relative 45 45 0.489 0.2075
3 8 Relative 45 45 0.526 8 ? ff f8 ? 53 0.1538
4 4 Absolute 45 45 0.417 0.1698
5 6 Absolute 45 45 0.470 0.2075
6 8 Absolute 45 45 0.482 0.1923
7 4 Relative 60 45 0.427 0.1698
8 6 Relative 60 45 0.524 9 i S fl ff f I 53 0.1731
9 8 Relative 60 45 0.569 0.2453
10 4 Absolute 60 45 0.436 0.1887
11 6 Absolute 60 45 0.514 0.2264
12 8 Absolute 60 45 0.533 0.2264
13 4 Relative 30 45 0.353 0.1321
14 6 Relative 30 45 0.405 0.2264
15 8 Relative 30 45 0.431 9 Â¥ Â¥ ? ? 53 0.1698
16 4 Absolute 30 45 0.354 0.1321
17 6 Absolute 30 45 0.387 0.1321
18 8 Absolute 30 45 0.388 0.1698
19 4 Relative 45 60 0.358 0.1132
20 6 Relative 45 60 0.408 9 ff f s Â§ 53 0.1698
21 8 Relative 45 60 0.443 0.2115
22 4 Absolute 45 60 0.360 0.1509
23 6 Absolute 45 60 0.391 0.1887
24 8 Absolute 45 60 0.401 0.1887
25 4 Relative 60 60 0.322 8 ? ? f ? ? 52 0.1509
26 6 Relative 60 60 0.395 0.1538
27 8 Relative 60 60 0.437 0.1731
28 4 Absolute 60 60 0.332 0.1154
29 6 Absolute 60 60 0.386 0.1731
30 8 Absolute 60 60 0.403 0.1731
Table 3.1 The detection rates for the geometric and clinical
simulations are displayed.
50
Number Number Clinical
of Spacing Detection of Detection
Experiment Needles Method e Rate Hits Rate
31 4 Relative 30 60 0.326 5 ? ? f f f f i 51 f 58 5? ? 58 58 ? 51 51 f f? 58 f f 58 52 0.0962
32 6 Relative 30 60 0.357 0.0962
33 8 Relative 30 60 0.381 0.1731
34 4 Absolute 30 60 0.329 0.0769
35 6 Absolute 30 60 0.341 0.0769
36 8 Absolute 30 60 0.340 0.0769
37 4 Relative 45 30 0.417 0.1154
38 6 Relative 45 30 0.498 0.1923
39 8 Relative 45 30 0.541 0.2308
40 4 Absolute 45 30 0.425 0.1538
41 6 Absolute 45 30 0.486 0.1923
42 8 Absolute 45 30 0.504 0.2115
43 4 Relative 60 30 0.436 0.1154
44 6 Relative 60 30 0.541 0.1923
45 8 Relative 60 30 0.590 0.1887
46 4 Absolute 60 30 0.451 0.1538
47 6 Absolute 60 30 0.537 0.2308
48 8 Absolute 60 30 0.562 0.2308
49 4 Relative 30 30 0.351 0.1000
50 6 Relative 30 30 0.412 0.2115
51 8 Relative 30 30 0.444 0.1923
52 4 Absolute 30 30 0.359 0.1154
53 6 Absolute 30 30 0.401 0.1538
54 8 Absolute 30 30 0.407 0.1923
Table 3.1 (Cont.) The detection rates for the geometric and
clinical simulations are displayed.
3.4 Geometric Model vs Clinical Model
Comparison of the detection rates between the geometric model and
the clinical model reveals that the geometric simulation produces much higher
rates than its clinical counterpart. In attempting to explain this discrepency,
several characteristics of the experiment are noted.
51
The distribution of the tumors and the total tumor volume in a given
specimen can impact the detection rate of a treatment. A comparison of the
tumor volumes is graphically displayed in Figures 3.2 and 3.3. As shown by
the histograms, the tumor volumes for the autopsy data tend strongly toward
small (< .5 cc) volumes. In contrast, the geometric model produces tumors
with volumes more equally spaced across the spectrum of possible volumes.
In fact, 80% of the autopsy specimens have a total tumor volume less than .5
cc. In contrast, only 49% of the geometric gland models have a total tumor
volume in this range. This difference in the size of the tumors can explain
some of the difference in detection rate between the clinical and geometrical
models.
A second difference is that the relative ranking of detection rates for
the digital data simulations is different than the ranking of detection rates for
the geometric simulations. An example of this discrepency is that experiment
9, ( 8 Needles, Relative Spacing, 9 = 60,
of 0.2453 or 13 hits out of 53 samples. This detection rate is better than
the detection rate of experiment 45, ( 8 Needles, Relative Spacing, 9 = 60,
4> = 30) which is the optimal biopsy as indicated by the geometric simulation.
This difference may be due to the fact that only 53 specimens were used in the
52
digital simulation in contrast to the 1000 models constructed for the geometric
simulation.
3.5 Optimal Technique vs SRSCB
The optimal technique, determined by the geometric model, consists
of 8 needles, relative spacing, 9 = 60 and
uses 6 needles, absolute spacing, 9 = 45 and
simulated on the geometric model as well as the digitized clinical data. The
optimal technique actually proved slightly worse at tumor detection than the
SESCB procedure when simulated on the clinical data. In fact, the optimal
method detected tumor in 10 out of 53 specimens (.189). The SESCB method
detected tumor in 11 out of 53 specimens (.207). These results compare with
the overall results from the geometric simulation as follows. The SESCB had
a detection rate of .47 and the optimal had a detection rate of .59 on the
1000 geometric models. This discrepency is addressed by noting the sample
size available in the two simulations and the distribution of tumor volumes as
noted earilier.
53
25
.05 .5 1 1.5 2 2.5 3 3.5 4 4.5 5
Sum of Tumor Volume
Figure 3.2. The histogram of the clinical data shows the
tumor distribution by volume.
54
Figure 3.3. The histogram of the geometric data shows
the tumor distribution by volume.
55
4. Geometric Model Volume Estimates
4.1 Tumor Volume Estimates
The total volume of tumor in a gland is an important piece of infor
mation for clinicians who use it to improve both the diagnosis and treatment
plan for a patient. The ultrasound used during a biopsy accurately measures
the prostate gland volume so that an approximate ratio of tumor to gland
volume can be used to estimate the volume of tumor in a gland. These sim
ulations offered an avenue to explore a means of approximating this volume
ratio by using the volume of the needle that contains tumor information and
the total volume of the needle.
Three methods are used to estimate the amount of tumor intersected
by the needle. The needle can be modeled by a line, a strip, or a cylinder
in one, two, and three dimensions, respectively. The length and diameter
of the needle are constant and are set by clinical limits. This incremental
approach began in one dimension in order to simplify aspects of the simulation
during software verification. As the research progressed, the two and three
dimensional needles were introduced in order to model the actual biopsy more
56
closely.
The first method of estimating the volume ratio is R = where
Vi represents the tumor volume within a single needle, Vi represents the volume
of that same needle, and n is the number of needles. This ratio is referred to as
the average of the ratios. A second estimator of volume ratio is r = t, where
Vi is the tumor volume within a single needle and Vi is the total volume of that
needle. This ratio is considered the ratio of the average volumes since ^ ]T"=i
is the average tumor volume and ^ ]T"=i V* is the average needle volume. This
yields r = ^. Both methods of estimating the ratio are documented
below.
Figure 4.1. This illustration of the gland, tumor and
onedimensional needle depicts the variables used in de
termining the volume ratio estimator.
57
4.1.1 OneDimensional Analysis Line Model
In this first model, we represent the needle by a line segment as shown
in Figure 4.1. The length of the needle that contains tumor pixels, It, is the
difference between t\ and f2, the two roots of equation ( 2.1): lT =\ t\ t2  A
needle length, L, of 1.25 cm is used in the estimate of volume ratio. Thus the
ratio lj is an approximation of the true volume ratio p^y', that is, lj ~ p^y
4.1.2 TwoDimensional Strip Model
In the twodimensional case we represent the needle by a strip. The
needle entry points (a^t/o^o) are used as a starting point in the twodimensional
analysis. Two lines are created, each offset from this starting coordinate by the
needle radius. The intersection between these two lines and the tumor ellipse is
determined and the roots of the two resulting quadratics are used to compute
both the occurrence of a detection and the amount of tumor within the needle.
In this case, the estimate of the volume ratio is the area of the tumor over the
area of the needle. Figure 4.2 defines the lengths used in determining the area.
The area of the tumor is calculated by estimating the needle length which con
tains tumor data with the roots of intersection: lt 1 = tn~ti2 \;lt2 =  t2l t22 I
58
The area of tumor is then given by ar = (/
of the needle. The area of the needle is calculated in the same way using the
length of the needle: aN = (L + L). Thus ^ serves as an estimate of the
true tumor to gland volume ratio, p^y.
Figure 4.2. This illustration of the gland, tumor and
twodimensional needle depicts the variables used in de
termining the volume ratio estimator.
4.1.3 ThreeDimensional Cylinder Model
The threedimensional analysis models the needle as a cylinder and is
similar to the twodimensional case in that the entry point of the needle is again
used as a center coordinate for four needles. In this case, the four needles are
constructed symmetrically about this point to generate a cylindrical needle.
Then intersections and roots are computed. A more accurate representation of
the volume ratio is obtained using the volume of the tumor within the needle
59
over the volume of the needle. In this case, the length is estimated to be the
maximum of the lengths determined from the four sets of intersection roots:
lt = max( tn ti2 ,  t2i t22 ,  hi h2 ,  hi h2 ).
The volume of the needle depends on the known diameter and length: vn =
7r()2(L). The estimated volume of the tumor depends on the needle lengths
which contain tumor data as shown in Figure 4.3. This leads to the tumor
volume estimate vt = 7r()2(lt). The ratio ^ estimates the true volume ratio,
PGV'
4.2 Experiment Setup
A second set of experiments utilizing the geometric model involved
exploring the question of accurately estimating the tumor volume to gland vol
ume ratio. The experiment simulated a biopsy on a single specimen, increasing
the number of needles each iteration and comparing the volume ratio obtained
from the biopsy sample to the known volume ratio. The parameters for the
biopsy include the optimal angles 9 and
vestigation The optimal number of needles and distancing method determined
from the ANOVA analysis do not apply to this experiment since the number
of needles increases from 6 to 20 and the distancing of these needles is done so
that the maximum1 number, 20, are equally spaced. The maximum number
60
Figure 4.3. This illustration of the gland, tumor and
threedimensional needle depicts the variables used in
determining the volume ratio estimator.
61
of needles was set at 20 due to clinical limitations. The spacing of the nee
dles is dependent on the maximum number so that from one iteration to the
next 2 needles are in the same exact location, yielding the same detection
information. In this manner the comparison between a specimen biopsied by
6 needles and the same specimen biopsied by 10 needles is not dependent on
needle position, but instead compares the gain made by the four additional
needles.
The simulation is executed on 1000 specimens, varying the number
of needles from 6 to 20 in increments of 2. The output from this experiment
consists of a hie for each specimen that contains the results of each set of
needles including the tumor to needle volume ratio achieved and the associated
estimates (R = ^ XX'itAn) and r = Â§y). In addition, the actual tumor to
gland volume ratio is noted.
4.3 Results
The results of this experiment were not as anticipated as there ap
pears to be no pattern of convergence to the actual tumor to gland volume
ratio within the limit of 20 total needles. However, much was learned from
this exercise that provided insight into the next series of investigations. First,
it is noted that in the great majority of cases, a single 8needle biopsy tends
62
to overestimate the true tumor to gland volume ratio. Secondly, a comparison
between the two methods of calculating the error leads to the conclusion that
the sum of the ratios is the more accurate method at least in this set of limited
trials.
4.4 Interactive Utility
Using the preceding idea as a starting point, an interactive software
tool was created to investigate the volume ratio question in greater detail.
This tool prompts the user for a random number, seeds the random number
generator, creates a gland containing a single tumor and conducts the optimal
8needle biopsy. This optimal biopsy has 8 needles, relative spacing between
the needles, 9 = 60 and
position, the amount of tumor volume contained in the needle and an estimate
as to the volume ratio of tumor to gland, are displayed for the user. At this
point, the user is able to choose the location for the next needle. This new
needle is then simulated and the tumor volume information it retrieves is
incorporated into the volume ratio. The user can continue this process of
requesting additional needles and evaluate the estimated volume ratio and its
error from the true ratio. A maximum of 20 needles can be simulated on
a single gland, beginning with the 8 original needles and accumulating the
63
additional 12 based on user specifications.
This area of research is full of openended questions where tools such
as this interactive utility can help shed light on answers. With involvement
from clinicians and medical researchers, experiments can be designed to gather
more information regarding the two issues of volume ratio and optimal biopsy
technique. In addition, using the results of this body of research, more real
istic tumor distributions and geometric models can be constructed to better
understand the impact of treatment parameters on detection rate.
64
A. APPENDIX ANOVA Definitions
A dot in the subscript indicates averaging over the variable repre
sented by that index.
The number of levels for Number of Needles: a = 3.
The number of levels for Distancing Method;, b = 2.
The number of levels for 9: c = 3.
The number of levels for 0: d = 3.
The number of specimens = 1000.
The number of experiments: abed = 54.
In general, Y is an observation, Y is the mean of observations, /i is the true
mean and (i is the least squares estimate of the true mean.
Yijki is the observed detection rate at the factor levels indicated by
i,j, k and l.
F ... is the mean of all specimens over all treatment levels i,j, k, l. It
indicates the overall detection rate for the entire experiment.
i abed
Y = X X X X Ym
abed
i=1j=1k=11=1
65
SSTO, or total sum of squares is a measure of the total variability
of the observations without consideration of factor level.
SSTO = 't't{YijUY...f
i=lj=1k=1 1=1
dfssro is the total degrees of freedom. The SSTO has abdc 1 =
54 1 degrees of freedom. One degree of freedom is lost due to the lack of
independence between the deviations.
SSTR or treatment sum of squares measures the extent of differ
ences between estimated factor level means and the mean over all treatments.
The greater the difference between factor level means (treatment means), the
greater the value of SSTR.
SSTR = 12(Ym y....)2
i=lj=lk=ll=l
df sstr is the degrees of freedom. There are r 1 degrees of freedom
for the SSTR, where r is the number of parameters in the model. In the full
model, r = abed, = 54, the total combinations of factor levels. In the model
used for this simulation, r = (a1) + (5l) + (cl) + (dl) + (a1)(5l) + (a
l)(c 1) + (a l)(d 1) + (b l)(c 1) + (& l)(d 1) + (c l)(d 1) = 26.
One degree of freedom is lost due to the lack of independence between the
deviations.
66
SSE or error sum of squares, measures variability which is not ex
plained by the differences between sample means. It is a measure of the varia
tion within treatments. A smaller value of SSE indicates less variation within
simulations at the same factor level.
SSE = Â£ Â£ Â£ Â£(%:, YijUf
i=ij=ik=il=i
dfssE is the degrees of freedom. Since SSE is the sum of the errors
across factor level, the degrees of freedom is the sum of the degrees of freedom
for each factor level. It is the total number of simulations minus r, abed r.
MSI: is the mean square for error defined by MSE = SSE/dfssE
Note: The above definitions imply SSTO = SSTR + SSE. Due to this
relationship, this process is referred to as the partitioning of the total sum of
the squares.
In order to measure the variability within a factor level, the fac
tor sum of square terms are computed. These terms are integral in the test
statistic applied to determine whether a factor main effect is significant. In
addition, interaction sum of squares are computed to measure variability of
the interactions.
67
The factor A sum of squares corresponds to the number of needles
factor.
SSA = bcdjr^iY F...)2
i=1
Similar factor sum of squares are computed for each of the factors:
Factor Sum of Square Mean Sum of Square
Number of Needles Spacing Method e SSA = bcd^ ,(T,.. f SSB = acdVf] ,(T F...)2 SSC = abdEt=i(Y..k. ^ F...)2 SSD = abcYlf=i(Y ...i F...)2 MSA = SSA/(a 1) MSB = SSB/{b 1) MSC = SSC/(c 1) MSD = SSD/(d 1)
The interaction sum of squares are computed as well for use in the
Ftest on the interactions. The first three pairwise interaction sum of squares
are shown below. The others are computed in the same manner.
68
Number of Needles: Spacing SSAB = cdZti E$=i (Xu.. V,. MSAB = SSAB/{a 1 ){b 1)  y.j.. + Y..y
Number of Needles: 9
SSAC = bdT*=1 ELi . Yk, +F...)2
MS AC = SSAC/(a l)(c 1)
Number of Needles:
The treatment means, jiijki, indicate the mean for the treatment at
the ijkl levels of the respective factors.
The overall mean, /i, is the mean across all factors and all levels
(across all i,j, k, i).
69
References
(1) Hodge K.K., McNeal J.E., Terris M.K., Stamey T.A. Random sys
tematic versus directed ultrasound guided transrectal core biopsies of
the prostate. Journal of Urology 142 (1989): 7174.
(2) Daneshgari, Firouz M.D., Taylor, Gerald D. PhD, Miller, Gary J.
M.D., PhD, Crawford, E. David M.D. Computer Simulation of the
Probability of Detecting Low Volume Carcinoma of the Prostate with
Six Random Systematic Core Biopsies. Urology 45 (April 1989): 604
609.
(3) McNeal, John M.D. Normal Histology of the Prostate The American
Journal of Surgical Pathology (1988): 619633.
(4) Neter, John, \Vasserman. William, Applied Linear Statistical Mod
els, Richard D. Irwin, Inc 1974.
70

Full Text 
PAGE 1
SIMULA TIONS OF PR OST A TE BIOPSY METHODS b y Catherine Colb y P ellish B.S.E.E., Marquette Univ ersit y 1985 A thesis submitted to the Univ ersit y of Colorado at Den v er in partial fulllmen t of the requiremen ts for the degree of Master of Science Applied Mathematics 1997
PAGE 2
This thesis for the Master of Science degree b y Catherine P ellish has been appro v ed b y William L. Briggs James R. Koehler W eldon A. Lodwic k Date
PAGE 3
P ellish, Catherine Colb y (M.S., Applied Mathematics) Sim ulations of Prostate Biopsy Methods Thesis directed b y Associate Professor William L. Briggs Abstract An accepted practice in screening for prostate cancer in v olv es a needle core biopsy of the prostate gland, whic h can pro vide information regarding if, and ho w m uc h, cancer is presen t in a gland. This paper documen ts sev eral in v estigations in to prostate gland biopsy tec hniques. The rst phase of study in v olv es a geometric model of a prostate gland con taining one to three tumors. This mathematical model of the gland is then used to sim ulate v arious biopsy tec hniques and compare the resulting data. Secondly the best biopsy procedure, as determined from the geometric model, is sim ulated on actual specimen data whic h ha v e been digitized. These specimen data are also used for sim ulation of the six random systematic core biopsy tec hnique (SRSCB) curren tly in clinical use. The results of the geometric model are compared to the results of the sim ulation on actual data. Finally the geometric model is used in another series of sim ulations that in v estigate the n um ber of needle samples needed to estimate the tumor to gland v olume ratio. iii
PAGE 4
This abstract accurately represen ts the con ten t of the candidate's thesis. I recommend its publication. Signed William L. Briggs iv
PAGE 5
A CKNO WLEDGEMENTS I w ould lik e to sincerely thank a n um ber of people who consisten tly pro vided me with their support, encouragemen t and guidance as I pursued the completion of this thesis. Dr. Bill Briggs, m y advisor, serv ed as a constan t source of insigh t and motiv ation, as w ell as pro viding considerable direction throughout this process. I am also grateful for the time spen t with Dr. Jim Koehler who had to teac h me the ner poin ts of statistics again and again. My thanks to both of these professers for pro ving to be excellen t academic sources. I also w ould lik e to thank Norm LeMa y who, out of the generousit y of his heart and his need for a free lunc h, assisted me in running the ANO V A analysis whic h this thesis required. Finally I m ust thank m y family Mark, Eric and Corinne for encouraging me and making me laugh through ev ery crisis.
PAGE 6
CONTENTS Chapter 1 In troduction . . . . . . . . . . . . . 2 1.1 Clinical Prostate Biopsy Analysis . . . . . . 2 1.2 Summary of Mathematical Methods . . . . . 4 2 The Geometric Model . . . . . . . . . . . 5 2.1 Geometric Model of gland and tumor . . . . . 5 2.2 Sim ulations . . . . . . . . . . . . 10 2.3 Statistical Analysis of Results . . . . . . . 14 2.4 Sim ulation Results . . . . . . . . . . 16 2.4.1 Applying the ANO V A to the Biopsy Sim ulation Data 18 2.4.2 ANO V A Mec hanics . . . . . . . . . 23 2.4.3 Residuals . . . . . . . . . . . . 24 2.4.4 The Null and Alternate Hypotheses . . . . . 25 2.4.5 Are the Main Eects all Equal? . . . . . . 27 2.4.6 Recognizing In teraction bet w een F actors . . . . 30 2.4.7 Clinical Distribution of T umors . . . . . . 38 vi
PAGE 7
3 Digitized Specimen Data . . . . . . . . . . 43 3.1 Summary of Soft w are T ool . . . . . . . . 43 3.2 Specic Algorithms . . . . . . . . . . 45 3.2.1 Locating the Apex . . . . . . . . . . 45 3.2.2 Establishing Needle P ositions . . . . . . . 47 3.3 Sim ulations . . . . . . . . . . . . 49 3.4 Geometric Model vs Clinical Model . . . . . 51 3.5 Optimal T ec hnique vs SRSCB . . . . . . . 53 4 Geometric Model V olume Estimates . . . . . . 56 4.1 T umor V olume Estimates . . . . . . . . 56 4.1.1 OneDimensional Analysis Line Model . . . . 58 4.1.2 Tw oDimensional Strip Model . . . . . . 58 4.1.3 ThreeDimensional Cylinder Model . . . . . 59 4.2 Experimen t Setup . . . . . . . . . . 60 4.3 Results . . . . . . . . . . . . . 62 4.4 In teractiv e Utilit y . . . . . . . . . . 63 Appendix A ANO V A Denitions . . . . . . . . . . . 65 1
PAGE 8
1. In troduction 1.1 Clinical Prostate Biopsy Analysis Curren tly the standard method of determining if a giv en prostate gland is cancerous in v olv es t w o procedures. The rst is the prostatespecic an tigen (PSA) test whic h measures the lev el of an tigens in the patien t's blood, a high lev el indicating a higher possibilit y of cancerous tissue. The second procedure is the needle biopsy whic h is carried out if the PSA test so indicates. The clinician conducts this biopsy b y inserting a needletool, equipped with ultrasound capabilities, in to the patien t's rectum. The gland is located and the urologist res three needles in to the righ t lobe of the gland and three needles in to the left lobe at appro ximately symmetric positions. The leftrigh t division of the gland is determined b y the position of the urethra in the gland. This ph ysical landmark is used as the visual dividing line, enabling clinicians to execute the biopsy in a systematic manner. The needletool is rotated to the left or righ t depending on the targeted lobe. This rotation corresponds to the angle used in the mathematical analysis. F ollo wing this sligh t rotation, the needles are inserted at a second independen t angle, referred 2
PAGE 9
to as The c hoice of a sixneedle biopsy is based on the six random systematic core biopsies (SRSCB) method dev eloped b y Hodge et al [1] and curren tly though t to ac hiev e the best detection rates. The results from this diagnostic biopsy are then analyzed in order to determine the best treatmen t plan for the patien t. There are sev eral factors that help the urologist c hoose the optimal treatmen t plan. The rst factor is ob viously whether the biopsy sho ws an y tumor cells at all. According to the Hodge study 96% of the 83 men diagnosed with cancer had the cancer detected b y SRSCB. Ho w ev er, as in v estigated b y Daneshgari et al [2], in prostate glands with lo w tumor v olume, the SRSCB fails to ac hiev e suc h a high percen tage of detection. This study concluded that \an impro v ed biopsy strategy ma y be needed in detection of CaP (carcinoma of the prostate) in patien ts with lo w v olume cancer". Secondly the v olume of the tumor itself is a deciding factor in determining treatmen t. Thirdly the location of the tumor, specically if the tumor penetrates the capsule of the gland, can dene a specic treatmen t plan. Some of this information is a v ailable from a single needlecore biopsy; more information is gleaned from successiv e, strategically placed biopsies. 3
PAGE 10
1.2 Summary of Mathematical Methods As an aid in understanding this problem, as w ell as researc hing w a ys to impro v e diagnosis, t w o methods of analysis are undertak en. The rst method relies on a geometric model of the prostate gland with from one to three tumors. V arious biopsy methods are sim ulated with this mathematical model and results are tabulated. The second method in v olv es running the same biopsy sim ulations on actual prostate glands whic h ha v e been digitized and stored as threedimensional objects in a computer. The experimen tal results from these t w o methods are then compared. All of the sim ulations w ere executed using soft w are created for this purpose primarily b y this author, although the sk eletons of these soft w are tools w ere engineered during the Spring 1995 Math Clinic on this topic b y sev eral participan ts. The sim ulations are written in C and C++, running on a UNIXbased computer. They are extensiv ely documen ted and rexible enough to be useful in a v ariet y of experimen ts within this realm of researc h. 4
PAGE 11
2. The Geometric Model 2.1 Geometric Model of gland and tumor An actual prostate gland is about the size of a w aln ut with v olumes ranging from 22 cc to 61 cc [3]. The geometry of an ellipsoid closely models this gland and an y tumors presen t within it. Therefore, an ellipsoid of the form x 2 A 2 + y 2 B 2 + z 2 C 2 = 1 ; is used to represen t the prostate gland. Ellipsoids are also used to represen t eac h of the tumors. The dimensions of the gland, A; B and C are c hosen randomly in the follo wing experimen tally determined ranges: 3.0 cm < A < 4.8 cm 3.8 cm < B < 4.6 cm 3.8 cm < C < 5.2 cm 22 cc < [ gland volume ] < 61 cc. The prostate is divided in to 3 zones: the peripheral, the cen tral and the transition region. The peripheral zone comprises appro ximately 70% of the mass of the prostate gland. It is located in the lo w er area of the gland, 5
PAGE 12
closest to the rectum. This region is the \site of origin of most carcinomas"[3]. The cen tral region mak es up appro ximately 25% of the glandular mass and is \resistan t to both carcinoma and inrammation"[3]. The transition region con tains the remaining 5% of prostate gland tissue and can be the site of some cancers. Figure 2.1 sho ws these regions of the prostate gland. Based on this clinical information, the soft w aregenerated tumors are located in the lo w er part of the elliptical gland model to sim ulate tumors residing in the peripheral zone. Figure 2.2 depicts the geometrical gland and tumor model in the xyz system. Since the gland model is cen tered at the origin, the y coordinate of the tumor cen ter, y c is alw a ys negativ e in order to place the tumor in the peripheral zone of the gland. Ho w ev er, other distributions of y could be used to impro v e the model. T umors are modeled b y an equation of the form ( x x c ) 2 a 2 + ( y y c ) 2 b 2 + ( z z c ) 2 c 2 = 1 where x c y c and z c specify the cen ter of the tumor. The biopsy needle is modeled as a line with the parametric equations x ( t ) = x 0 + t sin sin y ( t ) = y 0 + t sin cos z ( t ) = z 0 + t cos ; 6
PAGE 13
Figure 2.1. The peripheral (PZ), cen tral (CZ) and transition (TZ) regions divide the prostate gland in to 3 major zones. Tumor Ellipsoid Y Z X Gland Ellipsoid B A CFigure 2.2. The gland and tumor are modeled b y ellipsoids in the xyz coordinate system. 7
PAGE 14
where x 0 y 0 and z 0 are the coordinates of the en try poin t of the needles (Figure 2.3 and Figure 2.4). The angle is measured from the y axis and determines a plane. The angle is then assumed to remain in this plane and is measured from the z axis. F rom these denitions, the parametric equations for the line are determined. The parameter t measures the length of the needle. x y z Y X Needle Gland Ellipsoid F 0 0 0Figure 2.3. This gure of the xy plane and needle illustrates measuremen t of Substituting the parametric equations of the needle in to the equation for the tumor, it is possible to determine v alues of t corresponding to an in tersection. The equation of the tumor is ( x ( t ) x c ) 2 a 2 + ( y ( t ) y c ) 2 b 2 + ( z ( t ) z c ) 2 c 2 = 1 : 8
PAGE 15
0 0 0 Needle Y Z Gland Ellipsoid Q x y z Figure 2.4. This gure of the yz plane and needle illustrates measuremen t of Replacing x ( t ), y ( t ) and z ( t ) b y the parametric equations of the needle giv es t 2 ( sin 2 sin 2 a 2 + sin 2 cos 2 b 2 + cos 2 c 2 )+ t ( 2( x 0 x c ) sin sin a 2 + 2( y 0 y c ) sin cos b 2 + 2( z 0 z c ) cos c 2 + ( ( x 0 x c ) 2 a 2 + ( y 0 y c ) 2 b 2 + ( z 0 z c ) 2 c 2 ) = 1 : (2.1) If the discriminan t ( B 0 2 4 A 0 C 0 ) is positiv e, t w o real roots exist. In this case w e ha v e A 0 = sin 2 sin 2 a 2 + sin 2 cos 2 b 2 + cos 2 c 2 B 0 = 2( x 0 x c ) sin sin a 2 + 2( y 0 y c ) sin cos b 2 + 2( z 0 z c ) cos c 2 C 0 = ( x 0 x c ) 2 a 2 + ( y 0 y c ) 2 b 2 + ( z 0 z c ) 2 c 2 : 9
PAGE 16
If real roots t 1 and t 2 exist, they giv e the poin ts where the tumor ellipsoid and the line in tersect. If these v alues are greater than 0 and less than the actual needle length, the needle has in tersected the tumor. The amoun t of tumor extracted b y the needle is proportional to the dierence bet w een the t w o roots of the quadratic, j t 1 t 2 j By comparing the t w o roots, an estimate of the v olume of the tumor that is con tained in the needle can be made. If real roots do not exist, the needle does not in tersect the tumor ellipsoid and no tumor information is gained b y that needle. In this analysis, eac h biopsy procedure w as sim ulated on 1000 dieren t gland models and the n um ber of times a tumor w as detected per procedure w as recorded. This method does not dieren tiate bet w een one or more needles detecting the tumor. It simply records a hit or miss per biopsy procedure. In addition, an estimate of the tumor v olume is made whenev er a tumor is detected. 2.2 Sim ulations Since a fundamen tal goal of an y biopsy is to determine whether or not the gland con tains cancerous cells, the rst series of sim ulations is in tended to compare the detection rate of sev eral biopsy tec hniques. The detection rate is dened as the n um ber of times a biopsy procedure detects a tumor to the 10
PAGE 17
total n um ber of biopsies conducted. A set of 54 dieren t biopsy procedures is sim ulated with v ariation in the follo wing parameters: n um ber of needles, oset bet w een needles in the z direction, and The distance in the z direction bet w een needles can be a relativ e spacing based on the gland dimension in the z direction or an absolute spacing of 1 cm bet w een eac h needle. The rst method is referred to as relativ e spacing since it depends on the gland size and separates the needles b y equal distance. The second is referred to as the absolute spacing and has its basis in the SRSCB procedure. As a means of clarication, Figures 2.5 and 2.6 illustrate the analysis of a single specimen and the execution of the en tire experimen t. Eac h of the 54 biopsy procedures is sim ulated on 1000 dieren t gland models. The random n um ber generator is seeded once for eac h series of 1000 sim ulations using a specic biopsy tec hnique. Prior to the next tec hnique, the random n um ber generator is reseeded with the same n um ber, thereb y yielding the iden tical set of 1000 prostate models. This insures that eac h of the biopsies is conducted on the same set of 1000 sim ulated glands. The detection rate is determined for eac h of these procedures and the results of the sim ulation are documen ted in T able 2.1. 11
PAGE 18
Make Tumor(s) Determine starting location for all done? needles Simulate a single needle biopsy. Solveusing initial needle position; store hit and volume results. NO YES Make a gland model Simulation Over equation (1) for roots Needles AllFigure 2.5. This ro w c hart depicts the toplev el algorithm for modeling a single biopsy with sev eral needles. 12
PAGE 19
Read in Biopsyparameters for a Simulate this biopsy on a single Simulation Over gland model. given procedure. Doneglands? 1000 NO YES All 54 biopsy procedures done? NO YESFigure 2.6. This ro w c hart depicts the sim ulation process for the en tire sim ulation, eac h biopsy procedure is sim ulated on 1000 geometric gland models. 13
PAGE 20
2.3 Statistical Analysis of Results In order to in terpret the output from the sim ulations legitimately a statistical tool is needed. First, w e m ust determine whether or not the v arious biopsy settings inruence the observ ed detection rate. In other w ords, is there a relationship bet w een the settings of an y one or com bination of the four factors (n um ber of needles, z spacing, and ) and the detection rate or are the results completely random, therefore implying that the biopsy specication does not determine the detection rate? W e need a mathematically sound method to compare the detection rates pro vided b y the sim ulation and to infer some conclusions. The statistical model kno wn as Analysis of V ariance (ANO V A) w as used to compare the population means bet w een v arious treatmen ts, th us resulting in a statistically v alid conclusion. This model can be emplo y ed to determine whether the v arious factors in teract and whic h factors ha v e the most impact on the outcome. In order to describe the ANO V A model, a few denitions are required. (1) F actors are the independen t v ariables that are under in v estigation. In this instance, the biopsy parameters (n um ber of needles, spacing 14
PAGE 21
method, and ) are the factors for the ANO V A model. Num ber of Needles Spacing Method F actor 4 Absolute 30 30 Lev els 6 Relativ e 45 45 8 60 60 (2) F actor lev els are the v alues that eac h of the factors can tak e on during a single sim ulation. As sho wn in the list of biopsy sim ulation factors and lev els, eac h factor does not ha v e the same n um ber of factor lev els. The factor Spacing Method only has t w o factor lev els, whereas the other three factors eac h ha v e three factor lev els. (3) A treatmen t is a particular com bination of lev els of eac h of the factors in v olv ed in the experimen t where an experimen t is the sim ulation of the treatmen t on 1000 geometric specimens. In this example, a treatmen t refers to a biopsy with specic settings (for example, 4 needles, absolute spacing, = 45 = 45 ). F or the sim ulation, there are 54 dieren t treatmen ts and therefore, 54 dieren t experimen ts corresponding to all the com binations of the lev els of the four factors. (4) A trial is dened to be a sim ulation of one treatmen t on one geometric model. The outcome of a trial is either 1, the biopsy procedure detected the tumor, or 0, the tumor remained undetected. The outcome of the experimen t is the detection rate ac hiev ed b y a specic 15
PAGE 22
treatmen t sim ulated on 1000 geometric specimens. In other w ords, the outcome of the experimen t is the n um ber of specimens in whic h tumor is detected v ersus the total n um ber of specimens sim ulated and is referred to as outcome for the remainder of this thesis. 2.4 Sim ulation Results F or eac h of the 54 treatmen ts, the sim ulation is conducted on 1000 dieren t gland models. The follo wing table summarizes the treatmen t parameters as w ell as the results: T reatmen t P arameters Outcome Num ber of Spacing Detection Experimen t Needles Method Rate 1 4 Relativ e 45 45 0.252 2 6 Relativ e 45 45 0.307 3 8 Relativ e 45 45 0.335 4 4 Absolute 45 45 0.263 5 6 Absolute 45 45 0.293 6 8 Absolute 45 45 0.298 7 4 Relativ e 60 45 0.267 8 6 Relativ e 60 45 0.341 9 8 Relativ e 60 45 0.369 10 4 Absolute 60 45 0.270 11 6 Absolute 60 45 0.320 12 8 Absolute 60 45 0.339 13 4 Relativ e 30 45 0.196 14 6 Relativ e 30 45 0.225 15 8 Relativ e 30 45 0.255 16 4 Absolute 30 45 0.207 17 6 Absolute 30 45 0.221 18 8 Absolute 30 45 0.221 T able 2.1. The results from the 54 geometric model experimen ts are displa y ed. 16
PAGE 23
T reatmen t P arameters Outcome Num ber of Spacing Detection Experimen t Needles Method Rate 19 4 Relativ e 45 60 0.200 20 6 Relativ e 45 60 0.234 21 8 Relativ e 45 60 0.268 22 4 Absolute 45 60 0.211 23 6 Absolute 45 60 0.225 24 8 Absolute 45 60 0.228 25 4 Relativ e 60 60 0.191 26 6 Relativ e 60 60 0.254 27 8 Relativ e 60 60 0.268 28 4 Absolute 60 60 0.209 29 6 Absolute 60 60 0.240 30 8 Absolute 60 60 0.246 31 4 Relativ e 30 60 0.172 32 6 Relativ e 30 60 0.194 33 8 Relativ e 30 60 0.219 34 4 Absolute 30 60 0.188 35 6 Absolute 30 60 0.197 36 8 Absolute 30 60 0.197 37 4 Relativ e 45 30 0.260 38 6 Relativ e 45 30 0.316 39 8 Relativ e 45 30 0.341 40 4 Absolute 45 30 0.264 41 6 Absolute 45 30 0.305 42 8 Absolute 45 30 0.316 43 4 Relativ e 60 30 0.283 44 6 Relativ e 60 30 0.351 45 8 Relativ e 60 30 0.385 46 4 Absolute 60 30 0.279 47 6 Absolute 60 30 0.346 48 8 Absolute 60 30 0.372 49 4 Relativ e 30 30 0.210 50 6 Relativ e 30 30 0.247 51 8 Relativ e 30 30 0.273 52 4 Absolute 30 30 0.225 53 6 Absolute 30 30 0.245 54 8 Absolute 30 30 0.247 T able 2.1. (Con t.) The results from the 54 geometric model experimen ts are displa y ed. 17
PAGE 24
2.4.1 Applying the ANO V A to the Biopsy Sim ulation Data The biopsy sim ulation is a m ultifactored system, in whic h the four parameters (n um ber of needles, spacing, and ) individually and perhaps in some com binations ma y ha v e a measurable eect on the detection rate. Therefore a factor eects model is used in order to determine the impact of and in teractions bet w een these four parameters. This biopsy sim ulation is considered a complete factorial study since all possible com binations of the four parameters w ere sim ulated and ev aluated. The indices i; j; k; l refer to the lev els of the factors numb er of ne e dles, sp acing metho d and respectiv ely In this m ultifactored system, a true o v erall mean, whic h is equivalen t to the true o v erall detection rate, is assumed to exist. The en tire sim ulation results in 54 observ ed detection rates, p ijkl eac h of whic h indicates the observ ed detection rate for a giv en experimen t. This set of 54 observ ed detection rates is used in the ANO V A to determine estimated factor eects and an estimated o v erall mean whic h are used in the factor eects model. The factor eects model is used to predict a detection rate, a probabilit y of detection, ^ p ijkl giv en the lev els of the four factors. A factor lev el mean is the a v erage detection rate for a group of 18
PAGE 25
treatmen ts that ha v e one common factor lev el held constan t while all others v ary F or example, all outcomes from experimen ts with Numb er of Ne e dles = 6 are a v eraged to yield the factor lev el mean for the factor Numb er of Ne e dles at the lev el i = 6. The o v erall mean , is simply the a v erage outcome of all experimen ts. The dierence bet w een eac h factor lev el mean and the o v erall mean yields the main eect for that factor lev el. Because this model has 4 factors eac h with either 2 or 3 lev els, the follo wing main eects are designated. i the main eect for the factor Numb er of Ne e dles at eac h of its lev els (4,6,8): 1 i 3. j the main eect for the factor Sp acing Metho d at eac h of its lev els (0,1): 1 j 2. r k the main eect for the factor at eac h of its lev els (30 ,45 ,60 ): 1 k 3. l the Main Eect for the factor at eac h of its lev els (30 ,45 ,60 ): 1 l 3. A factor at a particular lev el ma y inruence another factor either b y inhibiting or enhancing its impact. Because of these in teractions bet w een factors, the in teraction eects are included in the model. P airwise in teraction 19
PAGE 26
eects are a measure of the com bined eect of t w o factors, across the dieren t lev els, min us the main eects of these factors. W e dene these t w ow a y eects as follo ws. ( ) ij n um ber of needles and spacing method ( r ) ik n um ber of needles and ( ) il n um ber of needles and ( r ) jk spacing method and ( ) jl spacing method and ( r ) kl and Threew a y factor eects are a measure of the in teraction eect of three factors. ( r ) ijk n um ber of needles, spacing method and ( ) ijl n um ber of needles, spacing method and ( r ) jkl spacing method, and ( r ) ikl n um ber of needles, and The fourw a y eect is the measure of the in teraction eect of all four factors. ( r ) ijkl n um ber of needles, spacing method, and 20
PAGE 27
Summary of V ariables T rue o v erall mean Estimated o v erall mean ^ T rue treatmen t mean ijkl Estimated treatmen t mean ^ ijkl Observ ed treatmen t detection rate p ijkl T ransformed observ ed treatmen t detection rate Y ijkl Estimated treatmen t detection rate ^ p ijkl T ransformed estimated treatmen t detection rate ^ Y ijkl Av erage observ ed detection rate p T rue main factor lev el eects i j r k l Estimated main factor lev el eects ^ i ^ j ^ r k ^ l T rue t w ow a y eects ( ) ij ( r ) ik ( ) il ( r ) jk ( ) jl ( r ) kl Estimated t w ow a y eects d ( ) ij d ( r ) ik d ( ) il d ( r ) jk d ( ) jl d ( r ) kl T able 2.2. A list of the v ariables used in the ANO V A analysis is displa y ed. The factor eects model tak es the general form ijkl = + i + j + r k + l +( ) ij +( r ) ik +( ) il +( r ) jk +( ) jl +( r ) kl +( r ) ijk + ( ) ijl + ( r ) jkl + ( r ) ikl + ( r ) ijkl : The observ ed outcome, the detection rate for a particular treatmen t, as giv en in T able 2.1, is p ijkl and is the sum of the true mean for that treatmen t and a residual term: p ijkl = ijkl + ijkl : 21
PAGE 28
The goal of the analysis is to form ulate a model that predicts the outcome of a giv en treatmen t. Since the true means and true factor eects are not kno wn, estimates of these terms are determined from the sim ulation and used in the model. Estimated v alues are indicated with the^notation. The predicted outcome ^ p ijkl is represen ted b y the follo wing relationship: ^ p ijkl = ^ + ^ i + ^ j + ^ r k + ^ l + d ( ) ij + d ( r ) ik + d ( ) il + d ( r ) jk + d ( ) jl + d ( r ) kl + d ( r ) ijk + d ( ) ijl + d ( r ) jkl + d ( r ) ikl + d ( r ) ijkl : In this equation ^ p ijkl is the estimated probabilit y of detecting a tumor at the factor lev els indicated b y i; j; k; l This probabilit y is predicted b y the model using least square estimators for the terms in the equation. The probabilit y of detection is a function of the estimated o v erall mean, ^ and the estimated eects from the four factors, alone and in com bination with one another. Not all of these eects ma y be signican t. In order to determine whic h of the factors do signican tly eect the detection rate and therefore belong in the nal model, v arious means are ev aluated. If all the means for a particular factor (or com bination of factors) are equal, v arying a factor lev el does not add to or subtract from the o v erall mean and therefore the factor does not belong in the nal model. This equalit y question is put, not only to eac h factor individually but to all the com binations of factors as w ell. 22
PAGE 29
2.4.2 ANO V A Mec hanics Use of the ANO V A model is founded on sev eral assumptions: (1) The outcomes follo w a normal probabilit y distribution. (2) Eac h distribution has the same v ariance. (3) The outcomes for eac h factor lev el are independen t of the other factor lev el outcomes. With these assumptions in mind, note that the probabilit y distributions of a factor at eac h of its lev els diers only with respect to the mean [4]. Therefore, the rst step in executing the analysis is to determine if the detection rates, are statistically dieren t. Secondly if they are dieren t, one of the in ten ts of the ANO V A model is to determine if the dierence bet w een the detection rate of t w o or more treatmen ts is sucien t, after examining the v ariabilit y within the treatmen ts, to conclude that one treatmen t does indeed produce a higher detection rate. In addition, b y ev aluating the statistical data, conclusions ma y be dra wn as to ho w eac h factor, both independen tly and within established in teraction groups (pairwise, threew a y or fourw a y), inruences the outcome. 23
PAGE 30
2.4.3 Residuals W e dene p to be the a v erage of all observ ations. The model states that p ijkl = ijkl + ijkl ; therefore the residual term is ijkl = p ijkl ijkl Since ijkl is estimated b y ^ ijkl the estimated residual term is e ijkl = p ijkl ^ ijkl the dierence bet w een the observ ed and the estimated a v erage detection rate. The set of all 54 residuals, e ijkl for all i j k and l are ev aluated for three c haracteristics whic h indicate whether the tted data are w ellsuited for the analysis. These c haracteristics are: 1. Normalit y of error terms. 2. Constancy of error v ariance. 3. Independence of error terms. Sev eral statistical tests and plots used on the residual data determine whether one of the v e assumptions is violated. These tests rev ealed that the error v ariances w ere not stable, th us violating the rst c haracteristic. A transformation w as emplo y ed to preserv e the statistical information in the output, but stabilize the error v ariances. Since nothing is lost b y emplo ying a transformation and the error v ariances are stabilized, the detection rate data p is transformed to Y via the follo wing relationship: Y = 2 arcsin ( p p ) : 24
PAGE 31
The outcome from these sim ulations is the detection rate, a proportion of the n um ber of specimens where tumor is detected to the total n um ber of specimens. The arcsine transformation is the most appropriate transformation when the outcome is a proportion [4]. All ANO V A data referenced from this poin t on are transformed unless noted otherwise. The in v erse transformation is calculated at the conclusion of this analysis to get a true estimate of the probabilit y 2.4.4 The Null and Alternate Hypotheses A starting poin t in the ANO V A process is to establish t w o h ypothesis, a n ull and alternate h ypothesis. The n ull h ypothesis assumes that all eects are equal, therefore indicating that specic factor lev els do not inruence the outcome. The alternate h ypothesis assumes that at least t w o of the eects are not the same. The Ftest is used to decide whic h of these t w o h ypotheses concerning the data will be accepted. The test consists of computing the ratio of bet w eeneect v ariation to withineect v ariation. This bet w eeneect v ariation, whic h c hanges depending on the eect, is called the treatmen t sum of squares and is denoted SSA SSB SSC and SSD (see Appendix also). It is a measure of the dierence bet w een the detection rate of a set of treatmen ts and the a v erage detection rate o v er all treatmen ts. The withineect v ariation 25
PAGE 32
is called the error sum of squares and is denoted SSE It is a measure of the dierence bet w een the individual outcome for a giv en treatmen t and the estimated detection rate o v er that treatmen t. The error sum of squares measures v ariabilit y that is not explained b y the SSA SSB SSC or SSD terms and therefore occurs within the set of treatmen ts. Both of these v ariation measuremen ts are ev aluated using sum of the squares expressions as detailed in the Appendix. The means of the SSA; SSB; SSC; SSD and SSE terms are MSA; MSB; MSC; MSD and MSE respectiv ely and are computed b y dividing b y the degrees of freedom, df, associated with eac h term. This results in F = MSA= MSE where MSA = SSA=d f A ( MSB = SSB=d f B ,etc) and MSE = SSE=d f Large v alues of F tend to support the conclusion that all the eects are not equal ( H a ), whereas v alues of F near 1 support the n ull h ypothesis ( H 0 ). In the ev en t that the alternate h ypothesis is indicated via the Ftest, the ANO V A also pro vides the probabilit y of a TYPE I error. A TYPE I error occurs when it is concluded that dierences bet w een means exist when, in fact, they do not (i.e. accept H a when in fact H o is true). This information is giv en in the column labelled Pr(F) in the ANO V A output in T able 2.3. 26
PAGE 33
2.4.5 Are the Main Eects all Equal? F ollo wing the general process of establishing n ull and alternate h ypothesis as described abo v e, a pair of n ull and alternate h ypotheses are stated for eac h factor in the biopsy model. The n ull h ypothesis assumes that the main eects for a giv en factor at eac h of its lev els are equiv alen t. The alternate h ypothesis ob viously assumes that the main eects dier. H 0 : 1 = 2 = 3 H a : not all i are equal. 1 = 2 not all i are equal. 1 = 2 = 3 not all r i are equal. r 1 = r 2 = r 3 not all i are equal. The Ftest statistic is applied to determine whic h h ypothesis to accept in eac h case. The factor sum of squares for eac h factor, n um ber of needles, spacing, and denoted SSA SSB SSC and SSD respectiv ely is computed as sho wn in the Appendix. The mean of eac h of these factor sum of square terms is computed b y dividing eac h term b y its associated degrees of freedom so that MSA = SSA=d f A MSB = SSB=d f B etc. as detailed in the Appendix. The test statistic is formed for eac h h ypothesis in the follo wing manner. T o test the eect of the rst factor, Number of Needles, F = MSA= MSE ; to test the eect of the spacing factor, 27
PAGE 34
F = MSB= MSE ; to test the eect of F = MSC= MSE ; and to test the eect of F = MSD= MSE Accepting the alternate h ypothesis means that a specic setting of the giv en factor corresponds to a c hange in detection rate; th us that factor has an eect on the o v erall outcome of the biopsy Df Sum of Sq Mean Sq F V alue Pr(F) Needles 2 0.15862 0.07931 607.427 0.0000000 Main Spacing 1 0.00498 0.00498 38.209 0.0000011 Eects 2 0.29249 0.14624 1120.073 0.0000000 2 0.28115 0.14057 1076.661 0.0000000 Ndls:Spc 2 0.1641 0.00820 62.846 0.0000000 Needles: 4 0.01444 0.00361 27.653 0.0000000 2W a y Spacing: 2 0.00059 0.00029 2.283 0.1206068 Eects Needles: 4 0.00395 0.00098 7.569 0.0002892 Spacing: 2 0.00046 0.00023 1.794 0.1848710 : 4 0.02867 0.00716 54.902 0.0000000 Residuals 28 0.00365 0.00013 T able 2.3. The output from the ANO V A is displa y ed abo v e. See Appendix for details of the calculations. Refering to this ANO V A output, the column of n um bers labelled Sum of Sq refers to the parameters SSA, SSB, SSC and SSD detailed in the Appendix. The column labelled Mean Square lists the parameters MSA, MSB, MSC, MSD. The F V alue column lists the Ftest outcome for eac h ro w: ( Needles F V alue = MSA/MSE). The larger v alues in this column tend to support the alternate h ypothesis that the main eect for a giv en factor diers across 28
PAGE 35
the possible lev els for that factor. The nal column, Pr(F), giv es the probabilit y of a T ype I error. Again, a T ype I error occurs if the alternate h ypothesis is concluded when in fact, the n ull h ypothesis is true. The ro w labelled Residuals indicates the total degrees of freedom, the SSE and the MSE for this analysis. Based on the n um bers in the table, eac h of the four main eects has a signican t eect on the outcome with the factor ha ving the greatest inruence on the detection rate, follo w ed b y the factors and Number of Needles This fact is indicated b y the high Fv alue that corresponds to eac h of the four factors. The ro ws labelled with t w o factor names (for example, Needles: Spacing ) indicate the ANO V A output corresponding to pairwise in teractions and include the sum of squares computed for eac h pair of factors. The sum of squares for all of the pairwise in teraction terms ( SSAB; SSAC; SSAD; SSBC; SSBD; SSCD ) are computed as detailed in the Appendix. The total treatmen t sum of squares, SSTR = SSA + SSB + SSC + SSD + SSAB + SSAC + SSAD + SSBC + SSBD + SSCD This sum does not include the sum of square terms due to the threew a y and fourw a y in teractions because there are not enough degrees of freedom in the experimen t to use the full model. 29
PAGE 36
2.4.6 Recognizing In teraction bet w een F actors A t this poin t, the Ftest has determined that eac h of the main factor eects con tributes to the o v erall detection rate. T o ev aluate the in teraction eects, the Ftest is applied again The Ftest is applied to determine in teraction bet w een, in this case, t w o, three or four factors. A n ull and alternate h ypothesis is form ulated for all possible com binations of factors and sum of square terms are computed for the factor groups and used in eac h Ftest. The n ull and alternate h ypothesis are constructed for eac h of the pairwise in teractions. H 0 : all ( ) ij = 0 H a : not all ( ) ij = 0 all ( r ) ik = 0 not all ( r ) ik = 0 all ( ) il = 0 not all ( ) il = 0 all ( r ) jk = 0 not all ( r ) jk = 0 all ( ) jl = 0 not all( ) jl = 0 all ( r ) kl = 0 not all ( r ) kl = 0 All threew a y com binations are formed, h ypotheses are constructed and Ftest results are ev aluated. H 0 : all ( r ) ijk = 0 H a : not all ( r ) ijk = 0 all ( ) ijl = 0 not all ( ) ijl = 0 all ( r ) ikl = 0 not all ( r ) ikl = 0 all ( r ) jkl = 0 not all ( r ) jkl = 0 The n ull/alternate set of h ypothesis is constructed for the fourw a y in teraction. 30
PAGE 37
H 0 : all ( r ) ijkl = 0 H a : not all ( r ) ijkl equal 0 Based on the actual ANO V A results in the preceding table, four of the pairwise in teractions appear strongly signican t: Needles: Spacing Needles: Needles: and : The other t w o pairwise in teractions are included in the nal model ev en though the strength of their signicance is uncertain. The ANO V A w as executed once to include all threew a y in teractions. Since these in teractions pro v ed insignican t, they are not included in the model. There are not enough degrees of freedom in the experimen t to estimate the residuals and test for the fourw a y in teraction. As stated previously the Y notation indicates the transformed detection rate ( p ). A t this poin t the general model, of the form Y ijklm = :::: + i + j + r k + l Main eects +( ) ij + ( r ) ik + ( ) il + ( r ) jk + +( ) jl + ( r ) kl P airwise eects +( r ) ijk + ( ) ijl + ( r ) jkl Threew a y eects +( r ) ijkl F ourw a y eect + ijklm residual error is reduced to the nal model for this analysis: ^ Y ijkl = ^ + ^ i + ^ j + ^ r k + ^ l + d ( ) ij + d ( r ) ik + d ( ) il + d ( r ) jk + d ( ) jl + d ( r ) kl : This model yields the transformed probabilit y of detection at the giv en lev els for i j k and l 31
PAGE 38
No w that the factor eects ha v e been iden tied, the analysis rev olv es around determining the factor lev els that result in the highest detection rate. F or this part of the analysis, the tables of means and tables of eects are ev aluated. :::: Grand Mean 1.072 Needles 4 6 8 Spacing Relativ e Absolute i::: 0.999 1.09 1.128 :j:: 1.082 1.063 30 45 60 30 45 60 ::k: 0.9723 1.098 1.147 :::l 1.14 1.1104 0.9724 T able 2.4. The ANO V A tables of means list the transformed v alues. Needles 30 45 60 Spacing 30 45 60 4 0.926 1.027 1.045 Relativ e 0.978 1.111 1.157 6 0.979 1.113 1.176 Absolute 0.967 1.084 1.137 8 1.012 1.152 1.221 Needles 30 45 60 Spacing 30 45 60 4 1.054 1.028 0.915 Relativ e 1.148 1.118 0.980 6 1.161 1.123 0.985 Absolute 1.132 1.091 0.965 8 1.205 1.163 1.017 Spacing Needles Relativ e Absolute 30 45 60 4 0.987 1.011 30 1.026 0.978 0.913 6 1.099 1.080 45 1.159 1.139 .0994 8 1.159 1.097 60 1.235 1.196 1.010 T able 2.5. The transformed v alues of the pairwise means are sho wn. 32
PAGE 39
Referring to the ANO V A tables of means, the highest n um bers in eac h category rerect the best setting for a particular factor. On reading through the tables of means, the conclusion is that a tec hnique of 8 needles, relativ e spacing, = 60 and = 30 yields the best detection rate. In order to corroborate this more fully the in teractions that are deemed signican t are analysed to v erify that the main eect is not con tradicted b y an in teraction. Therefore, the table for Needles: is review ed and it is found that the setting of 8 needles and = 60 again yields the highest mean. The tables for all of the pairwise com binations are review ed to determine that the best settings yield the highest means in the in teraction tables just as they did in the main eect tables. This pro v es to be the case, so none of the in teractions con tradict the conclusion dra wn from the main eect information. 33
PAGE 40
Num ber of Needles (4, 6, or 8) ^ 1 ^ 2 ^ 3 Eect 0.07329 0.01723 0.05607 Spacing (Relativ e or Absolute) ^ 1 ^ 2 Eect 0.009612 0.009612 (30 45 or 60 ) ^ r 1 ^ r 2 ^ r 3 Eect 0.1001 0.02519 0.07486 (30 45 or 60 ) ^ 1 ^ 2 ^ 3 Eect 0.0678 0.03215 0.09995 T able 2.6. The main factor lev el eects from the ANO V A output are documen ted. 34
PAGE 41
Spacing Relativ e Absolute 4 0.02127 0.02127 Needles 6 0.00017 0.00017 8 0.02143 0.02143 30 45 60 4 0.02680 0.00244 0.02925 Needles 6 0.01031 0.00127 0.01158 8 0.01649 0.00118 0.01767 30 45 60 Spacing Relativ e 0.004354 0.003708 0.000646 Absolute 0.004354 0.003708 0.000646 30 45 60 4 0.01271 0.00292 0.01563 Needles 6 0.00363 0.00087 0.00450 8 0.00907 0.00206 0.01113 30 45 60 Spacing Relativ e 0.001740 0.004148 0.002407 Absolute 0.001740 0.004148 0.002407 30 45 60 30 0.01404 0.02664 0.04067 45 0.00621 0.00978 0.00357 60 0.02025 0.01686 0.03711 T able 2.7. The ANO V A table of eects for pairwise in teractions is displa y ed. 35
PAGE 42
By using the v alues from the tables of eects, a probabilit y for detection is calculated for the optimal setting: ^ Y 3131 = ^ + ^ 3 + ^ 1 + ^ r 3 + ^ 1 + d ( ) 31 + d ( r ) 33 + d ( ) 31 + d ( r ) 13 + d ( ) 11 + d ( r ) 31 1 : 347918 = 1 : 072 + : 05607 + : 009612 + : 07486 + : 0678+ : 02143 + : 01767 + : 00907 + : 000646 + 0 : 00174 + : 02025 This result of 1.347918 is then transformed bac k (arcsine equation) to yield a probabilit y of 0.38948 for this setting. 1 : 347918 = 2 arcsin q ( p ) p = (sin(1 : 347918 = 2)) 2 = 0 : 38949 : Therefore, with the factors set to 8 needles, relativ e spacing, = 60 and = 30 the biopsy procedure has a 38 : 9% probabilit y of detecting the cancer giv en the tumor distribution model used. This estimated probabilit y is best used in comparisons with the other estimated probabilities rather than as an absolute measure of detection rate. Therefore the conclusion from this analysis is a relativ e ranking of treatmen ts in terms of their detection rate. Since the 1000 sim ulated specimens w ere the same for eac h treatmen t, the ANO V A model determined the relativ e dierences bet w een detection rates of v arious treatmen ts, not necessarily pro viding enough data and results to dra w 36
PAGE 43
conclusions about absolute detection rates. T able 2.8 lists eac h experimen t and the probabilit y of detection predicted from the factor eects model. T reatmen t P arameters Num ber of Spacing Predicted Experimen t Needles Method Probabilit y 1 4 Relativ e 45 45 0.247 2 6 Relativ e 45 45 0.297 3 8 Relativ e 45 45 0.327 4 4 Absolute 45 45 0.251 5 6 Absolute 45 45 0.281 6 8 Absolute 45 45 0.291 7 4 Relativ e 60 45 0.265 8 6 Relativ e 60 45 0.337 9 8 Relativ e 60 45 0.369 10 4 Absolute 60 45 0.271 11 6 Absolute 60 45 0.324 12 8 Absolute 60 45 0.335 13 4 Relativ e 30 45 0.195 14 6 Relativ e 30 45 0.227 15 8 Relativ e 30 45 0.251 16 4 Absolute 30 45 0.205 17 6 Absolute 30 45 0.219 18 8 Absolute 30 45 0.224 19 4 Relativ e 45 60 0.200 20 6 Relativ e 45 60 0.236 21 8 Relativ e 45 60 0.260 22 4 Absolute 45 60 0.208 23 6 Absolute 45 60 0.227 24 8 Absolute 45 60 0.232 25 4 Relativ e 60 60 0.192 26 6 Relativ e 60 60 0.247 27 8 Relativ e 60 60 0.273 28 4 Absolute 60 60 0.203 29 6 Absolute 60 60 0.241 T able 2.8. The probabilities of detection for one tumor sim ulations are displa y ed. 37
PAGE 44
T reatmen t P arameters Num ber of Spacing Predicted Experimen t Needles Method Probabilit y 30 8 Absolute 60 60 0.248 31 4 Relativ e 30 60 0.175 32 6 Relativ e 30 60 0.196 33 8 Relativ e 30 60 0.215 34 4 Absolute 30 60 0.189 35 6 Absolute 30 60 0.194 36 8 Absolute 30 60 0.195 37 4 Relativ e 45 30 0.257 38 6 Relativ e 45 30 0.314 39 8 Relativ e 45 30 0.346 40 4 Absolute 45 30 0.266 41 6 Absolute 45 30 0.303 42 8 Absolute 45 30 0.315 43 4 Relativ e 60 30 0.276 44 6 Relativ e 60 30 0.354 45 8 Relativ e 60 30 0.389 46 4 Absolute 60 30 0.287 47 6 Absolute 60 30 0.346 48 8 Absolute 60 30 0.360 49 4 Relativ e 30 30 0.208 50 6 Relativ e 30 30 0.246 51 8 Relativ e 30 30 0.272 52 4 Absolute 30 30 0.223 53 6 Absolute 30 30 0.243 54 8 Absolute 30 30 0.250 T able 2.8. (Con t.) The probabilities of detection for one tumor sim ulations are displa y ed. 2.4.7 Clinical Distribution of T umors The biopsy sim ulations w ere conducted a second time on more realistic geometric glands. By using a clinically deriv ed distribution of n um ber of tumors per gland, a better population w as a v ailable for these biopsy simulations. A sample size of 1000 w as again used but in this experimen t, 1/4 38
PAGE 45
of the glands had a single tumor, 1 = 2 had t w o tumors and the remaining 1/4 had 3 tumors. The total gland v olume w as again held to be less than 6.4 cc. This distribution is based on the analysis done b y Daneshagari [2]. The ANO V A results are found in the Appendix and yield the same optimal biopsy procedure with a sligh tly dieren t probabilit y resulting from the factor eects model. By using the v alues from this second table of eects, a probabilit y for detection is calculated for the optimal setting: ^ Y 3131 = ^ + ^ 3 + ^ 1 + ^ r 3 + ^ 1 + d ( ) 31 + d ( r ) 33 + d ( ) 31 + d ( r ) 13 + d ( ) 11 + d ( r ) 31 1 : 7535 = 1 : 429 + 0 : 0733 + 0 : 01507 + 0 : 07456 + 0 : 07091+ 0 : 02321 + 0 : 02650 + 0 : 01412 0 : 005442 0 : 004094 + 0 : 03638 T ransforming this v alue (arcsine) yields a probabilit y of detection for the optimal setting of : 5908. This probabilit y of 59.08% is higher than the 38.9% ac hiev ed b y the sim ulation using geometric models of one tumor as w ould be expected. The predicted probabilities for eac h of the 54 experimen ts giv en this distribution of tumors is sho wn in T able 2.9. 39
PAGE 46
T reatmen t P arameters Num ber of Spacing Predicted Experimen t Needles Method Probabilit y 1 4 Relativ e 45 45 0.417 2 6 Relativ e 45 45 0.489 3 8 Relativ e 45 45 0.526 4 4 Absolute 45 45 0.417 5 6 Absolute 45 45 0.470 6 8 Absolute 45 45 0.482 7 4 Relativ e 60 45 0.427 8 6 Relativ e 60 45 0.524 9 8 Relativ e 60 45 0.569 10 4 Absolute 60 45 0.436 11 6 Absolute 60 45 0.514 12 8 Absolute 60 45 0.533 13 4 Relativ e 30 45 0.353 14 6 Relativ e 30 45 0.405 15 8 Relativ e 30 45 0.431 16 4 Absolute 30 45 0.354 17 6 Absolute 30 45 0.387 18 8 Absolute 30 45 0.388 19 4 Relativ e 45 60 0.358 20 6 Relativ e 45 60 0.408 21 8 Relativ e 45 60 0.443 22 4 Absolute 45 60 0.360 23 6 Absolute 45 60 0.391 24 8 Absolute 45 60 0.401 25 4 Relativ e 60 60 0.322 26 6 Relativ e 60 60 0.395 27 8 Relativ e 60 60 0.437 28 4 Absolute 60 60 0.332 29 6 Absolute 60 60 0.386 30 8 Absolute 60 60 0.403 T able 2.9. Giv en the distribution of one to three tumors, the probabilities of detection predicted b y the ANO V A model are displa y ed. 40
PAGE 47
T reatmen t P arameters Num ber of Spacing Predicted Experimen t Needles Method Probabilit y 31 4 Relativ e 30 60 0.326 32 6 Relativ e 30 60 0.357 33 8 Relativ e 30 60 0.381 34 4 Absolute 30 60 0.329 35 6 Absolute 30 60 0.341 36 8 Absolute 30 60 0.340 37 4 Relativ e 45 30 0.417 38 6 Relativ e 45 30 0.498 39 8 Relativ e 45 30 0.541 40 4 Absolute 45 30 0.425 41 6 Absolute 45 30 0.486 42 8 Absolute 45 30 0.504 43 4 Relativ e 60 30 0.436 44 6 Relativ e 60 30 0.541 45 8 Relativ e 60 30 0.590 46 4 Absolute 60 30 0.451 47 6 Absolute 60 30 0.537 48 8 Absolute 60 30 0.562 49 4 Relativ e 30 30 0.351 50 6 Relativ e 30 30 0.412 51 8 Relativ e 30 30 0.444 52 4 Absolute 30 30 0.359 53 6 Absolute 30 30 0.401 54 8 Absolute 30 30 0.407 T able 2.9. (Con t.) Giv en the distribution of one to three tumors, the probablities of detection predicted b y the ANO V A model are displa y ed. A selection of detection rates are graphed in Figure 2.7 to pro vide visualization of the relativ e ranking of v arious treatmen ts. The plots indicate 6 and 8 needles, relativ e spacing and all of the lev els for and 41
PAGE 48
.55 j Rate Hit .6 qqqq q = 30 ; 6 needles q Legend 0 = 30 ; 8 needles = 45 ; 6 needles = 60 ; 6 needles = 45 ;8 needles = 60 ; 8 needles 00000 45 0 0 30 60 0 .35 .4 .45 .5Figure 2.7. The detection rates for sev eral experimen ts are graphed and the common treatmen t parameters are noted for eac h experimen t. This giv es a visual understanding of the ranking of these treatmen ts in terms of their detection rate. 42
PAGE 49
3. Digitized Specimen Data 3.1 Summary of Soft w are T ool An analysis program, written in C, w as created to sim ulate needle biopsies on clinical data pro vided b y the Univ ersit y of Colorado Health Sciences Cen ter, P athology Departmen t. The clinical data w ere gathered from autopsies, pathologically in v estigated and digitized [2]. The data for eac h specimen are stored as a 3dimensional arra y of information. The soft w are uses an input le to determine the c haracteristics of a giv en experimen t. These c haracteristics include the n um ber of needles, the initial placemen t of the rst needle, the angles and the spacing bet w een needles, and the needle diameter and length. In this manner, the analysis soft w are is rexible enough to handle a v ariet y of sim ulations. The goal of this biopsy sim ulation tool is to pro vide the means to experimen t realistically with v arious needle parameters on clinical data in order to determine an y correspondence bet w een biopsy methods and detection rates. The initial needle position is oset b y the distance requested (the z oset en tered b y the user), with half of the needles en tering the righ t lobe 43
PAGE 50
and the other half en tering the left lobe, in symmetry with eac h other. The initial position is determined as an absolute (in cm) oset from the apex of the gland. The other parameters are used to position eac h needle on the specimen data set and determine ho w m uc h of the specimen data is to be returned in the needle biopsy This specimen data is analyzed to determine whether and ho w m uc h tumor data is presen t in the needle. This information is a v ailable to the user. Ha ving read the input le with parameter v alues, the code begins a loop on the specimen data les requested for sim ulation. In this loop, the threedimensional specimen data le is opened, the data are read in to a 3d arra y with all of the bac kground trimmed o, the apex of the gland is located, and the needle positions are translated in to arra y coordinates. These coordinates are fed to the biopsy routine whic h extracts the specimen data coinciding with the needle and analyzes the data for tumor information. The information for the en tire experimen t is stored in an output le that documen ts the needle parameters and the results for eac h image data set. 44
PAGE 51
3.2 Specic Algorithms 3.2.1 Locating the Apex The apex is dened as the rst con tact with the prostate when approac hing it through the rectum, as done clinically This location is used as a landmark for positioning eac h biopsy needle. In the data set, the algorithm that searc hes for this landmark proceeds as follo ws. The planes are dened as sho wn in Figure 3.1. Eac h pixel in the threedimensional specimen le con tains a n um ber indicating the t ype of data at that location. The possible t ypes are gland, tumor, capsule or bac kground. Capsule data indicate those pixels dening the boundary of the gland. The apex is indicated b y the rst pixel poin ting to capsule data. Therefore one plane of specimen data is ev aluated at a time, un til a pixel that poin ts to capsule data is found. This location is recorded as the apex location. 45
PAGE 52
Apex X Y ZFigure 3.1. The x; y; z axis, as dened for the digital data, mimic those dened for the geometric models. 46
PAGE 53
3.2.2 Establishing Needle P ositions The starting position, the location of the apex, serv es as the landmark for eac h additional needle. F rom this starting poin t and the additional usersupplied parameters ( z oset, distance bet w een needles) all of the needle positions are calculated in terms of a v ector. This v ector, represen ted b y ( x; y; z ) coordinates, along with the angle, is a poin ter to a specic pixel of image data. The z oset is assumed to be in cen timeters and is added to the initial ( x; y; z ) of the starting position to locate the rst needle position. Eac h time an y coordinate is c hanged, the new v ector ma y be poin ting to gland, tumor, bac kground, urethra or capsule data. The pixel represen ted b y the v ector is read to insure that the needle en try position remains located on capsule data. If it does not, the y coordinate is adjusted to mak e sure that the en try position of the needle is on capsule data. A t this poin t in the algorithm, the rst needle position is determined. There are t w o w a ys to space the remaining needles. The user ma y en ter absolute distances in cen timeters or a relativ e measure tak en to be a percen tage of the z dimension of the gland. In addition, a zero percen tage indicates that 47
PAGE 54
the spacing is based on the n um ber of needles in the biopsy; the needles are equally spaced across the z axis of the gland. The remaining needle positions are calculated from the initial needle position: half of the needles are positioned in the righ t lobe b y using the remainder use to rotate in to the left lobe. All of the needles ha v e the x coordinate set to the midpoin t of the gland in the x dimension. The useren tered distance, in cen timeters, is con v erted to a specic n um ber of pixels. This z distance is added to the rst needle position to obtain the second needle position, added to the second to obtain the third, etc. Eac h time a needle position is calculated, the coordinates are ev aluated to insure that they poin t to capsule data. If the gland is too short in the z direction to handle all the needles requested, the experimen t proceeds with the n um ber of needles that do sta y within the gland. The experimen ts that depend on a relativ e distance bet w een needles, require additional analysis of the yz slice before determining the z oset. The z diameter of the particular yz slice is calculated. The z distance required for a needle of a specic length, inserted at a specic angle is then subtracted from this z diameter. Rather than ha ving the last needle pierce more bac kground than gland data, this subtraction enables the full n um ber of needles to be 48
PAGE 55
inserted in to the gland. This new z diameter is then divided in to the n um ber of segmen ts required b y the specied percen tage. If the user indicates 0% for the distance spacing, the soft w are calculates the distance based on the n um ber of needles requested and the diameter of the yz plane. 3.3 Sim ulations The 54 treatmen ts used in the geometric model w ere used as biopsy procedures on a maxim um of 53 digitized clinical specimens. Some of the biopsy tec hniques w ere sim ulated on only 52 of these clinical specimens. T able 3.1 sho ws the results from these sim ulations on the digitized clinical data. The table documen ts both the m ultipletumor geometric model hit rate as w ell as the n um ber of hits resulting from the same biopsy on the digitized clinical data. The rst v e columns indicate the experimen t n um ber and the biopsy parameter settings for the four v ariables, n um ber of needles, spacing method, and The column labelled Detection Rate is the n um ber of hits per 1000 sim ulations of the geometric model. The column labelled Num ber of Hits is the n um ber of hits per n um ber of digitized clinical samples. Most experimen ts w ere run on all 53 of the digitized specimens. Ho w ev er, some of the sim ulations resulted in an error on one or more of the specimens and these specimens w ere then remo v ed from the experimen t. The nal column, 49
PAGE 56
labelled Clincial Detection Rate is the rate for the experimen ts on the digitized specimens. Num ber Num ber Clinical of Spacing Detection of Detection Experimen t Needles Method Rate Hits Rate 1 4 Relativ e 45 45 0.417 8 53 0.1509 2 6 Relativ e 45 45 0.489 11 53 0.2075 3 8 Relativ e 45 45 0.526 8 52 0.1538 4 4 Absolute 45 45 0.417 9 53 0.1698 5 6 Absolute 45 45 0.470 11 53 0.2075 6 8 Absolute 45 45 0.482 10 52 0.1923 7 4 Relativ e 60 45 0.427 9 53 0.1698 8 6 Relativ e 60 45 0.524 9 52 0.1731 9 8 Relativ e 60 45 0.569 13 53 0.2453 10 4 Absolute 60 45 0.436 10 53 0.1887 11 6 Absolute 60 45 0.514 12 53 0.2264 12 8 Absolute 60 45 0.533 12 53 0.2264 13 4 Relativ e 30 45 0.353 7 53 0.1321 14 6 Relativ e 30 45 0.405 12 53 0.2264 15 8 Relativ e 30 45 0.431 9 53 0.1698 16 4 Absolute 30 45 0.354 7 53 0.1321 17 6 Absolute 30 45 0.387 7 53 0.1321 18 8 Absolute 30 45 0.388 9 53 0.1698 19 4 Relativ e 45 60 0.358 6 53 0.1132 20 6 Relativ e 45 60 0.408 9 53 0.1698 21 8 Relativ e 45 60 0.443 11 52 0.2115 22 4 Absolute 45 60 0.360 8 53 0.1509 23 6 Absolute 45 60 0.391 10 53 0.1887 24 8 Absolute 45 60 0.401 10 53 0.1887 25 4 Relativ e 60 60 0.322 8 53 0.1509 26 6 Relativ e 60 60 0.395 8 52 0.1538 27 8 Relativ e 60 60 0.437 9 52 0.1731 28 4 Absolute 60 60 0.332 6 52 0.1154 29 6 Absolute 60 60 0.386 9 52 0.1731 30 8 Absolute 60 60 0.403 9 52 0.1731 T able 3.1 The detection rates for the geometric and clinical sim ulations are displa y ed. 50
PAGE 57
Num ber Num ber Clinical of Spacing Detection of Detection Experimen t Needles Method Rate Hits Rate 31 4 Relativ e 30 60 0.326 5 52 0.0962 32 6 Relativ e 30 60 0.357 5 52 0.0962 33 8 Relativ e 30 60 0.381 9 52 0.1731 34 4 Absolute 30 60 0.329 4 52 0.0769 35 6 Absolute 30 60 0.341 4 52 0.0769 36 8 Absolute 30 60 0.340 4 52 0.0769 37 4 Relativ e 45 30 0.417 6 52 0.1154 38 6 Relativ e 45 30 0.498 10 52 0.1923 39 8 Relativ e 45 30 0.541 12 52 0.2308 40 4 Absolute 45 30 0.425 8 52 0.1538 41 6 Absolute 45 30 0.486 10 52 0.1923 42 8 Absolute 45 30 0.504 11 52 0.2115 43 4 Relativ e 60 30 0.436 6 52 0.1154 44 6 Relativ e 60 30 0.541 10 52 0.1923 45 8 Relativ e 60 30 0.590 10 53 0.1887 46 4 Absolute 60 30 0.451 8 52 0.1538 47 6 Absolute 60 30 0.537 12 52 0.2308 48 8 Absolute 60 30 0.562 12 52 0.2308 49 4 Relativ e 30 30 0.351 3 30 0.1000 50 6 Relativ e 30 30 0.412 11 52 0.2115 51 8 Relativ e 30 30 0.444 10 52 0.1923 52 4 Absolute 30 30 0.359 6 52 0.1154 53 6 Absolute 30 30 0.401 8 52 0.1538 54 8 Absolute 30 30 0.407 10 52 0.1923 T able 3.1 (Con t.) The detection rates for the geometric and clinical sim ulations are displa y ed. 3.4 Geometric Model vs Clinical Model Comparison of the detection rates bet w een the geometric model and the clinical model rev eals that the geometric sim ulation produces m uc h higher rates than its clinical coun terpart. In attempting to explain this discrepency sev eral c haracteristics of the experimen t are noted. 51
PAGE 58
The distribution of the tumors and the total tumor v olume in a giv en specimen can impact the detection rate of a treatmen t. A comparison of the tumor v olumes is graphically displa y ed in Figures 3.2 and 3.3. As sho wn b y the histograms, the tumor v olumes for the autopsy data tend strongly to w ard small ( : 5 cc) v olumes. In con trast, the geometric model produces tumors with v olumes more equally spaced across the spectrum of possible v olumes. In fact, 80% of the autopsy specimens ha v e a total tumor v olume less than : 5 cc. In con trast, only 49% of the geometric gland models ha v e a total tumor v olume in this range. This dierence in the size of the tumors can explain some of the dierence in detection rate bet w een the clinical and geometrical models. A second dierence is that the relativ e ranking of detection rates for the digital data sim ulations is dieren t than the ranking of detection rates for the geometric sim ulations. An example of this discrepency is that experimen t 9, ( 8 Needles, Relativ e Spacing, = 60 = 45 ) ac hiev ed a detection rate of 0.2453 or 13 hits out of 53 samples. This detection rate is better than the detection rate of experimen t 45, ( 8 Needles, Relativ e Spacing, = 60 = 30 ) whic h is the optimal biopsy as indicated b y the geometric sim ulation. This dierence ma y be due to the fact that only 53 specimens w ere used in the 52
PAGE 59
digital sim ulation in con trast to the 1000 models constructed for the geometric sim ulation. 3.5 Optimal T ec hnique vs SRSCB The optimal tec hnique, determined b y the geometric model, consists of 8 needles, r elative spacing, = 60 and = 30 The SRSCB procedure uses 6 needles, absolute spacing, = 45 and = 45 Both tec hniques w ere sim ulated on the geometric model as w ell as the digitized clinical data. The optimal tec hnique actually pro v ed sligh tly w orse at tumor detection than the SRSCB procedure when sim ulated on the clinical data. In fact, the optimal method detected tumor in 10 out of 53 specimens (.189). The SRSCB method detected tumor in 11 out of 53 specimens (.207). These results compare with the o v erall results from the geometric sim ulation as follo ws. The SRSCB had a detection rate of .47 and the optimal had a detection rate of .59 on the 1000 geometric models. This discrepency is addressed b y noting the sample size a v ailable in the t w o sim ulations and the distribution of tumor v olumes as noted earilier. 53
PAGE 60
of Tumors Number Autopsy Specimens 15 20 10 Sum of Tumor Volume 25 .05.511.522.533.544.55 50Figure 3.2. The histogram of the clinical data sho ws the tumor distribution b y v olume. 54
PAGE 61
of Tumors Number .05 .5 400 11.522.533.544.55 0 Sum of Tumor Volume Geometric Specimens 100 200 300Figure 3.3. The histogram of the geometric data sho ws the tumor distribution b y v olume. 55
PAGE 62
4. Geometric Model V olume Estimates 4.1 T umor V olume Estimates The total v olume of tumor in a gland is an importan t piece of information for clinicians who use it to impro v e both the diagnosis and treatmen t plan for a patien t. The ultrasound used during a biopsy accurately measures the prostate gland v olume so that an appro ximate ratio of tumor to gland v olume can be used to estimate the v olume of tumor in a gland. These simulations oered an a v en ue to explore a means of appro ximating this v olume ratio b y using the v olume of the needle that con tains tumor information and the total v olume of the needle. Three methods are used to estimate the amoun t of tumor in tersected b y the needle. The needle can be modeled b y a line, a strip, or a cylinder in one, t w o, and three dimensions, respectiv ely The length and diameter of the needle are constan t and are set b y clinical limits. This incremen tal approac h began in one dimension in order to simplify aspects of the sim ulation during soft w are v erication. As the researc h progressed, the t w oand threedimensional needles w ere in troduced in order to model the actual biopsy more 56
PAGE 63
closely The rst method of estimating the v olume ratio is R = 1 n ( v i V i ) where v i represen ts the tumor v olume within a single needle, V i represen ts the v olume of that same needle, and n is the n um ber of needles. This ratio is referred to as the a v erage of the ratios. A second estimator of v olume ratio is r = v i V i where v i is the tumor v olume within a single needle and V i is the total v olume of that needle. This ratio is considered the ratio of the a v erage v olumes since 1 n P n i =1 v i is the a v erage tumor v olume and 1 n P n i =1 V i is the a v erage needle v olume. This yields r = 1 n v i 1 n V i = v i V i Both methods of estimating the ratio are documen ted belo w. 1 Y Z Gland Ellipsoid Tumor Needle l T t 2 tFigure 4.1. This illustration of the gland, tumor and onedimensional needle depicts the v ariables used in determining the v olume ratio estimator. 57
PAGE 64
4.1.1 OneDimensional Analysis Line Model In this rst model, w e represen t the needle b y a line segmen t as sho wn in Figure 4.1. The length of the needle that con tains tumor pixels, l T is the dierence bet w een t 1 and t 2 the t w o roots of equation ( 2.1): l T = j t 1 t 2 j A needle length, L of 1.25 cm is used in the estimate of v olume ratio. Th us the ratio l T L is an appro ximation of the true v olume ratio TV PGV ; that is, l T L TV PGV 4.1.2 Tw oDimensional Strip Model In the t w odimensional case w e represen t the needle b y a strip. The needle en try poin ts ( x 0 y 0 z 0 ) are used as a starting poin t in the t w odimensional analysis. Tw o lines are created, eac h oset from this starting coordinate b y the needle radius. The in tersection bet w een these t w o lines and the tumor ellipse is determined and the roots of the t w o resulting quadratics are used to compute both the occurrence of a detection and the amoun t of tumor within the needle. In this case, the estimate of the v olume ratio is the area of the tumor o v er the area of the needle. Figure 4.2 denes the lengths used in determining the area. The area of the tumor is calculated b y estimating the needle length whic h contains tumor data with the roots of in tersection: l t 1 = j t 11 t 12 j ; l t 2 = j t 21 t 22 j 58
PAGE 65
The area of tumor is then giv en b y a T = d 2 ( l t 1 + l t 2 ) where d is the diameter of the needle. The area of the needle is calculated in the same w a y using the length of the needle: a N = d 2 ( L + L ). Th us a T a N serv es as an estimate of the true tumor to gland v olume ratio, TV PGV t1 Y Z Gland Ellipsoid Tumor Needle l 12 t t 21 11 t t2 t 22 lFigure 4.2. This illustration of the gland, tumor and t w odimensional needle depicts the v ariables used in determining the v olume ratio estimator. 4.1.3 ThreeDimensional Cylinder Model The threedimensional analysis models the needle as a cylinder and is similar to the t w odimensional case in that the en try poin t of the needle is again used as a cen ter coordinate for four needles. In this case, the four needles are constructed symmetrically about this poin t to generate a cylindrical needle. Then in tersections and roots are computed. A more accurate represen tation of the v olume ratio is obtained using the v olume of the tumor within the needle 59
PAGE 66
o v er the v olume of the needle. In this case, the length is estimated to be the maxim um of the lengths determined from the four sets of in tersection roots: l t = max ( j t 11 t 12 j ; j t 21 t 22 j ; j t 31 t 32 j ; j t 41 t 42 j ) : The v olume of the needle depends on the kno wn diameter and length: v N = ( d 2 ) 2 ( L ). The estimated v olume of the tumor depends on the needle lengths whic h con tain tumor data as sho wn in Figure 4.3. This leads to the tumor v olume estimate v T = ( d 2 ) 2 ( l t ). The ratio v T v N estimates the true v olume ratio, TV PGV 4.2 Experimen t Setup A second set of experimen ts utilizing the geometric model in v olv ed exploring the question of accurately estimating the tumor v olume to gland v olume ratio. The experimen t sim ulated a biopsy on a single specimen, increasing the n um ber of needles eac h iteration and comparing the v olume ratio obtained from the biopsy sample to the kno wn v olume ratio. The parameters for the biopsy include the optimal angles and determined from the ANO V A inv estigation The optimal n um ber of needles and distancing method determined from the ANO V A analysis do not apply to this experimen t since the n um ber of needles increases from 6 to 20 and the distancing of these needles is done so that the maxim um` n um ber, 20, are equally spaced. The maxim um n um ber 60
PAGE 67
Y 31 Gland Ellipsoid Needle Tumor Z Y X Tumor Gland Ellipsoid Needle t 22 l t2t 21 tt 11 t 12 t 42 t 32 t 41 tFigure 4.3. This illustration of the gland, tumor and threedimensional needle depicts the v ariables used in determining the v olume ratio estimator. 61
PAGE 68
of needles w as set at 20 due to clinical limitations. The spacing of the needles is dependen t on the maxim um n um ber so that from one iteration to the next n 2 needles are in the same exact location, yielding the same detection information. In this manner the comparison bet w een a specimen biopsied b y 6 needles and the same specimen biopsied b y 10 needles is not dependen t on needle position, but instead compares the gain made b y the four additional needles. The sim ulation is executed on 1000 specimens, v arying the n um ber of needles from 6 to 20 in incremen ts of 2. `The output from this experimen t consists of a le for eac h specimen that con tains the results of eac h set of needles including the tumor to needle v olume ratio ac hiev ed and the associated estimates ( R = 1 n P ( v t =v n ) and r = v t v n ). In addition, the actual tumor to gland v olume ratio is noted. 4.3 Results The results of this experimen t w ere not as an ticipated as there appears to be no pattern of con v ergence to the actual tumor to gland v olume ratio within the limit of 20 total needles. Ho w ev er, m uc h w as learned from this exercise that pro vided insigh t in to the next series of in v estigations. First, it is noted that in the great majorit y of cases, a single 8needle biopsy tends 62
PAGE 69
to o v erestimate the true tumor to gland v olume ratio. Secondly a comparison bet w een the t w o methods of calculating the error leads to the conclusion that the sum of the ratios is the more accurate method at least in this set of limited trials. 4.4 In teractiv e Utilit y Using the preceding idea as a starting poin t, an in teractiv e soft w are tool w as created to in v estigate the v olume ratio question in greater detail. This tool prompts the user for a random n um ber, seeds the random n um ber generator, creates a gland con taining a single tumor and conducts the optimal 8needle biopsy This optimal biopsy has 8 needles, relativ e spacing bet w een the needles, = 60 and = 30 The results, whic h include eac h needle position, the amoun t of tumor v olume con tained in the needle and an estimate as to the v olume ratio of tumor to gland, are displa y ed for the user. A t this poin t, the user is able to c hoose the location for the next needle. This new needle is then sim ulated and the tumor v olume information it retriev es is incorporated in to the v olume ratio. The user can con tin ue this process of requesting additional needles and ev aluate the estimated v olume ratio and its error from the true ratio. A maxim um of 20 needles can be sim ulated on a single gland, beginning with the 8 original needles and accum ulating the 63
PAGE 70
additional 12 based on user specications. This area of researc h is full of openended questions where tools suc h as this in teractiv e utilit y can help shed ligh t on answ ers. With in v olv emen t from clinicians and medical researc hers, experimen ts can be designed to gather more information regarding the t w o issues of v olume ratio and optimal biopsy tec hnique. In addition, using the results of this body of researc h, more realistic tumor distributions and geometric models can be constructed to better understand the impact of treatmen t parameters on detection rate. 64
PAGE 71
A. APPENDIX ANO V A Denitions A dot in the subscript indicates a v eraging o v er the v ariable represen ted b y that index. The n um ber of lev els for Numb er of Ne e dles : a = 3. The n um ber of lev els for Distancing Metho d : b = 2. The n um ber of lev els for : c = 3. The n um ber of lev els for : d = 3. The n um ber of specimens = 1000. The n um ber of experimen ts: abcd = 54. In general, Y is an observ ation, Y is the mean of observ ations, is the true mean and ^ is the least squares estimate of the true mean. Y ijkl is the observ ed detection rate at the factor lev els indicated b y i; j; k and l Y :::: is the mean of all specimens o v er all treatmen t lev els i; j; k; l It indicates the o v erall detection rate for the en tire experimen t. Y :::: = 1 abcd a X i =1 b X j =1 c X k =1 d X l =1 Y ijkl 65
PAGE 72
SSTO or total sum of squares is a measure of the total v ariabilit y of the observ ations without consideration of factor lev el. SSTO = a X i =1 b X j =1 c X k =1 d X l =1 ( Y ijkl Y :::: ) 2 d f SSTO is the total degrees of freedom. The SSTO has abdc 1 = 54 1 degrees of freedom. One degree of freedom is lost due to the lac k of independence bet w een the deviations. SSTR or treatmen t sum of squares measures the exten t of dierences bet w een estimated factor lev el means and the mean o v er all treatmen ts. The greater the dierence bet w een factor lev el means (treatmen t means), the greater the v alue of SSTR SSTR = a X i =1 b X j =1 c X k =1 d X l =1 ( ^ Y ijkl Y :::: ) 2 d f SSTR is the degrees of freedom. There are r 1 degrees of freedom for the SSTR where r is the n um ber of parameters in the model. In the full model, r = abcd = 54, the total com binations of factor lev els. In the model used for this sim ulation, r = ( a 1)+( b 1)+( c 1)+( d 1)+( a 1)( b 1)+( a 1)( c 1)+( a 1)( d 1)+( b 1)( c 1)+( b 1)( d 1)+( c 1)( d 1) = 26. One degree of freedom is lost due to the lac k of independence bet w een the deviations. 66
PAGE 73
SSE or error sum of squares, measures v ariabilit y whic h is not explained b y the dierences bet w een sample means. It is a measure of the v ariation within treatmen ts. A smaller v alue of SSE indicates less v ariation within sim ulations at the same factor lev el. SSE = a X i =1 b X j =1 c X k =1 d X l =1 ( Y ijkl ^ Y ijkl ) 2 d f SSE is the degrees of freedom. Since SSE is the sum of the errors across factor lev el, the degrees of freedom is the sum of the degrees of freedom for eac h factor lev el. It is the total n um ber of sim ulations min us r abcd r MSE is the mean square for error dened b y MSE = SSE=d f SSE Note: The abo v e denitions imply SSTO = SSTR + SSE Due to this relationship, this process is referred to as the partitioning of the total sum of the squares. In order to measure the v ariabilit y within a factor lev el, the factor sum of square terms are computed. These terms are in tegral in the test statistic applied to determine whether a factor main eect is signican t. In addition, in teraction sum of squares are computed to measure v ariabilit y of the in teractions. 67
PAGE 74
The factor A sum of squares corresponds to the numb er of ne e dles factor. SSA = bcd a X i =1 ( Y i::: Y :::: ) 2 Similar factor sum of squares are computed for eac h of the factors: F actor Sum of Square Mean Sum of Square Num ber of Needles SSA = bcd P a i =1 ( Y i::: Y :::: ) 2 MSA = SSA= ( a 1) Spacing Method SSB = acd P b j =1 ( Y :j:: Y :::: ) 2 MSB = SSB= ( b 1) SSC = abd P c k =1 ( Y ::k: Y :::: ) 2 MSC = SSC= ( c 1) SSD = abc P d l =1 ( Y :::l Y :::: ) 2 MSD = SSD= ( d 1) The in teraction sum of squares are computed as w ell for use in the Ftest on the in teractions. The rst three pairwise in teraction sum of squares are sho wn belo w. The others are computed in the same manner. 68
PAGE 75
Num ber of Needles: Spacing SSAB = cd P a i =1 P b j =1 ( Y ij:: Y i::: Y :j:: + Y :::: ) 2 MSAB = SSAB= ( a 1)( b 1) Num ber of Needles: SSAC = bd P a i =1 P c k =1 ( Y i:k: Y i::: Y ::k: + Y :::: ) 2 MSAC = SSAC= ( a 1)( c 1) Num ber of Needles: SSAD = bc P a i =1 P d l =1 ( Y i::l Y i::: Y :::l + Y :::: ) 2 MSAD = SSAD= ( a 1)( d 1) The treatmen t means, ijkl indicate the mean for the treatmen t at the ijkl lev els of the respectiv e factors. The o v erall mean, is the mean across all factors and all lev els (across all i; j; k; l ). 69
PAGE 76
References (1) Hodge K.K., McNeal J.E., T erris M.K., Stamey T.A. \Random systematic v ersus directed ultrasound guided transrectal core biopsies of the prostate." Journal of Ur olo gy 142 (1989): 7174. (2) Daneshgari, Firouz M.D., T a ylor, Gerald D. PhD, Miller, Gary J. M.D., PhD, Cra wford, E. Da vid M.D. \Computer Sim ulation of the Probabilit y of Detecting Lo w V olume Carcinoma of the Prostate with Six Random Systematic Core Biopsies". Ur olo gy 45 (April 1989): 604609. (3) McNeal, John M.D. \Normal Histology of the Prostate" The A meric an Journal of Sur gic al Patholo gy (1988): 619633. (4) Neter, John, W asserman, William, Applied Linear Statistical Models Ric hard D. Irwin, Inc 1974. 70

