
Citation 
 Permanent Link:
 http://digital.auraria.edu/AA00006590/00001
Material Information
 Title:
 Circulating tumor cells mathematical theory of detectability with simulations and experimental results in a model system
 Creator:
 Futia, Gregory Louis ( author )
 Language:
 English
 Physical Description:
 1 electronic file (171 pages) : ;
Thesis/Dissertation Information
 Degree:
 Doctorate ( Doctor of philosophy)
 Degree Grantor:
 University of Colorado Denver
 Degree Divisions:
 Department of Bioengineering, CU Denver
 Degree Disciplines:
 Bioengineering
 Committee Chair:
 Benninger, Richard KP
 Committee Members:
 Gibson, Emily A.
Behbakht, Kian Schlaepfer, Isabel R. Shandas, Robin
Subjects
 Subjects / Keywords:
 Tumor markers ( lcsh )
Neoplastic cells, circulating ( lcsh ) Metastasis ( lcsh )
 Genre:
 bibliography ( marcgt )
theses ( marcgt ) nonfiction ( marcgt )
Notes
 Review:
 Circulating tumor cells (CTCs) are nucleated objects that are shed from a primary tumor into the blood stream. Effective identification of CTCs holds promise for improving early detection and disease monitoring of cancer but is difficult due to the rarity of CTCs compared to background blood cells. ( ,,, )
 Review:
 In this dissertation, I develop mathematics describing how the rarity of cell that an assay can detect is limited by the sensitivity and specificity of the assay's identifying biomarker to that cell. I refer to the rarity of cell that an assay can detect as detectable rarity. I show that depending on the distribution of disease positive and disease negative populations on an identifying biomarker there can be a maximum in detectable rarity as a function of the test positivetest negative cutoff position on that biomarker. Most CTC assays consist of 2 stages which are an enrichment stage followed by an image cytometry stage. I present mathematics describing how the sensitivity, specificity, and detectable rarity of a multistage tests relates to the sensitivity, specificity, and detectable rarity of the individual test stages. The enriched output fraction typically contains between 1,00010,000 cells. Difficulties in processing this cell fraction for image cytometry lies in (1) preventing cell loss in the numerous handling steps involved in labeling and mounting the cells and (2) controlling the area of the resulting cell field such that is neither too sparse or too dense. I present technology I have engineered that addresses point (1) by confining cells during the labeling process using a filter and addresses point (2) by allowing the size of the cell field to be set using standard orings and with diameters interchangeable using variable low cost alignment plates.
 Review:
 I assess the identification performance of adding lipid imaging to the standard DAPI, Cytokeratin, CD45 panel used to identify CTCs. I assess the identification performance of adding metrics of spatial second moment, spatialfrequency second moment, the product of spatial second moment and spatialfrequency second moment to simple total content metric. To perform this assessment, I use technology I engineered to prepare samples for image cytometry with fluorescent staining and antibody labeling DAPI, Bodipy (lipids), Cytokeratin and CD45. I perform this analysis in a model system of disease negative white blood cells and disease positive MCF7 cancer cells.
 Review:
 In this model system, I present my analysis of the four spatial features calculated on each of the four labels, providing a total of 16 biomarkers. The best performing of the 16 biomarkers produced an average separation of 3 standard deviations between disease positive (D+) and disease negative (D ) populations and an average detectable rarity of ~1 in 200. I performed multivariable regression and feature selection to combine multiple biomarkers for increased performance and showed an average separation of 7 standard deviations between the D+ and D populations giving an average detectable rarity of ~1 in 480. Histograms and receiver operating characteristics (ROC) for these biomarker features and regressions are presented. I show methods to optimize for the maximum detectable rarity as a function of test positivetest negative cutoff position and apply this method for all biomarkers measured.
 Bibliography:
 Includes bibliographical references.
 System Details:
 System requirements: Adobe Reader.
 Restriction:
 Embargo ended 12/11/2019
 Statement of Responsibility:
 by Gregory Louis Futia.
Record Information
 Source Institution:
 University of Colorado Denver
 Holding Location:
 Auraria Library
 Rights Management:
 All applicable rights reserved by the source institution and holding location.
 Resource Identifier:
 on10277 ( NOTIS )
1027791082 ( OCLC ) on1027791082
 Classification:
 LD1193.56 2017d F97 ( lcc )

Downloads 
This item has the following downloads:

Full Text 
CIRCULATING TUMOR CELLS: MATHEMATICAL THEORY OF
DETECTABILITY WITH SIMULATIONS AND EXPERIMENTAL RESULTS IN A
MODEL SYSTEM By
GREGORY LOUIS FUTIA
B.S., Purdue University, 2007 M.S., Colorado State University, 2011
A dissertation submitted to the Faculty of the Graduate School of the University of Colorado in partial fulfilment of the requirements for degree of Doctor of Philosophy Bioengineering Program
2017
This dissertation for the Doctor of Philosophy by Gregory Louis Futia has been approved for the Bioengineering Program by
Richard KP Benninger, Chair Emily A Gibson, Advisor Kian Behbakht,
Isabel R Schlaepfer Robin Shandas
Date: December 16, 2017
Futia, Gregory Louis (Ph.D., Bioengineering)
Circulating Tumor Cells: Mathematical Theory of Detectability with Simulations and
Experimental Results in a Model System
Dissertation directed by Assistant Professor Emily A. Gibson
Abstract
Circulating tumor cells (CTCs) are nucleated objects that are shed from a primary tumor into the blood stream. Effective identification of CTCs holds promise for improving early detection and disease monitoring of cancer but is difficult due to the rarity of CTCs compared to background blood cells.
In this dissertation, I develop mathematics describing how the rarity of cell that an assay can detect is limited by the sensitivity and specificity of the assayâ€™s identifying biomarker to that cell. I refer to the rarity of cell that an assay can detect as detectable rarity.
I show that depending on the distribution of disease positive and disease negative populations on an identifying biomarker there can be a maximum in detectable rarity as a function of the test positivetest negative cutoff position on that biomarker. Most CTC assays consist of 2 stages which are an enrichment stage followed by an image cytometry stage. I present mathematics describing how the sensitivity, specificity, and detectable rarity of a multistage tests relates to the sensitivity, specificity, and detectable rarity of the individual test stages.
The enriched output fraction typically contains between 1,00010,000 cells. Difficulties in processing this cell fraction for image cytometry lies in (1) preventing cell loss in the numerous handling steps involved in labeling and mounting the cells and (2) controlling the area of the resulting cell field such that is neither too sparse or too dense. I present technology I have engineered that addresses point (1) by confining cells during the
m
labeling process using a filter and addresses point (2) by allowing the size of the cell field to be set using standard orings and with diameters interchangeable using variable low cost alignment plates.
I assess the identification performance of adding lipid imaging to the standard DAPI, Cytokeratin, CD45 panel used to identify CTCs. I assess the identification performance of adding metrics of spatial second moment, spatialfrequency second moment, the product of spatial second moment and spatialfrequency second moment to simple total content metric. To perform this assessment, I use technology I engineered to prepare samples for image cytometry with fluorescent staining and antibody labeling DAPI, Bodipy (lipids), Cytokeratin and CD45.1 perform this analysis in a model system of disease negative white blood cells and disease positive MCF7 cancer cells.
In this model system, I present my analysis of the four spatial features calculated on each of the four labels, providing a total of 16 biomarkers. The best performing of the 16 biomarkers produced an average separation of 3 standard deviations between disease positive (D+) and disease negative (D") populations and an average detectable rarity of ~1 in 200.1 performed multivariable regression and feature selection to combine multiple biomarkers for increased performance and showed an average separation of 7 standard deviations between the D+ and D" populations giving an average detectable rarity of ~1 in 480. Histograms and receiver operating characteristics (ROC) for these biomarker features and regressions are presented. I show methods to optimize for the maximum detectable rarity as a function of test positivetest negative cutoff position and apply this method for all biomarkers measured.
The form and content of this abstract are approved. I recommend its publication.
Approved: Emily A. Gibson
IV
V
I dedicate this dissertation to my wife, my parents and my daughter.
Acknowledgments
I would like to acknowledge Dr. Emily A. Gibson, my research advisor. She pushed for investigating lipids as a biomarker with motivation of looking at that biomarker label free with coherent antistokes Raman scattering microscopy. Her help in editing and composing the manuscripts that have been submitted related to this work have added clarity to the documents. She has also provided me with exposure and guidance on how to write grants and what goes into keeping a lab running. She has given me the flexibility to explore the aspects of this project that interested me as well as the latitude to embark on the technology development.
Second, I would like to acknowledge my clinical mentor Dr. Kian Behbakht. He drew us into the CTC problem by his own clinical need for better methods for detection and monitoring of gynecological malignancies. He has provided me with exposure to the clinical aspects of cancer research through the tumor board meeting, allowing me to see a cancer debunking surgery, and inviting me to events where survivors are present. In addition, support though his lab and lab group has given me much of my exposure to the biological questions in cancer.
I would like to acknowledge members of both Dr. Gibsonâ€™s and Dr. Behbakhtâ€™s lab groups. Particularly, Baris Ozbay, Dr. Stephanie Meyer, and Dr. Mariana Potcova who I worked closely with and also Dr. Allison Caster and Mr. Andrew Chandlor, and Mr. Robert Heffem. Also, I would like to acknowledge Dr. Heidi Wilson, Dr. Ben Bitler, Dr. Lubna Qamar, Dr. Georgina Cheng, and Mr. Doug Hicks.
I would like to acknowledge the efforts put in by my additional committee members Dr. Isabel Schaepfer, Dr. Robin Shandas, and my committee chair Dr. Richard Benninger.
Vll
This work was supported by a seed grant from the University of Colorado Cancer Center funded through an American Cancer Society Institutional Research Grant, #5700153 (EAG, KB), by funding provided by the Defense Advanced Research Proj ects Agency, # N66001104035 (EAG), NM/NCIK01CA168934 (IRS) andNM/NCATS Colorado CTSI Grant Number TL1TR001081. Imaging experiments were performed in the University of Colorado Anschutz Medical Campus Advanced Light Microscopy Core supported in part by NIH/NCATS Colorado CTSI Grant Number UL1TR001082.1 gratefully acknowledge Dr. Michael Yeager and Ms. Kelly Colvin for guidance on cell handling, labeling, and red blood cell lysis steps. I also acknowledge Dr. Heide Forde for supplying us with the MCF7 cell line. I have no conflicts of interest to declare. The funders had no role in the study design, data collection, analysis, decision to publish, or preparation of this dissertation.
vm
Contents
I. INTRODUCTION.................................................................1
1.1 Contributions...........................................................4
1.2 Background..............................................................6
1.2.1 The CTC Commercial Technology Landscape..............................6
1.2.2 FDA Regulations and Guidance for CTC Identification Systems..........7
II. MATHEMATICAL THEORY AND SIMULATIONS OF DETECTABILITY FOR
RARE EVENTS SUCH A CIRCULATING TUMOR CELLS IN MULTISTAGE ASSAYS................................................................9
2.1 Introduction............................................................9
2.1.1 Background..........................................................11
2.2 Theory: Derivation of Detectable Rarity................................12
2.3 Simulation: Depending on the Distribution of the D+ and D on a Biomarker There
Can Be an Optimal TestPositive/TestNegative Cutoff Position..........14
2.4 Theory: TwoStage Tests................................................17
2.4.1 Sensitivity and Specificity of a TwoStage Binary Assay.............17
2.4.2 Detectable Rarity in MultiStage Binary Assays......................19
2.4.3 Treatment of Independent Versus Dependent Test Stages...............20
2.4.4 ROC for TwoStage Tests Using the Same Biomarker....................21
2.4.5 ROC for TwoStage Tests Using Different Biomarkers..................24
2.4.6 Creating Overall Test ROC from TwoStage Test Using Different Biomarkers .25
2.4.7 Simulation: ROC Pre and Post Enrichment for Dependent and Independent
Biomarkers..........................................................26
IX
2.4.8 Simulation: The Optimal Enrichment Cutoff Position...................29
2.5 Theory: Cell Loss in Processing..........................................31
2.6 Theory: Sampling Considerations and Volume of Blood to Screen............32
2.7 Discussion...............................................................34
III. ENGINEERING OF CELL LABELING TECHNOLOGY....................................37
3.1 Introducti on............................................................37
3.2 Device for Cell Labeling.................................................37
3.3 Coverslip Mounter........................................................39
3.4 Protocol for DAPI, Bodipy, Pancytokeratin, and CD45 Labeling of Cells on Track
Etched Polycarbonate Filters.............................................40
3.5 Example Resulting Cell Preparations......................................42
IV. EXPERIMENTAL EVAULATION IMAGE CYTOMETRY FOR DNA, CD45,
CYTOKERATIN, AND LIPIDS IN A MODEL SYSTEM FOR CIRCULATING TUMOR CE1.1. IDENTII IATION..........................................44
4.1 Introduction.............................................................44
4.2 Materials and Methods....................................................47
4.2.1 Sample Inclusion and Abundancies......................................47
4.2.2 Sample Preparation and Labeling.......................................48
4.2.3 Fluorescence Microscopy...............................................49
4.2.4 Image Processing......................................................50
4.2.5 Calculation of Cohenâ€™s d..............................................51
4.2.6 Calculation of Image Metrics..........................................51
4.2.7 Performance Analysis..................................................53
x
4.3 Results
56
4.4 Discussion.......................................................71
V. CONCLUSIONS..........................................................75
5.1 Future Directions................................................78
REFERENCES..............................................................80
Appendix
A. Components of the Clinical Experience...........................90
B. Analysis of Experimental Variation..............................93
C. Comparative figures and performance between bootstrap and experimental analysis
used in dissertation...............................................................115
D. Comparative figures and performance with and without manual debris removal...125
E. Comparative figures and performance of Regl11 generated once using all days data
and applied to each of the trainingtesting subsets with Regl11 generated for each of the trainingtesting subsets................................................134
F. Comparative figures and performance of Regl11 generated once using all day 9 data and applied to each of the trainingtesting subsets with Regl11 generated for each
of the trainingtesting subsets.............................................139
G. Area Under the Curve (AUC) up to FPR .05.................................144
H. Regressed Equations Trained on All Data..................................145
F Regressed Equations Trained on just Day 9 Data............................147
J. Notes on Commercialized CTC Technologies.................................150
xi
List of Figures
Figure 1: Panels a & b: Histograms of D+ and D populations drawn from normal
distributions on biomarker a (BM a) and drawn from Pearsonâ€™s distributions with excess kurtosis of 3 on biomarker p (BM P). D+/D are separated by 4 standard deviations on both BM a and p. The zscored and likelihood ratio ROCs (panels c & d) show BM P is always beneath BM a indicating there is no cutoff position where BM P outperforms BM a. Panel (d) shows BM P has a maximum test positive likelihood ratio as a function of cutoff position while BM a is monotonically increasing with cutoff position. Annotated maximum Ndet shown as + is also not at the minimum false positive
rate in BM P as it in BM a...........................................................16
Figure 2: Panel a: Diagram of sample processing through a twostage test. The test negative output of the first test is discarded. The test positive output from the first test is used in the second test. The test positive output of the second test is test positive for the combined test. Panel b: Geometrical picture of two stage sensitivity showing that the twostage sensitivity must be smaller than the sensitivity of the individual stages. Panel c: Geometrical picture of the twostage specificity showing that the false positive rate
[=1specificity] must be smaller than that of the individual test stages.............17
Figure 3: Diagram of two stage system in D+ and D populations. Panel (a) has a relation where Equation 24 is not satisfied due to a large overlap between T1+ and the Dpopulation. Panel (b) the typical case where both test stages seek to reduce the false
positive rate of the assay...........................................................20
Figure 4: Diagram of population separation using continuous biomarkers through a two stage system. The effect of the first test is modeled as a selection function on a continuous
Xll
biomarker, x, (panel b) in which cells in the red population are more likely to be selected as test positive, Ti+. Resulting enriched distributions are shown in panel c) and
discarded distributions are shown in panel d).........................................22
Figure 5: The ROC of the biomarker post enrichment (blue) follows the ROC of the biomarker without enrichment for low false positive rates and is seen to saturate at higher false positive rates. This is caused by cell loss in the D+ population due to the stage 1 enrichment. D+ and D distributions are normally distributed and are shown in
Figure 4..............................................................................24
Figure 6: Simulated D+ events (red) and D events (blue) with distribution on biomarkers BM1, BM2, and BM3. (panels a and b). BM2 is independent (ID) of BM1 while BM3 is dependent (Dep.) on BM1. Distributions are put through enrichment stage on BM 1 with cutoff shown in (c and d) resulting in new distributions shown in panels (e and f). Panels (g and h) show the ROC of BM2 and BM3 before and after enrichment with maxima in Ndet annotated as +. Although the performance of BM3 is seen to be better than BM2 pre enrichment its performance improves less than BM2 post enrichment due to its
dependence on BM 1....................................................................28
Figure 7: The detectable rarity of stage I and overall test as function of the cutoff position on the stage I enrichment biomarker. Panel a) shows the detectable rarity of the stage I enrichment alone. For the case of D+ and D drawn from a normal distribution (BM a) the detectable rarity increases as the cutoff position is moved ever further away from the D population. For D+ and D populations with excess kurtosis of 3 (BM 1) a maximum is seen in detectable rarity as a function of cutoff position. In panel B, BM1 is used as the enrichment biomarker and BM2 from Figure 6 as the stage II biomarker. The
xm
performance of the overall test post enrichment (blue dashed line) follows that of the detectable rarity of BM1 alone combined with that of BM2 as predicted by equation 19. When the stage II biomarker is dependent on the stage I biomarker (BM3 from Figure 6)
the overall test detectable rarity underperforms that of the independent case...31
Figure 8: Assurance levels of encountering, k cells, after measuring n cells assuming 1 in one
million cell prevalence.........................................................33
Figure 9: Side cross section of labeling device. Device consists of input head, alignment plate, and output head. The input and output heads sandwich orings set with the alignment plates to create a seal on the polycarbonate filter as shown in detail 1. Input head has threaded connector (a) for applying positive pressure. The volume of the staining reservoir (b) is controlled by the diameter (c) while its height is set to the length of a standard gel loading pipette tip. This choice of height prevents damaging the filter while enabling bubble free loading. The diameter of the laydown area is controlled by the oring diameter which can be set by changing the diameters in the alignment plate (e). This plate must be thin enough for a compression gap (f) to enabling sealing of device. Output head contains threaded connector (g) for pulling fluids from the device.
All fluids flow in the direction of the dotted arrow (h)........................38
Figure 10: Diagram of assembled system: The lanes in the staining chamber (a) are pulled with vacuum controlled through stop cock panel (b) fabricated with 3D printing. The vacuum splits out through the manifold (d) and is routed through 2 aspiration steps (d). To prevent stretching the filter, vacuum pressure is controlled by regulator (e) which is connected to the lab vacuum (f)......................................................39
xiv
Figure 11: A 1:1 sample of WBCs and MCF7 cells labeled for DAPI (blue), Bodipy (green), antipanCytokeratain(yellow), and antiCD45 (red). Panel a) representative image of cell spot produced by labeling device. Diameter of spot is 2.1 mm. Image is 8x8 mosaic full resolution of square area is shown in panel b). Image at full resolution is available in
public dataset [72].................................................................42
Figure 12: Histograms of image cytometry features computed on all 4 channels. Blue shows WBCs only samples composed of 24,699 objects (D), red shows MCF7 only samples (D+) composed of 41,091 objects used in the analysis. Black dashed line (MCF7 +
WBC ~1:1 mixed samples containing 33,726 objects) is qualitative control reproducing modality of pure samples. dBct = 10*logl0(counts). Box plots of distribution of cut off positions maximizing Ndet across the 48 training subsets are shown on top of each of
the histograms......................................................................64
Figure 13: AUC and Cohenâ€™s d, performance characteristics of the features, are shown for the 48 training subsets. An operating point maximizing Ndet was found on each training subset (thresholds plotted in Figure 12). The remaining data from that day was used as a testing subset to compute the shown sensitivity, specificity and the minimum detectable thresholds at the operating point. Occurrences of false positive rates of zero on the
testing data were found and summed in the bottom panel..............................65
Figure 14: ROC curves for the training subsets averaged over each day. Logarithmic (zscored) sensitivity and specificity axis used shows straight lines when D+ and D" distributions are Gaussian. Average values for sensitivity and specificity maximizing Ndet for are shown as + symbols are generally to the left of seen inflection points.......66
xv
Figure 15: ROC on test positive likelihood ratio LRt+ and LRt shows there are maximum on LRt+ for all features. If any of these biomarkers was used for enrichment, this cutoff would be the optimal position for the transition between T+ and T for the enrichment.
.....................................................................................67
Figure 16: Histograms of testing subsets of WBCs (blue trace), MCF7 (red traces). The 1:1 mixed populations (black line) qualitative control showing regressed modality is real. Testing data was data not used to train the regressions and was naive to the regressions which were computed on the 48 training subsets. For each regression, an operating point on that training subset maximizing Ndet, and produced threshold positions shown as
box plots on top of histograms. Regressions are zscored thus center out near 0......68
Figure 17: Performance of the regressions combinations over the 48 testing subsets shown as box plots. Above line performance statistics, AUC and Cohenâ€™s d characterize the separation produced between the biomarkers. Below line performance statistics that depend on the operation point. Occurrences where a testing subset had a false positive rate of 0 were summed. For Regl4 below line performance statistics are similar while
above line statistics are different..................................................69
Figure 18: Receiver operating characteristics of the regressions computed on the testing subsets averaged over each day. Average positions of operating points maximizing Ndet are shown as + symbols. As more features are included in the regressions performance and stability is seen to improve. Regressed distributions are not Gaussian
as indicated by the shape on the zscored sensitivity specificity axis...............70
Figure 19: Regression on test positive likelihood ratio (LRt+) and test negative likelihood ratio LRt. LRt+ is linearly related to detectable rarity Ndet. This figure shows that there
xvi
are maxima for LRt+ and thus Ndetfor all regression combinations and the profile of these ROC are not that of D+ and D drawn from normal distributions as shown in
Figure 18
71
CHAPTER I
INTRODUCTION
Efforts to measure malignancy using peripheral blood samples have been focused on detecting circulating tumor cells (CTCs), circulating tumor DNA (CTDNA), and serum proteins such as prostate specific antigen (PSA) or cancer antigen 125 (CA125) [1]â€”[4], Detection and capture of CTCs have advantages over other measurements. For example, the presence of CTCs indicates the cancer has made a step in its development of metastatic potential and is capable of dissipating cells from its primary site. Some circulating tumor cells may be carriers of metastasis depending on their ability to survive in changing micro environmental conditions [5], Circulating tumor cells contain a sample of the entire genome of the tumor providing more information than other techniques. Another advantage is that measurements of these cells can allow further characterization by phenotype for metabolic markers such as lipid composition [6], Current technologies have detected the presence of CTCs in patients with breast [7], [8], colorectal [9], [10], prostate [11], [12], and ovarian [13] cancer.
The problem in accurately identifying circulating tumor cells is their rarity amongst similar cellular blood components. Previous studies indicate just 2 CTCs in 7.5 mL of whole blood is predictive of prognosis in many cancers [1], [14]â€”[17]. Similar CTC count has also been shown as predictive of response to chemotherapy [15], [18][20] indicating their usefulness as a disease monitoring biomarker. In contrast, 1 milliliter of blood contains 24 million nucleated blood cells often called white blood cells (WBCs), 1 billion red blood cells, and 100 million platelets. To address this rarity, biomarkers with high sensitivity and extremely high specificity are needed to distinguish between circulating tumor cells and
1
background nucleated blood cells. Failure to have sufficient sensitivity and specificity produces an assay in which most test positive objects are false positives. As part of this dissertation, I will present theoretical analysis of the CTC problem that begins by showing that sensitivity and specificity set the rarity of cell that a test can detect, which I will refer to as detectable rarity.
Eighty percent of cancers are carcinomas which originate from epithelial tissue. The most common method to isolate CTCs is to target markers of epithelial cells not expected in hematopoietic blood cells. Thus, the typical markers for CTCs are a nucleated cell, expressing epithelial markers, such as epithelial cell adhesion protein (EpCAM) or cytokeratins (CK), that is also negative for the hematopoietic marker CD45, which I annotate as DAPI+/CK+/CD45. Although DAPI+/CK+/CD45 seems to be a logical target, a large scale study of the DAPI+/CK+/CD45 biomarker (CellSearch, Veridex, LLC) using 2183 blood samples from 965 patients with known metastatic carcinomas found only 36% to have greater than 2 CTCs in 7.5 mL of blood [1], Indeed, the previously mentioned reports of CTC identification [7], [9]â€”[13], [21], [22], typically find 30  80% of metastatic patients to have detectable CTCs. This low sensitivity limits the utility of CTCs in early detection and disease monitoring and motivates the need to further improve CTC identification assays. The theory I develop here lays the pathway for further improvement to these assays as experimental investigation finds other possible biomarkers for CTCs with greater sensitivity and specificity. Future assays may involve biomarkers that are less dependent on epithelial markers as it is postulated that CTCs lose some of their epithelial characteristics [23],
Many CTC detection assays often consist of at least 2 stages, a stage I enrichment using antibody capture (EpCAM positive selection or CD45 depletion) to reduce the number
2
of background WBCs and a stage II identification using image or flow cytometry with fluorescent labels. In this dissertation, I present theory showing how the sensitivity, specificity, and detectable rarity of the first and second test stages combine to set the sensitivity, specificity, and detectable rarity of the overall test. Additionally, through simulations exploring the developed theory, I show that the detectable rarity of an identifying biomarker can be limited as a function of test positivetest negative cutoff position on that biomarker. Thus, experimental investigations of not just biomarkers but biomarker combinations are critical to better identify CTCs.
As part of this dissertation, I perform an experimental investigation of adding new biomarkers to the classical DNA/CD45/CK panel. The new biomarkers I looked to add were spatial image cytometry metrics of second moment, spatialfrequency second moment, the product of these two moments and a lipid channel/label. In addition to computing these three spatial metrics, I also computed total label content in each region of interest (ROI). I refer to the computation of an image metric computed on one channel as a feature. Thus, this four channel dataset on which 4 image cytometry metrics were computed resulted in a 16 feature image cytometry data set. Each of these features and combinations of these features are a potential identifying biomarker. I present on using regression methods I developed to combine the individual features for improved performance. In particular, I was interested in a DNA, lipids, and CD45 panel because it does not use epithelial biomarkers and because all of the labels are compatible with live cell imaging. I performed this experimental work using a model system of disease positive (D+) MCF7 cells and disease negative (D) white blood cells (WBCs) from human peripheral blood which has been used by others investigators in developing CTC assays [24]â€”[29],
3
Higher lipid content in CTCs isolated with CK+/CD45 marker has been previously quantified using coherent antiStokes Raman scattering (CARS) microscopy, showing a 7fold higher lipid signal over other blood cells in metastatic prostate cancer patients [30], In my work, I assessed how well lipids perform as an identifying biomarker for CTCs in this model system both alone and in combination with the DNA/CD45/CK biomarkers. I test the following experimental hypotheses: (H.l) image cytometry for a composite panel of DNA/lipids/CD45 will detect MCF7 cells with better performance than a standard DNA/CD45/CK panel, (H.2) MCF7 cells have increased lipid content compared to white blood cells, and (H.3) that spatial metrics of second moment, spatialfrequency second moment, and their product using image cytometry can increase sensitivity and specificity beyond the sensitivity and specificity of simple total content measurements. The identification performance I report for the studied biomarkers and combinations are the distributions of features for the D+, and D" populations, area under the curve (AUC), Cohenâ€™s d, sensitivity, specificity, and the rarity of cell that can be detected.
Finally, the utility of these biomarker combinations and image cytometry also lies in the reliability and repeatability of their execution. As part of this work, I have engineered technology that allows cells of interest to be imaged and analyzed for these biomarkers in an ever more reliable and repeatable fashion. I will present a device resulting from my engineering efforts to improve my reliability and repeatability in labeling cells for image cytometry and on how I used this device to evaluate the DNA/lipids/CD45/CK panel.
1.1 Contributions
My contributions are theoretical, technological, and experimental to the field of cytometry with an application focus on identification of circulating tumor cells. My
4
theoretical contributions to this field are the development of mathematics describing the rarity of object an assay can detect, the development of mathematics describing how the detectable rarity of each stage combines in multistage diagnostic assays, showing how detectable rarity is related to the test positive likelihood ratio, and showing that in some cases detectable rarity and the test positive likelihood ratio have maximum as a function of the test positivetest negative cutoff position and do not monotonically increase with decreasing false positive rate.
My technological contributions are the engineering of a device for labeling cells on tracketched poly carbonate filters and software to perform image cytometry. The filters prevent cell loss during the multiple fluid handling steps of cell labeling and the device allows for the size of the deposited cell field to be controlled. The software was written to perform image cytometry on these fields and computes spatial metrics of total signal, second moment, spatialfrequency second moment and their product. I have not previously seen these spatial metrics evaluated in the context of circulating tumor cells.
My experimental contributions are the evaluation of the DNA/Lipids/CD45/CK panel in a model system for circulating tumor cell detection. I have contributed an experimental protocol for the labeling of cells on these filters for DNA, Lipids, CD45 and Cytokeratin. I have contributed an open source dataset of image cytometry for this panel in a model system of circulating tumor cells to assist others developing their own software. I have not found other open source datasets related to CTCs.
Analyzing this experimental data, I present measured results using receiver operating characteristics on zscored (logarithmic) axis and on a test positivetest negative likelihood axis. To the best of my knowledge, I am the first to apply this mathematical theory to
5
calculate detectable rarity in an experimental context. I show that detectable rarity and likelihood ratios can be used to find the optimal cutoff parameters in experimental cytometry data.
1.2 Background
1.2.1 The CTC Commercial Technology Landscape
Recent news articles highlight a number of companies that are commercializing products for detecting circulating tumour cells [31], [32], A fairly detailed review of different technologies being employed had been done by [33] and [34], I have accessed and reviewed the websites of the companies listed in the news articles to try to understand the technology they are using, the product they are marketing, and any claims they make about their CTC product. My notes on this assessment are in Appendix J. I classify the technologies being commercialized by these companies into three categories, immunoenrichment, sizeenrichment and largefield image cytometry. The description largefield refers the imaged cell field being hundreds to thousands of square millimetres in area allowing for the measurement of tens of thousands to millions of cells rather than the specific microscopy employed. Several companies market a complete assay that includes enrichment combined with a second stage identification using cytometry.
In the immunoenrichment field, the Veridex (CellSearch) system is probably the most widely used in clinical and biomedical research [1], [35] and remains the only system with FDA approval. The Veridex assay employs immunoenrichment using a magnetic ferrofluid conjugated to EpCAM antibodies to capture EpCAM positive cells followed by with backend image cytometry of the captured cells using labels for DAP I, CD45, and cytokeratin. Microfluidic devices are also being employed for immuneenrichment using antibodies
6
conjugated to surfaces in the device [36], A few companies such as Miltenyi Biotec and StemCell Technologies offer immunomagnetic enrichment products alone without a specified stage II test. These products are presumably targeted for researchers planning on doing PCR or sequencing for the second stage of an assay. One interesting technology for immunoenrichment is the GILUPI CellCollector which is an antibody conjugated wire for use in in vivo capture [37],
Isolation of epithelial cells by size (ISET) [27] was one of the early alternatives to Cell Search. ISET employs size enrichment using tracketched polycarbonate filters with eightmicron pore sizes to capture larger cells. More companies are now offering products based on enriching for larger CTCs. The technology being marketed and developed by many of these companies employs microfluidic technology for size capture [38]â€”[43],
Using largefield image cytometry means performing microscopy on a large area allowing one to image tens of thousands to millions of cells. This removes the need for an enrichment stage. Epic Technologies and Cytotrack both employ this technological approach [44][48], In addition, there are publications on methods to image a large field first at lower resolution and then follow up in identified regions of interest with higher resolution microscopy [49], [50],
1.2.2 FDA Regulations and Guidance for CTC Identification Systems
The United States Food and Drug Administration considers CTC identification systems to be class II devices. Regulations for these devices are found under Chapter 21 of the Code of Federal Regulations Part 866  Immunology and Microbiology Devices, Subpart G  Tumor Associated Antigen immunological Test Systems Subsection 866.6020 Immmunomagnetic circulating cancer cell selection and enumeration system
7
[21CFR866.6020], The FDA has also produced a guidance document titled, â€œClass II Special Controls Guidance Document: Immunomagnetic Circulating Cancer Cell Selection and Enumeration Systemâ€ Docket No. 2004D0163 related to these systems.
8
CHAPTER II
MATHEMATICAL THEORY AND SIMULATIONS OF DETECTABILITY FOR RARE EVENTS SUCH A CIRCULATING TUMOR CELLS IN MULTISTAGE
ASSAYS
2.1 Introduction
Improving the performance of CTC assays will be obtained through discovering identifying biomarkers that are more sensitive and specific along with improvements in the detection methods that minimize cell loss and increase repeatability. In this chapter, I present analysis applicable to the engineering of multistage cytometry assays for the detection of rare cells such as CTCs. I present theory showing how sensitivity and specificity set the rarity of event an assay can detect, which I will call detectable rarity. Enrichment followed by image cytometry is commonly used to detect CTCs [1], [27], [35], [51], I present a multistage theory by first describing how sensitivity, specificity, and detectable rarity can be applied for two binary test stages. Then, I present theory for twostage assays consisting of enrichment followed by cytometry resulting in continuous biomarkers and derive expressions needed to build receiver operating characteristics (ROC). Through simulation, I explore how the distribution of the D+ and D populations on the stage I biomarker, enrichment cutoff position, and dependence between stage I and stage II biomarkers effects overall assay sensitivity, specificity, and detectable rarity.
In my approach, I quantify the performance of a test by solving for how abundant a rare D+ cell must be must be among the D population to detect it. This is similar to separating the signal strength from the noise as in an electronic analogy. With this analogy, one finds connections between the theory presented and the electronic detection theory
9
pioneered by Peterson W, Birdsall T and Fox W to understand the detectability of signals in the presence of noise [52], The â€˜minimum cell abundancyâ€™ that is a test can detect is what I term the detectable rarity.
I will find that detectable rarity is linearly related to the test positive likelihood ratio and maximizing it is the same as maximizing the test positive likelihood ratio (Lt+). I will show that detectable rarity has combination properties similar to likelihood ratios in multistage assays. Of note, likelihood ratios were also developed in the theory of [52], Likelihood ratios have previously been described as a decision criteria in diagnostics and in the field of psychology to describe decision making with thresholds to be determined by event prevalence and the rewardcost tradeoff of correct and incorrect decisions [53],
I will show that depending on the distribution of D+ and D on the continuous biomarker parameter, the test positive likelihood ratio and detectable rarity have a maximum as a function of cutoff position on ROC and that it does not always monotonically increase with ever decreasing cutoff on ROC thresholds as suggested by [53], I will show this graphically by transforming the sensitivityspecificity axis of my ROC curves to the test positive (Lt+) and test negative likelihood ratio (Lt~). The idea of transforming the coordinate axis was first proposed by Johnson and Lt+,Lt represents an alternate basis describing the ROC as fully as sensitivity and specificity [54], Through simulation, I find that when the Dpopulations have excess kurtosis, there is a maximum value of Lt+ and that when D+ populations have excess kurtosis, there is maximum on Lt~.
For a twostage test of enrichment followed by cytometry, if stage I biomarker has a D population with excess kurtosis, the optimal cutoff position of the enrichment stage biomarker is at the position maximizing Lt+.
10
There are only a few prior works about probabilistic mathematics related to rare events such as of circulating tumor cell detection. Rossenblatt et al. explored sampling statistics and showed that the binomial distribution can be used to determine the number of cells that need to be searched given a known rare prevalence and test with perfect sensitivity and specificity [55], Tibbe et al. noted that rare event sampling statistics could also be modeled with Poisson statistics and discussed other practical aspects related to the detection of CTCs such as how they can be used to stratify patient outcomes [56], In this chapter, I also will review the binomial statistics used to determine the number of cells to sample for CTC identification, first described by [55], The binomial sampling statistics implicitly assume a test with perfect sensitivity and specificity and only considers how many cells would need to be sampled to have a certainty of finding a cell given prevalence. The theory I present uses sensitivity and specificity to the demine the rarity of cell the test can detect. I believe in the rare cell context it is more often the assaysâ€™ detectable rarity rather than the expected cell prevalence, which may be unknown, that should be used to determine the number of cells to sample with binomial sampling statistics.
2.1.1 Background
The mathematics presented here build on the mathematics used in detection theory in the application of rare cell identification. The mathematics of detection theory are strongly connected to the mathematics of signal theory first developed to understand electronic systems and noise. Shannonâ€™s work, [57], in that field is considered seminal through the idea of analyzing communications systems and noise through information content and led to what is now called information theory [58], Peterson et al. a few years later would analyze the limits of electronic detection outside of the communications context [52], and this work was
11
cited by Green and Swets as being pioneering in detection theory [53], Radar is but one obvious application of [52], Peterson et al., [52], were the first to show that receiver operating characteristics on zscored axis will display normal distributions as straight lines. This point would later also be advanced into the diagnostic and psychological decision making context by Swets [59], Although the theory I present is more classical and similar to [52], [53] and [59], Shannonâ€™s work is of note as he showed a channelsâ€™ information capacity is logarithmically proportional to its signal to noise ratio. I will relate signal to noise ratio to positive predictive value in the theory I describe below.
2.2 Theory: Derivation of Detectable Rarity
Consider a binary test performed on N cells coming from diseased and healthy groups. There are four possibilities for the groupings of these N objects, the cells can be test positive and disease positive, T+OD+, test positive and disease negative, T+OD, test negative and disease positive, TOD+, and test negative and disease negative, TOD.
The number of false positives expected is
nfp = n xp(t+ nr), (l)
where P(event) is the probability of that event occurring, which must always range between 0 and 1. Nfp represents the background over which one needs to find the true positive cells.
The expected number of true positives is
Ntp = N X P(T+ n D+). (2)
In signal processing, the standard parameter used to measure if a signal is detectable is the ratio of signal to noise (SNR). In the application to CTC detection, the SNR can be defined as the ratio of number of true positives to number of false positives. Starting with this relationship,
12
(3)
Ntp = SNR X NFP,
one must have true positives exceed false positives(noise) by a minimum SNR in order for them to be detectable. Nfp can be expressed as the false positive rate, P(T+ DJ multiplied by the disease negative prevalence, P(D), while Ntp is written in terms of the sensitivity, P(T+\D+) multiplied by the disease positive prevalence,P(D+), one finds
N x P(T+\D+)P(D+) = SNR x N x P(T+\D~) x P(ZT). (4)
By replacing the D prevalence with its complement, P(D) = 1P(D+), one can solve for the detectable prevalence for a given SNR,
P(D+) =
SNR X P(T+\D~)
(5)
P(T+\D+) +SNR X P(T+\D~)
For my simulations, I set the minimum SNR to be 2. For calculations, it is easier to use an optimization parameter of Ndet = P(D+)~\ which is the inverse of the detectable prevalence. I define Ndet as detectable rarity. Simplifying Equation 5, one has the following relation for Ndet,
Ndet ~
P(T+\D+)
SNR X P(T+\D~)
+ 1.
(6)
For example, if Ndet is equal to 2000, then one can detect CTCs occurring at a concentration of 1 CTC to 2000 leukocytes in a sample. Any concentration of CTCs below this level will not be detectable. Maximizing Ndet maximizes for detection of the rarest possible events.
SNR is the ratio version of positive predictive value, P(D+\T+). They are related through the equation,
13
P(T+D+)
1P(T+D+)'
(7)
Ndet is linearly proportional to the test positive likelihood ratio, LRT+ =
P(T+D+)
p(r+\D~y
through the following equation
LRt+
Ndet = SNR + lm
(8)
This means that maximizing Ndet is also the same as maximizing for test positive likelihood ratio regardless of the choice of SNR. Since positive predictive value is a property of just the cell prevalence and the test positive likelihood, this also means maximizing for Ndet maximizes for positive predictive value (PV+) regardless of what PV+ actually is which also depends on the cell prevalence.
2.3 Simulation: Depending on the Distribution of the D+ and D on a Biomarker There Can Be an Optimal TestPositive/TestNegative Cutoff Position
Many CTC assays employ image cytometry as the final identification stage which provides test biomarkers as continuous variables. For continuous biomarkers, the sensitivity and specificity depend on the location of the cutoff position used to discriminate between T+ and T groups. The tradeoff in sensitivity and specificity as a function of cutoff position is often shown by a receiver operating characteristic (ROC) curve. Previous work [52], [53], [59] has shown that plotting sensitivity and specificity on zscored axis shows a straight line characteristic when the D+ and D populations are drawn from a normal distribution. In my experimental image cytometry work for DNA, lipid, CD45, and cytokeratin CTC labels, I found that none of these biomarkers showed normal distributions on an ROC (Chapter IV ).
14
P(T+\D+)
The test positive and test negative likelihood ratios (LT+ = p^T+^D_^ and LT_ =
P(T\D+)
p \D_^ ) are an alternate basis to sensitivity and specificity for biomarker ROC as first
described by [54], This basis has an advantage as one axis of the ROC is directly related to detectable rarity through Equation 8.
I created a simulation in Matlab 2015a (Mathworks, Cambridge, MA) to model D+ and D biomarkers as random numbers drawn from the Pearsonâ€™s system of distributions with configurable mean, standard deviation, skewness, and kurtosis. I then calculated and plotted the zscored ROC and likelihood ratios ROC.. I found that to create zscored ROC with a shape similar to my experimental results, I had to apply an excess kurtosis of 3 to my D+ and D distributions. Figure 1 shows my results from simulations using normally distributed populations and populations with excess kurtosis of 3.1 find that populations with excess kurtosis lead to a maximum in Lt+ as a function of cutoff while for the normally distributed populations, the likelihood ratio does not show a maximum value for a given cutoff position.
15
2 0 2 4 6 8 10 12
BM a
VST
2 0 2 4 6 8 10 12
BM /?
5432101234
LR T
Figure 1: Panels a & b: Histograms of D+ and D populations drawn from normal distributions on biomarker a (BM a) and drawn from Pearsonâ€™s distributions with excess kurtosis of 3 on biomarker P (BM P). D+/D are separated by 4 standard deviations on both BM a and p. The zscored and likelihood ratio ROCs (panels c & d) show BM p is always beneath BM a indicating there is no cutoff position where BM P outperforms BM a. Panel (d) shows BM P has a maximum test positive likelihood ratio as a function of cutoff position while BM a is monotonically increasing with cutoff position. Annotated maximum Njc, shown as + is also not at the minimum false positive rate in BM P as it in BM a.
16
2.4
Theory: TwoStage Tests
2.4.1 Sensitivity and Specificity of a TwoStage Binary Assay
Many CTC isolation techniques employ two or more stages. A diagram of a twostage system is shown in Figure 2 panel (a). I look at the theory for optimizing overall performance in multiple stage tests.
Figure 2: Panel a: Diagram of sample processing through a twostage test. The test negative output of the first test is discarded. The test positive output from the first test is used in the second test. The test positive output of the second test is test positive for the combined test. Panel b: Geometrical picture of twostage sensitivity showing that the twostage sensitivity must be smaller than the sensitivity of the individual stages. Panel c: Geometrical picture of the twostage specificity showing that the false positive rate [=1specificity] must be smaller than that of the individual test stages.
In the twostage assay, the test positive output is the intersection of a positive first test with a positive second test, T+ = Ti+ C"\T2+. The sensitivity of this test is
Tl+flT2+ Ti+nT2
(b)  Sensitivity : Test area in D+ space
(c)  Specificity: Test area in D space
17
p(t+\d+) = p(t? n r2+D+).
(9)
Using the law of total probability, the combined sensitivity, P(T/+ f)T2+\D+), can be written in terms of the sensitivity of the first test minus the intersection the objects that are positive in the first assay but negative in the second test,
P(Ti+ n r2+P>+) = P(Tf\D+)  p(ti+ n 7â€™2_d+). (io)
Since P(T/+ P\T2/D+) is a probability and ranges between zero and one, Equation 10, indicates that the sensitivity of the combined test is less than the sensitivity of the first test alone.
p(rf n p2+p>+) = p(p2+Â£>+)  P(Pf n p2+p>+) (li)
is also true and can be confirmed with the geometrical picture. Equation 11 indicates that the sensitivity of the combined test must be less than the sensitivity of the second stage alone. Thus, the combined test sensitivity must be lower than that of the individual stages.
The specificity of the overall test is
P(PÂ£)) = lP(P+nP2+Â£)). (12)
Using the law of total probability, the specificity of the test can be written as
P(T~\D~) = P(Pf \D~) + P(7\+ n P2â€œPâ€œ) (13)
in which the first term is the specificity of just the first test. The geometrical picture gives the similar relation
P(T\D~) = P(T2P>) + P(Pf n r2+pr) (14)
indicating that the combined test specificity must be larger than the specificity of the second test alone as well. Thus, the combined test specificity must be greater than that of the individual test stages.
18
2.4.2 Detectable Rarity in MultiStage Binary Assays
Suppose one has two independent test stages so that the combined test sensitivity can be written as P(T/+ H T2+ \ D+) = P(T/+ I) ) P(T2+ \ D+) and false positive rate can be written as P(T1+ H i'2 \ D) = P(T1+ \ D) P(T2+  D) .
Ndet for the individual test stages is
P(Ji\D+)
~ SNR XP(T+\D~) + 1 ('15a')
P(T2+\D+) , â€ž
SNRxP(T+\D~) ' ^
(15)
and Ndet for the combined test is
P(T?\D+) X P(T?\D+) (16)
det comb. SNRXP(T+\D)XP(T+\D~)
The sensitivity of the individual test stages can be written in terms of the Ndet value of
the individual stages, their false positive rate, and SNR as
P(T+\D+) = SNR X P{T^\D~)(Ndetl â€” 1) (17a) (17)
P(T2+D+) =SNRX P(T+\D)(Ndet2l). (17b)
Substituting Equation 17 into Equation 16 the expression simplifies to
^det comb. = SNR (JVdetl  1 )(JVdet2  1) + 1. (18)
For M test stages, the expression for Ndet becomes
Ndet comb. ~ 1 + SNRM~1nfi1(Ndetil). (19)
Thus, for the case of independent test stages, Ndet of the combined test is proportional to Ndet of the individual tests multiplied together. This conclusion can also be derived in terms of likelihood ratios from Equation 8.
19
2.4.3 Treatment of Independent Versus Dependent Test Stages
(a)
Figure 3: Diagram of twostage system in D+ and D populations. Panel (a) has a relation where Equation 24 is not satisfied due to a large overlap between T1+ and the D population. Panel (b) the typical case where both test stages seek to reduce the false positive rate of the assay.
Ndet comb, is the product of the individual tests stages for the independent test case Eq. 19 seems to be the upper limit on how test stages combine in a multistage test. For a multistage test, the overall test sensitivity is
P(7\+ n P2+P>+) = P(Tf\T? n D+) X P('T1+D+),
(20)
and overall false positive rate is
p(7\+ n r2+zr) = p(tÂ£\t? n zr) x p(7\+zr).
(21)
Where P(â€™P2+P1+ n D+) is the stage 2 sensitivity which is dependent on stage 1 and
P(P2+Pi+ n D ) is the stage 2 false positive rate dependent on stage 1. The combined test
detectable rarity is
Ndet comb. ~ SNR(_Ndetl l)(^det27â€™l+ 1) "F 1.
(22)
where Ndet2\T1+ is the stage 2 detectable rarity dependent on stage 1,
Ndet2\Tl+ ~
P(P2+P1+nP>+)
(23)
20
For most cases, the dependent stage 2 detectable rarity will be less than or equal to
the independent stage 2 detectable rarity, Ndet2\T1+ < Ndet2. This inequality is equivalent to
saying the stage 2 dependent likelihood ratio is lower than the independent likelihood ratio,
P(P2+P1+nF>+) < P(P2+P>+) (24)
P(72+I7i+ n D~) ~ P(P2+P)'
There are exceptions to Equation 24 being true, but I argue that it is true for most assays as shown in Figure 3. Exploring this, Figure 3 Panel a, shows a case were Eq. 24 is not true and the stage 2 dependent test has a greater likelihood ratio after stage 1. Note the limited occupation Ti+ has in the I) population while still including a large amount of the Dpopulation. Ti is not a test with high detectable rarity or one that seems very useful. Figure 3 Panel b shows a more realistic case where Ti+ and 72+ both exclude a large amount of the Dpopulation and have low false positive rates. This is the more realistic design and in this case Equation 24 holds.
2.4.4 ROC for TwoStage Tests Using the Same Biomarker
Estimating the total test sensitivity and specificity of a system that consists of a binary selection step, such as immunoenrichment, immunedepletion or size screen, followed by measurement of a continuous biomarker is of interest since this is the processes is employed by many CTC detection assays. Here I will evaluate how the first enrichment step alters the ROC when the same biomarker is used for both test stages. In the next section I will generalize this theory for the use of different biomarkers for stage 1 and stage 2.
21
/oo poo
M(x)fnÂ±(x) dx+ (1  M(x)) fnÂ±{x
OO j â€”OO
1 = P(T1+.DÂ±) + P(Tf ID*)
Figure 4: Diagram of population separation using continuous biomarkers through a twostage system. The effect of the first test is modeled as a selection function on a continuous biomarker, x, (panel b) in which cells in the red population are more likely to be selected as test positive, Ti+. Resulting enriched distributions are shown in panel c) and discarded distributions are shown in panel d).
D+ and D populations have a probability distribution across a given continuous biomarker are shown in Figure 4 (D+ [red] and D [blue]) and are represented as normal distributions. Mathematically we define Jd+(x) and f , (x) to represent the disease positive and disease negative probability density functions.
One can represent the effect of Ti selection as a modulation function, M(x), shown in Figure 4 panel b. M(x) is a function associated with a Bernoulli random variable that varies between 0 and 1 and has a selection constant, p dependent on the value of the continuous biomarker or p(x). AM(x) function is reasonably modeled as an error function using the expression
(25)
where //As the center position of enrichment the transition and oAs its spread.
22
The multiplication of the M(x) function and 1M(x) with the disease positive and negative populations tells us the probability of the cell being selected into the 77+ or 77group.
Since the cell must be in one of these groups,
1 = fZo M(x)fD+(x)dx + f*l (1 M(x))fD+(x)dx (26a) (26)
1 =C M(x)fD(x)dx + fâ„¢m (1 M(x))fD(x)dx (26b)
The result is the same as the law of total probability,
1 = P(T1+\D+) + P(T1~\D+) (27a) (27)
1 = P(Ti\D~) + P(T{\D~) (27b)
One can now consider the sensitivity and specificity of the test as a function of cutoff position, x = C. The sensitivity is
P(T2+ n T?\D+) = Jâ€ M(x)fD+(x)dx. (28)
The specificity is given by
p 00
P(T~\D~) = 1  J M(x)/D(x)dx.
The ROC curves for the enriched and nonenriched cases of Figure 4 are shown in Figure 5. The enriched ROC shows saturation because of the loss in cells in the enrichment stage.
23
5
4
3
2
1
0
1
2
3
4
5
Figure 5: The ROC of the biomarker post enrichment (blue) follows the ROC of the biomarker without enrichment for low false positive rates and is seen to saturate at higher false positive rates. This is caused by cell loss in the D+ population due to the stage 1 enrichment. D+ and D distributions are normally distributed and are shown in Figure 4.
2.4.5 ROC for TwoStage Tests Using Different Biomarkers
I generalize the previous results for the case that different biomarkers are used in the two test stages, as would be expected in a realistic test, for example immunoenrichment for EpCAM positive cells followed by image cytometry with CD45 biomarker. I now look at the effect on sensitivity and specificity for this case.
The two biomarkers make the D+/D populations describable through a two dimensional probability density function, /d+/(x) â€”*fDÂ±(x,y), also known as a joint probability density function. The enrichment function, M(x) acts on only one of these dimensions. Figure 6 shows a simulation for two cases, one with statistically independent biomarkers, BM1 and BM2, and one with biomarkers that are correlated, BM1 and BM3. In general, one would expect biomarkers that select for CTCs to be correlated.
The law of total probability, Eq. 26, defining the sensitivity and specificity of the first stage becomes
24
(30)
P(Ti\D+) =/_Â°Â°oo Cm M(x)f D+(x, y)dxdy (30a)
P(Ti\D+) =f_Z LI (lM(x))fD+(x,y)dxdy (30b) P(Ti+\D~) =/_Â°Â°oo M(x)/D(x,y)dxdy (30c)
P(Ji\D~) =LÂ°Â°00 (1 M(x))fD(x,y)dxdy (30d)
As a function of cutoff position on the stage 2 biomarker, the sensitivity of the combined first and second stage is
and as a function of cutoff position on the stage 2 biomarker, the false positive rate of the combined first and second stage is
One can relate the combined false positive rate to combined specificity through Equation 12.
One may see that additional dimensions can be added to further generalize this theory for 3 or more biomarkers.
2.4.6 Creating Overall Test ROC from TwoStage Test Using Different Biomarkers
Suppose one is given tubes of D+ and D cells that have been put through enrichment. With knowledge of the number of D+ and D cells preenrichment one can determine overall test ROC. Measurement of a biomarker â€˜ROCâ€™ post enrichment will produce distributions that can be used to compute the overall test ROC. I put ROC in quotes because this is not a typical ROC but one with a dependent sensitivity and false positive rate. As a function of cutoff position the â€˜sensitivity axisâ€™ of this ROC is not overall sensitivity but is instead P(T2+D+ fl Ti+) and the measured â€˜false positive rateâ€™ is P(T2+D+ fl Ti+). The measured sensitivity can be converted to overall test sensitivity by
(31)
(32)
25
p(r2+ n t?\d+) = p(t?\d+ n r1+)P(r1+D+)
(33)
and overall test false positive rate can be converted with
P(P2+ n t?\d~) = P(r2+zr n Tf)P(Tf\D~).
(34)
With knowledge of the number of D+ and D cells at the stage 1 input, Ndi+ and Ndi, and the number of cells in those groups post enrichment used to compute the ROC, Nd2+ and Nd2, the compensating sensitivity is
P(Ti\D+)
^2 D +
NlD +
(35)
and false positive rate is
PCP+lPr)
^2D
NlD
(36)
2.4.7 Simulation: ROC Pre and Post Enrichment for Dependent and Independent Biomarkers
The effect that the enrichment step has on overall ROC performance and relationship of the biomarker dependency is of interest in understanding commonly used multistage assays. To investigate this, I created and compared ROC for a twostage test that use biomarkers that are dependent and independent for each stage. I simulated the populations for each biomarker to have excess kurtosis of 3 as shown in Figure 1.1 created the D+ and Dpopulations through random number generation with a Pearsonâ€™s distribution onto two biomarker spaces, BM1 and BM2. The D+ and D distributions on BM1 and BM2 are independent since they were generated with different random number generation calls. To create a dependent biomarker, BM3,1 created a biomarker that was a linear combination of BM1 and BM2 at a 45/55 ratio, BM3 = .45 BM1 + .55 BM2. Those ratios were chosen to make the lines clear for visualization in panels a) and b) of Figure 6.
26
I simulated positive selection using the modulation functions shown in panels c) & d) of Figure 6. I used those modulation functions, M(x), to determine the probability that an event would be selected given its BM1 value. To perform selection, for each simulated cell, I sampled a uniform random variable ranging between 0 and 1 and selected it if selection probability value, M(x), was less than the sampled random variable number. Thus, events with high selection probability are more likely to get selected than those with low selection probability.
I calculated the ROC and detectable rarity of BM2 and BM3 before and after this enrichment step as shown in Figure 6 panels e) & f). From the results, one sees that even though BM3 performs better than BM2 before enrichment, the post enrichment performance of BM3 improves less than BM2 due to its dependence on BM1.
27
10 (a)
03 m
n
5
CM m 0 *
0 5 10
1 BM 1
(C) /
X, 5 0.5 /
0 5 10
BM 1
p 10 (e)
i 00 m
n
5
CM
P
m 0
0 5 10
BM 1
5 4 3 2
2 10 CO
d.
CD C
Q 5
g* 0.5
10
Q.
CD
Q
(b) 40
0 5 BM 1 10
(d) r
0 5 BM 1 10
(0 0'
0 5 BM 1 1 2 3 10 4
o  1spec.
___ BM 2 (ID BM1) _
Pre enrichment BM 2 (ID BM1) _ Post enrichment
BM 3(Dep. BM1) Pre enrichment BM 3(Dep. BM1) Post enrichment
LR T
Figure 6: Simulated D+ events (red) and D events (blue) with distribution on biomarkers BM1, BM2, and BM3. (panels a and b). BM2 is independent (ID) of BM1 while BM3 is dependent (Dep.) on BM1. Distributions are put through enrichment stage on BM 1 with cutoff shown in (c and d) resulting in new distributions shown in panels (e and f). Panels (g and h) show the ROC of BM2 and BM3 before and after enrichment with maxima in Ndet annotated as +. Although the performance of BM3 is seen to be better than BM2 pre enrichment its performance improves less than BM2 post enrichment due to its dependence on BM 1.
28
2.4.8 Simulation: The Optimal Enrichment Cutoff Position
I next asked the question of how maximum detectable rarity of the overall test changes as a function of enrichment cutoff position and how this is affected by the dependence between the enrichment stage I biomarker and stage II biomarker. To answer this question, I developed a simulation similar to the previous one, involving two independent biomarkers BM1 and BM2 and a third dependent biomarker BM3 = .30 BM1 + .70 BM2. These ratios were chosen to make the differences in curves easy to see graphically. I varied the cutoff position, shown in panels c) & d) of Figure 6 from left to right, computed the overall test ROC at each enrichment cutoff position, and recorded the value of Ndet for stage 1 and the maximum value of Ndet on BM2 and BM3 pre and post enrichment.
Logio of the percentage the D population is reduced by an enrichment is commonly reported as a performance metric for the enrichment referred to as enrichment efficiency. Distancing the cutoff position of the enrichment ever further from the D population continuously increases the enrichment efficiency. Thus, the cutoff position of the enrichment and the log of the enrichment efficiency are equivalent and are shown at the top and bottom axis of Figure 7.
What I find is that if the D/D+ populations have excess kurtosis on the stage 1 biomarker then there is a cutoff position of the enrichment that produces a maximum detectable rarity, Ndet, while if the D/D+ populations are normally distributed then Ndet increases as the cutoff position is made ever further from the D population. I show an example of this in Figure 7 panel a. When the D+ and D are not normal distributed the location of the cutoff position for the enrichment to maximize detectable rarity is the same as
29
the cutoff position where the test positive likelihood ratio shoes maximum on the likelihood ratio ROC.
In Figure 7 panel b, the combined test detectable rarity for a biomarker independent of the enrichment biomarker post enrichment (blue dashed trace), is the product of the detectable rarity for the enrichment biomarker with the detectable rarity of the stage 2 independent biomarker as predicted by Equation 19. When the stage 2 biomarker is dependent on the stage I biomarker the combined performance (red dashed trace) is less than the product of the individual stages and the optimal stage 2 cutoff position maximizing overall test detectable rarity is seen to shift.
For D+/ populations with excess kurtosis the location of the cutoff giving the maximum Ndet also depends on the separation between the D+ and D populations on the enrichment biomarker. For the simulations shown in Figure 6 & 7,1 chose a value of four standard deviations of separation. An important concluding point is there can be an optimal log of enrichment since this is directly related to optimal cutoff position of the enrichment which I show depends on the distribution of the D+ and D populations on the enrichment biomarker with kurtosis being an important property describing their spread.
30
Cuttoff Location from D in Standard Deviations
2 0 2 4 6 8 10
Enrichment Efficiency log10(#D Stage 1 input/#D Stage 1 selected)
Cuttoff Location from D in Standard Deviations 2 0 2 4 6 8 10
Enrichment Efficiency log10(#D Stage 1 input/#D Stage 1 selected)
Figure 7: The detectable rarity of stage I and overall test as function of the cutoff position on the stage I enrichment biomarker. Panel a) shows the detectable rarity of the stage I enrichment alone. For the case of D+ and D drawn from a normal distribution (BM a) the detectable rarity increases as the cutoff position is moved ever further away from the Dpopulation. For D+ and D populations with excess kurtosis of 3 (BM 1) a maximum is seen in detectable rarity as a function of cutoff position. In panel B, BM1 is used as the enrichment biomarker and BM2 from Figure 6 as the stage II biomarker. The performance of the overall test post enrichment (blue dashed line) follows that of the detectable rarity of BM1 alone combined with that of BM2 as predicted by equation 19. When the stage II biomarker is dependent on the stage I biomarker (BM3 from Figure 6) the overall test detectable rarity underperforms that of the independent case.
2.5 Theory: Cell Loss in Processing
Cell loss is almost always present in the numerous processing steps needed to bring a
whole blood cell fraction though an assay. How does this loss effect the sensitivity and specificity of the assay? If the cell loss is homogenous across the biomarker of interest, its
31
effect is to proportionally lower the number of test positive objects produced by the test. This means that if a percent of cells is lost, the sensitivity is reduced to Pioss(T+ / D+)=(la)Pnoioss(T+ ID) and false positive rate to Pioss(T+ /D)=(lq)Pnoioss(T+ ID). This reduced false positive rate means more cells can be used to compensate. Note that this is different than increasing numbers to compensate for low test sensitivity which does not change the specificity of the test. Analysis of positive predictive value and Ndet shows in case of uniform cell lose effect on sensitivity and false positive rate cancel leading to identical PV+ and detectable rarity. It is often not checked if cell loss is homogenous across the cell groups of interest. Also, compensating for cell loss requires using larger blood samples which may be detrimental to patients.
2.6 Theory: Sampling Considerations and Volume of Blood to Screen
Considerations of the amount of cells that need to be screened in order to have a good chance at accurately identifying a rare 1 in 1 million cell prevalence have been previously described [55], [56], This previous analysis modeled the problem as a binomial distribution and assumes perfect detection ability, a series of Bernoulli trials and assumes perfection in the identification. I believe the optimal number of cells to screen is the amount needed to adequately sample the detection limit of the assay, p = Ndet1. Working through the analysis, assume the probability of a cell being a CTC is p, then the probability of finding k cells in n cell measurements is
p(xâ€ž = fc) = QPâ€˜( i?)("*). (37)
The probability of finding greater than or equal to k cells in n measurements, P(Xâ€ž>k), is of interest. P(Xâ€ž>k)can be written as
32
(38)
P(.Xn >k) =
n
pk( 1 â€” p)^n p .
One should control this value such that it is greater than an assurance level, P(Xn>k)=la. The assurance level, a, is the probability that one does not find a cell of interest. Often this value is set to 5%. The assurance level is increased by increasing the number of measured samples n. Equation 38 can be rewritten as its complement, P(Xn>k)=lP(Xn
a =
^ fV(lp)n 'â–
7 = 0
(39)
Equation 39 is not easily rearranged to algebraically solve for n. Instead a graphical or numerical approach can be taken. A plot of alpha as a function of n forp=10r6 is shown in Figure 8. [55] also provides plots of n as a function of k for a 1CT6 cell rarity.
Figure 8: Assurance levels of encountering, k cells, after measuring n cells assuming 1 in one million cell prevalence.
33
The other question is what should the value of k be such that there is an assurance level 1a that upon repeated sampling, one finds greater than k cells. Clearly k needs to be greater than 0. Could it be 1? This is the same question just discussed, it can be, and n can be found for a k of 1 with Eq. 39. As one increases the value of k, one must search more cells to achieve the cutoff. The way k=l is assured at a confidence is by sampling enough cells such that it is most likely more than 1 cell with be found.
Although k, could be one by the above argument, perhaps a little more caution should be taken. One may note that the standard deviation of a binomial random variable is the square root of its expectation value. Thus, the standard deviation of k=l would also be 1. Thus, it is desirable to increase k by this number so that k minus standard deviation is at least greater than 1, requiring a minimum number of three cells.
2.7 Discussion
One interesting result is that the detectable rarity for a multistage test on independent test biomarkers is the product of detectable rarities of each of the individual biomarkers. This result falls out of the theory of likelihood ratios. The result points to a path to detect ever rarer events in many applications including CTCs. I used analytical arguments and simulation to show the independent case is the upper limit on how test stages combine in a multistate assay.
Other transformations of ROC axis have been proposed in addition to the likelihood ratio axis explored here. Recently, [60], [61] have shown transformations to convert ROC to metrics of Shannon entropy. Indeed, some of the more recent mathematical development of ROC is related to their application in imaging [62], [63], The prominent place that likelihood ratios take is of note in this analysis. In addition to metrics and analysis focused on Shannon
34
information, the diagnostic odds ratio is worth mentioning as it is independent of disease prevalence [64], I point out that my approach here differs from prior work in focusing on separating the question of what prevalence of cell a test can detect from what that prevalence actually is. The prevalence of cell that a test can detect is a quantification purely of the test while the actual cell prevalence is unknown until an accurate enough test is made to see it.
The connection between Shannon information and diagnostic is an intriguing one.
[65] has provided analysis how sensitivity and specificity relate to a diagnostics capacity as an information channel. Early connections between diagnostics and information entropy are attributed to Akaike [66], [67] presents more recent work about using information entropy in diagnostics and [68] provides a recent review of information metrics in general. Interest in these methods is high because information theory provides a deeper theoretical understanding similar to what Shannonâ€™s theory did in the field of communications. One application of Shannonâ€™s theory of information in diagnostics is quantifies the redundancy in testing [69], Overall, I note that the main difference between the diagnostic and communications systems is that in communications, one controls encoding and decoding while in diagnostics one only controls decoding.
Regressing two independent biomakers against D+ and D will produce a new biomarker that outperforms them but is also dependent on them. If one of the independent biomarkers is also used for the enrichment, post enrichment this regressed biomarker will performed worse than the other independent biomarker it is built from. Thus finding cutoffs maximizing Ndet for each biomarker and then using these cutoffs in each test step may help to detect rarer cells than regression techniques. I showed the connection between detectable rarity and likelihood ratios. ([52] on pg. 182) came to a similar conclusion stating, â€œThe chief
35
conclusion obtained from the general theory of signal detectability presented in Section 2 of this paper is that a receiver which calculates the likelihood ratio for each receiver input is the optimum receiver for detecting signals in noise.â€ [52] noted that the difficulty is in determining the ordering of the test gates. In my work, I have seen that PC A combined with regression does not improve regression performance [71], However, methods such as PC A may be helpful to generate a basis for test gating as they produce statistically uncorrelated basis vectors.
Further generalizing these results, a question that comes up is: what are the limitations of biomarkers such as prostate specific antigen (PSA) and cancer antigen 125 in the detection of disease with particular interest on early detection. Assay sensitivity and specificity set the rarity of event that an assay can detect for these biomarkers as well. For example, prevalence of many solid cancers such as ovarian cancer is 1 in 10 000 [3], [4],
[72], Therefore, by the presented theory, a useful early diagnostic with perfect sensitivity should have false positive rate better than 1 in 20 000.
The best way to evaluate the sensitivity and specificity of the overall system is to run a disease positive and disease negative cell populations through the complete assay and observe the frequencies for the test positive population, P(T/ 'A ID ), which would be
the sensitivity, and test negative population which is the false positive rate, P(T1+OT2+/ D), which is the complement of the specificity. I show parameters needed in the calculation of the overall ROC is the number of D+ and D cells before stage I and measurement of the continuous biomarkers at the depletion output.
36
CHAPTER III
ENGINEERING OF CELL LABELING TECHNOLOGY
3.1 Introduction
The motivations for the design of the labeling device was to reduce cell loss during the labeling process and to allow for the area of the field of cells produced to be configured. Current devices for laying down cell suspensions do not allow for the size of the cell field to be configured and the field is too large in area to be compatible with imaging of cell number fractions less than 10 000 cells. In this chapter, I also provide description of a coverslip mounter I employed for slowly applying coverslips to prepared filter samples, a protocol for labeling cells for DNA, lipids, CD45, and CK, and example images produced with the combined device and protocol.
3.2 Device for Cell Labeling
A side view schematic of the labeling device is shown in Figure 9. A ninelane version of the device (for holding nine samples in parallel) was machined from polycarbonate with the alignment plate 3D printed in acrylonitrile butadiene styrene (ABS). Manual ball valves and a manifold were used to connect the device to vacuum and control fluid flow in the individual lanes. I also 3D printed the plate in Figure 10 for securing the valves. Nylon tubing (1/8â€ diameter) was used to connect the valves to the device.
37
DETAIL 1
Figure 9: Side cross section of labeling device. Device consists of input head, alignment plate, and output head. The input and output heads sandwich orings set with the alignment plates to create a seal on the polycarbonate filter as shown in detail 1. Input head has threaded connector (a) for applying positive pressure. The volume of the staining reservoir (b) is controlled by the diameter (c) while its height is set to the length of a standard gel loading pipette tip. This choice of height prevents damaging the filter while enabling bubble free loading. The diameter of the laydown area is controlled by the oring diameter which can be set by changing the diameters in the alignment plate (e). This plate must be thin enough for a compression gap (f) to enabling sealing of device. Output head contains threaded connector (g) for pulling fluids from the device. All fluids flow in the direction of the dotted arrow (h).
The vacuum regulator (Figure 10 (e )) is used to pull fluid and prevents distortion of the filter. I set my vacuum level to 10 kPa. The filter is held with compression between the input and output head using eight 440 socket head cap screws on the device tightened to 4 inoz of torque. For this work, I used 90Buna ANSI004 orings (Rocket Seals, Denver, Co) to create the seal on the filter. The high durometer buna oring is needed to prevent filter deformation by the oring in clamping the device.
38
Figure 10: Diagram of assembled system: The lanes in the staining chamber (a) are pulled with vacuum controlled through stop cock panel (b) fabricated with 3D printing. The vacuum splits out through the manifold (d) and is routed through 2 aspiration steps (d). To prevent stretching the filter, vacuum pressure is controlled by regulator (e) which is connected to the lab vacuum (f).
I used a 4.5 mm punch (Micromark #83513) to cut 13 mm diameter 800 nm pore size track etched poly carbonate filters (ISOPORE ATTP01300) to 4.5 mm diameter. I am able to get 2 to 4 filters cut from each 13 mm diameter filter. I punched the filter with protective paper shipped with the filters above and below it.
3.3 Coverslip Mounter
A custombuilt set up was constructed for mechanical application of the coverslip to slide on which I set the filter. I have found that it is easy to smear out the cell bodies or introduce bubbles at this step when mounting by hand. My coverslip mounter uses a stage and vacuum to hold the coverslip to a plate. Coverslips are pressed onto the filter on slide at
0.2 mm per second and the vacuum holding the coverslip is released. Movements to a preloading position and retraction of the plate after vacuum was released were done at 1.2 mm per second (top speed for this stage).
39
3.4 Protocol for DAPI, Bodipy, Pancytokeratin, and CD45 Labeling of Cells on TrackEtched Polycarbonate Filters
The filter is lipophilic and different fluorophores can also label the filter producing a high background. To reduce background, filters are first treated with TrueBlack (Biotium #23007). Additionally, fixed cells will not naturally stick to the filter and can cause issues with mounting on slides. I have found that that treatment of the filter with polydlysine before adding the cells followed by treatment of the filter and cells with formaldehyde after adding the cells causes the cells to remain bound. I presume that the formaldehyde crosslinks the cells with the polydlysine bound to the filter.
To avoid air bubbles in the device, the filter must not dry out during the labeling process as fluids are passed through in series.
This protocol uses the saponin based BD Perm/Wash buffer for permeabilization (Part #: 512091KZ). I also experimented with using various concentrations of ethanol and methanol to permeable the cells. In comparing these methods, I found the fluorescent dyes labeling cytokeratin and lipids in MCF7 cells were much brighter with the BD kit compared to using alcohols. I have performed experiments pointing to the use of alcohols in permeabilization leaches out lipids leading to lower Bodipy intensity.
The sequence of steps used in labeling and mounting cell samples follow.
1. Load device with filters. Perform loading submerged in water to prevent air bubbles.
Tighten screws to 4 inoz torque.
2. Pull residual water left from loading filters
3. Load and pull 90 pL/lane 70% ethanol
4. Make a solution of 1:4000 True Black in 70 % ethanol.
40
5. Load and pull the 1:4000 True Black solution at 70 uL/lane
6. Incubate at half volume (i.e. half the liquid remains on top of the filter) at room temperature for 10 minutes
7. Load and pull 30 pL/lane PBS  Repeat Once
8. Load 30 pL/lane 0.01% V/V PolyDLysine
9. Incubate at half volume for 10 minutes
10. Load and pull cell sample solution
11. Load and pull 50 pL/lane PBS
12. Load 30 pL/lane 4% Formaldehyde (Sigma 47608) in water.
13. Incubate at half volume for 10 minutes on ice
14. Load and pull 30 pL/lane lx BD Permwash
15. Load 30 pL/lane lx BD Permwash
16. Incubate on ice at half volume for 10 minutes
17. Load 20 pL/lane IgG (Sigma 18640) at 20 pL per mL in lx BD Permwash.
18. Incubate on ice at half volume for 15 minutes. (Blocking)
19. Load and pull 30 pL/lane BD Permwash
20. Prepare antibodies solution: 3 pL antiPanCKAlexa555 (Cell Signaling Technologies 3478S) + 3 pL antiCD45Alexa633(Biolegend 304020) into 100 pL lx BD Perm wash
21. Load 10 pL/lane of solution. Incubate at half volume for 30 minutes on ice
22. Load and pull 30 pL/lane lx BD Permwash
23. Apply DAPI (Sigma D9542) and Bodipy 495/503 (Fischer Scientific D3922) labels. Total solution: DAPI and Bodipy each at 1 ug/mL in lx BD PermWash.
41
24. Load DAPI/Bodipy solution at 30 pL/lane
25. Incubate at half volume for 10 minutes at room temperature.
26. Load and pull 30 pL/lane NanoPure H2O
27. Pull lanes to empty
28. Mount by depositing 3 pL water on slide and 15 pL Prolong Diamond (Fischer Scientific P36970) on coverslip
29. Deposit filter with cells facing up in water drop on slide
30. Place coverslip over filter
31. Secure coverslip comers with nail polish
3.5 Example Resulting Cell Preparations
An example image of a full cell field and a sub region of interest prepared using the device and with the 4 labels applied using the protocol is shown in Figure 11.
Figure 11: A 1:1 sample of WBCs and MCF7 cells labeled for DAPI (blue), Bodipy (green), antipanCytokeratain(yellow), and antiCD45 (red). Panel a) representative image of cell spot produced by labeling device. Diameter of spot is 2.1 mm. Image is 8x8 mosaic full
/A 1 .â€¢
42
resolution of square area is shown in panel b). Image at full resolution is available in public
dataset [73],
43
CHAPTER IV
EXPERIMENTAL EVAULATION IMAGE CYTOMETRY FOR DNA, CD45, CYTOKERATIN, AND LIPIDS IN A MODEL SYSTEM FOR CIRCULATING
TUMOR CELL IDENTIFIATION
4.1 Introduction
In this chapter, I present analysis of how the identification performance of image cytometry for DNA (DAPI), lipids (Bodipy), and CD45 compares to image cytometry for the classical biomarker panel of DNA (DAPI), Cytokeratain (CK), and CD45 in a model system of disease positive (D+) MCF7 cells and disease negative (D) white blood cells (WBCs) from human peripheral blood. A DNA, lipids, and CD45 panel is interesting because it does not use epithelial biomarkers and because the lipid label (Bodipy) is compatible with live cell imaging.
The WBCMCF7 model system has been previously used by other investigators in studying CTCs [24]â€”[29]. Additionally, the manufacture of the cytokeratin antibody used MCF7 as the positive control. Fatty acid synthase has been shown to be over expressed in many cancers [74]â€”[81]. Higher lipid content in CTCs isolated with CK+/CD45 marker has been measured using coherent antiStokes Raman scattering (CARS) microscopy, showing a 7fold higher lipid signal over other blood cells[30]. It is not clear if the increased fatty acid content of these CTCs is due to increased de novo synthesis by the cells through FAS or through fatty acid uptake from the blood. One study has shown that treatment with a FAS inhibitor lowers the amount of neutral lipid staining (Bodipy) in a prostate cancer mouse model [82], Also, there is a report of lipid droplets accumulating through cellular uptake
44
under hypoxic conditions through the HIFA pathway [83], Tumors are thought to be under hypoxic conditions and perhaps this pathway may be a driver of lipid accumulation.
In my work, I assess how well lipids perform as an identifying biomarker for CTCs in a model system. I employee Bodipy to label lipids. Bodipy is a lipophilic dye that labels long chain fatty acids including neutral lipids and can provide lipid contrast similar to CARS while also being compatible with live cell imaging [84], I test the following hypotheses:
(H.l) image cytometry for a composite panel of DNA/lipids/CD45 will detect MCF7 cells with better performance than a standard DNA/CD45/CK panel, (H.2) MCF7 cells have increased lipid content compared to white blood cells. The identification performance I report for the studied biomarkers are the distributions of features for the D+, and Dpopulations, area under the curve (AUC), Cohenâ€™s d, sensitivity, specificity, and the rarity of cell detectable.
I additionally test the hypothesis (H.3) that spatial metrics of second moment, spatialfrequency second moment, and their product using image cytometry can increase sensitivity and specificity over simple total content measurements. I hypothesized that cancer cell lines have a bigger nucleus than WBCs that could be measured by a second moment feature. The desire to look at spatialfrequency was conceived as a metric to quantify structural differences associated with lipid droplets seen on microscopy. The importance of conjugate variables in signal processing and physics and knowledge of their invariant products, such as M2 in laser beam profiling [85], [86], motivated looking at the product of spatial moment with spatialfrequency second moment which is a unitless metric related to information content.
45
To perform image cytometry for these metrics, I coded my own segmentation and analysis software that uses both ImageJ and Matlab. Before doing this, I looked at using the ImageJ ROI analyzer and CellProfiler [70] to compute these metrics but second moment, spatialfrequency second moment, and their product are not currently available in these softwares.
I refer to the computation of an image feature computed on one channel/label as a biomarker. Thus this 4 label data in which 4 metrics were computed resulted in a 16 biomarker image cytometry dataset. Each of these potential identifying biomarker I calculated and present univariate performance in this model system.
I present a regression analysis to linearly combine individual biomakers to create a new biomarker with increased separation between D+ and D populations and thus performance. This regression technique is one of many machine learning techniques that can be used in this optimization problem. Recently, comparisons of Bayesian classifiers, knearest neighbors, support vector machines, and random forests has been performed [87], Using my regression analysis I performed on par with other machine learning methods that tested in a WBCcell line model of CTCs [87] with further detail presented in Appendix G.
In my approach I train and test on pure samples of known class without using human identification. Prior CTCmachine learning studies are motivated by a desire to reduce operator time and error and mostly focus on training and testing machine learning techniques against operator identified CTCs [87][91], These studies have found automated methods to perform similarly to manual identification[88], [91] and unsupervised methods to perform similarly to supervised ones [90],
46
In my pure sample analysis, I can rigorously test the performance because any object identified as test positive in the D WBC fraction is a false positive. My regression analysis points to the fact that there are a small number of objects in my D dataset that are classified as MCF7s and visual inspection confirms these objects would be classified as D+ by human operator. My theory points to these false positives being the limiting factor in rare cell identification.
Image cytometry work has been performed on the classification of WBC subtypes by image morphology [92]â€”[96], There has been investigation of image cytometry specifically related to circulating tumor cells [88], Many of the reports focus on the region of interest (ROI) segmentation problem [97]â€”[99]. There have also been studies of image cytometry using second spatialfrequency moment analysis [100], However, this work is the first I have seen characterize and quantify the metrics of total signal, second moment, spatialfrequency second moment and their product on several biomarkers and in which imaging was performed on over 1000 cells.
4.2 Materials and Methods
4.2.1 Sample Inclusion and Abundancies
The accuracy in which one estimates sensitivity, P(T+D+), is related to the number of samples in the D+dataset, while the accuracy in which one estimates specificity, P(T"D"), is related to the number of samples in the D" dataset. To estimate sensitivity and specificity to similar accuracy I have used similar abundancies of the D+ and D" cells in this dataset.
I have chosen the sample size of each dataset to be 10003000 cells. I also prepared 1:1 mixed samples to qualitatively show that position of modes seen in pure samples is not due varying labeling conditions or acquisition settings. The mixed samples were not used in
47
the quantitative performance analysis. A threshold of 65 dBct [= lOlogio(counts)] on DAPI intensity (Â£) was applied to include ROIs in the analysis.
White blood cells were isolated from peripheral blood taken from a different patient on each preparation day. MCF7 cells were prepared for a given day from cultured cells and therefore could have different properties due to confluence. In order to measure variations, the data was acquired on multiple days with samples in triplicate on each day. Statistical variations in performance within the same day (intraday) and between days (interday) are compared.
Training and testing data sets were acquired on experimental days 9, 10, 12, 13, 14, and 15. Additionally 1 WBC filter from day 14, and 1 WBC filter from day 15 were excluded due to poor quality. Thus this dataset comprised of 52 filters in total. The data collected on days 5 and 7 used an older CD45 antibody that did not show good labeling and was excluded from analysis. The cells on day 8 were only labeled for DAPI and Bodipy and also excluded.
4.2.2 Sample Preparation and Labeling
Samples of human WBCs, MCF7 cells and 1:1 mixtures were prepared as follows. Peripheral blood samples were collected from the Gynecological Tissue and Fluid Bank Repository (COMIRB 070935/COMIRB 051081) from consenting patients undergoing surgery at the University of Colorado Hospital. White blood cell (WBC) fractions were isolated from peripheral blood with an ammonia chloride lysis protocol. MCF7 cells were cultured and trypsinized. Following red blood cell lysis some MCF7s were added to a fraction of isolated WBCs at a 1:1 ratio to prepare a mixed sample. Pure populations of WBCs, MCF7s and the 1:1 mixed sample were fixed using 4 percent formaldehyde in water. Formaldehyde was washed out and samples were stored at 4 degrees Celsius. DNA from the
48
MCF7 cell line was sequenced at the University of Colorado BioResources Core Facility and found to be a 100% match to MCF7.
The stored samples were labeled with DAPI (DNA), Bodipy (lipids) (Fischer Scientific D3922), antiCD45Alexa633 (Biolegend 304020), and antipanCKAlexa555 (Cell Signaling Technologies 3478S) per protocol described in section 3.4. Labeling was performed in triplicate using a custom device to provide a seal against the track etched poly carbonate filters that places the cells in a uniform single layer on a 3.4 mm2 area of the filter. The filters containing the cell samples were then mounted onto a microscope slide for imaging.
The WBC, MCF7, and 1: 1 samples were prepared on 8 separate days, producing 24 total samples. Each of these samples were labeled in triplicate on a separate day and one dayâ€™s worth of WBC, MCF7, and 1:1 samples was labeled twice for 9 separate preparations of the filters. This resulted in 27 filters prepared with pure WBC, 27 prepared with pure MCF7s and 27 prepared of mixed samples and total of 81 filters.
4.2.3 Fluorescence Microscopy
Images of the full 3.4 mm2 area for each sample preparation were acquired on a laser scanning confocal microscope (Carl Zeiss AG, LSM 780) using a 20X Zeiss PlanApochromat 0.80 NA objective. The laser lines used for excitation were 488 nm (Bodipy), 561 nm (Alexa555), and 633 nm (Alexa647). The DAPI label was excited by two photon imaging using an ultrafast pulsed laser (Coherent Inc., Chameleon) tuned to 765 nm. I acquired data for all four channels by first imaging DAPI, and Bodipy simultaneously followed by acquisition of the Alexa555 and Alexa 647 channels simultaneously. Cross talk between samples was assessed by performing imaging with just one of the excitation lasers
49
on at a time and observing no bleed through signal on the other channels. Additionally, imaging was performed with samples labeled only with DAPI and Bodipy showing that the Alexa555 and Alexa633 channels did not show any signal even with all lasers on. Three axial confocal sections were acquired over a 10 pm scan range to deal with sample not being perfectly flat and 8x8 tiled mosaic images were acquired for each channel and stitched together using software (Zen, Zeiss Inc.). Each tile of the image was 1024 x 1024 pixels with a pixel size of 0.42 x 0.42 pm. The full data set showing images taken from 81 slides has been made publicly available [73],
4.2.4 Image Processing
I performed image cytometry and calculated these features of the ROIs using custom segmentation and analysis software using Matlab (MathWorks, Cambridge, MA), ImageJ version 1.50 [101] and the interfacing tool MIJ [102], My code performs the majority image manipulations including thresholding and segmentation in ImageJ with a macro and then imports the channels and ROIs into Matlab for computation of the metrics for each ROI.
The DAPI (DNA) channel was first smoothed using a 1.5 pm Gaussian filter and a threshold of 2500 counts was applied. The thresholded areas for the channel were then dilated to increase the area and a watershed algorithm was applied to aid in the ROI separation. The analyze particle function in ImageJ was run on the binary masks segmenting them into ROIs containing individual cells. After segmentation, the 16 pure WBCs samples contained 24,699 ROIs, the 18 pure MCF7 samples contained 41,091 ROIs, and the 18 mixed samples contained 33,726 ROIs.
The ImageJ advanced programmable interface (API) was used to import the array of ROIs into Matlab and MIJ was used to import the image data. I then computed my metrics
50
for each ROI. The image cytometry codes performing these functions has been made publically available [103],
4.2.5 Calculation of Cohenâ€™s d
Effect size measures the strength of the separation between two populations. Cohenâ€™s d quantifies separation by normalizing the difference between the means of the two populations, u\ and Â«2by the pooled standard deviation between them 5.1 used the formula,
to compute Cohenâ€™s d. In Eq. 40, m and 112 are the number of samples in each population, and (7i and (72 are the standard deviations of each population. Cohen considered weak, medium, and large effect sizes to be d= 0.2, d= 0.5, and d= 0.8 respectively [104], The value of d can be both positive and negative, the sign merely represents if mean of population 1 is greater or less than population 2. The sign has been suppressed to prevent confusion as the sign can be inferred from the means of the sample populations.
4.2.6 Calculation of Image Metrics
I selected spatial metrics that were invariant to rotation. The metrics are inspired from those used in laser beam profiling [85], [86], Let l(x, y) be the measured intensity of the image in a region of interest (ROI) as a function of spatial position (x,y). To quantify the spatial size of a labeled ROI I used the second moment. Computation of the second moment involves use of the first moment defined as,
(40)
/ / xl(x,y)dxdy _ / / yl(.x,y)dxdy
/ / I(x,y)dxdy ^ / / I(x,y)dxdy
(41)
51
, which is also the centroid position. The denominator of the fraction is the zeroth moment, 2, or the sum of the intensity signal over the ROI.
The second moment is a weighted average of the signal away from the centroid position
r2 //[(* xi)2 + (y ~ Ji)2]1 (x,y)dxdy
f f I(x,y)dxdy ' (42)
I was also interested in a metric that might distinguish cells with small particles, like lipid vesicles, from those without them. I attempt to quantify this as the image distribution in spatialfrequency. I will define the spatialfrequency distribution of the image as the Fourier transform of its intensity
Kfxâ€™fy) = / / I(x,y)ej2nxfx+j2nyfydxdy, (43)
where j = Vâ€”1. In the spatialfrequency domain, spatial centroid position is represented by a linear phase. Since one does not want position offset to affect these metrics, I will define the second moment in spatialfrequency as,
r/ If YKfx,fy)\dfxdfy ' (44)
taking the absolute value of the spatialfrequency distribution to eliminate the phase.
Finally, I consider the product of these two metrics
52
(45)
< M2 >=< r2 >< rj >.
This number broadly represents the information content of the ROI or in laser beam profiling represents the spatial mode content.
The units for the second moments are calculated in jxm2 and jim~2 respectively for spatial and frequency domains while M2 is unitless. To make interpretation easier, I report the metrics in a linear unit taking the square root,
< r >= V< r2 > < Tf >=
< M >= ~\J< M2 >.
< 77 >
(46)
4.2.7 Performance Analysis
4.2.7.1 Training and Testing Data Subsets
The performance of a biomarker that is assessed using the same data that is used to define or train it leads to an overstatement of its performance. To avoid this, the performance of both individual features and regressions combining different features were calculated by separating the dataset into training and testing subsets. Paired trainingtesting subsets were formed with data taken from the same experimental day. This approach is consistent with adding samples under test to the system as 3 additional samples to be tested at the found operating points. To test the generalizability this approach, I also looked at WBC and MCF7 data from all days pooled together and randomly segmented into 10 subsets (details in Appendix B).
53
For my trainingtesting approach, I used one sample of WBCs and one sample of MCF7s from a given day as a training subset to find a cutoff point maximizing Ndet for each feature and then to use the same training subset to perform the regressions. Sensitivity, specificity, and Ndet were then calculated on the remaining datasets from that day not including the training subset (the testing subset). For example, if the training subset was WBC sample 1 and MCF7 sample 1, the testing subset would be the grouping of WBC samples 2 and 3 and MCF7 samples 2 and 3.
The trainingtesting process was repeated using subsets from all possible pairings of WBC and MCF7 samples within each day. Thus days containing 3 samples of WBCs and 3 samples of MCF7 produced 9 trainingtesting subsets. Note that for the two experimental days one of the samples WBCs was excluded due to poor quality and therefore for those two days, I used six pairings. In total for all days, I used 48 trainingtesting pairs in my assessment of biomarker performance.
The code for building the training and testing subsets has been made available [105],
4.2.7.2 Multivariable Regressions with Feature Selection
Multivariable regressions with feature selection were performed to combine individual features in order to create a higher performance biomarker. Selection of different feature groups for this process allowed us to determine their contribution to regression performance. The first feature group considered was that of all 16 features, (denoted Reglaii.). I expected Reglaii which combined all features would perform better than just using the 4 total content features from each channel, (Reg2i). The performance of regressions using three channels (DAPI, CD45, and PanCK), (Reg3DAPi+CD45+PanCK) that consist of the standard biomarkers used for CTC detection were also tested. I compared this to the three
54
channels (DAPI, CD45, and Bodipy), (Reg4DAPi+BodiyP+CD45) to evaluate my hypothesis H.l. Since a CTC must be a nucleated cell, I tested each channel of features individually combined with DAPI features and evaluated the groups (DAPI/Bodipy, DAPI/CD45, DAPLPanCK), (Reg5, Reg6, Reg7). I then performed regressions of each of the 4 features on the individual channels, (Reg8ll).
The feature selection and regression process was performed using each of the 48 training subsets. This resulted in 48 different versions of the regressions, Regl  Regl 1. The performance of each version of these regressions was then tested using the paired testing subset not involved in training them.
For each of the training subsets, the regression and feature selection process began with feature selection to determine which features amongst the input feature group would produce the best performing regression. The feature selection process started by determining all combinations of the features in the group. For example if the feature group had 3 features, there are 3 single feature combinations, 3 two feature combinations, and one threefeature combination for a total of 7 feature combinations.
The input training subset was broken up into second level training and testing subgroups for assessing each feature combination. The number of ROIs in the second level trainingtesting subgroups was roughly the same and the selection process was repeated 10 times creating 10 different trainingtesting pairings, referred to as 10fold cross validation. For each feature combination, across the 10folds of second level training data, the features were zscored and regressed as X parameters against Y =1 for MCF7 cells and Y = 0 for WBCs. Cohenâ€™s d was used as the optimization parameter and was computed on the paired second level testing subgroup. The computed values of Cohenâ€™s d were averaged across the
55
10folds. The feature combination that produced the highest value of Cohenâ€™s d in the 10fold average was selected.
Using all data in the training subset, regression was then performed with zscored feature combination found through feature selection as X parameters and Y =1 for MCF7 cells and Y = 0 for WBCs producing equations for Regl11.
To address the generalizability of this process, I compare the performance of a set of Regl11 trained on the whole data set in appendix 4 and on just one day in appendix 5.
4.3 Results
Figure 12 shows histograms for all 16 individual features (4 spatial metrics for lipids, DNA, CD45, and panCK), of pure samples and mixed samples. The mixed samples histograms are used only as a qualitative control to confirm the modality seen in the pure sample is real and not an artifact of varying labeling conditions or acquisition settings. The mixed samples generally follow the linear combination of the profiles of the pure samples with bimodality in all the distributions. The total intensity of CD45 and panCK over the regions of interest, denoted by Â£, is consistent with what one would expect with panCK intensity being greater for MCF7 cells and CD45 intensity greater for WBCs.
Using a ttest and a MannWhitneyU test, I checked if the mean position of the WBC and MCF7 distributions were statistically different for each of the 16 features. Using both tests I found p < 0.01 for all of the features indicating statistically significant differences between WBCs and MCF7 cells for all 16 features. Thus my stated hypothesis (H.2) that the MCF7 cells have more lipid content than WBCs is supported.
Summary of the distributions for AUC and Cohenâ€™s d computed on the 48 training subsets are shown in Figure 13 and generally follow each other. ROC for these training
56
subsets were created, summarized in Figure 14 and Figure 15. The shape of the ROC on the zscored axis of Figure 14 indicates none of the biomarkers have an underlying normal distribution [59], Since LRt+ and Ndetare linearly related, Figure 15 shows that there are cutoff positions on each biomarker maximizing Ndet. The distribution of the cutoff positions maximizing Ndet are shown as box plots on top of each feature histogram of Figure 12. Generally these cuts off points are far from the center of the WBC population to reduce false positives. For each testing subset, sensitivity, specificity, and Ndet were computed at the trained cutoff position. The distribution of these performance statistics for the 48 trainingtesting pairs are shown in Figure 13.
The mean tested value of Ndet for the 16 individual features was compared. Amongst these 16 mean values the one from Bodipy E, Ndet = 203 was the highest. The highest mean values of Ndet for the DAPI and PanCK channels were from the Â£ metric and were found to be 34 and 75 respectively while for CD45 the highest values was from the metric at 37. Looking at the maximum instead of mean values of Ndet across the 16 features, I found Bodipy E to produce the greatest value at 1026. DAPI and PanCK also had their maximum values of Ndet in their E metric at 171, and 484 respectively, while the maximum value of Ndet for CD45 was found in the feature at 132.
Analysis of variance (ANOVA) was used to determine if the performance statistics of AUC, Cohenâ€™s d, Sensitivity, Specificity, and Ndet, shown in Figure 13, were dominated by between day variances or within day variances. For nearly all the performance statistics for all individual features, ANOVA results were found be significant (p < .05) for between day variances dominating within day variances. The outliers were the sensitivity of the feature DAPI and specificity of CD45 E and PanCK E which were not found to be statistically
57
different between the days. I also tested variances in the regressions results (Figure 17) using ANOVA. These were generally found to be significant (p < .05) for day to day variances.
The exception being specificity and Ndet were not dominated by between day differences for Reg 14.
I looked at the stability of the absolute positions of D+ and D" populations for each feature to see if the variance between days was greater than the variance within a day. I quantified the within day variability as measure of technical variability in the biomarkers. For the D" population I found between day variance to exceed within day variance of all features except CD45 , DAPI , DAPI , PanCK , and Pan CK The first 4 features had low within day variability indicating these features are stable within and between days while PanCK Â£ had high within day variability. More variation is generally seen in the measurements of the MCF7 population. For this population only Bodipy , CD45 , CD45 , DAPI , and PanCK . For CD45 and DAPI saw between day variation exceed within day variation in absolute position. Further description of methods and the results of these comparisons can be found in Appendix 1.
Absolute position could be controlled for with references similar to what is employed in aneuploidy studies in flow cytometry [106] by applying a different offset and scale factor to each dayâ€™s data. I applied an offset plus scaling transformation to each feature on a per day basis to make the absolute position of WBC and MCF7 the same. As expected, applying this control makes the within day variation in absolute position greater than the between day variation in absolute position. Importantly, such a coordinate system transformation will not change the shape of ROC or calculated performance statistics of AUC, sensitivity, specificity, and Ndet on each dayâ€™s data or the between day variation in performance.
58
I used feature selection and regression to produce combinations of features that would maximize Ndet. I performed these regressions using all 16 features (Reglaii), with just the sum signal features (Reg2i), and different combinations of labels as denoted in their subscript. DAPI (a specific nuclear stain) was included in all 2 feature regression combinations since it was used to determine the region of interest for finding nucleated cells. Histograms of the regressions for pure and mixed samples is shown in Figure 16 with performance depicted in Figures 17, 18, and 19.
I performed KustkalWallis tests with all 6 days of data pooled to determine if AUC, Cohenâ€™s d, sensitivity, specificity, or Ndet for different regressions (ReglReg4) gave statistically different results. For regressions (Regl  Reg4), AUC was found to have p=.06 and Cohenâ€™s d showed significant differences (p < .05), while for performance parameters at the operating point (sensitivity, specificity, and Ndet) I did not find a significant difference (p > .05). Thus my hypothesis (H.l) that the addition of lipid content produces a biomarker (Reg4) with better identification performance than CK+/CD45 (Reg3) is not supported. These regressions perform similarly. Post hoc comparisons found Cohenâ€™s d of Regl to be significantly greater than that of Reg3 (p < .05). Also, the 3 channel panel of Bodipy, DNA, and CD45 (Reg4), which could be made compatible with live cell imaging with substitution of Hoechst for DAPI, performs similarly (p > .05) to the CK+/CD45 panel for all 5 performance statistics.
Examining my hypothesis (H.3) that spatial metrics produce a regression (ReglAii) that outperforms just total signal metrics (Reg2y), I found AUC and Cohenâ€™s d to be significantly greater for Regl ah compared to Reg2y (p < .05) but no statistical difference in sensitivity, specificity, and Ndet (p > .05). Thus spatial metrics did produce a better overall
59
biomarker, supported by greater AUC and Cohenâ€™s d but not a biomarker with a better performing operating point.
I went on to use the KustalWallis test to ask if regressions of 2 channel and 1 channel features performed similarly. For the 2 channel regressions Reg57,1 found that AUC, Cohenâ€™s d, sensitivity, and Ndet all to be significantly different between Regs57 (p < .05). Post hoc comparisons found that Reg5DAPi+CD45 outperforms Reg6DAPi+PanCk with no statistical difference found between Reg5 or 6 and Reg7DAPi+BodiPy. With the single channel metrics (Reg811) I had a similar result finding that AUC, Cohenâ€™s d, sensitivity, specificity, and Ndet were statistically different (p < .05) between these 4 regressions. Post hoc analysis found Ndet forReg9BodiPywas significantly greater than Ndet of Regl0cD45 and Regl IpanCK.
Since the performance spread of the measured biomarkers were generally dominated by between day differences, I looked at my hypotheses tested on a single dayâ€™s data rather than with all 6 days pooled to see if the results were consistent. Ndet and specificity was consistent and not found to be statistically different between Regl4 tested on all 6 days individually. Sensitivities were found to be consistent with the exception of day 12 where Regl was found to be significantly greater (p < .05) than Reg3 and day 15 where Regl was found to be significantly greater than Reg2. Cohenâ€™s d was found to be significantly different between Regl4 on days 10, 12, 14, and 15. AUC was significantly different between Regl4 only on days 12, and 15. Although I found performance to be dominated by between day differences the tested results of my hypotheses generally do not vary by day.
To determine if the similar performance of Regl4 was due to the finite number of ROI in the testing subset, I computed a minimum possible false positive rate for each of the testing subsets as the inverse of the number of ROI in the disease negative test fraction of
60
that subset. I used a Wilcoxon ranksum test to compare the calculated minimum false positive rates to false positive rates of Regl and found the false positive rates from Regl to be significantly (p < .05) greater. This indicates the tested false positive rates were not driven by the number of samples in the disease negative test fraction.
This was not the case for the training subset. The cutoff position (operating point) between the T+ and T fractions of the regressions was set using the training subsets. The optimal false positive rates found in training were not statistically different from the false positive rate calculated as the inverse disease negative training sample size. As in for Regl4, the trained cutoff position (operating points), was most often set to the minimum false positive on the ROC that was greater than 0. Upon testing this cutoff on the larger testing subset, one or more false positives were often found. Even if the false positive rate is sample size limited, biomarkers with greater separation between D+ and D fraction should produce greater sensitivity and better Ndet T believe my similar sensitivity, specificity, and Ndet for Regl4 is driven by the similar separations these regressions produced between the D+ and D populations even though these separations (AUC & Cohenâ€™s d) were statistically different.
In the previously mentioned results, I generated Regl11 for each of the training subsets thus have 48 versions of them which were applied to the paired testing subset. I looked at whether the results varied if I applied the same version of Regl11 to the trainingtesting subsets to speak to the generalizability of the method. I generated Regl11 on day 9 data and applied these regression to the 48 trainingtesting subsets. Results and tested hypothesis were similar with Regl4 being statistically similar and performance similar to using the 48 different versions (Appendix E). I generated Regl11 using all days of data and
61
applied these regressions to the 48 trainingtesting subsets and also found similar results (Appendix F). This indicates that training Regl11 from one dayâ€™s data, the whole dataset, or individual trainingtesting pairings produced similar results. This speaks to the generalizability of the regression method.
The prior analysis was done by creating trainingtesting pairings from different physical samples between days. I looked how the results generalized by pooling all of my D+, and D data and randomly segmenting it into 10 trainingtesting samples for 100 different possible pairings. This resampling increased the average number of cells in the Dtraining subsets to 4077 compared to a mean value of 1539 in the main analysis and the average number of cells in testing to 22165 from 2870 in the main analysis. The bootstrap analysis on the individual features found slightly lower mean values of the performance statistics with about 10 times narrower standard deviations. For the regressions, Cohenâ€™s d, and AUC also have slightly lower mean values with the boot strap analysis and the same notable lower standard deviation. At the operating point, for Regl4, Ndet is found to be about twice the value as in my main analysis with smaller standard deviation, while for Reg511, Ndet is slightly lower. Comparison of Figures 1214 and Figures 1618 generated with this bootstrap data can be found in Appendix C.
I looked at whether using manual debris exclusion improved the performance of the individual features. Applying the same Regl11 on the unexcluded fields actually found mean value of 546 for Regl with exclusion while lower values for Reg2 and some other regressions. This indicates the exclusion did not contribute significantly to performance. Comparative figures are in Appendix D.
62
Overall the KustalWallis test shows that 3 and 4 channel regressions outperform 2 channel regressions, indicating that multiple channels assist in better isolating the rare cell types.
Variations were found in the expression of pancytokeratin in MCF7 cells which appears to depend on their confluency in cell culture. Dropping this data from analysis would likely further increase the separation produced between MCF7 and WBCs but doing this does not represent the true experimental variability measured.
63
1000 0
NA([&PI)Â£
~\â€”I11â€”
DNA (DAPI)
T
dBct
2000 h o
60 65 70 75 80 85 HRrt
â€”11â–  I I111JILJ''I
I ______I 11
(jjs j^TTipids (Bodipy)
2000 h 0
123456789 â– â– ^â€”uâ€”iâ€”iâ€”.â€”1111
S'
Â§5 2000
(Bodipy)
0.7 0.75 0.8 0.85 0.9 0.95 1 1.05 1.1
1 1 â€”Il I 1 I 1 â€”â– 1
riTi. _rr'x r_/  Lipids (Bodipy)
2 3 4 5 6 7 1
â– â– PanCK Â£
60 65 70 C 75 80 85 1  .1 1
/im
/i m
2000 â– 0
CD45 Â£
60 65 70 75 80 85
1 â€”Mâ€”il _n1â– â€œ
CD45
â€œI11
dBct
Figure 12: Histograms of image cytometry features computed on all 4 channels. Blue shows WBCs only samples composed of 24,699 objects (D), red shows MCF7 only samples (D+) composed of 41,091 objects used in the analysis. Black dashed line (MCF7 + WBC ~1:1 mixed samples containing 33,726 objects) is qualitative control reproducing modality of pure samples. dBct = 10*logl0(counts). Box plots of distribution of cutoff positions maximizing Ndet across the 48 training subsets are shown on top of each of the histograms.
64
0.977
0.841
0.500
4
â– o 2 _____0.
U 1 1 1 1 1 â€œI 1 1 1 1 1 T 1 1 1â€” _i i i i i i i i p â– â€¢
â€œI 1 1 1 T 1 :_l 1 1 1 1 1 â€”i 1 1 1 tâ€”i 1 1 1 r; â€”1 1 1 1 1 1 1 1 t Iâ€”s
_i i x i i _L_ â€”I l X l l l l l x L_
X X X * X â– X x x x â– X * X X X â– X â– X1
0.977 : 0.500 ! 0.023
. 0.999 I 0.977 l 0.841 1 0.500
103
102
101
E 20
ffi
N 10 0_
^ 0
Waa aWaa aWaa aWaa a
r\ * 5 >Â» L i_" 5 W 5 ^ ^ 5
Q 5=5: 5 s Â« Â« uS *
Â£ S* 9= â€œ =5 ~ a gÂ§ g n g
QÂ§ g
â– s ^
â€œ m ,Â§
O o
Figure 13: AUC and Cohenâ€™s d, performance characteristics of the features, are shown for the 48 training subsets. An operating point maximizing Ndet was found on each training subset (thresholds plotted in Figure 12). The remaining data from that day was used as a testing subset to compute the shown sensitivity, specificity and the minimum detectable thresholds at the operating point. Occurrences of false positive rates of zero on the testing data were found and summed in the bottom panel.
65
3210123
3210123 Z
0.9997 0.995 0.95 to 0.8 g 0.5 CO 0.2 0.05 0.005 0.0003
0.9997 0.995 0.95 to 0.8 Â§ 0.5 CO 0.2 0.05 0.005 0.0003
0.9997 0.995 0.95 to 0.8 g 0.5 CO 0.2 0.05 0.005 0.0003
0.9997 0.995 0.95 to 0.8 Â§ 0.5 CO 0.2 0.05 0.005 0.0003
0.9997 0.995 0.95 tii 0.8 Â§ 0.5 CO 0.2 0.05 0.005 0.0003
0.9997 0.995 0.95 oi 0.8 Â§ 0.5 CO 0.2 0.05 0.005 0.0003
0.9997 0.995 0.95 m 0.8 Â§ 0.5 CO 0.2 0.05 0.005 0.0003
0.9997 0.995 0.95 tii 0.8 g 0.5 CO 0.2 0.05 0.005 0.0003
3
2
1
0
1
2
3
3
2
1
0
1
2
3
3
2
1
0
1
2
3
3
2
1
0
1
2
3
3
2
1
0
1
2
3
3
2
1
0
1
2
3
3
2
1
0
1
2
3
3
2
1
0
1
2
3
colo LncvjLOcoun in s cololocmloco
W Ifl s 0)0)0) <! Â® Â®
1Spec.
 day9
1Spec.
day 12
day 14
day 10  day 13  day 15
Figure 14: ROC curves for the training subsets averaged over each day. Logarithmic (zscored) sensitivity and specificity axis used shows straight lines when D+ and D" distributions are Gaussian. Average values for sensitivity and specificity maximizing Ndet for are shown as + symbols are generally to the left of seen inflection points.
66
10J
: 10*
10'
10u
10J
: 10*
101
10u
Bodipy S
Bodipy < rf >
Bodipy < r >
Bodipy < M >
day14
day15
Figure 15: ROC on test positive likelihood ratio LRt+ and LRt shows there are maximum on LRt+ for all features. If any of these biomarkers was used for enrichment, this cutoff would be the optimal position for the transition between T+ and T for the enrichment.
67
.5
.5
.5
.5
.5
.5
.5
.5
.5
.5
.5
Figure 16: Histograms of testing subsets of WBCs (blue trace), MCF7 (red traces). The 1:1 mixed populations (black line) qualitative control showing regressed modality is real. Testing data was data not used to train the regressions and was naive to the regressions which were computed on the 48 training subsets. For each regression, an operating point on that training subset maximizing Ndet, and produced threshold positions shown as box plots on top of histograms. Regressions are zscored thus center out near 0.
68
z
Figure 17: Performance of the regressions combinations over the 48 testing subsets shown as box plots. Above line performance statistics, AUC and Cohenâ€™s d characterize the separation produced between the biomarkers. Below line performance statistics that depend on the operation point. Occurrences where a testing subset had a false positive rate of 0 were summed. For Regl4 below line performance statistics are similar while above line statistics are different.
69
3210123
3210123
Z
1Spec.
day9 day 10
day 12 day 13
day 14 day15
Figure 18: Receiver operating characteristics of the regressions computed on the testing subsets averaged over each day. Average positions of operating points maximizing Ndet are shown as + symbols. As more features are included in the regressions performance and stability is seen to improve. Regressed distributions are not Gaussian as indicated by the shape on the zscored sensitivity specificity axis.
70
LR
T
day9
day10
day 12 day13
day14
day15
Figure 19: Regression on test positive likelihood ratio (LRt+) and test negative likelihood ratio LRt. LRt+ is linearly related to detectable rarity Ndet. This figure shows that there are maxima for LRt+ and thus Ndet for all regression combinations and the profile of these ROC are not that of D+ and D drawn from normal distributions as shown in Figure 18.
4.4 Discussion
Recently, Lannin et al. [87] have performed a comparison of Bayesian, Knearest neighbors, support vector machines, and random forest classifiers that included data in the
71
WBCcell line model system similar to mine. The presented ROC in ref. [87] can be compared against the ROC I present (Figure 18). Comparison using the same metric of AUC up to false positive rate of .05, Reg 14 all had has a median value of 0.049 which is on par with the best value of 0.048 found for random forests in the model system of [87], Visual quantification of AUC up to false positive rate of .05 can be found in Appendix G. I conclude my simple feature selected regression is performing as well and that combination of different labels through methods like multivariate regressions are therefore necessary if one wants to detect even rarer cells, my regression method shows details of which features show the greatest ability to distinguish populations. Examples of the forms of the regressed equations are in Appendix H & I.
I have looked at using principle component analysis (PCA) and linear discriminant analysis (LDA) to create statistically uncorrelated basis functions prior to regression. I have found the performance to be identical to the straight up regression technique used here as predicted by the FrischWaughLovell theorem [71], Methods such as these are needed in cases where the number of samples is less than the number of variables to be regressed which is not true in the case of this data set.
Ndet is a quantity that allows for optimizing the detection level of a continuous biomarker panels. Under the scenario of perfect sensitivity, with a desired PV+ of 66%, the detectable prevalence scales at half the false positive rate. This seems reasonable as it makes sense that the false positive rate sets a limit for the rarity of event that can be detected similar to how noise in a circuit sets a limit on power of a signal that can be detected. However, what should one do when they calculate their false positive rate to be 0 making Ndet undefined and positive predictive value perfect? In this case, I choose to round the false positive rate up to
72
the inverse of the disease negative sample size. This occurred on testing for 11 of the 48 testing times in my best regressions. Thus I could improve the 1 in 480 mean value of Ndet by increasing the number of cells in the disease negative dataset and Ndet is connected to sample size. This point is also illustrated in my bootstrap analysis which allowed us to use more cells in training leading to ~1.5x better Ndet performance for regressions, Regl4.
An Ndet value of 480 corresponds to a false positive rate of 1 in 1044. Enrichment may be sufficient to bring the needed prevalence into range. Interestingly, I have found a clinical trial of CTC identification methods reporting efficacy with false positive rates between 1 in 800 and in 1 in 1600 [107],
I found between day differences in DAPIE for WBCs, which I expect to be stable between patients from previously published aneuploidy studies [108], Thus although I see between day differences, particularly in WBCs, I be believe these differences are largely technical and can be reduced instead of being genuine biological variation between patient WBCs.
Use of downstream molecular analysis for final CTC identification with single cell polymerase chain reaction (PCR) is of particular interest [24], I am interested in applying further molecular analysis to CTC samples including in situ hybridization [109] and viral reporters.
I hypothesized that use of spatial features would aid in separating these populations. As the number of channels were increased, feature selection tended towards regressions that included just total signal measurements. The spatial features were seen more often in the two and one channel regressions. At this time, I cannot conclude that the use of spatial features adds performance advantage in this 3 or 4 channel assay.
73
It also appears that the performance of the biomarker panel of DAPI, CD45, and Bodipy is equivalent to the standard CK+/CD45 panel. This result is interesting as labeling for DNA, lipids, and CD45 could be performed on live cells by substituting the DAPI label for Hoechst.
Labelfree specific optical contrast for lipids with (coherent antiStokes Raman scattering) CARS microscopy has previously been performed on CK+/CD45 objects finding that the CK+/CD45 objects have a 7 fold higher average pixel CARS intensity than leukocytes(CK/CD45+) [30], This result is close to the 7.9 fold differences I measured in Bodipy E intensity in this work. Additionally, the ROC shown in Figure 14 and 15 are invariant to the choice of units and linearity of the underlying variables as are the performance parameters summarized in Figure 13. These figures indicate the performance of lipid content alone is similar to the other channels.
Although labelfree, the viability of cells imaged for lipids with CARS compared to those imaged for lipids with Bodipy is unclear. Although the higher laser powers need for CARS may be an issue for viability, the longer wavelengths used also enable deeper imaging through scattering media such as tissue. Interestingly, [110] have taken advantage of this to perform intravital CARS imaging of cancer cells through the vein of a mouse ear. This could be an innovative noninvasive method for enumeration of CTCs but appears difficult to apply clinically.
74
CHAPTER V
CONCLUSIONS
The methods presented in this dissertation define the performance of single and multistage tests for identification of rare events, such as circulating tumor cells, using statistical and probabilistic analysis. I present theory that shows how biomarker sensitivity and specificity sets the rarity of cell the biomarker can identify. I present theory showing how sensitivity, specificity and detectable rarity scale when multiple test stages are employed. I show that there can be a maximum for detectable rarity on a biomarker depending on the distribution of the disease positive and disease negative populations on that biomarker. I show thatd etectable rarity can be used to choose operating points on ROC optimized to detect the rarest possible cells. At these operating points I experimentally quantify sensitivity, specificity, and computed rarity over measured experimental variability. I show that there are maximum values for the rarity of cell a biomarker can detect in all of my experimental data.
There are a few results that consider statistical requirements in circulating tumor cell detection [55], [56], [111] and more theoretical analysis is needed. Poisson statistics can be used to calculate the number of cells that need to be sampled. These calculations are done under the assumption that the identifying test is perfect thus the calculation only involves adequate sampling of a presumed known prevalence. The theoretical analysis presented in this work describes how sensitivity and specificity sets the prevalence (rarity) of cell a test can detect and how these parameters scale in multistage assays. For a given test, this is the prevalence that should be input into the previously mentioned theory [55], [56] to determine how many cells should be sampled.
75
In experimental results, I have found statistically significant differences between WBCs and MCF7s for all biomarkers measured indicating that my hypothesis (H.2) that MCF7s have more lipids than WBCs is supported. My hypothesis (H. 1) that a DNA/lipids/CD45 panel could outperform the classic DNA/CD45/CK panel is not supported. These panels performed similarly which is interesting since DNA/lipids/CD45 could be made compatible with live cell imaging by use of a Hoechst DNA label. My hypothesis (H.3) that computing spatial features would lead to increased performance using total content features was also not supported (P > .05). In 3 and 4 channel assays the total content features lead to similar separations. The spatial features were more prominently included in regression of 1 or 2 channels.
Under the assumption that D+ and D" distributions are Gaussian, I have looked at the separation of the cell populations using Cohenâ€™s d and found that ~7 standard deviations of separation should a achieve an ROC curve with an ability to detect cells at a rarity of 1 in 1 million [112], However, experimentally, the regression that produced greater than 7 standard deviations of separation (measured by Cohenâ€™s d) only achieved a mean 1 in 480 detection performance. My experimental ROC curves show none of the features or regressions are normally distributed [59], Thus theory modeling the separations between populations with just second order properties, such as Cohenâ€™s d, and using this to estimate the ROC and thus sensitivity and specificity while useful is inadequate [112], Still Cohenâ€™s d does describe the separation between populations which is similar to AUC while being less computationally expensive. I used Cohenâ€™s d as the optimization parameter in feature selection. I also experimented with using Ndet and AUC as the optimization parameter. I found Cohenâ€™s d to
76
be less computationally expensive to compute than AUC and Ndet since Cohenâ€™s d does not require computing an ROC in order to calculate.
With an enrichment step that produces an Ndet of ~ 1 in 1500,1 would be able to detect a 1 in a million object with my experimental image cytometry biomarker panel with mean Ndet of 1 in 480 provided the enrichment biomarker is statistically independent of that panel. However, enrichment biomarkers of CD45 and epithelial markers such as CK are already included in the image cytometry panel. Thus, by my theoretical findings, if these biomarkers are used in enrichment the overall improvement in detectable rarity will be lower. If biomarkers such as CD45 or EpCAM are to be used in enrichment, detecting ever rarer cells requires using different biomarkers for the cytometry stage. However, they should not necessary be excluded from the state II image cytometry panel either. Imaging the enrichment biomarker in follow up cytometry may still be important for quality control of the enrichment step. Still, discovery of ever better biomarkers for cancer detection in both enrichment and cytometry stages is critical.
The extent in which MCF7 cells model the distribution of true CTCs is unknown and presents an issue with calculating the true sensitivity of a biomarker. Fortunately, the choice and diversity of the D+ model has no effect on the specificity/false positive rate of the assay. In light of these facts, one can initially rely on measuring the false positive rate of an assay to determine if the assay is detecting anything. To show detection has occurred in the unknown patient sample, using a previously unestablished assay, one can show that their test positive rate in unknown patient samples is statistically greater than the false positive rate of their assay. Demonstrating the test positive rate is greater the false positive rate in a casecontrol study was done by investigators using CellSearch [51] before extensive prospective cohort
77
trials [1], Issues in study design are further discussed in [113], Ultimately, connection of CTCs to the underlying disease is demonstrated through cohort trials where patients with greater than XCTCs are shown to have different outcomes than those without. X is often a parameter to be determined by the trial.
5.1 Future Directions
Future interests include the application of spontaneous Raman microscopy to the analysis of lipid in CTCs. Previously, we have used this label free technique to quantify changes in lipid content in breast and prostate cancer cell lines [6], Characterization of CTCs with spectroscopy techniques such as Raman scattering or fluorescence lifetime imaging (FLIM) may further distinguish metabolic profiles of the solid tumor which may be correlated with outcomes.
A low cost open source platform that can prepare cells for immunofluorescence microscopy and image them would be ideal for evaluating the thousands of commercially available antibodies and labels for use in CTC identification. Such a system would have use in assessing the sensitivity and specificity of labels and clinical applicability in performing the final identification after enrichment. Biomarkers with greatest sensitivity and specificity, evaluated in model systems, would most merit follow up examination in clinical trials. As described in this paper, better performing panels will enable ever rarer CTCs to be detected in the peripheral blood of cancer patients. These CTCs may be the biomarkers needed for better early detection and monitoring of cancer. It would be very interesting to engineer such a system.
One next endeavor is to experimentally measure enriched ROC for the biomarker panel of DAPI, PanCK, CD45 & lipids in a samples of peripheral blood taken from a patient.
78
We are interested in using CD45 depletion to make the prevalence high enough to test out different biomarkers. Testing new biomarkers from a depleted population using image cytometry would expand on the experimental and theoretical analysis of this work to optimize CTC testing. Future work on engineering better methods to performing labeling, minimize cell loss and imaging throughput will also be critical for moving these techniques forward.
79
REFERENCES
[1] W. J. Allard et al., â€œTumor Cells Circulate in the Peripheral Blood of All Major Carcinomas but not in Healthy Subjects or Patients With Nonmalignant Diseases,â€ Clin Cancer Res, vol. 10, no. 20, pp. 68976904, Oct. 2004.
[2] C. A. Parkinson et al., â€œExploratory Analysis of TP53 Mutations in Circulating Tumour DNA as Biomarkers of Treatment Response for Patients with Relapsed HighGrade Serous Ovarian Carcinoma: A Retrospective Study,â€ PLOSMedicine, vol. 13, no. 12, p. el002198, Dec. 2016.
[3] T. Van Gorp et al., â€œHE4 and CA125 as a diagnostic test in ovarian cancer: prospective validation of the Risk of Ovarian Malignancy Algorithm,â€ Br J Cancer, vol. 104, no. 5, pp. 863870, Mar. 2011.
[4] I. M. Thompson et al., â€œOperating Characteristics of ProstateSpecific Antigen in Men With an Initial PSA Level of 3.0 ng/mL or Lower,â€ JAMA, vol. 294, no. 1, pp. 6670, Jul. 2005.
[5] D. Tarin, J. E. Price, M. G. W. Kettlewell, R. G. Souter, A. C. R. Vass, and B. Crossley, â€œMechanisms of Human Tumor Metastasis Studied in Patients with Peritoneovenous Shunts,â€ Cancer Res, vol. 44, no. 8, pp. 35843592, Aug. 1984.
[6] M. C. Potcoava, G. L. Futia, J. Aughenbaugh, I. R. Schlaepfer, and E. A. Gibson, â€œRaman and coherent antiStokes Raman scattering microscopy studies of changes in lipid content and composition in hormonetreated breast and prostate cancer cells,â€ J. Biomed. Opt, vol. 19, no. 11, pp. 111605111605, 2014.
[7] A. Casartelli et al., â€œA cellbased approach for the early assessment of the phospholipidogenic potential in pharmaceutical research and drug development,â€ Cell Biology and Toxicology, vol. 19, no. 3, pp. 161176, 2003.
[8] S. Dawood and M. Cristofanilli, â€œIntegrating Circulating Tumor Cell Assays into the Management of Breast Cancer,â€ Curr. Treat. Options in Oncol., vol. 8, no. 1, pp. 8995, Feb. 2007.
[9] S. J. Cohen et al., â€œIsolation and Characterization of Circulating Tumor Cells in Patients with Metastatic Colorectal Cancer,â€ Clinical Colorectal Cancer, vol. 6, no. 2, pp. 125â€” 132, Jul. 2006.
[10] S. J. Cohen etal., â€œPrognostic significance of circulating tumor cells in patients with metastatic colorectal cancer,â€ Ann Oncol, vol. 20, no. 7, pp. 12231229, Jul. 2009.
[11] D. C. Danila et al., â€œCirculating Tumor Cell Number and Prognosis in Progressive CastrationResistant Prostate Cancer,â€ Clin Cancer Res, vol. 13, no. 23, pp. 70537058, Dec. 2007.
80
[12] O. B. Goodman et al., â€œCirculating Tumor Cells in Patients with CastrationResistant Prostate Cancer Baseline Values and Correlation with Prognostic Factors,â€ Cancer Epidemiol Biomarkers Prev, vol. 18, no. 6, pp. 19041913, Jun. 2009.
[13] W. He etal., â€œQuantitation of circulating tumor cells in blood samples from ovarian and prostate cancer patients using tumorspecific fluorescent ligands,â€ Int. J. Cancer, vol. 123, no. 8, pp. 19681973, Oct. 2008.
[14] M. C. Miller, G. V. Doyle, and L. W. M. M. Terstappen, â€œSignificance of Circulating Tumor Cells Detected by the CellSearch System in Patients with Metastatic Breast Colorectal and Prostate Cancer,â€ J Oncol, vol. 2010, 2010.
[15] M. G. Krebs et al., â€œEvaluation and Prognostic Significance of Circulating Tumor Cells in Patients With NonSmallCell Lung Cancer,â€ JCO, vol. 29, no. 12, pp. 15561563, Apr. 2011.
[16] J. Y. Xu etal., â€œDetection and Prognostic Significance of Circulating Tumor Cells in Patients With Metastatic Thyroid Cancer,â€ J Clin EndocrinolMetab, vol. 101, no. 11, pp. 44614467, Nov. 2016.
[17] Y. Zhou, B. Bian, X. Yuan, G. Xie, Y. Ma, and L. Shen, â€œPrognostic Value of Circulating Tumor Cells in Ovarian Cancer: A MetaAnalysis,â€ PLOS ONE, vol. 10, no. 6, p. e0130873, Jun. 2015.
[18] J.Y. Pierga etal., â€œHigh independent prognostic and predictive value of circulating tumor cells compared with serum tumor markers in a large prospective trial in firstline chemotherapy for metastatic breast cancer patients,â€ Ann Oncol, vol. 23, no. 3, pp. 618â€” 624, Mar. 2012.
[19] H. I. Scher etal., â€œCirculating tumour cells as prognostic markers in progressive, castrationresistant prostate cancer: a reanalysis of IMMC38 trial data,â€ The Lancet Oncology, vol. 10, no. 3, pp. 233239, Mar. 2009.
[20] J. B. Smerage et al., â€œCirculating Tumor Cells and Response to Chemotherapy in Metastatic Breast Cancer: SWOG S0500,â€ JCO, vol. 32, no. 31, pp. 34833489, Nov. 2014.
[21] S. D. M.B.B.Ch and Massimo Cristofanilli MD, â€œIntegrating Circulating Tumor Cell Assays into the Management of Breast Cancer,â€ Curr. Treat. Options in Oncol., vol. 8, no. 1, pp. 8995, Feb. 2007.
[22] D. R. Shaffer et al., â€œCirculating Tumor Cell Analysis in Patients with Progressive CastrationResistant Prostate Cancer,â€ Clin Cancer Res, vol. 13, no. 7, pp. 20232029, Apr. 2007.
[23] J. P. Thiery, H. Acloque, R. Y. J. Huang, and M. A. Nieto, â€œEpithelialMesenchymal Transitions in Development and Disease,â€ Cell, vol. 139, no. 5, pp. 871890, Nov.
2009.
81
[24] J. F. Leary, F. He, and L. M. Reece, â€œDetection and isolation of single tumor cells containing mutated DNA sequences,â€ presented at the BiOS â€™99 International Biomedical Optics Symposium, 1999, pp. 93101.
[25] J. F. Leary, S. R. McLaughlin, L. M. Reece, J. I. Rosenblatt, and J. A. Hokanson, â€œRealtime multivariate statistical classification of cells for flow cytometry and cell sorting: a data mining application for stem cell isolation and tumor purging,â€ 1999, vol. 3604, pp. 158169.
[26] A. A. Powell eta/., â€œSingle Cell Profiling of Circulating Tumor Cells: Transcriptional Heterogeneity and Diversity from Breast Cancer Cell Lines,â€ PLOS ONE, vol. 7, no. 5, p. e33788, May 2012.
[27] G. Vona et al., â€œIsolation by Size of Epithelial Tumor Cells: A New Method for the Immunomorphological and Molecular Characterization of Circulating Tumor Cells,â€
The American Journal of Pathology, vol. 156, no. 1, pp. 5763, Jan. 2000.
[28] H. K. Lin et al., â€œPortable FilterBased Microdevice for Detection and Characterization of Circulating Tumor Cells,â€ Clin Cancer Res, vol. 16, no. 20, pp. 50115018, Oct. 2010.
[29] R. Konigsberg et al., â€œDetection of EpCAM positive and negative circulating tumor cells in metastatic breast cancer patients,â€ Acta Oncologica, vol. 50, no. 5, pp. 700710, Jun. 2011.
[30] R. Mitra, O. Chao, Y. Urasaki, O. B. Goodman, and T. T. Le, â€œDetection of LipidRich Prostate Circulating Tumour Cells with Coherent AntiStokes Raman Scattering Microscopy,â€ BMC Cancer, vol. 12, no. 1, p. 540, Nov. 2012.
[31] D. Nayar, â€œGlobal Circulating Tumor Cells (CTCs) and Cancer Stem Cells (CSCs) Market 2017 Qiagen, Advanced Cell Diagnostics, ApoCell and Janssen  Important Events 24.â€ .
[32] Aldis Clarke, â€œ2017 Global Circulating Tumor Cells Market By Top 5 Manufacturers in America, Europe, AsiaPacific and Africa,â€ Publicist Report. The site covers topics in Business. The teamâ€™s goal is to provide the latest, up to the minute news., 26Jul2017. .
[33] M. Kamal, W. Razaq, M. Leslie, and S. A. and T. Tanaka, â€œCirculating Tumor Cells in Breast Cancer: A Potential Liquid Biopsy,â€ 2017.
[34] K. F. Ho, N. E. Gouw, and Z. Gao, â€œQuantification techniques for circulating tumor cells,â€ TrAC Trends in Analytical Chemistry, vol. 64, pp. 173182, Jan. 2015.
[35] T. M. Scholtens et al., â€œCellTracks TDI: An image cytometer for cell characterization,â€ Cytometry Part A, vol. 79A, no. 3, pp. 203213, 2011.
[36] S. D. Mikolajczyk et al., â€œDetection of EpCAMNegative and CytokeratinNegative Circulating Tumor Cells in Peripheral Blood,â€ Journal of Oncology, 2011. [Online],
82
Available: https://www.hindawi.com/journals/jo/2011/252361/abs/. [Accessed: 05Jul2017],
[37] N. SaucedoZeni et al., â€œA novel method for the in vivo isolation of circulating tumor cells from peripheral blood of cancer patients using a functionalized and structured medical wire,â€ International Journal of Oncology, vol. 41, no. 4, pp. 12411250, Oct. 2012.
[38] J. Chudziak et al., â€œClinical evaluation of a novel microfluidic device for epitopeindependent enrichment of circulating tumour cells in patients with small cell lung cancer,â€ Analyst, vol. 141, no. 2, pp. 669678, Jan. 2016.
[39] G. Martin, S. Soper, M. Witek, and J. J. Yeh, â€œVitro capture and analysis of circulating tumor cells,â€ Patent: US20140134646 Al, 15May2014.
[40] A. A. S. Bhagat and Y. Guan, â€œMicrofluidics sorter for cell detection and isolation,â€ Patent: US20160303565 Al, 16Oct2014.
[41] H. W. Hou etal., â€œIsolation and retrieval of circulating tumor cells using centrifugal forces,â€ Scientific Reports, vol. 3, p. srep01259, Feb. 2013.
[42] P. Gogoi etal., â€œDevelopment of an Automated and Sensitive Microfluidic Device for Capturing and Characterizing Circulating Tumor Cells (CTCs) from Clinical Blood Samples,â€ PLOS ONE, vol. 11, no. 1, p. e0147400, Jan. 2016.
[43] M. Balic et al., â€œComparison of two methods for enumerating circulating tumor cells in carcinoma patients,â€ Cytometry, vol. 68B, no. 1, pp. 2530, Nov. 2005.
[44] Dittamore, Ryan, â€œCirculating tumor cell diagnostics for prostate cancer biomarkers,â€ Patent: US20160033508A1, 26Jan2015.
[45] Dittamore, Ryan, â€œMethods for the detection and quantification of circulating tumor cell mimics,â€ Patent: US20160341732A1, 29Jan2015.
[46] T. Hillig et al., â€œIn vitro detection of circulating tumor cells compared by the CytoTrack and CellSearch methods,â€ Tumor Biol., vol. 36, no. 6, pp. 45974601, Jun. 2015.
[47] S. Nagrath and H. J. Yoon, â€œSystem for detecting rare cells,â€ Patent: US9645149 B2, 09May2017.
[48] A. S. Frandsen etal., â€œRetracing Circulating Tumour Cells for Biomarker Characterization after Enumeration,â€ Journal of Circulating Biomarkers, vol. 4, p. 5, Nov. 2015.
[49] D. N. Curry etal., â€œHighspeed detection of occult tumor cells in peripheral blood,â€ in 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2004. IEMBS â€™04, 2004, vol. 1, pp. 12671270.
83

Full Text 
PAGE 1
C IRCULATING T UMOR C ELLS : M ATHEMATICAL T HEORY OF DETECTABILITY WITH SIMULATIONS AND EXPERIMENTAL RESULTS IN A MODEL SYSTEM B y GREGORY LOUIS FUTIA B.S., Purdue University, 2007 M.S., Colorado State University, 2011 A dissertation submitted to the Faculty of the Graduate School of the University of Colorado in partial fulfilment of the requirements for degree of Doctor of Philosophy Bioengineering Program 2017
PAGE 2
ii Word Template by Friedman & Morgan 2014 This dissertation for the Doctor of Philosophy by Gregory L ouis Futia h as been approved for the Bioengineering Program b y Richard KP Benninger, Chair Emily A Gibson, Advisor Kian Behbakht, Isabel R Schlaepfer Robin Shandas Date: December 1 6 , 2017
PAGE 3
iii Word Template by Friedman & Morgan 2014 Futia, Gregory Louis ( Ph . D ., Bioengineering) Circulating Tumor Cells: Mathematical Theory of Detectability with Simulations and Experimental Results in a Model System Dissertation directed by Assistant Professor Emily A. Gibson Abstract Circulating tum or cells (CTCs) are nucleated objects that are shed from a primary tumor into the blood stream. Effective identification of CTCs holds promise for improving early detection and disease monitoring of cancer but is difficult due to the rarity of CTCs compare d to background blood cells . In this dissertation, I develop mathematics describing how the rarity of cell that an assay can detect is limited by the sensitivity and specificity of the assay biomarker to that cell . I refer to the rarity of ce ll that an assay can detect as detectable rarity. I show that depending on the distribution of disease positive and disease negative populations on an identifying biomarker there can be a maximum in detectable rarity as a function of the t est positive test negative cutoff position on that biomarker. Most CTC assay s consist of 2 sta ges which are an enrichment stage followed by a n image cytometry stage . I present mathematics describing how the sensitivity, specificity, and detectable rarity o f a multistage tests relates to the sensitivity, specificity, and detectable rarity of the individual test stages . The enriched output fraction typically contains between 1 , 000 10, 000 cells. Difficulties in processing this cell fraction for image cytometry lies in ( 1) preventing cell loss in the numerous handling steps involved in labeling and mounting the cells and ( 2) controlling the area of the resulting cell field such that is neither too sparse or too dense. I present technology I have engineered that addresses point ( 1) by confining cells during the
PAGE 4
iv Word Template by Friedman & Morgan 2014 labeling process using a filter and addresses point ( 2) by allow ing the size of the cell field to be set using standard o rings and with diameters interchangeable using variable low cost alignment plates. I assess the identification performance of adding lipid imaging to the s tandard DAPI, Cytokeratin, CD45 panel used to identify CTCs . I assess t he identification performance of adding metrics of spatial second moment, spatial frequency second moment, the p roduct of spatial second moment and spatial frequency second moment to simple total content metric. To perform this assessment, I use technology I engineered to prepare samples for image cytometry with fluorescent staining and antibody labeling DAPI , Bodipy (lipids), C ytokeratin and CD45 . I perform this analysis in a model system of disease negative white blood cells and disease positive MCF7 cancer cell s . In this model system, I present my analysis of the four spatial features calculated on each of t he four labels , providing a total of 16 biomarkers . The best performing of the 16 biomarkers produced an average separation of 3 standard deviations between disease positive ( D + ) and disease negative ( D ) populations and an average detectable rarity of ~1 in 200. I performed multivariable regression and feature select ion to combine multiple biomarkers for increased performance and showed an average separation of 7 standard deviations between the D + and D populations giving an average detectable rarity of ~ 1 in 480. Histograms and receiver operating characteristics (ROC) for these biomarker features and regressions are presented. I show methods to optimize for the maximum detectable rarity as a function of test positive test negative cutoff position and appl y this method for all biomarkers measured. The form and content of this abstract are approved. I recommend its publication. Approved: Emily A. Gibson
PAGE 5
v Word Template by Friedman & Morgan 2014
PAGE 6
vi Word Template by Friedman & Morgan 2014 I dedicate this dissertation t o my wife , my parents and my daughter .
PAGE 7
vii Word Template by Friedman & Morgan 2014 Acknowledgments I would like to acknowledge Dr. Emily A. Gibson , my research advisor. She pushed for investigating lipids as a biomarker with motivation of looking at that bio marker label free with coherent anti stokes Raman scattering microscopy. H er help in editing and composing the manuscripts that have been submitted related to this work have added clarity to the documents . She has also provided me with exposure and guidance on how to write grants and what goes i nto keeping a lab running. She has given me the flexibility to explore the aspects of this project that interested me as well as the latitude to embark on the technology development . Second, I would like to acknowledge my clinical mentor Dr. Kian Behbakht. He drew us into t he CTC problem by his own clinical need for better methods for detection and monitoring of gynecological malignancies. He has provide d me with exposure to the clinical aspects of cancer research through the tumor board meeting, allowing me to see a cancer debunking surgery, and inviting me to events where survivors a re present. In addition, support though his lab and lab group has given m e much of my exposure to the biological questions in cancer. I would group s . Particularly, Baris Ozbay, Dr. Stephanie Meyer, and Dr. Mariana Potcova who I worked closely with and also Dr. Allison Caster and Mr. Andrew Chandlo r, and Mr. Robert Heffern. Also, I would like to acknowledge Dr. Heidi Wilson, Dr. Ben Bitler, Dr. Lubna Qamar, Dr. Georgina Cheng, and Mr. Doug Hicks. I would like to acknowledge the efforts put in by my additional committee members Dr. Isabel Schaepfer, Dr. Robin Shandas, and my committee chai r Dr. Richard Benninger.
PAGE 8
viii Word Template by Friedman & Morgan 2014 This work wa s supported by a seed grant from the University of Colorado Cancer Center funded through an American Cancer Society Institutional Research Grant, #57 001 53 (EAG, KB), by funding provided by the Defense Advanced Research Projects Agency, # N66001 10 4035 ( EAG), NIH/NCI K01 CA168934 (IRS) and NIH/NCATS Colorado CTSI Grant Number TL1 TR001081. Imaging experiments were performed in the University of Colorado Anschutz Medical Campus Advanced Light Microscopy Core supported in part by NIH/NCATS Colorado CTSI Gra nt Number UL1 TR001082. I gratefully acknowledge Dr. Michael Yeager and Ms. Kelly Colvin for guidance on cell handling, l abeling , and red blood cell lysis steps. I also acknowledge Dr. Heide Forde for supplying us with the MCF7 cell line. I have no conflic ts of interest to declare. The funders had no role in the study design, data collection, analysis, decision to publish, or preparation of this dissertation .
PAGE 9
ix Word Template by Friedman & Morgan 2014 C ontents I . INTRODUCTION ................................ ................................ ................................ .................. 1 1.1 Contributions ................................ ................................ ................................ ............... 4 1.2 Background ................................ ................................ ................................ .................. 6 1.2.1 The CTC Commercial Technology Landscape ................................ ...................... 6 1.2.2 FDA Regulations and Guidance for CTC Identification Systems .......................... 7 II . MATHEMATICAL THEORY AND SIMULATIONS OF DETECTABILITY FOR RARE EVENTS SUCH A CIRCULATING TUMOR CELLS IN MULTISTAGE ASSAYS ................................ ................................ ................................ ........................... 9 2.1 Introduction ................................ ................................ ................................ ................. 9 2.1.1 Background ................................ ................................ ................................ ........... 11 2.2 Theory: Derivation of Detectable Rarity ................................ ................................ ... 12 2.3 Simulation: Depending on the Distribution of the D+ and D on a Biomarker There Can Be an Optimal Test Positive/Test Negative Cutoff Position ............................. 14 2.4 Theory: Two Stage Tests ................................ ................................ .......................... 17 2.4.1 Sensitivity and Specificity of a Two Stage Binary Assay ................................ .... 17 2.4.2 Detectable Rarity in Multi Stage Binary Assays ................................ .................. 19 2.4.3 Treatment of Independent Versus Dependent Test Stages ................................ ... 20 2.4.4 ROC for Two Stage Tests Using the Sa me Biomarker ................................ ........ 21 2.4.5 ROC for Two Stage Tests Using Different Biomarkers ................................ ....... 24 2.4.6 Creating Overall Test ROC from Two Stage Test Using Different Biomarkers . 25 2.4.7 Simulation: ROC Pre and Post Enrichment for Dependent and Independent Biomarkers ................................ ................................ ................................ ............ 26
PAGE 10
x Word Template by Friedman & Morgan 2014 2.4.8 Simulation: The Optimal Enrichment Cut off Position ................................ ........ 29 2.5 Theory: Cell Loss in Processing ................................ ................................ ................ 31 2.6 Theory: Sampling Considerations and Volume of Blood to Screen ......................... 32 2.7 Discussion ................................ ................................ ................................ .................. 34 III . ENGINEERING OF CELL LABELING TECHNOLOGY ................................ .............. 37 3.1 Introduction ................................ ................................ ................................ ............... 37 3.2 Device for Cell Labeling ................................ ................................ ........................... 37 3.3 Coverslip Mounter ................................ ................................ ................................ ..... 39 3.4 Pr otocol for DAPI, Bodipy, Pan cytokeratin, and CD45 Labeling of Cells on Track Etched Polycarbonate Filters ................................ ................................ ..................... 40 3.5 Example Resulting Cell Preparations ................................ ................................ ........ 42 IV . EXPERIMENTAL EVAULATION IMAGE CYTOMETRY FOR DNA, CD45, CYTOKERATIN, AND LIPIDS IN A MODEL SYST EM FOR CIRCULATING TUMOR CELL IDENTIFIATION ................................ ................................ ................ 44 4.1 Introduction ................................ ................................ ................................ ............... 44 4.2 Materials and Methods ................................ ................................ .............................. 47 4.2.1 Sample Inclusion and Abundancies ................................ ................................ ...... 47 4.2.2 Sample Preparation and Labeling ................................ ................................ ......... 48 4.2.3 Fluorescence Microscopy ................................ ................................ ..................... 49 4.2.4 Image Processing ................................ ................................ ................................ .. 50 4.2.5 ................................ ................................ ...................... 51 4.2.6 Calculation of Image Metrics ................................ ................................ ............... 51 4.2.7 Performance Analysis ................................ ................................ ........................... 53
PAGE 11
xi Word Template by Friedman & Morgan 2014 4. 3 Results ................................ ................................ ................................ ....................... 56 4.4 Discussion ................................ ................................ ................................ .................. 71 V . CONCLUSIONS ................................ ................................ ................................ ................. 75 5.1 Future Directions ................................ ................................ ................................ ....... 78 REFERENCES ................................ ................................ ................................ ........................ 80 Appendix A . Components of the Clinical Experience ................................ ................................ .... 90 B . Analysis of Experimental Variation ................................ ................................ .......... 93 C . Comparative figures and performance between bootstrap and experimental analysis used in dissertation ................................ ................................ ................................ ....... 115 D . Comparative figures and performance with and without manual debris removal ... 125 E . Comparative figures and performance of Reg1 11 generated once using all days data and applied to each of the training testing subsets with Reg1 11 generated for each of the training testing subsets ................................ ................................ ........................... 134 F . Comparative figures and performance of Reg1 11 generated once using all day 9 data and applied to each of the training testing subsets with Reg1 11 generat ed for each of the training testing subsets ................................ ................................ ....................... 139 G . Area Under the Curve (AUC) up to FPR .05 ................................ .......................... 144 H . Regressed Equations Trained on All Data ................................ .............................. 145 I . Regressed Equations Trained on just Day 9 Data ................................ ..................... 147 J . Notes on Commercialized CTC Technologies ................................ ......................... 150
PAGE 12
xii Word Template by Friedman & Morgan 2014 List of Figu res Figure 1: Panels a & b: Histograms of D+ and D populations drawn from normal are separated by 4 standard scored and likelihood ratio ROCs (panels c & d) aximum test positive likelihood ratio position. Annotated maximum N det shown as + is also not at the minimum false positive ................................ ................................ ............................. 16 Figure 2: Panel a: Diagram of sample processing through a two stage test. The test negative output of the first test is discarded. The test positive output fr om the first test is used in the second test. The test positive output of the second test is test positive for the combined test. Panel b: Geometrical picture of two stage sensitivity showing that the two stage sensitivity must be smaller than the sensitiv ity of the individual stages. Panel c: Geometrical picture of the two stage specificity showing that the false positive rate [=1 specificity] must be smaller than that of the individual test stages. ......................... 17 Figure 3: Diagram of two stage system in D+ and D populations. Panel (a) has a relation where Equation 24 is not satisfied due to a large overlap between T1+ and the D population. Pa nel (b) the typical case where both test stages seek to reduce the false positive rate of the assay. ................................ ................................ ................................ 20 Figure 4: Diagram of popul ation separation using continuous biomarkers through a two stage system. The effect of the first test is modeled as a selection function on a continuous
PAGE 13
xiii Word Template by Friedman & Morgan 2014 biomarker, x, (panel b) in which cells in the red population are more likely to be selected as test pos itive, T 1 +. Resulting enriched distributions are shown in panel c) and discarded distributions are shown in panel d). ................................ ................................ 22 Figure 5: The ROC of the biomarker post enrichment (blue) follows the ROC of the biomarker without enrichment for low false positive rates and is seen to saturate at higher false positive rates. This is caused by cell loss in the D+ population due to the stage 1 enrichment. D+ and D distributions are normally distributed and are shown in Figure 4. ................................ ................................ ................................ .......................... 24 Figure 6: Simulated D+ events (red) an d D events (blue) with distribution on biomarkers BM1, BM2, and BM3. (panels a and b). BM2 is independent (ID) of BM1 while BM3 is dependent (Dep.) on BM1. Distributions are put through enrichment stage on BM 1 with cutoff shown in (c and d) resulting in new distributions shown in panels (e and f). Panels (g and h) show the ROC of BM2 and BM3 before and after enrichment with maxima in N det annotated as +. Although the performance of BM3 is seen to be better than BM2 pre enrichment its performance improves less than BM2 post enrichment due to its dependence on BM 1. ................................ ................................ ................................ ..... 28 Figure 7: The detectable rarity of stage I and overall test as fun ction of the cutoff position on the stage I enrichment biomarker. Panel a) shows the detectable rarity of the stage I enrichment alone. For the case of D+ and D the detectable rarity increases as the cutoff position is moved ever further away from the D population. For D+ and D populations with excess kurtosis of 3 (BM 1) a maximum is seen in detectable rarity as a function of cutoff position. In panel B, BM1 is used as the enrichment biomarker and BM2 from Figure 6 as the stage II biomarker. The
PAGE 14
xiv Word Template by Friedman & Morgan 2014 performance of the overall test post enrichment (blue dashed line) follows that of the detectable rarity of BM1 alone combined with that of BM2 as predicted by equation 19. When the stage II biomarker is dependent on the st age I biomarker (BM3 from Figure 6) the overall test detectable rarity underperforms that of the independent case. ............... 31 Figure 8: As surance levels of encountering, k cells, after measuring n cells assuming 1 in one million cell prevalence. ................................ ................................ ................................ ... 33 Figure 9: Side cro ss section of labeling device. Device consists of input head, alignment plate, and output head. The input and output heads sandwich o rings set with the alignment plates to create a seal on the polycarbonate filter as shown in detail 1. Input head has thre aded connector (a) for applying positive pressure. The volume of the staining reservoir (b) is controlled by the diameter (c) while its height is set to the length of a standard gel loading pipette tip. This choice of height prevents damaging the filter wh ile enabling bubble free loading. The diameter of the laydown area is controlled by the o ring diameter which can be set by changing the diameters in the alignment plate (e). This plate must be thin enough for a compression gap (f) to enabling sealing of d evice. Output head contains threaded connector (g) for pulling fluids from the device. All fluids flow in the direction of the dotted arrow (h). ................................ ................. 38 Figure 10: Diagram of assembled system: The lanes in the staining chamber (a) are pulled with vacuum controlled through stop cock panel (b) fabricated with 3D printing. The vacuum splits out through the manifold (d) and is routed through 2 aspiration steps (d). To prevent stretching the filter, vacuum pressure is controlled by regulator (e) which is connected to the lab vacuum (f). ................................ ................................ ..................... 39
PAGE 15
xv Word Template by Friedman & Morgan 2014 Figure 11: A 1:1 sample of WBCs and MCF7 cells labeled for DAPI (blue), Bodipy (green), anti pan Cytokeratain(yellow), and anti CD45 (red). Panel a) represe ntative image of cell spot produced by labeling device. Diameter of spot is 2.1 mm. Image is 8x8 mosaic full resolution of square area is shown in panel b). Image at full resolution is available in public dataset [72]. ................................ ................................ ................................ .......... 42 Figure 12: Histograms of image cytometry features computed on all 4 channels. Blue shows WBCs only samples composed of 24,699 objects (D ), red shows MCF7 only samp les (D+) composed of 41,091 objects used in the analysis. Black dashed line (MCF7 + WBC ~1:1 mixed samples containing 33,726 objects) is qualitative control reproducing modality of pure samples. dBct = 10*log10(counts). Box plots of distribution of cut off positions maximizing Ndet across the 48 training subsets are shown on top of each of the histograms. ................................ ................................ ................................ ................ 64 Figure 13: AUC and Cohen 48 training subsets. An operating point maximizing Ndet was found on each training subset (thresholds plotted in Figure 12). The remaining data from that day was used as a testing subset to compute the shown sensitivity, specificity and the minimum detectable thresholds at the operating point. Occurrences of false positive rates of zero on the testing data were found and summed in the bottom panel. ................................ ............. 65 Figure 14 : ROC curves for the training subsets averaged over each day. Logarithmic (z scored) sensitivity and specificity axis used shows straight lines when D + and D distributions are Gaussian. Average values for sensitivity and specificity maximizing N det for are shown as + symbols are generally to the left of seen inflection points. ....... 66
PAGE 16
xvi Word Template by Friedman & Morgan 2014 Figure 15: ROC on test positive likelihood ratio LR T+ and LR T shows there are maximum on LR T+ for all features. If any of these biomarkers was used for enrichment, this cut off would be the optimal position f or the transition between T+ and T for the enrichment. ................................ ................................ ................................ ................................ ........ 67 Figure 16: Histograms of testing subsets of WBCs (blue trace), MCF7 (red traces). The 1:1 mixed populations (black line) qualitative control showing regressed modality is real. Testing data was data not used to train the regressions and was naive to the regressions which were computed on the 48 training subsets. For each r egression, an operating point on that training subset maximizing Ndet, and produced threshold positions shown as box plots on top of histograms. Regressions are z scored thus center out near 0. .......... 68 Figure 17: Performance of the regressions combinations over the 48 testing subsets shown as separation prod uced between the biomarkers. Below line performance statistics that depend on the operation point. Occurrences where a testing subset had a false positive rate of 0 were summed. For Reg1 4 below line performance statistics are similar while above line stat istics are different. ................................ ................................ .................... 69 Figure 18: Receiver operating characteristics of the regressions computed on the testing subsets average d over each day. Average positions of operating points maximizing Ndet are shown as + symbols. As more features are included in the regressions performance and stability is seen to improve. Regressed distributions are not Gaussian as indicated by the shap e on the z scored sensitivity specificity axis. ........................... 70 Figure 19: Regression on test positive likelihood ratio (LR T+ ) and test negati ve likelihood ratio LR T . LR T+ is linearly related to detectable rarity N det . This figure shows that there
PAGE 17
xvii Word Template by Friedman & Morgan 2014 are maxima for LR T+ and thus N det for all regression combinations and the profile of these ROC are not that of D+ and D drawn from normal distributions as shown in Figure 18. ................................ ................................ ................................ ........................ 71
PAGE 18
1 Word Template by Friedman & Morgan 2014 CHAPTER I INTRODUCTION Efforts to measure malignancy using peripheral blood samples have been focused on detecting circulating tumor cells (CTCs), circulating tumor DNA (CT DNA), and serum proteins such as prostate specific antigen (PSA) or cancer antigen 125 (CA125) [1] [4] . D etection and capture of CTCs have advantages over other measurements. For example, the presence of CTCs indicates the cancer has made a step in its development of metastatic potential and is capab le of dissipating cells from its primary site. Some circulating tumor cells may be carriers o f metastasis depending on their ability to survive in changing micro environmental conditions [5] . Circulating tumor cells contain a sample of the entire genome of the tumo r providing more information than other techniques. Another advantage is that measurements of these cells can allow further characterization by phenotype for metabolic markers such as lipid composition [6] . Current technologies have detected the presence of CTCs in patients with breast [7], [8] , colorectal [9], [10] , prostate [11], [12] , and ovarian [13] cancer. The problem in accurately identifying circulating tumor cells is their rarity amongst similar cellular blood components. Previous studies indicate j ust 2 CTCs in 7.5 mL of whole blood is predictive of prognosis in many cancers [1], [14] [17] . Similar CTC count has also been shown as predictive of response to chemotherapy [15], [18] [20] indicating their usefulness as a disease monitoring biomarker. In contrast, 1 milliliter of blood contains 2 4 million nucleated blood cells often called white blood cells (WBCs), 1 billion red blood cells, and 100 million platelets. To address this rarity, biomarkers with high sensitivity and extremely high specificity are needed to distinguish between circulating tumor cells and
PAGE 19
2 Word Template by Friedman & Morgan 2014 background nucleated blood cells. Failure t o have sufficient sensitivity and specificity produces an assay in which most test positive objects are false positives. As part of this dissertation , I will present theoretical analysis of the CTC problem that begins by showing that sensitivity and specif icity set the rarity of cell that a test can detect , which I will refer to as detectable rarity . Eighty percent of cancers are carcinomas which originate from epithelial tissue. The most common method to isolate CTCs is to target markers of epithelial cel ls not expected in hematopoietic blood cells. Thus, t he typical markers for CTCs are a nucleated cell, expressing epithelial markers, such as epithelial cell adhesion protein (EpCAM) or cytokeratins (CK), that is also negative for the hematopoietic marker CD45 , which I annotate as DAPI+/CK+/CD45 . Although DAPI+/CK+/CD45 seems to be a logical target, a large scale study of the DAPI+/CK+/CD45 bio marker (CellSearch, Veridex, LLC) using 2183 blood samples from 965 patients with known metastatic carcinomas fo und only 36% to have greater than 2 CTCs in 7.5 mL of blood [1] . Indeed, the previously mentioned reports of CTC identification [7], [9] [13], [21], [22] detectable CTCs. This low sensitivity limits the utility of CTCs in early detection and disease monitoring and mo tivates the need to further improve CTC identifi cation assays. The theory I develop here lays the pathway for further improvement to these assays as experimental investigation finds other possible biomarkers for CTCs with greater sensitivity and specificity . Future assays may i nvolve biomarkers that are less dependent on epithelial markers as it is postulated that CTCs lose some of their epithelial characteristics [23] . Many CTC detection assays often c onsist of at least 2 stages , a stage I enrichment using antibody capture (EpCAM positive selection or CD45 depletion) to reduce the num ber
PAGE 20
3 Word Template by Friedman & Morgan 2014 of background WBCs and a stage II identification using image or flow cytometry with fluorescent labels . In this dissertation, I present theory showing how the sensitivity, specificity, and detectable rarity of the first and second test stages combine t o set the sensitivity, specificity, and detectable rarity of the overall test. Additionally, through simulations exploring the developed theory , I show that the detectable rarity of an identifying biomarker can be limited as a function of test positive test negative cutoff position on that biomarker. Thus, experimental investigations of not just biomarkers but biomarker combinations are critical t o better identify CTCs. As part of this dissertation, I perform an experimental investigation of adding new biomarkers to the classical DNA/CD45/CK panel. The new biomarkers I looked to add were spatial image cytometry metrics of second moment, spatial fr equency second moment, the product of these two moments and a lipid channel/label. I n addition to computing these three spatial metrics , I also computed total label content in each region of interest (ROI). I refer to the computation of an image metric com puted on one channel as a feature. Thus, this four channel dataset on which 4 image cytometry metrics were computed resulted in a 16 feature image cytometry data set. Each of these features and combinations of these features are a potential identifying bio marker. I present on using regression methods I developed to combine the individual features for improved performance. In particular, I was interested in a DNA, lipids, a nd CD45 panel because it does not use epithelial biomarkers and because all of the lab els are compatible with live cell imaging. I performed th is experimental work using a model system of disease positive (D+) MCF7 cells and disease negative (D ) white blood cells (WBCs) from human peripheral blood which has been used by others investigator s in developing CTC assays [24] [29] .
PAGE 21
4 Word Template by Friedman & Morgan 2014 Higher lipid content in CTCs isolated with CK+/CD45 marker has been previously quantified using coherent anti Stokes Raman scattering (CARS) microscopy, showing a 7 fold higher lipid signal over other blood cells i n metastatic prostate cancer patients [30] . In my work, I assessed how well lipids perform as an identifying biomarker for CTCs in this model system both alone and in combination with the DNA/CD45/CK biomarkers . I test the following experimental hypotheses : ( H.1) image cytometry for a composite panel of DNA/lipids/CD45 will detect MCF7 cells with better performance than a standard DNA/CD45/CK panel, (H.2) MCF7 cells have increased lipid content compared to white blood cells , and (H.3) that spatial metrics of s econd moment, spatial frequency second moment, and their product using image cytometry can increase sensitivity and specificity beyond the sensitivity and specificity of si mple total content measurements . T he identification performance I report for the stu died biomarkers and combinations are the distributions of features for the D + , and D d, sensitivity, specificity, and the rarity of cell that can be detected . Finally , the utility of these biomarker combin ations and image cytometry also lies in the reliability and repeatability of their execution. As part of this work, I have engineered technology that allows cells of interest to be imaged and analyzed for these biomarkers in an ever more r eliable and repea ta ble fashion. I will present a device resulting from my engineering efforts to improve my reliability and repeatability in lab eling cells for image cytometry and on how I use d this device to evaluate the DNA/lipi ds /CD45/CK panel . 1.1 Contributions My contributions are theoretical, technological, and experimental to the field of cytometry with an application focus on identification of circulating tumor cells. My
PAGE 22
5 Word Template by Friedman & Morgan 2014 theoretical contributions to this field are the development of mathematics describing the r arity of object an assay can detect , the development of mathematic s describing how the detectable rarity of each stage combines in multistage diagnostic assays, showing how detectable rarity is related to the test positive likelihood ratio, and showing tha t in some cases detectable rarity and the test positive likelihood ratio have maximum as a function of the test positive test negative cutoff position and do not monotonically increase with decreasing false positive rate . M y technological contributions are the engineering o f a device for l abeling cells on track etched poly carbonate filters and software to perform image cytometry . The filters prevent cell loss during the multiple fluid handling steps of cell l abeling and the device allows for the size of th e deposited cell field to be controlled. The software was written to perform image cytometry on these fields and compute s spatial metrics of total signal, second moment, spatial frequency sec ond moment and their product. I have not previously seen these sp atial metrics evaluated in the context of circulating tumor cell s. My experimental contributions are the evaluation of the DNA/Lipids/CD45/CK panel in a model system for circulating tumor cell detection. I have contributed an experimental protocol for the l abeling of cells on these filters for DNA, Lipids, CD45 and Cytokeratin. I have contributed an open source dataset of image cytometry for this panel in a model system of circulating tumor cells to assist others developing their own software . I have not fo und other open source datasets related to CTCs. Analyzing this experimental data , I present measured results using receiver operating characteristics on z scored ( logarithmic ) axis and on a test positive test negative likelihood axis . To the best of my kn owledge, I am the first to apply this mathematical theory to
PAGE 23
6 Word Template by Friedman & Morgan 2014 calculate detectable rarity in a n experimental context. I show that detectable rarity and likelihood ratios can be used to find the optimal cutoff parameters in experimental cytometry data. 1.2 Back ground 1.2.1 The CTC Commercial Technology Landscape Recent news articles highlight a number of c ompanies that are commercializing products for detecting c irculating tumour c ell s [31], [32] . A fairly detailed review of different technologies be ing employed had been done by [33] and [34] . I have accessed and reviewed the websites of the companies listed in the news articles to try to understand the technology they are using, the product they are marketing, and any claims they make about their CTC product. My notes on this assessment are in Appendix J . I classify the technologies being commercialized by these companies into three categories, immunoenrichment , size enrichment and large field image cytom etry. The description large field refe rs the imaged cell field being hundreds to thousands of square millimetres in area allowing for the measurement of tens of thousands to millions of cells rather than the specific microscopy employed. Several companies market a complete assay that includes enrichment combined with a second stage identification using cytometry. In the immunoenrichment field , t he Veridex (CellSearch) system is probably the most widely used in clinical and biomedical research [1], [35] and remains the only system with FDA approval. The V eridex assay employs immunoenrichment using a magnetic ferrofluid conjugated to EpCAM antibodies to capture EpCAM positive cells followed by with backend image cytometry of the captured cells using labels for DAPI, CD45, and cytokeratin. Microfluidic devices are also being employed for immune enrichment using antibodies
PAGE 24
7 Word Template by Friedman & Morgan 2014 conjugated to surfaces in the device [36] . A few companies such as Miltenyi Biotec and StemCell Technologies offer immuno magnet ic enrichment products alone without a specified stage II test . These products are presumably targeted for researchers planning on doing PCR or sequencing for the second stage of an assay. One interesting technology for immunoenrichment is the GILUPI CellCollector which is an antibody conjugated wire for use in in vivo capture [37] . I solation of epithelial cells by size (ISET) [27] was one of the early alternatives to CellSearch. ISET employ s size enrichment us ing track etched polycarbonate filters with eight micron pore sizes to capture larger cells . More c ompanies are now offering products based on enriching for larger CTCs . The technology being marketed and developed by many of these companies employs microfluidic technology for size capture [38] [43] . Using large f ield image cytometry means performing microscopy on a large area allowing one to image tens of thousands to millions of cells. This removes the need for an enrichment stage . Epic Technologies and Cyto t rack both employ this tec hnological approach [44] [48] . In addition , there a re publications on methods to image a large field first at lower resolution and then follow up in identified regions of interest with higher resolution microscopy [49], [50] . 1.2.2 FDA Regulations and Guidance for CTC Identification Systems The United States Food and Drug Administration considers CTC identification systems to be class II d evices. R egulations for these devices are found under Chapter 21 of the Code of Federal Regulations Part 866 Immunology and Microbiology Devices, Subpart G T umor Associated Antigen immunological Test Systems Subsection 866.6020 Immmunomagnetic circulating cancer cell selection and enumeration system
PAGE 25
8 Word Template by Friedman & Morgan 2014 [ 21CFR866.6020] . The FDA has also produced a guidance document Controls Guidanc e Document: Immunomagnetic Circulating Cancer Cell Selection and Enumeration 0163 related to these systems.
PAGE 26
9 Word Template by Friedman & Morgan 2014 CHAPTER II MATHEMATICAL THEORY AND SIMULATIONS OF DETECTABILITY FOR RARE EVENTS SUCH A CIRCULATING TUMOR CELL S IN MULTISTAGE ASSAYS 2.1 Introduction Improving the perform ance of CTC assays will be obtained through discovering identifying biomarkers that are more sensitive and specific along with improvements in the detection methods that minimize cell loss and increase repeatability . In this chapter , I present analysis applicable to the engineering of multistage cytometry assays for the detection of rare cells such as CTCs. I present theory showing how sensitivity and specificity set the rarity of event an assay can detect, which I w ill call detectable rarity. Enrichment followed by image cytometry is commonly used to detect CTCs [1], [27], [35], [51] . I present a multistage theory by first describing how sensitivity, specificity, and detectable rarity can be applied for two binary test stages. Then , I present theory for two stage assay s consisting of enrichment followed by cytometry resulting in continuous bioma rkers and derive expressions needed to build receiver operating characteristics (ROC). Through simulation, I explore how the distribution of the D+ and D populations on the stage I biomarker, enrichment cutoff position, and dependence between stage I and stage II biomarker s effects overall assay sensitivity, specificity, and detectable rarity. In my approach, I quantify the performance of a test by solving for how abundant a rare D+ cell must be must be among the D population to detect it . This is similar to separating the signal strength from the noise as in an electronic analogy. With this analogy, one finds connections between the theory present ed and the electronic detection theory
PAGE 27
10 Word Template by Friedman & Morgan 2014 pioneered by Peterson W, Birdsall T and Fox W to under stand the detectability of signals in the presence of noise [52] a test can detect is what I term the detectable rarity. I will find that detectable rarity is linearly related to the test positive likelihood ratio and maximizing it is the same as maximizing the test positive likelihood ratio (L T+ ). I will show that detectable rarity has combination properties similar to likelihood ratios in multistage assays. Of note, likelihood ratios were also developed in the theory of [52] . Likelihood ratios have previously been described as a decision criteria in diagnostics and in the field of psychology to describe decision making with thresholds to be determined by event prevalence and the reward cost tradeoff of c orrect and incorrect decisions [53] . I will show that depending on t he distribution of D+ and D on the continuous biomarker parameter, the test positive likelihood ratio and detectable rarity have a maximum as a function of cut off position on ROC a nd that it does not always monotonically increase with ever decreasing cut off on ROC thresholds a s suggested by [53] . I will show this graphically by transforming the sensitivity specificity axis of my ROC curves to the test positive ( L T+ ) and test negative likelihood ratio ( L T ). The idea of transforming the coordinate axis was first proposed by Johnson and L T+ , L T represents an alternate basis de scribing the ROC as fully as sensitivity and specificity [54] . Through simulation, I find that when the D populations have excess kurtosis , there is a maximu m value of L T+ and that when D+ populations have excess kurtosis , there is maximum on L T . For a two stage test of enrichment followed by cytometry, if stage I biomarker has a D population with excess kurtosis , the optimal cut off position of the enrichment stage biomarker is at the position maximizing L T+ .
PAGE 28
11 Word Template by Friedman & Morgan 2014 There are only a few prior works about probabilistic mathematics related to rare events such as of circulating tumor cell detection. Rossenblatt et al . explored sampling statistics and show ed that the binomial distribution can be used to determine t he number of cells that need to be searched given a known rare prevalence and test with perfect sensitivity and specificity [55] . Tibbe et al . noted that rare event sampling statistics could also be modeled with Poisson statistics and discuss ed other practical aspects related to th e detection of CTCs such as how they can be used to stratif y patient outcomes [56] . In this chapter, I also will rev iew the binomial statistics used to determine the number of cells to sample for CTC identification, first described by [55] . The binomial sampling statistics implicitly assume a test with perfect sensitivity and specificity and only consider s how many cells would need to be sampled to have a certainty of finding a cell given prevalence. The theory I present use s sensitivity and specificity to the demine the rarity of cell the test can detect. I believe in the an the expected cell prevalence, which may be unknown, that should be used to determine the number of cells to sample with binomial sampling statistics. 2.1.1 Bac kgro u nd The mathematics presented here build on the mathematics used in detection theory in the ap plication of rare cell identification . The mathematics of detection theory are strongly connected to the mathematics of signal theory first developed to understand electronic , [57] , in that field is considered seminal through the idea of analyzing communications systems and noise through information content and led to what is now call ed information theory [58] . Peterson et al. a few years later would analyze the limits of electronic detection outside of the communications context [52] , and this work was
PAGE 29
12 Word Template by Friedman & Morgan 2014 cited by Gr een and Swets as being pioneering in detection theory [53] . Radar is b ut one obvious application of [52] . Peterson et al., [52] , were the first to show that receiver operating characteristics on z scored axis will display normal distributions as straight lines. This point would later also be advanced in to the diagnostic and psychological decision making context by Swets [59] . Although the theory I present is more classical and similar to [52], [53] and [59] is logarithmically proportional to its signal to noise ratio. I will relate signal to noise ratio to positive predictive value in the the ory I describe below . 2.2 Theory: Derivation of Detectable Rarity Consider a binary test performed on N cells coming from diseased and healthy groups. There are four possibilities for the groupings of these N objects, the cells can be test positive and disease positive, , test positive and disease negative, , test negative and disease positive, T , and test negative and disease negative, T . The number of false positives expected is ( 1 ) where P(event) is the probability of that event occurring, which must always range between 0 and 1. represents the background over which one need s to find the true positive cells. The expected number of true positives is ( 2 ) In signal processing, the standard parameter used to measure if a signal is detectable is the ratio of signal to noise (SNR). In the application to CTC detection, the SNR can be defined as the ratio of number of true positives to number of false positives. Starting with this relationship,
PAGE 30
13 Word Template by Friedman & Morgan 2014 , ( 3 ) one must have true positives exceed false positives(noise) by a minimum SNR in order for them to be detectable. N FP can be expressed as the false positive rate, P(T+D ) multiplied by the disease negative prevalence, P(D ) ,while N TP is written in terms of the sensitivity, P(T+D+ ) multiplied by the disease positive prevalence, P(D+) , one find s ( 4 ) By replacing the D prevalence with its complement, P(D ) = 1 P(D+), one can solve for the detectable prevalence for a given SNR, ( 5 ) For my simulations, I set the minimum SNR to be 2 . For calculations, it is easier to use an optimization parameter of N det = P(D+) 1 , which is the inverse of the detectable prevalence. I define N det as detectable rarity. Simplifying E quation 5 , one ha s the following relation for N det , ( 6 ) For example, if N det is equal to 2000, then one can detect CTCs occurring at a concentration of 1 CTC to 2000 leukocytes in a sample. Any concentration of CTCs below this level will not be detectable. Maximizing N det maximizes for detection of the rarest possible events. SNR is the ratio version of positive predictive value, P(D+T+) . They are related through the equation,
PAGE 31
14 Word Template by Friedman & Morgan 2014 ( 7 ) is linearly proportional to the test positive likelihood ratio, , through the following equation ( 8 ) This means that maximizing N det is also the same as maximizing for test positive likelihood ratio regardless of the choice of SNR. Since positive predictive value is a property of just the cell prevalence and the test positive likelihood, this also means maximizing for N det maximizes for positive predictive value (PV+) regardless of what PV+ actually is which also depends on the cell prevalence. 2.3 Simulation: Depending on the Distribution of the D+ and D on a Biomarker There Can Be an Optimal Test Positive/Test Negative Cutoff Position Many CTC assays employ image cytometry as the final identification stage which provides test biomarkers as continuous variable s. For continuous biomarkers, the sensitivity and specificity depend on the location of the cutoff position used to discriminate between T+ and T groups. The trade off in sensitivity and specificity as a function of cutoff position is often shown by a rec eiver operating characteristic (ROC) curve. Previous work [52], [53], [59] has shown that plotting sensitivity and specificity on z scored axis shows a straight line characteristic when the D+ and D population s are drawn from a normal distribution. In my experimental image cytometry work for DNA, lipid, CD45, and cytokeratin CTC labels, I found that none of these biomarkers showed normal distributions on an ROC ( Chapter IV ).
PAGE 32
15 Word Template by Friedman & Morgan 2014 The test positive and test negative likelihood ratios ( and ) are an alternate basis to sensitivity and specificity for biomarker ROC as first described by [54] . This basis has an advantage as one axis of the ROC is directly related to detectable rarity through Equatio n 8 . I created a simulation in Matlab 2015a (Mathworks, Cambridge, MA) to model D+ and D with configurable mean, standard deviation, skewness, and kurtosis. I then calculated a nd plotted the z scored ROC and likelihood ratios ROC. . I found that to create z scored ROC with a shape similar to my experimental results, I had to appl y an excess kurtosis of 3 to my D+ and D distributions. Figure 1 shows my results from simulations us ing normally distributed populations and populations with excess kurtosis of 3 . I find that populations with excess kurtosis lead to a maximum in L T+ as a function of cutoff while for the normally distributed populations, the likelihood ratio does not show a maximum value for a given cutoff position.
PAGE 33
16 Word Template by Friedman & Morgan 2014 Figure 1 : P anels a & b: Histograms of D+ and D populations drawn from normal distributions on kurtosis of 3 on are separated by 4 standard deviations on both . Panel (d) shows N det
PAGE 34
17 Word Template by Friedman & Morgan 2014 2.4 Theory: Two Stage Tests 2.4.1 Sensitivity and Specificity of a Two Stage Binary Assay Many CTC isolation techniques employ two or more stages. A diagram of a two stage system is shown in Figure 2 panel ( a). I look at the theory for optimiz ing overall performance in multiple stage test s. Figure 2 : Panel a: Diagram of sample processing through a two stage test. The test negative output of the first test is discarded. The test positive output from the first test is used in the second test. The test positive output of the second test is test positive for the combined test. Panel b: Geometrical picture o f two stage sensitivity showing that the two stage sensitivity must be smaller than the sensitivity of the individual stages. Panel c: Geometrical picture of the two stage specificity showing that the false positive rate [=1 specif i city] must be smaller th an that of the individual test stages. In the two stage assay, the test positive output is the intersection of a positive first test with a positive second test , T+ = T 1 2 + . The sensitivity of this test is
PAGE 35
18 Word Template by Friedman & Morgan 2014 ( 9 ) Using the law of total probability, the combined sensitivity , P(T 1 2 +D+) , can be written in terms of the sensitivity of the first test minus the intersection the objects that are positive in the first assay but negative in the second test, ( 10 ) Since P(T 1 2 D+) is a probability and ran ges between zero and one, Eq uation 10 , indicates that the sensitivity of the combined test is less than the sensitivity of the first test alone. ( 11 ) is also true and can be confirmed wit h the geometrical picture. Equati o n 11 indicates that the sensitivity of the combined test must be less than the sensitivity of the second stage alone. Thus, the combined test sensitivity must be lower than that of the individual stages. The specificity of the overall test is . ( 12 ) Using the law of total probability, the specificity of the test can be written as ( 13 ) in which the first term is the specificity of just the first test. The geometrical picture gives the similar relation ( 14 ) indicating that the combined test specificity must be larger tha n the specificity of the second test alone as well. Thus, the combined test specificity must be greater than that of the individual test stages.
PAGE 36
19 Word Template by Friedman & Morgan 2014 2.4.2 Detectable R arity in M ulti S tage B inary A ssays Suppose one has two independent test stages so that the combine d test sensitivity can be written as P(T 1 2 +  D+) = P(T 1 + D+) P(T 2 +  D+) and false positive rate can be written as P(T 1 2 +  D ) = P(T 1 + D ) P(T 2 +  D ) . N det for the individual test stages is ( 15 ) and N det for the combined test is ( 16 ) The sensitivity of the individual test stages can be written in terms of the N det value of the individual stages, their false positive rate, and SNR as ( 17 ) Substituting Eq uation 1 7 into Eq uation 1 6 the expression simplifies to ( 18 ) For M test stages, the expression for N det becomes ( 19 ) Thus, for the case of independent test stages, N det of the combined test is proportional to N det of the individual tests multiplied together. This conclusion can also be derived in ter ms of likelihood ratios from Equation 8 .
PAGE 37
20 Word Template by Friedman & Morgan 2014 2.4.3 Treatment of I ndependent V ersus D ependent T est S tages Figure 3 : Diagram of two stage system in D+ and D populations. Pane l (a) has a relation where Eq uation 24 is not satisfied due to a large overlap between T1+ and the D population. Panel (b) the typical case where both test stages seek to reduce the false positive rate of the assay. is the product of the individual tests stages for the indepen dent test case Eq. 1 9 seems to be the upper limit on how test stages combine in a multistage test. For a multistage test , the overall test sensitivity is ( 20 ) and overall false positive rate is ( 21 ) Where is the stage 2 sensitivity which is dependent on stage 1 and is the stage 2 false positive rate dependent on stage 1. The combined test detectable rarity is ( 22 ) where is the stage 2 detectable rarity dependent on stage 1, ( 23 )
PAGE 38
21 Word Template by Friedman & Morgan 2014 For most cases, the dependent stage 2 detectable rarity will be less than or equal to the independent stage 2 detectable rarity, . This inequality is equivalent to saying the stage 2 dependent likelihood ratio is lower than the independent likelihood ratio , ( 24 ) The re are exceptions to E quati on 24 being true, but I argue that it is true for most assays as shown in Figure 3 . Exploring this, Figure 3 Panel a, shows a case were Eq. 2 4 is not true and the stage 2 dependent tes t has a greater likelihood ratio after stage 1. Note the limited occupation T 1 + has in the D + population while still including a large amount of the D population. T 1 is not a test with high detectable rarity or one that seems very useful. Figure 3 Panel b shows a more realistic case where T 1 + and T 2 + both exclude a large amount of the D population and have low false positive rates. This is the more realistic design a nd in this case Equation 24 holds. 2.4.4 ROC for T wo S tage T ests U sing the S ame B iomarker E stimating the total test sensitivity and specificity of a system that consists of a binary selection step, such as immunoenrichment, immune depletion or size screen, followed by measurement of a continuous biomarker is of interest since this is t he processes is employed by many CTC detection assays . Here I will evaluate how the first enrichment step alters the ROC when the same biomarker is used for both test stages. In the next section I will generalize this theory for the use of different biomarkers for stage 1 and stage 2.
PAGE 39
22 Word Template by Friedman & Morgan 2014 Figure 4 : Diagram of population separation using continuous biomarkers through a two stage system. The ef fect of the first test is modeled as a selection function on a continuous biomarker , x, (panel b) in which cells in the red population are more likely to be selected as test positive, T 1 +. Resulting enriched distributions are shown in panel c) and discarded distributions are shown in panel d). D+ and D populations have a probability distribution across a given continuous biomarker a re shown in Figure 4 ( D+ [ re d ] and D [ blue ] ) and are represented as normal distributions . Mathematically we define f D+ (x) and f (x) to represent the disease positive and disease negative probability density functions. One can represent the effect of T 1 selection as a modulation f unction, M(x), shown in Figure 4 panel b . M(x) is a function associated with a Bernoulli random variable that varies between 0 and 1 and has a selection constant, p dependent on the value of the continuous biomarker or p(x) . A M(x) function is reasonably modeled as an error function using the expression , ( 25 ) where Âµ s is the center position of enrichment the transition and s is its spread.
PAGE 40
23 Word Template by Friedman & Morgan 2014 The multiplication of the M(x ) function and 1 M(x ) with the disease positive and negative populations tells us the probability of the cell being selected in to the T 1 + or T 1 group. Since the cell must be in one of these groups, . ( 26 ) The result is the same as the law of total probability, ( 27 ) One can now consider the sensitivity and specificity of the test as a function of cut off position, x = C. The sensitivity is . ( 28 ) The specificity is given by ( 29 ) The ROC curves for the enriched and non enriched cases of Figure 4 are shown in Figure 5 . The enriched ROC shows saturation because of the loss in cells in the enrichment stage.
PAGE 41
24 Word Template by Friedman & Morgan 2014 Figure 5 : The ROC of the biomarker post enrichment (blue) follows the ROC of the biomarker without enrichment for low false positive rates and is seen to saturate at higher false positive rates. This is caused by cell loss in the D+ population due to the stage 1 enrichment. D+ and D distributions are normally distributed and are shown in Figure 4 . 2.4.5 ROC for T wo S tage T ests U sing D ifferent B iomarkers I generalize the previous results for the case that different biomarkers are used in the two test stages, as would be expected in a realistic test, for example immuno enrichment for EpCAM positive cells followed by image cytometry with CD45 bio marker. I now look at the effect on sensitivity and specificity for this case. The two bioma r kers make the D+/D populations describable through a two dimensional probability density function , f (x,y) , also known as a joint probability density function. The enrichment function, M(x) acts on only one of these dimensions. Figure 6 shows a simulation fo r two cases, one with statistically independent biomarkers, BM1 and BM2, and one with biomarkers that are correlated, BM1 and BM3. In general, one would expect biomarkers that select for CTCs to be correlated. The law of total probability, Eq. 2 6 , defining the sensitivity and specificity of the first stage becomes
PAGE 42
25 Word Template by Friedman & Morgan 2014 ( 30 ) As a function of cut off position on the stage 2 biomarker , the sensitivity of the combined first and second stage is ( 31 ) a nd as a function of cut off position on the stage 2 biomarker , the false positive rate of the combined first and second stage is ( 32 ) One can relate the combined false positive rate to combined specificity through Equation 12. One may see that additional dimensions can be added to further generalize this theory for 3 or more biomarkers. 2.4.6 Creating O verall T est ROC from Two S tage T est U sing D ifferent B iomarkers Suppose one is given tubes of D+ and D cells that have been put through enrichment. With knowledge of the number of D+ and D cells pre enrichment one can determine overall test ROC. Measurement of a biomarker will produce distributi ons that can be used to compute the overall test ROC. I put ROC in quotes because this is n o t a typical ROC but one with a dependent sensitivity and false positive rate. As a function of cutoff position the sensitivity axis of this ROC is not overall sen sitivity but is instead P(T 2 + D + T 1 + ) and the measured false positive rate is P(T 2 + D + T 1 + ). The measured sensitivity can be converted to overall test sensitivity by
PAGE 43
26 Word Template by Friedman & Morgan 2014 ( 33 ) a nd overall test false positive rate can be converted with ( 34 ) With knowledge of the number of D+ and D cells at the stage 1 input, N D1+ and N D1 , and the number of cells in those grou ps post enrichment used to compute the ROC, N D2+ and N D2 , the compensating sensitivity is ( 35 ) and false positive rate is ( 36 ) 2.4.7 Simulation: ROC Pre and Post Enrichment for Dependent and Independent Biomarkers T he effect that the enrichment step has on overall ROC performance and relationship of the biomarker dependency is of interest in understanding commonly used multistage assays. To investigate th is, I created and compared ROC for a two stage test that use biomarkers that are dependent and independent for each stage . I simulated the populations for each biomarker to have excess kurtosis of 3 as shown in Figure 1 . I created the D+ and D populations biomarker spaces , BM1 and BM2. The D+ and D distributions on BM1 and BM2 are independent since they were generated with different random number generation calls. To create a dependent biomarker, BM3 , I created a biomarker that was a linear combination of BM1 and BM2 at a 45 / 55 r atio, BM3 = .45 BM1 + .55 BM2. Those ratios were chosen to make th e lines clear for visualization in panels a) and b) of Figure 6.
PAGE 44
27 Word Template by Friedman & Morgan 2014 I simulated positive selection using the modulation functions shown in panels c) & d) of Figure 6. I used those modulation functions, M(x), to determine the probability that an event would be selected given its BM1 value. To perform selection, for each simulated cell , I sampled a uniform random variable ranging between 0 and 1 and selected it if selection probability value, M(x), was less than the sampled random variable number. Thus , events w ith high selection probability are more likely to get selected than those with low selection probability. I calculated the ROC and detectable rarity of BM2 and BM3 before and after this enrichment step as shown in Figure 6 panels e) & f). From the results, one see s that even though BM3 performs better than BM2 before enrichment, the post enrichment performance of BM3 improves less than BM 2 due to its de pendence on BM1.
PAGE 45
28 Word Template by Friedman & Morgan 2014 Figure 6 : Simulated D+ events (red) and D events (blue) with distribution on biomarkers BM1, BM2, and BM3. (panels a and b). BM2 is independent (ID) of BM1 while BM3 is dependent (Dep.) on BM1. Distributions are put through enrichment stage on BM 1 with cutoff shown in (c and d) resulting in new distributions shown in panels (e and f). Panels (g and h) show the ROC of BM2 and BM3 before and after enrichment with maxim a in N det annotated as +. Although the performance of BM3 is seen to be better than BM2 pre enrichment its performance improves less than BM2 post enrichment due to its dependence on BM 1.
PAGE 46
29 Word Template by Friedman & Morgan 2014 2.4.8 Simulation: T he Optimal Enrichment Cut off Position I next ask ed the question of how maximum detectable rarity of the overall test changes as a function of enrichment cutoff position and how this is affected by the dependence between the enrichment stage I biomarker an d stage II biomarker. To answer this question , I developed a simulation similar to the previous one, involving two independent biomarkers BM1 and BM2 and a third dependent biomarker BM3 = .30 BM1 + .70 BM2. These ratios were chosen to make the differences in curves easy to see graphically. I varied the cutoff position, shown in panels c) & d) of Figure 6 from left t o right, computed the overall test ROC at each enrichment cutoff position, and recorded the value of N det for stage 1 and the maximum value of N det on BM2 and BM3 pre and post enrichment. Log 10 of the percentage the D population is reduced by an enrichment is commonly reported as a performance metric for the enrichment referred to as enrichment efficiency. Distancing the cutoff position of the enrichment ever further from the D population cont inuously increases the enrichment efficiency. Thus , the cutoff position of the enrichment and the log of the enrichment efficiency are equivalent and are shown at the top and bottom axis of Figure 7 . What I find is that if the D /D+ populations have excess kurtosis on the stage 1 biomarker then there is a cutoff position of the enrichment that produces a maximum detectable rarity, N det , while if the D /D+ populations are normal ly distributed then N det increases as the cutoff position is made ever further from the D population. I show an example of this in Figure 7 panel a. When the D+ and D are not normal distributed the location of the cutoff position for the enrichment to maximize detectable rarity is the same as
PAGE 47
30 Word Template by Friedman & Morgan 2014 the cutoff position where the test positive likelihood ratio shoes maximum on the likelihood ratio ROC. I n Figure 7 panel b , the combined test detectable rarity for a biomarker independent of the enrichment biomarker post enrichment (blue dashed trace) , is the product of the detectable rarity for the enric hment biomarker with the detectable rarity of the stage 2 independent biomarker as predicted by Eq uation 19 . When the stage 2 biomarker is dependent on the stage I biomarker the combined performance (red dashed trace) is less than the product of the indivi dual stages and the optimal stage 2 cutoff position maximizing overall test detectable rarity is seen to shift. For D+/ populations with excess kurtosis the location of the cutoff giving the maximum N det also depends on the separation between the D+ and D populations on the enrichment biomarker. For the simulations shown in Figure 6 & 7 , I chose a value of four standard deviations of separation. An important concludi ng point is there can be an optimal log of enrichment since this is directly related to optimal cutoff position of the enrichment which I show depends on the distribution of the D+ and D populations on the enrichment biomarker with kurtosis being an impor tant property describing their spread.
PAGE 48
31 Word Template by Friedman & Morgan 2014 Figure 7 : The detectable rarity of stage I and overall test as function of the cutoff position o n the stage I enrichment biomarker. Panel a) shows the detectable rarity of the stage I enrichment alone. For the case of D+ and D drawn from a normal distribution ( ) the detectable rarity increases as the cutoff position is moved ever further away from the D popu lation. For D+ and D populations with excess kurtosis of 3 (BM 1) a maximum is seen in detectable rarity as a function of cutoff position. In panel B, BM1 is used as the enrichment biomarker and BM2 from Figure 6 as the stage II biomarker. The performance of the overall test post enrichment (blue dashed line) follows that of the detectable rarity of BM1 alone combined with that of BM2 as predicted by equation 1 9 . When the stage II biomarker is dependent on the stage I biomarker (BM3 from Figure 6 ) the overall test detectable rarity underperforms that of the independent case. 2.5 Theory: Cell Loss in Processing Cell loss is almost always present in the numerous processing steps needed to bring a whole blood cell fraction though an assay. How does this loss effect the sensitivity and specificity of the assay? If the cell loss is homogenous across the biomarker of interest, its
PAGE 49
32 Word Template by Friedman & Morgan 2014 effect is to proportionally lower the number of test positive objects produced by the test. This P loss (T+ noloss (T+ D+) and false positive rate to P loss (T+ noloss (T+ . This reduced false positive rate means more cells can be used to compensate. Note that this is different than increasing numbers to compensate for low test sensitivity which does not change the specificity of the test. Analysis of positive predictive value and N det shows in case of uniform cell lose effect on sensitivity and false positive rate cancel leading to identical PV+ and detectable rarity . It is often not checked if cell loss is homo genous across the cell groups of interest. Also, compensating for cell loss requires using larger blood samples which may be detrimental to patients. 2.6 Theory: Sampling Considerations and Volume of Blood to Screen Considerations of the amount of cells that n eed to be screened in order to have a good chance at accurately identifying a rare 1 in 1 million cell prevalence have been previously described [55], [56] . This previous analysis modeled the problem as a binomial distribution and assumes perfect detection ability, a series of Bernoulli trials and assumes perfection in the identification. I believe the optimal number of cells to screen is the amount needed to adequately sample the detection limit of the assay, p = N det 1 . Working through the analysis, assume the probability of a cell being a CTC is p , then the probability of finding k cells in n cell measurement s is ( 37 ) T he probability of finding greater than or equal to k cells in n measurements, P(X n , is of interest . P(X n can be written as
PAGE 50
33 Word Template by Friedman & Morgan 2014 ( 38 ) One should control this value such that it is greater than an assurance level, P(X n . The assurance level, , is the probability that one do es not find a cell of interest. Often this value is set to 5%. The assurance level is increased by increasing the number of measured samples n . Eq uation 3 8 can be rewritten as its complement, P(X n n
PAGE 51
34 Word Template by Friedman & Morgan 2014 The other question is what should the value of k be such that there is an assurance , one find s greater than k cells. Clearly k needs to be greater than 0. Could it be 1? This is the same question just discussed, it can be, and n can be found for a k of 1 with Eq. 3 9 . As one increase s the value of k , one must search more cells to achieve the cut off. The way k=1 fidence is by samp ling enough cells such that it is most likely more than 1 cell with be found. Although k, could be one by the above argument , perhaps a little more caution should be taken. One may note that the standard deviation of a binomial random va riable is the square root of its expectation value. Thus , the standard deviation of k=1 would also be 1. Thus , it i s desirable to increase k by this number so that k minus standard deviation is at least greater than 1 , requiring a minimum number of three c ells . 2.7 Discussion One interesting result is that the detectable rarity for a multistage test on independent test biomarkers is the product of detectable rarities of each of the individual biomarkers. This result falls out of the theory of likelihood ratios. The result points to a path to detect ever rarer events in many applications including CTCs. I used analytical arguments and simulation to show the independent case is the upper limit on how test stages combine in a multistate assay. Ot her transformations of ROC axis have been proposed in addition to the likelihood ratio axis explored here . Recently, [60], [61] have shown transformations to convert ROC to metrics of Shannon entropy. Indeed , some of the more recent mathematical development of ROC is related to their application in imaging [62], [63] . The prominent place that likelihood ratios take is of note in this analysis . In addition to metrics and analysis focused on Shannon
PAGE 52
35 Word Template by Friedman & Morgan 2014 information, th e diagnostic odds ratio is worth mention ing as it is independent of disease prevalence [64] . I point out that my approach here differs from prior work in focusing on separating the question of what prevalence of cell a test can detect from what that prevalence actually is. The prevalence of cell that a test can detect is a quantification purely of the test while the actual cell prevalence is unknown until a n accurate enough test is made to see it. The connection between Shannon information and diagnostic is an intriguing one. [65] has provided analysis how sensitivi ty and specificity relate to a diagnostics capacity as an information channel. Early connections between diagnostics and information entropy are attributed to Akaike [66] . [67] presents more recent work about using information entropy in di agnostics and [68] provides a recent review of information metrics in general. Interest in these methods is high because information theory provides a deeper theoretical understanding similar to what Shannon did in the field of communications. One application of Shannon theory of information in diagnostics is quantif ies the redundancy in testing [69] . Overall, I note that the main difference between the diagnostic and communications systems is that in communicat ions , one controls encoding and decoding while in diagnostics one only controls decoding . Regressing two independent biomakers against D+ and D will produce a new biomarker that outperforms them but is also dependent on them. If one of the independent bi omarkers is also used for the enrichment, post enrichment this regressed biomarker will performed worse than the other independent biomarker it is built from. Thus finding cutoffs maximizing N det for each biomarker and then using these cutoff s in each test step may help to detect rarer cells than regression techni ques. I showed the connection between detectable rarity and likelihood ratios. ( [52] on pg. 182) came to a similar conclusion stating , The chief
PAGE 53
36 Word Template by Friedman & Morgan 2014 conclusion obtained from the general theory of signal detectability presented in Section 2 of this paper is that a receiver which calculates the likelihood ratio for each receiver input is the optimum recei [52] noted that the difficulty is in determining the ordering of the test gates . In my work , I hav e seen that PCA combined with regression does not improve re gression performance [71] . However , metho ds such as PCA may be helpful to generate a basis for test gating as they produce statistically uncorrelated basis vectors. Further generalizing these results, a question that comes up is : what are the limitations of biomarkers such as prostate specific a ntigen (PSA) and cancer antigen 125 in the detection of disease with particular interest on early detection. Assay sensitivity and specificity set the rarity of event that an assay can detect for these biomarkers as well. For example, prevalence of many so lid cancers such as ovarian cancer is 1 in 10 000 [3], [4], [72] . Therefor e, by the presented theory, a useful early diagnostic with perfect sensitivity should have false positive rate better than 1 in 20 000. The best way to evaluate the sensitivity and specificity of the overall system is to run a disease positive and disease negative cell populations through the complete assay and observe the frequencies for the test positive population, P(T 1 2 + D+) , which would be the sensitivity, and test negative population which is the false positive rate, P(T 1 2 + which is the complement of the specificity. I show parameters needed in the calculation of the overall ROC is the number o f D+ and D cells before stage I and measurement of the continuous biomarkers at the depletion output.
PAGE 54
37 Word Template by Friedman & Morgan 2014 CHAPTER III ENGINEERING OF CELL LABELING TECHNOLOGY 3.1 Introduction The motivations for the design of the l abeling device was to reduce cell loss during the l abeling process and to allow for the area of the field of cells produced to be configured. Cur rent devices for laying down cell suspensions do not allow for the size of the cell field to be configured and the field is to o large in area to be compatible with imaging of cell number fractions less than 10 000 ce lls. I n this chapter, I also provide des cription of a coverslip mounter I employed for slowly applying coversl ips to prepared filter samples , a protocol for labeling cells for DNA, lipids, CD45, and CK , and example images produced with the combined device and protocol . 3.2 Device for Cell Labeling A side view schematic of the l abeling device is shown in Figure 9 . A nine lane version of the device (for holding nine samples in parallel) was machine d from polycarbonate with the alignment plate 3D printed in acrylonitrile butadiene styrene (ABS). Manual ball valves and a manifold were used to connect the device to vacuum and control fluid flow in the individual lanes. I also 3D printed the plate in Figure 10 for securing the
PAGE 55
38 Word Template by Friedman & Morgan 2014 Figure 9 : Side c ross section of l abeling device. Device consists of input head, alignment plate, and output head. The input and output heads sandwich o rings set with the alignment plates to create a seal on the polycarbonate filter as shown in detail 1. Input head has th readed connector (a) for applying positive pressure. The volume of the staining reservoir (b) is controlled by the diameter (c) while its height is set to the length of a standard gel loading pipette tip. This choice of height prevents damaging the filter while enabling bubble free loading. The diameter of the laydown area is controlled by the o ring diameter which can be set by changing the diameters in the alignment plate (e). This plate must be thin enough for a compression gap (f) to enabling sealing of device. Output head contains threaded connector (g) for pulling fluids from the device. All fluids flow in the direction of the dotted arrow (h). The vacuum regulator ( Figure 10 (e )) is used to pull fluid and prevents distortion of the filter. I set my vacuum level to 10 kPa. The filter is held with compression between the input and output head using eight 4 40 socket head cap screws on the device tig htened to 4 in oz of torque. For this work, I used 90 Buna ANSI 004 o rings (Rocket Seals, Denver, Co) to create the seal on the filter. The high durometer buna o ring is needed to prevent filter deformation by the o ring in clamping the device.
PAGE 56
39 Word Template by Friedman & Morgan 2014 Figure 10 : Diagram of assembled system: The lanes in the staining chamber (a) are pulled with vacuum controlled through stop cock panel (b) fabricated with 3D printing. The vacuum splits out through the manifold (d) and is routed through 2 aspiration steps (d). To prevent stretching the filter, vacuum pressure is controlled by regulator (e) which is connected to the lab vacuum (f). I used a 4.5 mm punch (Micromark #83513) to cut 13 mm diameter 800 nm pore size track etched poly carbonate filters (ISOPORE ATTP01300) to 4.5 mm diameter. I am able to get 2 to 4 filters cut from each 13 mm diameter filter. I punch ed the filter with protective paper shipped with the filters above and below it. 3.3 Coverslip Mounter A custom built set up was constr ucted for mechanical application of the coverslip to slide on which I set the filter. I have found that it is easy to smear out the cell bodies or introduce bubbles at this step when mounting by hand. My coverslip mounter uses a stage and vacuum to hold th e coverslip to a plate. Coverslips are pressed onto the filter on slide at 0.2 mm per second and the vacuum holding the coverslip is released. Movements to a preloading position and retraction of the plate after vacuum was released were done at 1.2 mm per second (top speed for this stage).
PAGE 57
40 Word Template by Friedman & Morgan 2014 3.4 Protocol for DAPI, Bodipy, Pan cytokeratin, and CD45 L abeling of C ells on T rack E tched P olycarbonate F ilters The filter is lipophilic and different fluorophores can also label the filter producing a high background. To reduce background, filters are first treated with TrueBlack (Biotium #23007). Additionally, fixed cells will not naturally stick to the filter and can cause issues with mounting on slides. I have found that that treatment of the filter with poly d lysine before adding the cells followed by treatment of the filter and cells with formaldehyde after adding the cells causes the cells to remain bound. I presume that the formaldehyde cross links the cells with the poly d lysine bound to the filter. To avoid air bubbles in the device, the filter must not dry out during the labeling process as fluids are passed through in series. This protocol uses the sap onin based BD Perm/Wash buffer for permeabilization (Part #: 51 2091KZ). I also experimented with using various concentrations of ethanol and methanol to permeable the cells. In comparing these methods, I found the fluorescent dyes labeling cytokeratin and lipids in MCF7 cells were much brighter with the BD kit compared to using alcohols. I have performed experiments pointing to the use of alcohols in permeabilization leaches out lipids leading to lower Bodipy intensity. The sequence of steps used in label ing and mounting cell samples follow . 1. Load device with filters. Perform loading submerged in water to prevent air bubbles. Tighten screws to 4 in oz torque. 2. Pull residual water left from loading filters 3. Load and pull 90 ÂµL/ lane 70% e thanol 4. Make a solution of 1:4000 True Black in 70 % e thanol.
PAGE 58
41 Word Template by Friedman & Morgan 2014 5. Load and pull the 1:4000 True Black solution at 70 uL/lane 6. Incubate at half volume (i.e. half the liquid remains on top of the filter) at room temperature for 10 minutes 7. Load and pull 30 ÂµL/ lane PBS R epeat Once 8. Load 30 ÂµL / lane 0.01% V/V Poly D Lysine 9. Incubate at half volume for 10 minutes 10. Load and pull cell sample solution 11. Load and pull 50 ÂµL / lane PBS 12. Load 30 ÂµL/lane 4% Formaldehyde (Sigma 47608) in w ater. 13. Incubate at half volume for 10 minutes on ice 14. Load and pull 30 ÂµL/lane 1x BD Permwash 15. Load 30 ÂµL/lane 1x BD Permwash 16. Incubate on ice at half volume for 10 minutes 17. Load 20 ÂµL/lane IgG (Sigma I8640) at 20 ÂµL per mL in 1x BD Permwash . 18. Incubate on ice at half volume for 15 minutes. (Blocking) 19. Load and p ull 30 ÂµL/lane BD Permwash 20. Prepare a ntibodies solution : 3 ÂµL anti PanCK Alexa555 (Cell Signaling Technologies 3478S) + 3 ÂµL anti CD45 Alexa633(Biolegend 304020) into 100 ÂµL 1x BD Perm wash 21. Load 10 ÂµL / lane of solution . Incubate at half volume for 30 minutes on ice 22. Load and pull 30 ÂµL/lane 1x BD Permwash 23. Apply DAPI (Sigma D9542) and Bodipy 495/503 (Fischer Scientific D3922) l abels. Total solution: DAPI and Bodipy each at 1 ug/mL in 1x BD PermWash.
PAGE 59
42 Word Template by Friedman & Morgan 2014 24. Load DAPI/Bodipy solution at 30 ÂµL / lane 25. I ncubate at half v olume for 10 minutes at room temperature. 26. Load and pull 30 ÂµL/lane NanoPure H 2 0 27. Pull lanes to empty 28. Mount by depositing 3 ÂµL w ater on slide and 15 ÂµL Prolong D i a mond (Fischer Scientific P36970) on coverslip 29. Deposit filter with cells facing up in water dro p on slide 30. Place coverslip over filter 31. Secure coverslip corners with nail polish 3.5 Example Resulting Cell Preparations An example image of a full cell field and a sub region of interes t prepared using the device and with the 4 labels applied using the proto col is shown in Figure 11 . Figure 11 : A 1:1 sample of WBCs and MCF7 cells labeled for DAPI (blue), Bodipy (green), anti pan Cytokeratain(y ellow), and anti CD45 (red). Panel a) representative image of cell spot produced by labeling device. Diameter of spot is 2.1 mm. Image is 8x8 mosaic full
PAGE 60
43 Word Template by Friedman & Morgan 2014 resolution of sq uare area is shown in panel b). Image at full resolution is available in public datase t [73] .
PAGE 61
44 Word Template by Friedman & Morgan 2014 CHAPTER IV EXPERIMENTAL EVAULATION IMAGE CYTOMETRY FOR DNA, CD45, CYTOKERATIN, AND LIPIDS IN A MODEL SYSTEM FOR CI R CULATING TUMOR CELL IDENTIFIATION 4.1 Introduction In this chapter , I present analysis of how the identification performance of image cytometry for DNA (DAPI), lipids (Bodipy), and CD45 compares to image cytometry for the classical biomarker panel of DNA ( DAPI), Cytokeratain (CK), and CD45 in a model system of disease positive (D+) MCF7 cells and disease negative (D ) white blood cells (WBCs) from human peripheral blood. A DNA, lipids, a nd CD45 panel is interesting because it does not use epithelial biomark ers and because the lipid label (Bodipy) is com patible with live cell imaging. The WBC MCF7 model system has been previously used by other investigators in studying CTCs [24] [29] . Additionally, the manufacture of the cytokeratin antibody used MCF7 as t he positive control. Fatty acid synthase has been shown to be over expressed in many cancers [74] [81] . Higher lipid content in CTCs isolated with CK+/CD45 marker has been measured using coherent anti Stokes Raman scattering (CARS) microscopy, showing a 7 fold higher lipid signal over ot her blood cells [30] . I t i s not clear if the increased fatty acid content of these CTCs is due to increased de novo synthesis by the cells through FAS or through fatty acid uptake from the blood. One study has shown that tre atment with a FAS inhibitor lowers the amount of neutral lipid staining ( B odipy) in a prostate cancer mouse model [82] . Also, there is a report of lipid droplets accumulating through cellular uptake
PAGE 62
45 Word Template by Friedman & Morgan 2014 under hypoxic conditions through the HIF A pathway [83] . T umors are thought to be under hypoxic conditions and perhaps this pathway may be a driver of lipid accumulation. In my work, I assess how well lipids perform as an identifying biomarke r for CTCs in a model system. I employee Bodipy to label lipid s. Bodipy is a lipophilic dye that labels long chain fatty acids including neutral lipids and can provide lipid contrast similar to CARS while also being compatible with live cell imaging [84] . I test the following hypotheses: (H.1) image cytometry for a composite panel of DN A/lipids/CD45 will detect MCF7 cells with better performance than a standard DNA/CD45/CK panel, (H.2) MCF7 cells have increased lipid content compared to white blood cells. The identification performance I report for the studied biomarkers are the distribu tions of features for the D+, and D cell detect able . I additionally test the hypothesis (H.3) that spatial metrics of second moment, spatial frequency second moment, and their product using image cytometry can increase sensitivity and specificity over simple total content measurements. I hypothesized that cancer cell lines have a bigger nucleus than WBCs that could be measured by a second moment feature. The de sire to look at spatial frequency was conceived as a metric to quantify structural differences associated with lipid droplets seen on microscopy. The importance of conjugate variables in signal processing and physics and knowledge of their invariant produc ts, such as M 2 in laser beam profiling [85], [86] , motivated looking at the product of spatial moment with spatial frequency second moment which is a unitless metric related to information content.
PAGE 63
46 Word Template by Friedman & Morgan 2014 To perform image cytometry for these metrics, I coded my own segmentation and analysis software that uses both ImageJ and Matlab. Before doing this, I looked at usin g the ImageJ ROI analyzer and CellProfiler [70] to compute these metrics but second moment, spatial frequency second moment, and their product are not currently available in these softwares. I refer to the computation of an image feature computed on one channel /label as a biomarker . Thus this 4 label data in which 4 metrics were computed resulted in a 1 6 biomarker image cytometry dataset. Each of these potential identifying biomarker I calculated and present univariate performance in this model system. I present a regression analysis to linearly combine individual biomakers to create a new biomarker with increased separation between D+ and D populations and thus performance. This regression technique is one of many machine learning techniques that can be used in this optimization problem. Recently, comparisons of Bayesian c lassifiers, k nearest neighbors, support vector machines, and random forests has been performed [87] . Using my regression analysis I performed on par with other machine learning method s that tested in a WBC cell line model of CTCs [87] with further detail presented in Appendix G . In my approac h I train and test on pure samples of known class without using human identification. Prior CTC machine learning studies are motivated by a desire to reduce operator time and error and mostly focus on training and testing machine learning techniques against operator identified CTCs [87] [91] . These studies have found automated methods to perform similarly to manual identification [88], [91] and unsupervised methods to perform similarly to supervised ones [90] .
PAGE 64
47 Word Template by Friedman & Morgan 2014 In my pure sample analysis, I can rigorously test the performance because any object identified as test positive in the D WBC f raction is a false positive. My regression analysis points to the fact that there are a small number of objects in my D dataset that are classified as MCF7s and visual inspection confirms these objects would be classified as D+ by human opera tor. My theory points to these false positives being the limiting factor in rare cell identification. Image cytometry work has been performed on the classification of WBC subtypes by image morphology [92] [96] . There has been investigation of image cytometry specifically related to circulating tumor cells [88] . Many of the reports focus on the region of interest (ROI) segmentation problem [97] [99] . There have also been studies of image cytometry using second spatial frequency moment analysis [100] . However, this work is the first I have seen characterize and quantify the metrics of total signal, second moment, spatial f requ ency second moment and their product on several biomarkers and in which imaging was performed on over 1000 cells. 4.2 Materials and Methods 4.2.1 Sample I nclusion and A bundancies The accuracy in which one estimates sensitivity, P(T + D + ), is related to the number of samples in the D + dataset, while the accuracy in which one estimates specificity, P(T D ), is related to the number of samples in the D dataset. To estimate sensitivity and specificity to similar accuracy I have used similar abundancies of the D + and D cells in this dataset . I have chosen the sample size of each dataset to be 1000 3000 cells. I also prepared 1:1 mixed samples to qualitatively show that position of modes seen in pure samples is not due varying l abeling conditions or acquisition settings . The mixed samples were not used in
PAGE 65
48 Word Template by Friedman & Morgan 2014 the quantitative performance analysis. A threshold of 65 dBct [= 10log 10 (counts)] on DAPI White blood cells were isolated from peripheral blood taken from a di fferent patient on each preparation day. MCF7 cells were prepared for a given day from cultured cells and therefore could have different properties due to confluence. In order to measure variations, the data was acquired on multiple days with samples in tr iplicate on each day. Statistical variations in performance within the same day (intra day) and between days (inter day) are compared. Training and testing data sets were acquired on experimental days 9, 10, 12, 13, 14, and 15. Additionally 1 WBC filter fr om day 14, and 1 WBC filter from day 15 were excluded due to poor quality. Thus this dataset comprised of 52 filters in total. The data collected on days 5 and 7 used an older CD45 antibody that did not show good l abeling and was excluded from analysis. Th e cells on day 8 were only labeled for DAPI and Bodipy and also excluded. 4.2.2 Sample P reparation and Labeling Samples of human WBCs, MCF7 cells and 1:1 mixtures were prepared as follows. Peripheral blood samples were collected from the Gynecological Tissue an d Fluid Bank Repository (COMIRB 07 0935/COMIRB 05 1081) from consenting patients undergoing surgery at the University of Colorado Hospital. White blood cell (WBC) fractions were isolated from peripheral blood with an ammonia chloride lysis protocol. MCF7 c ells were cultured and trypsinized. Following red blood cell lysis some MCF7s were added to a fraction of isolated WBCs at a 1:1 ratio to prepare a mixed sample. Pure populations of WBCs, MCF7s and the 1:1 mixed sample were fixed using 4 percent formaldehy de in water. Formaldehyde was washed out and samples were stored at 4 degrees Celsius. DNA from the
PAGE 66
49 Word Template by Friedman & Morgan 2014 MCF7 cell line was sequenced at the University of Colorado BioResources Core Facility and found to be a 100% match to MCF7. The stored samples were labeled with DAPI (DNA), Bodipy (lipids) (Fischer Scientific D3922), anti CD45 Alexa633 (Biolegend 304020), and anti panCK Alexa555 (Cell Signaling Technologies 3478S) per protocol described in section 3.4 . L abeling was performed in triplicate using a custom device to provide a seal against the track etched poly carbonate filters that places the cells in a uniform single layer on a 3.4 mm 2 area of the filter. The filters co ntaining the cell samples were then mounted onto a microscope slide for imaging. The WBC, MCF7, and 1: 1 samples were prepared on 8 separate days, producing 24 total samples. Each of these samples were labeled in triplicate on a separate day and one worth of WBC, MCF7, and 1:1 samples was labeled twice for 9 separate preparations of the filters. This resulted in 27 filters prepared with pure WBC, 27 prepared with pure MCF7s and 27 prepared of mixed samples and total of 81 filters. 4.2.3 Fluorescence M icro scopy Images of the full 3.4 mm 2 area for each sample preparation were acquired on a laser scanning confocal microscope (Carl Zeiss AG, LSM 780) using a 20X Zeiss Plan Apochromat 0.80 NA objective. The laser lines used for excitation were 488 nm (Bodipy), 561 nm (Alexa555), and 633 nm (Alexa647). The DAPI label was excited by two photon imaging using an ultrafast pulsed laser (Coherent Inc. , Chameleon) tuned to 765 nm. I acquired data for all four channels by first imaging DAPI, and Bodipy simultaneously fo llowed by acquisition of the Alexa555 and Alexa 647 channels simultaneously. Cross talk between samples was assessed by performing imaging with just one of the excitation lasers
PAGE 67
50 Word Template by Friedman & Morgan 2014 on at a time and observing no bleed through signal on the other channels. Addi tionally, imaging was performed with samples l abeled only with DAPI and Bodipy showing that the Alexa555 and Alexa633 channels did not show any signal even with all lasers on. Three axial confocal sections were acquired over a 10 m scan range to deal with sample not being perfectly flat and 8 x 8 tiled mosaic images were acquired for each channel and stitched together using software (Zen, Zeiss Inc.). Each tile of the image was 1024 x 1024 pixels with a pixel size of 0.42 x 0.42 Âµm . The full data set showing images taken from 81 slides has been made publicly available [73] . 4.2.4 Image P rocessin g I performed image cytometry and calculated these features of the ROIs using custom segmentation and analysis software using Matlab (MathWorks, Cambridge, MA), ImageJ version 1.50 [101] and the int erfacing tool MIJ [102] . My code performs the majority image manipulations including thresholding and segmentation in ImageJ with a macro and then imports the channels and RO Is into Matlab for computation of the metrics for each ROI. The DAPI (DNA) channel was first smoothed using a 1.5 Âµ m Gaussian filter and a threshold of 2500 counts was applied. The thresholded areas for the channel were then dilated to increase the area and a watershed algorithm was applied to aid in the ROI separation. The analyze particle function in ImageJ was run on the binary masks segmenting them into ROIs containing individual cells. After segmentation, the 16 pure WBCs samples contained 24,699 ROI s, the 18 pure MCF7 samples contained 41,091 ROIs, and the 18 mixed samples contained 33,726 ROIs. The ImageJ advanced programmable interface (API) was used to import the array of ROIs into Matlab and MIJ was u sed to import the image data. I then computed my metrics
PAGE 68
51 Word Template by Friedman & Morgan 2014 for each ROI. The image cytometry codes performing these functions has been made publically available [103] . 4.2.5 d quantifies separation by normalizing the difference b etween the means of the two populations, Âµ 1 and Âµ 2 by the pooled standard deviation between them s . I used the formula, ( ( 40 ) 40 , n 1 and n 2 are the number of samples in each population, and 1 and 2 are the standard deviations of each population. Cohen considered weak, medium, and large effect sizes to be d = 0 . 2, d = 0 . 5, and d = 0 . 8 respectively [104] . The value of d can be both positive and negative, the sign merely represents if mean of population 1 is greater or less than population 2. T he sign has been suppressed to prevent confusion as the sign can be inferred from the means of the sample populations. 4.2.6 Calculation of I mage M etrics I selected spatial metrics that were invariant to rotation. The metrics are inspired from those used in laser beam profiling [85], [86] . Let be the measured intensity of the image in a region of interest (ROI) as a function of spatial position . To quantify the spatial size of a labeled ROI I used the second moment. Computation of the second moment involves use of the first moment defined as, , ( ( 41 )
PAGE 69
52 Word Template by Friedman & Morgan 2014 , which is also the centroid position. The denominator of the fraction is the zeroth moment, , or the sum of the intensity signal over the ROI. The second moment is a weighted average of the signal away from the centroid position ( ( 42 ) I was also interested in a metric that might distinguish cells with small particles, like lipid vesicles, from those without them. I a ttempt to quantify this as the image distribution in spatial frequency. I will define the spatial frequency distribution of the image as the Fourier transform of its intensity ( ( 43 ) where j = . In the spatial frequency domain, spatial centroid position is represented by a linear phase. Since one do es not want position offset to affect these metrics, I will define the second moment in spatial frequency as, ( ( 44 ) taking the absolute value of the spatial frequency distribution to eliminate the phase. Finally, I consider the product of these two metrics
PAGE 70
53 Word Template by Friedman & Morgan 2014 ( ( 45 ) This number broadly represents the information content of the ROI or in laser beam profiling represents the spatial mode content. The units for the second moments are calculated in and respectively for spatial and frequency domains while M 2 is unitless. T o make interpretation easier, I report the metrics in a linear unit taking the square roo t, ( ( 46 ) 4.2.7 Performance Analysis 4.2.7.1 Training and Testing Data Subsets The performance of a biomarker that is assessed using the same data that is used to define or train it leads to an overstatement of its performance. To avoid this, the performance of both individual features and regressions combining d ifferent features were calculated by separating the dataset into training and testing subsets. Paired training testing subsets were formed with data taken from the same experimental day. This approach is consistent with adding samples under test to the sys tem as 3 additional samples to be tested at the found operating points. To test the g eneralizability this approach, I also looked at WBC and MCF7 data from all days pooled together and randomly segmented into 10 subsets (details in Appendix B ).
PAGE 71
54 Word Template by Friedman & Morgan 2014 For my training testing approach, I used one sample of WBCs and one sample of MCF7s from a given day as a training subset to find a cutoff point maximizing Ndet for each featu re and then to use the same training subset to perform the regressions. Sensitivity, specificity, and Ndet were then calculated on the remaining datasets from that day not including the training subset (the testing subset). For example, if the training sub set was WBC sample 1 and MCF7 sample 1, the testing subset would be the grouping of WBC samples 2 and 3 and MCF7 samples 2 and 3. The training testing process was repeated using subsets from all possible pairings of WBC and MCF7 samples within each day. T hus days containing 3 samples of WBCs and 3 samples of MCF7 produced 9 training testing subsets. Note that for the two experimental days one of the samples WBCs was excluded due to poor quality and therefore for those two days, I used six pa irings. In tota l for all days, I used 48 training testing pairs in my assessment of biomarker performance. The code for building the training and testing subsets has been made available [105] . 4.2.7.2 Multivariable Regressions with Feature Selection Multivariable reg ressions with feature selection were performed to combine individual features in order to create a higher performance biomarker. Selection of different feature groups for this process allowed us to determine their contribution to regression performance. Th e first feature group considered was that of all 16 features, (denoted Reg1 all .). I expected Reg1 all which combined all features would perform better than just using the 4 total content features from each channel, (Reg2 ). The performance of regressions us ing three channels (DAPI, CD45, and PanCK), (Reg3 DAPI+CD45+PanCK ) that consist of the standard biomarkers used for CTC detection were also tested. I compared this to the three
PAGE 72
55 Word Template by Friedman & Morgan 2014 channels (DAPI, CD45, and Bodipy), (Reg4 DAPI+Bodiyp+CD45 ) to evaluate my hypothe sis H.1. Since a CTC must be a nucleated cell, I tested each channel of features individually combined with DAPI features and evaluated the groups (DAPI/Bodipy, DAPI/CD45, DAPI/PanCK), (Reg5, Reg6, Reg7). I then performed regressions of each of the 4 featu res on the individual channels, (Reg8 11). The feature selection and regression process was performed using each of the 48 training subsets. This resulted in 48 different versions of the regressions, Reg1 Reg11. The performance of each version of thes e regressions was then tested using the paired testing subset not involved in training them. For each of the training subsets, the regression and feature selection process began with feature selection to determine which features amongst the input feature group would produce the best performing regression. The feature selection process started by determining all combinations of the features in the group. For example if the feature group had 3 features, there are 3 single feature combinations, 3 two feature combinations, and one three feature combination for a total of 7 feature combinations. The input training subset was broken up into second level training and testing subgroups for assessing each feature combination. The number of ROIs in the second level training testing subgroups was roughly the same and the selection process was repeated 10 times creating 10 different training testing pairings, referred to as 10 fold cross validation. For each feature combination, across the 10 folds of second level trai ning data, the features were z scored and regressed as X parameters against Y =1 for MCF7 cells and Y = 0 for s d were averaged across the
PAGE 73
56 Word Template by Friedman & Morgan 2014 10 fold average was selected. Using all data in the training subset, regression was then performed with z scored feature combination found t hrough feature selection as X parameters and Y =1 for MCF7 cells and Y = 0 for WBCs producing equations for Reg1 11. To address the generalizability of this process , I compare the performance of a set of Reg1 11 trained on the whole data set in appendix 4 and on just one day in appendix 5. 4.3 Results Figure 12 shows histograms for all 16 individual features (4 spatial metrics for lipids, DNA, CD45, and panCK), of pure s amples and mixed samples. The mixed samples histograms are used only as a qualitative control to confirm the modality seen in the pure sample is real and not an artifact of varying l abeling conditions or acquisition settings. The mixed samples generally fo llow the linear combination of the profiles of the pure samples with bimodality in all the distributions. The total intensity of CD45 and panCK over the intensity being greater for MCF7 cells and CD45 intensity greater for WBCs. Using a t test and a Mann Whitney U test, I checked if the mean position of the WBC and MCF7 distributions were statistically different for each of the 16 features. Using both tests I found p < 0 .01 for all of the features indicating statistically significant differences between WBCs and MCF7 cells for all 16 features. Thus my stated hypothesis (H.2) that the MCF7 cells have more lipid content than WBCs is supported. 48 training subsets are shown in Figure 13 and generally follow each other. ROC for these training
PAGE 74
57 Word Template by Friedman & Morgan 2014 s ubsets were created, summarized in Figure 14 and Figure 15 . The shape of the ROC on the z scored axis of Figure 14 indicates none of the biomarkers have a n unde rlying normal distribution [59] . Since LR T+ an d N det are linearly related, Figure 15 shows that there are cutoff positions on each biomarker maximizing N det . The distribution of the cutoff positions maximizing N det are shown as box plots on top of each feature histogram of Figure 12 . Generally these cuts off points are far from the center of the WBC population to reduce fals e positives. For each testing subset, sensitivity, specificity, and N det were computed at the trained cutoff position. The distribution of these performance statistics for the 48 training testing pairs are shown in Figure 13 . The mean tested value of N det for the 16 individual features was compared. Amongst det = 203 was the highest. The highest mean values of N det be 34 and 75 respectively while for CD45 the highest values was from the metric at 37. Looking at the maximum instead of mean values of N det across the 16 features, I foun d values of N det det for CD45 was found in the feature at 132. Analysis of variance (ANOV A) was used to determine if the performance statistics of det , shown in Figure 13 , were dominated by be tween day variances or within day variances. For nearly all the performance statistics for all individual features, ANOVA results were found be significant (p < .05) for between day variances dominating within day variances. The outliers were the sensitivi ty of the feature DAPI
PAGE 75
58 Word Template by Friedman & Morgan 2014 different between the days. I also tested variances in the regressions results ( Figure 17 ) using ANOVA. These were generally found to be significant (p < .05) for day to day variances. The exception being specificity and N det were not dominated by between day differences for Reg1 4. I looked at the stability of the absolute positions of D + and D populations for each feature to see if the variance between days was greater than the variance within a day. I quantified the within day variability as measure of technical variability in the biomarkers. For the D population I found between day variance to exceed within day variance of all features had low within day variability indicating these features a re stable within and between days while Pan measurements of the MCF7 population. For this population only Bodipy , CD45 , f >. For CD45 < day variation exceed within day variation in absolute position. Further description of methods and the results of these comparisons can be found in Appendix 1. Absolute position could be controlled for with references similar to what is employed in aneuploidy studies in flow cytometry [106] by applying a different offset and scale factor I applied an offset plus scaling transformation to each feature on a per day basis to make the absolute position of WBC and MCF7 the same. As expected, applyi ng this control makes the within day variation in absolute position greater than the between day variation in absolute position. Importantly, such a coordinate system transformation will not change the shape of ROC or calculated performance statistics of A UC, sensitivity, specificity, and N det
PAGE 76
59 Word Template by Friedman & Morgan 2014 I used feature selection and regression to produce combinations of features that would maximize N det . I performed these regressions using all 16 feature s (Reg1 all ), with just the sum signal features (Reg2 ), and different combinations of labels as denoted in their subscript. DAPI (a specific nuclear stain) was included in all 2 feature regression combinations since it was used to determine the region of i nterest for finding nucleated cells. Histograms of the regressions for pure and mixed samples is shown in Figure 16 with performance depicted in Figure s 17 , 18 , and 19 . I performed Kustkal Wallis tests with all 6 days of data pooled to determine if AUC, det for different regressions (Reg1 Reg4) gave statistically different results. For regressions (Reg1 Reg4), AUC was found to have p=.06 the operating point (sensiti vity, specificity, and N det ) I did not find a significant difference (p > .05). Thus my hypothesis (H.1) that the addition of lipid content produces a biomarker (Reg4) with better identification performance than CK+/CD45 (Reg3) is not supported. These reg ressions perform similarly. Post hoc significantly greater than that of Reg3 (p < .05). Also, the 3 channel panel of Bodipy, DNA, and CD45 (Reg4), which could be made compatible with live cell imaging with substitu tion of Hoechst for DAPI , performs similarly (p > .05) to the CK+/CD45 panel for all 5 performance statistics. Examining my hypothesis (H.3) that spatial metrics produce a regression (Reg1 All ) that outperforms just total signal metrics (Reg2 ), I found A significantly greater for Reg1 All compared to Reg2 (p < .05) but no statistical difference in sensitivity, specificity, and N det (p > .05). Thus spatial metrics did produce a better overall
PAGE 77
60 Word Template by Friedman & Morgan 2014 biomarker, supported by greater AUC and Co performing operating point. I went on to use the Kustal Wallis test to ask if regressions of 2 channel and 1 channel features performed similarly. For the 2 channel regressions Reg5 7, I found that sensitivity, and N det all to be significantly different between Regs5 7 (p < .05). Post hoc comparisons found that Reg5 DAPI+CD45 outperforms Reg6 DAPI+PanCk with no statistical difference found between Reg5 or 6 and Reg7 DAPI+Bodipy . With the single channel metrics (Reg8 11) I and N det were statistically different (p < .05) between these 4 regressions . Post hoc analysis found N det for Reg9 Bodipy was significantly greater than N det of Reg10 CD45 and Reg11 PanCK. Since the performance spread of the measured biomarkers were generally dominated by between day differences, I looked at my than with all 6 days pooled to see if the results were c onsistent. N det and specificity was consistent and not found to be statistically different between Reg1 4 tested on all 6 days individually. Sensitivities were found to be consistent with the exception of day 12 where Reg1 was found to be significantly gre ater (p < .05) than Reg3 and day 15 where Reg1 was between Reg1 4 on days 10, 12, 14, and 15. AUC was significantly different between Reg1 4 only on days 12, and 15. Although I found performance to be dominated by between day differences the tested results of my hypotheses generally do not vary by day. To determine if the similar performance of Reg1 4 was due to the finite number of ROI in the testing subset, I computed a minimum possible false positive rate for each of the testing subsets as the inverse of the number of ROI in the disease negative test fraction of
PAGE 78
61 Word Template by Friedman & Morgan 2014 that subset. I used a Wilcoxon rank sum test to compare the calculated minimum false positive rate s to false positive rates of Reg1 and found the false positive rates from Reg1 to be significantly (p < .05) greater. This indicates the tested false positive rates were not driven by the number of samples in the disease negative test fraction. This was n ot the case for the training subset. The cut off position (operating point) between the T+ and T fractions of the regressions was set using the training subsets. The optimal false positive rates found in training were not statistically different from the false positive rate calculated as the inverse disease negative training sample size. As in for Reg1 4, the trained cutoff position (operating points), was most often set to the minimum false positive on the ROC that was greater than 0. Upon testing this cu t off on the larger testing subset, one or more false positives were often found. Even if the false positive rate is sample size limited, biomarkers with greater separation between D+ and D fraction should produce greater sensitivity and better N det . I be lieve my similar sensitivity, specificity, and N det for Reg1 4 is driven by the similar separations these regressions produced between the D+ and D different. In the previousl y mentioned results, I generated Reg1 11 for each of the training subsets thus have 48 versions of them which were applied to the paired testing subset. I looked at whether the results varied if I applied the same version of Reg1 11 to the training testing subsets to speak to the generalizability of the method. I generated Reg1 11 on day 9 data and applied these regression to the 48 training testing subsets. Results and tested hypothesis were similar with Reg1 4 being statistically similar and performance s imilar to using the 48 different versions ( Appendix E ). I generated Reg1 11 using all days of data and
PAGE 79
62 Word Template by Friedman & Morgan 2014 applied these regressions to the 48 training testing subsets and also found similar results ( Appendix F ). This indicates that training Reg1 individual training testing pairings produced similar results. This speaks to the generalizability of the regression method. The prior analysis was done by creating training testing pairings from different physical samples between days. I looked how the results generalized by pooling all of my D+, and D data and randomly segmenting it into 10 training testing samples for 100 different possible pairings. This resampling increased the average number of cells in the D training subsets to 4077 compared to a mean value of 1539 in the main analys is and the average number of cells in testing to 22165 from 2870 in the main analysis. The bootstrap analysis on the individual features found slightly lower mean values of the performance statistics with about 10 times narrower standard deviations. For th and AUC also have slightly lower mean values with the boot strap analysis and the same notable lower standard deviation. At the operating point, for Reg1 4, N det is found to be about twice the value as in my main analysis with sma ller standard deviation, while for Reg5 11, N det is slightly lower. Comparison of Figures 12 14 and Figures 16 18 generated with this bootstrap data can be found in Appendix C . I looked at whether using manual debris exclusion improved the performance of the individual features. Applying the same Reg1 11 on the unexcluded fields actually found mean value of 546 for Reg1 with exclusion while lower values for Reg2 an d some other regressions. Thi s indicates the exclusion did not contribute significantly to performance. Comparative figures are in Appendix D .
PAGE 80
63 Word Template by Friedman & Morgan 2014 Overall the Kustal W allis test shows that 3 and 4 channel regressions outperform 2 channel regressions, indicating that multiple channels assist in better isolating the rare cell types. Variations were found in the expression of pan cytokeratin in MCF7 cells which appears to depend on their confluency in cell culture. Dropping this data from analysis would likely further increase the separation produced between MCF7 and WBCs but doing this does not represent the true experimental variability measured.
PAGE 81
64 Word Template by Friedman & Morgan 2014 Figure 12 : Histograms of image cytometry features computed on all 4 channels. Blue shows WBCs only samples composed of 24,699 objects (D ), red shows MCF7 only samples (D+) composed of 41,091 objects used in the analysis. Black dashed lin e (MCF7 + WBC ~1:1 mixed samples containing 33,726 objects) is qualitative control reproducing modality of pure samples. dBct = 10*log10(counts). Box plots of distribution of cut off positions maximizing Ndet across the 48 training subsets are shown on top of each of the histograms.
PAGE 82
65 Word Template by Friedman & Morgan 2014 Figure 13 48 training subsets. An operating point maximizing Ndet was found on each training subset (thresholds plot ted in Figure 12 ). The remaining data from that day was used as a testing subset to compute the shown sensitivity, specificity and the minimum detectable thresholds a t the operating point. Occurrences of false positive rates of zero on the testing data were found and summed in the bottom panel.
PAGE 83
66 Word Template by Friedman & Morgan 2014 Figure 14 : ROC curves for the training subsets averaged over each day. Logarithmic (z scored) sensitivity and specificity axis used shows straight lines when D + and D distributions are Gaussian. Average values for sensitivity and specificity maximizing N det for are shown as + symbols are generally to the left of seen inflection points.
PAGE 84
67 Word Template by Friedman & Morgan 2014 Figure 15 : ROC on test positive likelihood ratio LR T+ and LR T shows there are maximum on LR T+ for all features. If any of these biomarkers was used for enrichment, this cut off would be the optimal position for the transition between T+ a nd T for the enrichment.
PAGE 85
68 Word Template by Friedman & Morgan 2014 Figure 16 : Histograms of testing subsets of WBCs (blue trace), MCF7 (red traces). The 1:1 mixed populations (black line) qualitative control showing regressed modality is real. Testing data was data not used to train the regressions and was naive to the regressions which were computed on the 48 training subsets. For each regression, an operating point on that training subset maximizing Ndet, and produced threshold positions shown as box plots on top of h istograms. Regressions are z scored thus center out near 0.
PAGE 86
69 Word Template by Friedman & Morgan 2014 Figure 17 : Performance of the regressions combinations over the 48 testing subsets shown as the separation produced between the biomarkers. Below line performance statistics that depend on the operation point. Occurrences where a testing subset had a false positive rate of 0 were summed. For Reg1 4 below line performance statistics are similar w hile above line statistics are different.
PAGE 87
70 Word Template by Friedman & Morgan 2014 Figure 18 : Receiver operating characteristics of the regressions computed on the testing subsets averaged over each day. Average positions of operating points maximizing Ndet are shown as + symbols. As more features are included in the regressions performance and stability is seen to improve. Regressed distributions are not Gaussian as indicated by the shape on the z scored sensitivity specificity axis.
PAGE 88
71 Word Template by Friedman & Morgan 2014 Figure 19 : Regression on test positive likelihood ratio (LR T+ ) and test negative likelihood ratio LR T . LR T+ is linearly related to detectable rarity N det . This figure shows that there are maxima for LR T+ and thus N det for all regression combinations and the pr ofile of these ROC are not that of D+ and D drawn from normal d istributions as shown in Figure 18 . 4.4 Discussion Recently, Lannin et al. [87] have performed a comparison of Bayesian, K nearest neighbors, support vector ma chines, and random forest classifiers that included data in the
PAGE 89
72 Word Template by Friedman & Morgan 2014 WBC cell line model system similar to mine . The presented ROC in ref. [87] can be compared against the ROC I present ( Figure 18 ). Comparison using the same metric of AUC up to false positive rate of .05, Reg 1 4 all had has a median value of 0.049 which is on par with the best value of 0.048 found for random forests in the model system of [87] . Visual quantification of AUC up to false positive rate of .05 can be found in Appendix G . I conclude my simple feature selected regression is performing as well and that combination of different labels through methods like multivariate regressions are therefore necessary if one wants to detect even rarer cells. my regression method shows details of which features show the greatest ability to distinguish populations. Examples of the forms of the regressed equations are in Appendix H & I . I have looked at using principle component analysis (PCA) and linear discriminant analysis (LDA) to create statistically uncorrelated basis functions prior to regression. I have found the performance to be identical to the straight up regression technique used here as predicted by the Frisch Waugh Lovell theorem [71] . Methods such as these are needed in cases where the number of samples is less than the number of variables to be regressed which is not true in the case of this data set. N det is a quantity that allows for optimizing the detection level of a continuous biomarker panels. Under the scenario of perfect sensitivity, with a desired PV+ of 66%, the detectable prevalence scales at half the false positive rate. T his seems reasonable as it makes sense that the false positive rate sets a limit for the rarity of event that can be detected similar to how noise in a circuit sets a limit on power of a signal that can be detected. However, what should one do when they ca lculate their false positive rate to be 0 making N det undefined and positive predictive value perfect? In this case, I choose to round the false positive rate up to
PAGE 90
73 Word Template by Friedman & Morgan 2014 the inverse of the disease negative sample size. This occurred on testing for 11 of the 48 testing times in my best regressions. Thus I could improve the 1 in 480 mean value of N det by increasing the number of cells in the disease negative dataset and N det is connected to sample size. This point is also illustrated in my bootstrap analysis which allowed us to use more cells in training leading to ~1.5x better N det performance for regressions, Reg1 4. An N det value of 480 corresponds to a false positive rate of 1 in 1044. Enrichment may be sufficient to bring the needed prevalence into range. Int erestingly, I have found a clinical trial of CTC identification methods reporting efficacy with false positive rates between 1 in 800 and in 1 in 1600 [107] . I I expect to be stable between patients from previously published aneuploidy studies [108] . Thus although I see between day differences, particularly in WBCs, I be believe these differences are largely technical and can be reduced instead of being genuine biological variation between patient WBCs. Use of downstream molecular anal ysis for final CTC identification with single cell polymerase chain reaction (PCR) is of particular interest [24] . I am interested in applying further molecular analysis to CTC samples including in situ hybridization [109] and viral reporters. I hypothesized that use of spatial features would aid in separating these populations. As the number of channels were increased, feature selection ten ded towards regressions that included just total signal measurements. The spatial features were seen more often in the two and one channel regressions. At this time, I cannot conclude that the use of spatial features adds performance advantage in this 3 or 4 channel assay.
PAGE 91
74 Word Template by Friedman & Morgan 2014 It also appears that the performance of the biomarker panel of DAPI, CD45, and Bodipy is equivalent to the standard CK+/CD45 panel. This result is interesting as labeling for DNA, lipids, and CD45 c ould be performed on live cells by sub st it uting the DAPI label for Hoechst. Label free specific optical contrast for lipids with (coherent anti Stokes Raman scattering) CARS microscopy has previously been performed on CK+/CD45 objects finding that the CK+/CD45 objects have a 7 fold higher average pixel CARS intensity than leukoc ytes(CK /CD45+) [30] . This result is close to the 7.9 fold differences I measured in this work. Additionally, the ROC shown in Figure 14 and 15 are invariant to the choice of units and linearity of the underlying variables as are the performance parameters summarized in Figure 13 . These figures indicate the performance of lipid content alone is similar to the other channels. Although label free, the viability of cells imaged for lipids with CARS compared to those imaged for lipids with Bodipy is unclear. Although the higher la ser powers need for CARS may be an issue for viability, the longer wavelengths used also enable deeper imaging through scattering media such as tissue. Interestingly, [110] have taken advantage of this to perform intravital CARS imaging of cancer cells through the vein of a mouse ear. This could be an innovative noninvasive method for enumeration of CTCs but appears difficult to apply clinically.
PAGE 92
75 Word Template by Friedman & Morgan 2014 CHAPTER V CONCLUSIONS The methods presented in this dissertation define the performance of single and multi stage tests for identification of rare events, such as circulating tumor cells, using statistical and probabilistic analysis. I present theory that shows how biomarker sensitivity and specificity sets the rarity of cell the biomarker can identify . I present theory showing how sensitivity, specificity and detectable rarity scale when multiple test stages are employed. I show that there can be a maximum for detectable rarity on a biomarker depending on the distribution of the disease positiv e and disease negative populations on that biomarker. I show that d etectable rarity can be used to choose operating points on ROC optimized to detect the rarest possible cells. At these operating points I experimentally quantify sensitivity, specificity, a nd computed rarity over measured experimental variability. I show that there are maximum values for the rarity of cell a biomarker can detect in all of my experimental data. There are a few results that consider statistical requirements in circulating tum or cell detection [55], [56], [111] and more theoreti cal analysis is needed . Poisson statistics can be used to calculate the number of cells that need to be sampled. These calculations are done under the assumption that the identifying test is perfect thus the calculation only involves adequate sampling of a presumed known prevalence. The theoretical analysis pr esented in this work describes how sensitivity and specificity sets the prevalence (rarity) of cell a test can detect and how these parameters scale in multistage assays . For a given test, this is the prevalence that should be input into the previously men tioned theory [55], [56] to determine how many cells should be sampled.
PAGE 93
76 Word Template by Friedman & Morgan 2014 In experimental results, I have found statistically significant differences between WBCs and MCF7s for all biomarkers measured indicating that my hypothesis (H.2) that MCF7s have more lipids than WBCs is supported. My hypothesis (H.1) that a DNA/lipids/CD45 panel could outperform the classic D NA/CD45/CK panel is not supported. These panels performed similarly which is interesting since DNA/lipids/CD45 could be made compatible with live cell imaging by use of a Hoechst DNA label. My hypothesis (H.3) that computing spatial features would lead to increased performance using total content features was also not supported (P > .05). In 3 and 4 channel assays the total content features lead to similar separations. The spatial features were more prominently included in regression of 1 or 2 channels. Un der the assumption that D + and D distributions are Gaussian, I have looked at the separation should a achieve an ROC curve with an ability to detect cells at a rar ity of 1 in 1 million [112] . However, experimentally, the regression that produced greater than 7 standard 480 detection performance. My experimental ROC curves show none of the features or regressions are normally distributed [59] . Thus theory modeling the separations betwe en populations with just second order properties, such as d , and using this to estimate the ROC and thus sensitivity and specificity while useful is inadequate [112] . Still C d does describe the separation between populations which is similar to AUC while being less computationally expensive. I used d as the optimization parameter in feature selection . I also experimented with using N det and AUC as the optimizati d to
PAGE 94
77 Word Template by Friedman & Morgan 2014 be less computationally expensive to compute than AUC and N det d does not require computing an ROC in order to calculate. With an enrichment step that produces an N det of ~ 1 in 1500 , I would be able to dete ct a 1 in a million object with my experimental image cytometry biomarker panel with mean Ndet of 1 in 480 provided the enrichment biomarker is statistically independent of that panel. However, enrichment biomarkers of CD45 and epithelial markers such as C K are already included in the image cytometry panel. Thus , by my theoretical findings, if these biomarker s are used in enrichment the overall improvement in detectable rarity will be lower . If biomarkers such as CD45 or EpCAM are to be used in enrichment, detecting ever rarer cells requires using different biomarkers for the cytometry stage . However, they should not necessary be excluded from the state II image cytometry panel either. Imaging the enrichment biomarker in follow up cytometry may still be important for quality control of the enrichment step . Still , discovery of ever better bioma rkers for cancer detection in both enrichment and cytometry stages is critical. The extent in which MCF7 cells model the distribution of true CTCs is unknown and presents an issue with calculating the true sensitivity of a biomarker . Fortunately, the choi ce and diversity of the D + model has no effect on the specificity/false positive rate of the assay. In light of these facts, one can initially rely on measuring the false positive rate of an assay to determine if the assay is detecting anything. To show de tection has occurred in the unknown patient sample, using a previously un established assay, one can show that their test positive rate in unknown patient samples is statistically greater than the false positive rate of their assay. Demonstrating the test p ositive rate is greater the false positive rate in a case control study was done by investigators using CellSearch [51] before extensive prospective cohort
PAGE 95
78 Word Template by Friedman & Morgan 2014 trials [1] . Issues in study design are further discussed in [113] . Ultimately, connectio n of CTCs to the underlying disease is demonstrated through cohort trials where patients with greater than X CTCs are shown to have different outcomes than those without. X is often a parameter to be determined by the trial. 5.1 Future Directions Future intere sts include the application of spontaneous Raman microscopy to the analysis of lipid in CTCs. Previously, we have used this label free technique to quantify changes in lipid content in breast and prostate cancer cell lines [6] . Characterization of CTCs with spectroscopy techniques such as Raman scattering or fluorescence lifetime imaging (FLIM) may further distinguish metabolic profiles of the solid tumor which may be correlated with outcomes. A low cost open source platform that can prepare cells for immunofluorescence microscopy and image them would be ideal for evaluating the thousands of commercially ava ilable antibodies and labels for use in CTC identification. Such a system would have use in assessing the sensitivity and specificity of labels and clinical applicability in performing the final identification after enrichment. Biomarkers with greatest sen sitivity and specificity, evaluated in model systems, would most merit follow up examination in clinical trials. As described in this paper, better performing panels will enable ever rarer CTCs to be detected in the peripheral blood of cancer patients. The se CTCs may be the biomarkers needed for better early detection and monitoring of cancer. It would be very interest ing to engineer such a system. One next endeavor is to experimentally measure enriched ROC for the biomarker panel of DAPI, PanCK , CD45 & lip ids in a sample s of peripheral blood taken from a patient .
PAGE 96
79 Word Template by Friedman & Morgan 2014 We are interested in using CD45 depletion to make the prevalence high enough to test out different biomarkers. Testing new biomarkers from a depleted population using image cytometry would expand o n the experimental and theoretical analysis of this work to optimize CTC testing . Future work on engineering better methods to performing l abeling , minimize cell loss and imaging throughput will also be critical for moving these techniques forward.
PAGE 97
80 Word Template by Friedman & Morgan 2014 REFEREN CES [1] W. J. Allard et al. Clin Cancer Res , vol. 10, no. 20, pp. 6897 6904, Oct. 2004. [2] C. A. Parkinson et al. DNA as Biomarkers of Treatment Response for Patients with Relapsed High Grade PLOS Medicine , vol. 13, no. 12, p. e1002198, Dec. 2016. [3] T. Van Gorp et al. Br J Cancer , vol. 104, no. 5, pp. 863 870, Mar. 2011. [4] I. M. Tho mpson et al. Specific Antigen in Men JAMA , vol. 294, no. 1, pp. 66 70, Jul. 2005. [5] D. Tarin, J. E. Price, M. G. W. Kettlewell, R. G. Souter, A. C. R. Vass, and B. Crossley, Cancer Res , vol. 44, no. 8, pp. 3584 3592, Aug. 1984. [6] M. C. Potcoava, G. L. Futia, J. Aughenbaugh, I. R. Schlaepfer, and E. A. Gibson, anti Stokes Raman scattering microscopy studies of changes in lipid content and composition in hormone J. Biomed. Opt , vol. 19, no. 11, pp. 111605 111605, 2014. [7] A. Casartelli et al. based approach fo r the early assessment of the Cell Biology and Toxicology , vol. 19, no. 3, pp. 161 176, 2003. [8] the Curr. Treat. Options in Oncol. , vol. 8, no. 1, pp. 89 95, Feb. 2007. [9] S. J. Cohen et al. Clinical Colorectal Cancer , vol. 6, no. 2, pp. 125 132, Jul. 2006. [10] S. J. Cohen et al. Ann Oncol , vol. 20, no. 7, pp. 1223 1229, Jul. 2009. [11] D. C. Danila et al. ulating Tumor Cell Number and Prognosis in Progressive Castration Clin Cancer Res , vol. 13, no. 23, pp. 7053 7058, Dec. 2007.
PAGE 98
81 Word Template by Friedman & Morgan 2014 [12] O. B. Goodman et al. Resistant Prostate Can Cancer Epidemiol Biomarkers Prev , vol. 18, no. 6, pp. 1904 1913, Jun. 2009. [13] W. He et al. prostate cancer patients using tumor Int. J. Cancer , vol. 123, no. 8, pp. 1968 1973, Oct. 2008. [14] Tumor Cells Detected by the CellSearch System in Patients with M etastatic Breast J Oncol , vol. 2010, 2010. [15] M. G. Krebs et al. in Patients With Non Small JCO , vol. 29, no. 12, pp. 1556 1563, Apr. 2011. [16] J. Y. Xu et al. J Clin Endocrinol Metab , vol. 101, no. 11, pp. 4461 4467, Nov. 2016. [17] Circulating Tumor Cells in Ovarian Cancer: A Meta PLOS ONE , vol. 10, no. 6, p. e0130873, Jun. 2015. [18] J. Y. Pierga et al. value of circulating tumor cells compared with serum tumor markers in a large prospective trial in first line Ann Oncol , vol. 23, no. 3, pp. 618 624, Mar. 2012. [19] H. I. Scher et al. our cells as prognostic markers in progressive, castration The Lancet Oncology , vol. 10, no. 3, pp. 233 239, Mar. 2009. [20] J. B. Smerage et al. therapy in JCO , vol. 32, no. 31, pp. 3483 3489, Nov. 2014. [21] Curr. Treat. Options in On col. , vol. 8, no. 1, pp. 89 95, Feb. 2007. [22] D. R. Shaffer et al. Castration Clin Cancer Res , vol. 13, no. 7, pp. 2023 2029, Apr. 2007. [23] J. P. Thiery, H. Aclo Mesenchymal Cell , vol. 139, no. 5, pp. 871 890, Nov. 2009.
PAGE 99
82 Word Template by Friedman & Morgan 2014 [24] containing mutate Biomedical Optics Symposium, 1999, pp. 93 101. [25] time multivariate statistical classification of cells for flow cytometry and cell sorting: a 158 169. [26] A. A. Powell et al. Heterogeneity and Diversity fr PLOS ONE , vol. 7, no. 5, p. e33788, May 2012. [27] G. Vona et al. The American Journal of Pathology , vol. 156, no. 1, pp. 57 63, Jan. 2000. [28] H. K. Lin et al. Based Microdevice for Detection and Characterization Clin Cancer Res , vol. 16, no. 20, pp. 5011 5018, Oct. 2010. [29] R. KÃ¶ni gsberg et al. Acta Oncologica , vol. 50, no. 5, pp. 700 710, Jun. 2011. [30] Lipid Rich Prostate Circulating Tumour Cells with Coherent Anti Stokes Raman Scattering BMC Cancer , vol. 12, no. 1, p. 540, Nov. 2012. [31] Market 2017 Qiagen, Adva nced Cell Diagnostics, ApoCell and Janssen  Important [32] America, Europe, Asia Publicist Report. The site covers topics in Business. T , 26 Jul 2017. . [33] [34] K. F. Ho, N. E. Gouw, and Z. Gao, TrAC Trends in Analytical Chemistry , vol. 64, pp. 173 182, Jan. 2015. [35] T. M. Scholtens et al. Cytometry Part A , vol. 79A, no. 3, pp. 203 213, 2011. [36] S. D. Mikolajczyk et al. Negative and Cytokeratin Negative Journal of Oncology , 2011. [Online].
PAGE 100
83 Word Template by Friedman & Morgan 2014 Available: https://www.hindawi.com/journals/jo/2011/252361/abs/. [Acc essed: 05 Jul 2017]. [37] N. Saucedo Zeni et al. cells from peripheral blood of cancer patients using a functionalized and structured International Journal of Oncology , vol. 41, no. 4, pp. 1241 1250, Oct. 2012. [38] J. Chudziak et al. independent enrichment of circulating tumour cells in patients with small cell lung Analyst , vol. 141, no. 2, pp. 669 678, J an. 2016. [39] May 2014. [40] Patent: US20160303565 A1, 16 Oct 2014. [41] H. W. Hou et al. Scientific Reports , vol. 3, p. srep01259, Feb. 2013. [42] P. Gogoi et al. Capturing and Characterizing Circulating Tumor Cells (CTCs) from Clinical Blood PLOS ONE , vol. 11, no. 1, p. e0147400, Jan. 2016. [43] M. Balic et al. s for enumerating circulating tumor cells in Cytometry , vol. 68B, no. 1, pp. 25 30, Nov. 2005. [44] Patent: US20160033508A1, 26 Jan 2015. [45] Jan 2015. [46] T. Hillig et al. Tumor Biol. , vol. 36, no. 6, pp. 4597 4601, Jun. 2015. [47] 09 May 2017. [48] A. S. Frandsen et al. Characterization aft Journal of Circulating Biomarkers , vol. 4, p. 5, Nov. 2015. [49] D. N. Curry et al. 26th Annual International Conference of the IEEE Engineering in Medicine and Biology , 2004, vol. 1, pp. 1267 1270.
PAGE 101
84 Word Template by Friedman & Morgan 2014 [50] R. T. Krivacic et al. PNAS , vol. 101, no. 29, pp. 10501 10504, Jul. 2004. [51] E. Racila et al. PNAS , vol. 95, no. 8, pp. 4589 4594, Apr. 1998. [52] Transactions of the IRE Professional Group on Information Theory , vol. 4, no. 4, pp. 171 212, Sep. 1954. [53] D. M. Green an d J. A. Swets, Signal detection theory and psychophysics . Wiley, 1966. [54] curve into likelihood ratio co Statist. Med. , vol. 23, no. 14, pp. 2257 2266, Jul . 2004. [55] for sampling statistics useful for detecting and isolating rare cells using flow cytometry Cytometry , vol. 27, no. 3, pp. 233 238, Mar. 1997. [56] Cytometry Part A , vol. 71A, no. 3, pp. 154 162, 2007. [57] The Bell System technical Journal , vol. 29, no. 3, pp. 379 423, 1948. [58] T. M. Cover and J. A. Thomas, Elements of Information Theory . John Wiley & Sons, 2012. [59] J. A. Swets, Evaluation of Diagnostic Systems . New York, New York: Academic Press, 1982 . [60] J. Opt. Soc. Am. A, JOSAA , vol. 32, no. 7, pp. 1288 1301, Jul. 2015. [61] characteristic ana J. Opt. Soc. Am. A, JOSAA , vol. 33, no. 5, pp. 930 937, May 2016. [62] observer performance on signal Appl. Opt., AO , vol. 39, no. 11 , pp. 1783 1793, Apr. 2000. [63] III. ROC metrics, ideal observers, and likelihood J. Opt. Soc. Am. A, JOSAA , vol. 15, no. 6, pp. 1520 1535, Jun. 19 98.
PAGE 102
85 Word Template by Friedman & Morgan 2014 [64] Journal of Clinical Epidemiology , vol. 56, no. 11, pp. 1129 1135, Nov. 2003. [65] annel capacity of a diagnostic test as a function of test sensitivity Statistical Methods in Medical Research , vol. 24, no. 6, pp. 1044 1052, Dec. 2015. [66] P Breakthroughs in Statistics , S. Kotz and N. L. Johnson, Eds. Springer New York, 1992, pp. 610 624. [67] measure of the effects of signal incompleteness on system d Reliability Engineering & System Safety , vol. 45, no. 3, pp. 235 248, Jan. 1994. [68] International Statistical Review , vol. 78, no. 3, pp. 383 412, Dec. 2010. [69] J BMC Medical Informatics and Decision Making , vol. 15, p. 59, 2015. [70] A. E. Carpenter et al. software for identifying and Genome Biology , vol. 7, no. 10, p. R100, Oct. 2006. [71] Econometrica , vol. 1, no. 4, pp. 387 401, Oct. 193 3. [72] N. Howlader et al. 2012, based on November National Cancer Institute. Bethesda, MD , Feb. 2016. [73] G. L. Futia, L. Qamar, K. Behbakht, and E. A. Scanning Microscopy of White Blood Cells, Cancer Cell Line MCF7, and mixtures of Zenodo.org , p. 10.5281/zenodo.44884, Jan. 2016. [74] detection of a fatty acid synthase (OA 519) as a predictor of progression of prostate Human Pathology , vol. 27, no. 9, pp. 917 921, Sep. 1996. [75] P. L. Alo, P. Visca, A. fatty acid synthase (FAS) as a predictor of recurrence in stage I breast carcinoma Cancer , vol. 77, no. 3, pp. 474 482, 1996.
PAGE 103
86 Word Template by Friedman & Morgan 2014 [76] T. S. Gansler, W. Hardman III, D. A. Hunt, S. Schaff expression of fatty acid synthase (OA 519) in ovarian neoplasms predicts shorter Human Pathology , vol. 28, no. 6, pp. 686 692, Jun. 1997. [77] C. J. Piyathilake et al. ASE) is an early event Human Pathology , vol. 31, no. 9, pp. 1068 1073, Sep. 2000. [78] P. Visca et al. in Lu Anticancer Res , vol. 24, no. 6, pp. 4169 4174, Nov. 2004. [79] A. Rashid et al. Am J Pathol , vol. 150, no. 1, pp. 201 208, Jan. 1997. [ 80] site ELISA Clinica Chimica Acta , vol. 304, no. 1 2, pp. 107 115, Feb. 2001. [81] L. Z. Milgraum, L. A. Witters, G. R. Pas Clin Cancer Res , vol. 3, no. 11, pp. 2115 2120, Nov. 1997. [82] present Novel Noninvasive Biomarkers of Prostate Cancer Chemoprevention by Phenethyl Cancer Prev Res , Mar. 2017. [83] K. Bensaad et al. to Cell Growth and Survival after Hypoxia Cell Reports , vol. 9, no. 1, pp. 349 365, Oct. 2014. [84] I. C. Elle, L. C. B. Olsen, D. Pultz, S. V. RÃ¸dkÃ¦r, and N. J. FÃ¦rgeman, worth dyeing for: Molecular tools for the dissection of lipid metabolism in FEBS Letters , vol. 584, no. 11, pp. 2183 2193, Jun. 2010. [85] ngs of SPIE, Los Angles, CA pp. 2 12, Feb. 1993. [86] IEEE Journal of Quantum Electronics , vol. 27, no. 5, pp. 1146 1148, 1991. [87] T. B. Lannin, F. I. Thege, and B. Cytometry , vol. 89, no. 10, pp. 922 931, Oct. 2016. [88] T. M. Scholtens, F. Schreuder, S. T. Ligthart, J. F. Swennenhuis, J. Greve , and L. W. M. Cytometry Part A , vol. 81A, no. 2, pp. 138 148, 2012.
PAGE 104
87 Word Template by Friedman & Morgan 2014 [89] Cytom etry , vol. 87, no. 7, pp. 594 602, Jul. 2015. [90] C. Cytometry , vol. 85, no. 6, pp. 501 511, Jun. 2014. [91] S. T. L igthart et al. PLOS ONE , vol. 8, no. 6, p. e67148, Jun. 2013. [92] N. Theera level training of neural networks for countin IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews , vol. 32, no. 1, pp. 48 53, Feb. 2002. [93] N. Theera Nucleus in Automatic Bone Mar IEEE Transactions on Information Technology in Biomedicine , vol. 11, no. 3, pp. 353 359, May 2007. [94] IEEE Transactions on Biomedical Engineering , vol. BME 19, no. 4, pp. 291 298, Jul. 1972. [95] Classification based on the Combination of Eigen Cell and Parametric Feature 2006 1ST IEEE Conference on Industrial Electronics and Applications , Singapore, 2006, pp. 1 4. [96] 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) , 2010, pp. 5593 5596. [97] 11th IAPR International Conference on Pattern Recognition, 1992. Vol.III. Conference C: Image, Speech and S ignal Analysis, Proceedings , The Hague, Netherlands, 1992, pp. 530 533. [98] K. Z. Mao, P. Zhao, and P. based cell image segmentation IEEE Transactions on Biomedical Engineering , vol. 53, no. 6, p p. 1153 1163, Jun. 2006. [99] TENCON 2004. 2004 IEEE Region 10 Conference , Ciang Mai, Thailand, 2004, vol. A, p. 191 194 Vol. 1.
PAGE 105
88 Word Template by Friedman & Morgan 2014 [100] International Journal of Nanomedicine , vol. 2, no. 2, p. 181, Jun. 2007. [101] al Institutes of Health, Bethesda, Maryland, [102] D. Sage, D. Prodanov, J. ImageJ User & Developer , Mondorf les Bains, Luxembourg, 2012. [103] imagej image GitHub . [Online]. Available: https://github.com/CTCHunter1/matlab imagej image cytometry. [Accessed: 26 Aug 2016]. [104] Psychological Bulletin , vol. 112, no. 1, pp. 155 159, 1992. [105] cytometry analysis GitHub . [Online]. Available: https://github.com/CTCHunter1/ctc cytometry analysis v1. [Accessed: 26 Aug 2016]. [106] L. L. VindelÃ¸v, I. J. Christens resolution flow cytometric DNA analysis by the simultaneous use of chicken and trout Cytometry , vol. 3, no. 5, pp. 328 331, 1983. [107] B. y. Kularatne, P. Lor igan, S. Browne, S. k. Suvarna, M. o. Smith, and J. Lawry, Cytometry , vol. 50, no. 3, pp. 160 167, Jun. 2002. [108] Histopatho logy , vol. 46, no. 2, pp. 121 129, 2005. [109] D. Marrinucci et al. Journal of Oncology , vol. 2010, pp. 1 7, 2010. [110] T. T. Le, T. B. Huff, and J. Sto kes Raman scattering BMC Cancer , vol. 9, no. 1, p. 42, Jan. 2009. [111] Journal o f Oncology , vol. 2010, Dec. 2009. [112] cytometry measurements of lipids, DNA, CD45 and cytokeratin for circulating tumor Proceedings of SP IE , San Francisco, Ca, USA, 2016, vol. 9711, p. 97111U 97111U 15.
PAGE 106
89 Word Template by Friedman & Morgan 2014 [113] D. Marrinucci et al. Phys. Biol. , vol. 9, no. 1, p. 016003, 2012.
PAGE 107
90 Word Template by Friedman & Morgan 2014 Appendix A Components of the Clinical E xperience The Colorado Clinical and Translational Science Institute partially supported this work through award of a pre doctoral fellowship advancing translational research (TL1 TR00181). Part of the purpose of the award was for me, the fellow, to become more exposed the clinical aspects of the disease I am studying. In the case of this project, I am focused on trying to improve the rarity of CTC one can detect such that CTCs may be a more useful biomarker for diagnosing and monitoring ovarian cancer. In collaboration with Dr. Behbakht we composed a clinical mentoring component consisting of 3 components on which I now report on to partially complete the requirements of the fellowship. 1. Ovari an cancer case directed learning: I attended an exploratory laparoscopy on a n 85 year old patient suspected of having ovarian cancer. The goal of the procedure was to perform a total hysterectomy and bilateral salpingo oophorectomy (removal of both uterus ovaries and fallopian tubes ) . Attending this surgery gave me a lasting im pression both general happenings in the operating room & the invasiveness of a debulking surgery . Specific to this surgery, u pon opening the patient they found the disease had reache d the omentum but fortunately had not disseminated to the walls of the perineum. The omentum was removed and organs in the pelvis were disected . Disease has obliterated her ovaries to the point where the y were no longer recognizable. Dr. Behbakht, as the i nstructor, asked the learning surgeons how to proceed. Receiving blank faces, he taught them and myself that under cases like this one must rely on their anatomy. The surgeon should identify structures that should be near the ovary
PAGE 108
91 Word Template by Friedman & Morgan 2014 and remove tissue that o ccupies the location where the ovary should be. I saw how the training surgeon teaches a trainee as they stand on the opposite sides of the table. This leads to tasks being reversed in terms of handiness between trainer and trainee. This patient had a grea ter than 5 L of ascites removed from her during the surgery which left an impression on me due to its sheer volume. The whole operation lasted about 4 hours from opening to suturing closed. The patient would then go to recovery and hopefully be out of bed within about 3 days. 2. Attendance at weekly gynecologic cancer tumor conference (Thursdays 7 8 am) : I attended the gynecological tumor board meeting from the start of the fellowship in August 2014 until August 2016. All malignant cases are discussed at the tumor board. Cases discussed included those of endometrial, cervical, ovarian, and vulvar cancers. This process typically consisted of reading of the case, presentation of radiology, presentation of post surgical pathology followed by discussion including a diagnosis and treatment plan. Attending this meeting, I noted for some cases none of the diagnostic tools are clear cut. Radiological images require a trained eye to interpret and the resolution and of these images are limited. The most definitive results come from pathology but given the heterogenous disease even the best trained pathologist always provide definitive classifications . The classifications themselves are continuously being further refined and updated. Some tumors are morphologically unique and impossible to defiantly classily. Related to the engineering aspects of the project, one wonders
PAGE 109
92 Word Template by Friedman & Morgan 2014 about the extent machine learning could be applied to both pathology and radiology to further improve precision and maybe even robustness. 3. Meeting ovarian cancer patients: I presented poster at the annual Colorado Ovarian Cancer Alliance (COCA) meeting in 2014 & 2015. At the annual meeting, I meet ovarian cancer patients, their relatives and a few surviving family members. I presented about the CTC p roject and found enthusiasm for it from the patients and family members I got to speak with. I remember one of them wanted me to take a blood draw from her right away. She was of the opinion, she had plenty of blood to give me to test my system. A few of t hese people had an engineering background and were interested what engineering had to do with ovarian cancer and the engineering aspects of my work. Overall I was left with an impression what we are working on is important and we owe it to them to move qui ckly. In addition to these 3 components , I also participated in Dr. Behbakht lab meetings and worked in space in the Behbakht lab. During this time Dr. Behbakht was the head of the geological oncology fellowship program which trains the next generation of physicans that will be fighting these diseases. One of the 3 years of their training is a laboratory component where the gynecology fellows are to develop and pursue a research project. Working in the same area I got to interact and discuss the diseases with the fellows which helped further my understanding of the diseases we are fighting.
PAGE 110
93 Word Template by Friedman & Morgan 2014 Appendix B Analysis of Experimental Variation I found significant between day differences in performance for most of the 16 features. All performance metrics used are related to the relative separation produced between D and D+ pop ulations and none are related to the absolute positions of the individual populations on the biomarkers. Thus the significant difference in performance indicates that the relative position of the D populations to the D+ population are different. The cause of this variation could be underlying biological variation in the D+, or D samples, or technical variation in how the cells are processed such as antibody concentrations caused by different master mix preparation between days. Since the D WBCs came fr om blood from different patients on each day, the between day variation in performance leads one to ask if distribution of the WBCs on these biomarkers differs in blood from different people. If these biomarkers differ between patients, then blood from a h ealthy person may not be a sufficient D control in testing blood from a suspected diseased individual. To evaluate this, here I will perform analysis of the absolute positions of the D and D+ cells on the features. For each feature, the goal of my analys is will be to determine center positions of the leukocytes and MCF7s is different between days. I will consider this difference to be measureable if it exceed the within day variation of the leukocytes of MCF7 on each day. Finding measureable differences between days does not show the cause of the variation is the source of the cells. Additional experiments would need to be conducted with samples from the same patient collected and processed on different days with this repeated on multiple patients. These cells could still be different between samplings from the same
PAGE 111
94 Word Template by Friedman & Morgan 2014 patient due to different metabolic, homeostatic, or immune state of that patient at the time of sampling. I begin this analysis by with qualitative examination of the histograms. The distributi on of the D+ and D populations for each filter and each day for an example feature, Figure . One can see the D the distribution of the cells for most days is fairly consistent. The most spread is seen in day 12 and day 9. If difference between the D+ populations on each day is measureable than, #1) the between day spread of the data should be greater than the within day spread of the data. Also, one can see WBCs on filters from day 10 and day 14 occupy different absolute positions. On these same days the center position of the MCF7 population moves in the same direction. If I observe a measureable difference between days for the D , t hen #2) is this difference still measureable after using D+ as an absolute control.
PAGE 112
95 Word Template by Friedman & Morgan 2014 Figure A .1 per day basis. Each day has 2 3 WBC filters shown in blue and 2 3 MCF7 filters shown in red. P value in each histogram indicates significant differences in the populations wi thin that day. Bimodality is seen across the days of data, the absolute positions of WBC and distribution is seen to vary between day. P at the bottom quantifies that the between day differences are greater than the within day differences for this populati on. P off quantifies if this is the case after controlling for offset with the position of the other population. F is the F statistic driving the first P value (described eq. A. 3). Ftech is the ratio of the between group to within group sum squared errors s ummed for each day. I used analysis of variance (ANOVA) as a footing for statistically examining these points. Assuming one has a sample coming from multiple groups denoted by subscript (i), ANOVA seeks to model the position of these samples as being the mean of them combined Âµ , plus an offset for the mean of the sample group i , plus an error term, e i,j
PAGE 113
96 Word Template by Friedman & Morgan 2014 ( A1. 1) i do not i should be the same as that of the e i,j . To det ermine if this is true, without performing a regression for this model, one performs an F test to determine if the between group sample sum squared error (BGSS) is greater than the difference of the within group sample summer error (WGSS). ( A1. 2) Performing an ANOVA on each day of D data with cells group by the filter they came from found significant differences between the filters for most days and most biomarkers. The p value for these anovas is annotated in the top left of each histogram. This points to significant variation within a day, however many of the underlying distributions appear stable qualitatively. Performing an ANOVA grouping by day instead of filter, found significant differences. This indicates between day differences are signifi cant as well. The same was true when looking at the D+ population. Thus, I wish to further this analysis by comparing within day to between day variations. I would like to determine which one was actually greater. To determine if the between day variation was greater than the within day variation the WGSS of the ANOVA with cells labeled by day was divided WGSS of the ANOVAs performed on each day summed to creating an F statistic.
PAGE 114
97 Word Template by Friedman & Morgan 2014 ( A1.3 ) Using this F statistic an F test was then be performed with the appropriate degrees of freedom For cases where bet ween day spread of data does not exceed the within day spread there are two possible cases. One is that the within day spreads are also narrow and the biomarker is very stable. The most pronounced example of this was DAPI shown in Figure 2. The other cause is the biomarker has wide variation between day. This may be addressable by technical improvements to the assay and example of a biomarker like th is is I created another modified F statistic where an ANOVA is performed on each days data and results are grouped. (4)
PAGE 115
98 Word Template by Friedman & Morgan 2014
PAGE 116
99 Word Template by Friedman & Morgan 2014 Figure A.2 : Histograms of the DAPI which was the most stable compared with PanCK F number was found to be significant for between day variation exceeding technical variation on WBC datasets for most of the biomarkers. The exceptions were for CD45 (F tech =23), DAPI (F tech =13), DAPI (F tech =19.3), PanCK (F tech =33.69), and tech =262). The high value of F tech for PanCK Sigma i ndicates the reason the between day means are statistically similar is due to the high variance in the distribution of this biomarker between filters. The low values of Ft for CD45 , DAPI and DAPI indicates these biomarkers are stable both tech nically and between days.
PAGE 117
100 Word Template by Friedman & Morgan 2014 difference P=0.7% to non significant difference P=59%. The p values with the offset control for all other features was found to be the same or increas e as denoted by P off at the bottom of the histograms. The MCF7 population was generally had wider standard deviations on all biomarkers. Between day variation was only found to exceed technical variation in the MCF7 population for Bodipy , CD45 , f DAPI and not found to be statistically significant after using the WBCs as an offset control. The histograms for the remaining features follow
PAGE 118
101 Word Template by Friedman & Morgan 2014
PAGE 119
102 Word Template by Friedman & Morgan 2014
PAGE 120
103 Word Template by Friedman & Morgan 2014
PAGE 121
104 Word Template by Friedman & Morgan 2014
PAGE 122
105 Word Template by Friedman & Morgan 2014
PAGE 123
106 Word Template by Friedman & Morgan 2014
PAGE 124
107 Word Template by Friedman & Morgan 2014
PAGE 125
108 Word Template by Friedman & Morgan 2014
PAGE 126
109 Word Template by Friedman & Morgan 2014
PAGE 127
110 Word Template by Friedman & Morgan 2014
PAGE 128
111 Word Template by Friedman & Morgan 2014
PAGE 129
112 Word Template by Friedman & Morgan 2014
PAGE 130
113 Word Template by Friedman & Morgan 2014
PAGE 131
114 Word Template by Friedman & Morgan 2014
PAGE 132
115 Word Template by Friedman & Morgan 2014 Appendix C Comparative figures and performance between bootstrap and experimental analysis used in dissertation Comparison of results if dataset is pooled and randomly segmented into 10 training testing datasets for WBCs and 10 training testing datasets for MCF7s with the results using training testin g pairings form different filters performed in the main manuscript. Left side is bootstrapped data. Right side is figures in main manuscript. The pooled set of Reg1 11 was applied to the bootstrap instead of generating a regression for each training testi ng pairing in the dataset. This was done to save computation time. In appendix 4, I show the pooled set or Reg1 11 performs similarly to generating them on each training subset as done in the main manuscript. Hypothesis tests and performance follow the comparative graphics.
PAGE 133
116 Word Template by Friedman & Morgan 2014
PAGE 134
117 Word Template by Friedman & Morgan 2014
PAGE 135
118 Word Template by Friedman & Morgan 2014
PAGE 136
119 Word Template by Friedman & Morgan 2014
PAGE 137
120 Word Template by Friedman & Morgan 2014 Hypothesis Test Results: Reject null hypothesis with p=0.00% confidence that Ndet of regressiosn 1 is the same as NdetMax Reject null hypothesis with p=0.00% confidence that fpTest of regressiosn 1 is the same as fpMin Reject null hypothesis with p=0.00% confidence that NdetTrain of regressiosn 1 is the same as NdetTrainMax Fail to reject null hypothesis with p=100.00% confidence fpTrain of regressiosn 1 is the same as fp TrainMin Reject null hypothesis with p=0.00% confidence that NdetTrain of regressiosn 1 4 is the same as NdetTrainMax Reject null hypothesis with p=0.00% confidence that the minDetectTest of regressiosn 1 through 4 are the same Reject null hypothesis with p=0.00% confidence that the minDetectTest of regressions with two channels are the same Reject null hypothesis with p=0.00% confidence that the minDetectTest of regressions with two and one channels are the same Reject null hypothesis with p=0.00% confide nce that the minDetectTest of regressions with a single channel are the same
PAGE 138
121 Word Template by Friedman & Morgan 2014 Reject null hypothesis with p=1.09% confidence that the minDetectTest a Reg1_all and Reg2_sigma are the same Fail to reject null hypothesis with p=77.03% confidence that the minDe tectTest Reg1_all and Reg3_DAPI+CD45+Panck are the same Reject null hypothesis with p=0.07% confidence that the minDetectTest of Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Reject null hypothesis with p=0.30% confidence that variances of the minDetectTest for the Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Reject null hypothesis with p=4.78% confidence that the fpMinDetectTest of regressiosn 1 through 4 are the same Reject null hypothesis with p=0.00% confidence that the fpMinDe tectTest of regressions with two channels are the same Reject null hypothesis with p=0.00% confidence that the fpMinDetectTest of regressions with two and one channels are the same Reject null hypothesis with p=0.00% confidence that the fpMinDetectTest of regressions with a single channel are the same Reject null hypothesis with p=0.90% confidence that the fpMinDetectTest a Reg1_all and Reg2_sigma are the same Fail to reject null hypothesis with p=30.48% confidence that the fpMinDetectTest Reg1_all and Reg3_DAPI+CD45+Panck are the same Fail to reject null hypothesis with p=33.09% confidence that the fpMinDetectTest Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Fail to reject null hypothesis with p=6.24% confidence that variances of the fpMin DetectTest for a Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Reject null hypothesis with p=0.00% confidence that the sensMinDetectTest of regressiosn 1 through 4 are the same Reject null hypothesis with p=0.00% confidence that the sensMinDet ectTest of regressions with two channels are the same Reject null hypothesis with p=0.00% confidence that the sensMinDetectTest of regressions with two and one channels are the same Reject null hypothesis with p=0.00% confidence that the sensMinDetectTest of regressions with a single channel are the same Reject null hypothesis with p=0.02% confidence that the sensMinDetectTest a Reg1_all and Reg2_sigma are the same Reject null hypothesis with p=0.02% confidence that the sensMinDetectTest Reg1_all and Reg3_D API+CD45+Panck are the same Reject null hypothesis with p=0.00% confidence that the sensMinDetectTest of Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Fail to reject null hypothesis with p=6.01% confidence that variances of the sensMinDetectTes t for a Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Reject null hypothesis with p=0.00% confidence that the d_cohen of regressiosn 1 through 4 are the same Reject null hypothesis with p=0.00% confidence that the d_cohen of regressions with t wo channels are the same Reject null hypothesis with p=0.00% confidence that the d_cohen of regressions with two and one channels are the same Reject null hypothesis with p=0.00% confidence that the d_cohen of regressions with a single channel are the same Reject null hypothesis with p=0.00% confidence that the d_cohen a Reg1_all and Reg2_sigma are the same
PAGE 139
122 Word Template by Friedman & Morgan 2014 Reject null hypothesis with p=0.29% confidence that the d_cohen Reg1_all and Reg3_DAPI+CD45+Panck are the same Reject null hypothesis with p=0.00% confi dence that the d_cohen of Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Fail to reject null hypothesis with p=96.49% confidence that variances of the d_cohen for a Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Reject null hypothesi s with p=0.00% confidence that the AUCTest of regressiosn 1 through 4 are the same Reject null hypothesis with p=0.00% confidence that the AUCTest of regressions with two channels are the same Reject null hypothesis with p=0.00% confidence that the AUCTest of regressions with two and one channels are the same Reject null hypothesis with p=0.00% confidence that the AUCTest of regressions with a single channel are the same Reject null hypothesis with p=0.00% confidence that the AUCTest a Reg1_all and Reg2_sig ma are the same Reject null hypothesis with p=0.00% confidence that the AUCTest Reg1_all and Reg3_DAPI+CD45+Panck are the same Reject null hypothesis with p=0.00% confidence that the AUCTest of Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Reject null hypothesis with p=0.00% confidence that variances of the AUCTest for the Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Ndet performance of individual features using bootstrapped dataset Biomarker: DAPI \ Sigma, Ndet = 18.22 [18.78] +/ 1.15 (mean[median] +/ std), [15.75, 19.74] [min,max] Biomarker: DAPI < r > , Ndet = 15.54 [17.05] +/ 3.62 (mean[median] +/ std), [5.11, 17.57] [min,max] Biomarker: DAPI < r_f > , Ndet = 5.16 [6.34] +/ 1.72 (mean[median] +/ std), [2.54, 6.49] [mi n,max] Biomarker: DAPI < M > , Ndet = 23.62 [23.34] +/ 0.74 (mean[median] +/ std), [22.47, 25.33] [min,max] Biomarker: Bodipy \ Sigma, Ndet = 42.81 [42.48] +/ 2.70 (mean[median] +/ std), [38.27, 50.33] [min,max] Biomarker: Bodipy < r > , Ndet = 15.37 [16.10] +/ 2.78 (mean[median] +/ std), [7.06, 17.16] [min,max] Biomarker: Bodipy < r_f > , Ndet = 6.08 [6.31] +/ 0.87 (mean[median] +/ std), [4.30, 6.97] [min,max] Biomarker: Bodipy < M > , Ndet = 12.03 [12.98] +/ 2.59 (mean[median] +/ std), [4.58 , 13.83] [min,max] Biomarker: CD45 \ Sigma, Ndet = 6.20 [6.00] +/ 1.14 (mean[median] +/ std), [4.55, 12.84] [min,max] Biomarker: CD45 < r > , Ndet = 16.41 [17.69] +/ 3.44 (mean[median] +/ std), [6.04, 18.30] [min,max] Biomarker: CD45 < r_f > , Ndet = 6.27 [6.73] +/ 1.39 (mean[median] +/ std), [2.08, 6.90] [min,max] Biomarker: CD45 < M > , Ndet = 23.32 [25.18] +/ 5.37 (mean[median] +/ std), [7.57, 26.58] [min,max] Biomarker: PanCK \ Sigma, Ndet = 38.94 [37.50] +/ 6.28 (mean[median] +/ std), [30. 96, 57.60] [min,max] Biomarker: PanCK < r > , Ndet = 17.99 [19.36] +/ 3.87 (mean[median] +/ std), [6.45, 19.98] [min,max]
PAGE 140
123 Word Template by Friedman & Morgan 2014 Biomarker: PanCK < r_f > , Ndet = 1.60 [1.75] +/ 0.25 (mean[median] +/ std), [1.09, 1.76] [min,max] Biomarker: PanCK < M > , Nd et = 18.57 [20.05] +/ 4.38 (mean[median] +/ std), [5.15, 21.35] [min,max] Ndet performance of individual features using training testing system used in paper Biomarker: DAPI \ Sigma, Ndet = 34.33 [23.19] +/ 34.48 (mean[median] +/ std), [5.17, 171.53] [min,max] Biomarker: DAPI < r > , Ndet = 20.45 [18.28] +/ 13.44 (mean[median] +/ std), [6.36, 56.78] [min,max] Biomarker: DAPI < r_f > , Ndet = 6.86 [4.98] +/ 5.33 (mean[median] +/ std), [1.12, 20.36] [min,max] Biomarker: DAPI < M > , Ndet = 25.59 [ 20.14] +/ 16.48 (mean[median] +/ std), [8.89, 62.23] [min,max] Biomarker: Bodipy \ Sigma, Ndet = 203.60 [138.44] +/ 229.44 (mean[median] +/ std), [4.52, 1026.95] [min,max] Biomarker: Bodipy < r > , Ndet = 24.06 [22.70] +/ 13.55 (mean[median] +/ std), [6.33, 55.33] [min,max] Biomarker: Bodipy < r_f > , Ndet = 60.06 [22.19] +/ 92.53 (mean[median] +/ std), [1.87, 369.78] [min,max] Biomarker: Bodipy < M > , Ndet = 16.90 [14.94] +/ 9.27 (mean[medi an] +/ std), [6.03, 39.26] [min,max] Biomarker: CD45 \ Sigma, Ndet = 8.90 [6.86] +/ 8.17 (mean[median] +/ std), [1.68, 50.30] [min,max] Biomarker: CD45 < r > , Ndet = 25.31 [24.55] +/ 14.48 (mean[median] +/ std), [6.97, 60.66] [min,max] Biomarker: C D45 < r_f > , Ndet = 9.12 [6.27] +/ 8.52 (mean[median] +/ std), [1.34, 36.35] [min,max] Biomarker: CD45 < M > , Ndet = 37.81 [36.42] +/ 29.01 (mean[median] +/ std), [6.33, 132.68] [min,max] Biomarker: PanCK \ Sigma, Ndet = 75.74 [30.57] +/ 101.06 (me an[median] +/ std), [6.25, 484.61] [min,max] Biomarker: PanCK < r > , Ndet = 25.43 [23.58] +/ 13.76 (mean[median] +/ std), [7.70, 55.81] [min,max] Biomarker: PanCK < r_f > , Ndet = 2.58 [1.94] +/ 2.69 (mean[median] +/ std), [1.08, 18.84] [min,max] Biomarker: PanCK < M > , Ndet = 22.86 [21.51] +/ 12.41 (mean[median] +/ std), [6.83, 46.20] [min,max] Ndet performance of regressions on bootstrapped data Biomarker: Reg1_{All}, Ndet = 715.29 [607.92] +/ 443.77 (mean[median] +/ std), [118.39, 1653.44 ] [min,max] Biomarker: Reg2_{Sigma}, Ndet = 1135.64 [667.87] +/ 1079.45 (mean[median] +/ std), [163.46, 3901.87] [min,max] Biomarker: Reg3_{DAPI + CD45 + PanCK}, Ndet = 762.85 [636.80] +/ 624.56 (mean[median] +/ std), [184.21, 2469.13] [min,max] Bio marker: Reg4_{DAPI + Bodipy + CD45}, Ndet = 695.80 [453.58] +/ 844.44 (mean[median] +/ std), [136.48, 3159.66] [min,max] Biomarker: Reg5_{DAPI + CD45}, Ndet = 404.88 [299.35] +/ 464.14 (mean[median] +/ std), [120.17, 2285.91] [min,max] Biomarker: Reg 6_{DAPI + PanCK}, Ndet = 51.84 [50.20] +/ 6.87 (mean[median] +/ std), [40.39, 64.91] [min,max] Biomarker: Reg7_{DAPI + Bodipy}, Ndet = 119.16 [120.08] +/ 20.79 (mean[median] +/ std), [89.42, 147.37] [min,max] Biomarker: Reg8_{DAPI}, Ndet = 48.19 [48. 93] +/ 3.44 (mean[median] +/ std), [39.38, 51.70] [min,max] Biomarker: Reg9_{Bodipy}, Ndet = 63.30 [60.98] +/ 13.19 (mean[median] +/ std), [48.01, 95.30] [min,max] Biomarker: Reg10_{CD45}, Ndet = 72.64 [80.96] +/ 15.90 (mean[median] +/ std), [41.62 , 86.36] [min,max] Biomarker: Reg11_{PanCK}, Ndet = 37.56 [37.52] +/ 7.26 (mean[median] +/ std), [19.16, 46.53] [min,max] Ndet performance of regressions with day9 15merge \ Reg Mats
PAGE 141
124 Word Template by Friedman & Morgan 2014 Biomarker: Reg1_{All}, Ndet = 495.39 [337.98] +/ 440.96 (mean[median] +/ std), [43.58, 1555.06] [min,max] Biomarker: Reg2_{Sigma}, Ndet = 567.43 [329.20] +/ 536.55 (mean[median] +/ std), [38.02, 2271.87] [min,max] Biomarker: Reg3_{ DAPI + CD45 + PanCK}, Ndet = 515.53 [345.78] +/ 519.01 (mean[median] +/ std), [25.21, 2257.14] [min,max] Biomarker: Reg4_{DAPI + Bodipy + CD45}, Ndet = 373.54 [327.45] +/ 350.10 (mean[median] +/ std), [30.35, 1539.29] [min,max] Biomarker: Reg5_{DAPI + CD45}, Ndet = 208.12 [112.40] +/ 292.81 (mean[median] +/ std), [21.14, 1365.74] [min,max] Biomarker: Reg6_{DAPI + PanCK}, Ndet = 113.62 [62.63] +/ 122.18 (mean[median] +/ std), [13.97, 533.91] [min,max] Biomarker: Reg7_{DAPI + Bodipy}, Ndet = 211.0 7 [222.80] +/ 164.43 (mean[median] +/ std), [18.28, 628.34] [min,max] Biomarker: Reg8_{DAPI}, Ndet = 79.94 [54.82] +/ 69.95 (mean[median] +/ std), [5.24, 297.33] [min,max] Biomarker: Reg9_{Bodipy}, Ndet = 173.59 [129.22] +/ 174.31 (mean[median] +/ std), [6.07, 796.98] [min,max] Biomarker: Reg10_{CD45}, Ndet = 82.13 [59.49] +/ 54.71 (mean[median] +/ std), [15.98, 196.42] [min,max] Biomarker: Reg11_{PanCK}, Ndet = 85.87 [42.91] +/ 81.01 (mean[median] +/ std), [10.93, 305.47] [min,max]
PAGE 142
125 Word Template by Friedman & Morgan 2014 Appendix D Comparat ive figures and performance with and without manual debris removal Comparison of performance with automated segmentation and automated segmentation plus manual exclusion of ROIs in debris. Regression analysis applied 1 set of Reg1 11, trained on the whole dataset, to whole dataset for both left and right panels rather than doing regression training testing on each training testing dataset. This was to save computational time. Performance with and without the debris removed was roughly the same. Ndet numbers are summarized at the end. For the figures that follow: Figures on left can be generated adding filesep, 'Mats without manual debris exclusion' ] to mainPath and to the beginning of regPathPooled. Left Side ROIs from automated segmentation  Right side automated segmentation + manual exlusion used in main manuscript
PAGE 143
126 Word Template by Friedman & Morgan 2014
PAGE 144
127 Word Template by Friedman & Morgan 2014
PAGE 145
128 Word Template by Friedman & Morgan 2014
PAGE 146
129 Word Template by Friedman & Morgan 2014
PAGE 147
130 Word Template by Friedman & Morgan 2014 Just Automated Segmentation Tested Hypothesis: Reject null hypothesis with p=0.00% confidence that Ndet of regressiosn 1 is the same as NdetMax Reject null hypothesis with p=0.00% confidence that fpTest of regressiosn 1 is the same as fpMin Fail to reject null hypothesis with p=25.43% confidence that NdetTrain of regressiosn 1 is the same as NdetTrainMax Fail to reject null hypothesis with p=100.00% confidenc e fpTrain of regressiosn 1 is the same as fpTrainMin Fail to reject null hypothesis with p=6.57% confidence that NdetTrain of regressiosn 1 4 is the same as NdetTrainMax Fail to reject null hypothesis with p=52.35% confidence that the minDetectTest of reg ressiosn 1 through 4 are the same Reject null hypothesis with p=0.23% confidence that the minDetectTest of regressions with two channels are the same Reject null hypothesis with p=0.00% confidence that the minDetectTest of regressions with two and one chan nels are the same Fail to reject null hypothesis with p=22.75% confidence that the minDetectTest of regressions with a single channel are the same Fail to reject null hypothesis with p=37.13% confidence that the minDetectTest a Reg1_all and Reg2_sigma are the same Fail to reject null hypothesis with p=28.92% confidence that the minDetectTest Reg1_all and Reg3_DAPI+CD45+Panck are the same Fail to reject null hypothesis with p=67.83% confidence that the minDetectTest Reg3_DAPI+PanCK+CD45 is the same as Reg4_ DAPI+Bodipy+CD45
PAGE 148
131 Word Template by Friedman & Morgan 2014 Reject null hypothesis with p=0.34% confidence that variances of the minDetectTest for the Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Fail to reject null hypothesis with p=86.54% confidence that the fpMinDetectTest of regre ssiosn 1 through 4 are the same Fail to reject null hypothesis with p=8.24% confidence that the fpMinDetectTestof regressions with two channels are the same Reject null hypothesis with p=0.69% confidence that the fpMinDetectTest of regressions with two an d one channels are the same Fail to reject null hypothesis with p=53.10% confidence that the fpMinDetectTest of regressions with a single channel are the same Fail to reject null hypothesis with p=50.07% confidence that the fpMinDetectTest a Reg1_all and R eg2_sigma are the same Fail to reject null hypothesis with p=45.69% confidence that the fpMinDetectTest Reg1_all and Reg3_DAPI+CD45+Panck are the same Fail to reject null hypothesis with p=59.07% confidence that the fpMinDetectTest Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Fail to reject null hypothesis with p=20.50% confidence that variances of the fpMinDetectTest for a Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Fail to reject null hypothesis with p=34.77% confidence that the sensMinDetectTest of regressiosn 1 through 4 are the same Reject null hypothesis with p=0.44% confidence that the sensMinDetectTest of regressions with two channels are the same Reject null hypothesis with p=1.19% confidence that the sensMinDetectTest of regressions with two and one channels are the same Fail to reject null hypothesis with p=66.04% confidence that the sensMinDetectTest of regressions with a single channel are the same Fail to reject null hypothesis with p=36.97% confidence that the sens MinDetectTest a Reg1_all and Reg2_sigma are the same Fail to reject null hypothesis with p=28.09% confidence that the sensMinDetectTest Reg1_all and Reg3_DAPI+CD45+Panck are the same Fail to reject null hypothesis with p=40.86% confidence that the sensMin DetectTest Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Reject null hypothesis with p=0.00% confidence that variances of the sensMinDetectTest for the Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Fail to reject null hypothesis wi th p=6.39% confidence that the d_cohen of regressiosn 1 through 4 are the same Reject null hypothesis with p=0.00% confidence that the d_cohen of regressions with two channels are the same Reject null hypothesis with p=0.00% confidence that the d_cohen of regressions with two and one channels are the same Reject null hypothesis with p=0.00% confidence that the d_cohen of regressions with a single channel are the same Reject null hypothesis with p=1.27% confidence that the d_cohen a Reg1_all and Reg2_sigma a re the same Fail to reject null hypothesis with p=46.65% confidence that the d_cohen Reg1_all and Reg3_DAPI+CD45+Panck are the same Fail to reject null hypothesis with p=53.69% confidence that the d_cohen Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodi py+CD45 Fail to reject null hypothesis with p=13.48% confidence that variances of the d_cohen for a Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Reject null hypothesis with p=0.57% confidence that the AUCTest of regressiosn 1 through 4 are th e same Reject null hypothesis with p=0.00% confidence that the AUCTest of regressions with two channels are the same Reject null hypothesis with p=0.00% confidence that the AUCTest of regressions with two and one channels are the same Reject null hypothesi s with p=0.00% confidence that the AUCTest of regressions with a single channel are the same Reject null hypothesis with p=0.40% confidence that the AUCTest a Reg1_all and Reg2_sigma are the same Fail to reject null hypothesis with p=20.67% confidence that the AUCTest Reg1_all and Reg3_DAPI+CD45+Panck are the same Reject null hypothesis with p=3.13% confidence that the AUCTest of Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Reject null hypothesis with p=0.00% confidence that variances of the AUCTest for the Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45
PAGE 149
132 Word Template by Friedman & Morgan 2014 Ndet performance of individual features with just automated segmentation Biomarker: DAPI \ Sigma, Ndet = 26.56 [19.82] +/ 23.99 (mean[median] +/ std), [5.05, 123.00] [min,max] B iomarker: DAPI < r > , Ndet = 17.13 [13.64] +/ 10.03 (mean[median] +/ std), [5.86, 43.71] [min,max] Biomarker: DAPI < r_f > , Ndet = 6.75 [4.47] +/ 6.63 (mean[median] +/ std), [1.27, 26.35] [min,max] Biomarker: DAPI < M > , Ndet = 21.12 [16.99] +/ 1 1.37 (mean[median] +/ std), [7.14, 53.30] [min,max] Biomarker: Bodipy \ Sigma, Ndet = 140.60 [56.91] +/ 207.06 (mean[median] +/ std), [5.47, 933.29] [min,max] Biomarker: Bodipy < r > , Ndet = 18.52 [17.63] +/ 9.03 (mean[median] +/ std), [5.56, 40.79] [min,max] Biomarker: Bodipy < r_f > , Ndet = 33.61 [16.42] +/ 39.61 (mean[median] +/ std), [1.94, 164.52] [min,max] Biomarker: Bodipy < M > , Ndet = 12.44 [11.31] +/ 5.50 (mean[median] +/ std), [5.77, 28.51] [min,max] Biomarker: CD45 \ Sigma, Ndet = 6.65 [5.73] +/ 4.30 (mean[median] +/ std), [1.86, 22.28] [min,max] Biomarker: CD45 < r > , Ndet = 20.29 [18.73] +/ 10.32 (mean[median] +/ std), [4.80, 49.11] [min,max] Biomarker: CD45 < r_f > , Ndet = 9.49 [5.47] +/ 10.35 (mean[median] +/ std), [1 .27, 42.76] [min,max] Biomarker: CD45 < M > , Ndet = 26.19 [22.69] +/ 15.56 (mean[median] +/ std), [6.78, 76.61] [min,max] Biomarker: PanCK \ Sigma, Ndet = 57.28 [34.43] +/ 63.06 (mean[median] +/ std), [4.45, 266.54] [min,max] Biomarker: PanCK < r > , Ndet = 21.08 [18.98] +/ 10.04 (mean[median] +/ std), [5.93, 50.07] [min,max] Biomarker: PanCK < r_f > , Ndet = 3.16 [2.07] +/ 2.61 (mean[median] +/ std), [1.08, 12.45] [min,max] Biomarker: PanCK < M > , Ndet = 18.26 [15.99] +/ 8.54 (mean[median] + / std), [5.31, 34.90] [min,max] Ndet performance of individual features with automated segmentation and manual exclusion Biomarker: DAPI \ Sigma, Ndet = 34.33 [23.19] +/ 34.48 (mean[median] +/ std), [5.17, 171.53] [min,max] Biomarker: DAPI < r > , Ndet = 20.45 [18.28] +/ 13.44 (mean[median] +/ std), [6.36, 56.78] [min,max] Biomarker: DAPI < r_f > , Ndet = 6.86 [4.98] +/ 5.33 (mean[median] +/ std), [1.12, 20.36] [min,max] Biomarker: DAPI < M > , Ndet = 25.59 [20.14] +/ 16.48 (mean[median] +/ std), [8.89, 62.23] [min,max] Biomarker: Bodipy \ Sigma, Ndet = 203.60 [138.44] +/ 229.44 (mean[median] +/ std), [4.52, 1026.95] [min,max] Biomarker: Bodipy < r > , Ndet = 24.06 [22.70] +/ 13.55 (mean[median] +/ std), [6.33, 55.33] [min,max] Biomark er: Bodipy < r_f > , Ndet = 60.06 [22.19] +/ 92.53 (mean[median] +/ std), [1.87, 369.78] [min,max] Biomarker: Bodipy < M > , Ndet = 16.90 [14.94] +/ 9.27 (mean[median] +/ std), [6.03, 39.26] [min,max] Biomarker: CD45 \ Sigma, Ndet = 8.90 [6.86] +/ 8. 17 (mean[median] +/ std), [1.68, 50.30] [min,max] Biomarker: CD45 < r > , Ndet = 25.31 [24.55] +/ 14.48 (mean[median] +/ std), [6.97, 60.66] [min,max] Biomarker: CD45 < r_f > , Ndet = 9.12 [6.27] +/ 8.52 (mean[median] +/ std), [1.34, 36.35] [min,max ] Biomarker: CD45 < M > , Ndet = 37.81 [36.42] +/ 29.01 (mean[median] +/ std), [6.33, 132.68] [min,max] Biomarker: PanCK \ Sigma, Ndet = 75.74 [30.57] +/ 101.06 (mean[median] +/ std), [6.25, 484.61] [min,max]
PAGE 150
133 Word Template by Friedman & Morgan 2014 Biomarker: PanCK < r > , Ndet = 25.43 [23 .58] +/ 13.76 (mean[median] +/ std), [7.70, 55.81] [min,max] Biomarker: PanCK < r_f > , Ndet = 2.58 [1.94] +/ 2.69 (mean[median] +/ std), [1.08, 18.84] [min,max] Biomarker: PanCK < M > , Ndet = 22.86 [21.51] +/ 12.41 (mean[median] +/ std), [6.83, 4 6.20] [min,max] Ndet performance of Reg1 11 with just automated segmentation Biomarker: Reg1_{All}, Ndet = 546.63 [328.61] +/ 590.96 (mean[median] +/ std), [20.80, 2495.94] [min,max] Biomarker: Reg2_{Sigma}, Ndet = 483.42 [313.69] +/ 540.00 (mean[me dian] +/ std), [24.00, 2291.22] [min,max] Biomarker: Reg3_{DAPI + CD45 + PanCK}, Ndet = 455.73 [282.51] +/ 529.40 (mean[median] +/ std), [21.98, 2337.45] [min,max] Biomarker: Reg4_{DAPI + Bodipy + CD45}, Ndet = 354.20 [300.22] +/ 351.52 (mean[median] +/ std), [22.62, 1557.79] [min,max] Biomarker: Reg5_{DAPI + CD45}, Ndet = 196.24 [92.22] +/ 266.88 (mean[median] +/ std), [4.02, 1291.33] [min,max] Biomarker: Reg6_{DAPI + PanCK}, Ndet = 83.67 [48.96] +/ 85.58 (mean[median] +/ std), [8.43, 308.18] [min,max] Biomarker: Reg7_{DAPI + Bodipy}, Ndet = 176.22 [134.26] +/ 195.63 (mean[median] +/ std), [17.09, 921.89] [min,max] Biomarker: Reg8_{DAPI}, Ndet = 57.48 [39.43] +/ 54.04 (mean[median] +/ std), [4.92, 233.29] [min,max] Biomarker: Reg9_{ Bodipy}, Ndet = 118.24 [68.99] +/ 142.83 (mean[median] +/ std), [3.83, 647.98] [min,max] Biomarker: Reg10_{CD45}, Ndet = 74.64 [65.91] +/ 59.44 (mean[median] +/ std), [10.39, 325.45] [min,max] Biomarker: Reg11_{PanCK}, Ndet = 73.56 [33.21] +/ 71.74 (mean[median] +/ std), [4.42, 337.58] [min,max] Ndet performance of Reg1 11 with automated segmentation and manual exclusion Biomarker: Reg1_{All}, Ndet = 495.39 [337.98] +/ 440.96 (mean[median] +/ std), [43.58, 1555.06] [min,max] Biomarker: Reg2_{Si gma}, Ndet = 567.43 [329.20] +/ 536.55 (mean[median] +/ std), [38.02, 2271.87] [min,max] Biomarker: Reg3_{DAPI + CD45 + PanCK}, Ndet = 515.53 [345.78] +/ 519.01 (mean[median] +/ std), [25.21, 2257.14] [min,max] Biomarker: Reg4_{DAPI + Bodipy + CD45}, Ndet = 373.54 [327.45] +/ 350.10 (mean[median] +/ std), [30.35, 1539.29] [min,max] Biomarker: Reg5_{DAPI + CD45}, Ndet = 208.12 [112.40] +/ 292.81 (mean[median] +/ std), [21.14, 1365.74] [min,max] Biomarker: Reg6_{DAPI + PanCK}, Ndet = 113.62 [62.63 ] +/ 122.18 (mean[median] +/ std), [13.97, 533.91] [min,max] Biomarker: Reg7_{DAPI + Bodipy}, Ndet = 211.07 [222.80] +/ 164.43 (mean[median] +/ std), [18.28, 628.34] [min,max] Biomarker: Reg8_{DAPI}, Ndet = 79.94 [54.82] +/ 69.95 (mean[median] +/ s td), [5.24, 297.33] [min,max] Biomarker: Reg9_{Bodipy}, Ndet = 173.59 [129.22] +/ 174.31 (mean[median] +/ std), [6.07, 796.98] [min,max] Biomarker: Reg10_{CD45}, Ndet = 82.13 [59.49] +/ 54.71 (mean[median] +/ std), [15.98, 196.42] [min,max] Biomarke r: Reg11_{PanCK}, Ndet = 85.87 [42.91] +/ 81.01 (mean[median] +/ std), [10.93, 305.47] [min,max] Note: These numbers are lower than discussed in the paper. Those results were done with a regression done one each training data.
PAGE 151
134 Word Template by Friedman & Morgan 2014 Appendix E Comparative figures an d performance of Reg1 11 generated once using all days data and applied to each of the training testing subsets with Reg1 11 generated for each of the training testing subsets Left Side All Regressions day9 15merge \ Reg Mats compared to right side figures from main dissertation
PAGE 152
135 Word Template by Friedman & Morgan 2014
PAGE 153
136 Word Template by Friedman & Morgan 2014 Hypothesis Testing Results with day9 15merge \ Reg Mats Reject null hypothesis with p=0.00% confidence that Ndet of regressiosn 1 is the same as NdetMax Reject null hypothesis with p=0.00% confidence that fpTest of regressiosn 1 is the same as fpMin Fail to reject null hypothesis with p=27.32% confidence that NdetTrain of regressiosn 1 is the same as NdetTrainMax Fail to reject null hypothesis with p=100.00% confidence fpTrain of regressiosn 1 is the same as fpTrainM in Fail to reject null hypothesis with p=5.33% confidence that NdetTrain of regressiosn 1 4 is the same as NdetTrainMax Fail to reject null hypothesis with p=53.99% confidence that the minDetectTest of regressiosn 1 through 4 are the same Reject null hypo thesis with p=0.49% confidence that the minDetectTest of regressions with two channels are the same Reject null hypothesis with p=0.00% confidence that the minDetectTest of regressions with two and one channels are the same Reject null hypothesis with p=0. 80% confidence that the minDetectTest of regressions with a single channel are the same Fail to reject null hypothesis with p=82.89% confidence that the minDetectTest a Reg1_all and Reg2_sigma are the same Fail to reject null hypothesis with p=95.03% confi dence that the minDetectTest Reg1_all and Reg3_DAPI+CD45+Panck are the same Fail to reject null hypothesis with p=24.25% confidence that the minDetectTest Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Reject null hypothesis with p=0.80% confid ence that variances of the minDetectTest for the Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Fail to reject null hypothesis with p=98.58% confidence that the fpMinDetectTest of regressiosn 1 through 4 are the same Reject null hypothesis with p=4.43% confidence that the fpMinDetectTest of regressions with two channels are the same Reject null hypothesis with p=0.05% confidence that the fpMinDetectTest of regressions with two and one channels are the same Fail to reject null hypothesis with p=23.82% confidence that the fpMinDetectTest of regressions with a single channel are the same Fail to reject null hypothesis with p=90.66% confidence that the fpMinDetectTest a Reg1_all and Reg2_sigma are the same Fail to reject null hypothesis with p=88. 92% confidence that the fpMinDetectTest Reg1_all and Reg3_DAPI+CD45+Panck are the same Fail to reject null hypothesis with p=100.00% confidence that the fpMinDetectTest Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Reject null hypothesis with p=4.93% confidence that variances of the fpMinDetectTest for the Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Fail to reject null hypothesis with p=38.88% confidence that the sensMinDetectTest of regressiosn 1 through 4 are the same Reject nu ll hypothesis with p=0.68% confidence that the sensMinDetectTest of regressions with two channels are the same Reject null hypothesis with p=0.68% confidence that the sensMinDetectTest of regressions with two and one channels are the same Fail to reject nu ll hypothesis with p=8.49% confidence that the sensMinDetectTest of regressions with a single channel are the same Fail to reject null hypothesis with p=35.02% confidence that the sensMinDetectTest a Reg1_all and Reg2_sigma are the same Fail to reject null hypothesis with p=41.39% confidence that the sensMinDetectTest Reg1_all and Reg3_DAPI+CD45+Panck are the same Fail to reject null hypothesis with p=35.20% confidence that the sensMinDetectTest Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Rej ect null hypothesis with p=0.00% confidence that variances of the sensMinDetectTest for the Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Reject null hypothesis with p=0.28% confidence that the d_cohen of regressiosn 1 through 4 are the same R eject null hypothesis with p=0.00% confidence that the d_cohen of regressions with two channels are the same Reject null hypothesis with p=0.00% confidence that the d_cohen of regressions with two and one channels are the same
PAGE 154
137 Word Template by Friedman & Morgan 2014 Reject null hypothesis with p =0.00% confidence that the d_cohen of regressions with a single channel are the same Reject null hypothesis with p=0.52% confidence that the d_cohen a Reg1_all and Reg2_sigma are the same Fail to reject null hypothesis with p=34.64% confidence that the d_c ohen Reg1_all and Reg3_DAPI+CD45+Panck are the same Fail to reject null hypothesis with p=8.44% confidence that the d_cohen Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Fail to reject null hypothesis with p=14.44% confidence that variances of the d_cohen for a Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Reject null hypothesis with p=0.45% confidence that the AUCTest of regressiosn 1 through 4 are the same Reject null hypothesis with p=0.00% confidence that the AUCTest of regressions with two channels are the same Reject null hypothesis with p=0.00% confidence that the AUCTest of regressions with two and one channels are the same Reject null hypothesis with p=0.00% confidence that the AUCTest of regressions with a single ch annel are the same Reject null hypothesis with p=0.94% confidence that the AUCTest a Reg1_all and Reg2_sigma are the same Fail to reject null hypothesis with p=21.69% confidence that the AUCTest Reg1_all and Reg3_DAPI+CD45+Panck are the same Reject null hy pothesis with p=2.00% confidence that the AUCTest of Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Reject null hypothesis with p=0.00% confidence that variances of the AUCTest for the Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Ndet performance of regressions with day9 15merge \ Reg Mats Biomarker: Reg1_{All}, Ndet = 495.39 [337.98] +/ 440.96 (mean[median] +/ std), [43.58, 1555.06] [min,max] Biomarker: Reg2_{Sigma}, Ndet = 567.43 [329.20] +/ 536.55 (mean[median] +/ std), [38.0 2, 2271.87] [min,max] Biomarker: Reg3_{DAPI + CD45 + PanCK}, Ndet = 515.53 [345.78] +/ 519.01 (mean[median] +/ std), [25.21, 2257.14] [min,max] Biomarker: Reg4_{DAPI + Bodipy + CD45}, Ndet = 373.54 [327.45] +/ 350.10 (mean[median] +/ std), [30.35, 15 39.29] [min,max] Biomarker: Reg5_{DAPI + CD45}, Ndet = 208.12 [112.40] +/ 292.81 (mean[median] +/ std), [21.14, 1365.74] [min,max] Biomarker: Reg6_{DAPI + PanCK}, Ndet = 113.62 [62.63] +/ 122.18 (mean[median] +/ std), [13.97, 533.91] [min,max] Bioma rker: Reg7_{DAPI + Bodipy}, Ndet = 211.07 [222.80] +/ 164.43 (mean[median] +/ std), [18.28, 628.34] [min,max] Biomarker: Reg8_{DAPI}, Ndet = 79.94 [54.82] +/ 69.95 (mean[median] +/ std), [5.24, 297.33] [min,max] Biomarker: Reg9_{Bodipy}, Ndet = 173.5 9 [129.22] +/ 174.31 (mean[median] +/ std), [6.07, 796.98] [min,max] Biomarker: Reg10_{CD45}, Ndet = 82.13 [59.49] +/ 54.71 (mean[median] +/ std), [15.98, 196.42] [min,max] Biomarker: Reg11_{PanCK}, Ndet = 85.87 [42.91] +/ 81.01 (mean[median] +/ st d), [10.93, 305.47] [min,max] Ndet performance of regressions with generated on each training dataset Biomarker: Reg1_{All}, Ndet = 480.78 [330.33] +/ 470.38 (mean[median] +/ std), [18.36, 1922.30] [min,max] Biomarker: Reg2_{ Sigma}, Ndet = 416.14 [338.79] +/ 370.64 (mean[median] +/ std), [24.83, 1676.27] [min,max] Biomarker: Reg3_{DAPI + CD45 + PanCK}, Ndet = 404.31 [273.68] +/ 462.75 (mean[median] +/ std), [20.66, 2044.99] [min,max] Biomarker: Reg4_{DAPI + Bodipy + CD45 }, Ndet = 433.01 [325.41] +/ 488.02 (mean[median] +/ std), [22.88, 2365.96] [min,max]
PAGE 155
138 Word Template by Friedman & Morgan 2014 Biomarker: Reg5_{DAPI + CD45}, Ndet = 266.69 [151.19] +/ 343.04 (mean[median] +/ std), [19.26, 1514.20] [min,max] Biomarker: Reg6_{DAPI + PanCK}, Ndet = 127.51 [74. 29] +/ 166.99 (mean[median] +/ std), [6.89, 812.53] [min,max] Biomarker: Reg7_{DAPI + Bodipy}, Ndet = 225.27 [121.11] +/ 279.94 (mean[median] +/ std), [2.91, 1272.05] [min,max] Biomarker: Reg8_{DAPI}, Ndet = 75.60 [63.57] +/ 72.28 (mean[median] +/ std), [2.74, 329.86] [min,max] Biomarker: Reg9_{Bodipy}, Ndet = 190.32 [96.49] +/ 238.36 (mean[median] +/ std), [5.40, 1113.32] [min,max] Biomarker: Reg10_{CD45}, Ndet = 69.71 [41.07] +/ 68.25 (mean[median] +/ std), [10.61, 272.03] [min,max] Biomark er: Reg11_{PanCK}, Ndet = 67.18 [39.05] +/ 65.50 (mean[median] +/ std), [4.36, 252.86] [min,max]
PAGE 156
139 Word Template by Friedman & Morgan 2014 Appendix F Comparative figures and performance of Reg1 11 generated once using all day 9 data and applied to each of the training testing subsets with Reg1 11 gene rated for each of the training testing subsets Left Side: Day 9 Regressions all Right Side between Day Regressions in Dissertation
PAGE 157
140 Word Template by Friedman & Morgan 2014
PAGE 158
141 Word Template by Friedman & Morgan 2014 Hypothesis test results: Reject null hypothesis with p=0.00% confidence that Ndet of regressiosn 1 is the same as NdetMax Reject null hypothesis with p=0.00% confidence that fpTest of regressiosn 1 is the same as fpMin Reject null hypothesis with p=0.02% confidence that NdetTrain of regressiosn 1 is the same as NdetTrainMax Fail to reject null hypothesis with p=39.86% confidence fpTrain of regressiosn 1 is the same as fpTrainMin Reject null hypothesis with p=0.00% confidence that NdetTrain of regressiosn 1 4 is the same as NdetTrainMax Reject null hypothesis with p=2.59% confidence that the minDetectTest of re gressiosn 1 through 4 are the same Reject null hypothesis with p=0.74% confidence that the minDetectTest of regressions with two channels are the same Reject null hypothesis with p=0.00% confidence that the minDetectTest of regressions with two and one cha nnels are the same Reject null hypothesis with p=0.14% confidence that the minDetectTest of regressions with a single channel are the same Reject null hypothesis with p=2.08% confidence that the minDetectTest a Reg1_all and Reg2_sigma are the same Fail to reject null hypothesis with p=69.77% confidence that the minDetectTest Reg1_all and Reg3_DAPI+CD45+Panck are the same Fail to reject null hypothesis with p=94.45% confidence that the minDetectTest Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Fail to reject null hypothesis with p=6.82% confidence that variances of the minDetectTest for a Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Fail to reject null hypothesis with p=22.43% confidence that the fpMinDetectTest of regressiosn 1 th rough 4 are the same Fail to reject null hypothesis with p=76.63% confidence that the fpMinDetectTestof regressions with two channels are the same Reject null hypothesis with p=0.02% confidence that the fpMinDetectTest of regressions with two and one chan nels are the same Reject null hypothesis with p=0.01% confidence that the fpMinDetectTest of regressions with a single channel are the same Fail to reject null hypothesis with p=11.08% confidence that the fpMinDetectTest a Reg1_all and Reg2_sigma are the s ame Fail to reject null hypothesis with p=58.98% confidence that the fpMinDetectTest Reg1_all and Reg3_DAPI+CD45+Panck are the same Fail to reject null hypothesis with p=28.12% confidence that the fpMinDetectTest Reg3_DAPI+PanCK+CD45 is the same as Reg4_D API+Bodipy+CD45 Fail to reject null hypothesis with p=7.46% confidence that variances of the fpMinDetectTest for a Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Fail to reject null hypothesis with p=13.76% confidence that the sensMinDetectTest of regressiosn 1 through 4 are the same Reject null hypothesis with p=0.20% confidence that the sensMinDetectTest of regressions with two channels are the same Reject null hypothesis with p=0.00% confidence that the sensMinDetectTest of regressions with t wo and one channels are the same Reject null hypothesis with p=0.00% confidence that the sensMinDetectTest of regressions with a single channel are the same Fail to reject null hypothesis with p=43.09% confidence that the sensMinDetectTest a Reg1_all and Reg2_sigma are the same Fail to reject null hypothesis with p=49.33% confidence that the sensMinDetectTest Reg1_all and Reg3_DAPI+CD45+Panck are the same Fail to reject null hypothesis with p=6.97% confidence that the sensMinDetectTest Reg3_DAPI+PanCK+CD4 5 is the same as Reg4_DAPI+Bodipy+CD45 Fail to reject null hypothesis with p=11.06% confidence that variances of the sensMinDetectTest for a Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Reject null hypothesis with p=0.00% confidence that the d_cohen of regressiosn 1 through 4 are the same Reject null hypothesis with p=0.00% confidence that the d_cohen of regressions with two channels are the same Reject null hypothesis with p=0.00% confidence that the d_cohen of regressions with two and one ch annels are the same
PAGE 159
142 Word Template by Friedman & Morgan 2014 Reject null hypothesis with p=0.00% confidence that the d_cohen of regressions with a single channel are the same Reject null hypothesis with p=0.00% confidence that the d_cohen a Reg1_all and Reg2_sigma are the same Fail to reject null hypothesis with p=76.66% confidence that the d_cohen Reg1_all and Reg3_DAPI+CD45+Panck are the same Fail to reject null hypothesis with p=74.44% confidence that the d_cohen Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Fail to reject null hyp othesis with p=44.10% confidence that variances of the d_cohen for a Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Reject null hypothesis with p=0.03% confidence that the AUCTest of regressiosn 1 through 4 are the same Reject null hypothesis w ith p=0.00% confidence that the AUCTest of regressions with two channels are the same Reject null hypothesis with p=0.00% confidence that the AUCTest of regressions with two and one channels are the same Reject null hypothesis with p=0.00% confidence that the AUCTest of regressions with a single channel are the same Fail to reject null hypothesis with p=42.23% confidence that the AUCTest a Reg1_all and Reg2_sigma are the same Fail to reject null hypothesis with p=88.64% confidence that the AUCTest Reg1_all and Reg3_DAPI+CD45+Panck are the same Reject null hypothesis with p=0.05% confidence that the AUCTest of Reg3_DAPI+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Reject null hypothesis with p=0.10% confidence that variances of the AUCTest for the Reg3_DAP I+PanCK+CD45 is the same as Reg4_DAPI+Bodipy+CD45 Performance of Reg1 Biomarker: Reg1_{All}, Ndet = 340.09 [234.13] +/ 422.05 (mean[median] +/ std), [52.24, 2067.51] [min,max] Biomarker: Reg2_{ Sigma}, Ndet = 510.60 [371.41] +/ 408.10 (mean[median] +/ std), [62.64, 1353.74] [min,max] Biomarker: Reg3_{DAPI + CD45 + PanCK}, Ndet = 316.83 [175.47] +/ 412.61 (mean[median] +/ std), [54.74, 2007.34] [min,max] Biomarker: Reg4_{DAPI + Bodipy + CD45 }, Ndet = 314.56 [215.49] +/ 315.30 (mean[median] +/ std), [47.47, 1340.38] [min,max] Biomarker: Reg5_{DAPI + CD45}, Ndet = 116.40 [84.78] +/ 87.15 (mean[median] +/ std), [24.56, 348.77] [min,max] Biomarker: Reg6_{DAPI + PanCK}, Ndet = 85.29 [45.31] +/ 100.60 (mean[median] +/ std), [6.57, 453.69] [min,max] Biomarker: Reg7_{DAPI + Bodipy}, Ndet = 151.83 [107.49] +/ 162.80 (mean[median] +/ std), [4.79, 710.51] [min,max] Biomarker: Reg8_{DAPI}, Ndet = 74.28 [55.83] +/ 63.73 (mean[median] +/ std), [8.56, 249.03] [min,max] Biomarker: Reg9_{Bodipy}, Ndet = 216.61 [143.71] +/ 317.73 (mean[median] +/ std), [4.50, 1438.78] [min,max] Biomarker: Reg10_{CD45}, Ndet = 60.12 [39.55] +/ 56.32 (mean[median] +/ std), [14.57, 227.56] [min,max] Biomarker: Reg11_{PanCK}, Ndet = 76.29 [44.37] +/ 78.74 (mean[median] +/ std), [6.21, 296.59] [min,max] Ndet performance of regressions with generated on each training dataset in paper Biomarker: Reg1_{All}, Ndet = 480.78 [330.33] +/ 470.38 (mean[median] +/ st d), [18.36, 1922.30] [min,max] Biomarker: Reg2_{Sigma}, Ndet = 416.14 [338.79] +/ 370.64 (mean[median] +/ std), [24.83, 1676.27] [min,max]
PAGE 160
143 Word Template by Friedman & Morgan 2014 Biomarker: Reg3_{DAPI + CD45 + PanCK}, Ndet = 404.31 [273.68] +/ 462.75 (mean[median] +/ std), [20.66, 2044.99] [min,max] Biomarker: Reg4_{DAPI + Bodipy + CD45}, Ndet = 433.01 [325.41] +/ 488.02 (mean[median] +/ std), [22.88, 2365.96] [min,max] Biomarker: Reg5_{DAPI + CD45}, Ndet = 266.69 [151.19] +/ 343.04 (mean[median] +/ std), [19.26, 1514.20] [min,max] B iomarker: Reg6_{DAPI + PanCK}, Ndet = 127.51 [74.29] +/ 166.99 (mean[median] +/ std), [6.89, 812.53] [min,max] Biomarker: Reg7_{DAPI + Bodipy}, Ndet = 225.27 [121.11] +/ 279.94 (mean[median] +/ std), [2.91, 1272.05] [min,max] Biomarker: Reg8_{DAPI}, Ndet = 75.60 [63.57] +/ 72.28 (mean[median] +/ std), [2.74, 329.86] [min,max] Biomarker: Reg9_{Bodipy}, Ndet = 190.32 [96.49] +/ 238.36 (mean[median] +/ std), [5.40, 1113.32] [min,max] Biomarker: Reg10_{ CD45}, Ndet = 69.71 [41.07] +/ 68.25 (mean[median] +/ std), [10.61, 272.03] [min,max] Biomarker: Reg11_{PanCK}, Ndet = 67.18 [39.05] +/ 65.50 (mean[median] +/ std), [4.36, 252.86] [min,max]
PAGE 161
144 Word Template by Friedman & Morgan 2014 Appendix G Area Under the Curve (AUC) up to FPR .05 Figure 13 with additional line for AUC up to FPR of .05 used in Lannin TB, Thege FI, Kirby BJ. Comparison and optimization of machine learning methods for automated classification of circulating tumor cells. Cytometry 2016;89:922 931. Reg1 4 have mean value of .049 compared to best value of .048 in the above reference.
PAGE 162
145 Word Template by Friedman & Morgan 2014 Appendix H Regressed Equations Trained on All Data In the Chapter 4 regressions were done for each of the 48 training testin g subsets. The results in the main manuscript analysis from done with 48 different versions of these equations. Below is of training the regressions using all the data pooled instead. It is an example as to the form of the resulting regressions. The perfor mance of these equations are compared to those in t he main manuscript in Appendix E . Appendix I has the form of regressions just trained on day 9 data.
PAGE 163
146 Word Template by Friedman & Morgan 2014 .
PAGE 164
147 Word Template by Friedman & Morgan 2014 Appendix I Regressed Equations Trained on just Day 9 Data In Appendix H regressions were done for each of the 48 training testing subsets. The results in the main manuscript analysis from done with 48 different versions of these equations. Below is of training th e regressions using just day 9 data. It is an example as to the form of the resulting regressions. The performance of these equations are compared to those in the main manuscript in Appendix E . Appendix H has the form of regressions just trained on all data.
PAGE 165
148 Word Template by Friedman & Morgan 2014
PAGE 166
149 Word Template by Friedman & Morgan 2014
PAGE 167
150 Word Template by Friedman & Morgan 2014 Appendix J Notes on Commercialized CTC Technologies This append i x is a compilation of notes from my review of the all companies listed in the news articles of [33] and [34] . I characterize the technology being developed employed by these companies into 3 groups, immunoenrichment, size enrichment, and large field image cytometry. These two articles are for an inve stor audience rather than a scientific audience. Some companies listed in the articles because they have technology that could be used for CTC applications. For these companies keting. There also was a CTC instrument service company and a company claiming CTC focus but no presentation of their technology on their website. I classified the technology being marketed by these into 3 categories , immuno enrichment, size enrichment, an d wide field image cytometry. Some companies have more developed CTC pipelines while other seem to focus on a single step such a magnetic bead enrichment. Technology: Immunoenrichment Company: Janssen (Veridex) Product: CellSearch, CellTracks TDI Technolo gy: EpCAM+ Bead Capture followed by image cytometry. This is still the only FDA approved product for CTC identification. Company: Fluxion Bio Product: Technology: Microfluidics with Positive Immuno Selection
PAGE 168
151 Word Template by Friedman & Morgan 2014 Company: Gilupi Product: Gilupi CellCollecto r Technology: Selection for EpCAM positive cells in vitro. A needle with positive immunoselection. Company: Miltenyi Biotec Products: Magnetic bead kits for EpCAM cell enrichment. Technology: Magnetic bead immuno enrichment. Technology: Size Enrichment Company: Rarecells Product: Isolation of Epitheial Cells by Size (ISET) Technology: Size enrichment for CTCs. Company: Biofluidica Product: Biofluidica CTC System Technology: Microfludic CTC capture device. Device is based on size capture. Company: Cl earbridge Products: The CTChip FR & ClearCell FX1 System. Technology: Microfluidic CTC Size Capture Device.
PAGE 169
152 Word Template by Friedman & Morgan 2014 Company: Celsee Product: SingleCell, Prep100, Prep400 & Celsee Analyzer Technology: Microfluidic CTC Size Capture Company: Creatv MicroTech Product: CellSieve Technology: Capture on 7 um pore size filters Company: Parsortix Technology: Microfluidic size capture from whole blood. Product: OncoQuick Technology: Centrifugal tube with engineered porous barrier for CTC enrichment. Technology: La rge field immuno image cytometry Company: Epic Sciences Technology: Wide field of view image cytometry. They market it has no cell left behind by which they mean no nucleated cell. They enrich to get rid of reds and platelets. Company: Cytotrack Products : Cytotrack and CytoDisc Technology: Large field image cytometry (100 million cells) on custom substrates to hold cells.
PAGE 170
153 Word Template by Friedman & Morgan 2014 CTC Company with unknown technology Company: IVDiagnostics Products: Velox, Mango Praxis, Admonitrix Description: CTC Start Up Company based out of Purdue Technology Center Technology: There is no papers or posters linked to from their website and description is quite vague. Companies with Technology Applicable to CTCs but not focusing on it Company: Advanced Cell Diagnostics Product: RNAscope Technology: Product for doing RNA imaging. Company: BioView Technolgoy: Image cytometry systems and whole slide microscopy. Company: Fluidigm Techology: Microfluidic fluid processor. Company: Ikonisys Products: Ikoniscope Digital Microsocpy Sytemj Technology: Automated image cytometry. CTC equipment service companies Company: ApoCell Product: Services the CellSearch (Veridex) System
PAGE 171
154 Word Template by Friedman & Morgan 2014 Non cell based cancer diagnostic companies. Company: AdnaGen / Qiagen Products: AdnaTests with many cancer ty pe specific variations.

