Citation
Quantitative analysis of the lungs on computer tomography

Material Information

Title:
Quantitative analysis of the lungs on computer tomography
Creator:
Humphries, Stephen M. ( author )
Language:
English
Physical Description:
1 electronic file (145 pages). : ;

Subjects

Subjects / Keywords:
Lungs -- Tomography ( lcsh )
Tomography ( lcsh )
Genre:
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )

Notes

Review:
Computed tomography (CT) reveals detailed anatomic structure and plays a key role in the diagnosis of many lung diseases. The current gold standard for evaluation of CT is visual assessment, which is limited by high inter-observer variation. The overall goal of proposed research is to develop and validate quantitative ,methods for evaluating anatomic structure and the appearance of diseases such as cystic fibrosis (CF) and idiopathic pulmonary fibrosis (IPF) on CT of the lungs. The general approach begins with systematic feature extraction, for example geometric measurements or quantitative description of image characteristics like texture. Accumulation of feature data into a training set makes it possible to apply statistical learning methods that can help identify correlation between specific CT features and other metrics describing normal processes or disease states. Project hypotheses are that quantitative image analysis can be used to: (1) distinguish normal lung from characteristic patterns associated with IPF based on texture, (2) facilitate automated, texture-based quantification of extent of fibrosis in IPF and (3) identify group differences in airway morphology using statistical shape modeling.
Thesis:
Thesis (Ph.D.)--University of Colorado Denver.
Bibliography:
Includes bibliographic references.
System Details:
System requirements: Adobe Reader.
General Note:
Department of Bioengineering
Statement of Responsibility:
by Stephen M. Humphries.

Record Information

Source Institution:
University of Colorado Denver
Holding Location:
Auraria Library
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
911203703 ( OCLC )
ocn911203703

Downloads

This item is only available as the following downloads:


Full Text

PAGE 1

QUANTITATIVEANALYSISOFTHELUNGSONCOMPUTED TOMOGRAPHY by STEPHENM.HUMPHRIES B.A.,ConnecticutCollege,1993 M.S.,UniversityofColoradoHealthSciencesCenter,1996 M.S.,UniversityofColoradoDenver,2012 Athesissubmittedtothe FacultyoftheGraduateSchoolofthe UniversityofColoradoinpartialfulfllment oftherequirementsforthedegreeof DoctorofPhilosophy Bioengineering 2015

PAGE 2

ThisthesisfortheDoctorofPhilosophydegreeby StephenM.Humphries hasbeenapprovedforthe DepartmentofBioengineering by DavidLynch,Advisor KendallHunter,Chair RichardWeir EmilyDeBoer JoyceSchroeder March10,2015 ii

PAGE 3

Humphries,StephenM.(Ph.D.,Bioengineering) QuantitativeAnalysisoftheLungsonComputedTomography ThesisdirectedbyProfessorDavidLynch ABSTRACT Computedtomography(CT)revealsdetailedanatomicstructureandplaysakey roleinthediagnosisofmanylungdiseases.Thecurrentgoldstandardforevaluation ofCTisvisualassessment,whichislimitedbyhighinter-observervariation.The overallgoalofproposedresearchistodevelopandvalidatequantitativemethodsfor evaluatinganatomicstructureandtheappearanceofdiseasessuchascysticfbrosis(CF)andidiopathicpulmonaryfbrosis(IPF)onCTofthelungs.Thegeneral approachbeginswithsystematicfeatureextraction,forexamplegeometricmeasurementsorquantitativedescriptionofimagecharacteristicsliketexture.Accumulation offeaturedataintoatrainingsetmakesitpossibletoapplystatisticallearningmethodsthatcanhelpidentifycorrelationbetweenspecifcCTfeaturesandothermetrics describingnormalprocessesordiseasestates.Projecthypothesesarethatquantitativeimageanalysiscanbeusedto:(1)distinguishnormallungfromcharacteristic patternsassociatedwithIPFbasedontexture,(2)facilitateautomated,texturebasedquantifcationofextentoffbrosisinIPFand(3)identifygroupdierencesin airwaymorphologyusingstatisticalshapemodeling. Theformandcontentofthisabstractareapproved.Irecommenditspublication. Approved:DavidLynch iii

PAGE 4

ACKNOWLEDGMENT Iwishtoexpressmysincereappreciationtomymentorsandcollaborators,especially Dr.DavidLynchforhisgeneroussupportandguidancethroughoutmyPh.D.study. Hisconsiderableexpertise,understandingandguidancehavebeeninvaluableadditionstomyresearchexperience.IwouldalsoliketothankDr.JoyceSchroederwhose insightsasbotharadiologistandcomputerscientistprovidedclarifyingperspective. TheteamatNationalJewishHealth'sQuantitativeImagingLabhaveallbeenvery generouswiththeirtimeandtalents.Inparticular,visitingscholarsDr.Kunihiro YagihashiandDr.Byung-HakRho,whoinvestedagreatdealoftimeandenergy annotatingCTimages;SystemsandDataAdministratorAlexKluiber,andResearch AnalystJordanZach.IwouldalsoliketoexpressmythankstoDr.SoniaLeachand DanCollinsoftheCenterforGeneticsandEnvironmentalHealthatNationalJewish fortheirassistancewithscientifccomputingresources. Iwouldliketoexpressmygratitudetothefaculty,staandmyclassmatesin theDepartmentofBioengineeringatUCDenver.Inparticular,Iamgratefulto Dr.KendallHunterwhoseadviceandencouragementthroughoutmyBioengineering studieshavebeenindispensable.IamalsoappreciativeofDr.RobinShandasforhis adviceandleadership,aswellasDr.RichardWeirformachinelearningdiscussions andhiswillingnesstobeonmycommittee. IwouldalsoliketothankDr.EmilyDeBoer,whosharedherinterestinquantitativelungCTwithmeshortlyafterIbeganmyBioengineeringstudiesandhas remainedanvaluablesourceofadviceandcollaboration. Finally,thankyoutoallofmyfamilyandfriends,especiallymywife,Babs,for yourpatienceandencouragementoverthelastfewyears. ThisworkwasmadepossiblebyinpartbyColoradoBiosciencesDiscoveryGrant 10BGF-21,andCOPDGene. iv

PAGE 5

TABLEOFCONTENTS Tables........................................vii Figures.......................................viii Chapter 1.Introduction...................................1 2.Background...................................10 2.1ComputedTomography.........................10 2.2ImageAnalysis..............................15 2.2.1FeatureExtraction........................16 2.2.2MachineLearning........................21 2.3RespiratorySystem...........................29 2.3.1IdiopathicPulmonaryFibrosis.................34 2.3.2CysticFibrosis..........................40 3.SpinImageFeaturesforQuantifcationofUIPonCT............43 3.1Introduction...............................44 3.2Methods.................................46 3.3Results..................................51 3.4Discussion................................52 3.5Conclusions...............................55 4.UnsupervisedFeatureLearning:TechnicalValidation............56 4.1Introduction...............................56 4.2Methods.................................59 4.3Results..................................64 4.4Discussion................................64 4.5Conclusion................................68 5.UnsupervisedFeatureLearningforQuantifcationofUIPonCT......69 5.1Introduction...............................69 v

PAGE 6

5.2Methods.................................71 5.3Results..................................74 5.4Discussion................................75 5.5Conclusions...............................84 6.StatisticalShapeModelingoftheAirwayTree................85 6.1Introduction...............................85 6.2Methods.................................87 6.3Results..................................95 6.4Discussion................................103 6.5Conclusions...............................110 7.Conclusions...................................111 Appendix A.SpatialInformationinROIs..........................113 B.ComparisonwithStandardFilters.......................117 C.Semi-SupervisedLearningtoSeparateMixedRegions............119 References ......................................125 vi

PAGE 7

TABLES Table 3.1SpinimageROIcross-validationresults...................52 3.2Correlationbetweenalgorithm,visualscoresandphysiologicmeasures.53 3.3Correlationbetweenlungpixelstatisticsandphysiologicmeasures....53 4.1Cross-validationf1scores..........................66 5.1Cross-validationresults,binaryclassifertodetectfbrosis........76 5.2Cross-validationresults,multi-categoryclassifer..............76 5.3Associationbetweenradiologistandalgorithm...............78 5.4AssociationsatbaselineinIPFNetsubjects(n=280)...........79 5.5AssociationsatbaselineinIPFNetsubjects(n=50)............79 5.6AssociationsatbaselineinPANTHERsubjectswithfollow-upCT(n=72)80 5.7AssociationsinchangesforPANTHERsubjectswithfollow-upCT(n=72)80 6.1Subjectdemographics............................97 6.2SimulationResults..............................99 6.3LOOCVResults...............................100 6.4AssociationbetweenSSMparametersandclinicalvariables........101 6.5Logisticregressionparameters........................104 A.1Slicelocationconfusionmatrix.......................116 B.1ComparisonbetweenUFLandRFSresults.................118 vii

PAGE 8

FIGURES Figure 2.1BasicCTschematic.............................11 2.2CTattenuation................................12 2.3Grey-levelco-occurrencematrices......................18 2.4Localbinarypattern.............................21 2.5Supervisedlearning..............................23 2.6RandomForest................................27 2.7Respiratoryanatomy.............................31 2.8Tracheobronchialtreehierarchy.......................32 2.9IPFoncoronalCTimages..........................39 3.1Firstorderpixelstatistics..........................45 3.2UIPtexturemontages............................46 3.3ROIacquisitionsoftwareinterface......................47 3.4LabeledROIs.................................48 3.5Intensitydomainspinimage.........................49 3.6SpinimageclassifcationwithRF......................51 3.7WholeimageclassifcationbyRF......................52 4.1SpatialpoolingoveranROI.........................64 4.2Exampledictionaryelements........................65 4.3Classifcationaccuracywithdierentdictionaries.............65 4.4ConfusionmatrixforwholeROItesting..................66 4.5ConfusionmatrixformidlevelROItesting.................67 5.1Concordancewithradiologist,example1..................76 5.2Concordancewithradiologist,example2..................77 5.3Fractionoflunginvolved,radiologistversusalgorithm..........77 5.4Accuracyoffbrosistypeestimatedbyalgorithm,example1.......78 viii

PAGE 9

5.5Accuracyoffbrosistypeestimatedbyalgorithm,example2.......78 6.1Estimationlumensizewithinscribedspheres...............93 6.2AirwaySSMprocess.............................94 6.3Synthetictreeshapesforsimulation.....................96 6.4Diagramofairwaybranchesused......................96 6.5VarianceexplainedbySSMPCA......................100 6.6Receiveroperatingcharacteristiccurve...................102 6.7PCAscoresofControlandCFsubjects..................102 6.8Representativeairwaytreeskeletons....................103 A.1Euclideandistancetransformoflungmask.................114 A.2Lungsegmentationatdierentlocations..................115 A.3Slicelocationprediction...........................116 B.1RFSFilters..................................118 C.1Examplesofmixedfree-formROIs.....................120 C.2ManuallylabeledTBROIs.........................121 C.3TBROIsseparatedfromfree-formROIswithmixedpatterns.......121 C.4RAROIsseparatedfromfree-formROIswithmixedpatterns......122 C.5HCROIsseparatedfromfree-formROIswithmixedpatterns......123 C.6BVROIsseparatedfromcontrolCTs....................124 ix

PAGE 10

1.Introduction Diseasesaectingthelungsarealeading-andgrowing-causeofmorbidity andmortalityworldwide.Itisestimatedthatabillionpeoplesuerfromchronic respiratoryconditions,costingtheglobaleconomybillionsofdollarsannually[1]. Thecontinuedprevalenceoftobaccouseandincreasinglycompromisedairquality, includingoutdoorairpollutionandoccupationalexposures,promotetherisksof respiratorydiseaseandarefactorsinitsgrowingfrequency. Diagnostictestsareanimportantpartoflesseningtheimpactoflungdiseases. Imagingmodalitiessuchasx-ray,ultrasound,magneticresonanceandcomputedtomographyareparticularlyvaluablebecausetheyenablesamplingtissuenon-invasively andin-vivo[23].Informationprovidedbymedicalimagesiscentraltodiagnosis,evaluatingextentorprogressionofdiseaseandguidingtreatment.Imagingalsoplaysan increasingroleinpre-clinicaldecisionmaking,forexampleinthedevelopmentof drugsordevices[107].ComputedTomography(CT)isthecurrentlypreferredimagingmodalityforevaluatingthelungsbecauseitisthetechniquethatmosteectively capturespulmonarystructure,thankstohighspatialresolutionandthenaturaldifferenceindensityofaeratedlungrelativetosofttissue[51],[23]. Forthemostpart,imagingproceduresandsystemsweredesignedtoproducedata presentableasvisualizations.However,visualassessmentofCTimages,thecurrent standard,istimeconsuming,subjectiveandimprecise.Resultsaretraditionallycommunicatedinvague,non-specifctermsreducingintricategraphicalrepresentations ofinternalanatomytoafairlylimitedandsubjectivevocabulary.Expressionslike \appearsslightlyenlarged"or\moderateincrease"areusedfrequently,butarenot specifcenoughforrigorousquantitativeanalysis. Visualassessmentalsosuerssignifcantinter-andintra-observervariation.This isattributabletobothtechnicalchallengesandvariabilityinradiologistexperience [135].Mixedappearanceofdiseasefeatures,thecomplexityofthreedimensional 1

PAGE 11

anatomicshapes,andthesheervolumeofimagedataproducedmakevisualevaluationoflungCTadiculttask.Today'sradiologistsreviewclinicalstudiesthatcan beextremelylargeandmultidimensional,consistingofhundredsoreventhousands ofimagecross-sections.Theseissues,combinedwiththeemergenceofnewandmore individualizedtreatments,motivateinterestinreliablequantitativemethodsforevaluatingCTimageswiththegoalofgleaningmorepreciseandspecifcinformation fromexamsthancanbeachievedbyvisualevaluationalone.CTalreadyplaysasignifcantandgrowingroleindiagnosisandtreatment.Methodstofurthercapitalize onexistingtechnologyandevenexistingdatacouldrepresentconsiderablevalue. Imagesofalltypescancontainsubstantialamountsofinformation,butitisforthe mostpartunstructuredandrequireshumananalysistofullyappreciate.Formally, unstructureddataisinformationthatdoesnotftanestablisheddatamodeland thereforeisnoteasilyprocessedbytypicalcomputationalmethods.Traditionaldata processingandstatisticaltechniquestendtorequireinputdatathatarestoredin tidy,labeledcontainers,suchastherowsandcolumnsofaspreadsheet,ormuch morecomplicatedschemainrelationaldatabases.Objective,automaticanalysisof unstructureddatausingmodernquantitativemethodsisdicultinpartbecause eectiverepresentationoftheunstructureddata,likerawimagepixels,isachallenge. Considertherelativelysimpletaskoforganizingpersonalphotos.Metadata loggedbydigitalcamerasisstructuredandcaneasilybeusedtosortimagesby date,timeorevenlocation(usinggeographicinformationthatcanbecapturedby smartphones).Howevertheactualcontentofphotosisamuchrichersourceofinformation-theproverbialonethousandwords.Thiscontentisessentiallycompletely unstructured.Quantifcationofimagefeaturesthatcanbeusedtodistinguishfaces orrecognizescenesisnoteasy-manylocalfeaturesaresubtleandinterpretation ofcomplexpatternsdrawsonhigherlevelunderstandingthancanbeincorporated intocomputerprograms.Asaresult,determiningwhoisinaphotoorfndingback2

PAGE 12

groundstructuresthatcaninformlocationhavelongbeentasksthatrequirehuman intervention. Systematicmeasurement,analysisandreportingofimageattributesisusually calledquantitativeimaging.TheRadiologicalSocietyofNorthAmericasQuantitative ImagingBiomarkersAlliancedefnesquantitativeimageanalysisas:\theextraction ofquantifablefeaturesfrommedicalimagesfortheassessmentofnormalorthe severity,degreeofchange,orstatusofadisease,injury,orchronicconditionrelative tonormal"[2].Putsimply,quantitativeimagingisthedirectcalculation,usingimage dataasinput,ofmetricsthatcanindicatetheclinicalconditionofasubject.Inmany respectsthisamountstoexpressingcontentofmedicalimagesasstructureddatain ordertofacilitatesubsequentanalysis,especiallyassessmentofcorrelationsbetween imagefeaturesandclinicalvariables. Quantitativeimageanalysisoersthepossibilityofmoreobjective,consistent imageanalysisthanvisualassessmentalone.Evenfairlysimplemeasures,likevolume ofatumorororgan,orestimatedtissuedensityasrevealedbyCTpixelintensitiesare morepreciseandinformativethanstrictlyqualitativedescriptions.Tosomeextent thevalueofquantitativeimageanalysishasbeenrecognizedforsometime.Certain establishedapplicationsmakeuseofquantitativeinformationavailableinimaging modalities.ExamplesincludecalibrationofCTpixelintensitiesinordertomeasure bonemineraldensityanddimensionsoflungtumorsusedtofortreatmentresponse ortumorprogression[103]. Quantitativeimagingreliesontwoseparate,butcomplementary,overalltasks. Thefrstisextractionofquantifablefeatures,whichmaybedescribedasthetechnical taskofidentifyinganddescribing,inastructureddatasense,relevantanddistinctive characteristicsindigitalimages.Thiscanbeaccomplishedusingmethodsofimage processingandcomputervisiontodetectandmeasureimportantqualitiesofraw imagedata.TheessentialquantifablefeaturesinCTimagesaregeometry,includ3

PAGE 13

ingdimensionssuchaslengths,volumesandangles,andpixelintensities,whichare relatedtomaterialdensity. Thesecondkeytaskofquantitativeimagingistoestablishassociationsbetween thesemeasuredfeaturesandclinicalstatus.Theultimatesuccessofaquantitative imagingmethodreliesequallyonitstechnicalviability,includinghowaccurate,reproducibleandecientitis,anditsclinicalsignifcance-howclearlymeasuredchanges arelinkedtopresenceandextentofdisease,function,and/orsuccessoftherapy[22]. Tobesuccessful,quantitativeimagingmethodsmustbeaccurate,reproducibleand closelylinkedwiththepresenceorextendofadiseaseortherapeuticeect. Thereareafewkeywaysthatthetechnicalmethodsofquantitativeimaging maybetranslatedtomeaningfulclinicalapplications.Aprimarygoalofquantitative imagingresearchisdevelopmentofimagingbiomarkers.Biomarkersaremeasurablequantitiesthatareassociatedwiththepresenceorseverityofdeviationsfrom so-callednormal,biologicorpathogenicprocesses,indicatingdisease[120].Animagingbiomarkerisananatomic,physiologic,biochemical,ormolecularparameterdetectablewithimagingtoolsthatcanserveasabiomarker.Imagingbiomarkerscan beusefulinevaluatingresponsestotherapyandtreatmentecacy[114].Ifproperlyvalidated,theymayserveassurrogateendpointsforclinicaloutcomesintrials. Whereasclinicaloutcomesareoverallresultslikemortality,painrelieforqualityof life,surrogateendpointsarevaluesthatcanbemeasuredobjectivelyandaresucientlyanddemonstrablycorrelatedwithclinicaloutcomes.Insomecircumstances, surrogateendpointscanbeusedasmeasuresoftheecacyofexperimentaltreatments [9].Imagingbiomarkersthatcanserveassurrogateendpointsinpharmaceuticalor devicetrialscouldbeextremelyvaluable. Visualevaluationofmedicalimagesrequiresspecializedtraining,aswellassignifcantexperience.Theprocessisbothtimeconsumingandsubjective.Studies indicatethatpurelyvisualdetectionbyhumanobserversisinadequatetothetask 4

PAGE 14

ofcomprehensivedetection.Double-readingofimagesbyindependentradiologists hasbeenshowntoimproveresultsofvisualevaluation,butthisisnotpracticalor costeectiveinmostcircumstances[42].Computeraideddetection(CAD)isthe implementationofvalidatedquantitativeimagingmethodstoassistradiologistsin identifying,localizing,andcharacterizingtheextentandseverityofabnormalities inmedicalimagedata.CADsystemsoertheopportunitytoachieveimprovedincreasesindetectionrates,withoutinducingthecostsandlogisticalchallengesby doublereading.AnumberofCADsystemshavebeencommercialized,suchasthose designedtoevaluatemammograms(whichhasledtoaquantifableincreaseinbreast cancerdetections[49]).CADsystemscanserveasanobjectivesecondreader,one thatscansimagesmethodically,drawsonanestablishedtrainingdatabaseanddoes notfatigue.Theuseofthesesystemshasledtomarkedimprovementinsensitivity, reducingthenumberofmisdiagnoses,falsenegativesandothernon-optimalclinical outcomescausedbyambiguousorincorrectinterpretationsof-orafailuretoidentify -anomaliesinmedicalimages.UltimatelyCADmayoersignifcantimprovements tothemedicalimagingprocessandbetteroutcomesresultingfromearlydetection. Followingsuccessesingenomicsandproteomics,largescale,highthroughput quantitativeanalysishasemergedasthe\nextbigthing"acrossmanyscientifcfelds ofstudy.Thetermradiomicswascoinedtodescribetheextractionoflargeamounts ofimagefeaturesfromradiographicimagesintominablestructureddatasetsthat canbeusedtoconstructdescriptiveorpredictivemodels[74],[72],[33].Radiomics studiesinoncologyhaveidentifedcorrelationsbetweenfeaturesextractedfromCT datasetsdepictingcertaintumorsandunderlyinggene-expressionpatterns[6]. Whilethecompulsivetacking-onofthe\omics"suxhasbeencriticised[60], thebasicpremiseofthesemethods-discoverythroughdata-isviable,withpossibly profoundimplicationsforadvancesinmedicine,so-calledpersonalizedmedicinein particular.Usingquantitativeimagingmethodstounlockinformationinthe(un5

PAGE 15

structured)rawpixeldataexposesittoestablishedmethodsofanalysis,likeclusteringandstatisticallearning,thatmayrevealmeaningfulpatterns.Inthisregard, quantitativeimagingcouldbecomeanimportanttoolforknowledgediscoveryand eventuallypersonalizedtreatmentplans.Molecularcharacterizationusinggenomicor proteomictechniquestendtorelyoninvasiveproceduresanddonotcapturespatial ortemporalheterogeneity[6].Medicalimagingstudiesontheotherhandareforthe mostpartnon-invasiveandcanbeacquiredmultipletimesthroughoutapatient's treatmentcourse,providingapotentiallylargerpoolofinformation.Theabilityto identifyandexplorelinksbetweenimagefeaturesoccurringatseveralscales,functionalandgeneticinformationmayrevealnewphenotypesandleadtomoretargeted therapies[33]. Medicalimagingconsumesasizableportionofworldwideelectronicdatastorage, bysomeestimatesuptoonethird.Forecastssuggestthatoverallstorageandarchivingofmedicalimagingdatawillexceed1exabyteby2016[3].Byanymeasurethis qualifesas\bigdata",arecentlypopularcatchphraseusedtodescribeanyrepositoryofinformationseeminglysolargeandcomplexthattraditionalanalyticmethods areoflimitedvalueatbest.Likemanypopularterms,\bigdata"isdiculttodefne -butitstaglinemayhavebeenprescientlywrittenbyJohnNaisbittinhis1982book, Megatrends:\wearedrowningininformation,butstarvingforknowledge"[97].The primaryintentof\bigdata"methodologiesistomanagetheselargedatarepositories andincreasetheamountofmeaningfulinformationdistilledfromthem,toproduce whathasbecomeknowninmodernbusinessparlanceas\actionableintelligence." Todaynearlyeveryfeld-fromretailsalestopublishingtodefenseandsecurity-is intentoncapitalizingonvastdatastoresthathavebecomeavailableintheInternet age.Copingwiththehugecollectionsofdatathatarerapidlyaccumulatingisa thornyproblem.Thishasdriveninterestindatamanagementstrategies,algorithms andreportingmethodssuitableforlarge,complexdatarepositories.Newspecialties 6

PAGE 16

andmethods,thatblendmathematics,statistics,computerscienceandinformation technology,areemergingtofocusontheseproblems. Notsurprisingly,applicationofbigdatamethodologiestoproblemsinmedical imagingisaveryactiveareaofresearchandinthelastdecadenoteworthyprogress hasbeenmade[134].Twoimportanttoolsassociatedwith\bigdata"analysisthat areclearlyrelevanttomedicalimageanalysisarecomputervisionandmachinelearning.Newerideasinthesefeldshaveshownpromiseinidentifyingsubtlebuthighly descriptiveimagecharacteristicsthatcanbeusefulinthekeytasksofmedicalimage analysis. Useofquantitativeimaginghasprovenbenefcialincertainapplications,but manychallengesremain.Quantitativeinformationisnotroutinelyincludedinradiologicalreports.Thecomplexpatternsandshapesofanatomyandpathologyon CTarenotwelldescribedbysimplefeaturesandrequirecutting-edgemethodsof computervision.Additionalphysiologicalmodellingmayberequiredtoconfrm quantitativecorrelationwithclinicaldataandoutcomes.Noconsensusapproachto thestandardizationofquantitativeandstatisticalmethodsforthisanalysisexists. VariationsinCTimageappearancecausedbydierencesinscannersoracquisition parametersposeasignifcantchallengeinquantitativeimaging.Manyimagefeatures aresensitivetolevelsofimagenoiseandmonotonicchangesinpixelintensitylevel, possiblyduetoscannerhardware,imageacquisitionorreconstructionparameters. Toaddressthesechallenges,image-derivedmetricsmustbecarefullyvalidated sothattheyaremeaningfulandworthmeasuring.Quantitativeimagingtherefore reliesonclosecollaborationbetweenengineersandclinicians,payingcloseattention toclinicalandtechnicalexigencies,toidentifyandverifyimagingbiomarkersand carefullyvalidatealgorithms.Manyopportunitiestomakemeaningfulcontributions inthisfeldremain. 7

PAGE 17

Thegeneralhypothesisofthisworkisthatextractionandanalysisofquantitative imagefeatureswillimproveobjectivityandspecifcityintheevaluationoflungdisease onCT. Thebodyofthisthesisfocusesontwoprimaryprojectsusingquantitativeanalysis oflungCT.Thefrstprojectrelatestodiseaseaectingthemaintissueofthelung,the lungparenchyma.Alsocalledinterstitiallungdisease,thesediseasesappearonCT asdiusepatternsonCTthatcanbediculttoquantifyvisuallyandquantitatively. Thisprojectfocusesonidiopathicpulmonaryfbrosis(IPF),adicult,andultimately fatal,diseasecharacterizedbyprogressivescarringofthelungs. CTscansplayanimportantroleinthediagnosisofpatientswithIPF.Amount oflungfbrosis,whichisevidentasdistinctivepatternsonCTimages,correlateswith importantclinicalparametersandcanindicatediseaseseverity.Visualevaluationof theseimagesislimited,however.InterpretationofCTbyeyecanbehighlysubjectiveandresultsvaryevenamongexperiencedradiologists.Consequently,visual assessmentisnotpreciseenoughtomeasurechangesbetweenCTstakenatdierent times.Amoreprecise,quantitativemethodtoestimatetheamountoflungfbrosis onCTcouldbeveryusefulintrialsand,eventually,clinicalpractice. Advancesinthefeldsofcomputervisionandmachinelearningenablecomputer programstoautomaticallyidentifycomplexpatternsinimages.IdesignedanalgorithmthatcanaccuratelyquantifyfbrosisonlungCT.Myhypothesisisthatthe amountoffbrosismeasuredautomaticallybythecomputerispredictiveofmortality.IproposetousethealgorithmonCTsofaclinicalpopulationofsubjectswith IPFtotestcorrelationbetweenamountoffbrosisfoundautomaticallyandclinical parameterslikeresultsofpulmonaryfunctiontests. Theaimofthefrstpartwastodevelop,testandvalidateamachinelearning approachtodistinguish,andquantifytheextentof,thecharacteristicpatternsofIPF onCT.Thisalgorithmwasthenusedtoevaluatetherelationshipbetweenvisually 8

PAGE 18

assessedfndings,physiologicvariablesandautomatedtexture-basedclassifcation scoresinsubjectswithIPF. Cysticfbrosis(CF)isaninheriteddiseasethataectsmucusproduction.Patientssuermanypulmonarysymptomsandexperienceprogressivedeteriorationof lungfunction[47].Thickenedmucuscauseschronicinfection,inrammation,andairwayremodeling.Theshapeandsizeoftheairwaytree,acomplextubular,branching structure,impactsrowandresistancethroughthelung'sairways[137].Airwaymorphologyisimportanttolungfunctionandmaybeamarkerofdiseaseseverityin patientswithdiseasesthataectairwayslikeCF.Thereisalsoevidencetosuggest abnormalitiesinthesizeandshapeoftheupperairwaytreemaybepresentinCF suerersfrombirth[93]. CTisavaluabletoolforassessmentofmorphology,butdoingsoeectivelyrequiresquantifcationofpossiblysubtleshapevariationsinthreedimensions.Theaim ofthesecondprojectwastocapitalizeongeometricinformationavailableinchest CTtostudythreedimensionalmodelsoftheairwaytreeinchildren.Iappliedastatisticalshapemodelingapproachtoidentifydierencesinairwaytreeshapebetween agroupofchildrenwithoutlungdiseaseandagroupwithCF. Tosummarize,theaimsofthisworkweretoshowthatquantitativeanalysisof CTcanbeusedto: DistinguishdiusepatternsoffbrosisrelatedtoIPFbyquantifyingimagetexture Measurecorrelationsbetweenamountoffbrosisfoundandotherclinicalparameters Identifygroupdierencesinmorphologyusing3DsegmentationsandStatistical ShapeModeling(SSM) 9

PAGE 19

2.Background Backgroundknowledgeinseveralkeyareasisrelevanttothisproject.Anunderstandingofcomputedtomography(CT)technologyandtheimagereconstruction processisimportanttounderstandingtheeectsofmachineparametersonimage appearanceand,subsequently,quantitativeanalysis.Medicalimageanalysisisa complexandexpansivetopicthatdrawsfromimageprocessing,computervisionand machinelearning.Abasicunderstandingofnormalrespiratoryanatomy,thediseases idiopathicpulmonaryfbrosis(IPF)andcysticfbrosis(CF),includinghowthese appearonCT,isalsoimportant. 2.1ComputedTomography Computedtomographyproducesimagesthatrepresentx-rayattenuationpropertiesofthincross-sectionsthroughasubject.TheinventionofCToverfourdecades agorepresentedapivotaladvanceinmedicalimagingforanumberofreasons. Whereasconventionalanalogradiographsareformeddirectlybyx-raytransmission throughasubjectandontoaflmcassette,CTreliesonanabstractcomputational processtoproducedetailedimage\slices".Inconventionalradiographycontrast resolutionislimited.Subtledierencesinattenuation( < 5%)arenotcaptured, largelybecausetheprojectionofthreedimensionalanatomicstructuresontoatwo dimensionalreceptorobscuresfnedetailsbyoverlyinganatomy[23].Tomographic, orcross-sectionalimages,revealmoresubtlecontrastvariations,butmustbereconstructedviaamathematicalprocess.ThewidespreadadoptionofCThelpedinitiate thetransitiontodigitaltechnologyinradiologyandmedicineoverall. ThefrstCTscannerwasdevelopedin1971byGodfreyHounsfeld,anengineer atEMI,Ltd.Asthestorygoes,theBritishcompanywasrushwithcashafterthe commercialsuccessofTheBeatles,whosealbumshadbeenreleasedonitsrecord label.EMIhadanindustrialresearchfocusandHounsfeldwaschargedwithfnding applicationsofnewcomputingtechnologies.Unbeknownsttohimatthetime,A.M. 10

PAGE 20

Figure2.1: SimpleCTschematic Cormack,aSouthAfricanmedicalphysicist,hadstudiedtomographytenyearsprior andpublishedworkonthemathematicaltheoryoftomographicimagereconstruction. HounsfeldandCormacksharedtheNobelPrizeinMedicinein1979fortheinvention ofCT[24]. ThebasicapparatusHounsfeldusedininitialexperimentsconsistedofanxraysourceanddetectorpositionedsothatthinx-raybeamsweredirectedthrough asubjectandtheirattenuatedintensitiescapturedbythedetectorpositionedon theoppositeside(Figure2.1).Translatingthenrotatingthesource-detectorpair aroundthesubjectgatheredlinearattenuationmeasurementsfrommanyparallel beamsatdierentangles.Attenuationmeasurementsfromalargenumberofscanned narrowbeamsprovidessucientinformationforreconstructionofacross-sectional image.CTimagesareessentiallymapsofrelativex-rayattenuationproperties.Values ateachimagepixelrepresentsanestimateofthelinearattenuationcoecientat thatlocationinthesubject(Figure2.2).Theseestimatesarereconstructedbya mathematicalalgorithmoperatingontherawprojectiondata.Thetotalmeasured attenuationforanyindividualbeamrerectstheaccumulationofdierentattenuation propertiesofthevarioustissuetypesencounteredinthatbeamspath(Equation2.1). 11

PAGE 21

Figure2.2: SimpleCTattenuation I I 0 = e )Tj /T1_5 7.97 Tf 7.998 5.978 Td (P n i =1 i x i (2.1) Where i and x i areattenuationcoecientsandthicknessesrespectively.Values fortheseparateattenuationcoecientscannotbeestimateddirectlyfromasingle thinbeammeasurementbecausetherearetoomanyunknowns.However,alarge numbersuchmeasurementsmadeinthesameplanebutatdierentorientationsdoes. Eachbeammeasurementcontributesinformationaboutthenarrowlineoftissueit intersects,andthevalueofanysingleimageelement,orpixel,isbasedonallthethin x-raybeamsthatpassedthroughthecorrespondingpointinthesubject. Austrianmathematician,JohannRadon,establishedtheunderlyingtheoryfor tomographicimagereconstruction,althoughhedidnotenvisionthisspecifcapplication.Radonshowedthatafunctioncanbereconstructedfromaseriesofitsprojections.Hisworkin1917followedWilhelmRoentgensdiscoveryofx-raysin1895but camewellbeforethedevelopmentofcomputertechnologythatmakesCTpossible [122]. Consideringasinglecross-sectionalplanecontainingthesubjectasadiscretized gridindexedas(x;y ),intensitylossinasinglebeamduetothecumulativeeectof 12

PAGE 22

attenuationsthroughmultiplematerialscanbeexpressedasthelineintegral: ln ( I I 0 )= )Tj /T1_4 11.955 Tf 11.291 16.272 Td (Z (x;y )ds (2.2) where (x;y )istheaveragelinearattenuationcoecientatgridlocation(x;y ) TheRadontransformdescribestherelationshipbetweentheprojectionofacollectionofbeamspassingthroughacross-sectionalplaneandtheattenuationcoecients (x;y )withintheplane.Assume r representsthepathofanx-raybeamintheimagingplane,and isangleformedbetween r andthe x-axis.So, r = xcos( )+ ysin( ) andtheRadontransformisgivenby: p(r; )= ZZ (x;y ) (xcos( )+ ysin( ) )Tj /T1_1 11.955 Tf 11.955 0 Td (r )dxdy (2.3) where =Diracdeltafunction Theprincipaloftomographicreconstructionisthatanimagecanbeformedfrom themapofattenuationcoecientsin (x;y )viatheinverseRadontransform.Intheory (x;y )canberecoveredperfectlyusinganinfnitenumberofprojections(beams) [23],howeverdirectanalyticsolutionsofinverseRadontransformaredicult. InpracticeCTreconstructionistypicallyperformedusingbackprojection. Briery,backprojectionbeginswithanemptygriddiscretizingtheimagespace.Each gridcellthatanx-raybeamintersectsisassumedtocontributetothetotalattenuationofthatbeam.Conceptuallythetotalmeasuredattenuationforeachnarrow beamissmearedbackthroughtheimagespace,takingintoaccountfractionalarea ofcellsintersected.Thisprocessiscalledback-projection.Backprojectioncanresultinsignifcantartifacts,especiallyatinterfaceswheredensitiesdiersubstantially. Filteredbackprojectionalleviatesthisissuebyflteringprojectioninthefrequency domainpriortobackprojection. 13

PAGE 23

Inflteredbackprojectionrawx-rayprojectiondataaremathematicallyfltered beforebackprojectingontotheimagereconstructionmatrix[24].Themathematical flteringisaconvolutionoperation,meaningthatprojectiondataismultipliedbya chosenkernelfunctioninthefrequencydomain.Inotherwords,theFouriertransform isappliedtoboththeprojectiondataandkernelfunctionbeforetheyaremultiplied. TheinverseFouriertransformisappliedtoreturntothespatialdomainpriorto backprojection.Filteredbackprojectionhasbeenthemostcommonlyimplemented reconstructionalgorithmincommercialscanners. Awidevarietyofkernelfunctionscanbeused,manyareproprietarytoCTscannermanufacturers.Theprimaryeectofthisconvolutionoperationismodulation ofedgeeects(sharpness)inthereconstructedimage.Choiceofkernelfunctionhas asignifcantimpactontheultimateappearanceofreconstructedimages.Choiceof kernelaectsimageresolutionandnoise. ThesamebasicprincipalsusedbyHounsfeldareemployedintoday'sscanners, buthardwaretechnologyhasevolvedsubstantially.Currentdesignsmakeuseofslip ringconnectionsforpowersupplyanddatatransmissionenablingcontinuousscanningwithoutstoppingorreversingdirectiontomanagecables.Priorgenerationsof scannerssueredlimitationscopingwithmechanicalconstraintsrelatedtohighspeed rotatinggantries.Slipringswereanimportantadvancementbecauseitenabledhelical scanningwherethepatientismovedthethroughtheCTborethroughoutacquisition, reducingscanningtimeandallowingcontiguousimageslices[52].Anothermajoradvancementisuseofmultiplesolidstatedetectorsarrangedinrowsmakingitpossible toacquiredataformanyimageslicessimultaneously.Multi-detectorCT(MDCT) isconsideredthestandardtoday.CurrentCTscannersacquirealargenumberof highresolutionimageslicessimultaneously,enablingfastvolumetricimaging.Speed, resolution,andcontrastattissue-airinterfacesmakeCTparticularlygoodatimaging pulmonaryanatomy[23],[51]. 14

PAGE 24

X-raysareaformofionizingradiationandexposuretothembringssomeamount ofriskfordevelopmentofcancerorothergeneticmutations.ModernCTscanners uselowdoseprotocolssotheriskisdiculttoquantifyandmustbebalancedagainst themedicalbeneftthattheCTexamprovides.Ingeneral,CTexamprotocolsare designedsothatclinicallyusefulimagesareacquiredwithaslittleradiationdoseto thepatientaspossible.Thiscanbeaccomplishedmycarefulselectionofx-raytube potential(measuredinkilovoltsorkV)andtubecurrent(milliamp-secondsormAs). Settingthesevaluestoolowmayproduceimageswithpoorquality,meaninglowsignal tonoiseratio,whichcanseverelylimitthevalueoftheexam.CTmanufacturers areactivelydevelopingnewhardwareandimagereconstructionalgorithmsthatcan produceveryhighqualityimageswithaslittlex-raydoseaspossible. CTiseectivelyamapoftissuedensity,developedusingx-rayattenuationproperties.WhilethesametypeofinformationisusedineveryCTscanappearancecan varysubstantiallydependingonacquisitionandreconstructionparameters.Acquisitionsettingslikeslicethickness,collimationandspacing,helicalsettingslikepitchand tablefeed,x-raytechnique(kVandmAs)allaectimagequalityandappearance.Imagereconstructionalgorithmandparameterslikeconvolutionkernelforflteredback projectionalsohaveastrongimpactonfnalimages[78].Thesevariationscanhave asignifcantimpactonsubsequentimageanalysisproceduresandgreatcaremustbe exercisedwhendesigningexperimentsandquantitativeimageanalysisprocedures. 2.2ImageAnalysis Medicalimageanalysishasevolvedsignifcantlyinthelasttwodecades,especially withthe\bigdata"revolution,andnowdrawsheavilyfromthefeldsofcomputer visionandmachinelearning.Computervisioncombinesimageprocessing,feature extraction,andmachinelearningalgorithmstoperformdetection,recognition,and classifcationtasksinimages.Computervisionhasclearapplicationinfeldsas disparateasrobotics,manufacturing,surveillanceandthemilitarytobringmoreob15

PAGE 25

jective,thoroughandautomaticmethodstoimageanalysistasks.Thekeystepsof computervision,excludingimageacquisitionandpre-processing,arefeatureextractionandmachinelearning. 2.2.1FeatureExtraction Inmanyrespects,featureextractionfallswithinthedomainoftraditionalimage processing.Thetermimageprocessingtypicallyreferstolow-levelcomputingoperationswhereinsignaldatatakestheformofanimage.Traditionalimageprocessingoperationsincludedenoising,edgeandcornerdetection,andimagecompression. Theseoperationsaregenerallyimplementedasflters.Examplesincludesmoothing toremoveorreducenoisefromimagesorsharpeningtoenhanceedges. Manywell-knownoperationsofimageprocessingaremethodsforfeatureextraction,whichistheprocessofrepresentingcertaincharacteristicsofimagesinastructured,quantitativeway.Thesimplestfeatureextractionmethodstakeintoaccount onlyindividualpixelintensities.Sometimescalleddensitometricmethods,counting numberofpixelsthatfallaboveand/orbelowgiventhresholdvaluescanprovidesome basicinformationaboutimages.Similarly,frst-orderpixelstatisticsbasedonmomentsofpixelintensityhistograms(e.g.mean,varianceandskewness)areintuitive, easytocalculateandcanbeinformative.Sincenoinformationaboutneighboringpixelsisconsidered,methodslikethisarebestatcharacterizingbrightnessandcontrast, butnotmorecomplexpatternswithsubtlevariations. Pixelgradientsareastraightforwardwaytotakeintoaccountlocalchangesin images.Theycanbeusedtofndlocationanddirectionofstrongedgesorother low-levelfeatureslikeridgesorblobsinimages.Whileusefulincertainapplications, gradientsarestillfairlysimplefeaturesthatdonotnecessarilycapturenuancedpatternsinimages. Imagetextureisaqualityformedbypatternsinbothpixelintensitiesandtheir spatialarrangement.Characterizationoftexturecapitalizesonmoreimageinforma16

PAGE 26

tionthanindividualintensitiesandhasprovenvaluableinimageanalysis.Purely structuraltexture,likeatiledroor,isformedbyrepetitionofsimplebuildingblocks. Artifcialtexturesareoftenhighlystructured.Naturaltexturestendtobemore stochastic-considerimagesofwoodgrain,cloudsorgrass.Textureinmedicalimagesislikelytoincludebothstructuralandrandomelementsatvariousscales. Whilethegeneraldefnitionoftexturerelatestosurfacefeelofasubstance,no precisedefnitionofimagetextureexists.Still,thehumanvisualsystemisverygood atdiscriminatingtextures,eventhosethatareonlysubtlydierent.Borrowinga phrasefamouslyusedbyJusticePotterStewart,\IknowitwhenIseeit"[119]. Several\classic"texturemethodsareusedfrequentlyintraditionalimageprocessing.Theygenerallyapplyfairlysimplerulestocomputevaluesbasedonpixel intensitydierencesatfxeddistancesanddirectionsinimageregions.Thesecanbe considered\handengineered"featuresastheyrequirecarefulselectionofparameters tooptimizeperformanceforaparticulartask. Greylevelco-occurrencematrices(GLCM)areatypicalapproachtoquantifying textureusedinimageprocessing.TheywereproposedbyHaralick[54]andmeasure secondorderpixelstatistics,orhowfrequentlycertainpairsofpixelsappeartogether. Inthisapproach,aco-occurrencematrix,whichisadiscreteestimateofthejoint probabilitydistributionofpixelpairs,isconstructedforanimageorregion.Figure 2.3showstwosimpleexamplesofGLCMconstruction.RowsandcolumnsofGLCMs representpixelintensitiespresentintheimage.Valuesineachcellarecountsofthe numberofoccurrenceofspecifcpixelpairs.GLCMscanbeconstructedindierent waysdependingonwhichneighborsareconsidered.Formally,aGLCMisconstructed as: G x;y (i;j )= n X p=1 m X q =1 8 > > < > > : 1; if I (p;q )= i and I (p +x;q +y )= j 0; otherwise (2.4) 17

PAGE 27

Figure2.3: Simpleimagepatch(left)andtwopossibleGLCMs.GLCMconsidering pixelsanditsneighborimmediatelytotheright,( x =0 ; y =1 )(middle)and consideringitsneighboronepixeldown( x =1 ; y =0 )(right) where G x;y (i;j )istheGLCMforagivendistanceoperator. I (p;q )istheimage tobetested,withpixellocationsspecifedby(p;q ).Apixelpairin I isdescribed bythepixelofinterestat(p;q )anditsneighborat(p + x;q + y ).Thedistance operatorisdefnedbytheosetsx and y .GLCMsareusuallynormalizedsosum ofallcellsis1 :0. Forexample,inFigure2.3twodierentdistanceoperatorsareused.Themiddle GLCMconsidersonlyneighborsonepixeltotheright,sothecellintheupperleft handcorneroftheGLCMcountsthenumberoftimesapixelwiththevalue0has aneighborimmediatelytotherightwhosevalueisalso0.Similarly,forthethis GLCMthecellat(2 ; 3)countsthenumberoftimesapixelwithintensityvalue2 isimmediatelytotherightofapixelwhosevalueis3.TheGLCMonthefarright ofFigure2.3showsanalternativeGLCMforthisimageregion.Inthiscasethe distanceoperatorconsiderstheneighborinthepositionimmediatelydownfroma pixelofinterest.Inthisexampletherearetwoinstances,showninGLCMcell(2; 3) whereapixelwhosevalueis2hasaneighboroneindexdownwhosevalueis3. ConsideringGLCMsasestimatesofjointprobability,texturemetricsarecalculatedasmomentsofthematrix.Haralicksuggestedasetof28featuresthatcanbe 18

PAGE 28

computedformGLCMs.Threeexamplesare: AngularSecondMoment: X i X j g (i;j ) 2 (2.5) Contrast: X i X j ji )Tj /T1_3 11.955 Tf 11.956 0 Td (j j 2 g (i;j )(2.6) Homogeniety: X i X j g (i;j ) 1+ ji )Tj /T1_3 11.955 Tf 11.955 0 Td (j j (2.7) where g (i;j )isvalueinthe(i;j )cellofthenormalizedGLCM GLCMscanbediculttoapplyinpracticeforseveralreasons.Thefrstisthesize ofco-occurrencematricescanbecomeprohibitivelylargewhenimagescontainawide rangeofpixelintensityvalues.Forexample,usingthefullcontrastresolutionofan imagewith1024greylevelswillresultinaGLCMthatis1024x1024.TokeepGLCMs toamoremanageablesize,itiscommontocompresscontrastresolution.However, doingsoresultsinsomelossofinformationandfndingtheoptimalparameterscan requiretrialanderrortesting. AnotherchallengeofGLCMsisthattheyarenotrotationallyinvariant.AGLCM isbasedonasingledistanceoperatorwhichdefnestherelativepositionofpixelpairs beingconsidered.ThelackofrotationalinvariancemeansthatGLCMtexturemetricswillbedierentforsimilartexturesthatareorienteddierentlywithinanimage. Forexample,foraapplicationitmaydesirabletoidentifypatternsofstripesconsistingofparallellines.GLCMtexturecalculatedwithasingledistanceoperatorwill giveverydierentresultsifstripesareorientedverticallyorhorizontallyinanimage.Thisdierencemakesitdiculttoassociatesimilarpatternsthatareoriented dierentlywithinanimageframe.Awork-aroundforthisproblemiscalculating severalGLCMswithdierentdistanceoperatorsandaveragingresults,butthishas 19

PAGE 29

it'sdrawbacks.Averagingcanblurthedierencesbetweentexturecategories,calculatingmultipleGLCMsiscomputationallyinecientandrequiresmoretrialand error\handengineering". Runlengthencodingwasoriginallysuggestedfordatacompression,butcanbe usedasanimagefeaturedescriptor.Inthisapproachthelengthsofruns,meaning consecutivepixelsinagivendirectionwiththesamevalue,arecountedandorganized intoamatrix.Similartogrey-levelco-occurrencemethods,momentsofthismatrix canbeusedastexturefeatures[48].Theelement(i;j )inanRLmatrixcountsthe numberoftimesthatanimageregioncontainsrunsoflength j inagivendirection consistingofpixelswithintensity i.Similarly,frequencyofgaps,meaningconsecutive pixelsbetweenpixelsofaparticularvalue,canbecomputedandstoredinmatrices [80].Featuresthatcanbecomputedfromrunorgaplengthmatricesinclude: Shortrunorgapemphasis: P i P j rg ij =j 2 P i P j rg ij (2.8) Longrunorgapemphasis: P i P j j 2 rg ij P i P j rg ij (2.9) where rg (i;j )isvalueinthe(i;j )cellofthenormalizedrunorgaplengthmatrix Implementationofrunandgaplengthmethodsfaceschallengessimilartothose associatedwithGLCMs.Parameterslikebinsizesfordownsamplingcontrastresolutionandchoiceofsearchdirectionsoftenrequiremanualfnetuning. Localbinarypatterns(LBP)wereproposedasatexturemetricintendedtobe invarianttorotationanduniformintensitychangesinimages[105].LBPsencode thedierencebetweenacentralpixelofinterestanditsneighborsatfxeddistances. Movinginaclockwisedirectionalongaradialpathneighboringpixelsarequeried. 20

PAGE 30

Figure2.4: Localbinarypattern. Ifthequeriedpixelisbrighterthanthecenterpixelitisencodedas1,otherwise0 isstored.If8locationsaretestedthisproducesan8bitbinaryarray.Converting thebinaryarrayintoadecimalnumberproducesavalueforthatring.Typicallythe operationisrepeatedforseveralradiiandresultsconcantedintoanarrayasafeature vector(Figure2.4). Segmentationistheprocessofdelineatingtheboundariesofobjectsinimages. Medicalimagesegmentationisusedextensivelytoproducegeometricrepresentations ofanatomyforproceduralplanning,analysisandmodeling.Automaticmethodsfor CTsegmentationisasubstantialtopicinitsownrightandbeyondthescopeofthis work.Segmentationcanbeconsideredafeatureextractionmethodinthatitproduce geometricmodelsthatinformotherimageanalysistasksorbeusedforstudiesof morphology.Inthisworkcommerciallyavailablesegmentationroutineswereusedto computethree-dimensional(3D)modelsofthelungsandairways.Modelsofthelungs wereusedtoguidetexturalanalysisoflungparenchymaandairwaysegmentations wereusedtoanalyzemorphology. 2.2.2MachineLearning Machinelearning,orstatisticallearning,isafeldofstudywhosemaingoalis modelinganddevelopingunderstandingfromlargecomplexdatasets[64].Modern 21

PAGE 31

machinelearningencompassesavastarrayoftoolsandtechniquesfromcomputer science,informationsystems,statisticsandmathematics.Machinelearningistypicallyconcernedwithconstructionofalgorithmsthathavetheabilitytoadapt,in aninductivefashion,toexampledatainordertoperformaspecifctask.Inother words,algorithmsinmachinelearningaregenerallybasedaroundrulesorparametersthatareadaptedaccordingtoinputdatainordertogiveabetterresultina particulartask.Manyexamplesofmachinelearningsystemshavebecomefamiliarin thelastseveralyears.Recommendersystemsinonlineshoppingandsocialnetworks, emailspamflteringandfacerecognitionindigitalimagesareallimplementations ofmachinelearningalgorithms.Eachoftheseapplicationsusesdata,likerecordsof previousonlinepurchases,knownspammessagesorexemplarfaceimages,tomake estimatesorpredictionsaboutnewdata. Machinelearningprovidesimportanttoolsforquantitativeanalysisofmedical images.Thecomplexityandvariabilityofanatomyandabnormalitymakesitdicult toderiveanalyticsolutionsordevelopclosed-formsolutionsrelatingtheappearance ofmedicalimagestootherrelevantclinicaldata[134].Manytasksinmedicalimage analysisarebasedonpatternrecognitionandthebestresultsareachievedbythe inductivelearningbyexampleprocessesonwhichmachinelearningalgorithmsare based[123]. Machinelearningalgorithmsarebroadlydividedintotwomaincategories,supervisedandunsupervisedlearning[64].Whensourcedataislabeledsothatobserved featuresareassociatedwithaknownordesiredoutcome,thensupervisedlearning strategiesareusuallyapplied.Thetypicalapproachistodevelopamodelbasedon anavailablesetofdata,calledthetrainingsetandconsistingofsetsofinputfeatures andtheirassociatedoutcomes.Themaingoalistoconstructanalgorithmcapable ofgeneralizingfromthetrainingsetsothatwhenpreviouslyunseeninputfeatures arepresentedtothealgorithmthepredictedoutcomeisacceptablyaccurate. 22

PAGE 32

Figure2.5: Supervisedlearningcombinesextractedfeaturesandassignedlabelsto createamodel. Whenapplyingsupervisedlearningwheretheoutputisacontinuousvariableregressionalgorithmsareusuallyused.Examplesincludepredictionofthesellingprice ofahomebasedonlocation,yearofconstructionandnumberofroomsorestimation ofalcoholcontentinwinefromitscolor,acidityandpH.Linearregressionisamainstayoftraditionalstatistics[55].Asmachinelearningtendstobeappliedtolarge datacollectionswithmanypredictorvariables,morecomplexregressionstrategies areoftenneeded. Whereasregressionestimatesthevalueofacontinuousoutcomevariablegivena setofinputs,classifcationsystemstakeinputdata,arrangedasvectorsofnumerical valuesthatmaybediscreteorcontinuous,andoutputasinglediscrete,orcategorical, value.Thistypeofapproachisusedtoautomatediscriminationofinputdatainto previouslydefnedcategories.Exampleapplicationsincludespamflters,creditscore analysisandcharacterrecognition.Ineachofthesetaskssomesetofobservedfeatures providesinformationthatcanbeusedtodecidewhatcategoryanewobservation belongsto.Forexample,thecombinationofwordsinanemailorbankbalance,age, yearsofeducationandemployment,canbeusedtopredictifanemailisspamoran individualisagoodcandidateforanewbankloan.Thegeneralconceptofimage classifcationbasedonextractedfeaturesisrepresentedin2.5. 23

PAGE 33

Insupervisedlearning,thealgorithmadaptsbasedoninputfeaturesinorderto returnthedesiredoutcome.Bothclassifcationandregressionusuallyrelyonlabeled dataandemployasupervisedapproach.Somenoteworthysupervisedmethodsare RandomForest,LogisticRegressionandSupportVectorMachines,allofwhichwere usedinthiswork. Logisticregression(LR)isaspecialcaseofaclassofstatisticalmodelscalled generalizedlinearmodels.LRismeanttopredictacategoricalordiscreteoutcome, likegroupmembership,fromasetofpredictors.UsuallytheresponseofaLRmodelis binary,althoughanadaption,multinomialLRcanbeusedformulticategoryresponses [7].LRisrelativelysimplecomparedtomachinelearningalgorithms,butcanbevery useful.LRmodelsarebasedonwell-defnedstatisticsandoerinterpretability,a traitthatcanbelackinginsomemachinelearningalgorithmsthatfunctionas\black boxes". Thelinearprobabilitymodelforabinaryoutcomevariableisgivenby: (x)= + f (x) (2.10) Where (x)istheprobabilityofapositiveresponse(sometimesreferredtoas \success"whenconsideringbinaryoutcomes).Heretheprobabilityofsuccesschanges linearlyin x andtheparameter f representsthechangeinprobabilityofsuccessper unitchangein x [7].Usingsimplelinearregressiontoftthismodelisrawedbecause probabilitiesfallbetween0and1,butfttedvaluescouldfalloutsidethisrange.To avoidthisproblemitisnecessarytouseamethodthatgivesoutputslimitedtothe interval[0 ; 1]forallinputs x.LRusesthelogisticfunctiontodothis: (x)= e f 0 + e f 1 x 1+ e f 0 + e f 1 x (2.11) 24

PAGE 34

Thelogisticfunctionproducesan\S"shapedcurveontheinterval[0; 1]whichis convenientformodelingprobability.Aftersomemanipulation2.11canbesimplifed tothelogisticregressionmodelform: log (x) 1 )Tj /T1_1 11.955 Tf 11.955 0 Td ( (x) = + f (x) (2.12) whichcanbeftusingmaximumlikelihood[64]. Whendataformanypredictorvariablesisavailableitislikelythatonlyasubset arerelevanttoaparticularoutputofinterest.Moststatisticalmodelsandmachine learningalgorithmswillperformbestwheninputpredictorsareindependentfrom eachotherandeachstronglycorrelatedwiththeoutputvariableofinterest.As well,theprincipalofOccam'sRazorisveryrelevantintheconstructionofmachine learningmodelsandalgorithms.Simpler,moreparsimoniousmodelsareusually preferable-notnecessarilybecausesimplermodelsalwaysgivebetterresults,but becausesimplicityisworthwhileinitsownright(formanagingdatainterpretability, andotherpracticalreasons)[55].Realworlddataisusuallymessy,includesirrelevant andredundantinformationanditcanbeachallengetofndwhichaspectsarethemost informative.Dimensionalityreductionandvariableselectionaretwoimportantparts ofdevelopingmachinelearningmodels.Dimensionalityreductiondescribesmethods toshrinkthenumberofpredictorvariableto(hopefully)distillahigherconcentration ofmeaningfulinformationfromasetofpredictors.PrincipalComponentsAnalysis isawellknowndimensionalityreductiontechnique.Variableselectionistheprocess offndingwhichofasetofinputvariablesmakethebestmodel,givenaparticular criteria. ProposedbyLeoBreimanin2001[21],RandomForest(RF)isanensemble learningmethodthatusesbootstrapaggregationorbaggingintandemwithrandom featureselection.Ensemblelearningmethodsaggregatetheresultsofacommittee 25

PAGE 35

ofclassifers.Themainideaisthatacollectionofweak,independentlyconstructed statisticallearningentities,orlearners,canbeusedtogethertocreateasinglestrong learner.Thisisdonebycombiningresultsfrommanynoisybutapproximatelyunbiasedmodels,therebyreducingvariance.Inclassifcationthefnaloutputisarrivedat bymajorityvoteofthelearnerswhileinregressionfnaloutputisanaverageacross thecollectionoflearners[64]. Inbaggingabootstrapsampleisdrawn,usuallywithoutreplacement,fromthe originaltrainingsettobuildasinglemodel,mostoftenadecisiontree.Theprocess repeatedwithdierentindependentbootstrapsamplesuntilacollectionoftreeshas beenconstructed.Decisiontreesarewellsuitedtobaggingbecause,whiletheyare abletocapturecomplexinteractionswithindataandhaverelativelylowbiaswhen grownsucientlydeep,theyalsocanbequitenoisy[55]. RFsaddanadditionallayerofrandomnesstothisprocess.Duringthetree growingphasearandomsubsetofpredictorsisselectedateachnodeandonlythis subsetusedtodeterminethebestsplitter.Individualtreesarethereforde-correlated sincesplittingateachnodeisperformedbasedonadierentrandomsubspaceofthe overallfeaturespace.Thishastheeectofreducingbias,butalsotendstoincrease variancewhichisalleviatedbyuseofanensembleoftrees. RFsarebasedonClassifcationandRegressionTrees(CART),alsoknownas RecursivePartitioning.CARTsareasimplenonparametricapproachwhosemain characteristicisrecursivelypartitioningfeaturespace,i.e.thespacespannedbyall predictorvariables,intoasetofrectanglesthenfttingasimplemodel,oftenjusta constant,toeachone[55],[121]. RFsarebasedonfairlysimpleconcepts,areeasytoimplement,canbecomputed inahighlyparallelfashionandcanprovideessentiallystate-of-the-artperformance [25].Asaresulttheyhavebecomequitepopularandhavebeenappliedtoawide rangeofproblemsrangingfromcomputervisiontopharmaceuticalmodeling[36], 26

PAGE 36

Figure2.6: RandomForestsconsistofanensembleofdecisiontreeswhoseoutputs areaveragedtoproduceanoverallestimate [79]. AkeyfeatureofRFsistheavailabilityofso-calledoutofbag(OOB)data,the termcoinedforthedatanotchoseninagivenbootstrapsampleforconstruction ofanyonetree.Outofbagdata,byconvention1/3ofthetotaltrainingdata, canbeusedtoprovideongoingdynamicassessmentofmodelperformance.Ateach bootstrapiterationpredictionisperformedbypushingOOBdatadowneachtreeto testpredictionaccuracy.Thesepredictionsareaggregatedoverallthetreesandan errorratecalculated,calledtheOOBerrorestimate. RFsincorporatemethodsthatcanbeusedforfeatureselection.RFsestimate variableimportancebytestingtheincreaseinpredictionerrorwhenvaluesfora particularvariableinOOBsamplesarerandomlypermuted[21].Thepercentincrease inmisclassifcationerrorascomparedtotheOOBerrorratewithallvariablesintact givessomemeasureofvariableimportance.Themorepredictionerrorincreases 27

PAGE 37

whenaparticularvariableisarbitrarilychanged,themoreimportantthatvariableis deemedtobe. AnotherapproachforsupervisedlearningisSupportVectorClassifers,which areanon-probabilisticlinear,binaryclassiferbasedontheconceptthatoptimal separatinghyperplanescanbefoundinafeaturespacethatcanserveasdecision boundariesforclassifcation.SupportVectorMachines(SVMs)areanextensionof thisideathatmapinputstoahigherdimensionalspaceusingtheso-called\kerneltrick"inordertobetteraccommodatenon-lineardecisionboundaries[64]. Inunsupervisedlearningoutcomelabelseitherarenotknownorarenotapplicable.Unsupervisedlearningcanbeusedtoorganizecomplicateddatasetsintological groupsorclusters,whereideallyeachmemberofaclusterissomehowsimilartoother members.Unsupervisedlearningandclusteringareoftenusedtodevelopsomestructureororganizationtoverylargedatasets.Thestructuremayitselfbeuseful,for exampleindataminingproblemsseekingtofndconnectionsinlargedatasetswith seeminglydisparatefeatures.Certainaspectsofsearchenginesrelyonclustering,as dosomebioinformaticsstudiesseekingtofndassociationsbetweenclinicalmanifestationsandproteinsorgenes.Unsupervisedmethodscanalsobeusefulasastepin anoverallmachinelearningprocess.Someunsupervisedmethodsmakeitpossibleto reducethedimensionalityofverycomplicatedfeaturevectorsinaneorttodetermine themostrelevantcharacteristicsortodecomposeexampledatainputbasisfunctions thatthemselvesareusefuldescriptors.Twonoteworthyexamplesofunsupervised methodsarePrincipalComponentsAnalysis(PCA)andK-meansclustering,bothof whichareusedinthiswork. PrincipalComponentsAnalysis(PCA)isamathematicalmethodforsimplifying thedescriptionofdierencesinvariables.Itisoftenusedasanexploratorymethodor fordimensionalityreductionindataanalysisandmodeling[55].Themainobjective ofPCAistotransformasetofpossiblycorrelatedvariablesintoasetofuncorre28

PAGE 38

latedvariables,whicharecalledthePrincipalComponents(PCs)andaresortedby amountofvariance.Thisisaccomplishedbyachangeofspace,orrotation.PCsare orthogonal,uncorrelatedlinearcombinationsofthestandardizedoriginalvariables [53].PCAessentiallyrealignstheoriginaldatasothemostinformativeaspectsare alignedwiththePCs.Thedirectionofgreatestvarianceinthedataisalignedwith thefrstPCandPCsaresortedbydecreasingvariance[64]. K-meansisasimple,iterativeapproachtounsupervisedlearningthatpartitions adatasetin K distinct,non-overlappingregions[64].K-meansisastaplemethod forclusteringorvectorquantizationandiseasytoimplement.Briery,toperform K-meansclusteringthefrststepistospecifythenumberofdesiredclusters K .Each observationinthetrainingdatasetisrandomlyassignedtoonecluster.Ateach iterationofthealgorithmthecentroidofeachclusteriscalculatedusuallyusingthe Euclideandistancemetric,althoughotherdistancemeasuresmaybeused.Next, clustermembershipisrecomputedsothateachobservationisassignedtothecluster whosecentroidisclosest.Centroidcalculationandclusterassignmentisiterated untilthealgorithmconverges.Despitebeingquiteasimplealgorithm,K-meansdoes aprettygoodjobofclusteringfeaturespaceandreliesonlyonaninitialguessforthe numberofclusterstouse[64].Thiscanbeavaluabletoolforsimplifyingcomplex databyquantizingobservationsintoafnitenumberofbinsorclusters. 2.3RespiratorySystem Theprimaryfunctionoftherespiratorysystemisgasexchange,specifcallyO 2 uptakeintovenousbloodandCO 2 removal.Gasisbroughttoonesideoftheexchange surfacethroughairwaysandbloodtotheothersidebyvessels.Thisprocessrequires largesurfaceareawhereonlyaverythintissuebarrierseparatesbloodandairat theendpointsoftheairwaytreeandtheabilitytomovelargevolumesofair[140], [136].Thelungsalsometabolizesomecompounds,flterunwantedparticlesandact asareservoirforblood[140].Thesefunctionalrequirementsareconstrainedbysize 29

PAGE 39

restrictionsofthechestcavityandtheneedtobothventilateandperfusethelung surfacesevenlyandeciently.Researchershaveconcludedthatthedesignofthelung istheresultofanoptimizationoffunctionalrequirementsandphysicalconstraints [59],[91]. Themainstructuresoftherespiratorysystemaretherightandleftlungsandthe tracheobronchialtree,whichservestotransportanddistributeairfromthemouth intothelungsandback.Thetracheaisthelargetubethatbeginsatthethroat andleadsintothelungs.Thefrstmajorbranchpointofthetracheobronchialtree occursroughlyattheleveloftheffththoracicvertebraegivingrisetotheleftand rightprimarybronchi,alsoknownasleftmainbranch(LMB)andrightmainbranch (RMB)whichareseparatedbyaridgeofcartilagecalledthecarina.Theseprimary bronchienterthemedialsideoftheirrespectivelungs(Figure2.7).Therightandleft lungssitinthepleuralcavities,betweenwhichliesthemediastinumcontainingthe heart,aorta,greatvessels,esophagusandlymphnodes[57].Thelungsarepartitioned intoseparateentitiescalledlobesthatareenvelopedbyvisceralpleural.Themain spongysubstanceoflungsiscalledparenchyma. Theairwaysandpulmonaryarteriesarebranchingtubesthatruntogether,at roughlythesamesize,throughtheparenchymadividingtobecomenarrowerand morenumerousastheygetdeeperintothelung.Distaltothecarina,theprimary bronchidivideintothesecondary,orlobarbranches,whichleadtothelunglobes,of whichtherearethreeontherightside(upper,middleandlower)andtwoontheleft (upperandlower).Theairwayscontinuetobranchintofnerandfnertubesfora totalofroughly23generations[57](Figure2.8). Roughlythefrstsixgenerationsoftheairwaytreeexhibitasimilartopology acrosssubjectsandfairlystandardanatomicalnamesexistforabout30segments. Beyondthesixthgenerationbranchingstructurebecomesvariableanddiersfrom subjecttosubject[127].Thesesmallerairways,afteraboutgenerationsixtonine, 30

PAGE 40

Figure2.7: Basicrespiratoryanatomy[4].[AlveoliillustrationcourtesyofTravis Vermilye] arediculttovisualizewithimagingmodalitiescurrentlyinclinicaluse[125].The overallbranchingstructureoftheairwaytreeisasymmetric,heterogenousandselfsimilarandgenerallythoughttoapproximateaspace-fllingfractalstructure[136], [92],[98]. Thefrst16branchesoftheairwaytreemakeuptheconductingzone,whose primaryfunctionismovementofairtoandfromgasexchangesurfaces.Nogas exchangeoccursintheconductingzone.Thesmallestairwayswithoutalveoliare calledterminalbronchioles.Thegasexchangeunits,oralveoli,begintoappearat aboutthe17 th generationmarkingthestartoftherespiratoryzone.Mostofthetotal lungvolumelieswithinrespiratoryzone.Alveolibegintoappearsporadicallyin therespiratorybronchioleswhicharefollowedbyalveolarductswhicharecompleted linedwithalveoli[57]. 31

PAGE 41

Figure2.8: Tracheobronchialtreehierarchy[136].Z=generation. Thelungsachievelargesurfaceareaforgasexchange,despitesizeconstraintof thethoraciccavity,bywrappingcapillariesinadensenetworkaroundeachalveolus. Capillariesareintherangeofseventoten m indiameter,whichisjustenoughfora redbloodcell.Thisbringsvenousbloodincloseproximitytothethinepithelialcells ofthealveolithroughwhichgasexchangeoccurs.Eachredbloodcellspendslessthan onesecondinthecapillarynetworkandlikelypassesthroughtwoorthreealveoliin thistime,whichisthoughttobesucienttoachievenearcompleteequilibriumof O 2 andCO 2 inalveolargasandcapillaryblood[140]. Thereareanestimated300millionalveoliinthelungs,eachofwhichareabout onethirdofamillimeterindiameterresultinginatotalsurfaceareaofabout75 m 2 inadults[57],[138].Anindividualalveolusconsistsofanepitheliallayerandcontain somecollagenandelasticfbersthatprovideelasticproperties.Therearetwotypes ofcellsinthealveolarepithelium.TypeIpneumocytesaretheprimaryepithelial cellsandTypeIIpneumocytesareresponsibleforproductionofsurfactant[23],a 32

PAGE 42

substancethatpreventscollapseofalveoliduringexpiration[140]. Motionofairintoandoutofthelungsisnecessarytomaintainalveolarventilation.Oninspirationthediaphragm,adomeshapedmusclebelowthelungs,contracts toincreasethevolumeofthethoraciccavityandairisdrawnintothelungs.Other musclesofthechestcontributetothisaction,butnormallydiaphragmaticmotionis responsibleforthemajorityofairintake.Expirationofairfromthelungsisamore passiveprocesswhererelaxationofmusclesandnaturalelasticityofthelungsallow returntorestingvolumes[23]. Twophysicalprinciplesarekeytoalveolarventilation:1)Boyle'slaw,which statesthatasvolumeofagasincreases,itspressuredecreasesinproportionand2) thetendencyofgastorowfromcompartmentswithhigherpressuretothosewith lowerpressure[140].Airwayresistanceisamajordeterminantofrowinthelungs andislargelydictatedbygeometryaccordingtoPoiseuille'sLawwhichdescribes resistancetorowinatube: R = 8Ln r 4 (2.13) where R isresistancetorow, L istubelength, n isgasviscosityand r istube radius.Thedominanteectin2.13is r 4 inthedenominatorwherebyhalvingthe tuberadiusincreasesresistance16fold.Thisindicatesthatresistanceinthesmaller airwaysishigh.Still,themostsignifcantresistancetorowoccursintheupper airways.Despitethesmallradiiofdistalairwaysthemassivelyparallelnatureof branchingatthesegenerationsallowsforecientrow[136]. Airrowinthelungsisalsostronglyaectedbyelasticpropertiesofthelungs. Compliance,theratioofchangeinvolumetochangeinpressure,isameasureof howeasilylungtissuecanbeinrated.Resistancetoinration,calledstiness,is inverselyrelatedtocompliance.Lungcompliancecanbeaectedbyanumberof factorsincludingthemixandgeometricarrangementofelastinandcollagenfbersin 33

PAGE 43

alveolarwallsandsurroundingbronchi[140],[23]. Lungfunctioncanbetestedwithanumberofmethodsoftenusingspirometry, wherepatientsbreathintoaclosedsystemthatmeasuresvolumesofairmovedduring inspirationandexpiration.Inclinicalpulmonaryfunctiontestspatientsareaskedto breathinfullyformaximalinspiration,thenexhalecompletelyintothespirometer. Thevolumeexhaledinthefrstonesecondiscalledforcedexpiratoryvolumeor FEV 1:0 andthetotalvolumeexhaledisforcedvitalcapacityorFVC[140]. Diusioncapacityofthelungistheabilityofgastotransferfromthealveoli totheredbloodcells.Itdependsnotonlyontheareaandthicknessofthebloodgasbarrierbutalsoonthevolumeofbloodinthepulmonarycapillaries.Diusion capacitycanbemeasuredusingadilutemixtureofcarbonmonoxideandhelium.A patientisaskedtoinhalethegas,holdtheirbreaththenexhale.Concentrationsof theinspiredandexpiredconcentrationsofcarbonmonoxidearemeasured.Heliumis incorporatedtoenablemeasurementoflungvolume.DL CO isthevolumeofcarbon monoxidetransferredinmillilitersperminuteperunit(mmHg)ofalveolarpartial pressure[140]. 2.3.1IdiopathicPulmonaryFibrosis Interstitiallungdiseases(ILD)arecharacterizedbyinfltrationofthelung parenchymabyelementslikeinrammatoryormalignantcellsorothersubstances likecollagen[23].Therearegreaterthan100distinctdisordersconsideredtobeILDs andtheycanvarywidely.Idiopathicpulmonaryfbrosis(IPF),themostcommon formoflungfbrosis,isaspecifcILDwithauniformlypoorprognosis[19],[83].Itis achronicdisordercharacterizedbyprogressivescarringofthelungs.InIPFcollagen buildsupinthedelicatesupportsystemaroundthealveolileadingtoscarring,or fbrosis,whichmakesthelungsstianddecreaseinvolume.Theaccumulationoffbrosiseventuallydestroystissueandreduceslungcompliance,whichmakesbreathing dicultandcompromisestheabilitytotransportoxygenintothebloodstream.Gas 34

PAGE 44

exchangeisimpairedbothbythickeningofthegasexchangebarrieranddeformationoftheairwaysystemandbloodvessels[136].IPFisassociatedwithsubstantial morbidityandmortality[19]. Thediseaseisconsideredrelativelyrare,althoughrecentstudiessuggestincreasingfrequency[62].Ittendstoaectpeopleover50yearsoldandismorecommon inmen.DataonincidenceandprevalenceofIPFislimited[110],partlybecause defnitivediagnosticcriteriahavenotbeenestablishedandguidelineshaveevolved inthelastdecade[28].Currentresultsarebasedonpatientregistriesthatdidnot useorwerepublishedpriortotheacceptanceofcurrentdiagnosticcriteriaforIPF. Theage-andsex-adjustedincidencerateofIPFamongthoseaged50yearsorolder isestimatedtorangefromabout9to17casesper100,000person-years[106]. MediansurvivalinIPFjust2-4yearsafterdiagnosis-adeathrateworsethanthat ofmanycancers[61].Anumberofwell-knownpeoplehavediedfromIPFincluding actorMarlonBrandoanddaredevilEvelKnievel.Bycurrentestimatetherewill bebetween13,000and17,000deathsfromIPFintheUnitedStatesthisyearand between28,000and65,000deathsfromIPFinEurope[62]. Whileotherformspulmonaryfbrosishaveknowncauses,suchasdrugtoxicity, systemicdiseaseorinhalationofcertainagents,thepreciseetiologyofIPFisnot known.ItisthoughttobeassociatedwithfamilyhistoryofILD,smoking,certainoccupationalexposures,chronicviralinfectionsandabnormalacidrerux[19]. Theprevailingopinionforsometimehadbeenthatgeneralizedinrammationsetin motionachainofeventsleadingtoabnormaltissuerepairandfbrosisinthelung parenchyma[14].Accordingly,anti-inrammatoryagentsandimmunemodulators weretypicallyprescribed.However,thesehaveonlyminimaleectinalteringthe courseofthedisease.CurrentthinkingisthatIPFisanepithelial-fbroblasticdisease inwhichunknownstimuli,eitherinternalorenvironmental,disrupthomeostasisof alveolarepithelialcells[132].Thesuspicionisthatthisresultsindiuseepithelial-cell 35

PAGE 45

activationandaberrantepithelialrepairreleasingpotentfbrogenicmoleculesandcytokines,whichinturntriggereventsthatpromoteproductionofextracellularmatrix molecules[132]. Recentworkhasshownthatgeneticriskfactorsareassociatedwiththedevelopmentofpulmonaryfbrosis[46],[108],[27].Theseresultssuggestthatagenetic predispositioncombinedwithcertainenvironmentalfactorsmaytriggerthecellular processesthatleadtoprogressivelungfbrosis.Thereisevidencethatanumberof tyrosinekinasereceptorsincludingplatelet-derivedgrowthfactor,fbroblastgrowth factorandvascularendothelialgrowthfactorplayrolesinthepathogenesisofIPF [61],[26]butthemolecularmechanismsoftheearlystageofthediseasearenot entirelyclear[28]. Currently,treatmentoptionsforIPFarefew.Lungtransplantisdicultand historicallytherehasbeenlimited,andsometimesconricting,evidencethatanydrug couldalterthecourseofthisdisease[61].However,recentreportsfromphase3trials indicatetwonewerdrugtherapiescanreducetherateofdeclineinlungfunction[112], [70].TheU.S.FoodandDrugAdministration(FDA)recentlyapprovedtwodrugs thatseemtoinhibitcertainpathwaysthatleadtolungscarring[5].Whileneither drugisacurebothhavebeenshowntoslowdownprogressionasmeasuredbydecline inFVC. IPFisdiagnosedprimarilyonthebasisofclinical,physiologicandradiologic criteriaandissubjecttoanumberofchallenges.Itismisdiagnosedinasmanyas50% ofpatientsandproperidentifcationrequiresavarietyoftestsbyamultidisciplinary team[28].ClinicalfeaturesofIPFarenon-specifcanditsappearanceissimilarto otherformsofILD[135].Diagnosisisoftenbyexclusion,takingintoaccountpatient history,functionaltestsandimaging.DistinguishingIPFfromdiseaseswithasimilar appearanceisimportantbecauseotherILDscanhaveaknownenvironmentaletiology andestablishedtreatmentpathswhichmaybeinappropriate,orevendangerous,for 36

PAGE 46

patientswithIPF[62],[61]. ClinicalfeaturesofIPFincludeprogressiveshortnessofbreath,non-productive coughandevidenceofrestrictiononpulmonaryfunctiontests[96].Compromised lungcompliancelimitsinspiratoryrowandtendstoreducelungvolumes.FVC, FEV 1:0 ,DL CO andtotallungcapacityaregenerallyreduced[140].Chestx-raymay appearnormalinpatientswithearlydisease,butinmoreadvancecasesdecreased lungvolumeorsubpleuralopacitiesmaybeapparent[96].ProgressionofIPFis variable.Whilesomepatientsmayremainstableforperiodsofyears,otherssuer rapiddeclineandepisodesofacuteworsening[68]. CTofthelungsplaysakeyroleinthediagnosisofIPFanddistinguishingbetween otherformsofILD,[102].Usualinterstitialpneumonia(UIP)isthehistologicaland imagingcorrelateofIPF.Historically,identifcationofaUIPpatternonsurgicallung biopsywasconsideredthegoldstandardfordiagnosis[83].Currently,identifcation ofaUIPpatternonCTisconsideredsucienttodiagnosticofIPFwithoutasurgical lungbiopsy[110]. AUIPpatternonCTismarkedbymixedareasofnormallung,activeandend stagefbrosis[124].Thediseaseismarkedbyspatialandtemporalheterogeneity.It isnotunusualtofndregionsofnormallungadjacenttoregionsofseverefbrosis. ThekeyfeaturesofUIParereticularabnormality(RA)andhoneycombing(HC) appearingpredominantlyinthesubpleuralandbasalregionsofthelungs[84],[83]. Tractionbronchiectasis(TB),airwaysenlargedbyasaresultofradialforcesrelatedto stieninglungs,mayormaynotbepresent.RAismarkedbysuperpositionofsmall linearopacities,interlobularseptalthickening,interlobularlinesandthecystwallsof HC.HCisdefnedasclustered,thinwalledcysticairspaceswithgenerallyconsistent diametersintherangeof3-10mmbutsometimesaslargeas2.5cm.HCisconsidered matureandirreversiblefbrosisandisthestrongestindicatorofUIPonCT.Anumber offeaturesareconsideredinconsistentwithaUIPpattern,includingupperofmid 37

PAGE 47

lungpredominance,extensivegroundglass,profusemicronodules,discretecystsaway fromareasofHC,diusemosaicattenuationandconsolidationinthesegmentsor lobes[110]. NumerousstudieshaveshownthataconfdentdiagnosisofUIPonCTisstrongly predictiveofIPF[27],[16],[96].Semi-quantitativestudies,whereradiologistsassigned integerscorestoindicatetheirassessmentofextentofcertainfeatures,haveshown thatthepresenceandextentofUIPonCTisassociatedwithdeclineinlungfunction [139]andmortality[83].InpatientswithIPF,extentofhoneycombingincreaseson serialCTandreticularabnormalitytendstoprogresstohoneycombing[116].UIP withhoneycombingisassociatedwithprogressionandworseprognosis[65].Recently arelationshipbetweenCTdiagnosisofUIPandagenotypeassociatedwithairway defense,MUC5b,hasalsobeenreported[27].Theseresultsallsupportthenotion thatquantifyingextentofthesepatternsonCTisusefulandcanprovideprognostic information. Unfortunately,visualassessmentofCTisdicultandsubjective.Resultsare generallyexpressedinimprecise,qualitativetermsandbothintra-andinter-observer variationcanbesignifcant.Reportsindicatethatthereisoftendisagreement,even betweenexperiencedradiologists,inidentifyingUIPfeatureslikeHConCT[135], [83].ThevariedappearanceofUIPcharacteristicsisseenasakeychallenge.Further, mentalintegrationtoestimateextentofpatternsinthreedimensionsisinherently dicultparticularlywhenevaluatingonlyimagecross-sectionsthroughavolume. ImageanalysismethodstoquantifytheappearancelungfbrosisonCThave shownsomesuccess.Firstorderpixelstatistics,specifcallyskewnessandkurtosis havebeenshowntocorrelatewithphysiologicabnormality[15],[16].Thismakes intuitivesense-CTessentiallymeasuresdensityandwithincreasingfbrosislungs becomesmaller,moredenseandcontainlessair.Figure2.9demonstratesthiseect. Noticehowthepeakofpixelhistogramsbecomeslesssharp(kurtotic)andmoves 38

PAGE 48

Figure2.9: Withincreasingfbrosis,histogramsoflungpixelsbecomelessskewedand kurtotic.CoronalCTsectionsandcorrespondinglungpixelhistogramsfromleftto righta.Normal,b.moderatefbrosis,c.severefbrosis towardzero(lessskewed). Severalresearchershavedemonstratedtheabilitytoidentifylungfbrosison CT.KimshowedthatanSVMtrainedwithstatisticaltexturefeaturesinasupervisedfashionproducedaquantitativescoreforextentoffbrosisinsubjectswith sarcoidosisthatagreedwellwithradiologists[69].Theymadeuseofanimageflteringpre-processingsteptoattempttonormalizeimagevariationsresultingfrom variedscannersandacquisitionparameters.Theyhavealsoappliedthissystemto estimateextentofIPFonCTandnotecorrelationwithothermeasuresofdisease severityonbothbaselineandlongitudinalscans[68]. Uppaluridesignedanalgorithmdubbedtheadaptivemultifeaturemethod (AMFM)[129].Thealgorithmwasinitiallyappliedtoemphysemadetectionbut 39

PAGE 49

laterwastestedfordiscriminationofILDs[128].AMFMfrstseparatesthelung intoROIsusingsegmentation,thenextractsfeaturesbasedonfrstandsecondorder statistics,aswellasfractalfeatures.Firstorderpixelstatisticsusedincludemean, variance,skewness,kurtosisandentropy.Secondorderstatisticschosenincludeboth GLCMandrunlengthfeatures.Geometricfractaldimension,ameasureofthecomplexityofpixelpatternsatdierentscales,wasalsoused.Ofthesean\optimal" subsetischosenpriortotrainingaBayesianclassifer. Anothergroupdescribesanautomatedsystemthatusesintensityhistogramsof smallvolumesofinterestasfeaturesandtrainsastatisticalclassiferusingUIPexemplarregionslabeledbyradiologists.Theydemonstratedthesehistogramsignatures wereassociatedwithradiologistvisualclassifcationandcorrelatedwithphysiological parameters[12].TheyalsoappliedthissystemtoCTstudiesacquiredattwotime pointstoquantifyvolumeofinterstitialabnormalities.In55selectedsubjectswith IPFtheydemonstratedchangeinmeasuredquantitieswaspredictiveofmortality [86]. 2.3.2CysticFibrosis CysticFibrosis(CF)isaninheriteddiseasethataectstheglandswhichproduce mucusandsweat.InCFtransportofchlorideandsodiumacrosstheliningofthese glandsisabnormal[93].WhileCFmayimpactthepancreas,liver,intestines,sinuses, andsexorgans,itsmostwell-knowntargetsarethelungs.InpatientswithCF,mucus becomesviscousandsticky,buildingupinsidethelungsandblockingtheairways, makingitdiculttobreathe.Thickenedmucusblocksalveoliwhichleadstochronic infection,inrammation,andairwayremodeling.CFalsoleadstoanincreaseinsalt contentofsweatglandexcretionsandcanbediagnosedbysweatchloridetest[47]. Asageneticdisorder,CFpredominatelyaectsinfants,childrenandyoung adults.TheseverityofCFsymptomsmaywaxandwaneoverthelifetimeofthe 40

PAGE 50

disease.Signsoflungfunctiondegradationmaystarttoappearinearlychildhood, oftenleadingtoseverebreathingproblems,progressivedeclineinlungfunctionand, ultimately,respiratoryfailure,whichistheleadingcauseofdeathamongsuerersof CF[95]. CTplaysanimportantpartinmanagementofCFandisusuallyperformedatregularintervalsthroughouttreatment[95].Astherapeutictreatmentshaveimproved,a correspondingneedforamoresensitiveoutcomemeasurehasariseninordertotrack ecacy.CThasshownpromiseinthisrole,becauseitoershighspatialandcontrast resolutionandcanbeperformedserially(withclosemonitoringofthepatientsradiationburden).CTenablesassessmentofseverityandspatialdistributionofdisease eects,revealingcharacteristicpathologicchangeslikebronchialwallthickening[94], mucuspluggingandbronchiectasis,whichismarkedbylocalirreversibledilationof airwaybranches[41].Thiscanbecausedbyobstructionordestructionofelastic tissuesurroundingairways.Involvedairwaysbecomeenlargedandinramedandare easilycollapsible,whichcausesproblemsforairrowandclearanceofmucus. Recentanimalmodeldatasuggestsabnormalitiesinthesizeandshapeofthe upperairwaytreearepresentinCFsubjectsfrombirth[93].Imagingstudiesof youngchildrenwithCFindicateearlystructuralabnormalities,eveninpatientswith normalpulmonaryfunctiontestresults.ResearchusingCThasshownthatinfants andyoungchildrenwithclinicallydiagnosedCFhavedilatedairwayscomparedto control[95]. Ingeneral,airwayshapeaectspulmonaryfunctionandsymptoms.Successful therapycouldbeimpactedbyvariationsinairwaymorphology.Morphologyaects resistanceandmucousclearance[90].Fromanengineeringperspectivewewould expectvariationinrowwithshapedierencesinatree-shapednetworkoftubes. Reducedairwaydiameterisassociatedwithdecreasedlungfunction[73],[88],[11] andmucousclearance[90].QuantitativeCTtomeasureairwaycountshasbeen 41

PAGE 51

showntocorrelatewithbronchiectasis[41]ashastotaldiameterofairways[141].CT providesdataforstudiesofmorphology,butindividualmeasureslikespecifccrosssectionareastendtobenoisy,donotcapturetrendsinoverallshapeandusually needtobenormalizedtoaccountforpatientbodysize. 42

PAGE 52

3.SpinImageFeaturesforQuantifcationofUIPonCT Computedtomography(CT)ofthechestplaysakeyroleintheassessmentof IPF,enablingdiagnosiswithoutasurgicalbiopsywithconfdentvisualidentifcation ofaUIPpattern[110].VisualestimationofextentoflungfbrosisonCThasbeen showntocorrelatewithprognosisincludingmortalityandisarecognizedmarkerof diseaseseverity[83],[15]. However,visualevaluationissubjective,limitedbyvariableinter-andintraobserveragreementandinsucientlyprecisetodetectlongitudinalchangeonserial studies.EectivecomputermethodsfordetectionandquantifcationofUIPonCT wouldbevaluableforresearchandclinicalapplications.Availabilityofanaccurate quantitativeCTmethodthatcorrelateswithindicesofdiseaseseverityandwith mortalitycouldprovideanimportantbiomarkerfordiseaseprogressionthatcould beusedinclinicaltrialsandultimatelyinclinicalpracticetodetermineresponseto treatment. VolumetricCTscansprovideasubstantialamountofgeometricinformationthat canbeusedinavarietyofways.Examplesincludemorphologicstudiesandconstructionofanatomicmodelsforapplicationslikesurgicalplanninganddesignof patient-specifcdevices[23].Assuch,segmentationand3Dmodelingmakesupa largeportionofmedicalimageanalysiseorts[144].Oftentheseapplicationsfocus onidentifcationanddelineationoftheboundariesoflargerstructureslikebonesand abdominalorgans.However,manyimportantconditionsappearasdiusepatterns lackingspecifcorconsistentlandmarks.Thisisparticularlytrueofdisordersaecting lungparenchyma.Identifcationandanalysisofdiusepatternsrequiressomewhat dierentstrategiesthan3Dmodelingofdiscretestructures. Thischapterdescribesapplicationofanimagetexturemetrictoclassifcationand quantifcationofextentoffbrosisonCT.Specifcally,intensitydomainspinimage featureswereusedinconjunctionwithaRandomForest(RF)classifertodistinguish 43

PAGE 53

localregionsoffbrosisonlungCT. 3.1Introduction ManyestablishedmethodsforquantitativelungCTrelyonfairlysimplemeasures suchasdensitometry,whichmeasurespercentageofpixelswhoseintensityvalues fallwithinapredefnedrange[145],orhistogramstatistics[115],[15].Whilethese methodshavebeenshowntocorrelatetosomedegreewithIPFtheyarebasedonly onpooledmeasuresofindividualpixelintensitiesanddonotconsiderlocalcontext orspatialinformation. Firstorderpixelstatisticscandistinguishfbrosistosomeextent.Sincefbrosis ismuchmoredensethannormalair-flledparenchyma,pixelintensitiesinfbrotic regionswillgenerallybehigher.UIPcanincludebrightregionslikereticulationsand wallsofhoneycombcystsaswellasdarkregionslikecystcentersandbronchiectatic airways.Theseabnormalitiesaectlocalpixelintensitydistributionsintwokeyways. Alargernumberofbrightpixelsmovethepeakofthepixelintensitydistribution closertozerosinceaironCTshouldcorrespondtopixelintensitiesof-1024.Asthe peakmovesclosertozerothedistributionofpixelintensitiesbecomeslessskewed. Thepresenceofbrightanddarkpixelsincloseproximitybroadensthepeakofthe distribution,makingitlesskurtotic(see2.9forexamples). Globallunghistogramanalysiscorrelateswithbaselinemeasuresofphysiology andprogressiononlongitudinalanalysis[15],[16].However,histogramsarelocallyorderless,meaningthattheydonotinherentlycontainspatialinformation.Histograms computedbypoolingpixelintensitiesovertotallungvolumesdonotprovideinformationonspatialdistributionoffbrosisandtheydonotnecessarilyhelpdistinguish fbrosisfromotherILDpatterns.Histogramsarealsoalimitedlocalfeature,sincelocalregionswithverydierentappearancecanhavethesamefrstorderpixelstatistics (Figure3.1). 44

PAGE 54

Figure3.1: Pixelhistogramsdonotconsiderspatialrelationships.AnROIshowing honeycombinganditshistogram(left),thesameROIbutwithpixelsrearrangedto sortbyintensity(middle)orrandomlypermuted(right).Thepixelhistogramsarethe sameinallcases Textureisaqualityofimagesformedbypatternsinpixelintensitiesandtheir spatialarrangement.Whilenosingle,precisedefnitionofimagetextureexists,a varietyofmethodstoquantifydierentappearancesoftexturehavebeenproposed. QuantifcationoftexturevariationinCTisdicultbecausenaturaltexturesare likelytoincludebothstructuralandstochasticelementsatvariousscales.Still,the characteristicpatternsassociatedwithIPF,reticularabnormalityandhoneycombing,havedistinctivetexturalqualitiesthatarevisuallydistinctfromnormal(Figure 3.2)givingtheimpressionthattexturalmethodscouldprovidebetterfeaturesfor classifcationofUIPpatterns. Quantitativetexture-basedmethodshaveprovenusefulforanumberoftasks relatingtomedicalimageanalysis,includingsegmentation[67]anddiseasescoring [17].Texturemeasuresseemparticularlywellsuitedtotheproblemofdistinguishing diusepatternslikefbroticlungdiseaseonCT[133],[117].Awidevarietyofimage 45

PAGE 55

Figure3.2: ThecharacteristicspatternsofUIPhavequalitativelydistinctivetextural appearances.Fromlefttoright(a)RA,(b)HCand(c)Normallung texturemetricshavebeenappliedtoanalysisoflungparenchyma,includingstatistical methods[69],greylevelrunlength[50]andcombinationsoffeatures[128]. Thisworkappliesintensitydomainspinimages,afeatureextractionmethodthat incorporatesspatialinformationintolocalpixelintensityhistogramsbyconcatenating multiplehistogramsdrawnfromseparateannularringscenteredonapointofinterest.Wehypothesizethatintensitydomainspinimagescancapturecertaindistinct characteristicsofUIPtexturesand,whenpairedwithaversatileclassiferalgorithm suchasRandomForest,discriminateregionsofnormalandfbroticlung.Technicalvalidationofthemethodwasperformedbycross-validationusinglabeledROIs. ThemethodwasalsoappliedtoacollectionofCTstudiesfromsubjectsenrolledin amulticentertrial.Extentoflungfbrosis,asestimatedbythecurrentalgorithm onbaselineCT,wascomparedwithpulmonaryfunctiontestsandsemi-quantitative scoresassignedbytworadiologists. 3.2Methods Customsoftwarewasdeveloped(MATLAB,TheMathworks,Natick,MA)to facilitatereviewandlabelingofimageregionsofinterest(ROIs).ThesoftwareallowshandlinganddisplayofCTseriesintheindustrystandardDigitalImagingand 46

PAGE 56

Figure3.3: CustomapplicationdevelopedforROIlabeling. COmmunicationinMedicine(DICOM)format(Figure3.3).Functionsforimageslice navigation,zoomandcontrastadjustmentwerebuiltintothesoftware.Usersareable todelineateboundariesofROIsinseveralways,suchasrectangular,ellipsoidalor free-formpolygonalshapes.SemanticlabelsforeachROIarechosenfromasetlist. Todevelopatrainingset55inspiratoryhigh-resolutionchestCTstudiesfrom theIPFNetACEstudy[104]werereviewedbytwoexperiencedthoracicradiologists. Imagingstudieswerechoseninaneorttomaintainconsistencyinacquisitionand reconstructionparametersacrosstheseries.Radiologistswereaskedtodelineate regionsdemonstratingthekeyfeaturesofaUIPpattern.Thegoalwastodraw boundariessothateachROIcontainedahomogeneousexemplarconsistentwiththe labeledassigned.Categorylabelsincludednormallung,bronchovascularstructures, reticularabnormality(RA),honeycombing(HC)andtractionbronchiectasis(TB) (Figure3.4). 47

PAGE 57

Figure3.4: ExampleROIs.ToprowRA,middleHC,bottomTB. 48

PAGE 58

Figure3.5: IntensitydomainspinimagesconsistofKDEhistogramestimates,computedfromwithinconcentricannularringscenteredaroundapixelofinterest,concatenatedintoafeaturevector Spinimageswereinitiallysuggestedaskeypointsignaturesfor3Dgeometric models[66],butcanbeadaptedasalocaldescriptorfor2Dimages[75].Theirdesign seekstoincorporatespatialinformationintolocalpixelintensityhistograms.Spin imagessharesomeconceptualsimilaritytoLocalBinaryPatterns(seeFigure2.4) inthatradialregionsatafxeddistancefromacentralpixelofinterestareusedfor calculation.Theycombinelocalhistogramswithrotationallyinvariantspatialinformation.Thisisaccomplishedbyassemblinglocalpixelintensityhistogramswithin annularringscenteredonthepixelofinterest.Pixelintensityhistogramscorrespondingtoeachannularringareconcatenatedtoformafeaturevector.Kerneldensity estimates(KDE),sometimescalledsofthistograms,areusedinordertoproduce sucienthistogramcountswitharelativelysmallnumberofsamples.KDEsarea non-parametricformofdensityestimation.Intuitivelytheyaresimilartohistograms butratherthansummingdiscretecountsintobins,instancesofkernelfunctions(oftenGaussian)aresuperimposedtodevelopacontinuousestimateofaprobability densityfunction[20]. Pointsofinterest(POIs)weresampledevenlyfromwithineachlabeledROI. SpinimagetexturedescriptorswerecomputedatPOIstoaccumulateacollectionof featurevectorsassociatedwitheachcategory.Spinimageswithfourannularrings 49

PAGE 59

wereused.KDEhistogramswith64binsevenlyspacedoverthepixelintensityrange from-1024to0HUwereused.ThesefeatureswereusedtotrainaRandomForest (RF)classiferinasupervisedfashion.RFisaversatilemachinelearningalgorithm basedonensemblesofdecisiontrees.RFmakesuseofbootstrapaggregation,or bagging,toconstructiondecisiontreessothateachtreeonlyconsidersaportionof availabletrainingdata[21],[79]. Inaddition,RFusesonlyarandomlyselectedsubsetofvariablestocreatedecision rulesateachnodeofatree.Thisimplementsrandomfeatureselectionwithinthe algorithm.Individualdecisiontreestendtooverfttrainingdataandthusexhibit relativelylowbias,meaningthattheyprovideaconsistentresponsegivensimilardata. However,theydoshowhighvariance,meaningthattreesconstructedwithdierent setsoftrainingdatacanbeverydierentevenifthereisconsistentunderlyingsignal. Theuseofonensembleoftreeswherefnalresultisdeterminedbymajorityvote (intheclassifcationtask)reducesvariance.RFtendtoprovidegoodclassifcation resultswithaminimumofparametertuning[55]. RFclassifcationperformancewastestedusingleave-one-outcrossvalidation, whereROIsfromonesubjectwereomittedfromthetrainingsetandreservedfor testing.TheprocessisdepictedinFigure3.6.Inaddition,asmallnumberofimages weretestedbydenserandomsamplingofPOIsacrosslungsegmentationregionsin ordertovisualizationclassifcationresults.Classifcationresultswereaccumulated usingaGaussiankernelfunctionateachPOI,similartoKDEs,andfnalclassifcationestimateassignedbychoosingthecategorywiththelargestsummedvalue. Thisenabledpixel-by-pixelclassifcationbasedonirregularlysampledPOIswhere neighboringPOIresultssharedsomeinruence.Thistestingwasperformedincrossvalidationstyle,sothattestimagesweretakenfromCTvolumesnotusedfortraining. VolumetricCTof284subjectsenrolledinACE,PANTHERandSTEPIPFnet studieswereusedfortesting[104].Studieswereacquiredwithastandardizedprotocol 50

PAGE 60

Figure3.6: Spinimagealgorithmprocessdiagram butavarietyofscanners.Scanswerearchivedandreadatacentralimagingcoreat NationalJewishHealth.Tworadiologistsvisuallyscoredextentoffbrosis,expressing theirfndingsonascalefrom0(nofbrosis)to10.POIsweresampledwithinlungsand spinimagefeaturescomputed.POIfeaturevectorswereclassifedusingRFtrained withlabeleddatafromothercases.Algorithmresultswereexpressedas%oflung classifedasfbroticandcomparedwithmeanvisualscoresplusphysiologicmeasures. Inadditionglobalstatisticsincludingmean,variance,skewnessandkurtosisoflung pixelintensitieswerecompared. 3.3Results Atotalof900free-formROIswerecollectedfortraining.FromwithintheseROIs, 14; 664POIsweresampled.Cross-validationshowedverygoodresults.Sensitivity was98:4%andspecifcitywas93:7%(Table3.1). Wholeimageclassifcationresultsallowqualitativeevaluationofalgorithmresults (Figure3.7).Imageswithradiologist-drawnROIswereusedsothatconcordance 51

PAGE 61

Table3.1: SpinimageROIcross-validationresults NotfbroticFibrotic Notfbrotic3026202 Fibrotic17911257 Figure3.7: Wholeimageclassifcationresults.HCshowninyellow,RAcyan,Normal darkblue,TBredandBVmaroon. betweenmanuallylabeledregionsandalgorithmestimatescouldbeevlauated. Radiologistvisualscores(extentofreticularabnormality+honeycombing) showedrelativelypoorinter-observeragreement(weighted =0.3,p < 0.001).Algorithmvolumetricscoresshowedmoderatecorrelationbetweenfractionoflungclassifedasfbrotic,meanvisualscoreandphysiologicmeasures(3.2).Correlationbetween globalstatisticsoflungpixelintensitiesandphysiologicmeasuresareshowninTable 3.3. 3.4Discussion Thehighaccuracyseenincross-validationisencouraging,andindicatesthatthe combinationofspinimagefeaturesandRFclassifercanquantifytheappearance offbrosisonCT.Theseresultsareoptimistic,however,sincetheyarebasedonly 52

PAGE 62

Table3.2: Correlationsbetweenalgorithm,visualscoresandphysiologicmeasures (p< 0:001) VisualScore(95%Confdence)AlgorithmScore VisualScoreAlgorithmScore0.38(0.28,0.48) DL CO (%pred.)-0.48(-0.56,-0.38)-0.55(-0.63,-0.46) FEV 1:0 (%pred.)-0.32(-0.42,-0.22)-0.50(-0.58,-0.41) FVC(%pred.)-0.29(-0.39,-0.18)-0.61(-0.67,-0.53) TLC-0.21(-0.31,-0.09)-0.54(-0.62,-0.46) Table3.3: Correlationbetweengloballungpixelstatisticsandphysiologicmeasures (p< 0:001) MeanSt.Dev.SkewnessKurtosis DL CO (%pred.)-0.55-0.410.660.61 FEV 1:0 (%pred.)-0.50-0.350.580.54 FVC(%pred.)-0.66-0.550.690.65 53

PAGE 63

onlabeledROIs.ROIschosenmanually,wheretheintentistodevelopasetof exemplarsforalgorithmtraining,tendtoincorporateselectionbias.Giventhetask offndingrepresentativeexamplesofdiseaseandnormalpatternsapersonislikelyto chooseonlyverydistinctregions.Whilethesemaybegoodexamplesfordefningthe essentialnatureofimagepatterns,theyalsousuallyrepresentthebestcasescenario intestingclassifcationaccuracy. Wholeimagetestingisamorerealistictestsincerandomsamplingacrossthe lungsproducesROIsthatmaybemoreambiguous,includingROIsthatspanboundariesbetweendierenttissuecomponents.SincetheRFinthiscasewastrainedwith arelativelylimitednumberofexemplars,testinginthismannerwillforcethealgorithmtoestimateacategoryforsomeROIsthatarenotgoodmatchesforanyofthe establishedcategories.Qualitativeevaluationofwholeimagesdoesshowreasonable results,includingcorrectlyclassifcationofregionsthathadbeendelineatedbythe radiologists,butconsistentmis-classifcationsareevident.Forexample,areaswhere highintensitypixelsareadjacenttolowintensitypixelsareoftenmis-classifedas bronchovascularstructures,notablyROIsinthelungperipherythatpartiallyoverlap withmuscleofthechestwall.TherealsoappeartobemanyspuriousHCclassifcations. Observeragreementonvisualscoringofextentoffbrosisisrelativelypoor,underscoringthedicultyofstrictlyvisualassessment.Algorithmestimatesforextent offbrosisaremorestronglycorrelatedwithphysiologicmeasuresthanarevisual scores.Thisislikelyanindicationthattheautomatic,objectivenatureofcomputerizedscoringperformsthevolumeintegrationtaskmoreconsistentlythanhuman observers. Ithasbeenestablishedthatglobalfrstorderpixelstatisticscorrelatewithphysiologicmeasures.Indeed,inthisworkthesevaluesshowsomewhathighercorrelation withphysiologicteststhandospinimagealgorithmscores.Momentsofpixelin54

PAGE 64

tensitydistributionsarethemselveshighlycorrelatedandtogetherdescribeoverall brightnessandcontrastqualities.ConsideringthenatureofinspiratorychestCT,it seemsintuitivethatgloballungbrightnessandcontrastonCTisstronglyinruenced bydegreeoflungaeration,whichwouldhaveastrongimpactonpulmonaryfunction tests. Spinimagedescriptorsareessentiallylocalpixelhistogramswherespatialinformationisimposedbyconcatenatinghistogramestimatesfromseparatecontiguous zones.Theunderlyingsignalcapturedbythesedescriptorsisexpectedtobeclosely relatedtotheinformationinlungpixelhistogramswhetherglobalorlocal. 3.5Conclusions Automatictexture-basedclassifcationofaUIPpatternonCTcanquantifyextentoffbrosisinsubjectsenrolledinamulticentertrial,scannedwithavarietyof CTscanners.Thisworkrepresentsanovelapplicationofspinimagefeaturestothe problemoflungfbrosisclassifcation. 55

PAGE 65

4.UnsupervisedFeatureLearning:TechnicalValidation 4.1Introduction Theultimategoalof\bigdata"analysisisthedistillationofknowledgeormeaningfromrawinformation.Muchofthecurrentresearchinmachinelearningfocuses ontask-specifcsystemsthatcanidentifyobjectsorpatternsinimageswithminimum humanintervention.Thegeneralprocessofmachinelearningconsistsoftwoprimary components,featureextraction,whereinputdescriptorsarecomputedfromrawdata, andconstructionofdecisionfunctions(intheformofclassifcationorregressionalgorithms)thatreturnanestimatebasedonagivensetofinputdescriptors.Theseare separate,complementaryproblems.Thegoaloffeatureextractionistodeterminethe best,mostinformativedatarepresentationgivenacertaintypeofrawinputinformation.Thegoalindevelopingdecisionfunctionsistodeterminethemostaccurate mappingofinputfeaturestothedesiredresponse. Aclassiferalgorithm'sabilitytoprovideaccurateoutputishighlydependent onthedistinctivenessoffeaturesusedasinput.Theidealscenarioistohavemany independentfeaturesthateachcorrelatestronglywiththedesiredoutcomevariable [55].Extractingmeaningful,distinctivefeaturesfromrawpixelrepresentationsis oneofthemajorchallengesincomputervision[30].Consideringthatimagesare highlydimensionalandoftencontainredundantorcorrelatedfeatures,extracting idealfeatures,thosethatcorrelatestronglywithtargetcharacteristicsbutnotso muchwitheachother,canamounttofndinganeedleinahaystack. Digitalimagesaremadeupofmultidimensionalarraysofintegervalues,where thelowest-levelfeaturesarethepixelintensityvaluesanddistancesbetweenpixels. Theselowlevelfeaturesaretoohighlydimensionaltobegooddescriptorsofthe patternsthatcanbeappreciatedvisually.Simplysendingtheselowlevelfeatures throughthemathematicallymachineryofstatisticalanalysisforclassifcationdoes nottendtoworkverywell[101].Thereistypicallyalargegapbetweeninformation 56

PAGE 66

providedbylowlevelimagefeaturesandhigherlevelvisualconcepts.Thegoalof featureextractionistohoneinonkeycharacteristicswithinrawdatasothatlearning algorithmsareabletoconnectthesewithhigherlevelconcepts[32]. ThecharacteristicappearanceofIPFonCTismarkedbypatternsinbothpixel intensityandspatialarrangement,aqualityofimagesoftencalledtexture.While nosingledefnitionofimagetextureexists,thefeldofimageprocessingprovides anumberoftechniquestoquantifytextureindigitalimages.Thesearegenerally basedonfrstandsecondorderpixelstatisticsmeasuredatpredefneddistancesand directions(e.g.GLCM,RunLengthandLBP).Methodslikethiscanprovideuseful featureswhentexturalpatternsarehighlystructured.Naturaltextures,however, tendtobestochasticandaremorediculttodierentiateusingsuchrules. Anyofthetexturemetricsdescribedpreviouslycanbeconsidered\handengineered".Eachwasinventedtocaptureaspecifcimagecharacteristicorclassof characteristicsandrelyonavarietyofparametersthatmustbetunedforaspecifc application.Classifcationsystemsoftenconcatenateseveralofthesefeaturesbecause theyeachprovidedierentinformation.Forexample,LBPandspinimagescandescribecornersandedgesinarotationallyinvariantfashion,whilerunlengthmethods aregoodatdescribingspecklingpatterns. Applicationsofsuch\handengineered"imagefeatureshavemadestridesinmedicalimageanalysis.However,designingfeaturesinthiswayrequiresatrialanderror approachtoselectfeaturecombinationsoptimizeparameters.Inotherwords,agreat dealofpriorknowledgeanddomainknowledgemustbe\bakedin"inordertocreatealgorithmsthatperformwell.Thisintroducesbias,resultsinalgorithmsthatare highlytailoredtoaspecifcapplicationandmaynotbetolerantofvariationsinimage quality.Adangeristhattheseengineeredalgorithmscanoverfttrainingdata. AutomaticclassifcationoflungtissuecomponentsonCTisdicultduetonaturalvariationinnormalanatomy,variationinappearanceofabnormalityanddier57

PAGE 67

encesduetoimageacquisitionparameters.Researchershavesuggestedsomevaluable improvementsbycleverlyengineeringcombinationsoffeatures,processingstepsand classifers[50],[128],[12].Thisresultsinfeaturesthatarebasedonavarietyofassumptionsforpixelintensityrange,scaleandotherfactors.Thedicultyindesigning suchsystemsisthattherangeofvariationslikelytobeencounteredisverylarge.Itis diculttoanticipateallvariationsoraddadditionalalgorithmlayersasvariationsare encountered.Asaresultclassifcationsystemsbasedon\handengineered"features canbebrittleandonlyeectiveforanarrowrangeofinputdata. Recentworkincomputervisionhassoughttolearnhigherlevelfeaturerepresentationsdirectlyfromdata.Thegoalistodevelopversatilesystemsthatproducehighly descriptivefeaturesthatdonotneedtobeexplicitlydesignedbydomainexperts[30]. Mountingevidenceshowsthatunsupervisedfeaturelearning,adata-drivenapproach todevelopimagefeaturedescriptors,canbemoreeectiveoverawiderrangeofinputs[101].Inthisparadigmbasiselementsarelearnedviaanunsupervisedprocess fromrepresentativeunlabeleddata.Thesebasesserveasadictionaryorcodebook andnovelimageregionscanbedescribedasweightedcombinationsofdictionaryelements.Weightingcoecientscanserveaspredictorvariablesorfeaturevectorsfor subsequentclassifcationalgorithms[29],[31]. Anumberofdata-drivenmethodsforfeaturelearninghavebeenproposed,includingPrincipalComponentsAnalysis[143]andSparseCoding[85].Thegeneral processistoapplyanunsupervisedlearningalgorithmtodistillessentialelements fromalargecollectionofunlabeleddata.Theseessentialelementsarerepresented mathematicallyasasetofbasisfunctions,oftentermedadictionaryorcodebook, combinationsofwhichcanbeusedtoapproximatepreviouslyunseendata.Having beenderivedfrom(ideallyverylarge)collectionsofdata,learnedfeaturestendto captureimportantdetailsbetterthanmanuallydesignedfeatures. 58

PAGE 68

Atrendincomputervisionandmachinelearningisthatusingmoredatatends toprovidebetterresultsthantryingtodeviseamorecleverfeatureoralgorithm [43].Someunsupervisedmethodscanbecomputationallydemandinganddicult toapplywithlargedatasets.Withthisinminditisworthwhiletoemploysimpler, moregeneralizablealgorithmsthatareabletoprocessmoredatainareasonable amountoftime[30],[31].Forthisproject,arelativelysimpleandscalableapproach wasadopted.CoatesshowedthatamodifcationofK-meansclustering,afamiliar methodforvectorquantization,canlearnlow-levelfeaturesthatareusefulforimage classifcation.Themethodwasoriginallydescribedforidentifyingtextinphotographs ofscenes,butlaterpresentedasageneralpipelineforlearningfeaturesfromimages [29],[31]. Themainhypothesisofthisworkisthatimagefeaturesdescriptorslearnedinan unsupervisedprocesscanbeusedtoidentifyregionalpatternsassociatedwithIPFon CT.ThisapproachwasappliedtoalargecollectionofchestCTimagestogenerate acodebookoffeaturesspecifctosmallpatchesoflungCTimages.Thiscodebook wasthenusedtocomputefeaturesforlargerimageregions.Thesefeatureswereused withaSupportVectorMachine(SVM)classifer,trainedinasupervisedfashionusing regionslabeledbyexperiencedradiologists,todistinguishregionsoffbrosisonCT. 4.2Methods Alargecollection(500)ofvolumetricchestCTscans,knowntocontainboth normalanddiseasepatternswasusedforfeaturelearning.Semi-automaticlungsegmentationhadbeenperformedonthesestudiesbuttheywerenotlabeledotherwise. Inotherwords,noexpliciteortwasmadetoassignsemanticlabelstodierenttissue ordiseasetypeswithinthelungs. Asimpletwo-steppreprocessingprocedureisappliedtotheseextractedpatches. Thefrststepwastoperformabrightnessandcontrastnormalizationofeachpatch: 59

PAGE 69

x (i) = ~ x (i) )Tj /T1_0 11.955 Tf 11.955 0 Td (mean(~ x (i) ) p var (x (i) + norm ) (4.1) where~ x (i) aretheextractedimagepatches.Thisprocessamountstostandardization,transformingtheinputtozeromean,unitvariancebutwiththeinclusionof asmallconstant, norm toavoiddivisionbyzero. Thesecondpre-processingstepiswhitening,whichisaprocesstodecorrelate observations.Thegoalistotransformdatasothatthecorrelationmatrixbecomes (closeto)theidentitymatrix.AstraightforwardwayofaccomplishingthisisZCA whitening,whichisverymuchlikePCAbutwithoutrotation[31].Withtheeigen decompositionofthecovarianceofnormalizedimagepatchescalculatedas VDV T = cov (x (i) ),theZCAwhitenedimagepatchescanbecomputedas: V (D + ZCA I ) )Tj /T1_7 5.978 Tf 7.782 3.259 Td (1 2 V T x (i) (4.2) where ZCA isanothersmallconstantagainincludedtoavoiddivisionbyzero. SphericalK-meansisamodifcationofthewell-knownunsupervisedclustering algorithmbasedontheinnerproductratherthanEuclideandistance.Thealgorithm consistsofthefollowingsteps[64]: 1.Initialization:specifythedesirednumberofclusters K .Forasetofinputdata randomlyassignclustermembership(from1to K )toeachobservation. 2.Iteratethefollowinguntilclusterassignmentsstopchanging: (a)Foreachofthe K clusters,computethecluster centroid.The kth cluster centroidisthevectorofthe p featuremeansfortheobservationsthatare currentlyassignedtothatcluster. (b)Assigneachobservationtotheclusterwhosecentroidtowhichitismost similar(inthisworktheinnerproductisusedassimilaritymetric). 60

PAGE 70

Ateachiterationeachpre-processedimagepatch x (i) isassignedtothecluster whosecenteritisclosestto,basedontheinnerproduct.Clustercentersareupdated ateachiterationbycomputingthemeanofthefeaturevectorscurrentlyassignedto thatcluster.Inpracticethealgorithmtendstoconvergeafter10-12iterations. Tosummarize,thisunsupervisedfeaturelearning(UFL)processconsistsofthe followingsteps[31]: 1.Collectasetofsmallimagepatches,~ x (i) fromavailabletrainingdata.For6 x6 pixelpatches~ x (i) 2 R 36 .Normalizebrightnessandcontrastforeachpatchusing 4.1 2.Applystatisticalpre-processingsteptowhiten(decorrelate)thedatausing4.2, yieldinganewdataset x (i) 3.RunsphericalK-meansclusteringon x (i) tobuildamappingfrominputimage patchtoafeaturevector, z (i) = f (x (i) ) Applyingthisalgorithmtothecollectionofpre-processedpatches x (i) assigns eachobservationtooneandonlyoneclustercenterandyieldsasetofnormalized vectors D (j ) ;j 2f1;:::;dg (i.e.theclustercentroids)whicharethecolumnsofthe dictionary D Todescribeanewinputpatch~ x intermsofanexistingdictionary,thepatchisfrst brightnessandcontrastnormalizedandwhitenedtoyieldavector x.Itismapped toanewrepresentation z 2 R d bytakingtheinnerproductwitheachdictionary element.Theresultisthentransformedwiththescalarnonlinearfunction: z = max f0; jDxj)]TJ /T1_1 11.955 Tf 37.951 0 Td ( g (4.3) where D isthedictionaryorcodebookcomputedbythepipelinedescribedabove and isathresholdconstant.Inthiswork =0wasusedthroughoutsothat z does 61

PAGE 71

notincludenegativevalues.Whereasthedictionarywasformedbygroupingimage patchesintooneandonlyonecluster,this\soft"activationfunctionallowspartially contributionfromseveraldictionaryelements.Itisintuitivelysimilartodetermining alinearcombinationofelementsinthedictionarythatapproximatethenewimage patch.Thethresholdactivationincorporatesusefulnon-linearitybydroppingout negativevalues. Wewouldlikeuseactivations z todescribeimageregionsonalargerspatialscale thanthelowlevelpatchesin~ x (i) .Thisisaccomplishedbypoolingtheactivation foracollectionofpatchesinagivenregion,typicallybycomputing mean(z )overall lowlevelpatchesintheregion.Thisiscalledaveragepoolingandproducesafeature vectorthatcanbeusedinsubsequentsupervisedlearningtasks. AseriesofseparateUFLdictionarieswerecomputedtocompareresultsusing dierentparameters.Inseparateoperationsroughly500; 000smallsquarepatches, size6x6,8x8,10x10or12x12pixelswereextractedbyrandomsamplingfromaxial imageslices.Priortosamplingimagepixelsizewasstandardizedto0.5mm/pixel bybilinearinterpolation.Theselowlevelpatcheswereusedtocomputeseparate dictionarieswith K =64,128,256and512.Theparameters norm and ZCA were alsovaried. ThecollectionofROIsdescribedpreviously(seeSection3.2)wasusedtotest UFLforclassifcationofIPF-relatedpatterns.TheseweremanuallydelineatedROIs from55subjectsknowntohaveIPF.ROIslabeledasnormal,bronchovascular(BV), reticularabnormality(RA),honeycombing(HC)andtractionbronchiectasis(TB) wereusedfortesting.EachROIwasconsideredaseparateobservationanddescribed byasinglefeaturevector.Thisisanoptimisticscenariobecauseirregular,handdrawnROIscanbeconsideredhomogeneousexamplesofagivenpattern.Feature vectorswerecomputedbyaveragespatialpoolingmeaningtheaverageactivation z ofalloverlappinglowlevelpatcheswithinanROI.SinceROIshaveirregularbound62

PAGE 72

aries,lowlevelpatcheswhereatleast50%ofpixelsfellwithinROIboundarieswere considered.Mean(pooled)activationwasnormalized(i.e.eachfeaturevectormade tosumto1.0)sothatlargeandsmallROIscouldbecompared. Experimentswerealsoperformedtotestaccuracywhenclassifyingfxed-sized squarepatches(referredtoasmidlevelpatches)sampledfromwithinhanddrawn ROIs.Inthiscasesquarelowlevelpatchesftmoreneatlywithinthesquaremid levelpatchesandonlythoselowlevelpatchesthatftcompletelywithinamidlevel patchareconsidered.Thenumberoflowlevelpatcheswithinamidlevelpatchis (M )Tj /T1_1 11.955 Tf 12.226 0 Td (L +1) 2 where M isthesize(perside)ofamidlevelpatch,and L isthesize persideoflowlevelpatches.Forexample,(30 )Tj /T1_0 11.955 Tf 11.13 0 Td [(8+1) 2 =529lowlevelpatchessized 8x8pixelsftwithina30x30pixelmidlevelpatch. InadditiontoUFLfeatures,pixelintensitystatistics(minimum,maximum, mean,variance,skewness,kurtosisandentropy)andLocalBinaryPattern(LBP) texturemetricswerecomputedforeachobservation(wholeROIormidlevelROI). LBPwerecomputedateachpixelwithinanROIandresultscompiledintoahistogram toconstructafeaturevector.Theseadditionalfeaturesweretestedindependently, andtogetherbyconcatenatingfeaturevectorsforeachobservation.Concatenated featurevectorswerestandardizedtozeromeanandunitvariancetoaccommodate thevaryingunitsandscalesofdierentvariables. SupportVectorMachineclassifersweretrainedusingthesefeatures,withaveragepoolingovereitherwholefree-formROIsorfxedsizesquaremidlevelpatches sampledfromwithinfreeformROIs.SVMparameters C and r wereselectedby cross-validation.Fivefoldcross-validationbysubjectwasused.Classifcationaccuracywasassessedbyf1score: f 1=2 (precision)(recall ) precision + recall (4.4) 63

PAGE 73

Figure4.1: SpatialpoolingoverROI.Thegreensquareindicatesalowlevelpatch sampledacrosstheROI. 4.3Results Figure4.2showsasubsetofdictionaryelementslearnedfromlungCTpatches usingtheprocessdescribedabove.Figure4.3showsresultsofleaveoneoutcross validationofUFLfeaturevectorscomputedforwholeROIobservationsusingdierent dictionaries.Thisresultwasusedtochooselowlevelpatchsizeandnumberof dictionaryelements. Table4.1comparesf1scoresfordierentcombinationsoffeatures.ForUFLfeaturesthebestparameters(lowlevelpatchsize,numberofdictionaryelements, norm and ZCA )wereused.BestSVMparametersweredeterminedforeachcombination offeatures. Figures4.4and4.5showconfusionmatricesforthebestperformingcombination offeaturesoverwholeROIsandmidlevelROIs. 4.4Discussion VisualizationofUFLdictionaryelements(Figure4.2)showsthattheprocess learnsorientededgesandblobs,similartoflterscommoninimageprocessingbutwith signifcantasymmetryandcurvedelements.Qualitativelythedictionaryelements seemconsistentwithsmallfeaturesapparentonlungCT,suchasvesselandairway cross-sections.Thefrstcross-validationperformedsoughttoquantifytheeectsof lowlevelpatchsizeandnumberofdictionaryelements.Figure4.3makesitclearthat 64

PAGE 74

Figure4.2: ExampledictionaryelementslearnedusingsphericalK-means. Figure4.3: ComparisonofwholeROIclassifcationaccuracywithdierentUFLdictionaries. 65

PAGE 75

Table4.1: Cross-validationf1scores FeatureswholeROImidlevelROI UFL+Pixelstatistics0.930.80 UFL0.890.75 LBP+Pixelstatistics0.870.74 Pixelhistogram0.840.74 Pixelstatistics0.820.71 LBP0.780.52 Figure4.4: Confusionmatrix,wholeROItesting. 66

PAGE 76

Figure4.5: Confusionmatrix,midlevelROItesting. smallerlowlevelpatches(e.g.6x6pixel)anddictionarieswithmoreelementsproduce featuresthataremoreeectiveforclassifcationbySVM.Theparameters norm and ZCA didnotseemtohaveastrongeectoncross-validationresults.Slightlybetter resultswereseenwith norm =0 :01and ZCA =25:0sothesevalueswerefxedfor mosttesting. ThebestperformingfeatureswereacombinationofUFLandlocalpixelstatistics. Theinitialnormalizationstepin4.1transformspixelintensitiessotheyarenolonger ontheHounsfeldUnitscale.Thispre-processingstepdecouplesUFLfeaturesfrom overallbrightnessandcontrast,sothesefeaturesareprimarilyrelatedtolocaltexture. Thisbeingthecase,auniformlybrightregionwillhaveasimilarUFLrepresentation asauniformlydarkregion.Weknowthatbrightnessandcontrastinformationonlung CThelpsdistinguishnormallungfromfbrosis,soitisnotsurprisingthatomitting thisinformationdegradesclassifcationresultssomewhat.Still,classifcationbasedon UFLfeaturesaloneisquiteaccurateindicatingthatmanyregionscanbedistinguished bytexturalcharacteristicsalone. 67

PAGE 77

ClassifcationbasedonfeaturesderivedfromwholeROIsisnoticeablybetterthan forsquare,fxedsizeROIsextractedfromwithinlabeledregions.Indevelopingthe trainingset,radiologistswereaskedtodrawROIboundariestoconformtohomogeneousareasdepictingonecategorytype.Texturecanbeaverycomplexquality, especiallythestochasticpatternsinUIPfeatures,solarger,homogeneousareascapturedistinctivecharacteristicsbetterthanROIsofarbitrarysizeandshape.Consider thegrainpatternonahardwoodroor.Theunderlyingpatternisbetterappreciated lookingathandfuloflargeswathsthanacollectionofsmallsquarepatches.Thesmall squaresmaycapturecomponentslikeknotsorsharpedgesthatoccursporadically inthepattern.Whilethesearerelevant,iftheyappearasdominantcharacteristics insmallsquareregionsitmaymisrepresentthefrequencywithwhichtheyoccurin theoverallpattern.Largehomogeneousswathstendtodemonstratestherelative frequencywithwhichthecomponentsoccurmakingiteasiertolearnandidentify acomplexpattern.Withthisismind,itisnotsurprisingthatthewholeROIresultsarebetter,stillitisinterestingthatROIswithdierentsizesandshapescan beusedtogethertolearnaparticularpattern.Thisimpliesthat,eventhoughUIP patternsareveryheterogeneous,theirdominantcharacteristicsareconsistentand becomeapparentgivengoodexemplars(i.e.largehomogeneousswaths). Methodstofndcontiguousbutirregularlyshapedimageregionsexist,forexamplesuperpixelsandgraphbasedmethods,butamorestraightforwardapproachto samplingnewimagesiswithafxedsizedslidingwindow.ThecombinationofUFL featuresandpixelstatisticsprovidesinformativefeaturesforclassifcation,andthese experimentshaveidentifedparameterchoicesforbestresults. 4.5Conclusion Unsupervisedfeaturelearningusingarelativelysimply,scalableapproachisfeasibleforclassifcationoflungfbrosisonCT. 68

PAGE 78

5.UnsupervisedFeatureLearningforQuantifcationofUIPonCT 5.1Introduction Asignifcantchallengeofsupervisedlearning,especiallyinmedicalimagingapplications,isthatlabeleddataisveryexpensive.Confdent,precisesegmentation ofregionsshowingdiseasepatternsisanarduous,timeconsumingtaskandrequires domainexperts(i.e.experiencedradiologists).Visualassessmentisknowntobe subjective,soitisgoodpracticefortwoormoreradiologiststoreviewROIs.The resultisthatlarge,robusttrainingdatasetsarediculttoachieve.Also,manual labelingtendstointroducebias,especiallyinthathumansaremorelikelytoselect \obvious"exemplars.Atrainingsetconsistingmostlyofobservationsthatarevery easytodiscriminatedoesnotusuallyleadtoaclassifcationalgorithmthatperforms wellindiscriminatingmoresubtleobservations.Itisdiculttoassembleatraining datasetthatcapturestherangeofvariationslikelytobepresentinrealworlddata. Itisclearthattheamountandqualityoftrainingdatausedtoconstructaclassifcationalgorithmaremajorfactorsinitssuccess.Smalltrainingsetswilloften beoverftbyaclassifer,andselectionbiaspropagated,resultinginasystemthatis onlyeectiveforaverylimitedrangeofinputdata.Inthissituationcross-validation, thestandardapproachforevaluatingclassiferperformanceundercontrolledconditions,islikelytogiveoptimisticresults,especiallyiftrainingdatawasdrawnfrom auniformpopulationandlabeledbythesameradiologists.Itcanbethecasethat analgorithmthatperformsverywellincross-validation,whereonlyROIsthatwere distinctenoughtobechosenbyaradiologistareused,doesnotprovideusefulresults whenpresentedwithROIdatasampledfrompreviouslyunseenimages. Intuitively,itwouldseemthattheamountoftrainingdatarequiredtoadequately representaspecifcpatternisrelatedtotheexpectedrangeofvariationofthatpattern.Forexample,apatternthatisfairlyuniqueandidentifablebyarelatively smallnumberofkeycharacteristicsmaybeadequatelydescribedbyasmallnumber 69

PAGE 79

oftrainingsamples.Ontheotherhand,datawithaverylargeofvariationrequiresa correspondingincreaseintheamountoftrainingdata.Thisisthechallengewhendevelopingsystemstodistinguishspecifcdiseasepatternsfromsurrounding\normal" anatomy.ThegoalofthisworkisdevelopanalgorithmthatcandetectlungfbrosisonCT.Framingthisasasupervisedtwocategoryclassifcationproblemrequires imagesampleslabeledasfbroticandnotfbrotic.Fibrosis,althoughacomplicated, mixedpattern,isamuchmorespecifcpatternthannotfbrotic,whichamountsto anythingelsethatcouldappearonCT.Itisinfeasibletomanuallydevelopatraining setthatincludesasucientnumberofsamplesthatarenotfbrotic. Semi-supervisedlearningismethodintendedtoalleviateproblemsrelatedtolimitedtrainingdata.Thisapproachmakesuseofunlabeledorweaklylabeleddata,in additiontoasmalleramountoflabeleddata,todevelopamorerobusttrainingset. Inthiscaseweaklylabeleddatamaybewholeimagesorcollectionsofimageswhere onehighlevellabelisassignedtoall,ratherthanhighlyspecifclabelsdesignatedby boundariesthatconformtoalimitedregion.Whilefullylabeleddatacanbeprohibitivelyexpensivewhendealingwithmedicalimages,weaklylabeleddataismore plentiful.Overalldiagnosisorimpressionofaimagingstudycanprovideenough informationforweaklabeling.Forexample,aradiologist'sreportthataCTstudy isnormalprovidesenoughinformationtoassumethatanyROIextractedthestudy (whichmaycontainhundredsofimages)doesnotshowfbrosis. Largecollectionsoftrainingdatacanimproveresults,butalsopresentlogistical challenges.Thecomputationalburdenoftrainingaclassiferonahugetrainingset canbediculttoovercome.Iterativehardnegativeminingismeanttoworkaround thisproblem[37].Theapproachissimilarinconcepttobootstrapaggregation,in thatsubsetsoftrainingdataareusedseparately.Itisalsosimilartoboosting,a methodthatfocusesontrainingsamplesclosetothedecisionboundaryandtherefor arediculttoclassify[64],[55]. 70

PAGE 80

Whentrainingabinaryclassiferonalimitednumberofpositivesamplesand ahugenumberof(weaklylabeled)negativesamplestherearetypicallytoomany samplestotrainonallatonce.Theiterativehardnegativeminingapproachmakes itfeasibletoprocesstrainingsamplesinbatchesandusetheobservationsthatcontributemosttofndinggooddecisionboundaries.Thischapterbuildsontheresults ofChapter4,applyingthemethodtoacollectionofvolumetricCTsofsubjects withIPF.Inordertomanagethecomplexityofrealclinicalimages,semi-supervised learninganditerativehardnegativeminingwasused. 5.2Methods UsingimagefeaturesdeterminedbyUnsupervisedFeatureLearning(UFL),a binaryclassiferwasdevelopedtoidentifyregionsoflungfbrosisonCT.Acquisition oftrainingROIswasdescribedpreviously(seeSection3.2).ROIswithcategory labelsRA,HCorTBwereconsideredfbroticorpositiveexamples.Toaugmentthe positivetrainingseteachROIwasrotated 30 andasmallamountofnoiseaddedto therotatedversions.Thistriplesthenumberoffbroticexemplarsandaddsgreater variationinrotationalorientation. ROIswithotherlabels(Normal,bronchovascular,muscle,fatandbone)were usedasnegativeornon-fbroticexemplars.Inaddition,ROIsextractedfrom50 volumetricCTofnormal,non-smokingvolunteers[145]wereusedasweaklylabeled negativeexamples.Thisresultedinacollectionofroughly500,000negativeexemplar ROIs. UFLfeatureswerecomputedasdescribedpreviously(seeSection4.2).Lowlevel patchsizewas6x6pixelsand norm =0 :01, ZCA =25:0.Dictionaryconsistedof 512elements.Withatrainingsetofpositiveandnegativeexemplarsestablished,a binarySVMwastrainedbyiterativehardnegativeminingusingthefollowingsteps: 1.Partitionalltrainingsamplesinto n =10non-overlappingsubsets 71

PAGE 81

2.Trainaninitialclassiferwithonerandomlyselectedsubsetofthetrainingdata 3.Foreachremainingsubset: (a)Applycurrentclassifertothissubsetandtestaccuracy (b)Identifyfalsepositives (c)Trainanewclassiferusingalltruepositivesandfalsenegativesthathave beenidentifed Thisproducesabinaryclassifertrainedtodetectlungfbrosis.Asecondclassifer wasalsotrainedwiththegoalofdistinguishingtypesoffbrosis.Themethodisthe sameasdescribedinSection4.2.ThisSVMwastrainedwith5categories:normal, bronchovascular,RA,HC,andTB.Tenfoldcross-validationwasusedtotestbothof theseclassifers. Toevaluatenewimagesthelungswerefrstsegmented.ROIsweredenselysampledwithinlungmasksandeachROIprocessedtoproducefeaturevectorsconsisting oflearnedfeaturesandfrstorderpixelstatistics.EachROIwasclassifedinatwo stageprocess.First,ROIsareclassifedasfbroticornotusingthebinaryclassifer trainedwithhardnegativemining.ThoseROIsthatwerepredictedtobefbrotic werethenclassifedagainusingthemulti-categorySVMtoestimatetypeoffbrosis. Testingthebinaryfbrosisdetectorclassiferwasperformedon50CTimageslices selectedfromthetrainingdatasetatrandom.Foreachimagearadiologist(whohad notbeeninvolvedindrawingtrainingROIs)wasaskedtodelineateallareasoffbrosis. Patchesweredenselysampledwithinlungsandclassifedasfbroticornotusingthe fbrosisdetectorSVM.Datawaspartitionedsothattheclassiferwasnottrained withanydatafromthecurrenttestsubject.Resultswereevaluatedqualitatively.In addition,fractionoflungareaidentifedbytheradiologistasfbroticwascompared tofractionofpatchesclassifedasfbroticbythealgorithm. 72

PAGE 82

AhandfulofimagesRAorHCtrainingROIsweretestedinordertoqualitatively assessdiscriminationoffbrosistypeusingthetwoclassiferapproach.Thiswas accomplishedasabove,inacross-validationstylesothattrainingandtestdatawas notmixed.Resultswereassessedvisually. BaselinevolumetricCTof330subjectsenrolledinACE,PANTHERandSTEP IPFNetstudieswereusedfortesting[63],[104],[100],[99].Studieswereacquired withastandardizedprotocolbutavarietyofscanners.Scanswerearchivedandread atacentralimagingcoreatNationalJewishHealthinDenver,Colorado(NJH).Two radiologistsvisuallyscoredextentoffbrosis,expressingtheirfndingsonascalefrom 0(nofbrosis)to10.ROIsweresampledwithinlungsandUFLandpixelhistogram statisticscomputed.ROIswereclassifedusingthebinarySVMtrainedwithiterative hardnegativemining.ThoseROIspredictedtobefbroticwereclassifedusingthe second,multi-categoryclassiferinordertoestimatefbrosistype.Algorithmresults wereexpressedas%oflungidentifedasfbrotic,i.e.: Score = # ofROIsclassifedasfibrotic # ofROIstested (5.1) andcomparedwithmeanvisualscoresplusphysiologicmeasuresincludingFVC (%predicted)andDL CO (%predicted).Inaddition,globalstatisticsoflungpixel valuesincludingmean,skewnessandkurtosisoflungwerecomparedbySpearman rankcorrelation. Ofthetotal330subjects,50wereusedforcollectionoftrainingROIs.These 50subjectswereanalyzedseparatelysothattrainingandtestdatawasnotmixed duringanalysis.OfthesubjectsfromthePANTHERstudy[100],baselineand follow-upCTwereavailableon72subjects.VisualanalysisoftheseCTstudies wasperformedbythreeexperiencedradiologists,dierentfromthetwothathadassignedsemi-quantitativescoresforallbaselineIPFNetCTs.Extentoffbrosiswas scoredonascaleof0-10.Follow-upstudieswerescoredona5pointscaletoquan73

PAGE 83

tifychange(0=muchbetter,1=slightlybetter,2=same,3=slightlyworse,4=much worse).Changeinalgorithmfbrosisscoresfrombaselinetofollow-upwerecompared (bySpearmanrankcorrelation)withchangeinphysiologicmeasuresandchangein visualscores. Ofthe72subjectswithCTsattwopoints,33subjectshadCTseriesfromone acquisitionthatwerereconstructedusingtwodierentkernels.HavingCTimage seriesfromasingleacquisitionbutreconstructedtwodierentwaysprovideddatato testhowconsistentalgorithmestimateoffbrosiscanbewithdierentreconstruction kernels. 5.3Results Crossvalidationresults(precision,recallandf1score)forthebinaryfbrosis detectorareshowninTable5.1.ReceiverOperatorCharacteristics(ROC)analysis forthesebinaryclassiferresultsshowedmeanareaunderthecurve(AUC)of0.996 (standarddeviation0.003)overthetenfolds.CrossvalidationresultsformulticategorySVMareshowninTable5.2. Figures5.1and5.2showexampleimageswhereareasoffbrosisweredelineated bytheradiologist(left)andclassifedasfbroticbythealgorithm(right).Figure 5.3plotsfractionoflungareaestimatedtobefbroticbyradiologistversusalgorithm prediction.Table5.3showscorrelationsbetweenalgorithmandradiologistscoresand frstorderpixelstatistics. Figures5.4and5.5showresultsoftrainingimagethatwereclassifedusingthe twostagesystem.ROIssampledwithinthelungswerefrstclassifedasfbroticornot, andthosethatwereclassiferasfbroticwerethenprocessedusingthemulti-category SVMtopredictfbrosistype. Table5.4showsassociationsbetweenphysiologicmeasures,visualscoresandUFL algorithmresultsforbaselineCTsin280subjectsnotusedfortraining.Radiologist agreementinsemi-quantitativefbrosisscores(tworeaders)atbaselinewaspoorby 74

PAGE 84

weightedCohen'skappa( =0 :3, p< 0:001,N=280). Resultsforthe50ACEcasesusedtocollecttrainingROIsareshowninTable 5.5.Radiologistagreementinsemi-quantitativevisualscoreswaspoorinthissubset aswell( =0 :2, p< 0:001,N=50). Forthe72PANTHERsubjectswithfollow-upCT,associationsbetweenalgorithmscore,visualscoreandphysiologicmeasuresatbaselineareshowninTable5.6. Totestinter-rateragreementbetweenvisualextentoffbrosisscoredby3radiologists,Fleiss'kappawasperformed.Thistestshowedmodestagreement( =0 :35, p< 0:001).Associationsbetweenchanges(valueatfollow-upminusvalueatbaseline) areshowninTable5.7.Toquantifychange,the3radiologistsscoredchangeatfollowupona5pointscale(0=muchbetter,1=slightlybetter,2=same,3=slightlyworse, 4=muchworse).Fleiss'kappaonthesescoresshowedpooragreement( =0 :19, p< 0:001). Thirty-threesubjectshadtwoCTseriesreconstructedfromthesameacquisition butwithdierentconvolutionkernel.Allofthesewereacquiredusingscannersmade byGEMedicalSystemswithoneseriesreconstructedusingtheSTANDARDkernel. Twenty(n=20)hadasecondseriesreconstructedwiththeDETAILkernel.Mean dierenceinfbrosisscoresinthisgroupwas0.005(95%confdence[-0.421,0.431], p=0.98).Thesecondseriesintheremaining13subjectswasreconstructedusingthe BONEkernel.Meandierence(fbrosisscorecomputedfromBONEseriesminusfbrosisscorecomputedfromSTANDARDseries)inthisgroupwas2.1(95%confdence [0.329,3.855],p=0.02). 5.4Discussion FibrosisdetectiononCTusingUFLfeaturesandabinarySVMtrainedwith iterativehardnegativeminingshowsgoodclassifcationresultsasshowninTable Table5.1.ROCAUCofnearly1.0givesgoodconfdenceinthealgorithm'sability todistinguishROIsthatcontainlungfbrosisfromROIsthatdonot.Themulti75

PAGE 85

Table5.1: Cross-validationresults,binaryclassifertodetectfbrosis.Meanvalue overcrossvalidationpartitions,standarddeviationinparenthesis PrecisionRecallf1 Notfbrosis0.984(0.024)0.978(0.009)0.981(0.014) Fibrosis0.944(0.040)0.971(0.026)0.957(0.026) Table5.2: Cross-validationresults,multi-categoryclassifertodetectfbrosis.Mean valueovercrossvalidationpartitions,standarddeviationinparenthesis. PrecisionRecallf1 Normal0.965(0.066)0.976(0.018)0.972(0.042) Bronchovascular0.863(0.119)0.499(0.107)0.623(0.087) Reticularabnorm.0.877(0.088)0.936(0.028)0.903(0.048) Honeycombing0.802(0.070)0.845(0.092)0.820(0.063) Tractionbronch.0.665(0.193)0.439(0.144)0.512(0.123) Figure5.1: Fibrosisidentifedbyradiologist(left)andalgorithm(right) 76

PAGE 86

Figure5.2: Fibrosisidentifedbyradiologist(left)andalgorithm(right) Figure5.3: Fractionoffbroticlung,radiologistversusalgorithmesimate 77

PAGE 87

Table5.3: Pearsoncorrelation,radiologistandalgorithm( p< 0:001) RadiologistAlgorithm Algorithm0.92 MeanHU0.660.81 HUSt.Dev.0.410.54 HUSkewness-0.81-0.91 HUKurtosis-0.71-0.82 Figure5.4: RAROIdrawnbyradiologist(left)andalgorithmprediction(right). Figure5.5: HCROIdrawnbyradiologist(left)andalgorithmprediction(right) 78

PAGE 88

Table5.4: Associations(Spearmanrankcorrelation )atbaselineinIPFNetsubjects not usedforclassifertraining(n=280). p< 0:001 exceptwhereindicated. VisualScoreFVC(%pred.)DLCO(%pred.) Visualscore-0.26-0.49 Algorithmscore0.50 -0.60 -0.68 Meanlungatten.0.31 -0.68 -0.56 Skewness-0.48 0.70 0.67 Kurtosis-0.49 0.68 0.65 CTTLC(L)-0.06(p=0.35)0.57 0.31 Table5.5: Associations(Spearmanrankcorrelation )atbaselineinIPFNetsubjects usedforclassifertraining(n=50), p< 0:001 exceptwhereindicated VisualScoreFVC(%pred.)DL CO (%pred.) Visualscore-0.42-0.58 Algorithmscore0.69 -0.60 -0.63 Meanlungatten.0.52 -0.79 -0.51 Skewness-0.66 0.70 0.52 Kurtosis-0.68 0.70 0.50 CTTLC(L)-0.28(p=0.04)0.67 0.40 79

PAGE 89

Table5.6: Associations(Spearmanrankcorrelation )atbaselineinPANTHER subjectswithfollow-upCT(n=72). p< 0:001 exceptwhereindicated. VisualScoreFVC(%pred.)DL CO (%pred.) Visualscore-0.29-0.59 Algorithmscore0.75 -0.45 -0.72 Meanlungatten.0.59 -0.49 -0.54 Skewness-0.69 0.56 0.71 Kurtosis-0.67 0.53 0.67 CTTLC(L)-0.26(p=0.03)0.25(p=0.04)0.27(p=0.02) Table5.7: Associations(Spearmanrankcorrelation)in changes (valuesatfollow-up minusvaluesatbaseline)forPANTHERsubjectswithfollow-upCT(n=72). p< 0:001 exceptwhereindicated. VisualScoreFVC(%pred.)DL CO (%pred.) Visualscore-0.41 -0.38 Algorithmscore0.64 -0.41 -0.39 Meanlungatten.0.56 -0.36-0.31(p=0.009) Skewness-0.52 0.460.23(p=0.054) Kurtosis-0.49 0.450.12(p=0.131) CTTLC(L)-0.36(p=0.002)0.400.41(p=0.380) 80

PAGE 90

categorySVMalsoshowsverygoodresultsintenfoldcross-validation,butwesee thatsmallerandmoresubtlepatterns(BVandTB)aremorediculttoclassify. ThelowerrecallscoresforthesecategoriesindicatethatROIsthatbelongtothese categoriesareoftenmisclassifedintoothercategories.BothBVandTBareairway structures,soaresmallerobjectsthathavefewertrainingexemplars,whichmaybe afactor.Also,thefreeformROIsofothercategories,especiallyRAandHC,were drawnaslargeswathsthatprobablycontainembeddedairwaystructures.Sosmaller squareROIssampledfromwithinthesefreeformROIscouldactuallybemorelikea BVorTBROIs.Suchnoisylabelsmakeclassifcationmorechallenging. Inwholeimagetestingverygoodconcordanceisseenbetweenalgorithmpredictionsandradiologistidentifcationoffbrosis.Inthistesttheradiologistwasasked wasaskedtodelineateallvisibleregionsoffbrosis.Qualitativelythealgorithm predictionscoincidewellwithregionsoutlinedbytheradiologist,bothintermsof sensitivityandspecifcity.Quantitativeevaluation,bycomparedfractionoflungarea identifedasfbrotic,showsexcellentagreement.ROI-basedtesting,asisdoneincross validation,canbeoptimisticbecauseeveryROIhasanassignedlabel.Newimages, orimages\inthewild",aremorechallengingbecauserandomlysampledROIsmay bridgedierentcharacteristicpatternsormaynotnecessarilycontainaspecifcally identifablepattern.Theseresultsareencouraginginthattheydemonstrategood concordancebetweentheautomaticsystemandaradiologist.Anecdotally,radiologistswhoviewedtheseresultsstatedthatthealgorithmdididentifysomefalse positivesbutalsotendedtopickupsmall,subtleareasoffbrosisnotoutlinedbythe radiologist. Testingthesecondaryalgorithm,whoseintentistodiscriminatetypeoffbrosis, alsoshowsgoodresultsqualitatively.TheseROIswereoriginallydrawntocollect trainingexamples,sothefocuswasonoutliningregionsthatdemonstratedparticular patternsandnotnecessarilyexcludingotherregions.Inotherwords,radiologists 81

PAGE 91

wereaskedtooutlineregionsthatweregoodexamplesofthepatternsofinterestand werenottryingtocategorizeeveryregionoffbrosisinanimage.Assuch,thistest addressessensitivityandnotspecifcity. VolumetrictestsofbaselineCTstudies(Table5.4)showedmoderatecorrelation betweenalgorithmfbrosisscoreandphysiologicparameters.ThevariousCT-derived indices(visual,UFLalgorithm,andlungpixelhistorgamstatistics)allshowthat anincreaseinmeasuredfbrosisisassociatedwithdiminishedFVC(%pred.)and DL CO (%pred.).Ofnoteisthatinter-rateragreementbetweensemi-quantitative fbrosisscoresforthesestudiesassignedbytworadiologistsshowedpooragreement butautomaticalgorithmshowedmoderatecorrelationwithmeanvisualscore.It isalsoapparentthatalgorithmfbrosisscorecorrelateswithphysiologicmeasures slightlybetterthanvisualscoredoes.SimilartestsonthevolumetricCTsofsubjects fromwhichtrainingROIsweredrawn(Table5.5)showcomparableassociationswith slightlystrongerassociationbetweenalgorithmandvisualscore.Thisislikelybecause thissubsetofCTstudieswasacquiredwithmoreconsistentCTparametersandthese subjectswereallfromtheACEstudy.Inmachinelearningthistypeoftesting,where trainingandtestdataaremixed,maybeconsidered\cheating"butthenumber ofROIsusedfortrainingcomparedtothenumberofROIssampledfromwithin volumetricCTsisrelativelysmall.Itwasconsideredusefultoexplorecorrelations undertheseoptimisticconditions. Associationsatbaselinebetweenalgorithm,visualscoreandphysiologicparametersinthePANTHERsubjectsforwhomfollow-upCTwasavailable(Table5.6) showsimilartrends.Ofnotehereisthatassociationbetweenalgorithmscoreand radiologistvisualscoreishigherthaninthelargergroupofIPFNetbaselinestudies. Onepossibleexplanationisthatthisgroupof3radiologistscollectivelyhasmuch moreexperiencethanthe2whoscoredallIPFNetbaselinecases.Inter-rateragreement(measuredbyFleiss'kappa)isstillfairlylowbutusingthemeanscorereduces 82

PAGE 92

theaectofanyoneobserverscore. Lookingatchangeinvariablesfrombaselinetofollow-up(Table5.7)showsthat increasedfbrosis,asdeterminedbyvisualevaluation,theautomaticalgorithmor frstorderpixelstatistics,isgenerallyassociatedwithdiminishedlungfunction.It hasbeenestablishedthatskewnessandkurtosisoflungpixelhistogramsisassociated withbaselinefunctioninpatientswithIPF[15],[16].Otherstudiessuggestskewness andkurtosisarenotcorrelatedwithreducedfunctiononfollow-upstudies[68]but thisparticularstudylookedatchangesoverarelativelyshorttime(7months).My resultsindicatethatareductioninskewness(i.e.,lungpixelhistogrampeakmoving towardtheright)isassociatedwithreductionofFVC(%pred.)( =0 :46, p< 0:001) butnotwithchangeinDL CO (%pred.)( =0 :23, p =0 :054).However,changein algorithmfbrosisscoreisassociatedwithvisualscore,FVC(%pred.)andDL CO (%pred.).SincelungCThistogramindices,likemeanlungattenuation,skewness andkurtosis,aresimple,globalmeasurestheyareunabletocapturelocalvariations. Instead,thesemeasurestendtorerectoverallbrightnessandcontrastoflungpixels, whichisasimplemeasureoftherelativeamountsofairandtissueinthelungson inspiration.Itisthereforenotsurprisingthatthesemeasurescorrelatewithglobal measuresoflungfunction(likeFVC),butalsoclearthattheseglobalmeasuresare unlikelytocaptureextentofmoresubtle,heterogeneouspatternslikehoneycombing. Priorwork[16]suggeststhathistogram-basedquantitativeCTmeasurescanshow diseaseprogression,butvisuallydeterminedextentofdiseaseonCTisastronger predictorofmortality.Otherwork[77]suggeststhatCT-basedfbrosisscoreprovidesinformationcomparabletoDL CO inpredictivemodelsofmortalityinIPF.This suggeststhatautomaticextentoffbrosisscoresthatcorrelatewellwithvisualscores assignedbyexperiencedradiologistscouldprovideusefulprognosticinformation. AlimitationofthisstudypopulationisthatCTswereacquiredwithscannersfrom variousmanufacturersandwithdierentreconstructionkernels.Asmallsubsetof 83

PAGE 93

PANTHERstudies(n=33)hadCTseriesfromthesameacquisitionbutreconstructed usingdierentkernels.AllofthesecaseswerefromGEscanners.Acomparisonof theautomaticfbrosisscoresindicatesnosignifcantdierencebetweenseriesreconstructedwithSTANDARDandDETAILkernels,butthatfbrosisscoresforBONE kernelwereslightlyhigherthanthoseforSTANDARDseries.Thisissuewillrequire furtherexploration. Anotherlimitationofthisstudywasthatmortalitydatawasnotavailable.There hasbeensomedebateregardingappropriateendpointsforstudyofIPFprogression [45],[109].Thisworkshowscorrelationbetweenincreaseinfbrosis,measuredonCT withanautomaticalgorithm,anddiminishedlungfunction.Futureworkwillseek toincludemortalitydata. 5.5Conclusions UFLfeaturesanditerativehardnegativeminingusingweaklylabelednegative samplescanbeusedtoclassifyregionsoflungfbrosisonCT.Automaticdetection andquantifcationoflungfbrosisonCTisacomplexproblembecauseofheterogeneityinimagepatternsandvariationinCTimages.Thisworkdemonstratesthat featurescapableofdiscriminatingkeypatternsoflungfbrosiscanbelearnedfrom representativedata.Thesystemdescribedprovidesestimatesoffbrosisextentthat concurwithradiologistvisualassessmentandcorrelatewithphysiologicmeasures, bothatbaselineandonfollow-up.Possiblefutureworkincludesrefnementofthe classifcationalgorithm,perhapsbyincorporationofspatialinformation. 84

PAGE 94

6.StatisticalShapeModelingoftheAirwayTree Texture-basedmethodscanbeeectiveforquantifyingparenchymaldiseases, butmeasurementofairwaysrequiresadierentapproach.Morphologicparameters suchaslumenareasandhomothetyratiohavebeencorrelatedwithmeasuresoflung function[18],[89]indicatingthatanalysisofairwayssegmentedfromCTscansmay alsobeusefulforquantitativeimaging. 6.1Introduction Theshapeandsizeoftheairwaytree,acomplextubular,branchingstructure, impactsrowandresistancethroughtheairways[137].Studiesoftheairwayusing tissuespecimensorquantitativeimagingtechniquestendtofocuseitheronthedimensionsofarelativelysmallnumberofindividualbranchesoronsummarystatistics thataveragemeasurementsovermanybranches[35],[58],[94],[141],[142].Airway morphologyisimportanttolungfunctionandmaybeamarkerofdiseaseseverityin patientswithairwaydisease[11],[94]aectingrespiratorysymptoms,inhaledmedicationdelivery,andmucusclearance[91],[90].Despitethepotentialimportanceof three-dimensionalairwaymorphologyonrespiratorysymptomsandtherapy,there arenostandardmethodstodefneoverallpediatricairwaysizeandshapeandthere islimitedpublisheddataregardingthistopicinchildren[40],[111],[113],[141]. Geometricmorphometricsreferstothestatisticalanalysisofshape,especially comparisonsofbiologicalandanatomicalshapes[146].Ithasbeenusedinanthropology,forexample,toquantifyinterandintraspeciesvariationsincraniofacialstructure overtime[82].Statisticalshapemodeling(SSM)isanextensionofgeometricmorphometricsthathasbeenappliedtoanumberofproblemsinmedicalimaging[13], [56],includingsegmentationandquantitativeanalysisofanatomicalshapechanges duetoremodeling[8],[76].SSMcomputesameanshapefromasetofexamples, oftenreferredtoasthetrainingset,andprovidesaframeworkforecientdescriptionoftheprimarymodesofshapevariationdemonstratedwithinthatset.Thisis 85

PAGE 95

accomplishedbyrigidbodyregistrationofcorrespondinglandmarksfollowedbydimensionalityreductionbyPrincipalComponentsAnalysis(PCA)[53],whichreturns linearlyuncorrelatedmodesofdeformationsortedbyamountofvariance. SSMmaybeusefulforstudyingairwayshapebecauseitmakesitpossibleto representtheairway'scomplex,treelikegeometryusingarelativelysmallnumberof parameters.Theseparameterscanbecomparedwithotherclinicalvariablesusing establishedstatisticalmethodsinordertofndassociationsbetweenairwayshape, subjectsizeanddiseasestate.Acompositeairwaymodelcouldalsobeusefulfor furtherstudyincludingcomputationalruiddynamicssimulation. Cysticfbrosis(CF)isaninheriteddiseasethatleadstoprogressivedeterioration oflungfunctionandultimatelyrespiratoryfailure[47].Thickenedmucuscauses chronicinfection,inrammation,andairwayremodeling.Airwayclearancetechniques andinhaledmedicinesarecriticaltreatments.Successfultherapycouldbeimpacted byvariationsinairwaymorphology.Datafromanimalmodelssuggestabnormalities inthesizeandshapeofthetracheaarepresentinCFsubjectsfrombirth[93]and imagingstudiesofyoungchildrenwithCFindicateearlystructuralabnormalities, eveninpatientswithnormalpulmonaryfunctiontestresults[41],[81],[118],however thesestudiesfocusonindividualmeasurementsorsimplelumpedstatisticssuchas airwaysizeandairwaywallthickening.Thesemethodsdonotdescribecomplex changesinairwaytreeshapeandbranchingthatmayresultfromabnormalprotein functioninpatientswithcysticfbrosis. WehypothesizethatSSMmethodologycanbeusedtocomparetheairwaymorphologyoftwopopulationsbasedondatafromchestcomputedtomography(CT) scans.Toexplorethishypothesis,wesynthesizedanaverageairwayskeletonmodel andcomparedoverallairwayshapevariationswithmeasuresofsubjectgrowthand diseasestate.Weperformedexperimentswithsyntheticdataasatechnicalvalidation.Becausepreviousstudieshaveshownthatairwaysareaectedinchildrenwith 86

PAGE 96

CF[141],wechosetoapplythemethodtocompareacohortofchildrenwithCFand acohortofdiseasecontrolchildrenwithoutmeasurablerespiratorydisease. 6.2Methods Thispilotstudyisacross-sectional,retrospectiveanalysisofvolumetricchest CTacquiredforclinicalpurposes.Thestudyprotocolwasincompliancewiththe HealthInsurancePortabilityandAccountabilityAct(HIPAA)andtheDeclaration ofHelsinkiandwasapprovedbytheColoradoMultipleInstitutionalReviewBoard. Awaiverofconsentwasgrantedtocollectde-identifedCTdata.Candidatesfor thestudywereidentifedusingtheelectronicmedicalrecordofChildrensHospital Colorado.Thecontrolgroupwascomprisedofpatientsundertheageof18years whounderwentinspiratoryhigh-resolutionchestCTduringclinicalevaluationfor bonemarrowtransplant.SubjectswereincludedinthestudyifinspiratorychestCT wasperformedpriortothebonemarrowtransplantandmettechnicalrequirements includingsub-millimeter,contiguousimagesthroughthelungsandabsenceofsignificantartifactsfrommotionorthepresenceofmetal.Onlystudiesforwhichclinical reportsbyapediatricradiologistindicatednoevidenceofairwayabnormalitywere used.Exclusioncriteriaincludeddocumentedpulmonarysymptoms,cough,dyspnea, sleepapnea,persistentasthma,historyofpneumoniaorseverepulmonaryinfection requiringhospitalization,historyofchestsurgery,orrecordofabnormalpulmonary functiontests.CFsubjectswererelativelyageandgender-matchedtothedisease controlgroupandhadaninspiratorychestCTduringthesametimeperiod,accordingtohospitalrecords.ChestCTsintheCFcohortwereobtainedbasedon providerdiscretion.SubjectswereeligiblefortheCFcohortonlyiftheyhadanabnormalsweatchloridetestand/orgenetictestingconsistentwiththediagnosisofCF. Height,weight,andforcedexpiratoryvolumeinonesecond(FEV1)wererecorded withintwoweeksoftheirCTscan. 87

PAGE 97

ChestCTscanswereacquiredatChildrensHospitalColoradoonasingleCT scanner(SiemensSensation40,SiemensAG,MedicalSolutions,Erlangen,Germany). Scanswereobtainedincooperativechildrenneartotallungcapacityusingaclinical breathingprotocol.Childrenlessthanthreeyearsofageweresedatedusinggeneral anesthesiaandimagedafteradeepinspirationdeliveredbytheanesthesiologistwith eitheranoninvasivemaskoralaryngealmaskairway.CTacquisitionparameterswere dictatedbyclinicalprotocolswithatubepotentialof100-120kVpandcurrentof75 or150mAsbasedonpatientweight.Allscanswerereconstructedwithacontiguous slicethicknesslessthanorequalto1.0mm.Theimagematrixwas512x512pixels. Fieldofviewwasdeterminedbasedonchildsize,andpixelsizerangedfrom0.36 to0.97mm.Softkernelreconstruction(b31f)wasusedforbonemarrowtransplant evaluationandsharpkernelreconstruction(b70f)forCFpatients.CTswerereadin aclinicalworkrowbypediatricradiologists. CTstudiesofalldiseasecontrolandCFsubjectswereconsideredthetraining set.CTimagevolumeswereprocessedbyasingleinvestigatorusingacommerciallyavailableapplicationforquantitativelungimaging(Apollo r,VidaDiagnostics, Coralville,IA)toproduceairwaysegmentationmodelsforeachsubject.Thesoftware performsautomaticairwaytreesegmentationandbranchlabelinginabatchprocess [126].Italsoprovidesauserinterfaceformulti-planarvisualizationofimages,manualandsemi-automaticeditingofsegmentationandbranchlabels.Segmentationand anatomiclabelswereverifedbyapediatricpulmonologist.Centerlineextractionof theairwaytree,alsoknownasskeletonization,wasperformedautomaticallybythe applicationaftersegmentationwasfnalized.Aminimumof36brancheswerelabeled ineachcase.Seventeenwereultimatelyusedforshapemodeling. Customsoftware(MATLAB,TheMathworks,Natick,MA)wasdevelopedto processoutputfromApollor andtosynthesizeastatisticalshapemodel.Postprocessingincludedidentifcationofnamedbranchesbytheirendpoints,whichare 88

PAGE 98

bifurcationsontheairwaycenterlinesandcanserveaskeylandmarks.Cubicspline functionswerecomputedtoftverticesalongeachseparatebranchcenterline.Branch correspondenceacrossallsubjectswasdeterminedusinganatomicnamesassignedusingApollo r.TheSSMwasbasedonbranchesthathadbeenidentifedandnamedin allsubjects.Duetovariationsintopologyasmallnumberofbrancheswerepresent insomesubjectsbutnotinothers.Toavoidgaps,branchesthatwereabsentincertaincasesweremodeledassplinesofzerolength,withbothendpointsatthesame vertex.Coordinatesoftwentypseudo-landmarksevenlyspacedalongeachbranch wereinterpolatedusingsplinefunctionssothatlandmarkcorrespondence,animportantprerequisiteforSSMcalculation,wasmaintained.TwoCTstudiesreconstructed withbothsoftandsharpkernelwereavailable.Thesewerecomparedinaneortto assesstheimpactofkerneldierenceontheshaperepresentation.Airwayskeletons forthesamesubjectderivedfromCTseriesreconstructedwithdierentkernelswere overlaidandassessedvisually. Eachshapeinthetrainingsetwasrepresentedbythe3Dcoordinatesofitspseudolandmarks,derivedfromsplineinterpolationofeachbranch.GeneralizedProcrustes Analysiswasperformedtoalignthetreeshapeintoacommoncoordinatesystem andcomputeameanshape.Toinitializetheprocess,theconfgurationoflandmarks fromonesubjectwasarbitrarilychosenasthetargetandallothersetsoflandmarks alignedtoitusingrigidbodyregistration.Registrationcomputedtherigid-body rotationandtranslationtominimizethesumofthesquareddistancesbetweenpoints aftertransformation. x 0 = Rx + T + (6.1) Where x 0 and x areshapesrepresentedashomologouslandmarks. R isarotation matrix, T isatranslationvectorand isanaccommodationfornoise.Thegoalisto minimize: 89

PAGE 99

X 2 = n X i=1 kx 0 i )Tj /T1_2 11.955 Tf 11.955 0 Td ((Rx i + T )k 2 (6.2) Thiswasaccomplishedusingasingularvaluedecompositionapproach[10]. Registrationwasperformedwithandwithoutscaling.Toperformrigidbody registrationwithscalingeachlandmarkconfgurationisscaledbyoneoveritscentroid size,thesquarerootofthesummedsquareddistancesofeachlandmarktotheir centroid,equalsone[146].Thecentroidsizeofthe jth landmarkconfgurationis: C j = n X i=1 kx ij )Tj /T1_2 11.955 Tf 14.546 0 Td ( x j k 2 (6.3) where x ij isthe ith landmarkofthe jth memberofthetrainingsetand x j is thecentroidofthe x j .Omittingthisscalenormalizationstepallowsuniformscaling variationstopropagatethroughthesubsequentsteps. Procrustesanalysiswasperformediterativelywhereateachstepmeanlandmark confgurationservedastheregistrationtarget.Themean,orconsensus,confguration wasupdatedaftereachstepandnumberofiterationswasthesameasnumberof subjectsinthetrainingset. Givenasetof s landmarkconfgurationsthathavebeenalignedbyGPAtheir meancanbecalculatedas: x = 1 s s X j =1 x j (6.4) AfterGPAvariablesdescribingeachairwaytreeshapearearrangedascolumn vectors: x j =[ x 1 y 1 z 1 :::x n y n z n ] T (6.5) 90

PAGE 100

where x i istheshapevectorforonesubjectand n =numberoflandmarksinthe airwaytreeshape.Inthiscase n =20timesthenumberofbranches,becauseeach branchsplinewassampledtoproduce20pseudo-landmarks. SSMwascomputedusingPCA[34].Thisapproachreducesdimensionalityof thecoordinatedatawhereresultingPrincipalComponents(PCs)describelinearly uncorrelatedmodesofvariationinthetrainingdataandthusprovideamodelof deformation.TheyareorderedsuchthatthefrstPCaccountsforthelargestdirectionofvariationpresentinthetrainingset.PCAcanbeconsideredeigenvalue decompositionofacovariancematrix.InthecaseofshapedataanalyzedbyGPA, covarianceisgivenby: V = 1 s )Tj /T1_0 11.955 Tf 11.955 0 Td (1 s X j =1 (x j )Tj /T1_0 11.955 Tf 12.68 0 Td ( x)(x j )Tj /T1_0 11.955 Tf 12.679 0 Td ( x) T (6.6) where x j isthejthtreeshapeinthetrainingsetwith s membersand x isthe meanofthetrainingsetcalculatedusing6.4. Performinganeignenvaluedecompositionof V andassemblingthe t eigenvectors correspondingtothe t largesteigenvalues, b =( 1 j 2 j::: t )givesthebasisfora deformationmodel.Thevectorsinthematrix b indicatetheprimarydirectionsof deformationexemplifedinthetrainingset.Anymemberofthetrainingset, x j ,can beapproximatedby: x j x + bb (6.7) Where x j isanyshapeinthetrainingset, x isthemeanshape, b isthematrix ofprincipalcomponentsand b isavectorofparametersthatadjuststhemagnitude ofdisplacementalongthecorrespondingcomponent.Byvaryingtheelementsof b themeanshapedefnedbythetrainingsetisdeformed.SincethefrstfewPCs generallyaccountformostofthevariationpresentinthetrainingset,themodel 91

PAGE 101

providestheabilitytodescribeeachairwaytreeshapewitharelativelysmallnumber ofparameters.Thevarianceofthe i th parameter, b i ,acrossthetrainingsetisgiven bythecorrespondingeigenvalue i .Agivenshape x n canberepresentedusingan existingshapemodelbycomputingtheparametersin b as: b = b T (x n )Tj /T1_0 11.955 Tf 12.68 0 Td ( x) (6.8) Ashapethathadnotbeenincludedinthetrainingsetmustfrstberegistered tothemeanshapebeforeSSMparameterscanbecalculated. Toincludeestimatedairwaylumensizewithoutexplicitlyincludingadditional landmarksattheairwaywalls(whichwouldincreasecomplexitydramaticallyand poseproblemsfordeterminationofcorrespondence),theradiusofaspherecentered oneachlandmarkwasaddedasanadditionalvariable.Theradiuswasdeterminedby fndingthemaximallyinscribedspherewithintheairwaysegmentationatthatpoint. InthiscaseregistrationinGPAconsideredonlylandmarklocationcoordinates,but fnalcalculationofmeanshapeincludedradiiforlumensizeestimation.Priorto calculationofPCA,variablesforasingletreeshapewerearrangedas: x j =[ x 1 y 1 z 1 r 1 :::x n y n z n r n ] T (6.9) Figure6.1demonstratesinclusionoflumensizeestimateswithinscribedspheres. Forclarity,onlyonesphereisshownateachbranch'smidpoint. StatisticaltestswereperformedusingR(RDevelopmentCoreTeam,2013).Studentst-testwasusedtocomparegroupmeansofbasicdemographicdatabetween thetwopopulations.SSMparameterswerecomparedtoage,height,weight,body surfacearea(BSA),centroidsize,andFEV1usingPearsoncorrelation.BSAwascalculatedusingDuBoisformula[44]basedonheightandweightvaluesinthepatients medicalrecord.SSMparameterswerecomparedtodiseasestate(controlversusCF) 92

PAGE 102

Figure6.1: Incorporationofestimatedlumensizewithinscribedspheres. usingmultivariatelogisticregressionwithbestsubsetsvariableselectionbasedonthe AkaikeInformationCriterion(AIC)[7],[76].Levelofsignifcancewassetat p< 0:05. ByadjustingSSMparametersaccordingtologisticregressionresults,representative airwaymodelswerecreatedforthediseasecontrolgroupandCFgroup,respectively forvisualizationandtofacilitatemeasurementofspecifcdimensions.Representative airwaymodelsfortheCFanddiseasecontrolgroupswerecreatedandtheangleat thecarinawascompared. Leave-one-outcross-validation(LOOCV)wasperformedoverboththeshape modelingandanalysisprocesses.Thisinvolvedholdingoutpseudo-landmarkdatafor onesubjecttobeusedfortesting,thencalculatinggeneralizedProcrustesanalysis, SSM,variableselectionandregressionusingtheremainingsubjects.Afteraligning thetestsubjecttothemeanlandmarkconfgurationforthetrainingset,SSMparameterswerecalculatedusing(6.8).AccuracyofCFstatuspredictionwasestimatedby pluggingtheseSSMparametersintoregressionmodelsbuiltusingthetrainingset. Thiswasrepeatedoverthewholepopulationsothateachsubjectwastestedusing amodelconstructedfromtheothercases.Performingcross-validationsothatall stepsofamodelbuildingprocessaretestedisconsideredthestrongestapproach[55]. TheprocesspipelineisdepictedinFigure6.2.Apollo r isusedtosegment,label 93

PAGE 103

Figure6.2: AirwaySSMandLOOCVpipeline. andskeletonizeairwaytreesfromchestCTscans.Airwaylumensizeisconverted tosphericaldata.GeneralizedProcrustesAnalysisisusedtoalignthemodelsand createaconsensusmodel.Principalcomponentanalysisdeterminestheparameters thatdefneshapedierencesinthepopulation.Statisticalanalysisbylogisticregressionwithautomaticvariableselectioncomparesparametersbetweenpatientswith cysticfbrosisandcontrols.Leave-one-outcross-validationteststhemethodsability togeneralizetodatanotincludedinthetrainingset. Synthetictreeshapeswerecreatedinordertotestalgorithmbehaviorwithknown sizeandshapevariations.Theaveragetreeshapefromsubjectdatawasusedasa basisandthinplatesplinewarpingappliedtocreatesmoothvariations.Bifurcations 94

PAGE 104

andbranchmid-pointswereusedascontrolpoints.Controlpointsontheleftand rightsides,excludingthecarinaandtrachea,weredisplacedinordertoinduceshape variations.Thiswasdonebyindependentlyrotatingcontrolpointsaboutthecarina. Randomlyselectedrotationangleswithnobiasproduceshapevariationacrossa wholepopulationbutnosignifcantdierencebetweengroups.Applyingbiasto rotationanglescreatessignifcantdierencesbetweengroups.Twopopulationsof40 treeshapeswerecreatedsothatalltreesweredeformedandscaledbysomeamount. Inonepopulationof40,halfofthetreesweretransformedsothattheleftandright controlpointsweremorelikelytobedisplacedmedially.Thiswasaccomplishedby biasingrotationoftheleftandrightcontrolpointsinoppositedirectionsaboutthe anterior-posterioraxis(Figure6.3).Theaveragetreeshapewasusedasabasisand deformedusingthinplatesplinetransforms.Bifurcationsandbranchmidpointswere usedascontrolpoints.Controlpointswereindependentlyrotatedaboutthecarina todisplacethem.Inasecondpopulationof40treessimilarvariationswereinduced butwithoutsignifcantdierencesbetweengroups.Isotropicscalingalsoappliedto createoverallsizevariationswithoutbiasinbothpopulations.Bothpopulationswere analyzedusingSSMfollowedbylogisticregressionwithvariableselection.LOOCV wasusedtotestthealgorithmsabilitytodiscriminateknownshapedierencesaswell asitsresponsetoapopulationwithoutsignifcantshapedierences.Experimentsto testinclusionof 6.3Results Three-dimensionalairwaytreemodelswerecreatedfromatotalof40CTstudies, 20diseasecontrolsubjectsand20intheCFcohort.SubjectdemographicsaresummarizedinTable6.1.Groupmeanswerenotstatisticallydierentbyheight,weight, BSA,orFEV1percentpredicted.Atotalof17airwaybrancheswerelabeledinall cases(Figure6.4). 95

PAGE 105

Figure6.3: Synthetictreeshapesforsimulationexperiments. Figure6.4: Seventeenbrancheswerelabeledineverymodel. 96

PAGE 106

Table6.1: Subjectdemographics. ControlCF p-value Subjectsn=20n=20 Male1111 Age(years)9.82(1.42{17.92)10.05(2.2520.33)0.90 Height(cm)133.3(78{178)131.8(90179)0.87 Spirometryn=17n=16 FEV1%101.7(82{120) 91.3(33143)0.14 Weight(kg)34.83(10.1{75.6)32.21(12.574)0.66 BSA( m 2 )1.12(0.45{1.91)1.08(0.551.92)0.76 Centroidsize642.2(507.5{776.9)623.9(517.0730.85)0.64 Lungvol.(cc)2070.27(304.83{6249.34)2474(694.506542.70)0.43 97

PAGE 107

Twopopulationsof40synthetictreeshapes,eachwith17branches,wereanalyzed usingthecompleteworkrowincludingLOOCV.Onepopulationwascomprisedoftwo groupsof20treeswhereasmallbutstatisticallysignifcantshapedierencebetween groupshadbeeninducedbypreferentiallydisplacingleftandrightcontrolpoints mediallyinonegroupandlaterallyintheother.LOOCVshowedgoodaccuracy indiscriminatingthetwogroups.Thesecondpopulationwasalsodividedintotwo groupsof20wherearangeofvariationbutnosignifcantdierenceexistedbetween thegroups.InthiscasethemodelingprocessdidnotidentifyanySSMparameters assignifcantandlogisticregressiondidnotdierentiatebetweenthetwogroups. SimulationresultsaresummarizedinTable6.2. SegmentationbiasrelatedtodierentCTreconstructionkernelusedincontrol andCFpopulationsrenderedlumensizeestimatesunreliable.SSManalysisinsubjectdatausedonlypsuedo-landmarkcoordinatesfromskeletonbranches.InLOOCV whenscalingwasnotperformedclassifcationaccuracywas0.70(Table6.3a).Sensitivityandspecifcitywerealso0.70.InanSSMcomputedusingthefullcomplement ofdata( n =40)andwithoutscalingattheregistrationstep,eigenvaluesfromPCA indicatethatthefrsttenSSMparametersaccountforabout73%ofthetotalvariationpresentinthetrainingset(Figure6.5).ThefrstSSMparameterisstrongly correlatedwithage,lungvolume,height,weight,BSAandmodelcentroidsizebut parameters2through9arenot(Table6.4a).ConsideringtheCFandcontrolgroups separatelydoesnotappreciablechangecorrelations.Logisticregressionmodelingwith variableselectionbasedonAICrevealedthatthreeofthefrst10SSMparameters (b2,b4andb5)describedierencesbetweenCFandnon-CFsubjectswithlikelihood ratiotestindicatingsignifcance(Table6.5).Receiveroperatingcharacteristiccurve showsgoodaccuracywithareaunderthecurveequaling0.85(Figure6.6).Thelogisticregressionmodelsuggeststhatapositiveonestandarddeviationchangeintheb2 SSMparameter,leavingotherparametersfxed,increasedtheoddsofbelongingto 98

PAGE 108

Table6.2: SimulationResults GroupA(n=20)GroupB(n=20) LeftsideRightsideLeftsideRightside Left-Rightrotation mean =0 (st:dev: =4 ) Anterior-Posteriorrotation )Tj /T1_0 11.955 Tf (4 (4 )4 (4 )4 (4 ) )Tj /T1_0 11.955 Tf (4 (4 ) Superior-Inferiorrotation0 (4 ) Uniformscalefactor1.0(0.2) Lumenradiiscalefactor 0.9(0.2) 1.1(0.2) LOOCVAccuracy 0:87 LOOCVSensitivity 0:89 LOOCVSpecifcity 0:86 (a) Signifcantshapedierencebetweengroups GroupA(n=20)GroupB(n=20) LeftsideRightsideLeftsideRightside Left-Rightrotation mean =0 (st:dev: =6 ) Anterior-Posteriorrotation0 (6 ) Superior-Inferiorrotation0 (6 ) Uniformscalefactor1.0(0.2) Lumenradiiscalefactor 1.0(0.2) LOOCVAccuracy 0 :5 LOOCVSensitivity 0 :5 LOOCVSpecifcity 0 :5 (b) Nosignifcantshapedierencebetweengroups 99

PAGE 109

Table6.3: LOOCVResults ControlCF Control146 CF614 (a) Withoutscaling ControlCF Control137 CF812 (b) Withscaling Figure6.5: FractionofvarianceexplainedbyPCA theCFgroupbyabout2times.Similarly,anegativeonestandarddeviationchange intheb4SSMparameterincreasedtheoddsby3timesandapositiveonestandard deviationchangeinb5increasetheoddsby2times.Statedanotherway,atreerepresentedbyonestandarddeviationshiftsfromthemeanintheb2,b4andb5SSM parameterswouldhave93%probabilityofbelongingtotheCFgroupaccordingto thismodel.Figure6.7showsPCAscoresforSSMparametersinthismodel. WhenscalingwasperformedclassifcationaccuracyinLOOCVwas0.63(Table 6.4b).Inamodelusingallsubjectandwherescalingwasperformed,thefrst10 SSMparametersaccountforabout62%ofoverallvariation(Figure6.5).Inthiscase thesecondSSMparameterismoderatelycorrelatedwithage,lungvolume,height, 100

PAGE 110

Table6.4: PearsoncorrelationsbetweenSSMparametersandclinicalvariables b1b2b3b4b5b6b7b8b9b10 Age )Tj /T1_3 11.955 Tf (0:87 0.00.040.20-0-0.1-0.10.10 LungVol. )Tj /T1_3 11.955 Tf (0:91 0.1-0.1-0.10.1-0.2-0.10.10-0.1 Height )Tj /T1_3 11.955 Tf (0:93 00.10.10.10-0.100.10.1 Weight )Tj /T1_3 11.955 Tf (0:94 0000.1-0.20000.1 BSA )Tj /T1_3 11.955 Tf (0:95 000.10.1-0.10000.1 Centroid )Tj /T1_3 11.955 Tf (1:00 000000000 FEV1%0.02-0.240-0.20.2-0.100.1-0.20.2 (a) Withoutscaling,signifcantcorrelations( p< 0:05)in bold b1b2b3b4b5b6b7b8b9b10 Age-0.1 0:46 -0.10.10-0.20.4-0.20.10.1 LungVol.-0.2 0:57 00.20.2-0.20.3-0.30.20 Height-0.1 0:51 -0.10.10-0.20.4-0.20.10.1 Weight-0.1 0:50 -0.10.10.1-0.30.2-0.20.20.1 BSA-0.1 0:51 -0.10.10.1-0.20.3-0.20.10.1 Centroid-0.1 0:55 00.10-0.20.3-0.20.20 FEV1%0.200.10.20.20.2-0.30.20.1-0.2 (b) Withscaling,signifcantcorrelations( p< 0:05)in bold 101

PAGE 111

Figure6.6: ROCcurve. Figure6.7: PCAscoresofControlandCFsubjects 102

PAGE 112

Figure6.8: Representativeairwaytreeskeletons weight,BSAandmodelcentroidsize(Table6.4b).TwoCTstudiesinourstudy populationwerereconstructedwithbothsoftandsharpkernelsenablingacomparison ofkerneleects.Qualitativeevaluationofthealignmentofairwaycenterlinesderived fromthesameCTacquisitionbutreconstructedwithdierentkernelsdoesnotshow appreciabledierences.Inaddition,LOOCVwasperformedexchangingthesharp andsoftkernelmodelsforthesetwocasesandclassifcationaccuracywasthesame inbothtrials.Representativeairwayskeletonmodelsbasedondiseasestatuswere createdforeachgroup(Figure6.8).Theairwaytreeshapemodelthatisonestandard deviationinb2,b4andb5towardtheCFgroupismarkedbyaroughly12-degree smallerangleatthecarinabetweentherightandleftmainbranches. 6.4Discussion SSMmethodologyenablescorrelationof3Danatomicshapeswithrelevantclinicalvariables.Wehavedevelopedacomputermodeloflargeairwaytreegeometry usingdataderivedfromchestCTinchildren.ThisnoveluseofSSMmethodology successfullydiscriminatestwopopulationsbasedongeometricdata,asvalidatedby simulationexperiments.Inthesimulationresultspresentedrelativelysubtleshape 103

PAGE 113

Table6.5: Logisticregressionmodels ParameterEstimateexp(Est.)p-valueSt.dev.of param. Intercept-0.25 b20.021.020.0737.7 b4-0.060.940.0217.4 b50.051.050.0416.9 AIC=48.61 (a) Withoutscaling ParameterEstimateexp(Est.)p-valueSt.dev.of param. Intercept-0.04 b1-0.020.980.0738.6 b20.021.020.1824.9 b40.061.060.0417.0 b50.041.050.0816.8 AIC=51.88 (b) Withscaling 104

PAGE 114

dierenceswereinduced.Intestswherelargertransformationbiaseswereapplied, forexamplerotationbiaslargerthanstandarddeviation,classifcationaccuracyusing logisticregressionwithvariableselectionbasedonAICgenerallygaveperfectresults. IncorporationofestimatedlumensizeintotheSSMshowedgoodresultsinsimulationexperiments.Thisapproachfacilitatesinclusionofestimatedlumensizewithout dramaticincreaseincomplexitysinceonlyoneadditionalvariableisrequiredateach landmark.However,thedierentCTreconstructionsinourstudydatamadeestimationoflumensizebyinscribedspheresunreliable.Airwaysegmentationresultswith sharpkernelCTreconstructionstendtobeslightlylarger,thusbiasingthemodel. Approximatinglumencaliberwithasphereisalsoalimitationinthatlumensize islikelyunderestimatedwhentruecross-sectionsareellipsoidal,sinceaninscribed spheretendstofttheminordiameter. LOOCVresultsprovidefurtherconfdencethatthealgorithmcangeneralizeto datanotincludedinthetrainingsetandidentifysignifcantshapedierencesbetween groups.Whenscalingwasperformedinrigidbodyregistration,LOOCVclassifcationaccuracyresultswerenotquiteasgood.Inaddition,correlationsbetweenSSM parametersandvariablesassociatedwithsubjectsizearelessstrong.Scalingessentiallyremovessizevariationsfromthetrainingsetbutisbasedoncentroidsize,which changesprimarilywithisotropicscaling,butcanalsobeaectedbydeformations. CalculatingSSMwithoutnormalizingscale,andthusincludingcentroidsizevariationinthemodel,givesbetterclassifcationaccuracyinLOOCV.Thissuggeststhat centroidsizeandscalecontributeinformationthatisusefulindiscriminatingCF. Inmodelswherescalingwasnotperformedwithrigidbodyregistration,the largestmodeofvariationinthemodelstronglycorrelateswithsizevariationinour populationofgrowingchildren.ThisisnotsurprisingsincesizevariationswereallowedtopropagatethroughtheSSMprocess.Itisnoteworthythatshapedierences betweenthediseasecontrolandCFcohortsareassociatedwithothermodesofvaria105

PAGE 115

tion.Inthismodelvariationinthesecond,fourthandffthshapemodesarestatisticallycorrelatedtothediseasestateofthesubjectsasCFordiseasecontrolwhilethe frstmodeofvariationisstronglycorrelatedwithsizerelatedvariables.Thenatureof PCAisthatprincipalcomponentsaresortedbyamountofvarianceandarelinearly uncorrelated.Inourpopulationiswouldappearthatvariationincentroidsizeis responsibleforthelargestvariationbutthisvariationisnotsignifcantlyassociated withCFstatus. AdjustingtheseSSMparametersbyonestandarddeviationfromthemeanairway shaperevealsgeometrydierencesbetweentheCFairwaymodelanddiseasecontrol. SpecifcallytheCFairwaymodelismorelikelytohaveasmallerangleatthecarina. Dierencesbetweenotheranglesandbronchiarealsoevident.Itappearsthatthe airwaysareangleddownwardandtheoverallshapeiselongatedinthesuperiorinferiordirection.WedemonstratethattheairwaySSMdescribesvariationsinairway geometrywithsubjectsize,andidentifesstatisticallysignifcantshapedierences, independentofsize,betweentheCFanddiseasecontrolpopulations. Airrowintree-likestructuresisknowntobehighlydependentonbranching geometry.Numericalsimulationsshowthatairrowdistributionintreeschangeswith evensmallvariationsindiameter,lengthandangulararrangementofbranches[91]. Itfollowsthatairwaymorphology,includingsizeandshapeofindividualbranches,as wellasoveralltreestructure,aectsairrowinthelung.Clinicalobservationshows thatairwaymorphologyaectstheseverityofrespiratorysymptomsinconditionslike asthmaandCF[11],[95].Further,airwaymorphologyislikelyanimportantfactorin inhaledparticledepositionandmucusclearance.Futurestudiesofairwayresistance andmedicationdepositionusingcomputationalruiddynamicsmethodsareplanned tohelpunderstandtheimpactoftheseobservedairwaygeometrydierences. Theeectofairwaymorphologyonairrowhasmotivatedthestudyofairway shapefordecades.ClassicstudiesbyWeibelandHorsfeldprovidearichbackground 106

PAGE 116

forunderstandingairwaybranchingandpresentelegantsummariesofthecomplex airwaytreestructure[58],[137].However,thesestudieswerelimitedtoarelatively smallnumberofsamplesin-situ,especiallyregardingpediatriccases,andaccurate analysisofthesesamplesrequireddicultmanualmeasurements.Thesestudies providesomeusefulreferencevalues,butarenotsucienttomodelnormalvariation inoveralltreestructure.Morerecently,3Dcomputermodelsthatrelyonrepetitionof fairlysimplerulesdescribingbranchinggeometrytoapproximateairwaytreeshapes havebeenpresented[71].Whilesuchmodelscanbecomequitecomplicatedand maybeusefulforsimulation,theyareunlikelytocapturenormal,possiblysubtle, variations.CThasbeenusedtostudybothnormalandCFairwaytreeanatomy, butpriorstudieshavefocusedonlumpedindividualmeasurementssuchaslumen diameter,andhavenotstudiedcomplexinteractionsbetweenlumentakeoangles andrelativebranchlength[39],[94],[111]. OurhypothesisthatairwaytreemorphologyisdierentinpatientswithCFwas inruencedbystudiessuggestingthatstructuralairwaychangesinCFbeginvery earlyinlife[81],[93].MeyerholzetalreportedthatintheCFpigmodel,tracheal lumenshapeislesscircularthandiseasecontrol[93]andLongetal.concluded thatairwaysininfantsandyoungchildrenwithCFhavethickerwallsandaremore dilatedthannormal.OurapproachusedSSMmethodstoevaluateoverallairwaytree shape,intermsofrelativebranchlengthandbranchingangles,ratherthanluminal shapedierences.SSMisincreasinginpopularityfortechnicaltaskslikemodel-based segmentation,aswellasclinicalstudiesexploringrelationshipsbetweenanatomical shapeandothermetrics[8],[76].Whereastraditionalmorphologicmeasuresfocuson individualdimensionssuchasareas,anglesandlengths,SSMutilizescoordinatesof homologouslandmarkstoenablemultivariateanalysisofoverallshape.Thisstudy providesencouragingresultsthatanSSMapproachcanbeappliedtotreeshapes.In addition,SSMcanidentifysignifcantshapedierencesbetweenpopulations. 107

PAGE 117

AsignifcantchallengeofSSMistheneedtoestablishsetsofhomologouslandmarksacrossmodelsofdierentindividuals.Itcanbediculttodeterminepoint correspondenceinanatomicshapes,particularlyinregionsthatareveryuniform, symmetricorotherwiselackdiscrete,uniquelandmarks.Variousmethodshavebeen proposedforautomaticestimationofcorrespondence[38],[130]in3Dmodels,but airwaytreeshapespresentparticularchallengesbecausevariationinbothtopology andgeometrymaybepresent.Inthisstudywemadeuseofanatomicnamesassigned toeachairwaybranch,andmanuallyverifednameconsistencytoestablishcorrespondence.Bifurcationpointsintheskeletonizedairwaytreeprovidediscretelandmarks andidentifyseparatebranches.Interpolationalongsplinefunctionsfttoskeleton verticesbetweenbranchendpointsyieldsafxednumberofpseudo-landmarksassociatedwithnamedbranchesineachsubject.Theresultisasimplifedrepresentation oftreegeometrybutonethatcapturesessentialfeaturesandissuitablefortheSSM process.Modelsthatmorecompletelydescribeairwaylumencaliberand3Dshape wouldrequireadramaticincreaseinnumberofverticesanddeterminationoflandmarkcorrespondencewouldbedicult.Wehaveexperimentedwithmethodsto incorporatelumensizeeciently,forexamplebyincludingtheradiusofaninscribed sphereateachpseudo-landmark.Preliminaryresultsareencouraging,butvariations insegmentationresultsduetoCTreconstructionkernellimitsreliability.Further studiesarewarrantedtoevaluatethesizedierencesinCFairwaycaliber.Published dataisinconsistentregardingthelumensizeofCFairways.Meyerholzetal.found thattrachealsizewassmallerintheCFpigthanincontrol,buttherewasnodierencebetweentheirhumaninfantCTdata[93].Longetal.foundthatdistalairway sizeismoredilatedininfantswithCFthancontrol[81]. BecauseCTsusedinthisstudywereobtainedforclinicalpurposes,somelimitationsexist.CTmustbeusedjudiciously,especiallyinchildren,consideringtherisks associatedwithexposuretoionizingradiation.Theseconcernsmakerecruitmentof 108

PAGE 118

truenormalsubjectsforresearchCTimpractical.Asadiseasecontrolgroup,we choseapopulationofpatientsunderevaluationforbonemarrowtransplantwhohad beenimagedusingahigh-resolutionvolumetricCTprotocol.Althoughthepatients hadasignifcantdisease,workupforbonemarrowtransplantincludedrobustpulmonaryassessment,includingspecializedhistory,physical,andlungfunctiontesting thatshowednomeasurablerespiratorydisease. LimitationsrelatedtotechnicalparametersoftheCTstudiesarealsoextant. AllCTimageserieswereacquiredonthesameCTscanner,buttheclinicalprotocol fortheCTofbonemarrowtransplantcandidatesusedasoftreconstructionkernel, whiletheCFprotocolspecifedreconstructionwithasharpkernel.ChoiceofCT reconstructionkernelisbasedprimarilyonclinicalapplication,withsharpkernel reconstructionspreferredinsituationswheresharperdisplayoflunganatomyisuseful, despiteaconcomitantincreaseinimagenoise.Softkernelimagereconstructions havelessnoisebutappearslightlyblurredcomparedtosharpkernelreconstructions. Asexpected,thisdierenceinimagingcharacteristicsaectedairwaysegmentation somewhat,butsincetheeectisuniform,didnotintroducedistortionsinskeleton shape.Apollor generatedairwaysegmentationsfromsharpkernelCTimagestendto beslightlylargerthanfromsoftkernelstudies,whichisnotsurprisingconsideringthe dierenceinimageedgecharacteristicsbetweenthekernels.Thisintroducesbiasin lumenestimatesarrivedatbyfndinganinscribedspherewithinairwaysegmentation. InordertoavoidincorporatingthisbiasintoSSManalysisofsubjectdataonlyairway skeletondatawasused.Still,theprimaryresultsindicatedbyourSSMformulation aredierencesinanglesandlengths,whichwouldnotbeexpectedtochangewith reconstructionkernel.InthecaseswhereCTstudiesreconstructedwithbothkernels wereavailable,overlayofairwayskeletonsshowednoappreciabledierences. Theclinicalsignifcanceoftheshapedierencesbetweenthediseasecontroland CFairwaysisnotclear.OurCFcohortwasquiteheterogeneous,includingpatients 109

PAGE 119

withseverelungdiseaseandthosewithoutairrowlimitationbasedonspirometry. Beyondparameterb1,whichaccountedforpatientsize,otherairwayvariationswere notstatisticallycorrelatedwithpatientage,size,orlungfunctionwithintheCF ordiseasecontrolgroups.Wespeculatethatairwayshapemaychangewithdisease severityinpatientswithCF,astheirlungsbecomemoregastrappedandairwaysmore bronchiectatic.Thesmallnumbersofpatientswithsuccessfulspirometricmeasures andtheheterogeneityofourgroupdecreasedourpowertodetectchangeswithinthe CFcohortalone.FurtherstudywitheitheralargerCFcohortoramorehomogeneous CFgroupiswarrantedtofurtherexploretheseairwayshapevariations. 6.5Conclusions TheSSMapproachprovidesanecientframeworkformodelingandquantitative analysisofanatomicshapeandcanbeappliedtothepediatricairwaytree.Statistical validationofmodelparametersidentifedmodesofshapevariationthataresignifcantlyassociatedwithbodysizeand,separately,withthediagnosisofCF.Simulation testsdemonstratetheviabilityoftheapproach.Cross-validationexperimentssupporttheconclusionthatthemethodisabletoclassifyCFstatuswithgoodaccuracy. Futureworkincludeslarger,morecontrolledstudiesforvalidationandstudyoftherelationshipbetweenairwaymorphologyandsymptomseverity.Possibleextensionsof thisworkincludeuseofSSMresultsasinputgeometryforcomputationalsimulations ofrowandparticledeposition. 110

PAGE 120

7.Conclusions Medicalimagingprovidespowerfultoolsforin-vivosamplingofinternalanatomy toevaluateillnessandinjury.CTisthepreferredmodalityforimagingthelungs, becauseitrevealsdetailedstructuralinformationthatisinvaluableinthediagnosis andtreatmentofpatientswithpulmonarysymptoms. Forthemostpart,medicalimagingdevices,proceduresandsoftwarehavebeen designedforvisualassessment.However,interpretingtheinformationcapturedin medicalimagesisachallengingtaskthatrequireshighlytrainedexperts.Imaging studiescontainhugevolumesofdataanddepictsubtledetailsthatmaybeclinically important.Resultsaregenerallycommunicatedinqualitativeandimpreciseterms. Eciency,repeatabilityandinter-observervariationaremajorchallengeswithstrictly visualassessment. Thereissignifcantinterestinquantitativeanalysisofmedicalimages.Morepreciseandobjectiveanalysisofimageswouldprovidebeneftinclinicalandresearch applications.Methodsthathavebecomepopularintheeraof\bigdata",including computervisionandmachinelearning,areimportanttoolstocapitalizeontheinformationavailableinCTimages.Thisworkhasappliedafewoftheseconceptstotwo problemsrelatedtotheevaluationofthelungsonCT.Thefrstwasdevelopmentof amethodtodistinguishdiusepatternsinthelungparenchyma,specifcallyausual interstitialpneumoniapatterninpatientswithidiopathicpulmonaryfbrosis.The methodisbasedonanovelapplicationofunsupervisedfeaturelearningtodevelop aclassifercapableofdetectingregionsoffbrosisonlungCT.Idemonstratedthe technicalecacyofthisalgorithmandexploredrelationshipsbetweenautomatically measuredextentoflungfbrosisandresultsofvisualassessmentandphysiologictests. ThesecondapplicationfocusedonanalysisofanatomicmodelsderivedfromCT. Iappliedastatisticalshapemodelingtechniquetoanalyzetheupperairwaytreein subjectswithandwithoutcysticfbrosis.Thisanalysisrevealedasignifcantshape 111

PAGE 121

dierenceinbetweenagroupoftwentysubjectswithcysticfbrosiswhencompared withagroupoftwentysubjectswithoutmeasurablelungdisease. 112

PAGE 122

APPENDIXA.SpatialInformationinROIs Muchofthisworkhasreliedonextractionofsquareregionsofinterest(ROIs) fromimages.Ashasbeendescribed,featuresderivedfromthesesquareregionscan beusefultodiscriminatetexturepatterns.However,thisapproachdoesnotconsider relativespatialpositionoftheROIs,whichmayprovideusefuladditionalinformation. Forexample,AmericanThoracicSocietydiagnosticguidelinesstatethatkeyfeaturesofIPFonCThavecertainimportantcharacteristicsintermsofspatialdistribution.Ausualinterstitialpneumonia(UIP)patternisstatedtohaveperipheralor subpleuralandbasalpredominance.Withthisinmind,itislikelythatincorporation ofadditionalinformationregardingROIpositionwithinthelungswouldbeuseful. Inpracticethisisachallenge.Thesizeandshapeoflungsvarysignifcantly betweensubjectssoestablishingagenerallungcoordinatesystemwouldbedicult. Still,evengrossanatomicposition(peripheralversuscentral,superiorversusinferior, etc)isprobablymeaningful. MetadatastoredwithimagesinDICOMheadersindicatespatientpositionwithin theCTscanner,butthisisonlysouseful.Moreimportantistherelativeposition ofanROIwithrespecttootherkeyanatomy.Lungsegmentationmasks,which havebeenrelieduponformuchofthiswork,canhelpinthisregard.Forexample, knowingthataUIPpatterntendstohavesubpleuralpredominancewemightexpect thatquantitativefeaturesthatcapturehowclosetothelungperipheryanROIis couldbeuseful.Withlungsegmentationmasksavailable,eveniftheinputisjust asingleimageslice,itispossibletoquantifyproximitytoanedge.Applyinga Euclideandistancetransformtoabinarysegmentationmaskfndsdistancetothe nearestbackgroundelementforeachpixelwithinthemask.Thisisarelatively simple,eectivewaytoquantifyifgivenpixelwithinthemaskismorecentralor 113

PAGE 123

FigureA.1: Lungsegmentation(left)anditsEuclideandistancetransform(right), whichprovidesinformationtodierentiateperipheralfromcentralregions peripheral.Forexample,A.1showsabinarysegmentationmaskanditsEuclidean distancetransform.Thedistancetotheclosestbackgroundpixel(i.e.pixeloutside thebinarymask)isdisplayedincolor.Notethatmorecentralregionsareshownin red,whileperipheralregionsaremoretowardtheblueendofthecolormap.This measurementprovidesasimplevaluetoindicatehowcloseanindividualpixelorROI isfromtheedgeofthebinarymask.IncorporationofthisvalueinanROI'sfeature vectormaybeusefulindiscriminatingsubpleuralfromcentralregions. WhenvolumetricCTandlungsegmentationsareavailableitmaybepossible tousecompute3DpositionofanROIinasimilarmanner.However,onegoalof thisworkhasbeentoconstructaversatilealgorithmsthatdonotnecessarilyrequire volumetricscanning.Assuch,amethodtoestimatethecranio-caudallocationofa singleaxialCTimagemaybeuseful.Ihypothesizethatamachinelearningclassifer canbetrainedusingfeaturesfromlungsegmentationmaskstoestimatethelocation ofasingleCTimagethroughthelungs.Inotherwords,Ithinkthataclassifer canbetrainedtoestimatewherewithinthelunganaxialimagelies,basedonits lungsegmentationmask.FigureA.2showslungsegmentationmasksatdierent cranio-caudallocationsinthelung.Thedistinctiveshapesateachlocationgivethe impressionthatthisinformationcouldbeusedtoestimatetheaxialpositionofanew slice. 114

PAGE 124

FigureA.2: Fromlefttoright,lungsegmentationmasksatlower,middleandupper thirdsofthelung TotestthishypothesisvolumetricCTandassociatedlungmasksforacollectionof 100subjectswereselected.Foreachsubjecteveryaxiallungsegmentationmaskwas processedtocomputefeaturesofthebinaryimage.Typicalbinaryimagesfeatures wereusedincludingtheHusetofinvariantmoments,plussolidity,eccentricityand inertiatensors.Atotalof14featureswerecalculatedforeachaxiallungsegmentation image. Thecranio-caudallocationofeachslicewasexpressedonascaleof0.0to1.0 wherethemostinferioraxialimagecontaininglungsegmentationwas0.0andthe mostsuperior1.0.ARandomForest(RF)regressorwastrainedusingbinaryimage momentsaspredictorsandthelocationvariableasoutput. Imagemomentfeaturesandslicelocation,similarlyencodedonascaleof0.0-1.0, werecomputedforaseparatesetof50cases.TheRFregressorwasabletopredict slicelocationwithverygoodaccuracy(meansquarederror < 1:0%, R 2 =0 :91). Discretizingregressionoutputintothecategorieslower,middleandupperthird oflungshowshighclassifcationaccuracy.TheconfusionmatrixinTableA.1shows classifcationresultsfor1148imageslicesdrawnfrom50testsubjects.Classifcation accuracyisnearly90%. Theseproof-of-conceptresultsareencouragingandsupportthehypothesisthat cranio-caudallocationofanaxialCTimagecanbeestimatedusingimagemoments ofbinarylungsegmentationmasks.Thismaybeausefultooltoincorporatespatial 115

PAGE 125

FigureA.3: Predictedversusactualaxialslicelocationwherecranio-caudallocation isexpressedonthescale0.0-1.0where0.0ismostbasal TableA.1: Slicelocationclassifcationresults Lower3 rd Middle3 rd Upper3 rd Lower3 rd 354200 Middle3 rd 3732624 Upper3 rd 557325 informationevenforasingleimage. 116

PAGE 126

APPENDIXB.ComparisonwithStandardFilters InthisworkIusedUnsupervisedFeatureLearningtoderiveimagetexturefeaturesfromrepresentativeimages.Theselow-levelfeatureslooksimilartoimageflters producedbymathematicalfunctionsthataresometimesusedinimageprocessing. However,standardcomputedflters,forexampleGaborflterswhichareGaussian functionsmodulatedbyasinusoid,aremoresymmetricthanthefeatureslearned fromlungCT(seeFigure4.2).Othershavedemonstratedthat\otheshelf"flters areeectivefortextureclassifcation[131].Theunsupervisedfeaturelearningprocess seemstodistilllow-levelfeaturesthatarehighlyspecifctolungCT.Myhypothesisisthattheselearnedfeaturesprovidebetterclassifcationresultsthanmanually designedfeatureslikeGabor-typeflters. TotestthishypothesisIswappedtheRootFilterSet(RFS)[131]inplaceof thedictionarylearnedbyK-meansasdescribedinSection4.2.TodothisIsimply exchangedasetof38RFSflters(seeFigureB.1)forthelearneddictionaryand omittedthepixelnormalizationandwhiteningpre-processing.Thismadeitpossible tousethesamecodeframeworktotestclassifcationaccuracyofthe\otheshelf" RFSfltersinsteadofthelearneddictionaryintenfoldcross-validation.TableB.1 showsf1scoresbycategoryfrom10foldcrossvalidationwithanSVMtrainedto distinguish5ROIcategories.TheUnsupervisedFeatureLearningapproachshows betterresultsoverall,especiallyforthemorediculttodistinguishcategories. 117

PAGE 127

FigureB.1: Setof38RFSflters. TableB.1: f1scoresfrom10foldcross-validationcomparingRFSflterresultswith UFL NormalBVRAHCTBAvg. UFL0.970.620.910.820.520.77 RFS0.960.260.900.780.400.66 118

PAGE 128

APPENDIXC.Semi-SupervisedLearningtoSeparateMixedRegions Whenbuildingmachinelearningsystemsitisoftentruethatusingmoredata fortrainingprovidesbetterresultsthanengineeringclevereralgorithms.However, labelingdataforsupervisedtrainingisexpensiveinthatitrequirestediousmanual eortbydomainexperts(inthiscaseradiologists).Ifpatternsofinterestareheterogeneousorotherwisediculttodelineate(asisthecasewithusualinterstitial pneumoniaonCT)segmentationofexemplarregionstodevelopatrainingsetisall themorechallenging.Thismayleadtonoisylabelsthatcancompromiseclassifer training.Itwouldbeusefultodevelopmoreautomaticstrategiesthatcanboostthe numberandpurityoflabeleddataavailablefortraining.Othershavedemonstrated thatclassiferstrainedwithasmallnumberof(orevenasingle)positiveexemplar canbeusedtoidentifysimilarpatternsinunlabeleddata[87]. Semi-supervisedlearningisanapproachwhereunlabeledorweaklylabeledis usedfortraining.Theideaisthatincludingadditionalrelevantdata-evenifitis notstronglylabeled-canimprovetheresultsoftraining.Thegeneralideaistouse aninitialclassifer,trainedtorecognizeapatternofinterestwitharelativelysmall amountoflabeleddata,toselectmoreexamplesofthispatternfromapoolofweakly labeledorunlabeleddata. Whengatheringlabeledfree-formROIsfortraining(seeSection3.2)theradiologistreportedthattherewheresomecaseswhereitwasdiculttodelineateprecise boundariesbetweenregionswithdierenttypesofUIPfeatures.Specifcally,he statedthatitwasnotfeasibletodrawreliableboundariesseparatingregionsoftractionbronchiectasis(TB)fromothertypesoffbrosis.So,weaddedtwonewcategories toaccommodatethesemixedregions.ThesenewcategorylabelswereTBwithreticularabnormality(TBwRA)andTBwithhoneycombing(TBwHC).Free-formROIs 119

PAGE 129

FigureC.1: MixedRAwithTB(left)andmixedHCwithTB(right). withtheselabelscontainedmostlyRAorHC,butwithsomeTB(dilatedandinrammedairways)mixedin.Mid-levelROIssampledfromtheseregionswithmixed characteristicsarediculttousebecauseweknowsomewillcontainjustRAorTB andotherswillbeamix.ObviouslyusingthesetobuildaclassiferthatcandistinguishRAfromTBwillbedicult.Inaneorttoincreasethequantityoflabeled dataandtheaccuracylabels,Iexperimentedwithasemi-supervisedtypestrategyto separateTBfromRAandHCinthesemixedROIs. Briery,thefrstexperimentbeganbytrainingabinarySVM(usingUFLfeatures asdescribedinSection4.2)withTBROIsdefningthepositivecategoryandRAand HCROIstogetherasthenegativecategory.TheROIsusedherewereallspecifcally labeledandnotsampledfromfree-formROIswithmixedlabels.Onlyarelatively smallnumberofspecifcallylabeledTBROIswereavailable.Thisclassiferwasthen appliedtomid-levelROIssampledfrommixed(TBwRAandTBwHC)free-formROIs toidentifythosethatappearedtobeTB. FigureC.2showsthegeneralpatterninmanuallylabeledTBROIs.FigureC.3 showsexamplesofROIsoriginallysampledfromfree-formROIswithmixedlabels (TBwRAandTBwHC,seeFigureC.1).Qualitatively,theappearanceoftheROIs separatedfrommixedcategoriesmatcheswellwiththosethathadbeenmanually labeled,whichisencouraging. 120

PAGE 130

FigureC.2: ManuallylabeledTBROIsusedfortraining. FigureC.3: TBROIsseparatedfromfree-formROIswithmixedpatterns. 121

PAGE 131

FigureC.4: RAROIsseparatedfromfree-formROIswithmixedpatterns SimilarexperimentswereperformedtoseparateHCandRAROIsfrommixed patternfree-formROIssuchasthoseshowninFigureC.1.Expectingthatthese mixedROIswillcontainthreedominantcategories(RA,HCandTB)itwaspossible totrainbinaryclassiferswithoneofthethreepatternsasthepositivecategoryand theothertwoasthenegative. FigureC.4showsRAROIsseparatedfromthefree-formROIslabeledasTBwRA. QualitativelytheseappeartomatchthecharacteristicpatternofRAwell.Similarly, FigureC.4showsHCROIsseparatedfromthefree-formROIslabeledasTBwHC. Thepurposeofthesepreliminaryexperimentswastoexplorethefeasibilityof usingasemi-supervisedapproachtoidentifyROIsthatcouldbeusedtoaccruelarger trainingsetswithoutrequiringspecifcsegmentationbyradiologists.Theparticular datausedhere(regionsidentifedasmixedfbrosis,i.e.TBandRA)providedasimple testpoolforproofofconcepttesting.Itwouldbeusefultouseasimilarapproachto 122

PAGE 132

FigureC.5: HCROIsseparatedfromfree-formROIswithmixedpatterns extractROIsfromimagesetslabeledonlyatthesubjectlevel.WhilethemixedfreeformROIscanbeconsideredweaklylabeled,abetterexampleisvolumetricCTscans labeledatthestudylevel.Inthiscase,manualsegmentationwouldnotbenecessary, butsimplyaglobaldiagnosisorreportlike\normal,noapparentfbrosis"or\defnite UIP"couldbeused.Asanexperiment,Itriedtoextractmid-levelROIscontaining bronchovascular(BV)structuresfromcontrolCTstudiesofsubjectswithoutlung fbrosis.TheapproachbeganbytrainingabinarySVMclassiferwithmanually labeledBVROIsaspositiveexamplesandalargecollectionofROIswithother labels(anythingnotBV)asnegativeexamples.Theresultisaclassiferthatcan serveasaBVdetector.ROIssampledfromwithinlungsegmentationsofvolumetric CTofcontrolsubjectswereclassifedwiththisbinarySVMtoidentifythosethat likelycontainBVstructures.ThistargetpatternwaschosenbecauseBVstructures representarelativesmallfractionoftotallungvolume.FigureC.5showsexample 123

PAGE 133

FigureC.6: BVROIsidentifedinvolumetricCTofcontrolsubjects resultsfromthistest.Qualitatively,theseROIsappeartocontainBVstructures. TheintentofthispreliminaryworkwastoexplorethefeasibilityofasemisupervisedapproachtoimproveeciencyingatheringtrainingsetsofROIs.These resultsareencouragingandsuggestthattrainingROIsmaybeidentifedinasemiautomaticfashion,whereinitialspecifcsegmentationsoftargetcategoriesidentifed manuallycouldbeusedasseeddata.Classiferstrainedwiththeseinitialexemplars couldthenbeusedtoidentifymatchingregionsinweaklylabeledimagesets.Future workwillexploretheutilityofthisapproach. 124

PAGE 134

REFERENCES [1]Respiratorydiseasesintheworld:Realitiesoftoday,opportunitiesfortomorrow,anadvocacystatementoftheForumofInternationalRespiratorySocieties (FIRS). http://www.thoracic.org/newsroom/firs.php.Accessed:2014-1112. [2]RSNAQuantitativeImagingBiomarkersAlliance. https://www.rsna.org/ QIBA.aspx.Accessed:2014-10-12. [3]Frost&Sullivan:U.S.medicalimaginginformaticsindustryreconnectswith growthintheenterpriseimagearchivingmarket. http://www.frost.com/ prod/servlet/press-release.pag?docid=268728701,2012.Accessed:201410-14. [4]Lunganatomy. http://www.drcarolesilva.com/lung-size/,2012.Accessed:2014-12-01. [5]TwoFDAdrugapprovalsforidiopathicpulmonaryfbrosis. http://blogs.fda.gov/fdavoice/index.php/2014/10/ two-fda-drug-approvals-for-idiopathic-pulmonary-fibrosis-ipf/ October2014.Accessed:2014-11-29. [6]H.J.Aerts,E.R.Velazquez,R.T.Leijenaar,C.Parmar,P.Grossmann,S.Cavalho,J.Bussink,R.Monshouwer,B.Haibe-Kains,D.Rietveld,etal.Decoding tumourphenotypebynoninvasiveimagingusingaquantitativeradiomicsapproach. NatCommun,5,2014. [7]A.Agresti. Categoricaldataanalysis .WileySeriesinProbabilityandStatistics. Wiley-Interscience,2ndedition,2002. [8]S.Ardekani,R.G.Weiss,A.C.Lardo,R.T.George,J.A.Lima,K.C.Wu, M.I.Miller,R.L.Winslow,andL.Younes.Computationalmethodforidentifyingandquantifyingshapefeaturesofhumanleftventricularremodeling. Ann BiomedEng ,37:1043{1054,2009. [9]J.K.Aronson.Biomarkersandsurrogateendpoints. BrJClinPharmacol, 59(5):491{494,2005. [10]K.S.Arun,T.S.Huang,andS.D.Blostein.Least-squaresfttingoftwo3D pointsets. IEEETransPatternAnalMachIntell,9:698{700,1987. 125

PAGE 135

[11]R.S.Aysola,E.A.Homan,D.Gierada,S.Wenzel,J.Cook-Granroth,J.Tarsi, J.Zheng,K.B.Schechtman,T.P.Ramkumar,R.Cochran,E.Xueping, C.Christie,J.Newell,S.Fain,T.A.Altes,andM.Castro.Airwayremodeling measuredbymultidetectorCTisincreasedinsevereasthmaandcorrelateswith pathology. Chest,134:1183{1191,2008. [12]B.J.Bartholmai,S.Raghunath,R.A.Karwoski,T.Moua,S.Rajagopalan, F.Maldonado,P.A.Decker,andR.A.Robb.Quantitativecomputedtomographyimagingofinterstitiallungdiseases. JThoracImaging,28(5):298{307, 2013. [13]B.G.Becker,F.A.Coso,M.E.G.Huerta,andJ.A.Benavides-Serralde. Automaticsegmentationofthecerebellumoffetuseson3Dultrasoundimages, usinga3Dpointdistributionmodel.In EngineeringinMedicineandBiology Society(EMBC),2010AnnualInternationalConferenceoftheIEEE,pages 4731{4734.IEEE,2010. [14]J.Behr,M.Demedts,R.Buhl,U.Costabel,R.P.Dekhuijzen,H.M.Jansen, W.MacNee,M.Thomeer,B.Wallaert,F.Laurent,etal.Lungfunctionin idiopathicpulmonaryfbrosis{extendedanalysesoftheIFIGENIAtrial. Respir Res,10(101.2009),2009. [15]A.C.Best,A.M.Lynch,C.M.Bozic,D.Miller,G.K.Grunwald,andD.A. Lynch.QuantitativeCTindexesinidiopathicpulmonaryfbrosis:Relationship withphysiologicimpairment. Radiology,228(2):407{414,2003. [16]A.C.Best,J.Meng,A.M.Lynch,C.M.Bozic,D.Miller,G.K.Grunwald,and D.A.Lynch.Idiopathicpulmonaryfbrosis:Physiologictests,quantitativeCT indexes,andCTvisualscoresaspredictorsofmortality. Radiology,246(3):935{ 940,2008. [17]H.F.Boehm,C.Fink,U.Attenberger,C.Becker,J.Behr,andM.Reiser. Automatedclassifcationofnormalandpathologicpulmonarytissuebytopologicaltexturefeaturesextractedfrommulti-detectorCTin3D. EurRadiol, 18(12):2745{2755,2008. [18]P.Bokov,B.Mauroy,M.P.Revel,P.A.Brun,C.Peier,C.Daniel,M.M. Nay,B.Mahut,andC.Delclaux.Lumenareasandhomothetyfactorinruence airwayresistanceinCOPD. RespirPhysiolNeurobiol ,173(1):1{10,2010. [19]A.T.Borchers,C.Chang,C.L.Keen,andM.E.Gershwin.Idiopathicpulmonaryfbrosis{anepidemiologicalandpathologicalreview. ClinRevAllergy Immunol,40(2):117{134,2011. [20]Z.I.Botev,J.F.Grotowski,D.P.Kroese,etal.Kerneldensityestimationvia diusion. AnnStat,38(5):2916{2957,2010. [21]L.Breiman.Randomforests. MachineLearning,45(1):5{32,2001. 126

PAGE 136

[22]A.J.Buckler,L.Bresolin,N.R.Dunnick,andD.C.Sullivan.Quantitative imagingtestapprovalandbiomarkerqualifcation:interrelatedbutdistinct activities. Radiology,259(3):875{884,2011. [23]A.A.BuiandR.K.Taira. Medicalimaginginformatics.Springer,2009. [24]J.T.Bushberg. Theessentialphysicsofmedicalimaging.LippincottWilliams &Wilkins,2002. [25]R.CaruanaandA.Niculescu-Mizil.Anempiricalcomparisonofsupervised learningalgorithms.In Proceedingsofthe23rdinternationalconferenceon machinelearning,pages161{168.ACM,2006. [26]N.I.Chaudhary,G.J.Roth,F.Hilberg,J.Muller-Quernheim,A.Prasse,G.Zissel,A.Schnapp,andJ.E.Park.InhibitionofPDGF,VEGFandFGFsignalling attenuatesfbrosis. EurRespirJ,29(5):976{985,2007. [27]J.H.Chung,A.Chawla,A.L.Peljto,C.Cool,S.D.Groshong,J.L.Talbert, D.McKean,K.K.Brown,T.E.Fingerlin,M.I.Schwarz,etal.CTfndings ofprobableUIPhaveahighpredictivevalueforhistologicUIP. Chest,pageto appear. [28]G.CicchittoandC.M.Sanguinetti.Idiopathicpulmonaryfbrosis:theneed forearlydiagnosis. MultidiscipRespirMed,8(1):53,2013. [29]A.Coates,B.Carpenter,C.Case,S.Satheesh,B.Suresh,T.Wang,D.J. Wu,andA.Y.Ng.Textdetectionandcharacterrecognitioninsceneimages withunsupervisedfeaturelearning.In DocumentAnalysisandRecognition (ICDAR),2011InternationalConferenceon ,pages440{445.IEEE,2011. [30]A.CoatesandA.Y.Ng.Theimportanceofencodingversustrainingwith sparsecodingandvectorquantization.In Proceedingsofthe28thInternational ConferenceonMachineLearning(ICML-11),pages921{928,2011. [31]A.CoatesandA.Y.Ng.Learningfeaturerepresentationswithk-means.In NeuralNetworks:TricksoftheTrade,pages561{580.Springer,2012. [32]A.Coates,A.Y.Ng,andH.Lee.Ananalysisofsingle-layernetworksinunsupervisedfeaturelearning.In InternationalConferenceonArtifcialIntelligence andStatistics,pages215{223,2011. [33]G.J.Cook,M.Siddique,B.P.Taylor,C.Yip,S.Chicklore,andV.Goh. Radiomicsinpet:principlesandapplications. ClinTranslImaging ,2(3):269{ 276,2014. [34]T.F.Cootes,G.Edwards,andC.J.Taylor.Activeappearancemodels. IEEE TransPatternAnalMachIntell ,23:681{685,2001. 127

PAGE 137

[35]H.O.Coxson.Quantitativecomputedtomographyassessmentofairwaywall dimensions:Currentstatusandpotentialapplicationsforphenotypingchronic obstructivepulmonarydisease. ProcAmThoracSoc ,5:940{945,2008. [36]A.CriminisiandJ.Shotton. Decisionforestsforcomputervisionandmedical imageanalysis.Springer,2013. [37]N.DalalandB.Triggs.Histogramsoforientedgradientsforhumandetection.In ComputerVisionandPatternRecognition,2005.CVPR2005.IEEE ComputerSocietyConferenceon ,volume1,pages886{893.IEEE,2005. [38]R.H.Davies,C.J.Twining,T.F.Cootes,andC.J.Taylor.Building3D statisticalshapemodelsbydirectoptimization. IEEETransMedImaging 29:961{981,2010. [39]P.A.deJong,F.R.Long,J.C.Wong,P.J.Merkus,H.A.Tiddens,J.C. Hogg,andH.O.Coxson.Computedtomographicestimationoflungdimensions throughoutthegrowthperiod. EurRespirJ,27:261{267,2006. [40]P.A.deJong,Y.Nakano,W.C.Hop,F.R.Long,H.O.Coxson,P.D.Pare, andH.A.Tiddens.Changesinairwaydimensionsoncomputedtomography scansofchildrenwithcysticfbrosis. AmJRespirCritCareMed,172:218{224, 2005. [41]E.M.DeBoer,W.Swiercz,S.L.Heltshe,M.M.Anthony,P.Szerer,R.Klein, J.Strain,A.S.Brody,andS.D.Sagel.AutomatedCTscanscoresofbronchiectasisandairtrappingincysticfbrosis. Chest,145:593{603,2014. [42]K.Doi.Currentstatusandfuturepotentialofcomputer-aideddiagnosisin medicalimaging. BrJRadiol ,78:3{19,2005. [43]P.Domingos.Afewusefulthingstoknowaboutmachinelearning. Commun ACM,55(10):78{87,2012. [44]D.DuBoisandE.F.DuBois.Clinicalcalorimetry:tenthpaperaformula toestimatetheapproximatesurfaceareaifheightandweightbeknown. Arch InternMed,17(6 2):863{871,1916. [45]R.M.duBois,S.D.Nathan,L.Richeldi,M.I.Schwarz,andP.W.Noble. Idiopathicpulmonaryfbrosis:Lungfunctionisaclinicallymeaningfulendpoint forphaseIIItrials. AmJRespirCritCareMed ,186(8):712{715,2012. [46]T.E.Fingerlin,E.Murphy,W.Zhang,A.L.Peljto,K.K.Brown,M.P. Steele,J.E.Loyd,G.P.Cosgrove,D.Lynch,S.Groshong,etal.Genome-wide associationstudyidentifesmultiplesusceptibilitylociforpulmonaryfbrosis. NatGenet,45(6):613{620,2013. [47]S.C.FitzSimmons.Thechangingepidemiologyofcysticfbrosis. JPediatr, 122:1{9,1993. 128

PAGE 138

[48]M.M.Galloway.Textureanalysisusinggraylevelrunlengths. Computer graphicsandimageprocessing ,4(2):172{179,1975. [49]K.Ganesan,U.R.Acharya,C.K.Chua,L.C.Min,K.T.Abraham,andK.Ng. Computer-aidedbreastcancerdetectionusingmammograms:areview. IEEE RevBiomedEng ,6:77{98,2013. [50]S.B.Ginsburg,D.A.Lynch,R.P.Bowler,andJ.D.Schroeder.Automated texture-basedquantifcationofcentrilobularnodularityandcentrilobularemphysemainchestCTimages. AcadRadiol,19(10):1241{1251,2012. [51]J.G.Goldin.QuantitativeCTofthelung. RadiolClinNorthAm,40(1):145{ 162,2002. [52]L.W.Goldman.PrinciplesofCT:MultisliceCT. JNuclMedTechnol, 36(2):57{68,May2008. [53]B.J.Hafner,S.G.Zachariah,andJ.E.Sanders.Characterisationofthreedimensionalanatomicshapesusingprincipalcomponents:Applicationtothe proximaltibia. MedBiolEngComput ,38:9{16,2000. [54]R.M.Haralick.Statisticalandstructuralapproachestotexture. ProcIEEE, 67(5):786{804,1979. [55]T.Hastie,R.Tibshirani,andJ.H.Friedman. Theelementsofstatisticallearning:Datamining,inference,andprediction.Newyork,ny:Springer,2009. [56]T.HeimannandH.P.Meinzer.Statisticalshapemodelsfor3Dmedicalimage segmentation:Areview. MedImageAnal,13:543{563,2009. [57]J.W.HoleandK.A.Koos. Humananatomy.Wm.C.BrownPublishers,1994. [58]K.Horsfeld.Diameters,generations,andordersofbranchesinthebronchial tree. JApplPhysiol ,68:457{461,1985. [59]K.Horsfeld,F.G.Relea,andG.Gumming.Diameter,lengthandbranching ratiosinthebronchialtree. RespirPhysiol,26(3):351{356,1976. [60]RobertLeeHotz.Here'sanomicaltale:Scientistsdiscoverspreadingsux. WallStreetJournal ,Aug132012. [61]G.M.Hunninghake.Anewhopeforidiopathicpulmonaryfbrosis. NEnglJ Med,2014. [62]J.P.Hutchinson,T.M.McKeever,A.W.Fogarty,V.Navaratnam,andR.B. Hubbard.Increasingglobalmortalityfromidiopathicpulmonaryfbrosisinthe 21stcentury. AnnAmThoracSoc ,(ja),2014. [63]IPFnetInvestigatorsetal.TheIPFnetstrategy. AmJRespirCritCareMed 181:527{533,2010. 129

PAGE 139

[64]G.James,D.Witten,T.Hastie,andR.Tibshirani. AnIntroductiontostatisticallearning:withapplicationsinR.SpringerTextsinStatistics.SpringerNew York,2013. [65]Y.J.Jeong,K.S.Lee,N.L.Muller,M.P.Chung,M.J.Chung,J.Han,T.V. Colby,andS.Kim.Usualinterstitialpneumoniaandnon-specifcinterstitial pneumonia:serialthin-sectionctfndingscorrelatedwithpulmonaryfunction. KoreanJRadiol ,6(3):143{152,2005. [66]A.E.JohnsonandM.Hebert.Usingspinimagesforecientobjectrecognition incluttered3Dscenes. IEEETransPatternAnalMachIntell ,21(5):433{449, 1999. [67]A.KassnerandR.E.Thornhill.Textureanalysis:AreviewofneurologicMR imagingapplications. AJNRAmJNeuroradiol,31(5):809{816,2010. [68]H.J.Kim,M.S.Brown,D.Chong,D.W.Gjertson,P.Lu,H.J.Kim,H.Coy, andJ.G.Goldin.ComparisonofthequantitativeCTimagingbiomarkersof idiopathicpulmonaryfbrosisatbaselineandearlychangewithanintervalof 7months. AcadRadiol,2014. [69]H.J.Kim,D.P.Tashkin,P.Clements,G.Li,M.S.Brown,R.Elasho,D.W. Gjertson,F.Abtin,D.A.Lynch,D.C.Strollo,etal.Acomputer-aideddiagnosissystemforquantitativescoringofextentoflungfbrosisinscleroderma patients. ClinExpRheumatol ,28(5Suppl62):S26,2010. [70]T.E.KingJr,W.Z.Bradford,S.Castro-Bernardini,E.A.Fagan,I.Glaspole, M.K.Glassberg,E.Gorina,P.M.Hopkins,D.Kardatzke,L.Lancaster,etal. Aphase3trialofpirfenidoneinpatientswithidiopathicpulmonaryfbrosis. N EnglJMed ,2014. [71]H.Kitaoka,R.Takaki,andB.Suki.Athree-dimensionalmodelofthehuman airwaytree. JApplPhysiol ,87:2207{2217,1985. [72]V.Kumar,Y.Gu,S.Basu,A.Berglund,S.A.Eschrich,M.B.Schabath, K.Forster,H.J.Aerts,A.Dekker,D.Fenstermacher,etal.Radiomics:The processandthechallenges. MagnResonImaging,30(9):1234{1248,2012. [73]K.Kurashima,T.Hoshi,N.Takayanagi,Y.Takaku,N.Kagiyama,C.Ohta, M.Fujimura,andY.Sugita.Airwaydimensionsandpulmonaryfunction inchronicobstructivepulmonarydiseaseandbronchialasthma. Respirology, 17(1):79{86,2012. [74]P.Lambin,E.Rios-Velazquez,R.Leijenaar,S.Carvalho,R.G.vanStiphout, P.Granton,C.M.Zegers,R.Gillies,R.Boellard,A.Dekker,etal.Radiomics: Extractingmoreinformationfrommedicalimagesusingadvancedfeatureanalysis. EurJCancer,48(4):441{446,2012. 130

PAGE 140

[75]S.Lazebnik,C.Schmid,andJ.Ponce.Asparsetexturerepresentationusing localaneregions. IEEETransPatternAnalMachIntell ,27(8):1265{1278, 2005. [76]B.Leonardi,A.M.Taylor,T.Mansi,I.Voigt,M.Sermesant,X.Pennec,N.Ayache,Y.Boudjemline,andG.Pongiglione.Computationalmodellingofthe rightventricleinrepairedtetralogyofFallot:Canitprovideinsightintopatienttreatment? EurHeartJCardiovascImaging ,14(4):381{386,2013. [77]B.Ley,B.M.Elicker,T.E.Hartman,C.J.Ryerson,E.Vittingho,J.H.Ryu, J.S.Lee,K.D.Jones,L.Richeldi,T.E.KingJr,etal.Idiopathicpulmonary fbrosis:CTandriskofdeath. Radiology ,273(2):570{579,2014. [78]J.Ley-Zaporozhan,S.Ley,O.Weinheimer,S.Iliyushenko,S.Erdugan,R.Eberhardt,A.Fuxa,J.Mews,andH.U.Kauczor.Quantitativeanalysisofemphysemain3DusingMDCT:Inruenceofdierentreconstructionalgorithms. Eur JRadiol ,65(2):228{234,2008. [79]A.LiawandM.Wiener.ClassifcationandregressionbyRandomForest. R News,2(3):18{22,2002. [80]H.H.Loh,J.G.Leu,andR.C.Luo.Theanalysisofnaturaltexturesusing runlengthfeatures. IEEETransIndElectron,35(2):323{328,1988. [81]F.R.Long,R.S.Williams,andR.G.Castile.Structuralairwayabnormalities ininfantsandyoungchildrenwithcysticfbrosis. JPediatr,144:154{161,2004. [82]D.Lordkipanidze,M.S.PoncedeLeon,A.Margvelashvili,Y.Rak,G.P. Rightmire,A.Vekua,andC.P.Zollikofer.AcompleteskullfromDmanisi, Georgia,andtheevolutionarybiologyofearlyhomo. Science,342:326{331, 2013. [83]D.A.Lynch,J.D.Godwin,S.Safrin,K.M.Starko,P.Hormel,K.K.Brown, G.Raghu,T.E.KingJr,W.Z.Bradford,D.A.Schwartz,etal.High-resolution computedtomographyinidiopathicpulmonaryfbrosis:Diagnosisandprognosis. AmJRespirCritCareMed ,172(4):488{493,2005. [84]D.A.Lynch,W.D.Travis,N.L.Muller,J.R.Galvin,D.M.Hansell,P.A. Grenier,andJ.r.King.Idiopathicinterstitialpneumonias:CTfeatures. Radiology,236(1):10{21,2005. [85]J.Mairal,F.Bach,J.Ponce,G.Sapiro,andA.Zisserman.Discriminative learneddictionariesforlocalimageanalysis.In ComputerVisionandPattern Recognition,2008.CVPR2008.IEEEConferenceon,pages1{8.IEEE,2008. [86]F.Maldonado,T.Moua,S.Rajagopalan,R.A.Karwoski,S.Raghunath,P.A. Decker,T.E.Hartman,B.J.Bartholmai,R.A.Robb,andJ.H.Ryu.Automatedquantifcationofradiologicalpatternspredictssurvivalinidiopathic pulmonaryfbrosis. EurRespirJ,43(1):204{212,2014. 131

PAGE 141

[87]T.Malisiewicz,A.Gupta,andA.A.Efros.Ensembleofexemplar-svmsfor objectdetectionandbeyond.In ComputerVision(ICCV),2011IEEEInternationalConferenceon,pages89{96.IEEE,2011. [88]S.Matsuoka,Y.Kurihara,K.Yagihashi,M.Hoshino,andY.Nakajima.Airway dimensionsatinspiratoryandexpiratorymultisectionCTinCOPD:Correlation withairrowlimitation. Radiology,248(3):1042{1049,2008. [89]B.MauroyandP.Bokov.Theinruenceofvariabilityontheoptimalshapeof anairwaytreebranchingasymmetrically. PhysBiol ,7(1):016007,2010. [90]B.Mauroy,C.Fausser,D.Pelca,J.Merckx,andP.Flaud.Towardthemodeling ofmucusdrainingfromthehumanlung:Roleofthegeometryoftheairway tree. PhysBiol ,8(5):056006,2011. [91]B.Mauroy,M.Filoche,J.S.AndradeJr,andB.Sapoval.Interplaybetween geometryandrowdistributioninanairwaytree. PhysRevLett ,90(14):148101, 2003. [92]B.Mauroy,M.Filoche,E.R.Weibel,andB.Sapoval.Anoptimalbronchial treemaybedangerous. Nature,427(6975):633{636,2004. [93]D.K.Meyerholz,D.A.Stoltz,E.Namati,S.Ramachandran,A.A.Pezzulo, A.R.Smith,M.V.Rector,M.J.Suter,S.Kao,G.McLennan,G.J.Tearney, J.Zabner,P.B.McCray,andM.J.Welsh.Lossofcysticfbrosistransmembrane conductanceregulatorfunctionproducesabnormalitiesintrachealdevelopment inneonatalpigsandyoungchildren. AmJRespirCritCareMed ,182:1251{ 1261,2010. [94]M.Montaudon,P.Berger,A.Cangini-Sacher,G.deDietrich,J.M.TunondeLara,R.Marthan,andF.Laurent.Bronchialmeasurementwiththreedimensionalquantitativethin-sectionCTinpatientswithcysticfbrosis. Radiology,242:573{581,2007. [95]L.S.Mott,J.Park,C.P.Murray,C.L.Gangell,N.H.deKlerk,P.J.Robinson, C.F.Robertson,S.C.Ranganathan,P.D.Sly,andS.M.Stick.Progression ofearlystructurallungdiseaseinyoungchildrenwithcysticfbrosisassessed usingCT. Thorax,67:509{516,2012. [96]C.Mueller-Mang,C.Grosse,K.Schmid,L.Stiebellehner,andA.A.Bankier. Whateveryradiologistshouldknowaboutidiopathicinterstitialpneumonias. Radiographics,27(3):595{615,2007. [97]J.Naisbitt. Megatrends:Tennewdirectionstransformingourlives .Warner Books,1984. [98]T.R.NelsonandD.K.Manchester.Modelingoflungmorphogenesisusing fractalgeometries. IEEETransMedImaging,7(4):321{327,1988. 132

PAGE 142

[99]IdiopathicPulmonaryFibrosisClinicalResearchNetworketal.Acontrolled trialofsildenaflinadvancedidiopathicpulmonaryfbrosis. NEnglJMed, 363(7):620,2010. [100]IdiopathicPulmonaryFibrosisClinicalResearchNetworketal.Prednisone, azathioprine,andn-acetylcysteineforpulmonaryfbrosis. NEnglJMed 366(21):1968,2012. [101]A.Y.Ng.Graduatesummerschool:Deeplearning,featurelearning. https://www.youtube.com/watch?v=n1ViNeWhC24,May2013.AccessedDec. 9,2014. [102]M.Nishino,H.Itoh,andH.Hatabu.Apracticalapproachtohigh-resolution CTofdiuselungdisease. EurJRadiol ,83(1):6{19,2014. [103]M.Nishino,D.M.Jackman,H.Hatabu,B.Y.Yeap,L.A.Cioredi,J.T. Yap,P.A.Janne,B.E.Johnson,andA.D.VandenAbbeele.Newresponse evaluationcriteriainsolidtumors(RECIST)guidelinesforadvancednon{small celllungcancer:comparisonwithoriginalrecistandimpactonassessmentof tumorresponsetotargetedtherapy. AJRAmJRoentgenol ,195(3):W221,2010. [104]I.Noth,K.J.Anstrom,S.B.Calvert,J.deAndrade,K.R.Flaherty,C.Glazer, R.J.Kaner,andM.A.Olman.Aplacebo-controlledrandomizedtrialofwarfarininidiopathicpulmonaryfbrosis. AmJRespirCritCareMed ,186(1):88{ 95,2012. [105]T.Ojala,M.Pietikainen,andT.Maenpaa.Multiresolutiongray-scaleand rotationinvarianttextureclassifcationwithlocalbinarypatterns. IEEETrans PatternAnalMachIntell,24(7):971{987,2002. [106]E.R.F.Perez,C.E.Daniels,D.R.Schroeder,J.S.Sauver,T.E.Hartman, B.J.Bartholmai,S.Y.Eunhee,andJ.H.Ryu.Incidence,prevalence,and clinicalcourseofidiopathicpulmonaryfbrosisapopulation-basedstudy. Chest, 137(1):129{137,2010. [107]J.W.Prescott.Quantitativeimagingbiomarkers:Theapplicationofadvanced imageprocessingandanalysistoclinicalandpreclinicaldecisionmaking. J DigitImaging,26(1):97{108,2013. [108]R.K.Putman,I.O.Rosas,andG.M.Hunninghake.Geneticsandearlydetectioninidiopathicpulmonaryfbrosis. AmJRespirCritCareMed,189(7):770{ 778,2014. [109]G.Raghu,H.R.Collard,K.J.Anstrom,K.R.Flaherty,T.R.Fleming,T.E. KingJr,F.J.Martinez,andK.K.Brown.Idiopathicpulmonaryfbrosis: Clinicallymeaningfulprimaryendpointsinphase3clinicaltrials. AmJRespir CritCareMed ,185(10):1044{1048,2012. 133

PAGE 143

[110]G.Raghu,H.R.Collard,J.J.Egan,F.J.Martinez,J.Behr,K.K.Brown, T.V.Colby,J.F.Cordier,K.R.Flaherty,J.A.Lasky,etal.Anocial ATS/ERS/JRS/ALATstatement:Idiopathicpulmonaryfbrosis:Evidencebasedguidelinesfordiagnosisandmanagement. AmJRespirCritCareMed, 183(6):788{824,2011. [111]L.Rao,C.Tiller,C.Coates,R.Kimmel,K.E.Applegate,J.Granroth-Cook, C.Denski,J.Nguyen,Z.Yu,E.Homan,andR.S.Tepper.Lunggrowth ininfantsandtoddlersassessedbymulti-slicecomputedtomography. Acad Radiol,17:1128{1135,2010. [112]L.Richeldi,R.M.duBois,G.Raghu,A.Azuma,K.K.Brown,U.Costabel, V.Cottin,K.R.Flaherty,D.M.Hansell,Y.Inoue,etal.Ecacyandsafety ofnintedanibinidiopathicpulmonaryfbrosis. NEnglJMed ,2014. [113]E.E.Sarria,R.Mattiello,L.Rao,C.J.Tiller,B.Poindexter,K.E.Applegate, J.Granroth-Cook,C.Denski,J.Nguyen,Z.Yu,E.Homan,andR.S.Tepper.Quantitativeassessmentofchroniclungdiseaseofinfancyusingcomputed tomography. EurRespirJ,39:992{999,2011. [114]D.P.Schuster.Theopportunitiesandchallengesofdevelopingimaging biomarkerstostudylungfunctionanddisease. AmJRespirCritCareMed 176(3):224{230,2007. [115]K.E.Shin,M.J.Chung,M.P.Jung,B.K.Choe,andK.S.Lee.Quantitative computedtomographicindexesindiuseinterstitiallungdisease:Correlation withphysiologictestsandcomputedtomographyvisualscores. JComputAssist Tomogr,35(2):266{271,2011. [116]C.I.S.Silva,N.L.Muller,D.M.Hansell,K.S.Lee,A.G.Nicholson,andA.U. Wells.Nonspecifcinterstitialpneumoniaandidiopathicpulmonaryfbrosis: Changesinpatternanddistributionofdiseaseovertime. Radiology ,247(1):251{ 259,2008. [117]I.Sluimer,A.Schilham,M.Prokop,andB.vanGinneken.Computeranalysisof computedtomographyscansofthelung:Asurvey. IEEETransMedImaging 25(4):385{405,2006. [118]P.D.Sly,S.Brennan,C.Gangell,N.deKlerk,C.Murray,L.Mott,S.M. Stick,P.J.Robinson,C.F.Robertson,andS.C.Ranganathan.Lungdisease atdiagnosisininfantswithcysticfbrosisdetectedbynewbornscreening. Am JRespirCritCareMed ,180:146{152,2009. [119]J.Stewart.Jacobellisv. Ohio,378U.S.184.U.S.SupremeCourt,1964. [120]K.StrimbuandJ.A.Tavel.Whatarebiomarkers? CurrOpinHIVAIDS, 5(6):463,2010. 134

PAGE 144

[121]C.Strobl,J.Malley,andG.Tutz.Anintroductiontorecursivepartitioning: Rationale,application,andcharacteristicsofclassifcationandregressiontrees, bagging,andrandomforests. PsycholMethods,14(4):323,2009. [122]P.Suetens. Fundamentalsofmedicalimaging .Cambridgemedicine.Cambridge UniversityPress,2009. [123]K.Suzuki.Pixel-basedmachinelearninginmedicalimaging. JofBiomedImag 2012:1,2012. [124]N.Sverzellati.HighlightsofHRCTimaginginIPF. RespiratoryResearch, 14((Suppl1):S3),2013. [125]M.H.TawhaiandC.L.Lin.Image-basedmodelingoflungstructureand function. JMagnResonImaging ,32(6):1421{1431,2010. [126]J.Tschirren,E.A.Homan,G.McLennan,andM.Sonka.Intrathoracicairway trees:Segmentationandairwaymorphologyanalysisfromlow-doseCTscans. IEEETransMedImaging,24:1529{1539,2005. [127]J.Tschirren,G.McLennan,K.Palagyi,E.A.Homan,andM.Sonka.Matchingandanatomicallabelingofhumanairwaytree. IEEETransMedImaging, 24(12):1540{1547,2005. [128]R.Uppaluri,E.A.Homan,M.Sonka,G.W.Hunninghake,andG.McLennan.Interstitiallungdisease:Aquantitativestudyusingtheadaptivemultiple featuremethod. AmJRespirCritCareMed ,159(2):519{525,1999. [129]R.Uppaluri,T.Mitsa,M.Sonka,E.A.Homan,andG.McLennan.Quantifcationofpulmonaryemphysemafromlungcomputedtomographyimages. Am JRespirCritCareMed ,156(1):248{254,1997. [130]M.VaillantandJ.Glaunes.Surfacematchingviacurrents. InfProcessMed Imaging,19:381{392,2005. [131]M.VarmaandA.Zisserman.Astatisticalapproachtotextureclassifcation fromsingleimages. IntJComputVis,62(1-2):61{81,2005. [132]S.VermaandA.S.Slutsky.Idiopathicpulmonaryfbrosis{newinsights. N EnglJMed ,356(13):1370,2007. [133]J.Wang,F.Li,K.Doi,andQ.Li.Computerizeddetectionofdiuselung diseaseinMDCT:Theusefulnessofstatisticaltexturefeatures. PhysMed Biol,54(22):6881,2009. [134]S.WangandR.M.Summers.Machinelearningandradiology. MedImage Anal,16(5):933{951,2012. 135

PAGE 145

[135]T.Watadani,F.Sakai,T.Johkoh,S.Noma,M.Akira,K.Fujimoto,A.A. Bankier,K.S.Lee,N.L.Muller,J.W.Song,etal.Interobservervariabilityin theCTassessmentofhoneycombinginthelungs. Radiology,266(3):936{944, 2013. [136]E.R.Weibel.Whatmakesagoodlung? SwissMedWkly,139(27-28):375{386, 2009. [137]E.R.Weibel.Ittakesmorethancellstomakeagoodlung. AmJRespirCrit CareMed ,187:342{346,2013. [138]E.R.WeibelandD.M.Gomez.Architectureofthehumanlung:Useofquantitativemethodsestablishesfundamentalrelationsbetweensizeandnumberof lungstructures. Science,137(3530):577{585,1962. [139]A.U.Wells,S.R.Desai,M.B.Rubens,N.S.Goh,D.Cramer,A.G.Nicholson, T.V.Colby,R.M.duBois,andD.M.Hansell.Idiopathicpulmonaryfbrosis:A compositephysiologicindexderivedfromdiseaseextentobservedbycomputed tomography. AmJRespirCritCareMed ,167(7):962{969,2003. [140]J.B.West. Respiratoryphysiology:Theessentials.LippincottWilliams& Wilkins,2012. [141]M.O.Wielputz,M.Eichinger,O.Weinheimer,S.Ley,M.A.Mall,M.Wiebel, A.Bischo,H.U.Kauczor,C.P.Heussel,andM.Puderbach.Automaticairway analysisonmultidetectorcomputedtomographyincysticfbrosis:Correlation withpulmonaryfunctiontesting. JThoracImaging,28:104{113,2013. [142]J.P.Williamson,A.L.James,M.J.Phillips,D.D.Sampson,D.R.Hillman, andP.R.Eastwood.Quantifyingtracheobronchialtreedimensions:Methods, limitationsandemergingtechniques. EurRespirJ,34:42{55,2009. [143]X.YangandY.Tian.Texturerepresentationsusingsubspaceembeddings. PatternRecognitLett ,34(10):1130{1137,2013. [144]T.S.Yoo. Insightintoimages:Principlesandpracticeforsegmentation,registration,andimageanalysis.AkPetersSeries.Taylor&Francis,2004. [145]J.A.Zach,J.D.NewellJr,J.Schroeder,J.R.Murphy,D.Curran-Everett, E.A.Homan,P.M.Westgate,M.K.Han,E.K.Silverman,J.D.Crapo, etal.QuantitativeCTofthelungsandairwaysinhealthynon-smokingadults. InvestRadiol,47(10):596,2012. [146]M.Zelditch. Geometricmorphometricsforbiologists:Aprimer .2004. 136