Interpolated and adaptive contour 3D mesh segmentation in image volumes

Material Information

Interpolated and adaptive contour 3D mesh segmentation in image volumes
McTavish, Thomas S
Publication Date:
Physical Description:
xiv, 124 leaves : ; 28 cm


Subjects / Keywords:
Diagnostic imaging -- Digital techniques ( lcsh )
Three-dimensional imaging ( lcsh )
Diagnostic imaging -- Digital techniques ( fast )
Three-dimensional imaging ( fast )
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )


Includes bibliographical references (leaves 120-124).
General Note:
Department of Electrical Engineering, Department of Computer Science and Engineering
Statement of Responsibility:
by Thomas S. McTavish.

Record Information

Source Institution:
|University of Colorado Denver
Holding Location:
|Auraria Library
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
230480709 ( OCLC )
LD1193.E52 2007m M37 ( lcc )

Full Text
Thomas S. McTavish
B.S. Psychology, Pacific Union College, 1989
A thesis submitted to the
University of Colorado at Denver
and Health Sciences Center
in partial fulfillment
of the requirements for the degree of
Master of Science
Computer Science and Electrical Engineering

This thesis for the Master of Science
degree by
Thomas S. McTavish
has been approved
) I /XC1 / 7
Boris Stilman

McTavish, Thomas S. (M.S., Computer Science)
Interpolated and Adaptive Contour 3D Mesh Segmentation in Image Volumes
Thesis directed by Associate Professor Min-Hyung Choi
Precise, rapid segmentation of 3D image volumes is a prominent problem in
biomedical imaging. I describe an interactive 3D segmentation method where
users intuitively construct an interpolating 3D cubic mesh model of an items
surface while capitalizing on low-level image features. The method balances
controllability, ease of use, and efficiency while attaining human-defined accu-
racy in 3D. This is accomplished through the following means: 1) Control points
can only be inserted on discrete planar slices. This enables the user to under-
stand the correspondence between control points on different slices while only
adjusting few of them from their automatic insertion locations. 2) Interpolated
surfaces can also snap to low-level image features allowing the user to make con-
trol points sparser. 3) Because the user is constructing a 3D mesh, the resulting
segmentation is consistent in the orthogonal planesomething that is often lost
in the segmentation of 2D image stacks. Ultimately, this method intimately ties
the segmentation process to the desired output, a 3D model.

This abstract accurately represents the content of the candidates thesis. I
recommend its publication.
Min-Hyung Choi

This work is dedicated to my wife, Janine, for her support of my pursuits in
higher education.

I would like to thank my wife, of course, for her support of my work. I would also
like to thank Vic Spitzer for the opportunity to work on such a fascinating project
as the Visible Human. Various students, volunteers, and employees at the Center
for Human Simulation at the University of Colorado Health Sciences Center were
vital for testing and providing feedback of my segmentation tools that ultimately
lead to the creation of this mesh tool. Particularly, Susan McNevin, Adam
Lawson, and Terra Doucette greatly exercised my segmentation application and
gave me great feedback. Karl Reinig and Dave Rubinstein were also instrumental
in discussing technical details and user interface issues. It was also a pleasure
sharing an office with Lara Reigler during much of this development. And finally,
Id like to thank my advisor, Min Choi for supporting me even though I did not
follow standard protocols. His insistence on me gathering the correct data and
assistance with analyzing and presenting it properly have made me a better

List of Figures ...................................................... ix
Tables.............................................................. xiii
1. Introduction...................................................... 1
1.1 The difficult problem of segmentation............................. 1
1.2 Precision......................................................... 3
1.3 Biomedical imaging................................................ 3
1.4 The Visible Human dataset ....................................... 5
1.5 Difficulty of 3D segmentation..................................... 6
1.6 Project motivation................................................ 7
1.7 Document outline................................................. 13
2. Background....................................................... 14
2.1 Seeded region growing............................................ 14
2.2 Snakes and active contour models................................. 16
2.3 Snap-to-edge techniques.......................................... 18
2.4 Geometric modeling and drawing................................... 20
2.5 Background summary .............................................. 21
3. Methods.......................................................... 23
3.1 Editing with the 3D geometric mesh data structure................ 23
3.1.1 Cubic mesh..................................................... 23

3.1.2 Catmull-Rom interpolation ...................................... 23
3.1.3 Mesh data structure............................................. 25
3.2 Mesh with snap-to-edge ............................................ 29
3.2.1 Shortest path searching ........................................ 31
3.2.2 Inside and outside.............................................. 34
3.2.3 Blending shape and snap-to-edge................................. 34
4. Results.............................................................. 39
5. Discussion........................................................... 45
6. Conclusions and further research..................................... 50
A. Automated methods and features...................................... 53
A.l Automated Methods................................................... 53
A.2 Features............................................................ 54
A.2.1 Edge Detection.................................................. 57
A.2.2 Texture......................................................... 58
A.2.2.1 Textons . .................................................... 58
A.2.2.2 Filter banks.................................................... 60
A.2.2.3 Gradients and Gradient Flows ................................ 65
A.3 Methods............................................................. 66
A.3.1 Artificial Neural Networks...................................... 66
A.3.1.1 Other Learning Methods.......................................... 69
A. 3.2 Watershed Segmentation.......................................... 69
B. Cubic splines ...................................................... 71
C. Segmentation Application User Manual................................ 74


1.1 President Bush........................................................ 2
1.2 Number of peer-reviewed publications in the PUBMED database
from 1993 to 2006 using the search term medical AND image AND
segmentation [dp]. Figure taken after Figure 1 in Zaidi
(2006)................................................................ 5
1.3 Problems in the Z-plane. A portion of coronal plane 503 of the vis-
ible human male. Top: RGB image. Bottom: Segmented data as
performed via manual tracing. Many places show drift and incon-
sistent segmentation between slices. Arrows highlight examples of
single voxels which may be inappropriately tagged. Outline in RGB
image shows cutting artifacts in the volume that caused the user to
perform an inconsistent segmentation in the Z-plane.................. 8
2.1 Seeded region growing using DDA color space matching as the fea-
ture vector. The solid green stroke is the users seed and the semi-
transparent green is the resulting region grown from the seed. ... 15
3.1 Portion of a cubic mesh. With Catmull-Rom interpolation, the inter-
polated patch (gray) is bounded by 4 control points, prow,coh using
the location of 16 control points surrounding the patch........... 24
3.2 Example of local editing. Two segments on either side of the control
point that is moved are affected by the move...................... 25

3.3 After Figure 10-30 in Hearn and Baker (1997). Illustration of the
Catmull-Rom spline automatically determining the tangent vectors
at a control point and the segment being calculated (black). The
tangent vector for p^ is determined by the chord pTIYpT+i and the
tangent vector for p^+i is determined by the chord PfcPfc+2.............. 26
3.4 Automatic insertion of control points on other rows of the mesh. . 27
3.5 Example of inserting control points on a slice............................ 29
3.6 Responses from filters.................................................... 32
3.7 Example of the graph structure used for finding an optimal path be-
tween two pixels (gray). Control points are designated as a full pixel.
A path follows the edges between the start and end pixels, finding
the shortest path from one of the four nodes about the starting pixel
to one of the four nodes about the ending pixel................... 33
3.8 The shape gradient is made by first capturing the horizontal and
vertical edges of the shape (horizontal edges shown here on the left)
and then blurring that image with a Gaussian filter. The user sets
the spread of the Gaussian filter................................. 36
3.9 Response of the interpolated contour to various levels of blending. . 37
3.10 More responses of the interpolated contour to various levels of blending. 38
4.1 Statistics for 331 tissues segmented on the Visible Human dataset. 40
4.2 Histogram of the number of number of control points (V-axis) and
distance in pixels (AT-axis) for moved control points............. 40

4.3 Resulting segmentation of three muscles of the forearm in the Visi-
ble Human Male. Right brachioradialis, extensor carpi longus, and
extensor carpi brevis using manual tracing, mesh segmentation, and
mesh segmentation with snap-to-edge................................ 43
A.l Mapping into higher-dimensional nonlinear space. For data to be
linearly separable in feature space, there must be a separating hy-
perplane. This example applies a nonlinear transformation to create
a new feature space with a separating hyperplane. Image taken from
Greg Grudics Machine Learning class notes......................... 56
A.2 A portion of slice 348 of the Visible Human Male. The top left
image is the color image data. The top right image is the manually
segmented image. The bottom left image applies a LoG filter with
a sigma value of 1.0. The bottom-right image shows the results of a
LoG filter with a sigma value of 3.0............................... 59
A.3 The Leung-Malik filter bank. The upper-left filters are first deriva-
tives of Gaussians at 6 orientations and 3 scales. The upper-
right filters are second derivatives of Gaussians also at 6 orien-
tations and 3 scales. In these bar filters, the elongation factor
is 3 so ay = 3crx. The bottom row contains 8 spot LoG fil-
ters and 4 Gaussian filters at various scales. Image appears at vgg/research/texclass/filters.html af-
ter Leung and Malik (2001)............................................... 61

A.4 ID Laplacian of Gaussian. When 2D, it is as if this filter is spun
horizontally about the center resembling a Mexican sombrero. For
this reason, a LoG filter is also dubbed The Mexican Hat...... 61
A.5 A Gabor filter is a sinusoidal signal modulated with a Gaussian. . 63
A.6 A filter bank of spot and bar filters applied to an image of a butterfly
with the corresponding filter responses in the bottom images. Fig-
ure comes from
Slides/PyramidsandTexture. ppt which accompanies Forsyth and
Ponce (2003).................................................... 64
A.7 Example of the windowing problem. A pixel (small black square)
inside a texture (blue). Rectangle about the pixel is the window
centered at the pixel. Most pixels within the window are from outside
the texture. Therefore, the window may report that this pixel does
not belong to the texture, even though it does.................. 65
A.8 Simple, fully-connected ANN. For clarity, arrows have been omitted,
but edges are directed from left to right, from input to hidden nodes
then to output nodes................................................. 67

4.1 Comparison of time to trace a single slice with the mesh tool or
manual tracing................................................ 42

1. Introduction
1.1 The difficult problem of segmentation
Classifying individual pixels of a digital image into discrete, meaningful sets
is known as segmentation. Humans capacity to recognize visual patterns is
both extraordinary and taken for granted. Always on during waking hours,
we receive a continuous stream of visual images that help us segregate and
classify what we see. These images are easily processed by built-in low- and
high-level capabilities of our brains. The arrangement of photoreceptor neurons
in the retina creates a receptive field that provides a means of performing edge
detection (Boynton, 2005) and even shape-from-shading (Lehky and Sejnowski,
1988). Furthermore, our ability to focus and follow visual cues allows us to
segregate objects by motion and stereo depth perception. In this way, we quickly
group low-level visual cues into a collective set. We then draw on our rich history
of acquired images for further classification.
Segmentation is the paramount problem of Computer Vision. After all,
if a robot can interpret the images from its camera(s), then it can navigate
landscapes, sort items on an assembly line, look out for terrorists, etc. The
holy grail is that it can perform such tasks automatically, alleviating human
effort. As an indicator of the difficulty in training a computer to automatically
classify images, consider Figure 1.1. The image on the left is a color-inverted
and vertically flipped version of the image on the right. If the image on the right
was flashed before us for half a second and we were asked what we saw, we would

(a) Color-inverted and vertically (b) Our great leader,
flipped version of the picture on
the right.
Figure 1.1: President Bush.
say President Bush. If the image on the left was flashed before us for the same
duration, we would be hard pressed to say what it was, let alone a person with a
face, and certainly we would not recognize it as George Bush. Yet this is exactly
what we are asking computers to do when we expect them to perform quick,
automatic segmentation based on relatively few training examples compared to
our acquired image history.
Nevertheless, automatic segmentation has given acceptable results (i.e. a
low, acceptable number of false positives and negatives) in a variety of industrial
applications such as assembly line robotics (Malamas et al.), face recognition
(Zhao et al., 2003), and even in finding naked people on the internet (Fleck
et al., 1996).

1.2 Precision
A distinction must be made between precise and imprecise segmentation.
If the duty of the segmentation system is to quantify that a given element
exists in the image or even exists in a general region, then automatic techniques
may indeed be sufficient. If, however, the segmentation process is to determine
exactly which pixels belong to an object and which pixels do not, then automatic
techniques are rarely successful. This is because edge boundaries are frequently
hard to determine due to a variety of factors such as noise or homogeneity of
the object with the background.
A precise classification is therefore typically attained through semi-automatic
and manual methods. In the semi-automatic method, humans guide the seg-
mentation process in (typically) an iterative fashion until the human expert is
satisfied with the classification. These methods typically balance low-level image
features and information about the object of interest with the users decisions.
Conversely, the manual method does not incorporate any information, but sim-
ply enables the user to classify which pixels belong to the object and which ones
do not. For example, painting or tracing an object with a mouse is a common
manual technique.
1.3 Biomedical imaging
An important area of image segmentation is medical imaging. Frequently,
these images are 3D such as those obtained by computed tomography (CT),
magnetic resonance (MR), or even cryosectioned images from cadavers such as
the Visible Human (Ackerman et ah, 1995). In 3D biomedical images, segmen-
tation is the process of classifying voxels (3D pixels) as belonging to particular

tissues, disease states, or other regions of interest. After such a classification,
further analyses can be performed such as the diagnosis and quantitative com-
parison with other samples (Dyson et ah, 2000; Schild et ah, 2000), measuring
the presence and rates of disease progression (Fox et ah, 2001; Biilow et ah,
2007), and mapping a response to a stimulus across time (Salcedo et ah, 2005).
Segmented images also permit the construction of 3D anatomical models for
visualization, treatment plans (Lees, 2001), and training, including virtual dis-
section (Dev and Senger, 2005), surgical simulation (Reinig et ah, 1996, 2006),
and ultrasound simulation (Imani et ah).
As an example of the increasing role and interest in medical image segmen-
tation, consider the number of peer-reviewed journal articles in the PUBMED
database using the search term medical AND image AND segmentation for
a particular year. Results are shown in Figure 1.2. While PUBMED shows
exponential growth at about 10% each year, the growth in medical image seg-
mentation publications is obviously rising much faster, about 22%.
Many biomedical applications require precise classifications of each pixel.
After all, when imprecise, diagnoses and treatment plans may be incorrect,
scientific reports will be inaccurate, virtual dissections will be unrealistic, and
surgical training with virtual techniques will be inadequate. Unfortunately,
noise, distortion, resolution of images, and the relative homogeneity of many
tissues do not provide enough low-level contrast to enable effective automated
methods for precise segmentation, especially at object boundaries. Even many
semi-automatic techniques also fall short because they also often rely heavily on
low-level image features which are unable to discriminate object borders. In fact,

Figure 1.2: Number of peer-reviewed publications in the PUBMED database
from 1993 to 2006 using the search term medical AND image AND segmenta-
tion [dp]. Figure taken after Figure 1 in Zaidi (2006).
the effort to repair errors from an inaccurate segmentation can be worse than
simply employing manual methods from the start. However, manual methods
such as contour tracing of slices in 2D are extraordinarily tedious.
1.4 The Visible Human dataset
The Visible Human Project is an effort by the National Institutes of Health
(NIH) to digitize human anatomy (Ackerman et ah, 1995; Spitzer et al., 1996).
Over a decade ago, a male cadaver was cryosectioned and cross-sectioned col-
ored images of his internal anatomy were obtained in millimeter increments.
The stacking of these images comprises a 3D image volume available to the pub-
lic ( Since this
volume was obtained, efforts have been made to segment this data such that
realistic 3D anatomical models can be reconstructed for the use in virtual ul-
trasound (Imani et al.), dissection (, and

surgical simulations (Reinig et al., 1996, 2006). Voxels for this dataset are 0.3 x
0.3 x 1 mm in size.
With newer technologies, other cryosectioned volumes are being obtained
with 0.10 mm cubed voxels, and therefore these newer volumes have 100 times
the resolution of the original Visible Human Male. Segmentation of these
datasets, however, is far and away the bottleneck to other applications that
could be developed and capitalize on the data. Over 3000 discrete anatomical
structures have been labeled in the Visible Human Male by efforts at the Center
for Human Simulation at the University of Colorado Health Sciences Center.
Segmentation of this volume has been ongoing for over a decade, a laborious
effort by physicians, students, medical students, and others who have classified
this dataset largely by manual segmentation techniques. Some structures in
this dataset are still being classified and others are being edited. With newer
volumes at 100 times the size/resolution, the same effort would take 1000 years
to segment! (In reality, to date, these other volumes are not complete bodies.
Still, if the volumes were 1 /10th of a body, they would each take 100 years to
completely segment, so faster methods are obviously necessary).
1.5 Difficulty of 3D segmentation
3D image volumes are especially difficult to classify. While theoretically
more pixels of information are available for an automatic or semi-automatic
technique to operate on, the opportunity for error increases exponentially. The
reasons that cause 2D segmentation techniques to fail are often exacerbated
when they are implemented as 3D techniques. For example, say a region-growing
technique can properly classify 99.9% of a pixels neighboring pixels. In 2D, a

pixel has 8 neighbors and therefore a misclassification might occur every 125
pixels. In 3D, a pixel has 26 neighbors and therefore the technique will perform
an error, on average, every 38 pixels. While misclassification of this magnitude
may not sound that direa pixel off here or a pixel off thereone errant pixel
may induce several other pixels to be misclassified, a case known as bleed-out
when the region being grown spills over into unwanted regions.
A major problem with 3D data is how to visualize and navigate it. For
this reason, when using an automatic or semi-automatic technique, determining
the methods classification decision is often not readily apparent. Therefore,
humans employing these methods may waste a lot of time rotating and viewing
3D renderings of the classification or navigating through the volume to validate
each slice.
For this reason, many 3D techniques actually operate in pseudo-3D, sequen-
tially performing a classification on one slice and then another segmentation on
the adjacent slice. Even when incorporating information from previous slices or
when manually traced by human experts, this pseudo-3D approach is prone to
inaccuracies in the orthogonal plane as exemplified in Figure 1.3.
1.6 Project motivation
I consider segmentation accuracy as the classification agreed upon by (po-
tentially several) human experts. Additionally, I characterize efficiency as the
inverse of human interactivity. The motivation behind my work was to develop
computational methods to facilitate the accurate and rapid segmentation of 3D
medical image data. For reasons just mentioned and described in more detail
in the following chapter, many methods fall short in balancing accuracy and

Figure 1.3: Problems in the Z-plane. A portion of coronal plane 503 of the visible
human male. Top: RGB image. Bottom: Segmented data as performed via
manual tracing. Many places show drift and inconsistent segmentation between
slices. Arrows highlight examples of single voxels which may be inappropriately
tagged. Outline in RGB image shows cutting artifacts in the volume that caused
the user to perform an inconsistent segmentation in the Z-plane.

The parameters and goals of my computational tool were that:
1. Segmentations be accurate. If accuracy is mandatory and the seg-
mentation method is inaccurate, then a manual method has to be applied
to either undo the misclassified voxels or to classify the voxels that were
missed. This can often be more tedious than simply performing a manual
segmentation from the outset. The reason this can be so laborious is that
it is difficult to navigate slices in 2D and to build and rotate models in 3D
to determine where the errors are. For example, one stray voxel may be
classified outside the rest of a contiguous area that is properly classified.
Locating such stray voxels from thousands of voxels can be difficult. Ad-
ditionally, the most common place for errors in segmentation is at border
boundaries. If errors are always prone to exist at the boundary, then the
user must outline and fill part or all of the structure being segmented.
2. The tool work across most or all tissue types. The utility of an
automated tool that can accurately segment particular tissues would be
profound. For example, if an automated method could segment all the
veins, it would obviate many hours of users performing such a task by hand.
However, with over 3000 discrete tissue types, it is unreasonable to build
individual, customized approaches to each segmentation task. So while
some difficult-to-segment structures such as veins may warrant automated
or semi-automated approaches, there is also a need for a generic tool.
3. Classification be performed much quicker than existing manual

approaches. The previous manual approach involved users outlining a
structure on a slice, pixel-by-pixel, and then filling it. This meant the
users used a mouse to paint the outline, erasing or undoing sections if
they drifted here or there. In this context, image segmentation is largely
determining the contour and then filling the inside voxels. There was no
easy way of quickly tracing and editing a contour. There was also a lot of
going back and forth between adjacent slices to make sure the classification
did not drift too much in the Z-plane. Obviously, most any approach to
drawing outlines easier and maintaining consistency in the Z-plane would
be a moderate improvement to the previous approach.
4. The user interface be intuitive and facilitate rapid segmentation.
Many segmentation approaches, especially automatic or semi-automatic
methods, are usually not intuitive for 3D segmentation. Users can dial
in parameters that are optimized for one slice, only to find that other
slices are worse. If viewing a live 3D model, they may have to rotate
and view the model from several angles with each parameter adjustment.
Finally, understanding and editing parameters is often a difficult chore.
This is because the users are not programmers and do not understand
what the algorithm is explicitly doing. Alternatively, if programmers run
the segmentation process, they are apt to misunderstand the tissue to be
segmented as they lack the knowledge of an anatomist.
5. Segmentations could be easily undone and modified. Different peo-
ple will have different interpretations of the locations of tissue boundaries.

For this reason, the segmentation should be easily altered by several hu-
man experts, especially a supervisor to undo or edit the work of someone
In summary, my goals were to build a generic, accurate 3D method that was
intuitive and quick for users to operate. With these goals in mind, I developed
a rapid graphical modeling technique of 3D mesh editing, coupling the desired
outputa 3D modelwith the segmentation process. It also employs semi-
automated approaches that can snap to an edge. In short, users can lay control
points on a given slice in the AW-plane and an interpolated contour is placed
between control points. Users can skip several slices in the Z-plane before laying
control points on another slice. The intermediate slices are also interpolated in
the Z-plane. Briefly, here are the goals of the project and how my tool addresses
1. Segmentations are accurate. This tool allows the user to quickly out-
line the border of a tissue and fill it. Voxels outside are not filled. This also
maintains consistency in the Z-plane since those slices are interpolated.
To increase resolution, the user can simply add more control points. Ad-
ditionally, to make sure that the edges are not too rounded due to the
interpolated curve between control points, the user can adjust a snap-to-
edge parameter and allow for the contour to migrate to edges rather than
be rounded.
2. The tool work across most or all tissue types. Because this is a
generic 3D mesh editor, it can work on any tissue. In fact, it does not

need to function in a biomedical application. It can serve to outline nearly
any 3D object.
3. Classification is performed much quicker than existing manual
approaches. For a given slice on the XT-plane, an outline can be made
nearly 4 times faster than a manual paint method. Additionally, most
slices in the Z-plane are not edited (they are interpolated). In practice,
most meshes jump about every 7.5 slices. Therefore, this is a 7.5-fold
increase. Finally, there is no going-back and forth to clean up slices in
the Z-plane which probably amounts to a 25% increase. All-in-all, this
approach is about 32 times faster than the previous manual method.
4. The user interface is intuitive and facilitates rapid segmentation.
Because the user is in complete control of drawing an outline, it is apparent
which voxels will be marked for segmentation and which ones will not
be marked on any slice. Also, the user interface (UI) facilitates quick
navigation to slices with control points for editing (versus slices that are
only interpolated), cropping a sub-volume to the bounds of the mesh, and
the allowance for multiple meshes for the same tissue on a given slice.
5. Segmentations can be easily undone and modified. The mesh in-
formation for a tissue is stored in a file in XML format for the program
to read in easily (and for external programs to also take advantage of).
Because all users have access to the mesh when selecting a tissue, they
can alter the control points and edit the mesh. Additionally, there is an
undo/redo mechanism.

1.7 Document outline
Chapter 1 is this chapter.
Chapter 2 describes a background of manual and semi-automatic segmenta-
tion techniques. I critique their application to the segmentation of 3D biomedical
images, but also explain their useful properties which I have borrowed to make
my mesh editor.
Chapter 3 details the 3D mesh design and its functionality to meet my
previously defined objectives.
Chapter 4 contains statistics to measure the complexity of the models built
and examples of anatomical models built with this tool.
Chapter 5 provides a discussion of future directions and how this method
fits into the current body of 3D segmentation techniques.
I have also included a few appendices. Appendix A discusses automatic
segmentation techniques and why they fail in the segmentation of the Visible
Human dataset. Also in this appendix is a discussion of feature definitions
that are applicable to automatic and semi-automatic techniques. Such feature
classifications could be potentially adapted to make my methods snap-to-edge
capabilities more robust.
Appendix B discusses splines and particularly the Catmull-Rom/Cardinal
spline I use in my mesh object in detail.
Appendix C is the user manual for the larger segmentation application that
I built which incorporates this tool and describes to the user how to use it.

2. Background
If automatic methods provided precise segmentations, one could encapsu-
late the segmentation method in a black box hidden from the user. However,
automatic methods for precise classifications are frequently unrealized, requiring
the user to interact with the segmentation method.
Broadly speaking, user interaction methods are most often comprised of
setting parameter values and/or providing some type of pictorial input onto the
image (Olabarriaga and Smeulders, 2001). Parameters a user may adjust might
include settings of an edge filter, threshold levels for pixel intensities, or other
values particular to the segmentation method. Examples of users interacting
with the segmentation process with pictorial input include region growing and
pruning from user-selected pixel seeds (Schiemann et ah, 1997; Boykov and Jolly,
2001; Nguyen et ah, 2003; Protiere and Sapiro, 2007), snakes and balloons (Kass
et al., 1988; Caselles et ah, 1997; Nguyen et ah, 2003; Liang et ah, 2006), snap-to-
edge techniques (Mortensen and Barrett, 1995, 1998; Falcao and Udupa, 2000;
Schenk et ah, 2000), and drawing methods such as splines (Hearn and Baker,
1997), 3D geometric modeling (de Bruin et ah, 2005), and simple contour tracing
and painting.
2.1 Seeded region growing
Seeded region growing allows the user to provide a region that is part and
representative of the structure to be segmented and neighborhood pixels are
continually filled as long as they maintain semblance to the seed. Users can

Figure 2.1: Seeded region growing using DDA color space matching as the
feature vector. The solid green stroke is the users seed and the semi-transparent
green is the resulting region grown from the seed.
sprinkle a few seeds in disparate areas of the structure to theoretically fill the
complete structure. This can be coupled with seeds which are not part of the
segmented structure. Binary seeding in this fashion is known as stereo pruning
which can be quite effective in 2D when the seeds can easily describe disparate
image features (Boykov and Jolly, 2001; Nguyen et al., 2003; Protiere and Sapiro,
2007). However, seeded region growing in medical imaging is often limited due
to noise and homogeneity of the tissues being segregated. In practice, a user
may spend a lot of time dealing with fills that go too far in one area and not
far enough in another. Figure 2.1 also shows boundary pixels are likely to be
problematic using region growing.
In 3D, seeded region growing becomes even more difficult because of the vi-
sualization. Additional seeds to correct one slice may negatively impact another
slice without the user being immediately aware. Schiemann et al. (1997) have
been able to use this segmentation method in 3D for several disparately-colored

structures in the Visible Human, but their method cannot easily discriminate
adjacent, similar structures such as muscle against muscle.
2.2 Snakes and active contour models
Snakes (in 3D, balloons (Cohen, 1991)) and active contour models (Kass
et ah, 1988; Caselles et ah, 1997; Nguyen et ah, 2003; Liang et ah, 2006) begin
with some loose contour segmentation close to an objects border. When the
gradient of the underlying image is described as an energy landscape (discussed
below), and the bendability of the contour is also determined by energetic forces,
active contour models iteratively seek to minimize the total energy of the con-
tour. The result is that the initial contour will expand and contract to an image
edge while maintaining some degree of curvature set by the user.
The underlying energy landscape can perhaps be best imagined as a topo-
graphical 2D image where low energy regions are black and high energy regions
are white. The active contour seeks to navigate between low-energy regions as
much as possible, thereby minimizing total energy. Snakes also employ user-
defined costsmeasures such as deviation from a predefined path or shape as
well as the magnitude of curvature at points along the contour. In this way,
snakes are constrained and do not meander too far from the initial segmenta-
tion. This, however, can also make their output too smooth.
The energy landscape can be described as the composite of any number of
image and external features. One could employ texture comparisons through
the use of various texture representations such as textons (Julesz, 1981, 1986)
or responses to filterbanks (Malik and Perona, 1990; Leung and Malik, 2001;
Forsyth and Ponce, 2003), letting low-energy regions separate textures. One

could also employ edge detection such as the Canny (Canny, 1986) or Lapla-
cian of Gaussian (LoG) filter (Marr and Hildreth, 1980), and let the energy be
determined by the inverse magnitude of the edge. The energy landscape may
also include non-image features like the expected shape of the object. (These
techniques are described in more detail in Appendix A).
Regarding the use of snakes in 3D, an initial segmentation on one slice can
serve as the seeded contour to an adjacent slice, also incorporating information
from the previous slice to guide the snakes evolution on the new slice. This
potentially allows the user to make one rough segmentation in one slice and
then to let the process propagate through several other slices. If the object does
not change too much between slices, then this 2D serial processing can realize
a 3D segmentation. Alternatively, if the structure being segmented can map to
an existing graphical model, say from some gold standard, then the 3D contour
(balloon) can be provided as input along with the 3D energy landscape obtained
from image features. The balloon will navigate the 3D energy landscape while
not deviating too far from the existing graphical model that is serving as a
Unfortunately, it is difficult to predict the behavior of the snake. As they
evolve, they settle on a local minimum, and may, therefore, provide different
results given the same initial contour. Furthermore, once the snake settles on a
boundary, its contour is not easily or intuitively editable unless it is employed
with hard constraints such as B-splines (Menet et al., 1990). The addition of
hard constraints adds more complexity to the segmentation process, especially
in 3D, where it may be difficult to navigate and edit 3D control points or other

parameters that dictate the hard constraints.
2.3 Snap-to-edge techniques
Another interactive alternative is the snap-to-edge technique known as live-
wire, intelligent scissors, or magnetic lasso (Mortensen and Barrett, 1995, 1998;
Falcao and Udupa, 2000). An edge in this context is similar to following
energy gradients used in snakes, but live wire finds a lowest energy path between
discrete points, not a continuous contour as with snakes.
To make a live wire, the user selects a seed pixel (the first control point),
usually at an objects boundary, and as he/she moves the mouse, a minimal
energy path following the energy gradient is drawn to the current mouse location.
When the user clicks again, another control point is placed at that location
and the minimum path between the two control points is drawn and locked.
Additionally, the last control point becomes the seed for a new live wire segment.
Proceeding in this way, a user can quickly trace the outline of an object with
possibly few control points. Importantly, this method allows control points to
be intuitively inserted, deleted, or moved at any point during the segmentation
The principle fault with intelligent scissors is that they are too reliant on the
edge. Additionally, there is zero-order continuity between segments on either
side of a control point meaning that adjacent sections of curve are not even
tangent at the intersecting control point. Therefore, robust energy gradients
may have to be acquired and possibly other heuristics in the method itself may
have to be employed to make the tool behave as desired.

United snakes (Liang et al., 2006) provide a nice balance between intelligent
scissors and snakes. Users employ a segmentation with intelligent scissors that is
used as the initial contour to a snake. This serves to smooth the response from
the intelligent scissors segmentation. Perhaps more importantly, the control
points inserted by the user remain as hard constraints to the snake, allowing
the user to add, delete, or move control points, effectively keeping the snake
intuitively editable.
United snakes are not immediately translated into 3D, however. It is unrea-
sonable for the user to insert control points on every slice. In theory, one could
provide a 3D mesh with control points on sparse slices for a united balloon and
maintain the meshs control points as hard constraints, but this has not yet been
Schenk et al. (2000) demonstrate a 3D method where users trace an outline
with a live wire on one slice, followed by another outline n slices away. Then,
the outline of the slices in between is determined by following the shortest path
through the combined interpolated shape and edge gradients.
One problem with this approach is that if the interpolation between slices
is linear, then slices in the orthogonal plane will have zero-order connectivity.
If another interpolation technique is used, their method does not address how
correspondence between control points is established and maintained to take
into account objects that twist. Additionally, this method requires live-wire-
only segmentations on the top and bottom slices when live-wire may not be the
ideal tracing tool.

2.4 Geometric modeling and drawing
It may actually be desirable to have disparate border pixels as part of the
segmented object. For example, the user may want the fat that is on the surface
and internal to a muscle to be included with the structure. In this case, a snap-
to-edge or snake technique may be inappropriate because it will place divots on
the objects surface as it segregates the fat from the rest of the muscle. In other
cases, the edge may not be distinguishable when adjacent tissues are visually
homogenous as in muscle against muscle, fat against fat, or between brain areas.
In these cases, 2D or 3D geometric models or pixel painting are used. The user
must rely on some visual cues, but mostly applies his/her anatomical knowledge
to the segmentation.
Indeed, a user can apply one of the previously mentioned segmentation
methods to perform a first-pass segmentation and use a paint tool to correct
misclassified pixels, but this interrupts the building and re-editing of the 3D
model. For example, the user may apply a snake in the first pass and then
hand-draw additional pixels that were missed. If it is then recognized that the
first pass was in error, then the correction of the hand-drawn pixels might also
be difficult to undo.
A simple 2D geometric model is a spline. Like live-wire, the user can in-
tuitively draw an outline by assigning a few control points on a slice and let
a curve interpolate between them. Control points are easily moved, inserted,
and deleted, and the tool has predictable behavior. The principle problem with

splines is that they may be too curved, requiring several control points to accu-
rately follow an edge.
In 3D, a spline is a mesh. Like 2D splines, meshes may have the problem of
being too smooth unless they have a large number of control points. Addition-
ally, navigating and editing the mesh in 3D and understanding correspondence
between control points can be difficult.
de Bruin et al. (2005) present a 3D mesh editing method that maintains
correspondence between control points by constraining control points to lie on
2 orthogonal planes. This method is not immediately intuitive, especially on
objects that twist, and may require extensive training to understand imposed
constraints and 3D navigation to be used effectively. Additionally, like other
geometric modeling techniques, its surface may need several control points to
provide adequate segmentation detail.
2.5 Background summary
The border pixels of an object will often be problematic to segment by most
automatic methods. Additionally, a given semi-automatic method works well for
one structure, but not another, or more typically, performs well in some region
on a structure, but fails in another region. Manual paint methods are extraor-
dinarily laborious and interpolating geometric methods are subject to being too
smooth. Editable techniques where a user can undo/redo and otherwise alter
the ongoing segmentation is seldom realized, especially when the desired change
is to be local, only affecting the region of interest and not other areas where the
segmentation decision has already been made. Additionally, in 3D, navigation
and understanding the current segmentation task can be difficult for the user.

If the pseudo-3D approach is taken to help negate this, the segmentation will
be zero-order continuous in the Z-plane.
I have devised a generic method that incorporates some of the advances
just outlined while addressing some of their shortcomings. This method is the
construction of a 3D cubic mesh, with the single constraint that a row of control
points lie on one slice in the plane. This constraint is intuitive, enables sim-
ple navigation, and provides automated insertion of control points that often
only need slight adjustments. Additionally, this provides true 3D segmentation,
maintaining consistency in the orthogonal plane and continuous, simple editing.
Additionally, I blend geometric models with snap-to-edge techniques allowing
the user to choose one or the other technique, or most likely, some combina-
tion, creating contours that are neither too smooth, nor too jagged, in a very
predictable tool.

3. Methods
The segmentation method can be divided into two independent parts: Edit-
ing with the 3D geometric mesh data structure, and optionally employing
blended snap-to-edge.
3.1 Editing with the 3D geometric mesh data structure
Tight coupling of a 3D objects data structure with the user interface is
requisite for a well-designed editable 3D surface. I therefore present with each
component of the design, how the user interface capitalizes on it.
3.1.1 Cubic mesh
I have employed cubic meshes to take advantage of their simplicity during
user operation. Additionally, cubic meshes provide second-order continuity so
curves on either side of a control point will have the same tangent and rate of
change at that point.
3.1.2 Catmull-Rom interpolation
I employ Catmull-Rom interpolation with a default value of 0 for the tension
making its typical operation a cardinal mesh. (See Appendix B for more detail).
Therefore, any given patch of the mesh surface will be bounded by 4 control
points and use 16 control points for its interpolation (Figure 3.1).
Catmull-Rom interpolation offers several advantages over other interpola-
tion methods. For one, all control points of Catmull-Rom splines are placed

Pi-l.,~l P-1,; P-l.7-1 P-l.7+ 2
Figure 3.1: Portion of a cubic mesh. With Catmull-Rom interpolation, the
interpolated patch (gray) is bounded by 4 control points, prmu,coh using the
location of 16 control points surrounding the patch.
directly on the surface. This affords an intuitive interface allowing users to sim-
ply demarcate a border. Some splines such as Bezier curves anchor some control
points on the curve and other control points off of the curve. This is not very
intuitive and may also be impractical in some situations as it may be necessary
to place a control point beyond the space that is visible or usable on the users
Secondly, Catmull-Rom interpolation offers piecewise continuity so that edit-
ing one control point of the spline only affects the curve locally. Specifically, the
curve affected will be within two control points in any direction of a control
point being moved as shown in Figure 3.2. This makes it so that segmentation
decisions made in one location are unaffected by new editing decisions. Addi-
tionally, this permits robust, real-time editing interactions as only a portion of
the mesh has to be calculated as a user moves and positions individual control

Figure 3.2: Example of local editing. Two segments on either side of the
control point that is moved are affected by the move.
Thirdly, another advantage of Catmull-Rom splines over, say, Hermite
splines, which also exhibit local editing, is that Catmull-Rom splines do not
have to explicitly give the tangent of their endpoints. The value of the slope
at a control point is automatically determined by the coordinates of its two
neighboring control points (Figure 3.3). This saves the user time and needless
confusion which would otherwise occur with further editing of spline and control
point parameters.
3.1.3 Mesh data structure
Because each interpolated surface patch is bounded by 4 control points,
each row, i, of the mesh will have the same number of control points. Therefore,
the mesh forms a closed cylinder such that for n control points on a row, the
nth control point will be adjacent to the first control point. Furthermore, for
all control points in the mesh on row z, I require them to reside on the same
orthogonal plane in the volume, i.e. a slice in the Z-plane. This permits a more

Figure 3.3: After Figure 10-30 in Hearn and Baker (1997). Illustration of the
Catmull-Rom spline automatically determining the tangent vectors at a control
point and the segment being calculated (black). The tangent vector for p*. is
determined by the chord pTITPfc+i and the tangent vector for p^+i is determined
by the chord PkPk+2
intuitive interface and rapid segmentation because the user interacts with a slice,
seeing all control points for that row in the mesh. If control points of a mesh
row were not constrained to the same slice, then the user would see sporadic
control points as he/she navigates through slices of the volume which would
make it difficult to understand the model under construction and to maintain
correspondence of the control points. Indeed, when the jth column control point
is uniquely colored and placed on a landmark, maintenance of control point
correspondence across slices can easily account for object twisting, translation,
and shrinking/expanding.
Constraining a mesh row so that its control points all reside on the same
slice also allows for intuitive insertion of control points. Since each row i must
have the same number of control points, when one control point is inserted on
one slice, the method can automatically insert and place control points on all
other rows of the mesh and slices in the volume with minimal alteration to the
existing mesh. I accomplish this by considering where along the curve between
two pre-existing control points, j and j + 1, a newly-inserted control point j' is

(a) A portion of slice 640 of
the Visible Human. White circle
is where a control point will be
(c) Response after the user has in-
serted a control point on slice 640.
(b) A portion of slice 650, the next
row of the mesh that contains con-
trol points. White circle is where
a control point will be automati-
cally inserted when the user inserts
a control point on slice 640.
(d) Automatic control point inser-
tion and response on slice 650 af-
ter the user has inserted a control
point on slice 640.
Figure 3.4: Automatic insertion of control points on other rows of the mesh.

placed. Then for each of the other slices, I find the existing contour between
j and j + 1 and insert a new control point, /, along the contour applying the
same ratio. This is shown in Figure 3.4
Similarly, it is also easy to insert new rows into the mesh. Users typically
begin segmenting with a mesh of four rows and few control points that loosely
fit the structure being segmented. They then traverse slices and where the mesh
deviates strongly, they insert a new row of control points. The position of the
newly inserted control points falls on their locations in the existing interpolated
mesh as shown in Figure 3.5. That is, even though control points are placed
along the full contour, the slices contour with control points is the same as the
pre-existing interpolated contour. This permits the user to only modify those
control points in regions that need editing. Therefore, even if a mesh might
contain several control points, potentially few are actually adjusted on a given
My method is, however, prone to gathering several control points on a slice.
If this impedes the segmentation, a user can make a new mesh and continue
segmenting the structure with a new mesh. In fact, this is how I also deal with
branching of an object. Therefore, the segmentation of an object may be the
concatenation of several meshes.
The automatic insertion and deletion of control points, however, does al-
ter the mesh, albeit locally, for all vertical slices. Since the tangent through a
control point as defined by Catmull-Rom interpolation is defined by adjacent
control points, the insertion or deletion of a control point changes neighbors and
therefore alters the tangent through affected control points. In practice, this has

(a) Before insertion of control points
(b) After insertion of control points.
Figure 3.5: Example of inserting control points on a slice.
not proved to be a problem because the users alteration to the pre-existing sur-
face tends to be slight, and therefore changes in tangent values through control
points are also slight.
3.2 Mesh with snap-to-edge
One way to avoid too many control points in the construction of a mesh is to
employ a means of delivering a contour the user would normally choose. The sim-
plest method is a snap-to-edge technique. Current snap-to-edge techniques also
follow the same effective interface of inserting/deleting/moving control points
as I employ in the mesh construction. Therefore, enabling mesh construction
with snap-to-edge techniques is a logical extension.
While my technique can be applied to any combination of edge filters and
feature measures, I establish an edge gradient as the combination of two paired

3D Gaussian gradient magnitude filters ({fGlh, /gi} and {/g2/i,/g2} where h
denotes the filter to obtain the horizontal gradient magnitude and v denotes the
vertical filter reponse) and two zero-crossing 3D Laplacian of Gaussian (LoG)
filters (fz1 and fz2) (Marr and Hildreth, 1980). Each pair of filters only deviates
by their sigma values defined by the user. The user can also modify the relative
contributions of each of the LoG filters and the LoG filters to the total gradient
magnitude output.
The gradient magnitude, /g, is equal to the maximum output of fclk or
fc2k at a given pixel, k G {h, v}. That is, for pixel (x,y,z),
fGk(x,y,z) = max(fGlk(x,y, z), fG2k(x,y, z)). (3.1)
Figure 3.6b provides an example of output of the 3D filter, fGh, on a slice.
A symmetric, 3D Gaussian filtering function is
g{x, y, z) = 1/ (27rcr)3/2 exp[-(x2 + y2 + 22)/2a2]. (3.2)
When used in 3D image filtering, the parameters x, y, and 2 specify the distance
from the current pixel being evaluated.
I determine the magnitude of the image gradient on the XY plane through
the standard technique of convolving the image with the first derivative of Equa-
tion 3.2 with respect to x to acquire the vertical gradient magnitude and then
with respect to y to obtain the horizontal gradient magnitude. For example,
fc =
(2tt)3/2 oV2
exp[(x2 + y2 + 22)/2 (3.3)

and the first derivative with respect to y is similarly defined where i indicates
the gradient filter index (i.e, 1 or 2).
With each LoG filter, I define an edge as the zero crossing. To better
discriminate strong and weak edges from the LoG filters output, I also assign
each zero-crossing edge a weight with the following function.
rrii = y/max(V2/j G) min(V2/; G) (3.4)
where ra* is the magnitude of the edge i, V2/j is the output of the LoG filter
about edge i. 3D neighboring pixels of the edge are multiplied by G, a 3D
Gaussian with a = 2 afuter. That is, the sigma used for weighing the dynamic
range with a Gaussian distribution is equal to two times the sigma used in the
LoG filter. Figures 3.6c and 3.6d show examples of zero-crossings with weighted
3.2.1 Shortest path searching
As with other snap-to-edge techniques, I set up the problem of boundary
finding as a graph search. Since these algorithms seek to find a minimum cost
path, I invert filter responses (i.e max(/01Ifput) /output where /output is the out-
put of the gradient magnitude or LoG filter) to serve as arc weights. Unlike
Mortensen and Barrett (1995), which set up pixels as nodes in the graph and
assign a weighted arc between a pixel and its neighbors, I formulate my method
like Udupa et al. (1992) where nodes are the corners of the pixels and the bound-
ary between pixels are the arcs. This makes it so that the optimal path on a
slice plotted between control points via Dijkstras shortest paths algorithm will

(c) Response from LoG filter with
edge weights, a 1.0.

(b) Horizontal gradient magnitude.
(d) Response from LoG filter with
edge weights, a = 2.0.
Figure 3.6: Responses from filters.

Figure 3.7: Example of the graph structure used for finding an optimal path
between two pixels (gray). Control points are designated as a full pixel. A path
follows the edges between the start and end pixels, finding the shortest path
from one of the four nodes about the starting pixel to one of the four nodes
about the ending pixel.
stair-step horizontal and vertical edges between pixels. (See Figure 3.7).
For the path between two control points, I therefore queue the four nodes
about the starting pixel and terminate upon reaching the first of the four nodes
about the next control point. Continuing in this fashion, I chain sections of the
complete path with control points as bridges.
I also perform this boundary detection in 2D, for each slice in the AW-plane
and I maintain consistency in the Z-plane because our gradient magnitude and
LoG filters are 3D. While it is theoretically possible to have abrupt changes in
the Z-plane as an adjacent slice might find a deviating optimal path, this has
not been experienced.
On slices without control points, I assign soft control points. These
are positions on interpolated slices where for a patch bounded by points
Pij, pij+i, Pi+ij, and Pi+ij+i, and for a particular slice in Z, I take the pixel

corresponding to the left-most part of the patch and call it control point j. (See
Figure 3.1). After all soft control points have been assigned to the slice, I then
perform the shortest paths algorithm as before.
3.2.2 Inside and outside
Because this stair-stepping method of tracing between control points de-
termines the boundary between pixels, it is not immediately determined which
pixels are inside the boundary and part of the object being segmented and
which pixels are outside. This can be determined if the user always traced
clockwise or counterclockwise. I employ the following heuristic. I walk the con-
tour clockwise and counterclockwise and count the number of unique boundary
pixels that would be assigned in both cases. Assuming an outside border will
yield a larger circumference, I choose the direction that yields the smallest num-
ber of pixels. At elbows where I walk from a horizontal to a vertical edge or
vice versa, I also assign diagonal pixels to help ensure that the outside edge will
be assigned more pixels than the inside edge. If this heuristic is still wrong, the
user can explicitly set the mode as clockwise or counterclockwise.
3.2.3 Blending shape and snap-to-edge
Blending the geometric shape as obtained with Catmull-Rom interpolation
with edges inherent in the image is performed by capturing edges of the geomet-
ric shape and making a gradient of them. Then, this shape gradient is coupled
with the image gradient as a single gradient that is fed to the shortest paths
I make a shape gradient by first capturing edges of the mesh. My method
assumes that the pixels in the pixel-wide contour made by the mesh are in-

side. I therefore walk this contour to capture horizontal and vertical edges
that segregate inside and outside pixels.
To illustrate the shape gradient, imagine for the moment that there is no
image gradient. Additionally, imagine the shape gradient is rather binary where
horizontal and vertical edges of the geometric shape give rise to arcs in the graph
with a near-zero weight of 0.1 and all other arcs in the graph are given a weight
of 1.0. If the shortest-paths algorithm is performed on this graph, it will give
rise to the exact same shape as the geometrically interpolated contour. Figure
3.8a shows the horizontal edges of such a binary gradient.
To make this binary response actually more of a gradient, I blur the hori-
zontal and vertical arcs by assigning all arcs of the graph the minimum value of
inverted Gaussians (tails equal 1.0, center equals 0.0) centered about all edges:
aw = min
where k £ {h,v} denotes horizontal or vertical edges. A is the set of all arcs in
the horizontal or vertical graph and w is the weight associated with the arc a.
S is the set of horizontal or vertical edges obtained from walking the geometric
shape. Ge is the inverted Gaussian distribution centered at edge e.
Figure 3.8b provides an example of a horizontal blurred shape gradient with-
out any blending with the edge gradient. Additionally, Figure 3.10b illustrates
a blending of both the shape and edge gradients while Figure 3.10d shows the
composite edge gradient (response to the gradient magnitude and slight addition
of the LoG filters) without any shape gradient.

(a) Horizontal edges of a binary gradi- (b) Gaussian-blurred version of the image
ent. on the left.
Figure 3.8: The shape gradient is made by first capturing the horizontal and
vertical edges of the shape (horizontal edges shown here on the left) and then
blurring that image with a Gaussian filter. The user sets the spread of the
Gaussian filter.

(a) Brachioradialis segmentation with the (b) Horizontal edge gradient of the inter-
mesh. polated mesh contour.
(c) Resulting segmentation with 25% snap
to edge and 75% mesh.
(d) Blended horizontal edge gradient
(both gradient magnitude filters and LoG
filters) at 25% with 75% interpolated mesh
contour gradient.
Figure 3.9: Response of the interpolated contour to various levels of blending.

(a) Resulting segmentation with 50% snap
to edge and 50% mesh.
(c) Resulting segmentation with 100%
snap to edge and 0% mesh.
(b) Blended horizontal edge gradient
(both gradient magnitude filters and LoG
filters) at 50% with 50% interpolated mesh
contour gradient.
(d) Horizontal edge gradient at 100% and
0% interpolated mesh contour gradient.
Figure 3.10: More responses of the interpolated contour to various levels of

4. Results
Out of 331 structures of the Visible Human segmented using this tool, the
median number of meshes per structure was 1, but two or more meshes for a
structure were also common (Figure 4.1a). The median number of control points
per mesh was 140 (Figure 4.1b) with a median of 9 control points on those slices
that contained control points (Figure 4.1c). The median number of slices of a
mesh that contain control points was 14.6 (Figure 4. Id) with a mean distance
of 7.4 slices between slices with control points (Figure 4.1e). The median range
for each mesh was 69 slices (Figure 4.If).
For a subset of these meshes, we recorded editing time and the number of
edits performed for each mesh. Out of 4 meshes, they had a total of 1022 control
points and 526 edits. Therefore, about half of the control points were actually
moved from their default locations when users inserted a new row of control
points into the mesh. The median distance for a control point to be moved was
12 pixels, an easy mouse-click and drag for the user. The distribution of the
moved-distance is detailed in Figure 4.2.
To quantify the efficiency of the mesh tool compared to manual tracing, I
traced 10 disparate objects on a single slice. The results are described in Table
4.1. The mesh tool was always able to trace the object faster. The fastest
tracing was a 4x increase and the slowest was an improvement of 14%. The
mean ratio of mesh time:manual tracing time was 0.53, 1.87 times faster. The
median tracing ratio was similar at 0.57, 1.74 times faster. Combining these

200 140 120 * 160

100 ' ~ ' 60 1 40 i I. 5 i
0 - 20 t In. Hill _cv _cs _rs _c\ _cv 0 h ll
' ' ' 1 r-T- -. r^
123456710 10 It 6 10 15 20 25 30 36 40 45 50 55
(a) Count of the number
of structures segmented (Y-
axis) with the number of
meshes used for the struc-
ture (X-axis).
(b) Count of the number of
meshes (Y-axis) with the to-
tal number of control points
used on the mesh (X-axis).
(c) Count of the number
of meshes (Y-axis) with the
number of number of con-
trol points per slice (on
slices that contained control
points) (X-axis).
10 20 30 40 50 60 70 80 90 100 110
100 -
80 - -
60 !- 50 4- - ...
30 20 10 0 4 12 16 20 24 28 32 36 40 44
: 4i
S !* N* # 4?
(d) Count of the number
of meshes (Y-axis) with the
number of slices that con-
tain control points (X-axis).
(e) Count of the number
of meshes (Y-axis) with
their mean distance between
slices that contain control
points (X-axis).
(f) Count of the number of
meshes (Y-axis) with a par-
ticular slice range (X-axis).
Figure 4.1: Statistics for 331 tissues segmented on the Visible Human dataset.
3 10 15 20 25 30 35 40 45 50 55 60 65 70 75 50 55
Figure 4.2: Histogram of the number of number of control points (Y-axis) and
distance in pixels (X-axis) for moved control points.

ratios into the number of slices interpolated (median of 7.4) indicates that the
mesh tool is at least 7.4 1.74 = 13 times faster than manual tracing. Of
course, this is worse case. As mentioned before, users often spend a lot of
time to going back and forth cleaning up slices in the Z-plane. This has not
been quantified, but from experience I can give the confident estimate that users
spend at least 25% more time cleaning up slices in the Z-plane. It should also be
noted that the tracings given in Table 4.1 were on a single slice starting without
any contour. In the context of a 3D mesh, half of the control points are actually
edited on interpolated slices. Therefore, the time to edit an interpolated slice
vs. a new slice should be half. Therefore, I quantify the speed increase of mesh
editing over manual tracing as 13 1.25 2 = 32 times faster. Additionally,
these numbers are for the Visible Human dataset. For newer datasets that have
higher resolution, especially in the Z-plane, allowing users to jump many more
slices that will be interpolated, this method will serve to be even faster than
We also illustrate models built with and without this tool applied toward
the segmentation of three adjacent muscles of the forearm (Figure 3.10). These
structures were chosen because they have areas where their separation is not
visually discernable, along with parts that have weak and strong edges.
Segmentation of each of these three muscles using the mesh tool took less
than an hour. Each muscle was segmented with a single mesh. The brachioradi-
alis contains 384 control points with 16 control points per slice (Figure 3.9a). It
contains 24 slices with control points with a mean distance of 7.5 slices between
those slices with control points over a total range of 180 slices. 162 edits (inser-

Table 4.1: Comparison of time to trace a single slice with the mesh tool or
manual tracing.
Object Manual Trace Time Mesh Trace Time Mesh:Manual Ratio
L. Inferior Head of Lateral Pterygoid 3.5 1 0.29
L. Cerebellum 10 6 0.6
R. Splenius Capitis 4.5 1.5 0.33
Sinus 2.5 2 0.8
Cavity of Connective Tissue 9.5 7 0.74
L. Semispinalis Capitis 4 1 0.25
L. Vitreous 2.5 0.75 0.3
L. Hippocampus 2 1.75 0.875
L. Levator Scapulae 2.75 1.5 0.55
Cavity of Laryngopharynx 1.25 0.75 0.6

Manual tracing Mesh-only Mesh with snap-to-edge
Brachioradialis \ \ V
Longus V \
Brevis \ \
Collectively \
Figure 4.3: Resulting segmentation of three muscles of the forearm in the Visible
Human Male. Right brachioradialis, extensor carpi longus, and extensor carpi
brevis using manual tracing, mesh segmentation, and mesh segmentation with

tions, deletions, or moves of control points) were performed on this mesh. At
58 minutes, this comes to about 20 seconds per slice. The extensor carpi longus
contains 288 control points with 18 control points per slice. It has 16 slices with
control points and mean distance of 7.8 slices between those slices over a range of
125 slices. Performed in 53 minutes, this segmentation averaged 25 seconds per
slice. It used 124 edits. The extensor carpi brevis contains 240 control points
with 15 control points per slice with control points. 16 slices contain control
points over 110 slices giving it a mean of 6.9 slices between slices with control
points. This structure had 133 edits and took 56 minutes to perform, averaging
30 seconds per slice.
As can be seen in Figure 4.3, especially with the segmentation of the longus,
different boundary decisions were made between the segmentors. Additionally,
what can be seen between hand segmentation and that of the mesh, is that the
hand segmentation is much coarser, especially between slices in the Z-plane.
Where boundary decisions are difficult, the user performing the hand segmen-
tation varied his or her decisions.
When blended snap-to-edge was used, only slight differences were observed.
This is largely because in areas where the edge was not strong, the contour
followed that of the interpolated mesh. Where the edge was strong, the user
had placed the mesh fairly accurately on the edge so it did not need to snap.

5. Discussion
Image segmentation is difficult, especially when precision at an objects bor-
der is mandatory. No single border segmentation method is ideal for all cases,
but a tool built through a combination of methods can often outperform any of
its single methods operating alone. For example, Martin et al. (2002) showed
that brightness and texture segmentation can be combined to collectively im-
prove boundary detection over either methods single response.
It is important to remember that low-level boundary detection is not the
same as image segmentation. Frequently, a user still has to apply a method to
segregate areas where an edge or distinction in feature space was not detected or
is at odds with the human decision. Another interesting result from the report
by Martin et al. (2002) was that they showed that boundaries computed by the
segmentation method were never precise compared to human detail, nor could
they ever be, because different humans each had a slightly different classification
of the border pixels. Unsurprisingly, where humans and machines had the least
variability in segmentation decisions was between areas of both high textural
and brightness contrast. While this study was performed on natural images, its
results are pertinent to biomedical imaging. It highlights that tools too reliant
on image features, especially when image features are ill-defined due to noise and
homogeneity of tissues, will be inaccurate compared to the boundaries chosen by
human experts. Additionally, several human experts may be required to obtain
an accurate segmentation via consensus.

Interactive 3D segmentation is difficult primarily because of visualization
and difficulties interfacing with the user. Methods such as snakes and active
contour models (Kass et ah, 1988; Caselles et ah, 1997; Nguyen et ah, 2003;
Liang et al., 2006) enable a pseudo-3D approach to segmentation by allowing
an edit on one slice to be used as input onto the adjacent slice. If the object
does not vary too much between slices, then the snake can frequently evolve
properly as it proceeds through slices, but no direct link is maintained between
slices, so the resulting stack of snakes may be unwieldy for further editing and
may not maintain consistency in the orthogonal plane. Balloons (Cohen, 1991)
offer a true 3D snake segmentation, but being able to edit them after they have
settled is impossible without assigning hard constraints, which may be difficult
to assign, visualize, and make intuitively editable to the user.
I have provided an intuitive, interactive, true 3D mesh segmentation method
that capitalizes on simple 3D geometric modeling techniques with the utiliza-
tion of image features. I have utilized this technique with segmentation tasks
of the Visible Human and other cryosectioned volumes at the Center for Hu-
man Simulation at the University of Colorado Health Sciences Center. I have
provided as illustration its ability to segment adjacent muscles of the forearm.
This tool is ideal for such structures because these structures are contiguous,
spherical/cylindrical blobs. Where my technique is perhaps inappropriate and
has difficulties, as do snakes, is with irregular and branching structures. Users
have, however, applied this technique to artery and vein segmentation where,
even though these structures branch, segments are rather cylindrical. In this
case, users segment such structures as a concatenation of several meshes.

I measure accuracy relative to a boundary chosen by human experts. This
remains a subjective assessment as the boundaries the Center for Human Simu-
lation is defining and refining become the gold standard for the Visible Human
data set. The accuracy of my approach, then, is directly tied to the efficiency of
the method how easily users can attain consensus and edit each others work.
Because the mesh is easily editable through the addition/deletion or translation
of control points, users can easily modify the existing segmentation. Further-
more, through the utility of incorporating image features, users can entrust the
underlying boundary detection to keep control points sparse.
When a control point is inserted on one slice, an additional control point is
inserted on all rows of the mesh. Therefore, my method is prone to gathering
control points. I assert, however, that many of these inserted control points are
unaltered, and when they are, they are only slightly adjusted. We have shown
that 50% of control points are actually adjusted and when they are, they are
only positioned about 12 pixels away.
Another drawback with my approach is that the segmentation of an object
may involve the agglomeration of several stacked or adjacent meshes. For objects
that branch, there will be two or more meshes on a slice. In this case, I make it
known to the user which mesh is the active one, and they can cycle the single
active, editable mesh by hitting the TAB key. As statistics show, however, more
than half of the objects that have segmented using this tool contain only one
It is important to contrast my method with united snakes (Liang et ah,
2006). Both methods begin with a traced contour from the user placing control

points. In the case of united snakes, the initial contour is made by intelligent
scissors (Mortensen and Barrett, 1995, 1998) and therefore subject to low-level
image features and then smoothened with the snake.
With my method, I begin with a smooth contour and let the image features
alter it. If the image features are distinct and the initial segmentation performed
by intelligent scissors is pretty close, then employing a united snake (balloon)
might be best because one could possibly trace an edge with few control points.
When low level image features are not robust, however, my approach is not
prone to unpredictable meandering as one would find with intelligent scissors.
Of course, my method and united snakes are not mutually exclusive. It would
be interesting to incorporate our mesh, which is easily editable, with a united
It is also interesting to contrast my method with the shape interpolation
method proposed by Schenk et al. (2000). Two primary difficulties arise with
their method: Firstly, control points may be automatically inserted or deleted.
While this simplifies the structure in one respect, correspondence between con-
trol points is lost. This will be problematic if an object twists or stretches along
a particular dimension. Secondly, their approach is completely reliant on the
intelligent scissors method to define their shape. In contrast, my method deliv-
ers many available-but-unedited control points, but correspondence between the
control points is maintained. It is also not completely dependent on low-level
image features for defining contours.
One problem with any of these approaches of setting a contour is that the
boundary set by the segmentation method or user directly may fail with respect

to adjacent tissues. It may be common for the outline of one tissue not to touch
the outline of an adjacent tissue, leaving some pixels in between. Conversely,
the two outlines might overlap, sharing some pixels. For Visible Human seg-
mentation, a pixel can only belong to one tissue type so in the case of overlap, a
pixel will be set to the second tissue type unless the pixel was locked, in which
case, it remains the first tissue type.
To ensure that adjacent tissues are really adjacent, a common use of the
mesh tool is for the contour of the first tissue to be made precise and then
for those pixels to be locked. Then, for a second, adjacent mesh, the user
purposefully makes the second meshs contour bleed into the firsts region. With
respect to segmentation speed, this is a real boon. The user does not have to
re-work the edge with any precision. However, in this case, the mesh has lost
its complete representation of the object being segmented and is subject to
other tissues first being segmented. To have adjacent meshes truly retain their
representation of the objects they segment and provide the user with the ability
to detail the edge once, I discuss the possibility of having adjacent meshes share
control points and edges in the next section.

6. Conclusions and further research
I have implemented a cubic mesh editing tool that is simple and intuitive
to use to segment 3D images. It does this first by forcing all rows of control
points of the mesh to reside on the same orthogonal slice in the image. This
constraint has facilitated a simple interface permitting users to understand the
model they are constructing by maintaining correspondence between control
points. Additionally, I have added the capability for this mesh to take advantage
of low-level image features and snap-to-edges.
As effective as this tool is, more advanced tools can be built incorporating
some of its ideas. For example, instead of a cubic mesh, one could employ a
triangulating mesh, a surface bounded by three control points, and offer means
of maintaining correspondence between control points. In the case of the tri-
angulating mesh, the user needs to know which other two control points define
the triangle a given control point corresponds to which may be quite difficult
to attain through the user interface. However, this could afford the interface
to lose the constraint that all control points on a row in the mesh reside on
the same plane in the image. Additionally, this would provide the ability for
a contiguous tissue to be segmented completely by one mesh while keeping the
number of control points minimal.
Along the lines of control point decimation and triangulation editing one
could begin to integrate automatic and semi-automatic approaches with mesh
editing. A user could apply one of the inexact methods described in Chapter

2 or Appendix A to perform a rough, fast segmentation. This, in turn, could
be followed by a 3D model rendering of this crude segmentation such that the
surface rendered automatically creates a 3D mesh of triangulated control points
and also such that control points only occur on discrete, planar slices. That is,
the mesh would be made such that a slice would either contain control points
or not. Therefore, for any surface triangular patch, two of the control points
would exist on one slice and one control point would exist on another slice. The
reason for this constriction is again to allow for easy editing. Ideally, the user
would not have to add any more control points to maintain the correspondence
of other points in other planes and simply move control points as necessary to
perform a better fit.
Yet another way mesh editing can be used is to instill dynamic surface prop-
erties. The blending between control points may be visual as well as textural.
For example, for surgical simulation, one location or region of the model, might
be rigid and hard while other regions may be squishy or soft. Similarly, an
ultrasound simulator may not want to treat a given tissue as having the same
resonance throughout. In these cases, control points could be encoded with more
detail such that the patches interpolate and encode this information.
Another way mesh editing can speed the segmentation process is if two ad-
jacent tissues can have conjoined control points and adjacent interpolated edges.
Therefore, when one structures edge is modified, the adjacent structure can also
be simultaneously edited and remain adjacent without gaps being introduced.
Related to the dynamic surface properties just discussed, this also opens the op-
portunity to define other features of adjacency. For example, with two adjacent

surface patches, how one tissue can slide or move relative to the other patch in
a surgical simulation could also be encoded.
The mesh editor I have created and such types of enhancements begin to
broaden the role of segmentation beyond the mere classification of pixels as
belonging to one tissue or another to the relatively easy construction of 3D
models potentially encoded with surface details and complex relationships to
neighboring tissues.

Appendix A. Automated methods and features
The holy grail of computer vision is automatic segmentation whereby the
computer can apply processes and algorithms to classify sets of pixels without
user interaction. On the opposite side of the spectrum are manual techniques
whereby users explicitly label pixels. This is usually done by a human expert
who paints or outlines and fills the pixels which belong to a particular category.
In the middle are semi-automatic approaches which involve user interaction to
guide the computers processing to help train or zero in on a final segmentation.
There is typically an inverse relationship between the quality of these approaches
and the time and human resources involvedautomatic techniques often yield
poor results while obviously not employing any human effort (aside from the
effort to build the classifier), semi-automatic methods are often adequate when
precision can be inexact, and manual techniques are often precise, but extraor-
dinarily laborious. I now discuss leading efforts in these various segmentation
techniques and their applicability (or inapplicability) to cryosectioned human
tissue segmentation.
A.l Automated Methods
Automated methods of image segmentation are in the domain of computer
vision, artificial intelligence, machine learning, and pattern recognition. Features
of the image are evaluated and pixels or voxels are classified via some process or
method known as a classifier or model. (For clarification, when two classifiers
share the same process but only differ by the parameters that are used by the

process, these are also referred to as separate models). The model may itself be
comprised of sub-classifiers which perform techniques such as regression analysis
to provide a measure of how likely a given pixel is to belong to each sub-class.
High-level models, then, may have several layers of processing and decision-
making to define, measure, and discriminate features. So what is a feature?
A.2 Features
Features are implicit and explicit attributes of the data. The only true ex-
plicit features of images are pixel location and color. Implicit attributes range
from low- to high-level with low-level features generally drawing on explicit
attributes and higher-level features drawing on combinations of lower-level fea-
tures along with external knowledge such as knowing the pose, general shape,
or approximate dimensions of an object.
Low-level image features include contiguous color regions and edges. Mid-
level attributes are texture, gradients and flows (described later), shapes, and
shading. High-level, more abstract features could be measured using other
knowledge, statistics, heuristics, rules and combinations of lower-level attributes.
Again, using face recognition as an example, faceness (the face detection phase
of face recognition) may itself be a feature. Faceness may be the combination
of other features like hair-ness, forehead-ness, eye-ness, eyebrow-ness,
nose-ness, etc. with knowledge of how these elements agglomerate to form
a face. Alternatively, faceness may be deemed too complex when described
in this fashion and other means may be employed which measure faceness
without overt descriptors of every component of the face. This example illus-
trates the difficulty encountered in pattern recognition systems. It is tricky (and

often computationally prohibitive) to describe the combination of lower level at-
tributes which give rise to a pattern and it is difficult to describe a pattern
without describing the lower-level attributes.
Attempting to programmatically describe a high-level implicit feature is
prone to problems. The list of rules employed can quickly become unwieldy.
Using the example of face detection again, since all individuals are unique, rules
would have to account for various poses, facial expressions, baldness, beards and
moustaches, skin, hair color and style, the presence of eye glasses, sunglasses,
Alternatively, rather than trying to programmatically enumerate the rules
for segregating data points, the field of machine learning attempts to combine
low- and mid-level attributes in feature space and then, through statistical and
algorithmic means, make a map which describes regions in this space. For
example, the red, green, and blue values of a pixel may form a 3D feature space.
One could add further dimensions to this spacethe proximity of the pixel to an
edge, the direction of the edge, whether or not the pixel resides in a particular
texture, the direction of gradient flow at the pixel, etc. Data points which
reside in a particular region get ascribed the value(s) of that region. In this way,
high-level features and models can be automatically built without large sets of
It is often the case that a linear feature space is inadequate to classify and
describe the data. This is largely because each pixel is considered a discrete data
point and feature descriptors for the pixel are rather local, simply including near
neighbors. For this reason, large-scale patterns are lost. The workaround to this

: R2 R:!
(si.aa) *-* (*i.*2**i) := (a-?, y/2x\x%3&)
y x
> x
*5 x
A>\ x
- X
a o
Figure A.l: Mapping into higher-dimensional nonlinear space. For data to be
linearly separable in feature space, there must be a separating hyperplane. This
example applies a nonlinear transformation to create a new feature space with a
separating hyperplane. Image taken from Greg Grudics Machine Learning class
issue is to apply a nonlinear transformation to the feature space such that data
points can be separated by a hyperplane. Points on one side of the hyperplane
are classified differently from points on the other side of the hyperplane. Figure
A. 1 illustrates the transformation of nonlinear data in feature space with a non-
linear transform which creates a linearly separable hyperplane in a new feature
The nonlinear transformation to be applied does not have to be a guessing
game. One can apply the kernel trick to transform the data (Aizerman et ah,
1964; Boser et al., 1992; Wikipedia). It is important to note, however, that
not all nonlinear data can mapped into a higher dimension where it is linearly
separable. I will discuss techniques for segregating features in the discussion of
particular automated methods at the end of this section. I will now highlight
two important low-level feature descriptors.

A.2.1 Edge Detection
As simple as edge detection may appear to be, it is still a very difficult
problem in image processing. In grayscale images, most edge detectors are
based on finding maxima in the first derivative or a zero-crossing in the second
derivative. The difficulty in edge detection is that sometimes edges are missed
when they should be included (false negatives) and noted as edges where they
should not be (false positives). Collectively, the error rate is a measure of false
negatives and false positives. Most edge detectors have a relatively low error
rate, but the consequences of any error can have overwhelming consequences on
image classifiers where large areas of pixels are misclassified due to one or few
problematic edges. Errors occur much more frequently when the image contains
noise. When the image contains noise, false positive edges are reported unless the
detector is made to be less sensitive in which case it is likely to miss true edges.
Alternatively, removing noise from images often blurs them and therefore edge
boundaries will be more rounded and imprecise. Common edge detectors are
Sobel, Laplace (Marr and Hildreth, 1980), Laplacian of Gaussian (LoG) (Marr
and Hildreth, 1980; Huertas and Medioni, 1986), and Canny (Canny, 1986).
The Visible Human data is largely free of external noise. The images were
acquired under nearly constant conditions and only rarely are there cases of crys-
tals or frost on the image. Ironically, many tissues exhibit a lot of noise and/or
lack clear edge boundaries between tissues because many tissues are visually
homogenous. Figure A.2 shows an image from the Visible Human Male dataset.
The top-left image is the color RGB data. The top-right image is the segmented
data where discrete colors are used for the various tissues. (Sometimes a color

is repeated for different tissues.) The bottom-left image is the output from a
Laplacian of Gaussian (LoG) filter with a sigma value of 1.0. The bottom-right
is the output from a LoG filter with a sigma value of 3.0. Too many edges are
found when the sigma value is 1.0 where denoising is minimal. Alternatively,
the edges are too rounded and too few when the sigma value is 3.0. In both
cases, some edges are missed and others added where they should not be when
compared to the manually segmented data.
The image in figure A.2 was first converted to grayscale and then a LoG
filter was applied. The green outline highlights the zero crossings in the second
derivative of the grayscale image. Color edge detectors typically barely outper-
form grayscale edge detectors. Additionally, the Visible Human data has a lot
of dynamic range in the red channel and little in the blue and green channels.
Therefore, color edge detection is likely to not show any improvement when used
in this dataset.
A.2.2 Texture
Texture segmentation classifies contiguous areas where the texture is con-
stant. The advantage of texture segmentation over edge segmentation is that
textures implicitly contain several edges that are known to be part of one object.
Therefore, the important edge is the boundary between textures. The difficulty
in texture segmentation is that textures must somehow be described so that
pixels can be queried as to how likely they are to belong to a given texture.
A.2.2.1 Textons
As the fundamental unit of texture, a texton is analogous to a pixel in an
image (Julesz, 1981, 1986). A texture can be described as textons arranged in

Figure A.2: A portion of slice 348 of the Visible Human Male. The top left
image is the color image data. The top right image is the manually segmented
image. The bottom left image applies a LoG filter with a sigma value of 1.0.
The bottom-right image shows the results of a LoG filter with a sigma value of

a particular fashion. Obviously, for textures of repeating patterns, the texton is
the repeating element. A texture, however, may be represented by hierarchies
of textons-low-level textons describing a series of pixels connected in a line,
say. Higher-level textons may combine these line textons to describe Ls in
the texture as opposed to Ts or X. The texture may be a description of
L textons arranged in a particular format. Again, most textures cannot be
simply described by a single-layer repeating element. However, describing a
texture with hierarchical textons is difficult because there is no automatic way
to classify the various textons at each level in the hierarchy and to also classify
their arrangement (Forsyth and Ponce, 2003).
A.2.2.2 Filter banks
Instead of trying to decipher textons, a texture can be generally represented
by its response to a series of filters. Portions of the image that have a simi-
lar response to the same set of filters can then be ascribed the texture label.
Popular filters to use in a filter bank include Gaussian distributions in some
capacity, either directly, or by taking the first or second derivative of a Gaus-
sian distribution. (Again, the second derivative of a Gaussian is the Laplacian).
The filters may be symmetric spot filters or oriented bar filters. An example
of a filter bank is the Leung-Malik (LM) filter bank as illustrated in figure A.3
(Leung and Malik, 2001). It contains Gaussians, oriented first derivatives of
Gaussians, and oriented and spot second derivatives of Gaussians. As another
representation of LoG filters, the ID Laplacian of Gaussian is shown in Figure

"V V ! / [><
; / i \ i /
'r'SL - \\ ; 1 i.
# %

Figure A.3: The Leung-Malik filter bank. The upper-left filters are first
derivatives of Gaussians at 6 orientations and 3 scales. The upper-right filters
are second derivatives of Gaussians also at 6 orientations and 3 scales. In these
bar filters, the elongation factor is 3 so oy = 3ax. The bottom row contains
8 spot LoG filters and 4 Gaussian filters at various scales. Image appears at vgg/research/texclass/filters.html after Leung and
Malik (2001).
Figure A.4: ID Laplacian of Gaussian. When 2D, it is as if this filter is spun
horizontally about the center resembling a Mexican sombrero. For this reason,
a LoG filter is also dubbed The Mexican Hat.

As a Nobel laureate for the creation of holography, the physicist, Dennis
Gabor, also devoted much effort to social inventions and communications theory
(Lundqvist, 2000). It was in this latter capacity that he formulated what has
become to be known as Gabor filters which have found their way into the set
of commonly used filters in banks for texture analysis. These are sine functions
modulated by a Gaussian distribution. The equation for a 2D Gabor filter is:
G (x, y) = exp cos (A.l)
where x' = x cos 9+y sin 9, y' x sin 9+y cos 9, o is the Gaussian variance,
A is the sinusoidal wavelength, 9 is the orientation of the Gabor function, and
7 is the spatial aspect to define the ellipticity of the result. For example, 7 = 1
will yield a circular spot filter, and 7 < 1 will cause the filter to elongate along
the orientation as defined by 9.
In ID, it is easy to get a sense of this filter by seeing how it is created as
in Figure A.5. It should be apparent that if the sinusoid wavelength spans 2
standard deviations, then a Gabor filter becomes quite similar to the Laplacian
of Gaussian filter. (The wavelength of the sinusoid in Figure A.5 is just over 1
standard deviation of the Gaussian).
The intuition behind both Gabor and Gaussian derivatives of filters is that
similar, low-level responses are made in the visual cortex. Again, neurons in the
visual cortex are arranged in receptive fields and laterally inhibit their neighbors
to create this relatively same type of response (Marr and Hildreth, 1980; Kourtzi,
2006; Kandel et ah, 2000; Malik and Perona, 1990).

Figure A.5: A Gabor filter is a sinusoidal signal modulated with a Gaussian.
Figure A.6 serves as an example of output response to a filter bank using
filters from Forsyth and Ponce (2003).
The output from a filter bank still does not describe the texture. Somehow,
the classifier still needs to know how to interpret the results. This is most
often done by some windowing technique to measure the distribution of the
filter response in a small area. The problem with this windowing approach and
with any 2D filters like those shown in Figures A.3 or A.6 (and even when they
are constructed as 3D filters) is that they will be problematic at the edges.
Consider Figure A.7 where a texture forms a narrow peak. Most of the pixels
in that area convolved with the 2D kernel will not be part of the texture and
only a sliver will be. Therefore, the response from the windowing filter is likely
to be minimal and post-processing will likely consider the pixels in the thin
band as not belonging to the texture. One workaround to this issue is to have
even more filters, but then the interpretations of the results also become that
much more complex. If the response from the filter happened to be strong at
a particular point, the window used to report the distribution may still render

f N. * * *
a *JT.. t * * /
C'oiupitfer Ms ion A Modem Approach
Set' Pyiamids uidTcxtive
Slides by D. A. Forsyth
Figure A.6: A filter bank of spot and bar filters applied to an image
of a butterfly with the corresponding filter responses in the bottom images.
Figure comes from
PyramidsandTexture .ppt which accompanies Forsyth and Ponce (2003).

Figure A.7: Example of the windowing problem. A pixel (small black square)
inside a texture (blue). Rectangle about the pixel is the window centered at the
pixel. Most pixels within the window are from outside the texture. Therefore,
the window may report that this pixel does not belong to the texture, even
though it does.
it inconsequential. Squaring or limiting the filter response are two workarounds
to account for windowing limitations, but these techniques often need another
round of smoothing applied to the image which will make the filter response less
precise (Forsyth and Ponce, 2003).
A.2.2.3 Gradients and Gradient Flows
Similar to texture, a measure can be made to determine the degree to which
a given pixel follows a gradient with its neighbors such that they might all belong
to the same set. Gradients may be explicitly derived from pixel luminosity as
is the case with shading. An implicit gradient is one derived from the output
from a set of filters. For example, the response to a spot and bar filterbank
may illuminate a gradient flow such that the direction of each pixel within an
object makes collective sense. Edges in this case are determined where two pixels
have divergent directions. Because gradients and gradient flows are the subject
of semi-automatic approaches, they are discussed in more detail in Chapter 2.

A.3 Methods
A.3.1 Artificial Neural Networks
At first glance, it seems reasonable to solve the image segmentation problem
by modeling biological visual systems. Artificial neural networks (ANNs) are
loosely structured after biological neuronal networks. The nodes in ANNs are
called neurons and edges between neurons are called synapses. Synapses have
a weight associated with them. The simplest form of a neural network is a
single hidden-layer network as shown in figure A.8. Input nodes respond to
an input stimulus (i.e. an image) and propagate the signal to various hidden
nodes which then send the signal to the output neurons. The number of output
neurons corresponds to the number of items under classification. If the network
is only discriminating between two items, it will have only two output neurons.
For a given stimulus, the output neuron that receives the strongest signal is
designated the winner and the data is classified as belonging to that set. For
the network to filter the input signal such that the proper output neuron is
strongest requires adjustments to the synaptic weights of the network such that
the signal is directed appropriately. As a further discriminator, neurons may
only become active if the sum of their inputs is above a certain threshold. Often
in the case of ANNs which use images, each input neuron maps to a pixel and
the pixels synaptic weights are multiplied by a normalized scaling factor of its
There are several difficulties with ANNs. For one, the network topology is
difficult to attain. The number of input neurons could correspond to the number

Figure A.8: Simple, fully-connected ANN. For clarity, arrows have been omit-
ted, but edges are directed from left to right, from input to hidden nodes then
to output nodes.
of pixels in the image. Most ANNs employed on images do so on postage-stamp
sized 2D images, not volumes of thousands of voxels! The number of hidden
nodes would have to therefore also be immense! Additionally, the number of
required hidden nodes to attain an adequate response is more of an art than
science. It may be best to have several layers of hidden nodes, but how many
layers? Additionally, for each layer, how many nodes should exist?
Another difficulty with ANNs is attaining the correct synaptic weights. This
requires the network to be trained. Supervised learning is the process of giving
a method several training examples where the outputs are known for each input
example. ANNs can be trained via supervised learning, but this reason alone
prevents the use of ANNs with the Visible Human Dataset because no training
examples exist! Unsupervised learning employs estimation methods such as clus-
tering and statistical techniques to segregate inputs into N outputs. However,
unsupervised techniques cannot be used in the Visible Human Dataset because
of the relative homogeneity of the 3000 tissue types. (Since muscles look largely
the same, the segregation into 3000 sets may lump all muscles into one group

and inappropriately segregate other tissues). In theory, unsupervised learning
could be employed by cropping and labeling various regions of a tissue to serve
as a training set for supervised learning, but it is still difficult to determine how
many examples are necessary for classification of the tissue.
Additionally, ANNs have difficulty at pointed edge boundaries and are prone
to rounding corners. Where ANNs prove useful is in determining if, generally
speaking, a given region belongs to a particular class. This has proved useful
in areas such as face detection where a rectangular area can be determined to
contain a face (Rowley et ah, 1997), but the particular boundaries of the face
are unimportant. For crude classification problems they may work with great
effect, but as a precise segmentation tool, they fall short.
Another reason ANNs fall short when biological neural networks can ob-
viously succeed is that there is still much that is unknown about biological
networks. There are over a thousand different neuron types with various prop-
erties (Kandel et ah, 2000). The number and types of synapses that form are not
fully known, and the dynamics of single neurons, circuits, and networks are only
now starting to be realized. What is known is that there are some neurons that
have hundreds or thousands of synapses and that the human brain contains 1014
neurons. Even if a relative handful comprised the visual and learning system,
the network would be huge and unrealizable by existing computing resources
as each biological neuron operates in parallel with the others. As stated in the
introduction, there is much that is assumed and simply taken for granted in our
visual system. The lifetime each of us has of interfacing with the world cannot
adequately compare to a much, much simpler system which we expect to be

trained in minutes.
A.3.1.1 Other Learning Methods
Bayesian Classifiers and Support Vector Machines (SVMs) are examples of
two other learning methods worth quick mention. Bayes classifiers work well
in the semi-automated method of seeded region-growing (see ??), but can also
work automatically. In supervised training, the prior probability, the probability
of a feature given that it belongs to a certain class, is determined by the training
examples. A prior distribution can be attained for each feature given the class.
In this way, decision boundaries can be acquired by demarcating feature space
to optimize the probability of a set belonging to a particular class. SVMs work
by finding those training examples which are most similar, but which belong to
different classes to then segregate feature space by those similar examples. In
many cases, this requires a transformation into another dimension to make the
feature space separable. The problem again with SVMs and Bayesian classifiers
for the Visible Human dataset is that for one, they require numerous training
examples. Additionally, they are also imprecise, with errors especially at edge
A.3.2 Watershed Segmentation
Watershed segmentation (Vincent and Soille, 1991) is similar to segmenta-
tion of gradients described previously. One can most easily understand water-
shed segmentation by the following. Given a topographical map of an image,
either by letting the luminosity of a pixel or its filter response represent the
pixels elevation, let a raindrop fall on each pixel. The raindrop will roll to its
nearest local minimum, called a basin. Once all basins have been found, they

can be flooded such that if the water between any two basins would merge, a
dam can be inserted or elevated to prevent the merging of the waters. This
process can occur until only dams and no gradients segregate the waters. The
resulting segmentation is an optimal demarcation of each local minima. That is,
there will be as many segmented sets as there are local minima. For this reason,
watershed segmentation is prone to over-segmentation and often requires low
pass filtering to get rid of noise (i.e. small, local minimums). As an automatic
technique, then, it is not very useful, but it can be coupled with semi-automatic
approaches for better applicability.
Watershed can also work in 3D, but is not as easily visualized. Additionally,
the segmentation will be different than a 2D segmentation because of basins
filling in the Z-plane.

Appendix B. Cubic splines
Shipmakers used a strip of wood known as a spline which bent between
static points to mark the curve of the hull (Wikipedia, 2006). Drafters adopted
the term and used it in a similar way, letting wood strips flex between fixed
points and then tracing those curves in their designs. Mathematicians have also
adopted the term to mean a continuous curve described by a piecewise cubic
polynomial function whose first and second derivatives are continuous across
sections (Hearn and Baker, 1997). Zero-order continuity simply means that any
two curves meet at a point. They do not have the same tangent at that point.
First-order continuity means that the point at which two curves meet will have
the same tangent. Second-order continuity means that the two curves will have
first-order continuity and additionally have the same rate of change of their
curves at that point. Therefore, cubic splines are second-order continuous. The
cubic polynomial is expressed as
x(t) = axt3 + bxt2 + cxt + dx, 0 < t < 1 (B.l)
Equation B.l may represent the x coordinate along a section of a spline. In this
case, the four coefficients, ax, bx, cx, and dx can be determined through static
blending functions (also known as basis functions) and geometric constraints.
Geometric constraints contain coordinate values known as control points which
impact the curve, and possibly other variables such as tension, skew, etc. Equa-
tion B.l can therefore be written in matrix form as
x(t) = T M G (B.2)

where T =
and M G =
and where M is the spline
t3 t2 t 1
matrix containing the blending functions and G is the matrix of geometric con-
straints. For a spline with n + 1 control points and m dimensions, the control
points can be expressed as a vector in the following form:
Pk = {xik,x2k,...xmk), k = 0, l,...,n
Since I am using 3 dimensions, I will describe p*, in X, Y, and Z dimensions as
Pk = (xk,yk,zk), k = 0,1,..., n
There are a number of general cubic spline forms which abide by a matrix
representation. I have chosen to implement Catmull-Rom splines (also known
as Overhauser splines). With the default tension value of 0, this is also called a
Cardinal spline. The design of the application, however, allows for easy insert
of other basis functions if other splines are desired.
Figure 3.3 illustrates that 4 points are required for an interpolating line
using Catmull-Rom. For a 3D surface, this must be 16 points as illustrated in
Figure 3.1.
The general form of the equation to fill a cubic patch in 3D is
Q{s,t) = [x(s,f) y{s,t) z(s,t)] = S-M-G-Mt-Tt,
0 < s < 1, 0 72

where S =
direction, T =
M =
s3 s2 s 1
t3 t2 t 1
is a matrix oftime values in the horizontal
is a matrix of time values in the vertical (Z)
k 2-k k-2 k
2k k-3 3-2k -k
-k 0 s 0
0 1 0 0
, where k = (1 tension) /2, and where G
are the x, y, and z coordinates of the control points
Pi-1, j+2
Pi+2,j 1 Pi+2j+2
For the points described in Figure 3.1, pij represents the point at which s = 0
and t = 0 and the point pi+ij+i is the point at which s 1 and t = 1.
The mesh forms a cylinder such that for n control points on a row, the nth
control point will be adjacent to the first control point. Furthermore, MT Tt
can be computed once for a given slice, so adjustments to the spline, such as
clicking and dragging control points, can be calculated quickly enough to allow
for real time user editing.

Appendix C. Segmentation Application User Manual

Segmentor User Manual
Segmentor User Manual
Log-On Window
Edit Window
Overview Window
Controls Window
Anatomy List Window
Volume Cropper Window
Flv-Thru/Reaistration Window
Polygonal Model Window
Filters Window
Keyboard Shortcuts
Mouse Shortcuts
Log-On Window
1 of 44

Segmentor User Manual
C Work over the server
< Work on local volumes
r ROB Volume Files: ----
VcerebrumldataVhmVolumesVgbVhm rgb .0000-0197 vol Veer ebrum\dalaVhm Volumes VgbVhm rgb .0198-0395 vol \lcerebr umldalaVhm Volumes Vgbtvhm rgb .0396-0593 vol VcerebrumldataVhmVolumes'rgbVhm rgb .0594-0791 vol Veer ebrumldafaVhmVolumes VgbVhm rgb .0792-0989 .vol VcerebrumldataVhmVolumesVgbVhm rgb.0990-1187 vol d

Remove Browse...
Vcerebrumldata VhmVolumesImasks Vhm mask .0000-0296 vol Vcerebrumldata Vhm VolumesVnasks Vhm .mask .0297-0593.vol Veer ebrumkdataVhm VolumesVnasks Vhm mask ,0594-0890.vol VeerebrumkdalaVhmVolumesVnasksVhm mask.0891 -1187.vol VcerebrumldataVhmVolumesVnasksVhm.mask.1188-1484 vol Veer ebrum\data Vhm VolumesVnasksVhm.mask.1485-1781 .vol d

Remove Browse...

Dounds. cl At F*iic. Remove Browse...
| VcerebrumVJata VhmldbVhm bounds dat

The Log-On window requires the user to log on to the system and choose a volume
to segment before continuing. Select a user name and volume from the pull-down
menus. If you don't see your name listed, tell Tom McTavish to add you to the list.
The system remembers which volume and which slice number you were last editing
and will come up with this information when you select your name. From here you
can simply select "Launch" if you're working on the same volume.
If the volume is available over the server and you are working at the lab at CHS,
choose the radio button to work over the server if it is enabled. If you are working
remotely or on a volume not available from the server, choose the radio button to
work on local volumes.
If you are working on local volume files, you should see a default set of files appear
when you select the volume you want to use. Alternatively, you can specify which
2 of 44

Segmentor User Manual
RGB volume(s), which Alpha Mask volume(s), and which Bounds.dta file to use.
Clicking on the "Browse..." button will prompt you for individual files to add to the
list. Alternatively, you can drag files into this window instead of going through the
file dialog window. The "Remove" button will remove all files from the list.
Edge volumes are only necessary when using the Edge Classification tool. If you
are not reclassifying edges, this can be blank.
When you are ready to launch the application, click on the "Launch" button.
Edit Window
The Edit Window contains the image being edited. Controls to actually set the
image, tools for drawing, etc. can be found on the Controls Window. Use the scroll
bars to navigate to areas when the image is large or zoomed in. Instead of using
the scrollbars, you can hold the SPACEBAR. A "Hand" icon will appear and you can
click and drag the image to a new location.
Menu Commands:
File Menu
3 of 44

Segmentor User Manual
Saves edits to this slice. If no edit has been made, this will be grayed
out. This is also accomplished via the keyboard command,
CONTROL-S. There is actually a rare need to save edits as navigating
to a different slice will save edits automatically. Cropping or navigating
within a given slice will also save off previous edits to the slice. While
edits are saved to disk, they can still be undone. See the Edit Menu
for more information.
Save Image as JPEG
This saves the image cropped in the main edit window as a JPEG
image file. This saves off the image how it is seen -- with any filters,
masks, or edits as they appear on the screen. If labels are on, they
are not saved as part of the image.
Quits the Segmentor application. Closing this window or the Controls
window also quits the application. The keyboard command,
CONTROL-Q also quits. Any outstanding edits are automatically
Edit Menu
Undoes the last edit to this slice. Multiple undos can be performed by
repeatedly selecting this menu item, undoing edits from several steps
back. Only drawing edits are undo-able. No other steps, such as
selecting a different image, changing colors, etc. are undoable. Again,
this operation is only for this slice. If you make an edit
to one slice, then navigate to another slice and select
UNDO, it will undo any edits previously performed on
that slice, and not the last edit you made on the
previous slice. Undos can be performed even after the edit is
The keyboard shortcut, CONTROL-Z, can be used to perform an undo
without selecting this menu item.
If the Undo menu item is disabled, then there are no edits in this slice
that are undoable.
Be aware that performing an Undo will also take you to the cropped
view that you had on that slice when performing the edit. If you crop an
area, perform an edit, and re-crop another area, then select UNDO,
you will be re-cropped to that first edit.
Also, be aware that it is the changed pixels that are undoable. What is
remembered in an edit is not the path of the paint brush, but what
pixels were changed in the process. When you hit undo, you will not
see your edit disappear, but the pixels will revert to their previous
4 of 44

Segmentor User Manual
If an undo has been performed, redo re-performs the edit that was
undone to this slice. Multiple redos can be performed if several undo
operations have been performed. When a new edit is made after
undoing some steps, previous undone edits cannot be redone. Again,
this operation is only for this slice. Redos can be performed
even after the edit is saved (if there have been undo commands
The keyboard shortcut, SHIFT-CONTROL-Z, can be used to perform a
redo without selecting this menu item.
If the Redo menu item is disabled (grayed out), then there are no edits
in this slice that are re-doable.
Apply Edits from Files...
^Apply Edilt
Use this utility to apply or remove edits from ".sag" files to the volume.
Consult the User Manual forfurther instruction as applying edits will overwrite existing
volume data!
Type full path of a file here and click on the Add button or drag flies to the large area belov
List of".seg" files to read from: (Type and Add above or drag files to this area.)
Apply the edits stored in the files above. | Undo the edits stored in the files abny
Selecting "Apply Edits from Files" brings up the window above. This
allows you to apply or remove edits en masse. This is useful for
remote users to apply their edits that they've written on their local
drives and to apply them to the master volume. It's also useful for
undoing or redoing the work of a particular user on a particular day on
particular slices.
5 of 44

Segmentor User Manual
Edits are written to disk in the form of ".seg" files. When working over
the network, these edits are written to the following file:
cerebrum/data/volumeedits//// As an example, if Tom McTavish were editing Coronal slice number
753 on January 12, 2002, to the VHM volume, there will be a file called:
If there was a need to undo all of Tom's edits to Coronal slice number
753 on that day, you could navigate to that particular folder and drag
that file into this window and select the "Undo the edits stored in the
files above" button. If you wanted to undo the edits stored in multiple
files, simply drag more files into the window.
For remote users, edit files are stored on the local disk from where the
application is launched instead of cerebrum/data. That is to say,
/volumeedits/// Simply email Tom McTavish these files so that he can apply them to
the master volume.
When applying the edits, the output window will contain messages
explaining the success or failure of the operation.
IMPORTANT NOTE: .seg files contain only contain what pixels
were changed on a given slice and what their previous values were. In
the case where corruption occurs, attempting to undo the
segmentation will not work. You will have to go to a previous version of
the volume and replace the cropped area (or entire slice) with the
utility, SliceReplacer, in the path:
Show Tooltips
When checked, helper boxes will appear over some tools when the
mouse lingers over the object. After learning the Segmentor
application, you may want to turn this off if the tooltips are annoying.
View in Correct Perspective
In cases where voxels in the volume are not cubed, as in the case with
the Visible Male where X and Y are 1/3mm, but Z is 1mm, selecting this
option will make it so that in Coronal or Sagittal views, voxels will be
stretched to make up the correct perspective. This may make it
peculiar for drawing as one pixel mark will also stretch.
Match First Characters Only in Anatomy List
With this option checked, structures are searched in the Anatomy List
Window, by their first characters. For example, if you typed "lun" as
the search string, you will receive "Lunate" for Left and Right sides.
This differs from the default behavior which is to match the search
6 of 44

Segmentor User Manual
string anywhere in the name, "lun" will normally return "Lobes of the
Lung", "Semilunar Valve of the Aorta", as well as "Lunate".
Can write NULL
NULL is the value of unclassified pixels which usually only occur
outside the body (in the black region). The application only allows
NULL to be written when this option is checked. With this option
checked you can add NULL as a valid item to the structures list in the
Controls window by clicking in a NULL area with the dropper tool to set
it as the active item to draw. You should be deliberate and understand
what you are doing when you use this option and attempt to write null.
Read Full Axial Slices from Disk
When retrieving axial slices, if your crop covers nearly the entire span
of the volume (including the black borders), you might find this option
for retrieving slices is faster.
Write to Performance Log
This is for slice-retrieval debugging and optimization. Normal users
can ignore this.
A list of available windows is displayed. If a window is checked, it will be
visible. To hide it, uncheck the menu item. To show it, check the menu item.
Overview Window
The Overview window displays a shrunk view of the slice being edited. This window
drives the cropped image that is placed in the main Edit window. A green rectangle
corresponds to the Edit window's placement within the overall slice. A white
7 of 44

Segmentor User Manual
rectangle corresponds to the cropped image that is placed in the Edit window.
To use, click and drag an area to crop that area and place it into the main Edit
window. Smaller cropped images will result in speed improvements -- especially
when switching between slices a lot and using image processing filters. Once a
crop is made, clicking on different locations within the slice will center the cropped
view where the click occured. You can, of course, re-crop to get other areas within
the slice.
If an item is active for editing or selected in the Anatomy List Window, you may
want to select the button at the bottom of this window to automatically crop to the
bounding box of the active structure. NOTE: This does not navigate to a slice with
the active structure, but stays on the current slice. If the active structure is not in
the slice, you may crop to an area that you didn't intend. Also, if the bounds of the
structure are outside the bounds of the volume (i.e. you're working on a
sub-volume) the crop region may be zero!
Controls Window
& Controls Bi&.
View Death Active ID Slice
C Axial 4 | | -LJI197 Ceph I Mid | Coud |
Coronal 4 J =jjF Post I Mid Ant |
(~ Sagittal: < f _dfo Rt Mid Left |
MS 7/ m-
^Pj|No Labtls A & m n
Connective Tissue
li ii hiiEi i i i
A 3* | 301 Add | Remove | AddAI | Remove Locked
880: Connective Tissue
| | 134ft Lett C-2 Root of the Cervical Plexus
p" | | 1361: Lett C-3 Root of the Cervical Plexus
P* | | 1681: Right C-3 Root of the Cervical Plexus
p^ | | 46: Cavity of the Oropharynx
p"| | 5545: Left Deep Cervical Vein
p"| | 895: Left Inferior Pharyngeal Constrictor
The Controls Window contains several tools for navigating to an image view, what
to see, how to see it, and how to edit.
Navigator Panel.
Paint Tools Panel.
Slice Structures Panel
Navigator Panel
8 of 44

Segmentor User Manual
___View___ _________Depth__________ ________Active O 9 ice___
(S Axial -if" ~ _r............Ceph | Mid | Caud |]
r Coronal < I Jf5. Post I Wd I Ant |j
Sagittal <| |fo Rt | Mid | Left |
The Navigator Panel allows for you to navigate Axial, Coronal, or
Sagittal views. Choosing the appropriate view (Axial, Coronal, or
Sagittal), then using the slider or typing in a slice number in the field to
the right of the slider, takes you to a particular slice in the volume.
If an item is made active by selecting it from the Anatomy List Window
or with the Dropper Tool, you can easily navigate to the starting slice,
middle slice, or last slice that contains that element by pressing the
"Min", "Mid", or "Max" buttons. A cropped area will be made that
encompasses the structure from the min slice to the max slice. The
min and max buttons have the following abbreviations:
"Ceph" is short for "Cephalal" which is toward the head.
"Caud" is short for "Caudal" which is toward the feet.
"Post" is short for "Posterior" which is toward the back.
"Ant" is short for "Anterior" which is toward the front.
"Rt" is short for "Right".
"Left" will take you to the left-most side of the structure in the
Paint Tools Panel
P * *> mm
| No Libtls _ij A & m Si
Tools here are used to edit and view the image. For the most part,
holding the SHIFT-, ALT-, or OPTION-keys with a particular tool
selected will perform the opposite effect of the tool. RIGHT-Clicking
each tool (COMMAND-Clicking on the Mac) will bring up options if they
exist to modify parameters. The top row contains mostly editing tools
and the lower row contains mostly tools for viewing.
Paint Tool A\
The paint tool is what is used for painting over the image. When
this tool is selected, clicking on the image will make a mark.
Clicking and dragging continue to draw until the mouse is
released. The ink will be a bright aqua color. If the ink is not
bright, then that indicates that the area being painted is locked.
NOTE: Holding the SHIFT-, ALT-, or OPTION-key while drawing
temporarily enables the Eraser tool so you do not have to select
the eraser button in the tool palette for a quick edit.
Also, when in paint mode, RIGHT-clicking (COMMAND-clicking
on Macintosh) in the edit window performs a fill. Draw the border
9 of 44

Segmentor User Manual
with the paint tool, then RIGHT-click inside that region to fill it.
To change the width of the Paint brush from its default setting of
1 pixel, RIGHT-click the Paint Tool button (COMMAND-click with
Macintosh). This will bring up a slider where you can adjust the
diameter of the brush.
This tool can also be selected by pressing the F1-key or "1" on
the number pad.
Eraser Tool
The Eraser removes paint edits and sets the pixels to what they
were previously. You can only erase the Active structure. Upon
releasing the mouse, the alpha masks that were changed will
draw in their particular color.
NOTE: Holding the SHIFT-, ALT-, or OPTION-key while erasing
temporarily switches to the Paint tool, allowing you to draw if
you over-erase.
The cursor will adjust somewhat to the size you are going to
erase. There is a limit, however, to the size of the cursor, so if
you are zoomed in, you may erase a larger area than what the
cursor may make you believe.
To change the width of the Eraser brush from its default setting
of 1 pixel, RIGHT-click the Eraser Tool button (COMMAND-click
on Macintosh). This will bring up a slider where you can adjust
the diameter of the brush.
SHIFT-clicking the Eraser Tool Button erases all paint marks
from the edit window. Pressing the ESC-key from the application
will do the same thing.
This tool can also be selected by pressing the F2-key.
Fill Bucket Tool
The fill bucket performs a fill to areas bounded by a painted
border or enclosed by the active structure. By "painted" border,
this means the bright aqua color or locked color drawn by the
paintbrush. If you draw a circle, you can fill it. If the active
structure already has an outline painted in its masked color,
then the fill stops when it hits that border as well.
RIGHT-Click this button to bring up a popup to select other
options to the bounded fill described above.
Island Fill
An island fill fills takes the outer boundary and fills
everything inside. Consider a donut. The "island" is the
hole. Performing an island fill will fill the center.
10 of 44

Segmentor User Manual
You can temporarily perform an island fill without
selecting this checkbox by holding the SHIFT-key while
filling. This means with the paint tool selected, since
RIGHT-clicking normally performs a fill, holding the
SHIFT-key while RIGHT-clicking will perform an island fill.
Fill to Painted Border
This option fills until it finds the painted border -- bright
aqua or locked colored pixels drawn by the paintbrush.
This differs from the normal fill in that the fill will continue
even if pixels already have the same mask as what's
being filled. This is useful when reworking the outline of a
structure and islands would be made on a normal fill.
Here's an example of where you might fill to the painted
border. If you clicked on the pink segmented mask in
attempts to grow it, no fill would be performed in normal fill
operation. Also, if you clicked inside one of the dark
areas still inside the drawn line, the normal fill would just
fill that one dark area because it stops when it encounters
pink or the painted blue. With "Fill to Painted Border"
checked, however, it will fill the entire bounded area that
was painted.
Fill clicked element
This option takes the mask value of the pixel that was
clicked and performs a fill of pixels that are attached with
the same alpha value. This is useful in situations where
you want to rename or reclassify a complete structure.
This type of fill can be performed without painting a
11 of 44

Segmentor User Manual
bounding region, but it also responds to normal bounding
if a border is painted.
Fill unlocked pixels
With this option, you must click on an unlocked pixel. The
fill will fill all joined, unlocked pixels. This is useful in
situations where a region of unlocked pixels to be filled is
bounded by locked pixels. This type of fill can be
performed without painting a bounding region, but it also
responds to normal bounding if a border is painted.
NOTE: Fills can also be performed by RIGHT-clicking
(COMMAND-clicking on Macintosh) the image when the Paint
tool is selected.
Spline Tool i^J
A spline is a line that bends and curves about control points.
This tool is useful for making smooth lines. To use, click on the
image. A "control point" is laid down. Continue to click on the
image to lay down more control points. As you do, a curved line
will be laid down between the control points.
Double-clicking the last point will close the loop, making a line
between the last control point and the first point. To disconnect
a closed loop, add a point outside of the existing line.
Click and drag a point that is already laid down to move it to a
new location.
Click on the line to insert a control point between two others.
Hold the SHIFT-key while clicking on a control point to delete it.
Press the DELETE key to delete the last control point.
Holding the SHIFT key while pressing the DELETE key will
remove all points and remove the spline.
12 of 44