ANALYSIS OF THERMAL HYDRAULIC FLOW SYSTEMS
by
Abdulati, A.Ali Alshames
BA, 7th of April University 2000
A thesis submitted to the
University of Colorado Denver
in partial fulfillment
of the requirements for the degree of
Master of Science
Mechanical Engineering
2011
This thesis for the Master of Science
degree by
Abdulati, A.Ah Alshames
has been approved
by
Date
Alshames, Abdulati, A.Ali (M.S., Mechanical Engineering)
Analysis of Thermal Hydraulic Flow Systems
Thesis directed by Professor John Trapp
ABSTRACT
Thermal hydraulic flow systems are widely used in many applications in
different fields. Improving the performance of such systems requires a good
understanding of the their behavior. In order to understand the behavior of
thermal hydraulic systems, many studies have been done to analyze the steady
state conditions. Since all thermal hydraulic flow systems pass through tran
sient conditions, the analysis of the steady state conditions does not give a
complete prediction of the system behavior. This thesis addresses the behavior
of some thermal hydraulic flow system components during the transient condi
tions. Pipes and pumps are the basic components that are used in almost all
flow systems. The transient analysis of these two components is the subject of
this thesis. Moreover, reaction steam turbines are one of the most complicated
hydraulic flow system components that have a big influence on our lives. Thus,
the detailed analysis of a reaction steam turbine is included in this thesis.
This abstract accurately represents the content of the candidates thesis. I
recommend its publication.
DEDICATION
This thesis is dedicated to my parents who spend their lives supporting me,
to my brothers and sisters who encourage me, and to my best friends who always
help me.
ACKNOWLEDGMENT
I would like to express my sincere gratitude to my advisor Professor John Trapp
for his great guidance, support and encouragement. Throughout my thesis work
he encouraged me to develop independent thinking. This work would not be
done without his support and advice. I also wish to express my appreciation
to my committee members Dr. Sam Welch and Prof. Peter Jenkins for their
comments and their help.
CONTENTS
Figures ................................................................. xi
Tables................................................................. xiii
Chapter
1. Introduction........................................................... 1
1.1 Background........................................................... 1
1.2 Objective ........................................................... 2
2. System Models ......................................................... 3
2.1 Governing Equations.................................................. 3
2.2 Numerical Simulation................................................. 5
2.2.1 Stability of the Numerical Scheme................................ 6
2.2.1.1 Explicit Finite Difference Scheme................................ 7
2.2.1.2 Implicit Finite Difference Scheme............................... 10
2.2.1.3 SemiImplicit Upwind Finite Difference Scheme................... 11
2.3 The Wall Heat Transfer........................................... 14
2.3.1 The Wall temperature Constitutive Equation .................... 14
2.3.1.1 Explicit Scheme................................................. 15
2.3.1.2 Implicit Scheme................................................. 16
2.4 The General Final Scheme............................................ 16
3. Piping Systems........................................................ 19
3.1 Pipes............................................................... 19
vii
3.1.1 Pressure Losses and Energy Dissipation........................ 20
3.1.1.1 Major Head Losses Due to Friction (hL)........................ 20
3.1.1.2 Minor Head Losses Due to Changes in Size, Shape or Direction
of Flow (hLm)................................................. 23
3.1.2 Body Force (BF)................................................. 24
3.1.3 Wall Heat Transfer (Q)......................................... 24
3.2 Numerical Modeling Equations For Pipe............................ 30
3.3 Simple Test Problem of pressure Drops for a Flow in Horizontal Pipe 31
3.3.1 Problem Statement............................................... 31
3.3.2 Analytical Solution ............................................ 32
3.3.3 Numerical Solution ............................................. 33
3.4 Simple Test Problem of Heat Transfer for a Flow in Horizontal Pipe 39
3.4.1 Problem Statement............................................... 39
3.4.2 Analytical Solution ............................................ 40
3.4.3 Numerical Solution ............................................. 41
3.5 Pump............................................................... 41
3.5.1 Modeling Equations for a Pump................................. 44
3.5.2 Numerical Scheme for a Pump................................... 45
3.5.3 Piping system Test problem .................................... 45
3.5.3.1 problem Statement............................................. 45
3.5.3.2 Analytical Solution........................................... 46
3.5.3.3 Numerical Solution............................................ 48
4. Steam Turbine........................................................ 55
4.1 The Classification of The Steam Turbine............................ 55
viii
4.2 Reaction Axial Turbines............................................ 57
4.2.1 Turbine Stage Analysis........................................... 58
4.2.2 Turbine Stage Work............................................... 59
4.2.3 Turbine Stage Reaction .......................................... 60
4.2.4 Turbine Stage Efficiency ........................................ 61
4.2.5 Repeating Stage Turbines........................................... 61
4.3 Modeling Equations of MultiStage Turbines......................... 62
4.4 Selected Design for 50% Reaction MultiStage Turbine .............. 66
4.5 Simulation of Selected Turbine Design........................... 67
5. Conclusion............................................................ 74
Appendix
A. The Numerical Analysis of SemiImplicit Upwind Scheme............. 76
B. Linearized State Equation for Pressure and Entropy................ 81
B.l Linearized pressure state equation................................ 81
B. 2 Linearized entropy state equation............................... 82
C. State Properties Code ............................................ 84
C.l Loading Constants for State Properties ............................ 84
C.2 Calculation of Q2 and its Derivatives............................. 91
C.3 Calculation of the Derivatives k, f3, Cv, Cp...................... 95
C.4 Calculation of the Internal Energy, Enthalpy, and the Entropy ... 98
C.5 Calculation of the Pressure....................................... 100
C.6 Calculation of the Saturation Pressure and Saturation Specific Vol
ume for Liquid and Vapor.......................................... 101
C.7 The Main State Function........................................... 108
IX
D. Dynamic Viscosity and Friction Factor Code....................... 113
D.l Dynamic Viscosity Calculation ..................................... 113
D. 2 Friction Factor Calculation.................................... 116
E. Convective Heat Transfer Coefficient Code........................ 119
E.l Thermal Conductivity Calculation .................................. 119
E. 2 Convective Heat Transfer Coefficient Calculation .............. 123
F. Flow in a Pipe Code.............................................. 126
F.l General Code for Different Pipes Problems.......................... 126
F.2 User Input Part of Pipe Code for Pressure Drop Test Problem . 146
F. 3 User Input Part of Pipe Code for Heat Test Problem............... 149
G. Curve Fitting of Pump Data Code.................................. 153
H. Flow in PipePump System Code.................................... 155
I. Newton Method to Calculate Thermodynamic Properties as Function
of Entropy and Pressure............................................. 174
J. Five stages Reaction Turbine Code.................................. 178
References............................................................. 203
x
FIGURES
Figure
2.1 absolute value for A in the explicit upwind scheme.......................... 9
2.2 difference between control volumes for a cell and a junction(staggered
spatial mesh).............................................................. 12
2.3 general schematic diagram of a thermalhydraulic component in staggered
mesh....................................................................... 17
3.1 water How in a pipe........................................................ 32
3.2 pipe discretization........................................................ 33
3.3 steady state solution for How in a pipe.................................... 34
3.4 velocity change with space and time for How in a horizontal pipe........... 36
3.5 pressure change with space and time for How in a horizontal pipe. ... 37
3.6 density change with space and time for How in a horizontal pipe............ 37
3.7 temperature change with space and time for How in a horizontal pipe. . 38
3.8 entropy change with space and time for How in a horizontal pipe............ 38
3.9 steady state solution of heat test problem for How in a pipe............... 42
3.10 transient solution of heat test problem for How in a pipe.................. 43
3.11 schematic diagram of piping system......................................... 46
3.12 pump performance curve and system head curve........................... 48
3.13 discretization of the piping system........................................ 49
3.14 steady state solution of the How in piping system.......................... 51
3.15 velocity change with space and time for How in piping system............... 52
xi
3.16 pressure change with space and time for How in piping system........... 53
3.17 density change with space and time for Row in piping system................ 53
3.18 temperature change with space and time for Row in piping system. ... 54
3.19 entropy change with space and time for Row in piping system................ 54
4.1 reaction turbine stage velocity diagram.................................... 58
4.2 multistage turbine discretization..................................... 62
4.3 schematic diagram of Rve stages reaction turbine........................... 66
4.4 discretization of Rve stages reaction turbine.............................. 67
4.5 steady state Turbine simulation............................................ 69
4.6 velocity change with space and time for reaction turbine................... 71
4.7 pressure change with space and time for reaction turbine................... 72
4.8 density change with space and time for reaction turbine.................... 72
4.9 temperature change with space and time for reaction turbine................ 73
4.10 entropy change with space and time for reaction turbine.................... 73
xii
TABLES
Table
3.1 Coefficients a*. for equation(3.5)................................... 22
3.2 Coefficients bij for equation(3.5)................................... 22
3.3 Numerical Values of the Constants for equation (3.20 3.23). ... 27
3.4 Coefficients am for equation (3.21).................................. 28
3.5 Coefficients for equation (3.20)..................................... 28
3.6 Steady State Result for Flow in Pipe................................. 35
3.7 Steady State Result of heat test problem for flow in a pipe.......... 42
3.8 system head calculation.............................................. 47
3.9 Steady State Result for Flow in Piping system........................ 50
4.1 diameters and the height at each row of the turbine.................. 68
4.2 Steady State Result for turbine simulation........................... 70
xiii
1. Introduction
1.1 Background
Thermal hydraulic flow systems are significant in many different applica
tions and occupy a very prominent place in our lives. These systems are widely
used in many disciplines such as these related to heating and cooling of build
ing complexes, industrial processes, power generation, aerospace and automobile
engineering, and many others. A thermal hydraulic system consists of multiple
units or components that interact with each other. There are various types of
components that make up a thermal hydraulic flow system. The most com
monly use components are turbines, boilers, pipes, compressors, pumps, heat
exchangers, solar collectors, mixers, accumulators and valves.
Due to the increasing use of thermal hydraulic flow systems, more accurate
analysis of the behavior of the flow systems components is needed in order to
improve their performance. Although, many studies have been done on several
components that are used in thermal hydraulic systems, most of these studies
are concentrated on the behavior of steady flow states. However, the transient
conditions are often encountered in reality during the start up and shut down the
systems or during an accident. Hence, the behavior of the transient conditions
have to be considered in the analysis of these systems.
The analysis of the thermal hydraulic flow systems is often complicated by
the complexity of the nature of the fluid flow and the complexity of the mass and
energy transfer mechanisms especially when the change in space is coupled with
1
change in time. Consequently, some level of approximations and simplifications
must be reasonably applied in order to make acceptable analysis .
1.2 Objective
This thesis aims to study some of the most common thermal hydraulic flow
system components by simulating their characteristics during the transient con
ditions. The steady state also will be considered in order to compare it to some
analytical solutions. Assumptions of homogeneous onedimensional flows will
be applied in this study. The general forms of the governing equations that
apply all principle of physics with their appropriate numerical schemes will be
obtained and applied to specific components. Pipes, pumps and steam turbines
are the selected components that will be simulated as examples of widespread
thermalhydraulic flow system components.
2
2. System Models
Modeling is significant in order to understand and predict the behavior of
any system. The common concerns in this procedure are to model the system
in such a way that the system performance can be satisfied. This satisfaction
requires a deep understanding of the physics of the system, determines a model
of the system and applies the basic principles of physics to this model in order
to develop appropriate mathematical equations that handle all the physics.
2.1 Governing Equations
In general, thermal hydraulic systems can be modeled as a finite number
of control volumes. The basic principles of mass conservation, Newtons second
law, and the energy conservation can be applied to the control volume to obtain
mathematical governing equations. Our interest in this thesis is to model one
dimensional homogeneous flow of thermal hydraulic systems. Applying these
basic principles of physics to the control volume of onedimensional homoge
neous flow produces a general form of three differential governing equations for
any thermal hydraulic component, and it can be developed as needed for specific
components. As a result, the three governing equations that describe the be
havior of the thermalhydraulic systems are the continuity equation, momentum
equation and the energy equation. These equations can be written per unit of
volume in onedimensional flow as:
1 Continuity equation.
dp d(pu) _
dt dx
3
(2.1)
The first term on the left hand of the continuity equation( ^ )js the rate
of change of mass inside the control volume, whereas the second term ) is
the mass flux at the boundary of the control volume.
2 Momentum equation
d(pu) d(pu2) dp BF Pl
dt dx dx L L
(2.2)
The first term on the left hand of the momentum equation is the rate
of change of momentum inside the control volume, and the second term
d(pu2)
dx
IS
the momentum flux at the boundary of the control volume. On the right hand
of the equation, the three terms are the pressure gradient ^, the body force per
unit of volume BF, and the pressure losses due to wall friction and changes in
size, shape and flow direction Pl where L is the length of the volume.
3Energy equation
The total energy equation is transformed into the thermal energy equation
in term of the entropy by subtracting the mechanical energy obtained from the
momentum equation, from the total energy equation. The result is
(dps) t d(psu) Q i Df ,0 ^
dt + dx ~TW + TL [Z6)
The first term on the left hand of the energy equation is the time rate
of change of entropy inside the control volume, and the second term is
the entropy flux at the boundary of the control volume. On the right hand of
the equation, the two terms are the heat transfer between the fluid and the wall
due to the temperature difference Q, and the energy dissipation due to the wall
friction Df where V is the volume.
4
The three governing equations are written in terms of five primary dependent
variables. These dependent variables are density rho, temperature T, pressure
p, entropy s, and the velocity u. Therefore, more two equations are needed
to solve for these primary variables. The two equation can be obtained from
state equations that relate the four thermal variables to each other. So, any
variable of the four can be written in terms of two other variables. Here, the
two state equations will be selected for the pressure and entropy state equations
as a function of density and temperature as following:
4Pressure state equation
P = P{P,T)
5Entropy state equation
(2.4)
s = s(p,T) (2.5)
2.2 Numerical Simulation
Numerical simulation approximates the actual system and yields the quan
titative information on the behavior of the system under different operation con
ditions.A general approach to simulate a thermalhydraulic component requires
solving the partial differential equations of the models numerically. The numer
ical solution will be based on the finite difference method. There are several
choices or approaches among which the finite difference scheme can be devel
oped. One of the most important considerations in making a decision for which
approach is appropriate is the stability of the scheme.For this reason, different
5
numerical schemes have to be investigated in order to choose the appropriate
scheme for the governing equations.
2.2.1 Stability of the Numerical Scheme
The Von Neumann method is one of the powerful methods used to analyze
the stability of the finite difference equations. Therefore, this method will be
used here to analyze the stability of the different schemes of the modeling equa
tions. Since the stability analysis of a system of the three modeling equations
that involve time (continuity, momentum and energy ) for different schemes will
requires a lot of work, as a first step, the analysis will be for specific simple
case of one finite difference equation that has a form similar to the conservation
of mass equation or the conservation of energy equation. Then, the selected
scheme that works for this specific case will be fully analyzed for the system of
the three equations.
The equation (2.3) with assumptions of no source terms can be rewritten
as
(dps) + d(psu) = 0
(2.6)
dt dx
Both equations (2.1), (2.6) have same formula. So, they can be written as:
dR
dt
d(Ru)
dx
= 0
(2.7)
where R can be p or ps.
For the specific case of the constant velocity u, this equation becomes
dR dR n
~dt U~dx ~
(2.8)
This partial differential equation (equation (2.8)) will be numerically solved
6
using different finite difference schemes in order to analyze the stability of these
different schemes.
2.2.1.1 Explicit Finite Difference Scheme
In the explicit approach, the values of the variables at a point in time level
n + 1 (new time) depends only on the values of the variables at this point and
its neighbors in time level n (old time). So, the solution can be specified in term
of known variables, and it can be obtained directly. The partial differential
equation (equation (2.8)) in this approach can written as
dR dRn n
at+u^ = 0
(2.9)
Developing this partial differential equation explicitly can be done in several
different ways. Two of these ways will be discussed here, the centered difference
method and the upwind method.
1) Explicit Centered Difference Method.
In this method, the finite difference equation can be developed by approxi
mating the partial derivative with respect to time by the firstorder forwardtime
approximation at the grid point and approximating the partial derivative with
respect to space by the firstorder centeredspace approximation.
R]+1 iff
_ pn
J .j+1 nj~ 1
At
+ u
2Ax
= 0
(2.10)
This equation can be solved numerically using Fourier components which
can be expressed as
iff = i20eJ'e(wAt>n
Considering that A" = a,A0n5 where A is the amplitude function at time
7
n, the Fourier component can be written as
Rn = RoXne^ikAx)j
(2.11)
Substituting equation (2.11) into equation (2.10) and solving for A, we can
get
A = 1 isin(kAx) (2.12)
For the finite difference scheme to be stable and numerical solution must
remain bounded, we must have A < 1 for all k. this requirement must be
satisfied conditionally or unconditionally before the scheme is selected.
Analyzing equation (2.12) to determine the stability criteria, we get
A = \/12 + C^sin(kAx)f (2.13)
From equation (2.13), it is clear that the absolute value of A is always bigger
than 1, and the stability condition cannot be satisfied. In other words, this
scheme is unconditionally unstable, and the solution will blow up after few
steps.
2) Explicit Upwind Method.
Developing an upwind finite difference equation can be done by replacing
the partial derivative with respect to time by the firstorder forward difference
approximation at the grid point and replacing the partial derivative with respect
to space by the firstorder onesided approximation in the upwind direction.
R]+1
DTI DTI
Tlx SXa
3 +u 3
Dn
= 0
(2.14)
At Ax
Substituting Fourier component^ equation (2.11) ) into equation (2.14) ap
plies several straightforward manipulations, the solution for A is
8
, uAt. uAt , uAt . .
A = (1 ) + ^cos{k Ax) isinykAx)
(2.15)
Ax ' Ax ~""v ~ ' ~ Ax
Equation (2.15) is the equation of a circle in complex plane centered at
(1 on the real axis with radius ^ as it shown in figure(2.1).
Figure 2.1: absolute value for A in the explicit upwind scheme.
For the stability requirement, A < 1, it can be seen from the figure that
the circle radius must stay within the unit length. On other words, to keep this
scheme stable we must have < 1. Therefore, this scheme is conditionally
stable, and it can be used under this stability requirement. However, this condi
tion can reasonably be satisfied if we just consider the information propagating
9
with the velocity of moving particles. On the other side, satisfying this condition
for the information propagating with the velocity of speed of sound (acoustic
propagation) requires very tiny time steps At that can be unreasonable to get
to the steady state solution.
2.2.1.2 Implicit Finite Difference Scheme
In the implicit approach, the values of the variables at a point in time level
7i + l (new time) depends on the values of the variables at its neighbors in time
level 77 + 1 (new time) as well as the values of the variables at time level n
(old time). Furthermore, the solution is specified in term of unknown variables,
and it can be obtained by solving a system of equations simultaneously. This
approach can be written for the partial differential equation (equation (2.8)) as:
dR dRn+1 n
~at+u dx ~
(2.16)
Developing this partial differential equation implicitly into a finite difference
equation will be based on the upwind method.
Implicit Upwind Method.
As it explained above, the finite difference equation in the upwind method
can be developed implicitly from partial differential equation as:
R]
Tl+l
R Rn
3 + u^
in+l
nn+1
Rji
0
(2.17)
At Ax
The solution of equation (2.17) for A is accomplished using Fourier compo
nent(equation (2.11)), and the solution after a number of algebraic steps is in
the form of
A =
1
1 + ^(1 cos(kAx)) + i^sin(kAx)
(2.18)
10
By taking the absolute value of A we get
I'M = , 1 (219)
y (1 + aH1 cos(kAx))2 + (^sin(kAx))2
Because (1 cos(kAx)) always bigger than or equal to zero, it is obvious
that A has denominator always bigger or equal one, which makes A always
less than or equal one. Consequently, this scheme unconditionally satisfies the
stability criteria, and it can be used even for large time step. On the other
side, the disadvantage of this scheme is that using it for general case where the
velocity is changeable will produce a system of nonlinear equations that cant
be solved directly, and solving this system of equations iteratively will result in
inefficient code.
Based on the stability analysis of the previous different schemes, we can
suggest semiimplicit upwind scheme that is mixing up of the implicit upwind
scheme and explicit upwind scheme to be fully analyzed for the three modeling
equations. The semiimplicit upwind finite difference scheme will be developed
by evaluating only the terms involved in acoustic equation implicitly where other
terms will be evaluated explicitly. Using this strategy in evaluation will allows
us to overcome the disadvantage of the explicit and implicit upwind schemes
because evaluating acoustic terms implicitly will not require small time step for
stability and evaluating nonacoustic terms explicitly will not produce nonlinear
equations.
2.2.1.3 SemiImplicit Upwind Finite Difference Scheme
The semiimplicit upwind scheme is based on replacing the differential equa
tions with finite difference equations that are partially implicit. The implicit
11
terms are formulated to be linear in the dependent variables at new time. This
results in a linear timeadvancement matrix that can be solved by direct in
version. However, the thermal hydraulic systems can be represented as cells
and junctions. The scalar properties such as density, temperature, pressure,
and entropy are defined at cell centers, whereas velocities are defined at cells
boundaries (junctions). Therefore, the model of the system will be a control
volume for a cell, where the continuity, energy, and the state equation have to
be solved, and control volume for a junction, where the momentum equation
has to be solved. This approach results in numerical scheme having a staggered
spatial mesh.
The terms evaluated explicitly and implicitly for the different modeling equa
tions using staggered mesh of flow in a pipe illustrated in figure (2.2) as identified
below.
Control volume of cell (j)
H__________________________________________________________i_^____________________________________________^_____________________________________________________________________________________________________________i+1
i'/i j+/i
hH
Control volume of junction (j+Vi)
Figure 2.2: difference between control volumes for a cell and a junction (staggered
spatial mesh). 1
1) Continuity Equation.
The continuity equation can be developed for the semiimplicit upwind
12
scheme from equation (2.1) after linearizing it as
f>T'
At
+ u
nn
Pi1
u
Ax
+ P
nf 1
>+1
"+:
____J 2
Ax
= 0
(2.20)
2) Momentum Equation.
The momentum equation (equation (2.2)) with assumption of frictionless
and neglecting the effect of body force can be developed for the semiimplicit
upwind scheme as
n+l _n+l
lin+1 ?/n
V* Vi
At
un i un i v. xr
2 o+h , j+1 Pj
+ puA+ J J
Ax
Ax
= 0
(2.21)
3) Energy Equation.
The energy equation (equation (2.3)) after linearization and with assump
tions of well isolated system and frictionless can be developed for the semi
implicit upwind scheme as
cn+l cn n ______ n
13 + u 3 J = 0
(2.22)
At Ax
The system of three finite difference equations (equation (2.20), equation
(2.21), equation (2.22)) have been solved numerically using Fourier components
and linearized state equation (for more details see appendix A) in order to make
sure that the selected semiimplicit scheme for the system of governing equations
has same stability condition as it in single special case equation. After series of
manipulations, we get the three roots of A.
. uAt. uAt uAt . ,
Ai = (1 ) I cos(kAx) isin(kAx)
v Ax; Ax v Ax v
(2.23)
13
(1 ~ ^cosjkAx) i^sin(kAx)
(2.24)
l1 ~ + ^cos(kAx) i^sin(kAx)
(2.25)
Ax '3t,'v 2 >
where a in the equations(2.24) and (2.25) is the sound speed.
Ai equation is same as equation (2.15) for A in explicit upwind scheme. So, it
has same stability condition, which is ^ < 1. However, this stability condition
is for the information propagating with the velocity of moving particles only
since we evaluated acoustic terms implicitly.
A2 and A3 equations ((2.24), equation (2.25)) have a numerator equal to Ai
and a denominator with absolute value greater than or equal to 1. Therefore, it
has same required stability condition for A3.
2.3 The Wall Heat Transfer
In most cases, fluid flows in a hydraulic system are associated with some
energy transferred between the fluid and the system boundaries (the wall) due
to the temperatures difference. This onedimensional heat transfer is considered
in the energy equation (equation (2.3)) by the term Q. Since the heat transfer
term depends on the wall temperature that is a function of time, the stability
of the numerical scheme for the constitutive equation used to calculate the wall
temperature is necessary.
2.3.1 The Wall temperature Constitutive Equation
The time dependance wall temperature Tw due to the heat transfer between
a wall and fluid at temperature can be determined by overall energy balance
14
on the wall. With assumption of negligible wall temperature gradients, the
energy balance relates the rate of heat transfer convectively at the wall surface
to the rate of change in internal energy. This relation can be written as[9]
pVcp^ = ~hA(Tw T^) (2.26)
where p is the wall materials density, V is the volume, cv is the specific heat of
the wall material, h is the convictive heat transfer coefficient, and A is the wall
surface area.
Defining the temperature difference as
Q = Tm Too
for is constant,  = That leads to
dt
89
dt
hA
pVc?
9
(2.27)
This constitutive equation (equation (2.27)) will be analyzed explicitly and
implicitly in order to choose the right scheme in term of the stability.
2.3.1.1 Explicit Scheme
The equation (2.27) can be written in the explicit scheme as
At
hA nn
pVcp 3
(2.28)
The Fourier solution for the deferential constitutive equation can be written
as
0? = 0o\nei(kAx)i
(2.29)
15
Substituting equation (2.29) into equation (2.28) and solving for A resulting
in
A = 1
hAAt
pMcp
(2.30)
For the scheme to be stable it must has A < 1. Applying this stability
criteria to equation (??) shows that the explicit scheme is conditionally stable.
The stability condition is hAAt < 2pVcp. That means using tiny time step if
the convective heat transfer coefficient is very high.
2.3.1.2 Implicit Scheme
The equation (2.27) can be written in the implicit scheme as
hA
rl 
At
.Qn+1
p\/cp 3
(2.31)
substituting Fourier component into equation (??) and solving for A pro
duces
A =
1
1 +
hAAt
pVcp
(2.32)
Since the term
hAAt
pVcp
is always bigger than or equal zero, it is obvious that
the implicit scheme is unconditionally stable. Therefore, the wall temperature
constitutive equation will be evaluated implicitly whenever it used to obtain the
wall temperature profile.
2.4 The General Final Scheme
The basic schematic diagram of a thermal hydraulic component in a stag
gered mesh that has junction numbering different than the cells numbering is
illustrated by figure (2.3). The general governing equations for the component
illustrated in the diagram can be written in the semiimplicit upwind scheme as
16
Control volume of cell ( nv )
L_____________At_________M
1 J. 1 J. J .1. J.
1 1 i i nv i f 1 u nv+1
j (p.T.P.s) i i i
nj1
nj+1
Control volume of junction ( nj)
Figure 2.3: general schematic diagram of a thermalhydraulic component in stag
gered mesh.
 Continuity Equation for cell nv
r.n+1 nn rfi _ nn 7/n+l
Pnv Pnv __ Pnvunj________Pnvlunjl ___ q
At
Ax
(2.33)
 Momentum Equation for junction nj
u
n+l
U"
At
+ 0.5 p\
W)2 Ki)2
Ax
+
pith
vn+i
rnv
Ax
bf+Pl_
Ax Ax
(2.34)
 Energy Equation for cell nv
sn+l on
jn nv nv
At
Pit1 ~ Pnv
At
rfl <,n ,
Jnvnv
.n+l
lnj
Q Df
TÂ£vVnv TÂ£v Ax
(2.35)
These three equations are written in such way to consist with figure (2.3)
where the velocity is taken as positive to obtain the donored flux. Both velocity
directions are considered in the code. The terms on the right hand side of
equations (2.34), (2.35) (body force BF, pressure losses Pl, wall heat transfer Q,
energy dissipation due to friction Df) will be developed later for each component
as needed.
 Pressure State Equation for cell nv
17
Linearized pressure state equation as function of density and temperature
are obtained using Taylor series in the form
Pnntl = Pi + ^1 r(pltl pn) + (l  TZ) (2.36)
 Entropy State Equation for cell nv
Linearized entropy state equation as function of density and temperature
are obtained using Taylor series in the form
f) c f)c
C = + fykUC' Piv) + g^CC1 7Z) (2.37)
The terms t, Â§^r? an(^ Â§flp are calculated in term of (k, (3,cp,cv) for
single phase and two phase region using Maxwell relations (see appendix B).
The formulas for (/c, (3, cp, cv) and other thermodynamic properties as a function
of temperature and density are presented in reference [6]. These formulas are
coded by Pro. Welch and the code is used in this thesis and included in appendix
C.
18
3. Piping Systems
Piping systems are important components in many engineering systems.
They are widely used in several applications in Aerospace, Agricultural, Chem
ical, Civil, Environmental, Mechanical,Nuclear, and Petroleum engineering. In
general, piping systems include components such as pipes, pumps, reservoirs,
valves, and heat exchangers.
Since the unsteady (transient) state analysis of piping systems is character
ized by variables that are functions of time and space, it is complicated com
pared to the steady state analysis. Therefore, many studies have been focused
on the consideration of steady state conditions. The steady state analysis of
piping system, however, is less applicable since the unsteady state is more often
encountered. For that reason,a simple piping system consisting of two compo
nents (pipe, pump) will be studied in this chapter taking in account the transient
case.
3.1 Pipes
A pipe is a basic component encountered in any piping system. It is used
to carry the fluid between two points or to connect different components. A
pipe can be modeled as one cell bounded by junctions and two phony cells for
boundary conditions or it can be discretized by breaking it up into several cells.
The basic governing equations illustrated in the previous chapter (equations
(2.33)(2.37)) will be used to simulate a pipe with more description of the source
terms in momentum and energy equations.
19
3.1.1 Pressure Losses and Energy Dissipation
For any fluid flows in a pipe, there will be some losses of pressure (PL)
and energy (Df) ,as they appear in momentum equation (equation (2.34)) and
energy equation (equation (2.35)), due to the friction (major losses) and changes
in size, shape, or direction of flow (minor losses). These losses in the pressure
and energy can be calculated in term of the sum of major head losses hi and
minor head losses hLm.
3.1.1.1 Major Head Losses Due to Friction (hi)
These losses are affected by the roughness of the inside surface of the pipe,
the pipe diameter, and the physical properties of the fluid. These losses can be
illustrated by the equation
hL = f
L u2
~D~2
(3.1)
where hi is head lost (m2/s2), / is the friction factor (dimensionless), L is the
length (m), D is the pipe inside diameter (m), and u is the flow velocity (m/s).
In order to estimate the major friction lost in a pipe, the friction factor /
must be calculated. The friction factor depends on the pipe diameter, inner
roughness of the pipe, flow velocity and fluid viscosity. The flow condition,
whether turbulent or laminar, will determine the appropriate method to calcu
late the friction factor.
for laminar flow, the used formula for friction factor[5] is:
/ =
64
Re
for turbulent flow, the most common formula[5] is:
2.51
/'
h = 2.0(OJ(d? +
3.7 Ref
0.5'
(3.2)
(3.3)
20
where (e/D) is the relative roughness, and Re is Reynolds number.
The value for absolute roughness e is available in the literature for different
type of pipes, and Reynolds number Re can be calculated from the formula:
puD
Re =
(3.4)
where p is the fluid density (kg/m3) and p is the dynamic viscosity (Pa.s).
Dynamic viscosity is a physical property that can be calculated for each
type of fluid differently. For one phase region water and steam, the dynamic
viscosity can be calculated in term of density, temperature with the following
constants [6].
p(p, T) = p0(T)exp[4 (3.5)
P 1=0 j=0 p
where
Po{T) = (Â£)a5[EL=oafc(?)fc]_1 x 10 ~6kg/(sm)
p* = 317.763%/m3
T* = QA7.27K
The constants ak and bij are listed in tables (3.1) and (3.2)
For homogeneous two phase region, there are many different formulas in the
literature for the dynamic viscosity[ll, 14]. The most common used formula is
( + Â£.)* (3.6)
Pf Pg
where x is the quality of the mixture, fif is the dynamic viscosity of the liquid,
and pg is the dynamic viscosity of the vapor.
Equations (3.5) and (3.6) is coded in appendix D to calculate dynamic vis
cosity in term of density and temperature. Also, equations (3.2), (3.3), and
21
Table 3.1: Coefficients afor equation(3.5).
Coefficient Value
a0 0.0181583
a i 0.0177624
0.0105287
3 0.0036744
Table 3.2: Coefficients bij for equation(3.5).
i
j 0 1 2 3 4 5
0 0.501938 0.162888 0.130356 0.907919 0.551119 0.146543
1 0.235622 0.789393 0.673665 1.207552 0.0670665 0.0843370
2 0.274637 0.743539 0.959456 0.687343 0.497089 0.195286
3 0.145831 0.263129 0.347247 0.213486 0.100754 0.032932
4 0.0270448 0.0273093 0.0267758 0.0822904 0.0602253 0.0202595
22
(3.4) are coded in appendix D in order to define flow condition and determine
the friction factor for a prescribed flow.
3.1.1.2 Minor Head Losses Due to Changes in Size, Shape or
Direction of Flow (tiLm)
When fluid flows in a valve or fitting, changes in the flow pattern will cause
extra losses. These losses are known as minor losses and can be expressed as:
u2
hLm = K (3.7)
where h^m is the minor head losses (m2/s2), and K is a loss coefficient.
Loss coefficients depend only on the shape and size of the component. There
fore, They can be found in most fluid references for variety of components used
in piping systems.
In addition to the previous formula of minor head losses in equation (3.7),
fittings such as elbows and valves are often rated in terms of friction loss as
equivalent lengths of straight pipe. This can be stated as:
hLm = (3.8)
where Le is equivalent length of straight pipe, and it is available in the literature
for various components used in piping systems.
The total head losses in a pipe is the sum of major head losses {Hl) and
minor head losses (/?Lm)i and it is denoted as
Hl = (/(^ + Â§) + A')y (3.9)
The losses of pressure term in the momentum equation will be the total
head losses in equation (3.9) times the density with minus sign.
Pl =+ ^) + K)j (3.10)
23
Since the thermal energy equation (equation (2.3)) is derived by multiplying
the momentum equation with the velocity to obtain the mechanical energy then
subtracting the mechanical energy from the total energy, the energy dissipation
term in the energy equation (Df) is equal to the pressure losses term times the
velocity with opposite sign.
W = p(/(^ + Â§) + (3.11)
3.1.2 Body Force (BF)
The only body force that can be considered in piping systems are the force
due to gravity. For the case of piping system carrying water, which can be
assumed incompressible, the body force can be expressed as:
BF = pg(H2H1) (3.12)
where g is the gravity (9.81m/,s2) and (H2 Hi) is the levitation between the
inlet and outlet (m).
3.1.3 Wall Heat Transfer (Q)
In most piping systems, the fluid carried by the system is at a different
temperature than the environment or the surrounding temperature. Therefore,
these systems are associated with heat transfer between the fluid and its sur
rounding resulting usually in a temperature gradient along the wall. This heat
transfer term as it appears in the energy equation can be calculated for flow in
a pipe with wall temperature gradient as
Q = hinAin(Tin Tf) (3.13)
where
hin is the convection heat transfer coefficient on the inner surface of the pipe,
24
which will be explained with more details later.
Ain is the pipe inner surface area.
Tf is the fluid temperature.
Tin is the pipe inner surface temperature.
Since the inner surface temperature Tin is a function of the outer surface
temperature Tout, it can be calculated by taking the energy balance on the inner
surface and outer surface of the pipe.
1 Energy balance on the pipe inner surface
^ in PmCi
dT,r
2itlk
dt
{Tout Tin) hinAin{Tin Tf)
(3.14)
where is the inner half of the pipe volume, pm is the density of the pipe
material, Cp is the specific heat of the pipe material, k is the conductivity of
the pipe material, l is the length of the pipe, rin, rout are the inner and the out
radius of the pipe respectively.
2 Energy balance on the pipe outer surface
BT 2irlk
VoutPmCp= houtAout(Tso T^) ~ 7^77^^out ~ Tn) (3.15)
1 rin >
where Vout is the out half of the pipe volume, houÂ£ is the convective heat
transfer coefficient on the pipe out surface, Aout is the pipe out surface area, T^
is the environment temperature.
The convection heat transfer coefficient on the inner surface of the pipe hin
in equation (3.13) is calculated for internal flow[9] as
k
hin = Nu
Di,
(3,16)
25
where Din is the inner diameter of the pipe, Nu is the Nusselt Number, and k
is the conductivity of the fluid.
The Nusselt number Nu is a constant for laminar flow and function of
Reynolds number and Prandtl number for turbulent flow. It can be obtained[9]
as
Laminar flow with uniform surface heat flux
Nu = 4.36 (3.17)
Turbulent flow
Nu = 0.023Re8 Prn (3.18)
where Re is Reynolds number as it is calculated in equation (3.4), Pr is Prandtl
number, n is a constant equal to 0.3 if the fluid temperature is higher than the
inner surface temperature and equal to 0.4 otherwise.
Prandtl number Pr in equation (3.18) is calculated[6] as
Pr = ii% (3.19)
k
where // is the dynamic viscosity as it is calculated in equations (3.5) and (3.6),
cp is the specific heat capacity of the fluid, and k is the conductivity of the fluid.
The conductivity of the fluid k in equations (3.16), (3.19) is calculated for
one phase region as function of the fluid density and temperature with the
following constants[6].
k(p,T) = k(T)exp[
P i=0 j=0
T
m^iy] + Ak
(3.20)
26
where
m=0
(3.21)
C T p 2 d(p/p')
Pt p11a(r/r*)Jp
gx?(4)5exp[yl(
r*
T"
1)2S(^1)4
p
(3.22)
Pr#*)
lo/ / T
(3.23)
p* d(p/p*)
All the constants in equations (3.20)( 3.23) are listed in tables (3.3), (3.4)
and (3.5)
Table 3.3: Numerical Values of the Constants for equation (3.20 3.23).
Constant Value
rp* 647.27 K
P* 317.763 kg/m3
P* 22.115 MPa
c 3.7711 x KT8 WPas/(Km)
w 0.4678
A 18.66
B 1.00
The special derivatives in equations (3.22),(3.23) are obtained for one phase
region using Maxwell relations as
27
Table 3.4: Coefficients am for equation (3.21).
Coefficient Value (Km/W)
do 2.02223
ai 14.11166
5.25597
03 2.01870
Table 3.5: Coefficients bij for equation (3.20) .
i
j 0 1 2 3 4
0 1.3293046 1.7018363 5.2246158 8.7127675 1.8525999
1 0.40452437 2.2156845 10.124111 9.5000611 0.93404690
2 0.24409490 1.6511057 4.9874687 4.3786606 0.0
3 0.018660751 0.76736002 0.27297694 0.91783782 0.0
4 0.12961068 0.37283344 0.43083393 0.0 0.0
5 0.044809953 0.11203160 0.13333849 0.0 0.0
28
d(p/p*)\r
1
p*
In the homogeneous two phase flow model, there are several common ex
pressions for the thermal conductivity k[ 14]. The widely used expression is
where is the volume fraction of the vapor,kf is the thermal conductivity of
the liquid, and kg is the thermal conductivity of the vapor.
In the same manner the specific heat capacity in equation (3.19) for two
phase region is calculated [13] as
where x is the quality of the mixture, cpj is the specific heat capacity of the
liquid, and cpg is the specific heat capacity of the vapor.
All the equations and constants used to calculate the convective heat transfer
coefficient ( equations (3.16) (3.25)) are coded in appendix E
3.2 Numerical Modeling Equations For Pipe
(3.24)
Cp (1 x)cpf T xcpg
(3.25)
29
The continuity equation, pressure state equation, and entropy state equation
for a numerical scheme of a pipe will have the same form as they appear in the
previous chapter ( equations (2.33), (2.36), (2.37) respectively). The numerical
momentum equation for a pipe can be developed by evaluating the pressure
losses in equation (3.10) and the body force in equation (3.12) explicitly and
substituting them in equation (2.34). This produces the numerical momentum
equation in the form:
u!1 
nn n3_________VP. _l O
Pm> a ' U.Dp,
At
 o.5 />r
d 
nv
j)2  1)
Plvll
pn+1
rnv
Ax
Axj)2
Ax
n ( Hnj Hnj i)
PPnj
Ax
n3 3 Dnj Ax
 o5/>:
njrnj
LenjKjf K n jf
Dnj Ax U'5 njPnj Ax
(3.26)
where
n Pnv+l^~Pnv
Pnj 2
The numerical energy equation, also, can be obtained by evaluating wall
heat transfer in equation (3.13) implicitly and the energy dissipation in equation
(3.11) explicitly then substituting them into equation (2.35).
n+l _ /in+1 nn
n nv______nv n Pnv________Pnv .
Pnv j ' * '
n ?/n+l
Jnvsnvu'nj
n n .n+1
Pnvl^nv1unj1
At
(in)nv
V Tn
vnvnv
+ 0.5 Kn
pn
rnv
Kvf
Tn A t
1 nv
At
Ax
n r. tn Pnv Ax (\b) , n k f
0 SJnvT^r^r +0.5/,
Tn D
Ax
nn T
rnv ^env
Tn D
1nv ^nv
Ax
(3.27)
where
Ain 'ftD(in')nvAx
/,
n ____ Jnj
fn i fn
* nj 1 1 J nj
u
n ____ nj 
, +u.
30
Dn
^ nv
Dn 4. Dn
Unj\^Un_
ILL
LP
Lenj 1 ~t~ Lenj
TS Kn j 1 +K~n?
*^nv 2
Since the wall heat transfer term are evaluated implicitly, the two energy
balance equations at the inner and out pipe surface (equations(3.14),(3.15)) must
be added as constitutive equations to calculate the temperature on the inner and
outer surface at new time. These equations can be developed numerically as
 Energy balance on the pipe inner surface
2nkAx
^(in)nvPmC'j
rrn+1 rpn
(in)nv (in)nv
~ ~At
ln^(out)nv ^ ^'(out)nv T(in)nv) %n)m^(in)m; (in)nv 1nv
' (in)nv
A(in\ni,(T^nv TnJ
(3.28)
 Energy balance on the pipe out surface
rpn+1 _ rpn ~ , a
w (~i (out)nv (out)nv , * /rp rpn\\ \ Z7TrCL\X /rpn+l rrn\1
V(out)nvPm^p ^ ~ ^out^{out)nv\1 oo 1(out)nv) ^^7\out)nv_y1 {out)nv 1 (in)n
^(in)nv
(3.29)
The numerical modeling equations for a pipe ( equations (2.33), (2.36),
(2.37),(3.26),(3.27), (3.28), (3.29)) are coded in appendix F to simulate different
pipe problems.
3.3 Simple Test Problem of pressure Drops for a Flow in
Horizontal Pipe
3.3.1 Problem Statement
Water at temperature 297K and density 997.41 kgjnr? flows in a horizontal
pipe at volumetric flow rate of 18.852m?/min as shown in figure ((3.1) ).
The pipe has inner diameter 0.2m relative roughness (e/D = 0.002), and
16m length. Below the pressure developed in the pipe calculated analytically is
compared to the ones calculated numerically in the code.
31
18.852 kg/mini
16 m
Figure 3.1: water How in a pipe.
3.3.2 Analytical Solution
The dynamic viscosity of water (p) at 297(K) and 997Al(kg/m3) is equal
to 7.834 x 104(Pa.s)
the pipe cross section area is
4 = 0.03142(m2)
__ 7t0.2
the flow velocity is
u
_ _ 18.852
\ 60x0.03142
10(m/s)
D puD 997.4x10x0.2 9 Z.AP, v 106
Ue ~ n ~ 7.834X104 Z04D X iU
the friction factor associated with (Re = 2.546 x 106) and (e/D = 0.002) is
/ = 0.02349
the head losses in a straight pipe is
hL = = 0.02349(^)(^) = 93.96(m2/s2)
for incompressible flow
+ + 9a,)(f + f +gH2) = hL
since the pipe is horizontal with constant cross section area, (Hi = H2) and
32
(tti = u2). So, the pressure developed in the pipe is only due to the friction.
APanalytical = Pi P2 = phL = 997.41 x 93.96 = 93716.6436Ps
3.3.3 Numerical Solution
The numerical solution can be obtained by simulating the pipe as one cell
or several cells bounded by two phony cells for the boundary condition. In this
problem, the pipe is broken up into four equal length cells as is illustrated in
figure (3.2).
4m ^ ^ 4m ^ ^ 4m ^ ^ 4m
1 2 CO 5 6
1 2 3 4 5
Figure 3.2: pipe discretization .
The code in appendix F for a pipe is developed in a such way that two
boundary conditions at inlet (entropy and velocity or entropy and pressure)
and one boundary condition at out let (pressure ) are defined. Therefore, the
inlet boundary conditions in this problem are chosen to be the entropy at the
phony cell (1) associated with T = 297(K) and p = 997.4(kg/m3) and the flow
velocity (u = 10m/s) whereas the outlet boundary condition is the pressure at
cell 6 associated with T = 297(K) and p = 997A(kg/m3). In order to eliminate
the effect of the wall heat transfer on the developed pressure by the code, the
33
environment temperature must be set to be equal to the fluid temperature
or the convective heat transfer coefficient on the outside of the pipe hout must be
set to small value so the pipe is insulated. The steady state simulation results
are shown in figure (3.3 ) and table (3.6).
997.6
997.5
997.4
2
1.5
1
297.05
density at n=75
velocity at n=75
10.0005 10
9.9995 ,
0 2 4
x io5 pressure at n=75
0 2 4
Temperature at n=75
297
0
1
351
350.5
350 L
6 0
313.3585
313.3585
313.3585 L
6 1
2 3 4
entropy at n=75
2 4
mass flow rate at n =75
2 3 4 5
Temperature at inner surface n=75 Temperature at out surface n=75
297.05,,1 297.05,,,,
297
297 1
Figure 3.3: steady state solution for flow in a pipe.
The pressure drops in the pipe calculated numerically can be obtained as
^numerical = P2 P6 = 196202.89364 102460.94732 = 93741.94632(Pa).
By compering the pressure drop calculated numerically to the one calculated
analytically, we get
ATnumerical APanalytlcal = 93741.94632 93716.6436 = 25.30272(Pa).
34
Table 3.6: Steady State Result for Flow in Pipe.
cell or junction density (kg/m3) pressure (Pa) Temperature (K) Entropy (J/kg.K) Velocity (m/s)
1 997.45099 196202.89364 297.00724 350.39943 10.00000
2 997.45099 196202.89364 297.00724 350.47853 10.00000
3 997.43904 172768.75028 297.01245 350.55764 10.00001
4 997.42708 149333.13778 297.01766 350.63674 10.00002
5 997.41512 125897.25131 297.02287 350.71584 10.00003
6 997.41512 102460.94732 297.02287 350.71584
Therefore, the result obtained numerically is so close to the analytical result
with small difference due to the little change in the density inside the pipe as is
seen in figure (3.3 ). The figure (3.3 ) shows that the steady state results satisfy
the basic principle of physics of fluid flow in a pipe. It can be seen that the
small increase in the flow velocity due to the drop in the density is consisted
with concept of conservation of mass. The figure also illustrates how much the
pressure drops with the increment in the length due to the friction between the
fluid flow and the inner surface of the pipe. The effect of friction also can be
seen on the tiny increase of the entropy and temperature.
However, the pipe code also can simulate the unsteady (transient) solution
as it is illustrated in figures (3.4) (3.8 ).
The sudden increase in the flow velocity, ( figure (3.4)) from the static u = 0
to flow velocity of 10m/s in the first time step imposes a jump in the pressure in
35
Figure 3.4: velocity change with space and time for Row in a horizontal pipe.
36
x 10
5
Figure 3.5:
pressure change with space and time for How in a horizontal pipe.
997.46
997.45
I. 997.44
I
Â£ 997.43
i
997.42
997.41
30
Figure 3.6: density change with space and time for How in a horizontal pipe.
37
297.025
297.02
297.015
297.01
297.005
297
30
time step
0 1
cell number
Figure 3.7: temperature change with space and time for how in a horizontal pipe.
Figure 3.8: entropy change with space and time for how in a horizontal pipe.
38
the first pipe cell, as it can be seen in figure (3.5), to recover the pressure drops
due to the friction. This raise occurs in the first cell decreases gradually with
the increment in length to meet the fixed pressure at the outlet cell. All the
pressure changes occur in the first few steps due to the fact that the pressure
drops strongly depends on the velocity, which does not change much after first
five steps.
In the figure (3.6), the density raises in the first cell following the pressure,
after the first cell, the density decreases as the length increases. This small
compressibility occurs because of the temperature raise as it shown in figure
(3.7), which is the result of the friction effects. Also, the effects of the friction
can be seen on the entropy increase in the figure (3.8) as it takes the same trend
of the temperature figure.
3.4 Simple Test Problem of Heat Transfer for a Flow in Horizontal
Pipe
3.4.1 Problem Statement
Water at temperature 317A" and density 997Alkg / m3 flows in a horizontal
pipe at volumetric flow rate of 0.3m3/min The pipe is 1.5m length with
inside diameter of 10cm and thickness of 0.5mm made from material that has
conductivity of 12ITJm.K. The environment around the pipe is at temperature
297AT with convection heat transfer coefficient of 500W/m2K. The test is to
compare the inner and out pipe surface temperature calculated analytically to
these calculated numerically.
3.4.2 Analytical Solution
39
The properties of water at at 317(K) and 997.41 (kg/m3) is
 viscosity of water p = 532.89 x 106(Pa.s)
conductivity of water k = 0.64294(JT/ra.A;)
Prandtl number Pr = 3.4313
The pipe inner cross section area
A = ^ = = 0.007854(m2)
The fluid flow velocity
u = a = 0.007854 x 60 = 0.63 66(m/s)
Reynolds number
te=ef = = 011915 x 106
Nusselt number
Nu = 0.023Re08Pr03 = 0.023 x (0.11915 x 106)0 8 x (3.4313)03 = 383.03695
Convective heat transfer coefficient on the inner surface
hin = Nu% = 383.03695 x = 2462.6978(lT/m2.fc)
Energy balance on the inner surface
hinAin{Tf Tin) =  Tout)
rin
2462.6978 x (O.Itt x 1.5) x (317 Tm) = x (Tin Tout)
Tout = 1978Tin ~ 310.0262
Energy balance on the out surface
l^rLh^{Tjn Tout) = houtAout(Tout ~ T'oc)
rin
x (r Tout) = 500 x (0.117T x 1.5) x {Tout 297)
Tn = 12184T0ut ~ 64.8705
solving the two equations, we get
Tout = 310.8722(fc)
40
Tin = 313.8962(/c)
3.4.3 Numerical Solution
The numerical solution can be obtained by simulating the pipe as one cell
bounded by two phony cells for the boundary condition. So, the pipe is presented
by cell number 2 whereas the cell 1 and 3 are the inlet and outlet boundary
cells respectively. The boundary conditions at the inlet are the flow velocity
u = 0.6366m/s and the entropy associated with the temperature T = 317K and
density p = 997Alkg/m3, and the outlet boundary condition is the pressure
calculated at temperature T = 317K and density p = 997A\kg/m3 In order
to eliminate the effect of friction, we set the friction factor in the pipe code to
be equal to zero.
The steady state results for this problem is shown in figure (3.9 ) and listed
in table (3.7).
Comparing the steady state code result for inner and out surface tempera
ture to these calculated analytically shows that the code results are so close to
the analytical solution with tiny difference due to tiny drop in the steady state
fluid temperature.
The transient change of the properties in the pipe is expressed in figure
(3.10)
3.5 Pump
Pumps are one of the most widely used components in thermal hydraulic
systems. It is used in all piping systems when the working fluid is a liquid.
There are many different types of pumps available to use in the market. Even
though each type has its own design principles, they are all designed to add
41
1000
density at n=100
998
996
1.5645
1.5645
1.5645
318
316
314
1 1.5 2 2.5
x 107 pressure at n=100
0.6366,
0.6366
0.6366
velocity at n=100
1 1.5
entropy at n=100
1 1.5 2 2.5
Fluid Temperature at n=100
3 1
4.9873
4.9873
4.9873
1.5 2 2.5
flow rate at n=100
1.5
620 615
, 610 ,
1 1.5 2 2.5 3 1
Inner surface Temperature at n=100 Outer surface Temperature at n=100
316
314
312
312 310
, . 308 ,
1 1.5
2.5
1.5
2.5
Figure 3.9: steady state solution of heat test problem for flow in a pipe.
Table 3.7: Steady State Result of heat test problem for flow in a pipe.
cell or junction
properties 1 2 3
density (kg/m3) 997.48240 997.48240 997.48240
pressure (Pa) 15.64515 x 106 15.64515 x 106 15.64515 x 106
Entropy (J/kg.K) 616.09608 613.84067 613.84067
Velocity (m/s) 0.6366 0.6366 
fluid Temperature (K) 316.82735 316.82735 316.82735
inner Temperature (K) 313.75333 313.75333 313.75333
out Temperature (K) 310.75005 310.75005 310.75005
42
density at cell 2
entropy at cell 2
E 997.8 r
ch
Â§; 997.61
w
Â£ 997.4
*0
1.5655
Q_
'ar
b 1565
to
Â§. 1.56451
3 50 100 150
time(s)
x lO7 pressure at cell 2
200
*
Â£ 600
317
50 100 150
time(s)
Fluid Temperature at cell 2
200
0
200
316.5
&
I 316
0
50 100 150
time(s)
200
50 100 150
time(s)
^ Inner Surface Temperature at cell 2 ^ Out Surface Temperature at cell 2
* 320, 320
u
g 300 
a
Â§ 280 l
0 50 100 150
time(s)
200
2
D
g 300
a
i 280
0 50 100 150
time(s)
200
Figure 3.10: transient solution of heat test problem for flow in a pipe.
43
energy to fluid at liquid phase. Centrifugal pumps are the most commonly
used pumps. They can be found in many sizes and wide range of developed
head. The analysis of the fluid flow inside a pump is very complicated due to
the complexity of moving parts and changing the flow direction. Therefore, all
pump manufacturers provide performance curves for each type of pumps. These
curves are generated from experimental data.
3.5.1 Modeling Equations for a Pump
A pump is modeled as one cell that has special outlet junction. The continu
ity, energy and state equations are applied to the cell, whereas the momentum
equation is replaced with the performance curve equation. There are several
types of performance curves for pump that relate different characteristic param
eters to each other. The most used performance curve is the one that relates
the volumetric flow rate (Q) to the head added by the pump (H). These type
of curves can be fitted reasonably by quadratic equation in the form of:
where a, b, and c are constants that depend on the type and size of the pump.
These constants can be calculated using the method of quadratic least square
fit for any pump data. The code to generate the constants (a, b, c) from data
are attached in appendix G.
The equation (3.30) is used instead of the momentum equation to relate the
flow velocity to the pressure gained from the pump, it is can be rewritten as:
H = a + bQ + cQ2
(3.30)
(3.31)
pg
where g is the gravity (9.81m/s2) and A is the cross section area (m2).
44
3.5.2 Numerical Scheme for a Pump
The continuity equation, energy equation, pressure state equation, and en
tropy state equation for a numerical scheme have the same form of equations
(2.33),(2.35), (2.36), (2.37) respectively with neglecting the source terms in en
ergy equation. The performance equation (equation (3.31)) will be evaluated
implicitly.
pn+1
/ rnv+1
{Puv+i9
pn+1
nv ) = a + bA
'}nv9
un+1
TIJ u'nj
+ ^71)2
nj \ nj
the last term that has(u)h1)2 can be linearized as
j+1)2 = .i)2 +  Ki)
TIJ
nj J
nj
nj >
(3.32)
or
w1)2 =  Kj)2
So, equation (3.32) becomes:
pn+1
/ rnv+1
^ Pnv+l9
pn+1
1 nv
Pnv9
) (bA
nj
+ 2 cAnjUTn]J
K/1 = a cAljj)
(3.33)
The equation (3.32) is the final form of the numerical momentum equation of a
pump.
3.5.3 Piping system Test problem
3.5.3.1 problem Statement
A piping system provides water from reservoir to tank at 8m high is illus
trated in figure (3.11). Both the reservoir and the tank are at same pressure.
The pump is working at performance curve H = H0 KQ2, where H0 = 17.038m
and K = 2634.2055(s2/m5).
45
P=P(atm)
i
E
00
28 m
Figure 3.11: schematic diagram of piping system.
3.5.3.2 Analytical Solution
A general equation for a system is
P9
+ + tf1+tfs = Â£ + Â£ + //2 +
P9
Since P\ = P2, and u\ = u2, this equation reduces to
Ha = (H2 H1) +
This equation will be used to generate system head curve in order to obtain
the operating point from the intersection of the system head curve and the pump
performance curve,
hL = (Kin + f( + %) + Kout)^
where Kin = 0.5, Kout = 1, and ^ = 2 x 30.
The result of the system head curve equation is calculated in table (3.8) at
various flow velocities.
46
Table 3.8: system head calculation.
u (m/s) Re (10000) / (io2) Q (m3/mm) Hs (m)
0.0 0.0  0.0 0.0
0.2 5.087279 3.37035 0.376991 8.0212
0.4 10.174558 3.29346 0.753982 8.0834
0.6 15.261837 3.26676 1.130973 8.1863
0.8 20.349116 3.25319 1.507964 8.3301
1.0 25.436395 3.24498 1.884955 8.5147
1.2 30.523674 3.23947 2.261946 8.7401
1.4 35.610953 3.23552 2.638937 9.0063
1.6 40.698232 3.23255 3.015928 9.3134
1.8 45.785512 3.23024 3.392920 9.6613
2.0 50.872791 3.22838 3.769911 10.0499
2.2 55.960070 3.22686 4.146902 10.4794
47
The pump performance curve and the result in table (3.4) are used to gen
erate the figure (3.12).
Figure 3.12: pump performance curve and system head curve.
From the figure (3.12) it can be seen that the operating point is H =
9.49m, Q = 3.21 w?/min. This point is associated with flow velocity (1.71m/s).
3.5.3.3 Numerical Solution
Simulating the system in figure (3.11) can be done by discretizing the system
into a number of cells and junctions. The number of the cells can be chosen by
the user. In this problem the system is broken up into 12 cells and 13 junctions
48
12
i 13
12
 11
11
9 ' 10
10
T 14 l
_l______J
13
Figure 3.13: discretization of the piping system.
bounded by two phony cells, one at each end (figure(3.13)).
The pipe connected to the pump suction junction and the pump are dis
cretized as one cell each of (lm) length. The rest of the pipe is discretized to 10
cells of equal length (4m each cell). The pump discharge junction (junction 3 ) is
the special junction where the momentum equation is replaced by the pump per
formance equation as illustrated in the equation (3.32). The code that simulates
this system is developed by solving the continuity, energy, and two state equa
tions at each cell and the momentum equation of a pipe at each junction other
than junction 3 where the modified momentum equation is used. The minor loss
coefficient (K, jy) are added to the code as direct input whereas the friction fac
tor (/) is calculated as function of velocity and other thermal properties. This
code is attached in appendix (H). The boundary conditions at the inlet are cho
sen to be the entropy of water and the pressure at (T = 297K, /rho = 997.41).
The outlet condition is the pressure at (T = 297K, jrho = 997.41) which is the
same pressure at the inlet. The steady state simulation results are demonstrated
49
Table 3.9: Steady State Result for Flow in Piping system.
cell or junction density (kg/m3) pressure (Pa) Temperature (K) Entropy (J/kg.K) Velocity (m/s)
1 997.40996 102460.94732 297.00014 350.39943 1.71109
2 997.40996 102460.94733 297.00014 350.40146 1.71109
3 997.40985 102225.04849 297.00016 350.40186 1.71109
4 997.45150 194465.61278 297.00188 350.40345 1.71102
5 997.45102 193522.17946 297.00209 350.40663 1.71102
6 997.45054 192578.62236 297.00230 350.40982 1.71103
7 997.45006 191635.06483 297.00251 350.41300 1.71103
8 997.44958 190691.50685 297.00272 351.41619 1.71103
9 997.44909 189747.94845 297.00293 351.41937 1.71103
10 997.44857 188804.38964 297.00330 351.42495 1.71103
11 997.44741 186445.49422 297.00366 351.43052 1.71103
12 997.42920 146362.46077 297.00336 351.43610 1.71106
13 997.41031 104864.73645 297.00321 352.44413 1.71109
14 997.41031 102460.94732 297.00321 352.44413
50
997.46
997.44
997.42
997.4
density at n=80
0 5 5 1C
x 10 pressure at n=
1.5
0 5 10
Temperature at n=80
15
15 0
53.6166
53.6166
53.6166
velocity at n=80
n 1.7111 '
1.711
1.711
^ entropy at n
15
350.5
350.45 
350.4 
350.35 ,
5 10
mass flow rate at n =80
15
0
10
15
Figure 3.14: steady state solution of the Bow in piping system.
in the figure (3.14) and table (3.9).
This numerical solution shows that the system is operating at the point
where the flow velocity is equal to 1.711m/s. This flow velocity is almost the
same velocity calculated analytically for the operating pint.
Figure (3.14) shows more details about the effect of different system com
ponents on the flow velocity and other thermal properties. Up to cell 3, the
fluid flows in a short straight pipe with a tiny pressure drop, entropy increase
and almost no density, velocity or temperature change. At cell 4, the pressure
increases by the pump causing a slight rise in the density and fall in the velocity
51
at junction 4. From cells 4 to 10 the fluid flows in a relatively long straight
pipe resulting in gradual pressure drops due to the friction coupled with entropy
and temperature increase. This also leads to some compressibility and change
in velocity. In the cells 11 and 12 there is a large pressure drops because of
the elevation change. This large pressure drop causes some density and velocity
change whereas the entropy and temperature changes stay at same level as be
fore due to the fact that the elevation pressure drop will not effect the entropy
change. In the last cell ( cell 13), a slight pressure drops due to the friction
and minor losses, which resulting in more flat density, entropy and temperature
change.
The transient solution for the system is shown in figures ((3.15)(3.19).
time step junction number
Figure 3.15: velocity change with space and time for flow in piping system.
52
density (kg'm3)
Figure 3.16: pressure change with space and time for Bow in piping system.
997.46
997.45
997.44
997.43
997.42
997.41
997.4
20
Figure 3.17: density change with space and time for Bow in piping system.
53
297.004
297.003
Â£ 297.002
 297.001
I 297
296.999
296.998
20
15
15
10
10
time step
0 0
cell number
Figure 3.18: temperature change with space and time for flow in piping system.
Figure 3.19: entropy change with space and time for How in piping system.
54
4. Steam Turbine
A turbine is a device designed to convert thermal energy contained in high
pressure fluid to mechanical work delivered at the coupling output shift. The
operation principle of a turbine is based on Newtons Second Law of Motion,
which is the force is proportional to the rate of change of momentum (mass
velocity). According to this concept, a high pressure fluid entering the turbine
is expanded in fixed curved nozzles producing increase in kinetic energy of the
fluid. A high velocity fluid passes over curved moving blades applying a force to
the blades. The force will move the blades in the direction of the force providing
mechanical energy.
Steam turbine is one of the most important power producing devices. It
is widely used especially in the field of electricity generation. Since the early
steam turbine of the Swedish engineer Carl de Laval in 1883 many developments
have been made on the steam turbine. However, significant progress in the size
and efficiency has been made in the last fifty years with 1000 MW now being
achieved for single shaft plants [4].
4.1 The Classification of The Steam Turbine
In general, there are many categories that the steam turbines can be classi
fied in. One of the categories is based on principle of operation. According to
the principle of operation, steam turbines can be divided into impulse turbines
and reaction turbines. In the impulse turbines, the moving blades are designed
with constant cross section passage area in such a manner that the steam is only
55
expanded in the fixed nozzle, and the pressure stays constant at the inlet and
outlet of the moving blades. In other words, the fixed nozzle only transforms
the high pressure to high velocity whereas the moving blades only transfer the
high velocity to mechanical work. Contrastingly, the expansion in the reaction
turbine takes place in both the fixed nozzle and the moving blades. This can be
obtained by making the blade passage of decreasing cross section area. There
fore, the pressure drops through the moving blades accelerating the fluid causing
in additional generation of kinetic energy within the moving blades. Since there
is an acceleration of flow in moving blade passage, the chances of separation of
flow are less which results in higher stage efficiency[7].
Another category of steam turbines classification is based on the direction of
the flow. Under this category steam turbines are specified as axial flow turbines
or radial flow turbines. The axial flow turbine is appropriate for large turbo
generators, so it is used in all new steam power plants. In this type of turbines
the fluid flows along the axis of the shift. In the radial flow turbine, the fluid flow
in the radial direction. It incorporates two shifts, each connected to a different
generator. This type is appropriate for use at times of peak load because it can
be rapidly warmed and started.
Other categories have been used to classify steam turbine such as turbines
based on the means of the heat supply and the heat rejection. Based on the
means of heat supply, steam turbines are classified as single pressure turbines,
dual pressure turbines and reheated turbines. For heat rejection category, there
are five types of steam turbines. These types are pass out or extraction turbines,
regenerative turbines, condensing turbines, noncondensing turbines and back
56
pressure turbines. Furthermore, steam turbines may be sorted based on the
number of cylinder and number of shift. Based on the number of cylinder,
they sort as single cylinder and multi cylinder whereas they sort as tandem
compound and cross compound based on the number of the shift. Also, they
can be classified as constant speed turbine and variable speed turbine based on
the rotational speed[3].
4.2 Reaction Axial Turbines
Although the development in the axial flow turbine is carried out with the
gas turbine and the aircraft industry, it depended on the Progress that was
made in the steam turbine. Today most of the turbines used in power plants
are axial turbines especially the reaction turbines. The reaction axial turbines
demonstrate a high level of performance compared to the impulse turbine. This
preference in the reaction turbine performance is due to the comparatively low
fluid velocity [7]. In general, axial turbines are constructed from a number of
stages. Each stage has a row of fixed nozzles, usually called a stator or fixed
blades, and a row of moving blades, usually called a rotor.
Before going on further with turbine analysis, there are some parameters
that should be defined. The blade angles and the fluid angles are the most
important parameters in turbine analysis. The blade angles (/3i, fa ) are the
angles at the inlet and outlet of the blade with respect to reference direction. For
turbines, particulary steam turbines, the reference direction is the perpendicular
direction to the machine axial. The reaction blades are designed with large
entrance angle, often close to 90, and small discharge angle[8]. The fluid angles
(oil, 2 ) are the angles between the direction of the velocity relative to the blade
57
at the inlet and outlet and the reference direction. The difference between inlet
blade angle and the inlet fluid angle is known as the angle of incidence i whereas
the difference between the inlet and outlet fluid angles is the deflection angle.
4.2.1 Turbine Stage Analysis
In the analysis of the reaction axial turbine stage, the fluid with absolute
velocity c\ enters the fixed nozzles at the fluid angle an as shown in figure (4.1
)
Figure 4.1: reaction turbine stage velocity diagram.
The flow then expands in the fixed blades to absolute velocity c2 at exit
blade angle fa. The expanded fluid enters the moving blades with relative
velocity w2 at fluid angle a2 The relative velocity w2 and its direction can be
58
calculated from the velocity triangle by, vectorially, subtracting the linear blade
speed U from the absolute velocity of the expanded fluid c2 The relative flow
expands in the moving blades to relative velocity w3 at exit blade angle 02
The consequential absolute flow with velocity c3 at angle a3 is calculated form
the velocity triangle by, vectorially, adding the blade speed U to the relative
velocity w3.
4.2.2 Turbine Stage Work
The rate of work done on the blades by the fluid is equal to product of
angular velocity ui and the torque r.
P = cor
since
t = rh(riCUt2 ~ r2Cu,3)
where rq and r2 are the mean blade radius for fixed and moving rows, which
are usually equal, and Cu%2 and Cu^ are the rotational components of the outlet
blade velocities C2 and C% illustrated in figure (4.1).
and
and
ui = 2im
U = 27mr
P = m(Cu,2U CU,3U)
59
or
Work = (wu,2 wu$)U (4.1)
Applying energy balance on the reaction stage
Warkre = (hi ~ h3) + ('^
For reaction turbine stage C\ = C3, so
Workre = {hi h2) + (h2 h3) (4.2)
Reapplying energy balance for fixed blades and moving blades
h fl\ Flo 2 2 (4.3)
/~<2 r<2 h2k3 = W ( 22 3) (4.4)
= (wu,2 WUt3)U ( 2 2 3)
substituting equations (4.3), (4.4) into equation (4.2)
WorKe = (^) + (^) (4.5)
4.2.3 Turbine Stage Reaction
The stage reaction can be defined as the ratio of the enthalpy drops in the
moving blades to the enthalpy drops across the stage.
R =
h2 h3
hi h3
(4.6)
60
substituting equations (4.3), (4.4) into equation (??)
^ wo Wo
R = (Cf C{) + (w\ wl) <47)
4.2.4 Turbine Stage Efficiency
There are several definitions of the stage turbine efficiency in the literature.
One of the important definition defines the efficiency as the ratio of the actual
work output to the isotropic ideal work
Vs =
IV,
actual
Wi
isotropic
hi h3
hi h3s
(4.8)
where h3s is the enthalpy at the stage exit if the expansion occurs isotropically.
This enthalpy can be obtained from the entropy si and the pressure p3
h3s = h(si,p3)
Since the state code calculates the thermodynamic properties as a function of
temperature and specific volume, a code based on Newton method is devel
oped in appendix I to calculated the thermodynamic properties as a function of
entropy and pressure.
4.2.5 Repeating Stage Turbines
Turbines are designed from several stages in order to achieve a high power
output. Axial multistages turbines are usually designed to have identical ve
locity triangles for all stages. In order to achieve identical velocity triangles, the
axial velocity and the mean blade radius have to remain constant for all stages.
Therefore, the blade height is gradually increasing for each blade row due to
the density reduction as the the flow expands through the turbine. In addition
61
to that, the fluid flow angles at the stage outlet q3 must be equal to the fluid
flow angle at the inlet an in order to have same velocity diagram at each stage.
Moreover, for 50% reaction turbine, which have same pressure drop at each row,
the blades are designed to have same blade angles at each row (/32 = /?3).
4.3 Modeling Equations of MultiStage Turbines
Multistage turbines can be modeled in staggered mesh as several cells and
junctions. Each blades row is considered as one cell, and each two cells are
connected by one junction as illustratable in figure (4.2 ).
Moving blades
Figure 4.2: multistage turbine discretization.
Since the cross section area and the velocity leaving each row are not the
same as the cross section area and the entrance velocity at the next row, more
algebraic relationships are used at each junction. These relationships are derived
from the velocity diagram in figure(4.2 ). There are three different cross section
62
areas and velocities at each junction. The figure (4.2 ) shows these different
areas of a single path between two blades denoted by Ax, Ain, Aout, whereas the
total areas are the sum of these single areas. The total axial cross section area
Ax is the area perpendicular to the axial flow velocity u. For constant blade
thickness and small difference between inside and outside diameters, this area
can be calculated from
Ax (tt Djn ntjZ/
Where Dm is the mean row diameter, n is the number of blades in the row,
t is the thickness of the blade, and L is the height of the blade.
The other two cross section areas are written in term of the axial cross section
areas, and the other two velocities are written in term of the axial velocity. The
outer cross section area Aout is the area perpendicular to the flow velocity leaving
the blade. From the velocity triangle, the outer cross section area and leaving
velocity are
Aout Axsin(f3)
and
C = u/sin(/3)
where f3 is the outlet blade angle. It is often designed as a small angle to provide
a high velocity.
The inlet cross section area Ain is the area perpendicular to the entrance
flow to the next blade at the same junction. In the same manner, this area and
the velocity are
Ain = Axsin(a)
63
and
w = u/sin(a)
where a is the entrance fluid angle that depends on the axial junction velocity
u, the linear blade velocity U and the outlet blade angle /?
i / u \
a = taU ( U TT>
tan(0)
Based on the previous analysis, the numerical governing equations for a
multistage reaction axial turbine illustrated in figure (4.2 ) can be expressed as
1Continuity Equation
pn+1
rnv
pn
rnv
At
+
Ax(p:
nv nj
xnj
nn
rnv 1
U
71+1
nj
fl 4 A
Ax
= 0
(4.9)
where V is the cell volume.
2Momentum Equation
i________
n sin(0)
rnv
sin(0)
At
+ 0.5p:
sin{0) '
l2
Ax
+
r)n+1 rtn+l
irnv+1 rnv
Ax
Pl_
Ax
(4.10)
3Energy Equation
c71+l
V Dn 
ynvHnv
At
+ V;
Hn
nvanv
n+1
rnv
pn
rnv
At
(4.11)
Ax(PnvsnvKj'At*i  1 Df
Ax TÂ£v Ax
Pl in equation (4.10) represents the frictional pressure losses in the blades.
In addition to the major pressure losses due to wall friction, three other minor
losses are considered. These minor losses are the losses due to the incidence
angle, where the fluid enters the the blade with angle other than the the entrance
blade angle, the losses due to the deflection angle as the flow passes through the
64
curved blade path and the losses due to the expansion, where the flow leaves the
blades to bigger area. Since the pressure losses are calculated at the junction
where two different velocities are located, the losses term is divided into two
terms. The first term includes the wall friction and deflection angle losses on the
half cell to the left of the junction in addition to the expansion losses using the
leaving flow velocity. The second term includes the wall friction and deflection
angle losses on the half cell to the right of the junction in additional to the
incidence angle losses using the entrance flow velocity to the next blade. So, the
pressure losses term can be written as
Where pnj is the average density at the junction, and Dfm] is the hydraulic
diameter which calculated for noncircular geometries as.
where A is the cross section area and p is the circumference
In the same manner, the energy dissipation term Df can be obtained as
(4.12)
(4,13)
Where
65
knv 0.5( T k(0ut)nj
4.4 Selected Design for 50% Reaction MultiStage Turbine
Since the specific geometric details for these types of turbines such as rows
diameters, blades hight and angles, number of blades per row, thickness of
blades, and other geometric details are not available in the literature, we will
select our geometry and make our design in such way that meet with theoretical
working concept of this type of turbines. Our turbine design as it illustrated
in figure (4.3) consists of five stages, each has fixed row and moving row. The
turbine also has short straight pipes at the inlet and outlet connected to the
first stage and last stage by collectors.
Figure 4.3: schematic diagram of five stages reaction turbine .
Each row in this turbine has 60 blades of mean diameter of 0.8 m. The
rotational shaft speed is 3600 rpm. All the blades at different rows have same
design with inlet blade angle of 90, outlet blade angle of 29, thickness of
66
0.0005m and flow path of 0.3m length. The height of the blades at each row
increases gradually in the direction of the flow so each blades row has different
height at inlet and outlet as it illustrated in table((4.1)).
The inlet pipe has diameter of 0.4m and length of 0.25m, and the outlet
pipe has diameter of 0.781m and length of 0.25m. Both collectors have length
of 0.3m and diameters of the side they are connected to.
4.5 Simulation of Selected Turbine Design
In order to simulate the previous turbine design, it is discretized into 14
cells bounded by two phony cells for the boundary conditions. As it illustrated
in figure(4.4), each row of the ten rows, the two collectors and two pipes at the
inlet and outlet each is discretized as one cell.
Figure 4.4: discretization of five stages reaction turbine .
The turbine governing equations (4.9),(4.10) and (4.11) in addition to the
state equations(2.36), (2.37) are coded in appendix J to simulate the turbine.
The two boundary conditions at the turbine inlet are chosen to be the en
tropy and pressure for the superheated vapor as Si = 7004.76(J/kg.k),pi =
67
Table 4.1: diameters and the height at each row of the turbine .
row number location hub diameter (m) case diameter (m) blade height (m)
1 inlet 0.749077 0.850923 0.050923
outlet 0.747908 0.852092 0.052092
2 inlet 0.747908 0.852092 0.052092
outlet 0.741602 0.858398 0.058398
3 inlet 0.741602 0.858398 0.058398
outlet 0.734337 0.865663 0.065663
4 inlet 0.734337 0.865663 0.065663
outlet 0.725932 0.874068 0.074068
5 inlet 0.725932 0.874068 0.074068
outlet 0.716160 0.883840 0.083840
6 inlet 0.716160 0.883840 0.083840
outlet 0.704743 0.895257 0.095257
7 inlet 0.704743 0.895257 0.095257
outlet 0.691330 0.908670 0.108670
8 inlet 0.691330 0.908670 0.108670
outlet 0.675479 0.924521 0.124521
9 inlet 0.675479 0.924521 0.124521
outlet 0.656627 0.943373 0.143373
10 inlet 0.656627 0.943373 0.143373
outlet 0.634052 0.965948 0.165948
68
1.54(MPa), whereas the boundary condition at the outlet is the pressure of
Pie = 0.285 (MPa).
The steady state simulation results are demonstrated in the figure (4.5) and
table (4.2).
density at n=300 velocity at n=300
101,1., 109.9343,,,
0 5 10 15 20
x iq6 pressure at n=300
109.9343
0 5 10 15
entropy at n=300
7200
7100
5 10 15 20
7000
Temperature at n=300
800,,,, 79.3563
0 5 10 15 20
mass flow rate at n =300
600
400
0 5 10 15 20
79.3563)
79.35631
10 15
Figure 4.5: steady state Turbine simulation.
In addition to the study state results tabled in table (4.2), we get common in
let fluid angle at all rows (a = 66.62), common inlet row velocity (119.769ra/s),
common outlet row velocity (226.758m/s), mass flow rate (m = 79.356kg/s),
efficiency (r]s = 0.9087), stage reaction (R = 0.5) specific work per stage
(w = 37.0744KJ/kg) and total power output (P = 14.710MIT).
69
Table 4.2: Steady State Result for turbine simulation.
cell or junction density (kg/m3) pressure (Pa) Temperature (K) Entropy (J/kg.K) Axial Velocity (m/s)
1 5.74432 1540288.65206 600.33882 7004.76064 109.93430
2 5.74432 1540288.65206 600.33882 7006.00131 109.93430
3 5.70834 1532171.35305 600.77082 7010.25886 109.93430
4 5.58028 1495534.05669 599.53394 7017.98618 109.93430
5 4.97764 1296341.94389 582.07138 7025.57203 109.93430
6 4.42693 1119278.02135 564.59499 7033.31599 109.93430
7 3.92456 962317.97992 547.10210 7041.23380 109.93430
8 3.46716 823598.39732 529.59020 7049.34240 109.93430
9 3.05160 701402.97045 512.05672 7057.66007 109.93430
10 2.67494 594150.24169 494.49875 7066.20674 109.93430
11 2.33443 500382.64458 476.91261 7075.00420 109.93430
12 2.02748 418756.71643 459.29337 7084.07652 109.93430
13 1.75167 348034.34403 441.63408 7093.45037 109.93430
14 1.50799 287046.66223 423.06238 7098.86706 109.93430
15 1.49924 285161.55253 422.70470 7100.38475 109.93430
16 1.49924 285100.17157 422.70470 7100.38475
70
The transient solution for the system is shown in figures ((4.6)(4.10).
150
140
130
%
& 3 120
0 110
> 100
90
20
Figure 4.6:
velocity change with space and time for reaction turbine.
71
pressure (Pa)
x 10
6
15
20
10
15
10
time step
0 0
cell number
Figure 4.7: pressure change with space and time for reaction turbine.
6
5
4
3
2
1
20
15
10
20
15
10
time step
0 0
cell number
Figure 4.8: density change with space and time for reaction turbine.
72
entropy (J/kg.k)
650
Figure 4.9: temperature change with space and time for reaction turbine.
7300
7200
7100
7000
6900
20
20
Figure 4.10: entropy change with space and time for reaction turbine.
73
5. Conclusion
The wide usage of thermal hydraulic systems in our lives continues to provide
a reason to study the behavior of these system under different conditions in order
to improve their performance. For this purpose, this thesis focuses on analyzing
some common thermal hydraulic flow system components.
A set of five general mathematic modeling equations that are consistent
with the principles of the physics of fluid flow in any thermal hydraulic system
component are derived in chapter (2). These modeling equations are tested for
several different numerical schemes in order to choose the appropriate numerical
scheme. As a result, the semiimplicit upwind finite difference scheme is selected
to simulate the model equations of the different components.
The general modeling equations are first developed for a pipe regarding it
as the basic component in all thermal hydraulic systems. The pipe modeling
equations are coded in a matlab code to simulate the transient and the steady
state solution for a pipe flow test problem. The steady state simulation is
compared to analytical solution, and the result validates the basic numerical
scheme. The general modeling equations are also modified in order to be applied
to a pump. Then, a system consisting of pump and pipes driving water between
two points is simulated, and analytical solution is compared to the steady state
simulation.
Detailed analysis of the flow in a reaction axial steam turbine is made in
chapter (4). Algebraic equations are combined with the governing modeling
74
equations to apply to the complex flow in a turbine. The combined equations
are used to simulate the flow passes through a multistage reaction turbine
consisting of five stages.
All of these simulations demonstrate that fairly accurate numerical results
can be obtained with one dimensional flow model for thermal hydraulic system
components. It can be concluded that making appropriate modeling assump
tions for each component is very important to get accurate results.
75
APPENDIX A. The Numerical Analysis of SemiImplicit Upwind
Scheme
The basic government equations can be written in the semiimplicit upwind
finite difference scheme as
1) Continuity Equation.
P]+1
At
+ u
PjPj1
Ax
+ P
un+\ un+\
7+2 32
Ax
= 0
(A.l)
2) Momentum Equation.
,71+1
u:
J+i
u
J + 2
At
3) Energy Equation.
u i u i nn+i rin+1
2 o+k 3k Pj+1 Pj n
+ pu^J . = 0
Ax
Ax
n+l n n n
j j + u j j~l = 0
At
Ax
(A.2)
(A.3)
These three equations are written in term of four unknown (pn+1, un+1,pn+1, sn+1).
In order to solve these equation, one more equation is needed. Therefore, lin
earized state equation will be used to eliminate one of the three thermal prop
erties. The density (p+1) will be chosen to be defined in term of the other two
thermal properties as
p"+1 = ^Us?+1) + + constant (A.4)
substituting the equation (A.4) into the continuity equation (equation (A.l))
fI / n+1 _____ n\ dp
/ 7i+l n \ I . (sn sn \,?E\(vn_ n V
Jp's.? sj' Qp's^j PjP' ax p ' dp s PjiP
At ds
+ P
vn+1
un+\
Ax
= 0
76
dp pT1 p]
dslp[ At
u
r>n r>n
Pi Pili
Ax
(A.5)
rearranging this equation
dp, rs;+1  ,
k"+! u"+!
+ p '*=o
Ax
Fourier components of the numerical solution can be written as
s] = s0Xne^kAx)j
p] = p0Xnei{kAx)j
u\ x = u0Xnei{kAx)je^i{kAx)
substituting Fourier components into the modified continuity equation (
equation (A.5))
dp, rs0Xnei{kAx)j(X 1) s0Xnei{kAx)j{l ei{kAx))^ dp, rp0Xne^kAx^j(X 1)
dPpL At + U Ax J + dllp[ At
p0XnSkAx)j(l e*(fcAx)) u0Anei(feAx^(Ae<(4tÂ£) Xe~i(^)
+ U7 + P7 = 0
Ax J Ax
dividing the equation on \nel(kAx)i
dp, s0(Al) s0(l e~i{kAx)) dp, p0(A 1) p0(l e~i(kAx))
+ a71 + + a71
H,(Ae'lT) \e^)
a;1 = 0
rearranging the above equation
 1 + ^(1 +P.l.[A 1 + ^(1  (A.6)
+ K,JA^A(e!(st*>ei) = o
Ax
substituting Fourier components into the momentum equation (equation
(A.2))
A ^ H U A A if fcAx \ / * _ \ / .'/ fcAx \ _ _
2 ) g *A 2
P
u0Xnel^kAxdel^kÂ£2X\X 1) u0
At
+ pu
Ax
p0Xnei{kAx)j(Xe^kAx^ X) _
Ax
77
dividing the equation on \nel(kAx)i
u,
 1) u0(e
+ pu
 eH*?
>) PoX{ei{kAx) 1)
r At r Ax
multiplying the above equation by e1^
+
Ax
= 0
n0(A 1) tiG(l e *(fcAx)) p0A(e^ 21) e ^ 21))
prr1 pu11:= 0
At
Ax
Ax
rearranging the above equation
u0[A 1 + ^(1 e^kAx))] + Po\^(ei^ e"^) = 0
Ax pAx
(A.7)
substituting Fourier components into the energy equation (equation (A.3))
SoXnei(kAx)j(X ^ s0Anei(fcA*)j(1 ei(fcA*)j
A , + It A
At Ax
dividing the equation on \nel(kAx)i
s0{A 1) + ^s0(le^)) = Q
= 0
At
Ax
rearranging the above equation
So[A 1 + ^(1 ei("Al))] = 0 (A.8)
The equations ((A.6),(A.7),(A.8)) can be written as a system in a array
Â§f lp(A 1 + CB) g,(Al +CB) A 0
0 (A 1 + CB) Po = 0
(Al + C) 0 0 u0 0
where
B = (1 e l(kAx)) = 1 cos(kAx) + isin(kAx)
78
(e*(fc2X) e ^t1)) = 2isin(^Y^)
s~i uAt
U ~ Ax
For the system of equations to be satisfied, the determine of the array must
equal zero, by setting the determine to zero, three roots for A can be obtained .
(A 1 + CB)&,(A 1 + CBf (A^in(^))2] = 0
since
dp  1
dp s a2
where a is the sound speed, so
0/1 A/ Zp /\ np
(A 1 + CB)[{A 1 + CBf (A^^isin{^))2} = 0
im\JL Â£
rearranging the equation
[\l + CB][\l + CB + \?^isin{)][\l + CB\2^isin{,^)\ = 0
L JL Ax K 2 Ax v 2 n
The first root of A is
[A 1 + CB\ = 0
so
Ax = 1 CB
uAt, uAt
.uAt
= (1 ~~ ~) + cos(kAx) isin(kAx)
v Ax Ax y Ax
The second root of A is
[A 1 + CB + A^^ism(^)] = 0
L Ax 2
79
so
1CB
2 1+i2^sin^
(1  + ^cos(kAx) i^sin(kAx)
The third root of A is
[Al
+ CB A
2aAt
Ax
isin(
kAx
2
)] = 0
so
A3
1CB
l1 ~ *57) + ^cos(kAx) i^sin(kAx)
12
; 2a At
Ax
sm(
fcAx \
2 1
80
APPENDIX B. Linearized State Equation for Pressure and Entropy
B.l Linearized pressure state equation
The linearized pressure state equation as mentioned in chapter 2 equation
(2.36) is
n+l = rtn 4 ? (nn+l nn 1 4 ^1 (rrn+^ rTn t
"nv "nv Qp'r^'^>nv "nv) grpipy1 nv nv)
The terms ^T and ^\p can be calculated in term of (k, j3,cp,cv) using
Maxwell relations for single and two phase regions as
1 single phase region
dp. odp. v
T = V t = 
op OV K
1
pK
and
dp. v(3 /3
dT p VK K.
substituting the terms ^T and of single phase region into the linearized
pressure state equation
pn+1
r nv
Jnv ^nv
n+l
rnv
ft
nv rpn\l __ p
1 nv *
n
nv
K
on
t^nv pn
Kn 1nv
An at
2 two phase region
dp
dp
2 dP  n
v x T 0
dv
and
81
dp
gf\p
Â£><>
dP_
dT
substituting the terms ^t and ^ \p of two phase region into the linearized
pressure state equation
pn+1
rnv
BP
( \n rpn+1 p
V Qrj1 /nv1 nv ry
n
nv
\n pn
)nv1 nv
B.2 Linearized entropy state equation
The linearized entropy state equation as mentioned in chapter 2 equation
(2.37) is
r) o
nv nv dp Ty^nV ^HV ' nv xnv)
The terms j^\t, and Â§f \p can be calculated in term of (,0,cp,cv) using
Maxwell relations for single and two phase regions as
1 single phase region
and
ds. 2 ds. v2/3 (3
03 1 C3 1 II V w r = dv K p2K
ds _
dTlp==
substituting the terms ^r,
earized entropy state equation
^VKCv^
and Â§f\p
1 \ Cv
_ T
of single phase region into the lin
s
n+l
nv
(PnvV^nv
n+l
rnv
(,Cy^)nv rpn\1
rpn 1nv
1 nv
ft
n
nv
Pn t<~ n
nv^nv
(Cu)
n
nv
82
2 two phase region
(9s. 2 ds. 2dP 1 dP
= ~V = ~V dT = dT
and
ds cv
dT'p = T
substituting the terms ^t, and ^ \p of two phase region into the linearized
entropy state equation
s
n+1
nv
1 dP
YnvP
n+1
nv
'T'n
nv
) rjin+1 __
1
pn
rn
dP
dT
)
('v)
n
nv
83
APPENDIX C. State Properties Code
C.l Loading Constants for State Properties
0001 function [] = datah2o()
0002 % this function loads all the property data into the
0003 7. global state data variables. these global variables
0004 7. are used in all the state functions, the data is
0005 % from the text on page 154. the variable names are
0006 7. those used in the text.
0007
0008 % common global variables used in all state equations
0009 global Tc Pc rhoc vc TO uO sO ...
0010 R A E Ta rhoa F a Tp D G taua
0011
0012 7. basic reference variables
0013 Tc = 647.286;
0014 Pc = 2.2089e7;
0015 rhoc = 317.0;
0016 vc = 1.0/rhoc;
0017 TO = 273.16;
0018
0019 % PrhoT from equation P6 using Q from equation Q2
0020
84
0021
0022
0023
0024
0025
0026
0027
0028
0029
0030
0031
0032
0033
0034
0035
0036
0037
0038
0039
0040
0041
0042
0043
0044
R = 461.51;
A(1,1) = : 2.9492937e2;
A(2,1) = : 1.3213917e4;
A(3,1) = 2.7464632e7;
A(4,1) = : 3.6093828e10;
AC5.1) = : 3.4218431e13;
A(6,l) = : 2.4450042e16;
A(7,l) = : 1.5518535e19;
A(8,l) = 5.9728487e24;
A(9,1) = : 4.1030848el;
A(10,l) = 4.1605860e4;
A(l,2) = 5.1985860e3;
A(2,2) = 7.7779182e6;
A(3,2) = 3.3301902e8;
A(4,2) = 1.6254622ell;
A(5,2) = : 1.7731074e13;
A(6,2) = : 1.2748742e16;
A(7,2) = 1.3746153e19;
A(8,2) = 1.5597836e22;
A(9,2) = 3.3731180el;
A(10,2) = 2.0988866e4;
85
0045
0046
0047
0048
0049
0050
0051
0052
0053
0054
0055
0056
0057
0058
0059
0060
0061
0062
0063
0064
0065
0066
0067
0068
A(1,3) = 6.8335354e3;
A(2,3) = 2.6149751e5;
A(3,3) = 6.5326396e8;
A(4,3) = 2.6181978ell;
A(5,3) = 0.0;
A(6,3) = 0.0;
A(7,3) = 0.0;
A(8,3) = 0.0;
A(9,3) = 1.3746618el;
A(10,3) = 7.3396848e4;
A(l,4) = 1.5641040e4;
A(2,4) = 7.2546108e7;
A(3,4) = 9.2734289e9;
A(4,4) = 4.3125840e12;
A(5,4) = 0.0;
A(6,4) = 0.0;
A(7,4) = 0.0;
A(8,4) = 0.0;
A(9,4) = 6.7874983e3;
A(10,4) = 1.0401717e5;
A(l,5) = 6.3972405e3;
A(2,5) = 2.6409282e5;
86
0069
0070
0071
0072
0073
0074
0075
0076
0077
0078
0079
0080
0081
0082
0083
0084
0085
0086
0087
0088
0089
0090
0091
0092
A(3,5) = 4.7740374e8;
A(4,5) = 5.6323130ell;
A(5,5) = 0.0;
A(6,5) = 0.0;
A(7,5) = 0.0;
A(8,5) = 0.0;
A(9,5) = 1.3687317el;
A(10,5) = 6.4581880e4;
A(1,6) = 3.9661401e3;
A(2,6) = 1.5453061e5;
A(3,6) = 2.9142470e8;
A(4,6) = 2.9568796ell;
A(5,6) = 0.0;
A(6,6) = 0.0;
A(7,6) = 0.0;
A(8,6) = 0.0;
A(9,6) = 7.9847970e2;
A(10,6) = 3.9917570e4;
A(l,7) = 6.9048554e4;
A(2,7) = 2.7407416e6;
A(3,7) = 5.1028070e9;
A(4,7) = 3.9636085e12;
87

PAGE 1
KNOWLEDGE DISCOVERY IN FLOW CYTOMETRY DATA by Janet Siebert B.A. University of Montana, 1981 M.Ed., Vanderbilt University, 1984 A thesis submitted to the University of Colorado at Denver in partial fulfillment of the requirements for the degree of Master of Science Computer Science 2004
PAGE 2
This thesis for the Master of Science degree by Janet Siebert has been approved by M. Karen Newell Date
PAGE 3
Siebert, Janet (M.S., Computer Science) Knowledge Discovery in Flow Cytometry Data Thesis directed by Professor Krzysztof Cios ABSTRACT Flow cytometry data is stored in a published but esoteric format. Existing analysis tools are proprietary, and limited in available functionality. Additionally, they are designed to process a small number of samples at a time. Immunological research requires analysis of multiple samples on multiple individuals. The selection of samples for a particular type of analysis ideally is based on experiment metadata, identifying the individual, characteristics of that individual, and characteristics of the sample. This metadata is not available to existing tools. This work attempts to overcome these shortcomings in the existing analytical environments. It offers four distinct contributions: 1. Open architecture: Data is stored in a relational database, and can be accessed and analyzed by many tools and techniques. 2. Addition of the experiment's metadata: Each event from each sample is associated with relevant information from the experiment, such as individual, individual's strain, individual's diet type, stain or sample type, and sample processing technique. This metadata is easily accessible within the database, thereby supporting analysis of samples by any logical grouping. 3. Ability to derive new metrics from core data: Much of the analysis in this work is based on ''Normalized Fluorescence," in which the measurement for fluorescence is divided by the measurement for relative cell size. 4. Support for characteristic based analysis: The environment created in this work allows analysis by characteristic, such as stain, diet, or sample processing technique. Data can be grouped by experiment or project, instead of simply by sample. While these last two items are the most significant and new approaches presented by this work at this point in time, the first two items may prove the most significant lll
PAGE 4
contributions over the long run The combination of the open environment and the inclusion of the experiment's metadata create a rich analytical environment in which biologists can perform types of analysis that they simply could not have performed before. This possibility may lead to powerful new analytical techniques, and to significant biological findings and discoveries. This abstract accurately represents the content of its publication. Signed iv
PAGE 5
My great thanks to my husband, Wes Munsil, for his moral support, wisdom, and technical guidance. I also thank my advisor, Dr. Krys Cios, for his guidance and enthusiasm. My gratitude also goes to Ashleigh Reding, Dr. Suzi Schweitzer, Jeff Rogers, and Dr. Karen Newell (all of the Institute of Bioenergetics, University of Colorado at Colorado Springs) for their patience in teaching me about flow cytometry, and for generating the data which inspired this work.
PAGE 6
CONTENTS Figures .............................. ......................................................................................... viii Tables .......................................................................................................................... ix Chapter L Introduction ............................................... ....................................................... 1 2. The Fundamentals of Flow ............................................................. 4 U Instrumentation and Measurement .... ............................................................. 4 2.2 A BriefHistory of Flow Cvtometry ................................................................. 9 2.3 Experimental Design ...................... .............................................................. 10 2.4 The Analytical Process ................................... ........ ..................................... 13 3. Building the Rich Analytical Environment.. ................................................... 16 il Data Flow ..................................................... ................................................ 17 3.2 FCS Parsing ................................................... ............................................... 19 3.3 Preparing the Flow Sample Key .................................................................... 23 3.4 Connecting the Flow Data to the Sample Key ............................................... 26 3.5 Evolution of the Data Model ......................................................................... 27 3.5.1 The EntityRelationship Model ..................................................................... 27 3.5.2 The Dimensional Model ................................................................................ 29 3.5.3 Data Volume .................................................................................................. 32 Vl
PAGE 7
4. Analvtical Techniques in the Rich Analytical Environment .......................... 34 ti Statistical Analyses ........................................................................................ 36 4.1.1 Summary Statistics, Entire Experiment.. ....................................................... 36 4.1.2 Suspect Samples ............................................................................................ 38 4.1.3 Study of a Particular Individual ..................................................................... 39 4.1.4 M5114 Fluoresence, Normal Process and CPCF Process ............................. 40 4.1.5 Fluorescence by Diet Type, Isotype and M5114 Stains ................................ 42 4.1.6 Fluorescence by Experiment and Diet Type. M5114 .................................... 43 4.1.7 Simple Histograms Using Text.. .. ............................ .... .................................. 45 4.1.8 Normalized Fluorescence by Fotward Scatter Range ................................... 46 4.2 Graphical Analyses ............. .......................................................................... 48 4.3 Summary ........................................................................................................ 52 Conclusions and Future Work ........................................................................ 54 References ...................................................................... ........ ..................................... 58 Vll
PAGE 8
FIGURES Figure 21 The Flow Column ........................................................................................... ...... 5 22 Hydrodynamic Focusing .................................................... .................................. 6 23 Forward and Side Scatter ...................................................................................... 7 24 Representative Scattergram ....................... ........ .................... ...... ....................... 14 25 Representative Histograill ............................ ....... .................. ......... ................. .... 14 31 First Generation Data Flow Diagraill ...................... ....................... .................... 18 32 Second Generation Data Flow Diagram .............. .... ..... .......... ...... .................... 19 33 Entity Relationship Model.. ............................... ... .............................................. 28 34 Dimensional Model ........................................................................... ........ ......... 30 35 The Event Table .................................................................................................. 31 41 Normalized FL2, Individual BL6 ....................................................................... 49 42 M5114 Saillples, Experiment 1 .......................................................................... 50 43 M5114 Saillples, Experiment 1. Strain C .............................. ............................ 51 44 Forward Scatter, M5114 Samples ...................................................................... 52 Vlll
PAGE 9
TABLES Table 21 Sample Event Data ............................................ ...... .... ............... ........ ............... 9 31 An Extract from a Param File . . .......................... ....... .... . .... ........... .............. 22 32 An Extract from an E v en t File First Generation ......................... . .... ....... ..... .... 23 33 An Extract from an Event File, Second Genera t ion ..... ..................................... 23 34 Flow Sample Key .......... .............. ........... ........ ..................................... ... ...... 24 35 Sample Key Data, Reformatted for Database .... ........................................... ..... 25 36 Values of SRC Parameter . ....... ..... . .......... ........ ........ ............... . ........... ........ 26 37 Histogramgenerating Query, Original Mode l ............................ .... ... ............... 32 38 Histogramgenerating Query, Dimensional Model ............................................ 32 3 9 Selected Data Set Statistics ................................................................................. 33 41 Result Set (Extract}Summary Stat i stics, En t ire Experiment .......................... 37 42 Result Set Suspect Samples ............................................................................. 39 43 Result SetStudy of Ind i vidual BH3 ............................................................... 40 44 Result Set (Extract}M5114 Samples, Normal and CPCF Processing ............ 41 45 Result SetFluorescence by Diet Type, Normal Processing ........................... 42 46 Result SetFluorescence by Diet Type, CPCF Processing .............................. 43 47 Result SetFluorescence by Experiment by Diet Type, Normal Processing .. 44 IX
PAGE 10
48 Result SetFluorescence by Experiment by Diet Type, CPCF Processing .... 44 49 Result Set (Extract}Textual Histogram ........................................................... 46 410 Result SetFluorescence by Forward Scatter Range ...................................... 47 411 SOL Statement for Graphical Result ............................... ................................ 49 X
PAGE 11
1. Introduction Flow cytometry is a common technique used by research biologists and immunologists. A flow cytometer measures several attributes of individual cells which are suspended in a solution. Flow cytometry processing generally collects data on thousands of cells per sample This technique is used to study cell behavior, and investigate treatments for diseases such as cancer, HIV, and sickle cell anemia. The data collected by the flow cytometer is written to a published but esoteric format. Generally, biologists use proprietary software to access the data and perform a fixed set of analyses. Unfortunately, these techniques are limited and limiting. This work provides mechanisms and methods to dramatically improve the efficiency and range of the data analysis techniques. Research biologists at the University of Colorado Institute for Bioenergetics, located at the University of Colorado at Colorado Springs, provided the experimental data and guidance which inspired this work. Flow cytometry is integral to their research into cellular metabolism and cellular communication. Chapter 2, The Fundamentals of Flow Cytometry, presents background information, and provides an overview of one project conducted by the researchers at the Institute for 1
PAGE 12
Bioenergetics. It then describes the current state of the art in flow cytometry analysis. Chapter 3, "Building the Rich Analytical Environment," explains the methodology of moving the data from a limited environment to one in which a myriad of analytical techniques are possible. We combine data created by the flow cytometer and data sources maintained by the biologists in a relational database. Additionally, we convert a sample key maintained by the biologists from a word processor document into a database table. This sample key correlates the flow cytometer run number to the particular individual and the particular preparation of the sample The chapter begins with a discussion of data flow for creating this environment, and continues with details on processing the data generated by the flow cytometer. The chapter also discusses the evolution of the data model for the rich analytical environment. Chapter 4, "Analytical Techniques in the Rich Analytical Environment," expands upon the types of analysis that can be performed with a large body of data readily accessible in an open environment. Existing analytical processes limit the analyst to considering one sample at a time. With all of the samples from an experiment loaded into a relational database, the analyst can consider such groupings as "all of the samples on this particular individual" or "all of the samples of this particular sample type." Analyses can be either statistical or graphical. 2
PAGE 13
Chapter 5, and Future Work," summarizes the discussions, and offers suggestions for future work in the rich analytical environment. The possibilities for future work include generalizing the parsing algorithms, automating the identification of cell groups, incorporating additional information into the environment, identifying additional analytical algorithms, and automating the execution ofthe most valuable algorithms. 3
PAGE 14
2. The Fundamentals of Flow Cytometry This chapter presents the fundamentals of flow cytometry, a brief history of the instrument, the experiment which generated the data considered in this study, and the standard methods of analyzing flow cytometry data. The first section provides an overview of the instrument and the data it collects. The second section highlights selected aspects of the development of the instrument. The third section discusses experimental design, methods, and materials. The final section considers the current stateoftheart techniques for analyzing the resulting data. 2.1 Instrumentation and Measurement Articles and media segments about genetics and proteomics research abound in the popular press (Kelleher, 2004; Harris, 2004). The casual reader thereby might assume that all bioscientists are immersed in such research, and that the cell is no longer an important research focus. However, the cell is the fundamental building block of every organism. "In the hierarchy of biological order, cells hold a special place, for they alone have the capacity to make themselves autonomously, and to multiply by division" (Harold, 2001, p. 17). Flow cytometry provides a mechanism for studying cell characteristics and behaviors. 4
PAGE 15
One of the strengths of the flow cytometer is its ability to record multiple independent and quantitative measurements on a large number of cells (Parks, 1996). A flow cytometer takes in a sample of cells or cell particles suspended in solution, sending them in a single file past a laser beam, as shown in Figure 21. The alignment of the cells is achieved by hydrodynamic focusing, as shown in Figure 22 (BD Biosciences, 1999). The focusing ofthe solution into a thin channel forces the cells to align in single file within the stream. laser 0 00 forward scatter detector 0 Figure 21. The Flow Column 5
PAGE 16
Figure 22. Hydrodynamic Focusing The particular instrument used in this study was a Coulter Epics XLMCL. The four data points measured by this instrument for these studies are : Forward Scatter (FS) Side Scatter (SS) Red Fluorescence (FL 1) Green Fluorescence (FL2) Figure 23 highlights the highlevel physics behind the measurement of forward scatter and side scatter Forward scatter represen t s cell size. Side scatter represents internal structure of the cell or granularity. Among cells with different structures, light scatter provides a rough indicator of cell size. Among cells with similar structures, forward scatter and side scatter increase monotonically with cell size. Dead cells and cellular debris tend to have higher side scatter than live cells. Taken together forward scatter and side scatter can help identify and thereby exclude dead cells and debris (Parks, 1996). 6
PAGE 17
forward scatter side scatter Figure 23. Forward and Side Scatter Fluorescence detectors measure the presence of cells or molecules that have been dyed during the preprocessing of the sample. Each particle on which data is recorded is called an event. Data collected by the flow cytometer during the processing of a sample is written to an output file. The flow cytometer analyzes a sample until a certain number of "acceptable" events have been processed. An acceptable event is one in which the data falls into a preconfigured range, based on typical size and granularity for a particular population. Biologists can label this range "live." Other ranges are sometimes labeled "dead" or ''junk." Cells that fall into the live range have FS and SS 7
PAGE 18
measurements such that they are probably intact cells. Those that fall into the dead range have FS and SS measurements such that they probably have been compromised during the processing and are no longer whole. Sample runs are often configured to process 5,000 live events. This processing generally takes 15 to 120 seconds, depending on how many cells are in the sample. Depending on the sample, anywhere from approximately 6,000 to 30,000 particles must be processed to find 5000 particles in the acceptable range. The data is written in a standard format, such as FCS 2.0 or FCS3.0, where FCS stands for Flow Cytometry Standard The formats are specified by the Data File Standards Committee of the International Society for Analytical Cytology (ISAC). The file format includes a header section, an ASCII text section specifying parameters of the data run, and a section recording the data from the events. The data section is often written in a binary encoding. Thus, the file must be processed by a utility program for the event data to be translated to a humanreadable form. Table 21 shows sample event data, after processing. 8
PAGE 19
Table 21. Sample Event Data FS ss FL1 LOG FL2LOG FSLOG SSLOG 190 274 0 0 836 877 266 206 0 0 874 846 245 265 0 0 865 873 34 43 172 0 645 672 84 206 0 0 746 846 85 72 0 0 747 729 86 124 113 0 748 789 247 252 0 0 865 868 229 206 73 0 857 846 2.2 A Brief History of Flow Cytometry Two pioneers in the fields of flow cytometry fluorescence activated cell sorting (F ACS), and immunology are Leonard and Leonore Herzenberg. Their recent autobiography, published in the Annual Review of Immunology (2004), provides insights into the development of the flow cytometer. The development effort started in the late 1960s: As I became more deeply involved in immunology, I became increasingly aware of the need to characterize and isolate the different kinds of lymphocytes tha t were beginning to be visualized with fluorescentlabeled antibodies under the microscope .. So I started asking around to see whether anyone had solved the problem. (Herzenberg and Herzenberg, 2004) A group of researchers at Los Alamos had developed a machine to count and sort cellsized particles based on particle volume. Their goal was to be able to count 9
PAGE 20
debris particles obtained from the lungs of mice and rats, set aloft in balloons after atomic tests. Consequently, they had no need to measure fluorescence. However, Herzenberg was able to convince the researchers to share their engineering drawings and schematics with him. He returned to Stanford and assembled a team to build the first device. The first cellsorting paper was published in Science in 1969 (Herzenberg and Herzenberg, 2004) Data was originally collected by photographing histograms displayed on oscilloscope screens. Impro v ements in subseq u ent generations of the machine included the addition of logarithmic amplifiers thereby all owing the full range of data to be displayed on a single data plot; the use of computers in data collection; and software for data computation and display. The original computer used in the architecture was the PDP8, followed by the PDP11. In 1983, the first dual laser instrument was put into routine service. This allowed simultaneous measurements on multiple fluorescences. In 1998, a three laser instrument was developed, allowing up to eleven distinct fluorescence emissions (Herzenberg and Herzenberg, 2004). 2.3 Experimental Design Researchers at the Institute for Bioenergetics are engaged in a project which considers the link between lipid availability, lysosomallendosomal acidity, and the expression of cell surface Major Histocompatibility (MHC) class II molecules in human macrophage cell lines. This work is significant because it provides evidence for a mechanistic link 10
PAGE 21
between exogenous lipid availability, inflammation and the immune response. Because lysosomes are established sites for fatty acid accumulation and MHC class II molecules must traffic through the acidic lysosomal/endosomal compartment, we reasoned that fatty acid availability might have a direct impact on the immune response through fatty aciddependent alteration in the expression ofMHC class II on the macrophage cell surface. (Schweitzer et al., 2004) Putting this work into context for the nonbiologist, intracellular vesicles, such as the lysosome or lipid raft, may support the transport ofMHC class II molecules to the surface of the cell. These molecules aid in resistance to certain diseases. The data considered in this paper attempts to verify the lipid raft hypothesis by showing that mice raised on a highfat diet have more surface expression ofMHC class II than those raised on a lowfa t diet. The presence of MHC class II can be detected through cytometric analysis. Expanding upon in vitro experiments in human cell lines, the researchers designed in vivo experiments on mice. The project under consideration in this paper started with 112 mice. Half of the mice were fed a high fat diet (5% coconut oil and 5% safflower oil), half a low fat diet ( 5% safflower oil). After approximately 16 weeks, the mice were killed. The spleens were removed, and a suspension of splenocytes was prepared. The suspension contained approximately 1,000,000 cells per milliliter of solution. 11
PAGE 22
Subsamples of the splenocyte suspension were then dyed with substances designed to fluoresce in the flow cytometer Lysosomal acidity was measured by the fluorescence ofLysoSensor stain. MHC class IT expression was measured by the fluorescence of a phycoerythrin conjugated rat antimouse 1NIE. In experimental data, samples stained with this substance were labeled "M5114." Additionally, some samples were stained with an isotype (lgNigE) which binds to all nonspecific matter, thereby acting as a control. Sam p les were processed through the flow cytometer, yielding the e vent data as discussed above. The experimental data includes up to 13 samples or data se t s on each individual! unstained, 1 isotype sta i ned, 3 M 5114 stained, and 3 LysoSensor stained; with an additional set of samples trea t ed with CytoP e rm/CytoFix processing. This proc e ss, also used by Kumar et al. (200 2 ), permeates t he cell m embrane, allowing intracellular staini ng. This treatment i s labeled "CPCF" in the experimetnal data. The CPCF treatment was performed on one nostain, one isotype, and three M5114 samples, per individual. The project was divided into four e x periments. The first experiment included only females, while the second experiment included only males. The third and fourth experiments included both males and females. Individuals in the third experiment were wounded at an age of 16 weeks. Individuals in the fourth experiment were wounded at an age of 12 weeks. The wounding provided some data for research on diabetics and wound healing. Additionally, the wounding could 12
PAGE 23
activate the immune system of the wounded individuals. This would be demonstrated by increased MHC class II expression The project included four strains ofmiceBalb/c, C57/Black 6, UPC2 Knockout, and P6129. The P6129 species is the parent line of the UPC2 Knockout mice. These species are represented in the data as B, C U, and P individuals, respectively. 2.4 The Analytical Process Cell Quest software available from BD Biosciences is the primary tool that the researchers at the Institute for Bioenergetics use to analyze the flow cytometry data Thi s software presents the data graphically, and provides summary statistics. The software also lets users manually define s ubsets or clusters of data. The graphical or statistical analysis can then be performed on those clusters The two main graphi c al presentations are the scatt e rgram and the histogram. The scattergram i s a dot plot showing event val ues for two different parameters, as shown in Figure 24. The dot plot is the most common and useful data representation, particularly durin g realtime data collection (Parks and Bigos, 1996) The histogram shows the value of one parameter on the horizontal axis, and the count of events with that value on the vertical axis, as shown in Figure 25 ''The simple onedimensional histogram, plotting cell frequency as a function of signal level, has been a venerable analysis tool since the beginnings of flow cytometry'' 13
PAGE 24
(Parks and Bigos, 1996, p. 50.6) Scattergrams and histograms can be found in Desbarates (1999), Huber (2001), and Lee (2004) (/) u.. 0 ....... ,._ ., .. .. \: ss Figure 24. Representative Scattergram FS Figure 25. Representative Histogram 14
PAGE 25
Gating is a mechanism for graphically or numerically identifying a subset of the data. The analyst can then view scattergrams or histograms on that subset of data only. Additionally, certain summary statistics are available on the gates or regions. These include number of events, percentage of gated events, percentage of total events, arithmetic mean, and geometric mean. 15
PAGE 26
3. Building the Rich Analytical Environment Recall that the event data is written to a file in an esoteric format. Biologists are accustomed to analyzing this data with proprietary software such as CellQuest. CellQuest represents the data graphically, and via summary statistics. Consequently, biologists are not necessarily aware of the possibilities for analyzing the data were it in a more accessible format. The current analytical environment of flow cytometry analysis is limited, closed, and samplecentric. To bring more powerful analytical tools, techniques, and mindset to the problem space, a different environment is required. First, a mechanism must exist to combine event data from multiple samples. Second, the analyst needs to be able to perform a variety of statistical and graphical analyses. These analytical approaches should be constrained only by the fundamental nature of the data, and by the analyst's imagination. The approaches should not be artificially constrained by a particular tool. Third, the analytical environment should be open, allowing the data to be accessed and processed in a wide variety of means. Fourth, the analytical environment should support rapid hypothesis testing. This rich analytical environment (RAE) is created by parsing the FCS data and loading the resulting data into a relational database. Additionally, the parsed 16
PAGE 27
data is associated with information about the individual, the sample type, and the experiment. Once the data is in a standard relational database, a variety of tools and techniques can be employed. Viable tools for analyzing the data include SQL; programming languages such as Java, Perl, and database stored procedures; and data analysis and graphing programs. Such programs can have a mathematical or scientific focus, such as MATLAB (www.mathworks.com) and Origin (www.originlab com). Alternatively, they can have a business intelligence focus, such as Business Objects (www.businessobjects.com) and Cognos (www.cognos.com). Many powerful tools for analyzing data in relational databases are available. The research biologist can leverage these tools, once the data is exposed. This chapter discusses the process of building the RAE. 3.1 Data Flow Figure 31 shows the data flow for the first generation of the rich analytical environment. The Java program written for this work, FCSParser processes the native flow cytometry into two files, events and parameters. The resulting files are then loaded into a relationa l database. The sample key maintained by the biologists to correlate samples with individual mice and sample type, is also loaded into the database. 17
PAGE 28
. 1 params.dat ___t.LMQ)__ events.dat sample key+< Figure 31. First Generation Data Flow Diagram Figure 32 shows the data flow of the second generation of the RAE. In this case, the event data is combined with information about the sample with which it is associated. This metadata includes individual, diet type, strain, and sample type. The resulting environment is easier to use, because fewer database joins are required. Additionally, the reduction in joins can improve query performance. The modified components of this solution are FCSParser', events.dat' and sample key'. The following section provides more details on preparing data for the database. 18
PAGE 29
FCS File CLMD) params .dat events .dat' sample key' L..+f Figure 32. Second Generation Data Flow Diagram 3.2 FCS Parsing The FCS files, collected and written by the flow cytometer, are written in a published format. The formats are specified by the Data File Standards Committee of the International Society for Analytical Cytology (IS A C). The file format includes a header section, a section recording parameters of the data run, and a section recording the events. There is also an optional analysis section. The header section is of a fixed format, noting the version of the data standard (FCS2.0 or FCS3.0), and providing byte offsets to the remaining data. Among the values in the header section are the byte offset to the beginning of the text section, the byte offset to the end of the text section, the byte offset to the beginning of the data section, and the byte offset to the end of data section. For example, the standard states that the offset to the first byte of the text section is found 19
PAGE 30
in positions 1017, and the offset to the last byte of the text section is found in positions 1825. A representative header from the files considered in this work is: FCS2 0 128 971 2048 90223 96512 96537 90240 96425 The file format is FCS2.0. The text section ranges from bytes 128971, the data section from 204890223, the analysis section from 9651296537, and a userdefined "other" section from 9024096425. These last two sections are not processed in this work. The text section consists of ASCTI namevalue pairs. The first character of the text section is the delimiter which separates the names and values. Names or keywords which are required in the specification are prefaced by the "$" character. Additional keywords can be included in the text section, but they are not prefaced by the"$" character. Among the required keywords are $DATA TYPE, $PAR, and $TOT. The values associated with these keywords represent the type of data in the data segment (ASCTI text, binary integer, binary floating point), the number of parameters recorded in an event (where the parameters are such values as FS, SS, FLlLOG and FL2LOG), and the tota] number of events in the data set, respectively. A subset of the text section from one of the files considered in this work follows: !$FIL!G0036390.LMD!$SYS!DOS 6.22!$INST!UCCS Flow Center ,Science Bldg. Rm.163 The file name is G0036390.LMD. It was created by a computer running DOS 6.22, at the institution named ''UCCS Flow Center ,Science Bldg. Rm. 163." 20
PAGE 31
The data section consists of the raw data recorded by the flow cytometer. Several of the keywords in the text section describe the data section. These keywords include $MODE, $DATATYPE, $BYTEORD, and $PnB. $MODE specifies whether the data is in list or histogram mode. The data considered in this work was list mode, which means that data was recorded for each cell or event. Additionally, in this work, the data was \Vritten as binary integers with a byte order of "1,2", as specified by the $DATA TYPE and $BYTEORD parameters. The byte order specifies the order in which the binary dat a bytes are written to compose a data word : "1" refers to the least significant digit, "2" to the most significant digit. Recording the byte order allows data which is written under one operating system to be analyzed under another operating system. $PnB specifies the number of bits in each parameter, where n is the number of the parameter (i.e. 1, 2, 3). In this work, a Java program was written to read an individual file and write two resulting files, params.dat and events.dat. The format of the data section was determined by manual inspection of the appropriate parameters of the text section. The parsing code was written with this knowledge. As such, it is not a fully general parsing solution The value of was determined by extracting the first part of the input file name. Sample input file names, as generated by the flow cytometer, are: G0036500.LMD G0036501.LMD 21
PAGE 32
G0036502.LMD G0036503.LMD G0036504.LMD Consequently, given an input file of G0036500.LMD, output files would be paramsG0036500 dat and eventsG0036500 dat. Shell scripts were used to run the parsing program on all of the files within a particular directory In the first generation of the parsing code, all rows ofboth output files were prepended with the tag and the experiment number. The tag identifies the particular sample. The experiment number allows identification and efficient processing of a subset of the data. Table 31 and Table 32 show sections of data from each of these two outpu t files Table 31. An Extract from a Param File G0036076 1 1IFILIG0036076.LMD G0036076IliSYSIDOS 6.22 G0036076IliiNSTIUCCS Flow Center ,Scienc e Bldg. Rm.163 G0036076IliCYTIXL AB27219 G0036076IliDATEIIOJul03 G0036076IliBTIM I 16:59:34 G0036076IliSRCiss new keep 1 ns G0036076IliSMNOIG0036076 G0036076IliOPISLV G0036076Il i COMISYSTEM ll Version 3 0 G0036076IIITESTNAMEILisa spleen FLI G0036076IIITESTFILEIG0000157.PRO G0036076IliBYTEORDI1 ,2 G0036076IliDATATYPEII 22
PAGE 33
Table 32. An Extract from an Event File, First Generation G0036076Ili75I164I387IOI734I821 G0036076Ill197l203l451IOI841l844 G0036076Ili173I1291432IOI8261794 G0036076Ill18li116I383IOI83ll782 G0036076Ill18ll16li344IOI83ll818 G0036076Ili20li212I399IOI842I848 G0036076Ill193l22li464 IOI838I853 As the data model evolved, the information added to each row of data in the event file grew. In the second generation of the environment, each row in the event table includes key information from the sample key. Representative data is shown in Table 33. Table 33. An Extract from an Event File, Second Generation G0036076IllliBL3INo StainiOIBILINORMALI75I164I387IOI734I821 G0036076IllliBL3INo StainiOIBILINORMALI197I203I451IOI84ll844 G0036076IllliBL3INo Stain!OIBILINORMALI173I129I432IOI8261 794 G0036076IllliBL3INo StainiOIBILINORMALI18li116I383IOI83ll782 G0036076 I llliBL31No StainiOIBILINORMALI18ll16li344IOI83ll818 G0036076IIIIIBL31No StainiOIBILINORMALI20II212I399IOI842I 848 3.3 Preparing the Flow Sample Key For the experiments under discussion, the Flow Sample Key data was stored in a Microsoft Word Document. An extract of that data is shown in Table 34. 23
PAGE 34
Table 34. Flow Sample Key Sample# Description Sample# Description 101. BL7 No Stain 126 BH12 no stain 102 BL7 isotype 127. BH12 isotype 103 BL7 M5114 #1 128. BH12 M5114 #1 104 BL7 M5114 #2 1 29 BH12 M5114 #2 105. BL7 M5114 #3 130. BH12M5114#3 106 BL8 No sta i n 131. CL8 no stain 107. BL8 isotype 132 CL8 isotype 108, BL8 M5114 #1 133. CL 8 M5114 #1 109. BL8 M5114 #2 134. C L8M5114#2 Turning this data into a databasefriendly format required several steps. The first step was a parsing step, whereby the descriptions were split apart into "Individual," "Sample Type," "Replicate," "Strain," "Diet," and ''Process" components. The Individual values represent the particular mice. The Individual values shown in this set are BL7, BL8, BH 12, and CL8. The Sample Type values indicate how the sample has been stained. The values shown in this set include No Stain, Isotype, and M5114. Replicate values are 0, 1, 2, and 3. The first letter ofthe Individual designation also represents the mouse strain. The second letter represents diet, either high fat (H) or low fat (L). Finally, some of the samples were processed with a 24
PAGE 35
CytoPerm/CytoFix (CPCF) treatment. The two process values are NORMAL and CPCF. Inspection of the sample key also shows that the input must be cleansed or standardized. For example, No stain and no stain were transformed to No Stain, and isotype to lsotype. Additionally, the Sample# value 108, was changed to 108. Finally, a value for "Experiment" was prepended to the data set. Resulting data, ready for loading into the database, is shown in Table 35. Table 35. Sample Key Data, Reformatted for Database Gi .0 8. c: E :::l >a> z 10 12 E a> :::l a> 1:: a. "0 a. c::: 8. E > E a. G) )( I'll '5 I'll a> w en ..E en a:: en 0 14 BL6 Lyso B L NORMAL 15 BL6 Lyso 2 B L NORMAL 16 BL6 Lyso 3 B L NORMAL 17 BH4 No Stain 0 B H NORMAL 18 BH4 Lyso 1 B H NORMAL 19 BH4 Lyso 2 B H NORMAL In the future, the biologists may want to record their sample keys in a more databasefriendly format. One possibility is an Excel spreadsheet with column headers corresponding to the database fields (Experiment, Sample_No, Individual, Sample_ Type). Additionally, they might want to include validation rules on the columns to keep the data clean. 25
PAGE 36
3.4 Connecting the Flow Data to the Sample Key The data collected by the flow cytometer needs to be associated with the data from the Flow Sample Key. The parameter SRC contains the necessary linlc Sample SRC values can be found in Table 36. The nonnumeric part of the value is filtered out. The remaining numeric portion is written to the param file with the name SAMPLE_NO. Note that under certain circumstances, there are multiple runs per sample. In the second generation of the RAE, this sample number was used to retrieve information from the sample key Table 36. Values of SRC Parameter ss 95 ss 96 ss 97 ns ss 97 ns re ss 9 7 ns re re Parks and Bigos (1996) note that the designers of most flow data systems have viewed documentation as a poor stepchild to the g l amour of graphical display. Often all they provide is a pedestrian editor for keyword values to be stored with the data file. .. These editors tend to be oriented toward single samples and not taking sample grouping and experiment structure into account. Once the annotations are entered, there may be no facility for browsing or searching the keywords to group or retrieve relevant files when doing later analysis. ( p. 50.1) The system used by the researchers at the Institute for Bioenergetics for correlating flow files to their sample key is tedious. The SRC values discussed 26
PAGE 37
above are entered into the flow cytometer at sample run time. The values are printed automatically on a hardcopy analysis summary which is generated at run time. These printouts are stored in a binder. When a researcher wants to find the flow file associated with the sample, BL6 Lyso #1, she first finds that entry in the sample key, and mentally notes the associated key number of that sample (e.g 14). She then turns pages in her binder, scanning the printouts until she finds the number 14 embedded in the SRC field (A. Reding, personal communication, March 30, 2004). 3.5 Evolution of the Data Model The data model for the Rich Analytical Environment underwent several iterations before reaching its current form. The initial version was normalized, with tables for parameters, events, and the sample key. This vers ion had several shortcomings. First, the vo l ume of data visavis machine resources created a need for performance tuning. Second, multiple joins were required to connect the event data with the sample key data. Both of these factors led to the consideration of a star schema or dimensional model. However since most ofthe dimensions are single attribute dimensions, the collapse of all requisite values into a single table made sense. Each of these iterations of the data model is discussed in this section. 3.5.1 The EntityRelationship Model The entityrelationship model, as shown in Figure 33, accurately represents the data coming from two data sources. The first data source is the FCS files. Recall that there are two types of data that are relevant to the analyst. These are the 27
PAGE 38
individual events that are measured in a particular sample, and the parameters that describe each sample. The event data translated from binary into ASCII, comprises each row of the event table. Additionally, each event is prepended with the experiment number and the tag or file name event tag experiment fs ss fl11og fl21og fslog sslog param tag experiment param_name param_value where param_ + sample_key experiment sample_no individual sample_type name='SAMPLE NO' am_ value=sample_no and par Figure 33. Entity Relationship Model There were two options for modeling the parameter table. One option was to create a table having one field for every parameter 68 fields in this work. The other option was to create a more general format in which each namevalue pair had one row. Since implementers of the FCS format can include their own namevalue pairs in the text section of the file, this second option is more flexible in the long run. Additionally, this option supports the creation of additional namevalue pairs that support the analysis. For example, the sample number that corresponds to the 28
PAGE 39
sample key is embedded in the SRC field. This field was parsed to extract the numeric part of the data. This number, along with the name "SAMPLE_NO," was added to the param table. Again, each row includes the experiment and the tag. The SAMPLE_KEY table is simply a database representation of the Flow Sample Key, maintained in a word processor. INDIVIDUAL, SAMPLE_TYPE, and REPLICATE are split apart into separate fields. DIET and STRAIN are derived from the appropriate character of the individual value. Each row also includes the experiment number. 3.5.2 The Dimensional Model This environment is a reporting environment as opposed to a transactional environment. Thus, techniques for performance and usability in reporting environments are worth considering. One prominent technique is the dimensional model, which is also known as the star schema. A dimensional model for the RAE is shown in Figure 34. Visually represented, a fact is at the center of the model, with dimensions or attributes surrounding that fact. 29
PAGE 40
tag experiment individual sample(#) strain diet replicate process Figure 34. Dimensional Model In this application, the event is the fact or the measurement at the center of the star. The dimensions include INDIVIDUAL, TAG, EXPERIMENT, SAMPLE, SAMPLE_TYPE, REPLICATE, PROCESS, DIET, and STRAIN. Most ofthese dimensions include only one attribute. As such, they are degenerate dimensions. Kimball, Reeves, Ross and Thomthwa it e (1998) recommend collapsing the degenerate dimensions into the fact table, without a join to anything. INDIVIDUAL is the only dimension that has multiple attributes. These attributes could include sex, weight, age, food consumption, and various other attributes recorded by the biologists. The resulting event table is shown in Figure 35. 30
PAGE 41
event tag experiment individual strain diet sample sample_type replicate process fs 55 f111og f121og fslog sslog Figure 35. The E vent Table Because most queries are performed on the ev en t t ab l e and the event table only, this structure increases the ease with which the biologists can query the data. Table 37 shows a his t ogramgenerating query from the original implementation. Table 38 shows the logically equivalent query from the second generation implementation. Index in g on key values like INDIVIDUAL, DIET_ TYPE, EXPERIMENT, SAMPLE_ TYPE, and REPLICATE will enhance performance. 31
PAGE 42
Table 37. Histogramgenerating Query, Original Model select e.tag, sample_type, round(fs/25), count(*) from event e, param p, sample_ key s where e.tag=p.tag and param name='SAMPLE _NO' and param value=to _char( sample_ no) and individual='BH6' and s.experiment=1 and p.experiment=1 group by e.tag, sample_type, round(fs/25) having round(fs/25)<20 Table 38. Histogramgenerating Query, Dimensional Model select tag, sample_type, round(fs/25), count(*) from event where individual='BH 6' group by tag llsample_type, round(fs/25) having round(fs/25)<20 3.5.3 Data Volume The volume of data created by cell biology research is significant. On average, each individual generated 154,000 rows of event data. Compare this to a business application, such as cable billing. Suppose that each customer has 3 products on his or her account. Twelve months worth of billing would create 36 billing line items. One would need to bill nearly 4300 customers for 1 year to create 32
PAGE 43
as much data as one specimen generates! Selected statistics on the data set are shown in Table 39. Table 39. Selected Data Set Statistics 1! .!!l oJB .!!I,__ c: c:ro c:o. c: Q) ii: Q) ;:o:l Q) E Q)a."O 0 > lll"O > ro > :;: w i5 Wen w 'i5 'It 0 'lt"O C),_ Cl > 'It <( Exp1 418 3 889 684 25 9 305 155 587 Exp2 360 2 ,791, 145 24 7 753 116 298 Exp3 279 2 799 232 19 10 033 147 328 Exp4 362 3 1 1 9 628 22 8 618 141, 484 TOTAL 1419 12 599 689 90 33
PAGE 44
4. Analytical Techniques in the Rich Analytical Environment Once data from a flow cytometry experiment, which consists of multiple samples on multiple individuals, has been parsed and loaded into the RAE, a variety of analytical techniques can be employed. These techniques can be statistical or graphical. Additionally, many of the techniques should be thought of as standard techniques, performed each time an experiment is conducted. These standard techniques are fundamentally analysis patterns, enabling the analyst to know much more about his or her data, and to reach this knowledge very quickly. Furthermore, the RAE supports analyses in which multiple samples are analyzed with the same technique at the same time. The RAE also supports the derivation of new measurements from the base measurements collected by the flow cytometer. This chapter discusses some of the possible analytical techniques. The RAE contains data on multiple samples. There are two broad groupings of these samples, individual and sample type. In the data analyzed in this work, there are four different sample types. The types are No Stain, Isotype, Lyso, and M5114. Additionally, for certain individuals, the No Stain, Isotype, and M5114 samples are also processed with the CytoPerm/CytoFix technique, which perforates the cell membrane, allowing the stain to adhere to the interior of the cell. Thus, one 34
PAGE 45
approach to segmentbased analysis is to perform standard analytical techniques on all of the samples for a particular individual. This approach demonstrates the similarities and the differences in the different sample types for that particular individual. The other approach is to analyze all of the san1ples of a particular type (e.g. M5114) for all individuals, or a particular subset of individuals. This approach demonstrates similarities and differences in that particular sample type across a population of individuals. Additionally, new measurements can be derived from the data collected by the flow cytometer. Recall that the flow cytometer measures at least four parameters: forward scatter, side scatter, and two types of fluorescence. Recall also that the fluorescence is created by staining the cell with compounds which will adhere to certain molecule types, "on" or "in" the cell. The fluorescence is generated by these stains. All other things being equal, the larger cells have more surface area for the stains to adhere to, and consequently, more fluorescence. Thus, the analyst might want to normalize the fluorescence as a function of cell size. Since forward scatter is an approximate measurement of cell size, one viable normalization function is FL2/FS. Many of the analysis techniques discussed in the following section refer to this normalized fluorescence. 35
PAGE 46
4.1 Statistical Analyses This section presents a variety of statistical analyses on the data collected by the project. The discussion of each type of analysis includes a brief overview of the purpose of the analysis, the SQL statement which retrieves data, all or part of the result set, and a brief comment about the findings. 4.1.1 Summary Statistics, Entire Experiment A good starting point for analysis is high level statistics on an entire experiment. These statistics provide an overall sense of results, and relative consistency (or lack thereof) across samples. An appropriate SQL statement is shown below. SQL Statement select tag, individual sample_ type, count(*), avg(fs), stddev(fs), avg(ss), stddev(ss), avg(fl21og) from event where experiment=2 and process='NORMAL' group by tag, individual, sample_ type 36
PAGE 47
Table 41. Result Set (Extract}Summary Statistics, Entire Experiment 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 G0036563 G0036564 G0036565 G0036566 G0036567 G0036568 G0036569 G0036572 G0036573 G0036574 G0036575 G0036576 G0036577 G0036629 G0036630 G0036631 G0036632 G0036687 G0036688 G0036689 G0036690 G0036691 G0036692 G0036693 G0036694 G0036695 G0036696 G0036747 G0036748 G0036749 G0036750 G0036751 The results show: BL 9 BL9 BL9 BL9 BL9 BL9 BL10 BL10 BL 10 BL10 BL10 BL10 BL10 CH8 CH8 CH8 CH8 BL9 BL 9 BL9 BL9 BL10 BL10 BL10 BL1 0 BL10 BL10 CH8 CH8 CH8 CH8 CH8 w Cl. 1=, w Cl. No Stain No Stain No Stain Lyso Lyso Lyso No Stain No Stain No Stain No Stain Lyso Lyso Lyso No Stain Lyso Lyso Lyso No Stain lsotype M5114 M5114 No Stain No Stain lsotype M5114 M5114 M5114 No Stain lsotype M5114 M5114 M5114 I= z :::l 0 () 37 7681 7558 7663 7484 7108 7220 7102 7880 7749 7838 7680 8153 7721 76511 62706 63176 62629 6078 6332 6472 6378 6291 6 761 6773 7102 6964 6920 26788 26650 26203 27246 31303 220 216 219 198 212 212 224 242 244 238 219 212 216 135 140 136 139 235 222 245 240 238 224 228 224 226 232 157 151 151 156 156 Vi u. 0 0 len 113 108 112 92 100 101 103 125 120 118 102 102 102 133 135 128 130 107 110 135 119 118 108 116 117 119 124 137 128 123 139 149 309 299 307 267 281 283 305 357 359 351 310 308 305 2 9 2 2 8 7 277 286 178 170 194 189 183 192 199 204 203 207 219 208 208 216 220 Vi (ij 0 0 len 211 204 212 185 188 193 216 240 239 236 220 217 212 238 234 226 231 148 146 167 157 156 155 169 173 174 176 209 201 196 210 219 3.5 3.6 3 4 1.5 0.6 0 6 0.1 2.5 2 9 2 8 0.4 0 7 0 6 0.1 1.2 1 3 0.9 148 7 135 7 584.9 566 7 567 2 105 8 130 5 491. 0 524.3 503 9 115 8 129.2 409 0 590 6 583. 9
PAGE 48
in general, a fairly consistent number of events in each sample; certain suspect samples, as indicated by a large number of events and/or reruns of the sample (e.g. CH8); relatively consistent measurements across replicates for a given individual and a given sample type (e.g. rows 13, 1112); and that no stain "families" have almost no FL2 (rows 117), while stain families show some FL2 (rows 1832), e v en on No Stain samples. 4.1.2 Suspect Samples Another technique highl i ghts suspect samples or individuals by identifying the samples that have a large number of events say more t han 20,000. Additionally, the query can identify how many suspect samples are assoc i ated with a particular individual. SQL Statement select experiment, individual, count(*) from ( select experiment, tag, individual, count(*) from event group by experiment, tag, individual having count(*) >20000 ) group by experiment, individual 38
PAGE 49
Table 42. Result SetSuspect Samples EXPERIMENT INDIVIDUAL COUNT(*) 1 1 1 1 1 2 3 3 3 3 BL3 BL4 BL5 PH4 UL4 CH8 CH2 PH3 PL1 UL1 3 6 2 6 1 9 7 5 10 2 Note that compared to Experiments 1 and 3, Experiment 2 has a small number of suspect individuals. Depending on the sample type of the suspect samples, and on the particular analysis performed, the suspect samples/suspect individuals could skew comparisons made between Experiment 1 and Experiment 2. 4.1.3 Study of a Particular Individual A variation of the analysis presented in Section 4.1.1 is summary statistics, constrained to a particular individual. SQL Statement select tag, sample_type, replicate, count(*), avg(fs) avg(fl21og), avg (fl2log/fs) from event where individual='BH3' and process='NORMAL' group by tag, sample_type, replicate 39
PAGE 50
Table 43. Result SetStudy of Individual BH3 w en u. s (!> w en 0 0 ...J ...J w (.) t=: u. N ...J z (3' ...J ::i ::::l u. u. (!) :::E 0 (3' (3' w <( <( 0:: (.) 1(/) G0037080 No Stain 0 7352 198 2 0 0 G0037081 No Stain 0 7409 195 3 0 0 G0037082 Lyso 1 7 96 8 189 3 0 0 G0037083 Lyso 2 8791 179 4 0.0 G0037084 L yso 3 8332 185 4 0 0 G0037184 No Stain 0 6691 216 45 0.3 G0037185 lsotype 0 7 127 2 1 4 59 0 4 G0037186 M5 1 14 1 7097 211 535 3.3 G0037187 M5114 2 7002 212 544 3.3 G0037188 M5114 3 7207 212 5 34 3.2 The results show: a relatively consistent number of events in each sample ; increased fluorescence on Isotype and M5114 samples; and relative consistenc y across replicates of the same sample type. 4.1.4 M5114 Fluoresence, Normal Process and CPCF Process Another possible analysis is to compare M5114 fluorescence in normally processed samples to that in CPCF processed samples. 40
PAGE 51
SQL Statement select tag, individual, sample_ type replicate, process avg( fl2log), avg( fl2loglfs) from event where experiment= 1 and sample_type='M5114' and fs>20 group by tag individual, sample_ type, replicate, process order by individual sample_ type Table 44. Result Set (Extract}M5114 Samples, Normal and CPCF Processing w Ui Ll.. a.. 8" (3 ...J w 0 0 <( en ...J ...J :::> w (.) en N 0 ...J w ...J 5 a.. ::i (.) Ll.. Ll.. (,!) 0 a.. 0 (3' (3' w <( <( a:: a:: 1en a.. G0036266 BH 5 M5114 NORMAL 597 3 4 G0036267 BH5 M5114 2 NORMAL 600 3.3 G0036268 BH5 M5114 3 NORMAL 594 3.2 G0036404 BH5 M5114 1 CPCF 442 3 5 G0036405 BH5 M5114 2 CPCF 424 3.5 G0036406 BH5 M5114 3 CPCF 562 4.3 G0036407 B H5 M5114 3 CPCF 570 4.4 G0036271 BH6 M5114 1 NORMAL 599 3 7 G0036272 BH6 M5 1 14 2 NORMAL 625 3.6 G0036273 BH6 M5114 3 NORMAL 595 3 5 G0036410 BH6 M511 4 1 CPCF 568 4.7 G0036411 BH6 M5114 2 CPCF 576 4 8 G0036412 BH6 M5114 3 CPCF 585 4 9 G0036241 BL3 M5114 NORMAL 617 3.8 G0036242 BL3 M5114 2 NORMAL 623 4 0 G0036243 BL3 M5114 3 NORMAL 639 3 9 G0036377 BL3 M5114 CPCF 608 5 1 G0036378 BL3 M5114 2 CPCF 611 5 0 G0036379 BL3 M5114 3 CPCF 616 5 1 41
PAGE 52
The expected finding of more FL2 on CPCF samples is not consistently shown. However, ifFL2 is normalized (FL2LOG/FS), a higher value is consistently shown on the CPCF samples. This result highlights the value of the normalization technique. 4.1.5 Fluorescence by Diet Type, Isotype and M5114 Stains The types of statistics computed on individual samples can also be computed on the entire project. This technique lumps the data into large buckets, not broken out by sample. SQL StatememNORMAL Process select diet, sample_ type, avg( fl2log/fs ) stddev( fl21og/fs ), count(*) from event where sample_type in ('M5114','1sotype') and process='NORMAL' group by diet, sample_ type Table 45. Result SetFluorescence by Diet Type, Normal Processing DIET H H L L SAMPLE TYPE l sotype M5114 lsotype M5114 AVG(FL2LOG/FS ) 7 3.4 6 3.3 STDDEV(FL2LOG/FS) 1.1 2 8 1 0 2.7 COUNT(*) 237376 717209 275953 780993 The results show slightly more normalized fluorescence for the high fat samples, for both isotype and M5114 stains, indicating support for the biologists' hypothesis regarding highlipid environments being conducive to the presentation ofMHC Class IT molecules. 42
PAGE 53
Table 46. Result SetFluorescence by Diet Type, CPCF Processing DIET H H L L SAMPLE TYPE lsotype M5114 lsotype M5114 AVG(FL2LOG/FS) 1 0 4 5 1 1 4 6 STDDEV(FL2LOG/FS) 1 2 3 5 1 3 3.5 COUNT(*) 203028 624292 268040 729606 The results show slightly more normalized fluorescence for the low fat samples, both isotype and M5114 stains. The CPCF processing alters the structure ofthe cell, allowing staining ofMHC class II molecules both inside of and on the surface of the cell. Thus this result is consistent with experimental design. 4.1.6 Fluorescence by Experiment and Diet Type, M5114 The analysis above is easily expanded by adding experiment as one of the grouping variables The select statement includes only M5114 samples, and computes average normalized FL2. Result sets for both NORMAL and CPCF processes are presented. SQL StatementNORMAL process select experiment, diet, avg(fl2log/fs) count(*) from event where sample_type ='M5114' and process = 'NORMAL' group by experiment, diet 43
PAGE 54
Table 47. Result SetFluorescence by Experiment by Diet Type, Normal Processing EXPERIMENT DIET AVG(FL2LOG/FS) COUNT(*) 1 H 3 4 215554 1 L 3 3 262408 2 H 3.6 279322 2 L 3.3 252318 3 H 3 1 222333 3 L 3.4 266267 The findings are as follows: Experiment 2, Males, shows greater difference in normalized FL2 between high fat and low fat diets than does Experiment 1, Females. Experiment 2, Males, shows higher normalized FL2. Experiment 3, Wounded Mice, shows greater normalized FL2 in low fat diets than high fat diets; and high fat diets show the lowest normalized FL2 of any of these result sets. Table 48. Result SetFluorescence by Experiment by Diet Type, CPCF Processing EXPERIMENT DIET AVG(FL2LOG/FS) COUNT(*) 1 H 4.3 225751 1 L 4 4 256878 2 H 4.4 189169 2 L 4 5 246198 3 H 4.9 209372 3 L 4.9 226530 The results show that: in no cases does the high fat diet show more normalized FL2 than the low fat diet; and 44
PAGE 55
Experiment 3, Wounded Mice, shows higher normalized FL2 than do the other two experiments. Interexperimental comparisons may be biased by differing machine calibration on different days. Consequently, these results warrant further investigation to ascertain validity. 4.1.7 Simple Histograms Using Text Constructing simple histograms using text is helpful when the user does not have easy access to a good graphical tool, and wants to do some quick graphical exploration. The technique can be used to show similarity of histogram curves across replicates, or for a quick identification of a local minimum. Additionally, this technique can provide graphical results of a large number of samples quickly. One of the challenges of graphical presentation on a two dimensional surface, such as a piece of paper, is how to distinguish between many samples. Only a certain number of samples can fit onto the graph before the presentation becomes too cluttered to extract useful information. Thus, this technique, while not showing multiple samples in the same data space, does provide a quick visualization of a large number of samples. SQL Statement select tag, individual trunc(fl2log/50) t, substr(lpad('o' round( count(*)/20,0), 'o'), 1 40) graph from event where fs> 100 and sample_type='M5114' and process='NORMAL' and replicate= 1 group by tag individual trunc(fl2log/50) 45
PAGE 56
Table 49. Result Set (Extract}Textual Histogram G0037249 IUH 2 .o 0000000000000000000000000 G0037249 IUH 2 1.0 000000 G0037249 IUH 2 2.0 000000 G0037249 IUH 2 3 0 00000000000 G0037249 IUH 2 4.0 00000000000000 G0037249 IUH 2 5.0 0000000000000000 G0037249 IUH 2 6 0 00000000000000000 G0037249 IUH 2 7 0 0000000000000 G0037249 IUH 2 8.0 0000000000 G0037249 IUH 2 9.0 00000000 G0037249 IUH 2 10.0 0000000 G0037249 IUH 2 11. 0 0000000000 G0037249 IUH 2 12. 0 0000000000000 G0037249 IUH 2 13. 0 000000000000000000 G0037249 IUH 2 14.0 00000000000000000000000 G0037249 IUH 2 15. 0 0000000000000000000000000000 G0037249 IUH 2 16.0 0000000000000000000000000000 G0037249 IUH 2 17.0jooooooooooooooooooooooo G0037249 IUH 2 18.0jooooooooooooo G0037249 IUH 2 19.0jooooooo G0037249 IUH 2 20.0joooo Note the two peaks of fluorescence, and the local minimum at trunc(fl2log/50)=10. This translates to a fluorescence reading of 450500. 4.1.8 Normalized Fluorescence by Forward Scatter Range The following technique was inspired by a "Quartile" based analysis presented by Covas (2004). Covas presents forward scatter quartiles, and reports fluorescence by each quartile. The technique discussed below divides forward scatter into four equal ranges (0 to 255, 256 to 511, 512 to 767, 768 to 1023) and computes the average normalized fluorescence in each of these quadrants. SQL Statement select tag, individual sample_ type, avg(decode(trunc(fs/256),0,fl21og/fs null)) Ql, avg( decode(trunc(fs/256), I ,fl2log/fs,null)) Q2, avg(decode(trunc(fs/256),2,fl21og/fs,null)) Q3, 46
PAGE 57
avg( decode( trunc( fs/256),3,fl2log/fs null)) Q4 from event where experiment=2 and process='NORMAL' and sample_type in ('M5114', 'lsotype ) group by tag, individual sample_ type Table 410. Result SetFluorescence by Forward Scatter Range w a.. ...J c( :;:) w 0 ...J 0 N "' ..,. 5 a.. 0 0 0 (!) iS :::E ?; c( (J) G0036688 BL9 l sotype 1 0 0.4 0 .5 0.5 G0036689 BL9 M5114 4.1 1.7 1.4 1 0 G0036690 BL9 M5114 3.9 1 7 1.3 1 0 G0036693 BL10 lsotype 0.9 0.4 0.4 0 5 G0036694 BL10 M5114 3 5 1 4 1 1 0.9 G0036695 BL 10 M5114 3 7 1.5 1.3 1 0 G0036696 B L 10 M5114 3.6 1 5 1.3 1 0 G0036708 BH7 lsotype 0 9 0 4 0 4 0.5 G0036709 BH7 M5114 3 8 1 .6 1. 2 0.9 G0036710 BH7 M5114 4 1 1.7 1.3 1 0 G0036711 BH7 M5114 4 1 1.7 1 3 1 0 G0036713 BH8 lsotype 0 8 0.4 0 4 0 5 G0036714 BH8 M5114 3.5 1 5 1.1 0 9 G0036715 BH8 M5114 3 8 1.6 1 2 0 9 G0036716 BH8 M51 14 2.1 0 9 0. 8 0. 6 These results show that the Isotype samples fluoresce primarily in the lower ranges ofFS (<256), while the M5114 samples also fluoresce in FS ranges from 256 to 1023 47
PAGE 58
4.2 Graphical Analyses Recall that the mainstream graphical treatments of the data collected from a single sample are twodimensional scattergrams and histograms as discussed in Chapter 2. The RAE allows the analyst to consider multiple samples simultaneously : The twodimensional scattergram might be able to support the simultaneous presentation of two or three samples, with different samples indicated by different colors. However, the more natural grouping of"all samples, this individual" usually contains 15 samples; the grouping all individuals, this sample type" usually contains 2025 samples. Consequently, a chart containing one histogram for each sample is abetter way to display more information in a compact space. In this work, these charts were created with a Java program incorporating JFreeChart (www.jfree.org ) libraries. One feature of this solution is that a SQL statement submitted to the database returned a result set which was then displayed graphically. No intermediate steps were required. Successive modifications of the SQL statement led to reasonably clean visualizations. Modifications included such steps as curve smoothing by creating wider bins for the histogran1 counts (e.g. round(fl2log*4/fs)/4, count(*)) and constraining the result sets to a range of relevant data activity. Table 411 shows a representative SQL statement. The expression round(fl2log*4/fs)/4 creates a smoother curve. The constraint, "having 48
PAGE 59
round(fl2log*4/fs)/4 between .5 and 10", limits the result set and thus the display to an appropriate range. Table 411. SQL Statement for Graphical Result select tagllsample typellreplicate, round( fl2log* 4 /fs )/ 4, count(*) from event where individual='BH 6' group by tagllsample_typellreplicate, round(fl2log*4/fs)/4 having round(fl2log*4/fs)/4 between .5 and 10 Figure 41 shows a family ofhistograms ofNormalized FL2 for all samples for a particular individual. Normalized FL2 Histograms: lndivkluel BH41 Nomelz..:l FL2 (JXI:35.410U51141 e OOD:38f12U51143 OIXJ38114NoS.i.O .. axnm72M51142 "GCIOJ8119Ho5tiii..O .o::ole117..,S.IJIO GDW&40etlotwQ I GOO.li2'71M51 1<41
PAGE 60
This technique shows the similarity or dissimilarity of the replicate samples. It also shows the similarities and the dissimilarities of different sample types. Note that the realtime display includes both color and interactivity. Data values are displayed when the mouse pointer hovers on a tick mark. Consequently, it is more informative than the screenshot of Figure 41. Figure 42 shows histograms ofNormalized FL2 for all M5114 samples for the individuals in Experiment 1. However, there are too many samples on this chart for it to be useful. Figure 43 shows the same information, only constrained to all individuals of the strain "C." 700 800 200 100 Normalized FL2 Histograms: Experiment 1 1.0 1..5 2.0 2.5 l.O 3.6 4.0 4.5 !i.5 6.0 6..5 7IJ 7.!5 1.0 1.5 9.0 8.5 10.0 Ncrmaltzed FL2 8L .... mo3824e e ULICDOJ8351 Cl50J038211 CHS(I)(J36301 .. CHHil03e298 "'Cl .... CJl038278 UH6al036381 f'H..8c:D038338 t BL 6B:I0062fi& 81+6
PAGE 61
lt+'M'iiMHM1111111111111'111 Normalized FL2 Histograms: Experiment 1 . I 700 I .,...., 83!10 300 """ 200 1 50 100 50 1.0 t.S 2 0 2.5 l.D 3.5 4.0 4 .5 5 0 5 .5 ti O 8..5 7.0 7.5 1.0 1.5 9.0 8.5 10. 0 NomwllzAMI FL2 I CH1Gl03625'& CL5Gl036281 C .... !5CXI03830 1 CL7())()36;zg1 .. CL8G)03821fi .. CL1())036 2 7 5 C H6(JJ()36J,(J81 Figure 43. M5114 Samples, Experiment 1, Strain C Figure 44 contains histograms of forward scatter for all M5114 samples of a particular individual. Again, it shows the similarity o f the replicate samples. It also shows that different processes have a fundamentally diff erent distribution of forward scatter. Upon first thought, one would not expect the sample type to influence the size of the cells. However, the CPCF process perforates the cell membrane, thereby significantly altering the forward scatter. 51
PAGE 62
1600 1400 1200 1000 c BOO 8 1500 400 200 0 Forward Scatter Histograms: Individual BL6 3 6 000315272NORMAL 9 10 11 1 2 13 14 15 16 17 11 111 FS 00038273NORMAI. CJI036ol1 1 C PCF 00036271 NORMAI.l Figure 44. Forward Scatter, M5114 Samples 4.3 Summary In summary, the Rich Analytical Environment supports both statistical and graphical anal y sis techniques. These techniques provide more information more quickly than does the traditional flow cytometry analytical process. This speed of response brings to mind another type of flow. In his book Flow: The Psychology of Optimal Experience, Czikszentmihalyi (1990) suggests that as human beings, our optimal experiences come when we are challenged. He calls these experiences "flow" and argues that flow is just as real as being hungry, just as concrete as walking into a wall. Two of the conditions required for experiencing flow are 52
PAGE 63
becoming immersed in an activity with a sense of immediacy and timely feedback (p.65) The data analysis task is fundamentally more satisfying when hypotheses can be posed, and relevant data can be retrieved, within seconds or minutes, as opposed to hours or days Additionally, the speed of retrieval supports rapid hypothesis refinement and extension. 53
PAGE 64
5. Conclusions and Future Work The work presented in this thesis demonstrates that a rich analytical environment can be created from flow cytometry data. This analytical environment allows the analyst to perform types of analysis that were not previously possible, and to gather more knowledge more quickly. The essential features of the environment are: an open architecture in which multiple analytical tools can be leveraged; the combination of flow cytometry data with experimental metadata, such as strain, diet, and sample type; the ability to derive new metrics from core data, such as the calculation of normalized fluorescence; the ability to perform analyses across experiments as opposed to on one or two samples at a time; and the ability to group multiple samples together to perform characteristicbased analysis, where such characteristics include diet, strain, and process. Furthermore, because the environment is so rich from an analytical perspective, the possibilities for future work are significant. 54
PAGE 65
Future work falls into five main categories: generalization of parsing algorithms, automated identification of cell groups or phenotypes, incorporating additional information into the RAE, additional analytical algorithms, and process automation. The code used to parse the FCS files in this work was written specifically for the data collected by this particular instrument. It is not a general implementation of a parsing algorithm for the FCS standard. In particular, the number and names of the data values collected are assumed known. Additionally, the byte order of the data section is also assumed known. A more general implementation would extract that data from the text section, and adapt the parsing of the data section to those values. Future work might also find the optional analysis section of the file to be valuable. Another aspect of generalization is extending the solution to support multiple or different relational databases. While the parsing algorithm writes the data to flat files, this solution used Oracle's bulk loading utility, SQLLOAD. Support for additional relational databases would require support for the bulk loading utilities associated with those databases. Additionally, the table and index creation syntax might need slight modifications for other databases. One of the challenges that face the biologist analyzing flow cytometry data is identifying the cell clusters, gates, or phenotypes. In traditional techniques, that identification is done manually. Automated clustering algorithms might prove to be of great value. Assuming that a viable algorithm could be found, the eventlevel data 55
PAGE 66
could be updated to include cluster name O.t:' number. Then, analysis could be performed on a particular cluster. That analysis could be either statistical or graphical, as discussed above. Additionally, that analysis could run across the standard crosssections (all samples, this individual; or all individuals, this sample type) while constraining on a particular cluster, such as "T cells" or "mature B cells." Another extension of the environment is to include additional information about the individuals. This information includes, but is not limited to, sex, weight, age, and weight of selected organs. Then, future analysis could incorporate these values. The analyst could research effect of age or weight on the flow cytometry results. The types of analysis discussed in Chapter 4, "Analytical Techniques in the Rich Analytical Environment," are but a starting point for the types of analysis that can be performed once the event data is stored in an open and accessible environment. Recall the various histograms presented above. Recall that in certain cases, one of the replicate samples would not follow the expected curve. Visual inspection suggests that the sample is an outlier, but a statistical goodnessoffit measure might be more appropriate. Numerous other statistically intensive analytical techniques can be easily supported in this environment. A final area for further work is process automation. As particular types of analyses show themselves to be valuable, the processes to perform these analyses can 56
PAGE 67
be automated. For example, the families of histograms could be automatically generated. In conclusion, the RAE empowers both the biologist and the analyst. Many types of analysis are possible when all of the data from an experiment is made available in an open environment. As biologists become more familiar with what they can accomplish with this environment, certain processes will become standard. Other processes will emerge as innovative and exciting. 57
PAGE 68
REFERENCES BD Biosciences. (1999). Introduction to Flow Cytometry: A Learning Guide. [Manual Part Number: 111103200]. Covas, D.T., De Lucena Angulo, 1., Vianna Bonini Palma, P., & Zago, M.A. (2004). Effects of hydroxyurea on the membrane of erythrocytes and platelets in sickle cell anemia. Haematologica, 89 ( 3), 237280 Retrieved March 25, 2004 from http : / / www .haematologica.org/journal/2004/3/pdf/890273. pdf Czikszentmihalyi, M (1990). Flow: The Psychology of Optimal Experience. New Y.ork: Harper & Row. Desbarats, J., Wade, T Wade, W. F & Newell ; M K. (1999). Dichotomy between na1ve and memory CD4+ T cell responses to Fas e ngagement. Proceedings of the National Academy of Sciences, 96(14), 81044109 R e trieved March 25,2004 from http: / /www. pubmedcentral.nih.gov/articlerender fcgi ?tool=pubmed&pubmed id=l0393955 Harold, F. ( 2001). The Way of the Cell : Molecules, Organisms and the Order of Life Oxford: Oxford University Press. Harris, R. (2004 March 31 ). Human diseases mirror e d in rat genome. All Things Considered [Radio Broadcast]. Washington, D C : National Public Radio Retrieved on April3, 2004 from http://www.npr.org/features/feature.php?wfld=1804676 Herzenberg, L. A & Herzenberg, L. A. (2004). Genetics, F ACS, immunology, and redox: a tale oftwo lives intertwined. Annual Review of Immunology, (22) 131. Retrieved on March 31, 2004 from http:/ /herzenberg.stanford.edu/Publications/Reprints/LAHSOO. pdf 58
PAGE 69
Huber S.A., Sakkinen P., David C., Newell M.K., & Tracy R.P. (2001). T helpercell phenotype regulates atherosclerosis in mice under conditions of mild hypercholesterolemia. Circulation, 103, 26102616. Retrieved March 25, 2004 from http://circ.ahajournals.org/cgi/reprint/1 03/211261 O.pdf International Society for Analytical Cytology (ISAC). Data File Standard for Flow Cytometry, Version FCS3.0. Retrieved March 25,2004 from http://www.isacnet.org/ Kelleher, K. (2004, April). The drug pipeline flows again. Business 2.0, 5(3), 5051. Kimball, R., Reeves, R., Ross, M., & Thomthwaite, W. (1998). The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing, and Deploying Data Warehouses. New York: John Wiley & Sons, Inc. Kumar, L., Pivniouk, V., de la Fuente, M., Laouini, D., & Geha, R. (2002). Differential role ofSLP76 domains in T cell development and function. Proceedings of the National Academy of Sciences, 99(2), 884889. Retrieved April17, 2004 from http://www.pnas org/cgi/reprint/99/2/884.pdf Lee, J., Shin, J., Kim, E., Kang, H., Yim, I., Kim, J., et al. (2004). Immunomodulatory and antitumor effects in vivo by the cytoplasmic fraction of Lactobacillus casei and Bifidobacterium longum. Journal of Veterinary Science, 5(1), 4148. Retrieved March 25, 2004 from http://www.vetsci.org/2004/pdf/41.pdf Murphy, R. (1996). Flow Cytometry and Sorting. Retrieved on October 4, 2003 from http:/ /flowcyt.cyto.purdue.edu/flowcyt/educate/theory/fluioptilsldOO 1.htm Parks, D. R.(1996). Flow cytometry instrumentation and measurement. In L. Herzenberg, L. Herzenberg, C. Blackwell, & D. Weir (Eds.), The Handbook of Experimental Immunology (pp. 4 7.14 7 .12). Boston: Blackwell Science. Retrieved on March 31, 2004 from http:/ /herzenberg.stanford.edu/Publications/Reprints/LAH413. pdf 59
PAGE 70
Parks, D. R. & Bigos, M. (1996). Collection, display and analysis of flow cytometry data. In L. Herzenberg, L. Herzenberg, C Blackwell, & D. Weir (Eds.), The Handbook of Experimental Immunology (pp. 50.150.11 ). Boston: Blackwell Science. Retrieved on March 31, 2004 from http:/ /herzenberg.stanford.edu/Publications/Reprints/LAH414. pdf Schweitzer, S.C., Reding, A. M., Ford, C. A., VillalobsMenuey, E., Huber, S. A., & Newell, M. K. (2004). Exogeneous Fatty Acids Affect the Expression of MHC Class II in Macrophage Cell Lines. Unpublished manuscript, University of Colorado at Colorado Springs. 60
