Citation
Efficient plumbing methods with statistical applications

Material Information

Title:
Efficient plumbing methods with statistical applications
Creator:
Harburg, Aaron
Publication Date:
Language:
English
Physical Description:
73 leaves : illustrations ; 28 cm

Subjects

Subjects / Keywords:
Plumbing industry ( lcsh )
Plumbers -- Time management ( lcsh )
Industrial efficiency ( lcsh )
Genre:
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )

Notes

Bibliography:
Includes bibliographical references (leaf 73).
General Note:
Department of Mathematical and Statistical Sciences
Statement of Responsibility:
by Aaron Harburg.

Record Information

Source Institution:
|University of Colorado Denver
Holding Location:
Auraria Library
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
47815101 ( OCLC )
ocm47815101
Classification:
LD1190.L622 2001m .H37 ( lcc )

Full Text
EFFICIENT PLUMBING METHODS
WITH STATISTICAL APPLICATIONS
by
Aaron Harburg
B. A., University of Colorado at Boulder, 1986
B. A., Metropolitan State College of Denver, 1995
A thesis submitted to the
University of Colorado at Denver
in partial fulfillment
of the requirements for the degree of
Master of Science
Applied Mathematics
2001
SAlf
a___V


This thesis for the Master of Science
degree by
Aaron Troy Harburg
has been approved
by
Karen Kafadar
Burton Simon
(o
1
I
Date


Harburg, Aaron Troy (M. S. Applied Mathematics)
Efficient Plumbing Methods With Statistical Applications
Thesis directed by Dr. Ramalingam Shanmugam
ABSTRACT
In order to find out what factors improve the speed of running
pipe, this study uses a balanced ANOVA model to simultaneously look at
five different agents that might be important.
This abstract accurately represents the content of the candidates thesis.
I recommend its publication.
Signed


ACKNOWLEDGMENT
Special thanks to:
Ramalingam Shanmugam, Burton Simon, Karen Kafadar, Diane
Bernhardt-Barton, Mark Harburg, Rudy Harburg, DeAnn Major, Gabriel
Major, Miriam Nations, Myint Shwe, Yusef Ulmanis, Scott Ulrich and
Steve Williamson.


CONTENTS
Figures.................................................vii
Tables.................................................. ix
CHAPTER
1. INTRODUCTION...................................... 1
Model Assumptions............................ 3
Factors...................................... 4
Plumbing Experience ......................... 5
Organization of Fittings..................... 6
Ridgid vs. Flexible Pipe................... 7
Large Holes vs. Small ....................... 7
Straight vs. Crooked Path ................... 8
2. PILOT STUDY ......................................10
Significance of Factors......................15
Normal Residuals ............................15
Equality of Variance.........................16
Time Series ............................... 26
Looking Foreword to Phase 2: are 64 Data
Enough?................................32
Estimates of Improvement Percentages.........35
v


3. 5 FACTOR FACTORIAL ANALYSIS .............38
Equality of Variance ...............38
Normal Residuals....................52
Time Series.........................52
Significant Effects ................55
Treatment Mean......................59
4. FULL SCALE MODEL..............................63
5. CONCLUSION ...................................67
APPENDIX
A Phase 1 Data ................................... 69
B Phase 2 Data ....................................71
BIBLIOGRAPHY ............................................73
vi


FIGURES
Figure
1.1 Existing Model................................................. 2
1.2 Better Model .................................................. 5
1.3 Tolerance...................................................... 9
2.1 Q-Q Time ..................................................... 17
2.2 Q-Q Log ...................................................... 18
2.3 Time Variance ................................................ 20
2.4 Log Variance ................................................ 20
2.5 Time Subject ................................................. 22
2.6 Log Subject .................................................. 23
2.7 Time Organization ............................................ 24
2.8 Log Organization ............................................. 25
2.9 Time Series .................................................. 26
2.10 Rigid Time Series............................................ 29
2.11 Flexible Time Series ....................................... 30
3.1 Organization Variance......................................... 43
3.2 Log-Log Organization.......................................... 45
3.3 Log-Log Experience ........................................... 46
vu


3.4 Log-Log Flexibility ......................................... 47
3.5 Log-Log Path................................................. 48
3.6 Log Holes ..................................................... 49
3.7 Log-Log Holes ................................................. 50
3.8 Phase 2 Q-Q.................................................... 51
3.9 Rigid Time Series.............................................. 53
3.10 Flexible Time Series ......................................... 56
3.11 Flex-Path Interaction......................................... 57
VUl


TABLES
Table
2.1 Subject 0 by Variable .
2.2 Subject 0 in Order . .
2.3 Subject 1 by Variable .
2.4 Subject 1 in Sequence .
2.5 Subject 2 by Variable .
2.6 Subject 2 in Sequence .
2.7 Phase 1 ANOVA............
2.8 Effects..................
2.9 Phase 1 Log Anova . .
2.10 Means ..................
3.1 Subjects 0, 1 by Variable
3.2 Subjects 0, 1 in Sequence
3.3 Subjects 2, 3 by Variable
3.4 Subjects 2, 3 in Sequence
3.5 Subjects 4, 5 by Variable
3.6 Subjects 4, 5 in Sequence
3.7 Subjects 6, 7 by Variable
11
12
12
13
13
14
14
16
31
37
39
39
40
40
41
41
42


3.8 Subjects 6, 7 in Sequence................................... 42
3.9 Phase 2 Time Series ........................................ 52
3.10 Phase 2 ANOVA............................................... 56
3.11 Phase 2 Means .............................................. 57
3.12 Confidence Intervals........................................ 61
3.13 Simple Confidence Intervals................................. 62
3.14 Complex Confidence Intervals.............................. 62
4.1 Phase 3 Data ............................................... 63
4.2 Phase 3 ANOVA............................................... 66
x


CHAPTER 1
INTRODUCTION
Time is money. Working as a plumber, this is very clear because
some projects are very profitable while others lose money. And the
obvious difference is time. Experienced plumbers are expensive. If a
plumber gets bogged down and goes slow on a job it doesnt take long
to eat up all the profits. On the other hand, plumbers are expensive,
i. e. they charge a lot of money. So a time-efficient plumbing crew can
be a huge cash cow.
Among experienced plumbers Ive known, much disagreement
exists about what methods are the most time-efficient, and which
factors are the most important. The purpose of this study is to use the
tools of statistics to sort among many possible predictors of relative
efficiency in order to determine which are significant, which are the
most important, and how they interact with each other.
To penetrate this mystery I built model construction projects
similar to a plumber at work. In my first 2 experiments, I built
wooden mazes, and had the test subjects run pipe from one point in the
1


/
/
/
/
/
/
/Rigid
maze to the other. In the third, I built a wall on the correct scale for a
bathroom and ran the water pipe.
The first model is cheaper to run and more portable. I used it
to cast a broad net and get a hint about many potential variables.
However, this model is on a much smaller scale than any real plumbing
work. In the third phase, I collected fewer data, but I had a much
better idea what I was looking for. I wanted to confirm on a large
scale what the small scale results had suggested.
Phase 1 was a pilot study where I looked at the effects of
2


organizing ones fittings, the difference between flexible tubing and
rigid pipe and the difference among 3 different test subjects.
In Phase 2, I collected more data and looked at 5 different
factors.
In Phase 3, I only looked at the flexibility effect, but it was a
much better model.
The factors under study are as follows:
Plumbing experience
Organization of materials
Rigid vs. flexible pipe
Large vs. small holes
Straight vs crooked path
Model Assumptions
Before collecting any data, I suspected that my response variable
should not be the time it takes to complete the maze, but the logarithm
of that time. This is because I expected the predictors to affect the
completion time multiplicatively, not additively. The fundamental
assumption of the standard ANOVA model is that a given datum is
equal to a grand mean plus means from various effects plus a normally
3


distributed residual.
Imagine for a moment that one could run flexible tubing twice
as fast as rigid pipe, and that this was true irrespective of all other
factors. And assume for the sake of argument that with rigid pipe
maze 1. took on average 100 seconds to complete with rigid pipe, and
that maze 2 required 200 seconds. Under our assumptions Mazes 1
and 2 should take 50 and 100 seconds respectively with flexible tubing.
So with a large sample size, sample means should be similar to those in
figure 1.1. There appears to be interaction between two variables, and
there is if yoii think of the factors as additive. But if you believe in
multiplicative effects, and you use the log-of time as your response you
graph now looks like figure 1.2. The effects have become additive
because of the theorem ln(x) In y = In (x/y).
Factors
For each of these factors, my intuition told me that they would
affect speed in a particular way. In this section I give some of my
reasons why. In each case, level 0 is the level I expected to go slowly
4


Rigid
Flexible
. +
Figure 1.2.
and level 1 is the level I thought would be faster .
Plumbing experience
How much faster does plumbing experience make somebody?
Experienced plumbers cost more money, and in the current labor
market theyre harder to find.
In the phase 1, all three subjects were inexperienced. However,
I looked at the subject effect as a random factor to see if it was a great
source of variability. In phase 2, I replaced the subject effect with a
controlled experience factor with 2 levels. Those with plumbing
5


experience were at level 1 and those without were at level 0.
Organization of Fittings
Ive long suspected that having your fittings sorted out makes
one faster. I would expect this to be true for 2 reasons. Firstly, you
can find things without rummaging. Every time you need a particular
elbow or length of pipe you can grab it from its allotted place if things
are sorted, or you can dig for it if theyre not. This aspect is whats
being tested here.
However, theres a second reason for keeping your fittings
organized which is harder to test in a controlled study like this. If you
sort each kind of fitting into its own bin, you know what youre
running out of, so you know what to get more of. It cant possibly be
efficient to roll up your tools in the middle of the day and go buy 1 or
2 things that you ran out of but desperately need.
Im not sure that having separate bins for every part youll ever
need is always efficient. After all, fittings have to be sorted into these
bins. The labor cost of this has to be measured in order to completely
answer the question of whether organizing pays dividends. But I did
get some useful information.
6


For both phase 1 and phase 2, this a controlled factor with 2
treatments. Level 1 is where the things the subject needs are presented
in a well sorted tray. The other is where the subject is given a large
bucket with all the fittings needed. However, this bucket also contains
a lot of useless hardware and its all mixed up.
Rigid vs. Flexible Pipe
In half the trials of the pilot study 'and phase 2, subjects ran 3/8
black iron pipe through a maze. In the rest, flexible nylon tubing was
used. The flex had an outside diameter of 5/8 just like 3/8 black iron.
(3/8 refers to the inside diameter of the pipe. The rigid pipe is level 0.
Flexible tubing is level 1. I expected flexible tubing to be faster
because it requires fewer fittings and it doesnt require the precise
measurements that rigid pipe does.
Large Holes vs. Small
I think that large holes give you more latitude to be imprecise.
The more perfect your decisions have to be, the slower you go. Neibel
claims that production cost decrease asymptotically with larger
tolerances as shown in figure 1.3. (Neibel p. 65) The diameter of a
7


hole which pipe must run through is a kind of tolerance, because it
limits the sizes of pipe that can be used to construct the line that
eventually goes through it. Its also more awkward to run pipe through
small holes. One may have more trouble getting the pipe into position.
Therefore, I assigned level 0 to small holes and 1 to large ones.
Straight vs Crooked Path
Obviously making a path crooked is going to require rigid pipe
runners to use more fittings and thus more time. So the crooked path
is level 0 while a straight path is level 1. But how much slower? How
much does it slow down flexible tubing? A plumber doesnt always
have control over how straight his path can be. The buildings design
and other utilities can get in the way. But if he knows his pipe is going
to have to run like spaghetti he can adjust his bid accordingly. I was
also interested in whether or not the other factors reacted the same way
to a straight path that they did to a crooked one.
8


Approximate relationship between cost and
machining tolerance
0 0.010 0.020
TOLERANCE INCHES
(PLUS OR MINUS)
Figure 1.3


CHAPTER 2
PILOT STUDY
In this phase of the research 24 data from 3 subjects measure 3
effects.
A. volunteers (random effect)
B. organization of fittings
C. hard pipe vs. flexible hose.
The foremost goal of the pilot study was to work the kinks out
of the data gathering process. I also wanted to approximate the mean
square for error of the data, in order to guess how many observations I
would need to take in phase 2. Furthermore, I wanted to look for a
time series learning curve. Is the 8th observation from a given subject
likely to be faster than the 1st? There could also be as fatigue effect
making the 8th slower than the 7th. I wanted to be sure that any
time series effects would not distort my data thus casting doubt on my
inferences. Toward this end, I randomized the order in which I
collected the data. The idea was to make data sheets like those in table
2.1.
The numbers in the column Ord are generated by a random
10


number generator which samples without replacement. When actually
doing the experiment it would be more useful to organize the chart like
table 2.2. For each subject a different random order was chosen.
Originally the data sheets had Is and Os for treatment levels like
table 2.1. After timing the first subject, (subject 0) I realized that it
would be easier to keep the treatments straight while administer the
trials if I wrote out the factor levels as is done in table 2.2.
PILOT

Subject 0
bv variable

ord ref oraan flex time
6 0 0 0 214
1 1 0 0 439
2 2 0 1 56
5 3 0 1 48
4 4 1 0 186
3 5 1 0 238
7 6 1 1 43
0 7 1 1 55
Table 2.1.

11


in seauence

om ref oraan flex time min sec
0 7 org-1 flex-1 55 0 55
1 1 0-dis rig-0 439 7 19
2 2 0-dis flex-1 45 0 45
3 5 brg-1 rig-0 238 3 58
4 4 org-1 rig-0 186 3 6
5 3 0-dis flex-1 48 0 48
6 0 0-dis rig-0 214 3 34
7 6 org-1 flex-1 43 0 43
Table 2.2. :

Subiect 1
bv variable

ord ref oman flex time
0 0 0 0 612
7 1 0 0 395
6 2 0 1 53
5 3 0 1 79
4 4 1 0 311
2 5 1 0 211
3 6 1 1 51
1 7 1 1 65
table 2.3.

12


in seauence

ord ref oman flex time min sec
0 7 ora-1 flex-1 47 0 47
1 0 dis-0 rig-0 568 9 28
2 2 dis-0 flex-1 60 1
3 1 dis-0 rig-0 380 6 20
4 5 org-1 n'g-0 352 5 52
5 6 org-1 flex-1 65 1 5
6 3 dis-0 flex-1 145 2 25
7 4 org-1 rig-0 313 5 13
Table 2.6.
Table 2.7
14


Tables 2.1-6 also list the times in seconds that each trial took.
Significance of Factors
Table 2.7. is an ANOVA table computed in MINITAB. The
table is completely accurate up to the column F. Because the factor
subject is a random factor the test statistics should be as in table h.
(Neter pp. 1377-81)
In table 2.8. conventional notation is used where subject = A, org
= B and flex = C.
We can now see that all 3 main effect are significant and that no
interactions are. As Ill discuss below, the significance of organization
may be inflated because of time series effects.
Normal Residuals
Figure 2.1 is a normal probability plot. It plots residuals against
the expected standard normal values for a 24 data sample. For Figure
2.1, the analysis was done on time values. In Figure 2.2, we see the
same graph using log-of-time for a response variable. Clearly the log-
of-time analysis produces more normal residuals.
15


Effect E(MS) General F Statistic Specific F Value p Value
subject A bcnc£ + o2 MSA/MSE 4.00 0.047
org B acn £ P/(*-l) +C02p+02 MSB/MSAB 16.03 0.057
flex C zbn £y*/(c-1) +bnalKt+a1 MSC/MSAC 4623.68 2.16 X io-4
subject*org A*B cna^+a2 MSAB/MSE 0.423 0.66
subiect*flex A*C 6/ia^+o2 MSAC/MSE 0.0403 0.96
org*flex B*C PyJ (6-l)(c-l) +M02pr+02 MSEC MSABC 1.31 0.37
subject*org* flex A*B*C oJPY+o2 MSABC/MSE 0.800 0.47
error O2
table 2.8.
Equailitv of Variance
Fundamental to the standard ANOVA model is the assumption
that the residuals are not only normally distributed, but that they also
have the same variance for every treatment.
16


aosu
i
e
s
X
s
3

*


s
-f-
I*

I

-J


a

si

X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Figure 2.1
17


JOSU
*5
e
v
i -
8 8 £ 6
Figure 2.2
18


flexibility. Figure 2.3 is done with time, Figure 2.4 with log-of-time.
What these graphs shows us is that for this factor, time would seem to
give unequal variances, while log-of-time would not.
The same plots with respect to test subjects are shown in Figures
2.5. and 2.6. Here there doesnt seem to be highly unequal variance
either way.
When the graphs are made with respect to organization though
one gets apparent difference in variance regardless of response
variable. This can be seen in Figures 2.7 (time) and 2.8 (log of time).
This could be coincidence or it could be that the variances are
significantly different.
If it is significant, it makes the data harder to analyze. But if
variances are truly different, that information is not without value. It
suggests that organizing ones parts not only makes one faster, but
makes completion time more predictable. The smaller variance is with
factor level 1: organized fittings. In running a plumbing business,
steady income is preferable to feast and famine. After all, several
famine months in a row can result in bankruptcy.
For the sake of analyzing the pilot data, I did a Hartley test. A
balanced ANOVA is somewhat robust against unequal variance. The
19


KIJf



I
ee

H-

*>
J
9
5
M

H-------------
O
s
) X X XX XX X X xc

{4
-
s
--
s
e

o

CD
..

s
X'f&wr* x
Figure 2.3
20


5
tn
1 G S id G i
s w
G s G
X X3K X X X X XX x
s
N
--
s
--

os
--


--

S
X
XXX XX XXX
X

X
Figure 2.4
21


X
X
80
ti Meres
0 -
x
x
X
X
X
X
x
X
CN
-80
+
+
0.00 0.40 0.80 1.20 1.60 2.00
subject
Figure 2.5


subject
S
e w 1 Hi 6 6 ft* 8
U 8 CO 8 8 8
8 8 8 8 ib 8 -i 8 e -e* 8 - 1 1 X Z X X Z X
H- N 8 K B OS 8 - 5 X X X X
00S 1 X XXXXXX X
Figure 2.6
23


09'0 0V0 8S "0 00


I
CO

u

M

H--------
03
s
>X X
X X X X
X XX


--
s

0
s
Figure 2.7
X XX * XX X
24


0.20 0.40 0.60 0.80
8

6
8 --
X
2
1 6 6 VI JU 8
8 w
8 8 6
-4- H
x XX XX
*
6
8
0
s
U5
Figure 2.8
X X M X XWXX
25


idea of a Hartley test is to look at the ratio of the standard deviations
for the highest and lowest factor levels. If the ratio is not significantly
large, one falls back on this robustness and does a standard ANOVA.
(Neter p. 764-6, 1189)
jy_0!0823 _4 02
0.0204
r = 2
d.f. = 11
Ha:ol*a\
rejectH0 if H > H(0.99, 2, 11) = 5.38
But 4.02<5.38 so conclude Ha
If I had used a = 0/05, I would have H* = 3.5 and be forced to
conclude Ha. So while the variances probably are different, they arent
so different as to warrant concern for the analysis of this data set.
At this point I was faced with the possibility that with more data
in phase 2, this same test might become significant. However, since I
seemed to have normal residuals, I knew I knew I could always try an
other transformation or nse weighted least squares if necessary. (Neter
p. 768)
Time Series
As I indicated earlier, one pitfall of this study was the risk that
26


6
8
h>
8
os
8
0
s
ft-
4
Figure 2.9
4.5


the speed with which test subjects worked could change over the course
of their session. Figure 2.9. plots residuals against their order of
execution. There appears to be a general downward trend. This trend
is even more obvious in Figure 2.10. Figure 2.10 is the same graph,
but only the trials with rigid pipe are represented. By contrast, Figure
2.11 show only trials with flexible tubing. This suggests that not only is
flexible tubing faster to run, its easier to learn. Indeed, one seems to
reach optimum speed immediately.
One potential problem from this would be to inflate the MSE. If
the goal is to study the long term productivity of full time plumbers
then the early trials are abnormally large, and the range of response
values is wider than necessary. So for phase 2, I had each test subject
do 2 practice runs with rigid pipe.
Notice from the data that for each test subject the first time a
given subject runs rigid pipe, that is their slowest time. By coincidence,
in each of these data, the fittings are disorganized. These assignments
were made randomly. However, with only 3 test subjects, there is a
0.25 probability that they would all be the same factor level, and a
0.125 probability that they would all be disorganized, which is the level
I had predicted to be slower. This casts dispersion on the organization
28


s
H'
tt



6 6 i 9
9 V
N - 9 N
Ul -1 9 1 Ul h
X
Ul
X X
X
M

X
X
Ul
X XX
a*
--
-0
Ul
*s
e
j
%
Figure 2.10
X
X
X
29


paoxexj
I
\
'i
ji
I
1 6 *> i 6
0
to 9 X w
6 9 9
J 1 1-
w
X

ilk
x
x
<*>

OS
*p
00
x xx
os


X X
Figure 2.11
30
x
i


MTB > Regress 'subrsp' 4 'subjo' 'subjl' 'suborg' 'subflex
The regression equation is
subrsp = 5.97 0.448 subjo 0.161 subjl
0.197 suborg
1.58 subflex
Predictor Coef
Constant 5.9660
subjo -0.4479
subjl 0.1613
suborg 0.1970
subflex 1.5821
Stdev t-ratio P
0.1315 45.37 0.000
0.1293 -3.46 0.003
0.1293 -1.25 0.230
0.1082 -1.82 0.087
0.1082 -14.62 0.000
S = 0.2420 R-sq = 93.4% R-sq(adj) = 91.8%
Analysis of Variance
SOURCE DF ss MS
Regression 4 13.2587 3.3147
Error 16 0.9369 0.0586
Total 20 14.1956
SOURCE DF SEQ SS
subjo 1 0.6294
subjl 1 0.0911
suborg 1 0.0229
subflex 1 12.5154
Table 2.9


effect that seems to be present in the data. Table 2.9 shows the results
of a regression analysis done on MINITAB. The 3 dubious data are
discarded, org and flex become suborg and subflex. With just 2 factor
levels, they can be treated as regression predictors. SubjO is 1 if the
datum is from test subject 0 and 1 otherwise. Subjl is 1 for subject 1
and 0 otherwise. Interactions arent considered. With this analysis, the
p-value goes to 0.087 up from 0.023. Its a different kind analysis with
fewer data, but I still believe that the organization effect is weaker than
it seems. Below well see that removing the 3 data lowers the
improvement percentage for organization.
After making data charts for phase 2, I looked at the first official
rigid pipe datum for each of the 8 test subjects. Of these, 5 are
organized, 5 use large holes, 4 use a straight path and of course, 4 are
experienced plumbers.
I dont expect any great bias from time series effect in phase 2.
There should be less confounding with the initial rigid trials more
balanced. Doing 2 practice runs should also even out the data some.
Looking foreword to phase 2: are 64 data enough?
As I said earlier, one reason for doing a pilot study was to be
32


able to decide how much data to collect in phase 2. Now phase 2
looked at 5 factors with 2 levels each. I wanted to be able to make
inferences about all my interactions: 2, 3, 4 and 5 way so I needed at
least 2 repetitions of each treatment. I need to collect at least 26 or 64
data. If necessary I could have collected more.
The statistics Im most interested in are the percentages of time
saved by using the faster methods. Specifically, I want to be sure that
when changing factor levels improves the true mean by a modest
amount It will manifest itself as a significant factor.
Because Im doing my analysis on the logs of the times rather
than the times themselves, I want to start out looking at the differences
in treatment means rather than their ratios.
So suppose that in phase 2 Im evaluating a factor which has
mean p0 for the slow method and pt for the faster method. Again, Im
in the logarithmic domain here. I knew that after evaluating my data
for phase 2, I would have statistics x0 and Jtj which are unbiased
point estimates of p0 and pj. At this point I want to consider U=p0 pj
with 0 + jc0..-xx .
33


C-u
\
MSE\
nil
has a t distribution on the same number of degrees of
freedom as the MSE, 32 in this case. (Neter p. 718-9) I didnt yet
know MSE for phase 2, so I used my MSE from phase 1. I felt that
with more explanatory variables in phase 2, MSE would tend to be
smaller. I also expect less general variance because as I said in the
discussion of time series effects, by doing 2 practice runs the times
should be more consistent.
I want to do a Lower tail test because Im not worried about tf
being larger than U. Only if 0 is too small can we have an important
improvement not show up as significant
t(0.9; 32) = 1.694
MSE il_ 0.0942 f 2
[nil] \l l 64/2)
So with 90% confidence, U - ^ 0.0767 1.694 = 0.1299. This
means that if £T > 0.1299 it will be found to be significant. Further, if
U > 2*0.1299 = 0.260, we can be 90% confident that the parameter will
34


be found to be significant.
This is just the difference of the transformed values. So far all
conpletion, so that log(T) = pi If factor level 1 shaves percentage P off
of level 0, then Tj = T0(l-P).
log(T1)= log(T0(l-P
P, = log(T0(l-P))
Pi = Po + log(l-P)
Po Pi = -log(l-P)
U = -log(l-P)
1-P = eu
P = 1-eu
If U = 0.260, P = 0.23. That is, whenever changing treatment
levels improves speed by 23% or better, we can reasonably expect a
significant statistic.
Table 2.10 shows means from the MINITAB ANOVA analysis.
For Flexibility, CT = 5.794 4.0823 = 1.7117. Therefore, a 90%
confidence interval for 0 is,
our analysis has been done in log-of-time mode. Let T = time of
Estimates of improvement percentages
1.7117
35


= 1.7117
= 1.7117
P = l-eu,
\
0.0942 *1.782
UJ
0.2233 = {1.477, 1.935}
so our confidence interval for P is from 77% to 85%
with 81% being the point estimate.
If we proceed in the same manner for organization we get,
0.3265 0.2233 = {0.103, 0.5498}
This would imply that P is between 10% and 42% with a point
estimate of 27%. But as we said earlier, the organization effect might
be inflated by the time series effects. Table 2.9, the regression analysis
I did when I threw out the 3 points I wasnt sure about give the slope
for the organization effect as 0.197. Since the interval from 0 to 1 is 1
unit wide, changing from 0 to 1 decreases the response variable by
0.197. Therefore = 0.197. Using the same standard error as before,
P now falls in a confidence interval of -3% to 34% with 18% most
likely. That is, were no longer sure that there even is an organization
effect.
36


MEANS
subject N rsp
0 8 4.6994
1 8 4.9917
2 8 5.1235
org N rsp
0 12 5.1014
1 12 4.7749
flex N rsp
0 12 5.7940
1 12 4.0823
MTB > nopaper
Table 2.10
37


CHAPTER 3
5 FACTOR FACTORIAL ANALYSIS
In this part of the study, I collected 64 data, so that I could
study 5 factors at 2 levels each with 2 data per treatment. Each of 8
subjects were timed on 8 treatments. 4 subjects were experienced
plumbers, and 4 were not.
As in phase 1, the order in which the treatments were
administered was randomized. However, this time there were 16
treatments that a given subject could be given. The first factor was
plumbing experience, and so depended only on who the subject was.
To get the right assignments I paired subjects up and assigned the 16
treatments randomly by sampling without replacement, the integers
from 0 through 15. Each of these numbers corresponds to a treatment,
the first 8 treatments were carried out by the first subject. The last 8
by the second See tables 3.1-8.
Equality of Variance
Figure 3.1 shows residuals plotted against organization. To
38


By Variable
Subjects 0,1
In Sequence
ord__________ref__________experience organ flex_________________holes path________________time
14 0 0-inexp 0-disora 0-riaid 0-small 0-crooked 711
13 1 0-inexp 0-disorg 0-riaid 0-small 1-straight 263
15 2 0-inexp 0-disora 0-riaid 1-large 0-crooked 719
5 3 0-inexp 0-disora 0-riaid 1-larae 1-straight 105
7 4 0-inexp 0-dlsorg 1-flex 0-small 0-crooked 86
4 5 0-inexp 0-disora 1-flex 0-small 1-straight 55
8 6 0-inexp 0-disora 1-flex 1-large 0-crooked 138
3 7 0-inexp 0-disora 1-flex 1-large 1-straight 43
6 8 0-inexp 1-ora 0-riaid 0-small 0-crooked 474
11 9 0-inexp 1-ora 0-riaid 0-small 1-straight 94
2 10 0-inexp 1-ora 0-riaid 1-large 0-crooked 454
10 11 0-inexp 1-org 0-rigid 1-large 1-straight 80
1 12 0-inexp 1-ora 1-flex 0-small 0-crooked 58
0 13 0-inexp 1-ora 1-flex 0-small 1-straight 52
12 14 0-inexp 1-org 1-flex 1-large 0-crooked 63
9 15 0-inexp 1-flex 1-large 1-straight 67
ord_______ref________experiencr organ flex holes path time
0 13 0-inexp 1-ora 1-flex 0-small 1-straight 52
1 12 0-inexp 1-org 1-flex 0-small 0-crooked 58
2 10 0-inexp 1-org 0-rigid 1-large 0-crooked 454
3 7 0-inexp 0-disorg 1-flex 1-large 1-straight 43
4 5 0-inexp 0-disorg 1-flex 0-small 1-straight 55
5 3 0-inexp 0-dlsorg 0-rigid 1-large 1-straight 105
6 8 0-inexp 1-org 0-rigid 0-small 0-crooked 474
7 4 0-inexp O-disorg 1-flex 0-small 0-crooked 86
8 ' TJ 0-inexp 0-dlsorg 1-flex 1-large 0-crooked 138
9 15 0-inexp 1-ora 1-flex 1-large 1-straight 67
10 11 0-inexp 1-ora 0-rigid 1-large 1-straight 80
11 9 0-inexp 1-org 0-rigid 0-small 1-straight 94
12 14 0-inexp 1-ora 1-flex 1-large O-crooked 63
13 1 O-inexp 0-disorg 0-riaid 0-small 1-straight 263
14 0 0-inexp 0-disora 0-rigid 0-small 0-crooked 711
15 2 0-inexp 0-disora 0-riaid 1-large 0-crooked 719
ON


By Variable
Subjects 2, 3
ord_______ml_________experience organ flex holes path
15 0 0-lnexp 0-disorg 0-rigid 0-small 0-crooked 353
3 1 0-inexp 0-disorg 0-rigid 0-small 1-straight 172
1 2 0-inexp 0-disorg 0-rigid 1-larae 0-crooked 342
2 3 0-inexp 0-disorg 0-rigid 1-large 1-straight 121
12 4 0-lnexp 0-disorg 1-flex 0-small 0-crooked 89
0 5 0-inexp 0-disorg 1-flex 0-small 1-straight 59
4 6 0-inexp 0-disorg 1-flex 1-large 0-crooked 44
14 7 0-inexp 0-disorg 1-flex 1-larae 1-straight 42
10 8 0-inexp 1-orq 0-rigld 0-small 0-crooked 376
9 9 0-inexp 1-ora 0-rigid 0-small 1-straight 60
11 10 0-inexp 1-ora 0-rigid 1-large 0-crooked 423
5 11 0-lnexp 1-org 0-rigid 1-larae 1-straight 64
8 12 0-inexp 1-ora 1-flex 0-small 0-crooked 55
6 13 0-inexp 1-org 1-flex 0-small 1-straight 40
7 14 0-inexp 1-org 1-flex 1-larae 0-crooked 43
13 15 0-inexp 1-ora 1-flex 1-large 1-straight 51
In Sequence
ord________ref experience organ flex boles path
UIU 0 5 0-inexp 0-disorg 1-flex 0-small 1-straight 59
1 2 0-inexp 0-disorg 0-rigid 1-large 0-crooked 342
2 3 0-inexp 0-disorg 0-rigid 1-large 1-straight 121
3 1 0-inexp 0-disorg 0-rigid 0-small 1-straight 172
4 6 0-inexp 0-disorg 1-flex 1-large 0-crooked 44
5 11 0-inexp 1-org 0-rigid 1-large 1-straight 64
6 13 0-inexp 1-ora 1-flex 0-small 1-straight 40
7 14 0-inexp 1-org 1-flex 1-large 0-crooked 43
8 12 0-inexp 1-org . 1-flex 0-small 0-crooked 55
9 9 0-inexp 1-org _ 0-rigid 0-small 1-straight 60
10 8 0-inexp 1-ora 0-rigid 0-small 0-crooked 376
11 10 0-inexp 1-org 0-rigid 1-large 0-crooked 423
12 4 0-inexp 0-disorg 1-flex 0-small 0-crooked 89
13 15 0-inexp 1-org 1-flex 1-larae 1-straight 51
14 7 0-inexp 0-disoig 1-flex 1-larae 1-straight 42
15 0 0-inexp 0-disorg 0-rigid 0-small 0-crooked 353


By Variable
Subjects 4, 5
ord_______ref________experience organ flex holes path time
15 0 1-exper 0-disorg 0-rigid 0-small 0-crooked 504
7 1 1-exper 0-disorg 0-rigid 0-small 1-straight 58
13 2 1-exper 0-disorg 0-rigid 1-large 0-crooked 385
8 3 1-exper 0-disorg 0-rigid 1-large 1-straight 257
6 4 1-exper 0-disorg 1-flex 0-small 0-crooked 46
10 5 1-exper 0-disorg 1-flex 0-small 1-straight 71
14 6 1-exper 0-disorg 1-flex 1-large 0-crooked 37
0 7 1-exper 0-disorg 1-flex 1-large 1-straight 38
1 8 1-exper 1-org 0-rigid 0-small 0-crooked 292
11 9 1-exper 1-org 0-rigid 0-small 1-straight 102
9 10 1-exper 1-org 0-rigid 1-large 0-crooked 328
2 11 1-exper 1-org 0-rigid 1-large 1-straight 39
5 12 1-exper 1-otg 1-flex 0-small 0-crooked 32
4 13 1-exper 1-org 1-flex 0-small 1-straight 32
3 14 1-exper 1-org 1-flex 1-large 0-crooked 30
12 15 1-exper 1-org 1-flex 1-large 1-straight 26
In Sequence
ord________ref_________experiencei organ flex holes path time
0 7 1-exper 0-disorg 1-flex 1-large 1-straight 38
1 8 1-exper 1-org 0-rigid 0-small 0-crooked 292
2 11 1-exper 1-org 0-rigid 1-large 1-straight 39
3 14 1-exper 1-orq 1-flex 1-targe 0-crooked 30
4 13 1-exper 1-org 1-flex 0-small 1-straiqht 32
5 12 1-exper 1-org 1-flex 0-small 0-crooked 32
6 4 1-exper 0-disorg 1-flex 0-small 0-crooked 46
7 1 1-exper 0-disoig 0-rigid 0-small 1-straight 58
8 5 1-exper O-disorg 0-rigid 1-large 1-straight 257
9 10 1-exper 1-otg 0-rigid 1-large 0-crooked 328
10 5 1-exper 0-disorg 1-flex 0-smail 1-straight 71
11 9 1-exper 1-org 0-rigid 0-small 1-straight 102
12 15 1-exper 1-org 1-flex 1-large 1-straight 26
13 2 1-exper 0-disorg 0-rigid 1-large 0-crooked 385
14 6 1-exper 0-dlsorg 1-flex 1-large 0-crooked 37
15 0 1-exper 0-disorg 0-rigid 0-small 0-crooked 504


By Variable
Subjects 6, 7
ord_______ref_______experienceorgan flex holes path time
9 0 1-exper 0-disorg 0-rigid 0-small 0-crooked 442
1 1 1-exper 0-disorg 0-rigid 0-small 1-straight 98
3 2 1-exper 0-disorg 0-rigid 1-large 0-crooked 324
8 3 1-exper 0-disorg 0-rigid 1-large 1-straight 112
12 4 1-exper 0-disorg 1-flex 0-small 0-crooked 56
13 5 1-exper 0-dlsorg 1-flex 0-smail 1-straight 47
14 6 1-exper 0-disorg 1-flex 1-large 0-crooked 52
4 7 1-exper 0-disorg 1-flex 1-large 1-straight 40
11 8 1-exper 1-org 0-rigid 0-small 0-crooked 467
10 e 1-exper 1-org 0-rigid 0-small 1-straight 59
5 10 1-exper 1-org 0-rigid 1-large 0-crooked 213
7 11 1-exper 1-org 0-rigid 1-large 1-straight 56
6 12 1-exper 1-org 1-flex 0-small 0-crooked 44
0 13 1-exper 1-ora 1-flex 0-small 1-straight 40
15 14 1-exper 1-ora 1-flex 1-large 0-crooked 42
2 15 1-exper 1-ora 1-flex 1-large 1-straight 29
In Sequence
0 13 1-exper 1-org 1-flex 0-small 1-straight 40
1 1 1-exper 0-disorg 0-rigid 0-small 1-straight 98
2 15 1-exper 1-org 1-flex 1-large 1-straight 29
3 2 1-exper 0-disorg 0-rigid 1-large 0-crooked 324
4 7 1-exper 0-disorg 1-flex 1-large 1-straight 40
5 10 1-exper 1-ora 0-rigid 1-large 0-crooked 213
6 12 1-exper 1-org 1-flex 0-small 0-crooked 44
7 11 1-exper l-ora 0-rigid 1-large 1-straight 56
8 3 1-exper 0-disorg 0-rigid 1-large 1-straight 112
9 0 1-exper 0-disorg 0-rigid 0-small 0-crooked 442
10 9 1-exper 1-org 0-rigid 0-small 1-straight 59
11 8 1-exper 1-org 0-rigid 0-small 0-crooked 467
12 4 1-exper 0-disorg 1-flex 0-small 0-crooked 56
13 5 1-exper 0-disorg 1-flex 0-small 1-straight 47
14 6 1-exper 0-disorg 1-flex 1-large 0-crooked 52
15 14 1-exper 1-org 1-flex 1-large 0-crooked 42
rt-


os *0 0? 0 eae oz "o eoo
s

a N a - VI i $L 8 a a * £ S 1 1 1 X XXX XXX XKWOKm XXX XX X X
w s a
1 00 ki uaauu^uL uui mu uuuuiu u X*B#XXP XK Ma QB0BQKX
0 s UJ
Figure 3.1
43
40


produce the residuals, the logs of completion times are used for a
response variable just as in phase 1. It appears to have less variance for
organized fittings than disorganized fittings. Remember that the same
issue arose in the pilot study. As in the pilot study, we want to do a
Hartley test to make sure we can use a standard ANOVA model. Our
sample variances this time are 0.0277 for organized fittings and 0.0629
for disorganized. The test statistic H = 2.27. We have 63 d. f. For a
= .05, H* = 1.67. If a = .01, H* = 1.96. That is, this time we really do
have a variance problem. On the one hand, this completely confirms
my earlier hypothesis that organizing reduces variance. However, we
cant proceed with the analysis until we fix the problem.
So next I used the transformation Y = log(log(Y)). Now the
sample variances are 0.0461 and.0385. H = 1.066 < 1.67. In the pilot
study, I used Y = log(Y) for theoretical reasons; we were studying
multiplicative agents. Here Im only using Y = log(log(Y)) because it
makes the data analyzable.
Figure 3.2 shows the new residuals plotted against organization.
Figures 3.3-5 are similar plots for experience, flexibility and path
respectively. Figures 3.6-7 show that the new transformation drives the
residual variances farther apart in the case of hole size. However, the
44


08-0 09"0 0^-0 03*0 00*0
9
m
f
l
6

H
6
CD
a
t-
6
X X XOOCX X X8S06KX XXX* X X
*
6
o
0
s
ut
Figure 3.2
v ****** X X *XttttK X
rS ffvnnnnn n n
45


X
0.10
ne wires id
0.00
g
x
*
0
K
x
g
-0.10 '
X
8
g
8
$
g
8

X
0.00 0.20
0.40 0.60
----H
0.80
----1-----
1.00
expep
Figure 3.3


o
Q

I
6

h-
O
-+
6

S
H-
9
n
C
*s
(D
VI
M*
9-
S
a
h*

X XOOSM X>X 6
to
--
O
c*
--
9
a
--

9
H*
X
Figure 3.4
X
XXXXaOQKXXK *
47


X
1 9 s X
ne wipes id X K g X x 1 8
0.00 - B s 1 1 5? * X
-0.10 X
1 1 -1- 1 1 1 0.00 0.20 0.4O 0.60 0.80 1.00 path ^ CO path


n
h-
crq
0
i
os
03
o
saioq
00T 08*0
x
X
X
VO
X
X
X
08'0
00*0
09'0
0fr*0
X
8
X
*
X
n
H
X
£
X
g
N
X
- 0f-*0-
-00-0
piSBiX
- 0^-0


X
8.18 --
x
ne wires id
8.88
|
x
X
1
-8.10 -
o
in
X
--1----------h
8.80 0.20
0.40 0i60
0.80
1.00
holes
Figure 3.7


nscores
Figure 3.8
51


same Hartley test gives sample variances of 0.00261 and 0.00158.
H = 1.643 < 1.67. So for this data set we dont have evidence of
unequal variance even at a = 0.05.
Normal Residuals
Figure 3.8 is a normal probability plot for the new residuals. As
we can see, theres a nice linear relationship.
Time
Figures 3.9 and 3.10 plot the
they were done. In Figure 3.9 the
data are all rigid pipe trials. In
Figure 3.10 theyre all flexible
tubing trials. No time series
effects are apparent in either
graph. Again, each test subject
did 2 practice runs with rigid pipe
this time. This makes me glad that I did a pilot study.
I did time the first practice run for the sake of comparison. In
each case I set it up as the same treatment as the last rigid trial.
Series
residuals against the order in which
Subject Time 0 Timel P
0 618 474 23.30%
1 1066 719 32.55%
2 63 64 -1.59%
3 541 353 34.75%
4 83 58 30.12%
5 568 257 54.75%
6 75 56 25.33%
7 544 467 14.15%
Avg. 444.75 306 26.67%
Table 3.9
52


pao£ ia
S
s ifi 5 6^6 r h t? 65 8 1 1 H
B o - X XX XX
Ul - X X X X X
w
CD - r X X X
1 4.5 X XX X X X X X
a*
8 - XXX XX
-d XX XX x
w -
Figure 3.9
53


paoxeu

1 9 X 1^
0 £ *d £ 6 6
- 1 \ 1 X XX
cn - XX X XX
XX X XXX
(O - X X XXX
1 4.5 XX X
X X X X
i 6.8 X XX
0 U1 - X X
Figure 3.10
54


Therefore I can compute the improvement percentage associated with
an hours practice.
The most unusual datum here was subject 2 who went slower.
(-1.59%) At first I thought it might be that because he was involved
with the pilot study he had less to learn. However, subjects 0 and 1
were 23% and 33% respectively.
Table 3.9 shows the results. The average improvement
percentage is 26.67%. s = 13.65. If we do a confidence interval usinga
t-distribution on 7 d. f. with a = 0.1 we get a confidence interval of
(19.84%, 33.50%).
Significant Effects
In deciding which effects I believe to be significant, I propose to
treat the main effect differently than the interactions. Early on, I gave
theoretical reasons for why I believed the main factors would affect a
plumbers speed. I also stated which factor level should be faster and
assigned it factor level 1.
Table 3.9 shows an ANOVA table using log of log for a response
variable. 4 of 5 main effects are significant beyond the computers
ability to measure infinitesimal probability. Hole size has a p-value of


MTB > anova 'newrsp'=c5|c6|c71 c8 | c9
Factor Type Levels Values
exper fixed 2 0 1
org fixed 2 0 1
flex fixed 2 0 1
holes fixed 2 0 1
path fixed 2 0 1
Analysis of Variance for newrsp
Source DF SS MS F P
exper 1 0.086097 0.086097 21.16 0.000
org 1 0.100613 0.100613 24.72 0.000
flex 1 1.424667 1.424667 350.07 0.000
holes 1 0.015018 0.015018 3.69 0.064
path 1 0.434121 0.434121 106.67 0.000
exper*org 1 0.002646 0.002646 0.65 0.426
exper*flex 1 0.007958 0.007958 1.96 0.172
exper*holes 1 0.000509 0.000509 0.13 0.726
exper*path 1 0.000897 0.000897 0.22 0.642
org*flex 1 0.002582 0.002582 0.63 0.432
org*holes 1 0.000022 0.000022 0.01 0.942
org*path 1 0.006866 0.006866 1.69 0.203
flex*holes 1 0.000756 0.000756 0.19 0.669
flex*path 1 0.232573 0.232573 57.15 0.000
holes*path 1 0.000263 0.000263 0.06 0.801
exper*org*flex 1 0.002129 0.002129 0.52 0.475
exper*org*holes 1 0.017625 0.017625 4.33 0.046
exper*org*path 1 0.002989 0.002989 0.73 0.398
exper* f1ex*holes 1 0.003447 0.003447 0.85 0.364
exper*flex*path 1 0.003723 0.003723 0.91 0.346
exper*holes*path 1 0.001006 0.001006 0.25 0.622
org*flex*holes 1 0.008963 0.008963 2.20 0.148
org*flex*path 1 0.021781 0.021781 5.35 0.027
org*holes*path 1 0.000159 0.000159 0.04 0.844
flex*holes*path 1 0.000537 0.000537 0.13 0.719
exper*org*flex*holes 1 0.006497 0.006497 1.60 0.216
exper*org*flex*path 1 0.009456 0.009456 2.32 0.137
exper*org*holes*path 1 0.016529 0.016529 4.06 0.052
exper*f1ex*holes *path 1 0.013311 0.013311 3.27 0.080
org*flex*holes*path 1 0.005726 0.005726 1.41 0.244
exper*org*flex*holes* path 1 0.004452 0.004452 1.09 0.303
Error 32 0.130228 0.004070
Total 63 2.564145
MTB > nopaper
Table 3.10
56


MEANS
exper N newrsp
0 32 1.5367
1 32 1.4633
org N newrsp
0 32 1.5397
1 32 1.4604
flex N newrsp
0 32 1.6492
1 32 1.3508
holes N newrsp
0 32 1.5153
1 32 1.4847
path N newrsp
0 32 1.5824
1 32 1.4176
flex path N newrsp
0 0 16 1.7918
0 1 16 1.5066
1 0 16 1.3729
1 1 16 .1.3287
Table 3.11
57


0.064. For all these effects, factor level mean 1 is smaller than factor
level mean 0. That is, they
turned out like I predicted
they would. So
I feel reasonably safe
concluding that all my
predictor variables affect
the speed with which pipe Figure 3 11
is run.
Interactions are a different story. I made no predictions about
them because I had none. Here Im just fishing. And with 26
interactions, some of them would certainly have small p-values by
coincidence. So I feel obliged to use a family test of significance.
Using the Kimball inequality, if I do 26 tests of significance and I want
a significance for the whole family, then any given test should come in
at a' according to the equation;
a < l-(l-a)26 (Neter p. 831)
a-1 <; (1-a)26
log(a-l) < 26 log(l-a)
log(l-a) £ ^
log(ot-l)
1-a > e 26
58


azl-e
log(tt-l)
26
So if a = 0.1, a = 0.004. This doesnt mean that
exper*org*holes*path isnt a meaningful effect. Indeed, the relatively
small p-value (0.052) provides compelling conjecture that it might. Its
just that Im not ready to jump to conclusions when it could just be
coincidence.
By that standard, flexibility interacts with the path taken, (p
0.000) As figure 3.11 shows, a crooked path slows down rigid pipe
more than it slows down flexible tubing. This makes sense because an
obstruction adds 4 elbows to the job of running rigid pipe. Flexible
tubing can just be bent around the obstruction.
Treatment Means
Because flex and path interact, it doesnt make sense to build
confidence intervals either for the flex effect or the path effect. That
would beg the question of whether one meant the flex effect when
running a straight path or the flex effect when running a crooked path.
Likewise, the path effect is different for rigid pipe than it is for flex.
So I propose to make 7 confidence intervals: exper, org, holes,
59


flex I crooked, flex I straight, path I rigid and path I flex. In each case
what I really want is a confidence interval for the improvement
I
percentage. Unlike the pilot study, theres no nice formula to
transform back from what I called U to P, the improvement
percentage. So instead I propose to make confidence intervals for all
relevant treatment means, and then construct intervals for P from
them.
Lets first consider an interval for P for the experience effect.
Starting with our transformed response variable we need confidence
intervals for p0 and Their point estimates are in table 3.10.
Because were going to combine these confidence intervals into a single
interval, we need to be simultaneously confident in both intervals.
Therefore, I will use the Bonferroni procedure. That is, for 90%
confidence, I will use t(0.975, 32) = 2.037. Standard error is then
MSE
* 2.037
^ 32
= 0.02297 = standard error
This will be the same standard error for all main effect factor levels.
F0 = 1.5397. F, = 1.4633. So with 90% confidence, F0 e
(1.51673, 1.56267) and F, e (1.44033, 1.48627).
These are all transformed variables. To get time values, all these
60


numbers have to be transformed back. Let T0 = exp(exp(F0 )). =
exp(exp( Yt )). Table 3.11 shows
all the relevant T values.
Po * (1-P) = Px
1-p _ Pi. ..
Po
p = 1.
Po...
Variable Point Est. Min Max
Yo 1.5367 1.51673 1.56267
To 104.5037 95.3256 118.1015
Y1 1.4633 1.44033 1.48627
T1 75.20312 68.17575 83.14416
Table 3.12
In order to find lower and upper bounds for a 90% confidence
interval for P we need to first minimize then maximize 1 To
Pi
minimize P we must maximize -
and the upper limit for Tj.
Pmin = 1 83.14/95.33
= 12.78%
So we use the lower limit for Tn
Po.....
Z'
And the same method gives Pmax = 42.27%. Using point estimates we
get P = 28.04%. Table 3.12 gives confidence intervals for all 3 main
effects using this method. The table shows us that this means of
finding confidence intervals is sufficiently inefficient that 0 is in the
90% confidence interval for the holes effect, even though the ANOVA
table shows a p-value of 0.064 for the same effect.
Notice also that our point estimate isnt exactly midway between
the max and min. This is because exp(exp(x)) isnt a linear function.
61


The method is exactly the same for the interactions except that
the standard error for the confidence intervals in step 1 is larger. Here
we still use 2.037 for a t value, but S is now
MSE
N w
= 0.0159
s.e. = 0.0159*2.037
= 0.0325
Table 3.13 gives
Effect Point Est. Min Max
flex) crooked 87.17% 82.29% 90.73% :
flexjstraight 52.06% 37.26% 63.40% :
path] rigid 77.44% 68.27% 83.98% :
path|flex 15.69% -8.36% 34.41% ;
Table 3.14
point estimates and confidence intervals for the interactions. Note that
while the flex effect is greater for a crooked path than a straight path,
its still huge for a straight path. There appears to be a large path
effect when running rigid pipe. However, there may be no path effect
for flexible tubing.
: Effect : Point Est.: Min Max
Exper 28.04%: 12.78% 42.27%
Org : 29.92%! 13.89% 42.98% ;
: Holes : 12.82%: -7.12% 29.04%
Table 3.13
62


CHAPTER 4
PHASE 3: FULL SCALE MODEL
The one uncertainty about all the work Ive done up to this
point is that running pipe through these mazes I built is not exactly like
any real plumbing experience. There are many different kinds of pipe:
copper, cast iron galvanized steel, black iron, ABS, PVC, CPVP, cross-
linked polyethylene, gas-tite, slip-joint, lead and many more. Pipes are
installed in walls and ceilings; Crawl spaces, attics, trenches dug in the
ground. My hope is that the inferences Ive made are broadly
applicable to all these materials in all these situations.
However, I have to consider the possibility that not everything
Ive learned about running pipe through
mazes while sitting at a kitchen table holds
up when plumbing a building. So for my
last experiment, I built a 2x4 wall in my
back yard and plumbed it 10 times: 5 times
with flexible hose and 5 times with rigid
pipe.
Material : Time Log(t'me)
rigid 2100: 7.65
rigid 1880; 7.539
rigid 1809: 7.5
rigid 1885 7.542
flex 1703: 7.44
flex 723: 6.583
flex 628: 6.443
flex 643: 6.466
flex 635: 6.454
flex 604: 6.404
Table 4.1
63


In this case the rigid material is copper pipe that I sweat-fit with tin-
antimony solder. There could be some bias from the order in which
the trials are carried
out. So I ran them in the following order: 2 flex, 3 rigid, 3 flex, 2
rigid.
The results are in table 4.1. The different factor levels have
been grouped together, but within the group theyre listed in the order
in which they were run. There looks to be some learning curve.
Table 4.2 is an ANOVA done in MINITAB. The variable rsp is
the log of the time in seconds. The pilot study showed that this is a
good way to even out the variance for this factor. Indeed table 4.2
shows that the individual standard deviations of the transformed values
are 0.0765 for rigid and 0.0677 for flex. That is, theyre about 12%
different.
As before, flexible tubing is significantly faster than rigid pipe.
0 1.0643
s2(0) = 2MSE/5
= 2*0.00521/5
= 0.002084
s(C) = 0.0457
t(0.95, 8) = 1.860
standard error = 0.0849
So (0.979, 1.149) is a 90% confidence interval for U. Using the
64


formula from phase 1 P = eu we can translate to P values. P has a
point estimate of 65.50%, with (62.43%, 68.30%) for a 90% confidence
interval.
In the pilot study this interval was (77%, 85%). In phase 2 it was
(82.29%, 90.73%) for a crooked path and (37.26%, 63.4%) for a
straight path. So while there may well be differences between the
dynamics of real plumbing and my models, theyre not without
application.
65


MTB > oneway 'rsp^ind'
ANALYSIS OF VARIANCE ON rsp
SOURCE DF SS
ind 1
ERROR 8
TOTAL 9
LEVEL N
0 5
1 5
POOLED STDEV =
MTB > nopaper
2.83214 2.83214
0.04171 0.00521
2.87385
MEAN STDEV
7.5342 0.0765
6.4699 0.0677
0.0722
F
543.22
P
0.000
INDIVIDUAL 95 PCT Cl'S FOR MEAN
BASED ON POOLED STDEV

6.65 7.00 7.35
CM
-s


CHAPTER 5
CONCLUSIONS
I should start out by saying that this study doesnt address the
whole process of plumbing a house. Theres setting fixtures,
paperwork, meetings with the general contractor, etcetera. But running
rough pipe is a large part of it, so thats why I primarily looked for
strategies for running rough pipe faster.
The most obvious conclusion one can draw from this research is
that its not efficient to run rigid pipe when they can get by with
flexible tubing.
When forced to run rigid pipe, a wise plumber chooses jobs
where its possible to run pipe in a straight line. With flex, path may
not even matter.
Plumbing experience is also complex. Weve shown that
experienced plumbers are worth more money. But are they worth as
much more as they cost? If we took our point estimate as gospel, and
we assumed that a you could get rookies for $10.00 per hour (what I
started at 2 years ago) then a typical experienced plumber should get
67


10/(1-P) = 10/0.7196 = $13.90.
But Im oversimplifying in a couple of ways. First of all there
are many costs associated with employing somebody above and beyond
their hourly wage. Some of them are directly proportional to their
wage and some are the same for every worker. Im also treating
experience as something one has or doesnt have. A deeper study
might regress productivity against years of experience, having or not
having licences, upper body strength and perhaps other variables. To
muddy the waters a little more, apprentice plumbers arent allowed to
work alone, but must be supervised by a licenced plumber at all times.
People cheat on this, but a legitimate plumbing business has to hire a
certain number of licenced plumbers whether its economical or not.
Furthermore, an inexperienced plumber may make more mistakes
causing leaks and rework. In spite of the lingering confusion, I think
Ive shown that all else being equal, its better to have experience.
Finally, the hole size effect may not speed one up as much as I
might have predicted. But according to Table 3.9 it seems to exist. So
all else being equal, one should drill ones holes as large as possible.
68


APPENDIX A
Order:
Subject:
Org:
Flex:
Time:
Rsp:
Resid:
Nscr:
Timeres:
Orgpop:
PHASE 1 DATA
Glossary of Variables
Each subject does 8 trials numbered in the order theyre
executed from 0 to 7.
There are three subjects. This variable identifies which one did
which trial.
0 if fittings were disorganized 1 if organized.
0 if using rigid pipe. 1 if using flexible tubing.
Time in seconds required to finish running pipe through maze.
Response variable equal to log(time).
Residual from 3 way ANOVA model using rsp as the response
variable.
Normal score, expected value from a standard normal
distribution for the order of the resid value. Used to make
normal probability plots.
Residual when using time as \the response variable.
Residuals with org ~ 1.
69


Dispop.
Rigpop:
Rigord:
Flexpop:
Flexord:
Subsub:
Suborg:
Subflex:
Subrsp:
SubjO.
Subjl:
Residuals with org = 0. Orgpop and dispop are used to do a
Hartley test for unequal variance.
Residuals with flex = 0.
Ord values associated with rigpop. Rigpop and rigord are used
to make Figure 2.10.
Residuals with fles = 1.
Ord values associated with flexpop. Flexpop and flexord are
used to make Figure 2.11.
Data 2, 9 and 17 potentially confounded organization effects
with time series effects. Subsub is the variable sub without
those data.
Org without data 2, 9 or 17.
Flex without data 2, 9 or 17.
Rsp without data 2, 9 or 17.
1 if subsub = 0. 0 otherwise.
1 if subsub =1.0 otherwise. SubjO and subjl allow the
information in subsub to be used in a regression analysis.
70


APPENDIX B
Subj:
Order:
Exper:
Org:
Flex:
Holes:
Path:
Time:
Rsp:
Resid:
Nscores:
PHASE 2 DATA
Glossary of Variables
Identifies among the 8 test subjects numbered from 0 to 7.
Identifies which of the 8 sequential trial the given test subject
executed for that data point.
0 for test subjects without plumbing experience. 1 for
experienced plumbers.
0 if fittings were disorganized. 1 if organized.
0 if using rigid pipe. 1 if using flexible tubing.
0 for small holes. 1 for large holes.
0 for a crooked path. 1 for a straight path.
Time in seconds required to run the pipe through the maze.
Natural log of time values. This was the first attempt at a
response variable, but was abandoned for newrsp.
Residuals when using rsp as a response variable.
Normal scores. Expected value from a standard normal
distribution for the order of the resid value. Used to make
normal probability plots.
71


Suborg:
Subdis:
Newrsp:
Newresid:
Subsmall:
Sublarge:
SubdisU:
Suborgll:
Learn:
Rigpop:
Rigord:
Flexpop:
Flexord:
Rsp residuals for all data where org = 1.
Rsp residuals for all data where org = 0.
Natural log of rsp values, or log(log(time)). Used as the
response variable for all inferences.
Residuals when newrsp is used.
Newrsp residuals when holes = 0.
Newrsp residuals when holes 1.
Newrsp residuals when org = 0.
Newrsp residuals when org = 1. Subsmall, sublarge, subdisll
and suborgll are used for Hartley tests.
Improvement percentages between the first trial run and last
rigid pipe trial for a given subject.
Newrsp residuals where flex = 0.
Ord values associated with rigpop values. Rigpop and rigord
were used to make figure 3.9.
Newrsp residuals for flex = 1.
Ord values associated with flexpop values. Flexpop and flexord
were used to make Figure 3.10.
72


BIBLIOGRAPHY
Neter, John et. al. Applied linear Statistical Models. Irwin 1996.
Niebel, Benjamin W. Motion and Time Study. 1972.
73