Citation |

- Permanent Link:
- http://digital.auraria.edu/AA00003242/00001
## Material Information- Title:
- Efficient plumbing methods with statistical applications
- Creator:
- Harburg, Aaron
- Publication Date:
- 2001
- Language:
- English
- Physical Description:
- 73 leaves : illustrations ; 28 cm
## Subjects- Subjects / Keywords:
- Plumbing industry ( lcsh )
Plumbers -- Time management ( lcsh ) Industrial efficiency ( lcsh ) - Genre:
- bibliography ( marcgt )
theses ( marcgt ) non-fiction ( marcgt )
## Notes- Bibliography:
- Includes bibliographical references (leaf 73).
- General Note:
- Department of Mathematical and Statistical Sciences
- Statement of Responsibility:
- by Aaron Harburg.
## Record Information- Source Institution:
- |University of Colorado Denver
- Holding Location:
- Auraria Library
- Rights Management:
- All applicable rights reserved by the source institution and holding location.
- Resource Identifier:
- 47815101 ( OCLC )
ocm47815101 - Classification:
- LD1190.L622 2001m .H37 ( lcc )
## Auraria Membership |

Full Text |

EFFICIENT PLUMBING METHODS
WITH STATISTICAL APPLICATIONS by Aaron Harburg B. A., University of Colorado at Boulder, 1986 B. A., Metropolitan State College of Denver, 1995 A thesis submitted to the University of Colorado at Denver in partial fulfillment of the requirements for the degree of Master of Science Applied Mathematics 2001 SAlf a___V This thesis for the Master of Science degree by Aaron Troy Harburg has been approved by Karen Kafadar Burton Simon (o 1 I Date Harburg, Aaron Troy (M. S. Applied Mathematics) Efficient Plumbing Methods With Statistical Applications Thesis directed by Dr. Ramalingam Shanmugam ABSTRACT In order to find out what factors improve the speed of running pipe, this study uses a balanced ANOVA model to simultaneously look at five different agents that might be important. This abstract accurately represents the content of the candidates thesis. I recommend its publication. Signed ACKNOWLEDGMENT Special thanks to: Ramalingam Shanmugam, Burton Simon, Karen Kafadar, Diane Bernhardt-Barton, Mark Harburg, Rudy Harburg, DeAnn Major, Gabriel Major, Miriam Nations, Myint Shwe, Yusef Ulmanis, Scott Ulrich and Steve Williamson. CONTENTS Figures.................................................vii Tables.................................................. ix CHAPTER 1. INTRODUCTION...................................... 1 Model Assumptions............................ 3 Factors...................................... 4 Plumbing Experience ......................... 5 Organization of Fittings..................... 6 Ridgid vs. Flexible Pipe................... 7 Large Holes vs. Small ....................... 7 Straight vs. Crooked Path ................... 8 2. PILOT STUDY ......................................10 Significance of Factors......................15 Normal Residuals ............................15 Equality of Variance.........................16 Time Series ............................... 26 Looking Foreword to Phase 2: are 64 Data Enough?................................32 Estimates of Improvement Percentages.........35 v 3. 5 FACTOR FACTORIAL ANALYSIS .............38 Equality of Variance ...............38 Normal Residuals....................52 Time Series.........................52 Significant Effects ................55 Treatment Mean......................59 4. FULL SCALE MODEL..............................63 5. CONCLUSION ...................................67 APPENDIX A Phase 1 Data ................................... 69 B Phase 2 Data ....................................71 BIBLIOGRAPHY ............................................73 vi FIGURES Figure 1.1 Existing Model................................................. 2 1.2 Better Model .................................................. 5 1.3 Tolerance...................................................... 9 2.1 Q-Q Time ..................................................... 17 2.2 Q-Q Log ...................................................... 18 2.3 Time Variance ................................................ 20 2.4 Log Variance ................................................ 20 2.5 Time Subject ................................................. 22 2.6 Log Subject .................................................. 23 2.7 Time Organization ............................................ 24 2.8 Log Organization ............................................. 25 2.9 Time Series .................................................. 26 2.10 Rigid Time Series............................................ 29 2.11 Flexible Time Series ....................................... 30 3.1 Organization Variance......................................... 43 3.2 Log-Log Organization.......................................... 45 3.3 Log-Log Experience ........................................... 46 vu 3.4 Log-Log Flexibility ......................................... 47 3.5 Log-Log Path................................................. 48 3.6 Log Holes ..................................................... 49 3.7 Log-Log Holes ................................................. 50 3.8 Phase 2 Q-Q.................................................... 51 3.9 Rigid Time Series.............................................. 53 3.10 Flexible Time Series ......................................... 56 3.11 Flex-Path Interaction......................................... 57 VUl TABLES Table 2.1 Subject 0 by Variable . 2.2 Subject 0 in Order . . 2.3 Subject 1 by Variable . 2.4 Subject 1 in Sequence . 2.5 Subject 2 by Variable . 2.6 Subject 2 in Sequence . 2.7 Phase 1 ANOVA............ 2.8 Effects.................. 2.9 Phase 1 Log Anova . . 2.10 Means .................. 3.1 Subjects 0, 1 by Variable 3.2 Subjects 0, 1 in Sequence 3.3 Subjects 2, 3 by Variable 3.4 Subjects 2, 3 in Sequence 3.5 Subjects 4, 5 by Variable 3.6 Subjects 4, 5 in Sequence 3.7 Subjects 6, 7 by Variable 11 12 12 13 13 14 14 16 31 37 39 39 40 40 41 41 42 3.8 Subjects 6, 7 in Sequence................................... 42 3.9 Phase 2 Time Series ........................................ 52 3.10 Phase 2 ANOVA............................................... 56 3.11 Phase 2 Means .............................................. 57 3.12 Confidence Intervals........................................ 61 3.13 Simple Confidence Intervals................................. 62 3.14 Complex Confidence Intervals.............................. 62 4.1 Phase 3 Data ............................................... 63 4.2 Phase 3 ANOVA............................................... 66 x CHAPTER 1 INTRODUCTION Time is money. Working as a plumber, this is very clear because some projects are very profitable while others lose money. And the obvious difference is time. Experienced plumbers are expensive. If a plumber gets bogged down and goes slow on a job it doesnt take long to eat up all the profits. On the other hand, plumbers are expensive, i. e. they charge a lot of money. So a time-efficient plumbing crew can be a huge cash cow. Among experienced plumbers Ive known, much disagreement exists about what methods are the most time-efficient, and which factors are the most important. The purpose of this study is to use the tools of statistics to sort among many possible predictors of relative efficiency in order to determine which are significant, which are the most important, and how they interact with each other. To penetrate this mystery I built model construction projects similar to a plumber at work. In my first 2 experiments, I built wooden mazes, and had the test subjects run pipe from one point in the 1 / / / / / / /Rigid maze to the other. In the third, I built a wall on the correct scale for a bathroom and ran the water pipe. The first model is cheaper to run and more portable. I used it to cast a broad net and get a hint about many potential variables. However, this model is on a much smaller scale than any real plumbing work. In the third phase, I collected fewer data, but I had a much better idea what I was looking for. I wanted to confirm on a large scale what the small scale results had suggested. Phase 1 was a pilot study where I looked at the effects of 2 organizing ones fittings, the difference between flexible tubing and rigid pipe and the difference among 3 different test subjects. In Phase 2, I collected more data and looked at 5 different factors. In Phase 3, I only looked at the flexibility effect, but it was a much better model. The factors under study are as follows: Plumbing experience Organization of materials Rigid vs. flexible pipe Large vs. small holes Straight vs crooked path Model Assumptions Before collecting any data, I suspected that my response variable should not be the time it takes to complete the maze, but the logarithm of that time. This is because I expected the predictors to affect the completion time multiplicatively, not additively. The fundamental assumption of the standard ANOVA model is that a given datum is equal to a grand mean plus means from various effects plus a normally 3 distributed residual. Imagine for a moment that one could run flexible tubing twice as fast as rigid pipe, and that this was true irrespective of all other factors. And assume for the sake of argument that with rigid pipe maze 1. took on average 100 seconds to complete with rigid pipe, and that maze 2 required 200 seconds. Under our assumptions Mazes 1 and 2 should take 50 and 100 seconds respectively with flexible tubing. So with a large sample size, sample means should be similar to those in figure 1.1. There appears to be interaction between two variables, and there is if yoii think of the factors as additive. But if you believe in multiplicative effects, and you use the log-of time as your response you graph now looks like figure 1.2. The effects have become additive because of the theorem ln(x) In y = In (x/y). Factors For each of these factors, my intuition told me that they would affect speed in a particular way. In this section I give some of my reasons why. In each case, level 0 is the level I expected to go slowly 4 Rigid Flexible . + Figure 1.2. and level 1 is the level I thought would be faster . Plumbing experience How much faster does plumbing experience make somebody? Experienced plumbers cost more money, and in the current labor market theyre harder to find. In the phase 1, all three subjects were inexperienced. However, I looked at the subject effect as a random factor to see if it was a great source of variability. In phase 2, I replaced the subject effect with a controlled experience factor with 2 levels. Those with plumbing 5 experience were at level 1 and those without were at level 0. Organization of Fittings Ive long suspected that having your fittings sorted out makes one faster. I would expect this to be true for 2 reasons. Firstly, you can find things without rummaging. Every time you need a particular elbow or length of pipe you can grab it from its allotted place if things are sorted, or you can dig for it if theyre not. This aspect is whats being tested here. However, theres a second reason for keeping your fittings organized which is harder to test in a controlled study like this. If you sort each kind of fitting into its own bin, you know what youre running out of, so you know what to get more of. It cant possibly be efficient to roll up your tools in the middle of the day and go buy 1 or 2 things that you ran out of but desperately need. Im not sure that having separate bins for every part youll ever need is always efficient. After all, fittings have to be sorted into these bins. The labor cost of this has to be measured in order to completely answer the question of whether organizing pays dividends. But I did get some useful information. 6 For both phase 1 and phase 2, this a controlled factor with 2 treatments. Level 1 is where the things the subject needs are presented in a well sorted tray. The other is where the subject is given a large bucket with all the fittings needed. However, this bucket also contains a lot of useless hardware and its all mixed up. Rigid vs. Flexible Pipe In half the trials of the pilot study 'and phase 2, subjects ran 3/8 black iron pipe through a maze. In the rest, flexible nylon tubing was used. The flex had an outside diameter of 5/8 just like 3/8 black iron. (3/8 refers to the inside diameter of the pipe. The rigid pipe is level 0. Flexible tubing is level 1. I expected flexible tubing to be faster because it requires fewer fittings and it doesnt require the precise measurements that rigid pipe does. Large Holes vs. Small I think that large holes give you more latitude to be imprecise. The more perfect your decisions have to be, the slower you go. Neibel claims that production cost decrease asymptotically with larger tolerances as shown in figure 1.3. (Neibel p. 65) The diameter of a 7 hole which pipe must run through is a kind of tolerance, because it limits the sizes of pipe that can be used to construct the line that eventually goes through it. Its also more awkward to run pipe through small holes. One may have more trouble getting the pipe into position. Therefore, I assigned level 0 to small holes and 1 to large ones. Straight vs Crooked Path Obviously making a path crooked is going to require rigid pipe runners to use more fittings and thus more time. So the crooked path is level 0 while a straight path is level 1. But how much slower? How much does it slow down flexible tubing? A plumber doesnt always have control over how straight his path can be. The buildings design and other utilities can get in the way. But if he knows his pipe is going to have to run like spaghetti he can adjust his bid accordingly. I was also interested in whether or not the other factors reacted the same way to a straight path that they did to a crooked one. 8 Approximate relationship between cost and machining tolerance 0 0.010 0.020 TOLERANCE INCHES (PLUS OR MINUS) Figure 1.3 CHAPTER 2 PILOT STUDY In this phase of the research 24 data from 3 subjects measure 3 effects. A. volunteers (random effect) B. organization of fittings C. hard pipe vs. flexible hose. The foremost goal of the pilot study was to work the kinks out of the data gathering process. I also wanted to approximate the mean square for error of the data, in order to guess how many observations I would need to take in phase 2. Furthermore, I wanted to look for a time series learning curve. Is the 8th observation from a given subject likely to be faster than the 1st? There could also be as fatigue effect making the 8th slower than the 7th. I wanted to be sure that any time series effects would not distort my data thus casting doubt on my inferences. Toward this end, I randomized the order in which I collected the data. The idea was to make data sheets like those in table 2.1. The numbers in the column Ord are generated by a random 10 number generator which samples without replacement. When actually doing the experiment it would be more useful to organize the chart like table 2.2. For each subject a different random order was chosen. Originally the data sheets had Is and Os for treatment levels like table 2.1. After timing the first subject, (subject 0) I realized that it would be easier to keep the treatments straight while administer the trials if I wrote out the factor levels as is done in table 2.2. PILOT Subject 0 bv variable ord ref oraan flex time 6 0 0 0 214 1 1 0 0 439 2 2 0 1 56 5 3 0 1 48 4 4 1 0 186 3 5 1 0 238 7 6 1 1 43 0 7 1 1 55 Table 2.1. 11 in seauence om ref oraan flex time min sec 0 7 org-1 flex-1 55 0 55 1 1 0-dis rig-0 439 7 19 2 2 0-dis flex-1 45 0 45 3 5 brg-1 rig-0 238 3 58 4 4 org-1 rig-0 186 3 6 5 3 0-dis flex-1 48 0 48 6 0 0-dis rig-0 214 3 34 7 6 org-1 flex-1 43 0 43 Table 2.2. : Subiect 1 bv variable ord ref oman flex time 0 0 0 0 612 7 1 0 0 395 6 2 0 1 53 5 3 0 1 79 4 4 1 0 311 2 5 1 0 211 3 6 1 1 51 1 7 1 1 65 table 2.3. 12 in seauence ord ref oman flex time min sec 0 7 ora-1 flex-1 47 0 47 1 0 dis-0 rig-0 568 9 28 2 2 dis-0 flex-1 60 1 3 1 dis-0 rig-0 380 6 20 4 5 org-1 n'g-0 352 5 52 5 6 org-1 flex-1 65 1 5 6 3 dis-0 flex-1 145 2 25 7 4 org-1 rig-0 313 5 13 Table 2.6. Table 2.7 14 Tables 2.1-6 also list the times in seconds that each trial took. Significance of Factors Table 2.7. is an ANOVA table computed in MINITAB. The table is completely accurate up to the column F. Because the factor subject is a random factor the test statistics should be as in table h. (Neter pp. 1377-81) In table 2.8. conventional notation is used where subject = A, org = B and flex = C. We can now see that all 3 main effect are significant and that no interactions are. As Ill discuss below, the significance of organization may be inflated because of time series effects. Normal Residuals Figure 2.1 is a normal probability plot. It plots residuals against the expected standard normal values for a 24 data sample. For Figure 2.1, the analysis was done on time values. In Figure 2.2, we see the same graph using log-of-time for a response variable. Clearly the log- of-time analysis produces more normal residuals. 15 Effect E(MS) General F Statistic Specific F Value p Value subject A bcncÂ£ + o2 MSA/MSE 4.00 0.047 org B acn Â£ P/(*-l) +C02p+02 MSB/MSAB 16.03 0.057 flex C zbn Â£y*/(c-1) +bnalKt+a1 MSC/MSAC 4623.68 2.16 X io-4 subject*org A*B cna^+a2 MSAB/MSE 0.423 0.66 subiect*flex A*C 6/ia^+o2 MSAC/MSE 0.0403 0.96 org*flex B*C PyJ (6-l)(c-l) +M02pr+02 MSEC MSABC 1.31 0.37 subject*org* flex A*B*C oJPY+o2 MSABC/MSE 0.800 0.47 error O2 table 2.8. Equailitv of Variance Fundamental to the standard ANOVA model is the assumption that the residuals are not only normally distributed, but that they also have the same variance for every treatment. 16 aosu i e s X s 3 * s -f- I* I -J a si X X X X X X X X X X X X X X X X X X X X X X Figure 2.1 17 JOSU *5 e v i - 8 8 Â£ 6 Figure 2.2 18 flexibility. Figure 2.3 is done with time, Figure 2.4 with log-of-time. What these graphs shows us is that for this factor, time would seem to give unequal variances, while log-of-time would not. The same plots with respect to test subjects are shown in Figures 2.5. and 2.6. Here there doesnt seem to be highly unequal variance either way. When the graphs are made with respect to organization though one gets apparent difference in variance regardless of response variable. This can be seen in Figures 2.7 (time) and 2.8 (log of time). This could be coincidence or it could be that the variances are significantly different. If it is significant, it makes the data harder to analyze. But if variances are truly different, that information is not without value. It suggests that organizing ones parts not only makes one faster, but makes completion time more predictable. The smaller variance is with factor level 1: organized fittings. In running a plumbing business, steady income is preferable to feast and famine. After all, several famine months in a row can result in bankruptcy. For the sake of analyzing the pilot data, I did a Hartley test. A balanced ANOVA is somewhat robust against unequal variance. The 19 KIJf I ee H- *> J 9 5 M H------------- O s ) X X XX XX X X xc {4 - s -- s e o CD .. s X'f&wr* x Figure 2.3 20 5 tn 1 G S id G i s w G s G X X3K X X X X XX x s N -- s -- os -- -- S X XXX XX XXX X X Figure 2.4 21 X X 80 ti Meres 0 - x x X X X X x X CN + + 0.00 0.40 0.80 1.20 1.60 2.00 subject Figure 2.5 subject S e w 1 Hi 6 6 ft* 8 U 8 CO 8 8 8 8 8 8 8 ib 8 -i 8 e -e* 8 - 1 1 X Z X X Z X H- N 8 K B OS 8 - 5 X X X X 00S 1 X XXXXXX X Figure 2.6 23 09'0 0V0 8S "0 00 I CO u M H-------- 03 s >X X X X X X X XX -- s 0 s Figure 2.7 X XX * XX X 24 0.20 0.40 0.60 0.80 8 6 8 -- X 2 1 6 6 VI JU 8 8 w 8 8 6 -4- H x XX XX * 6 8 0 s U5 Figure 2.8 X X M X XWXX 25 idea of a Hartley test is to look at the ratio of the standard deviations for the highest and lowest factor levels. If the ratio is not significantly large, one falls back on this robustness and does a standard ANOVA. (Neter p. 764-6, 1189) jy_0!0823 _4 02 0.0204 r = 2 d.f. = 11 Ha:ol*a\ rejectH0 if H > H(0.99, 2, 11) = 5.38 But 4.02<5.38 so conclude Ha If I had used a = 0/05, I would have H* = 3.5 and be forced to conclude Ha. So while the variances probably are different, they arent so different as to warrant concern for the analysis of this data set. At this point I was faced with the possibility that with more data in phase 2, this same test might become significant. However, since I seemed to have normal residuals, I knew I knew I could always try an other transformation or nse weighted least squares if necessary. (Neter p. 768) Time Series As I indicated earlier, one pitfall of this study was the risk that 26 6 8 h> 8 os 8 0 s ft- Figure 2.9 4.5 the speed with which test subjects worked could change over the course of their session. Figure 2.9. plots residuals against their order of execution. There appears to be a general downward trend. This trend is even more obvious in Figure 2.10. Figure 2.10 is the same graph, but only the trials with rigid pipe are represented. By contrast, Figure 2.11 show only trials with flexible tubing. This suggests that not only is flexible tubing faster to run, its easier to learn. Indeed, one seems to reach optimum speed immediately. One potential problem from this would be to inflate the MSE. If the goal is to study the long term productivity of full time plumbers then the early trials are abnormally large, and the range of response values is wider than necessary. So for phase 2, I had each test subject do 2 practice runs with rigid pipe. Notice from the data that for each test subject the first time a given subject runs rigid pipe, that is their slowest time. By coincidence, in each of these data, the fittings are disorganized. These assignments were made randomly. However, with only 3 test subjects, there is a 0.25 probability that they would all be the same factor level, and a 0.125 probability that they would all be disorganized, which is the level I had predicted to be slower. This casts dispersion on the organization 28 s H' tt 6 6 i 9 9 V N - 9 N Ul -1 9 1 Ul h X Ul X X X M X X Ul X XX a* -- -0 Ul *s e j % Figure 2.10 X X X 29 paoxexj I \ 'i ji I 1 6 *> i 6 0 to 9 X w 6 9 9 J 1 1- w X ilk x x <*> OS *p 00 x xx os X X Figure 2.11 30 x i MTB > Regress 'subrsp' 4 'subjo' 'subjl' 'suborg' 'subflex The regression equation is subrsp = 5.97 0.448 subjo 0.161 subjl 0.197 suborg 1.58 subflex Predictor Coef Constant 5.9660 subjo -0.4479 subjl 0.1613 suborg 0.1970 subflex 1.5821 Stdev t-ratio P 0.1315 45.37 0.000 0.1293 -3.46 0.003 0.1293 -1.25 0.230 0.1082 -1.82 0.087 0.1082 -14.62 0.000 S = 0.2420 R-sq = 93.4% R-sq(adj) = 91.8% Analysis of Variance SOURCE DF ss MS Regression 4 13.2587 3.3147 Error 16 0.9369 0.0586 Total 20 14.1956 SOURCE DF SEQ SS subjo 1 0.6294 subjl 1 0.0911 suborg 1 0.0229 subflex 1 12.5154 Table 2.9 effect that seems to be present in the data. Table 2.9 shows the results of a regression analysis done on MINITAB. The 3 dubious data are discarded, org and flex become suborg and subflex. With just 2 factor levels, they can be treated as regression predictors. SubjO is 1 if the datum is from test subject 0 and 1 otherwise. Subjl is 1 for subject 1 and 0 otherwise. Interactions arent considered. With this analysis, the p-value goes to 0.087 up from 0.023. Its a different kind analysis with fewer data, but I still believe that the organization effect is weaker than it seems. Below well see that removing the 3 data lowers the improvement percentage for organization. After making data charts for phase 2, I looked at the first official rigid pipe datum for each of the 8 test subjects. Of these, 5 are organized, 5 use large holes, 4 use a straight path and of course, 4 are experienced plumbers. I dont expect any great bias from time series effect in phase 2. There should be less confounding with the initial rigid trials more balanced. Doing 2 practice runs should also even out the data some. Looking foreword to phase 2: are 64 data enough? As I said earlier, one reason for doing a pilot study was to be 32 able to decide how much data to collect in phase 2. Now phase 2 looked at 5 factors with 2 levels each. I wanted to be able to make inferences about all my interactions: 2, 3, 4 and 5 way so I needed at least 2 repetitions of each treatment. I need to collect at least 26 or 64 data. If necessary I could have collected more. The statistics Im most interested in are the percentages of time saved by using the faster methods. Specifically, I want to be sure that when changing factor levels improves the true mean by a modest amount It will manifest itself as a significant factor. Because Im doing my analysis on the logs of the times rather than the times themselves, I want to start out looking at the differences in treatment means rather than their ratios. So suppose that in phase 2 Im evaluating a factor which has mean p0 for the slow method and pt for the faster method. Again, Im in the logarithmic domain here. I knew that after evaluating my data for phase 2, I would have statistics x0 and Jtj which are unbiased point estimates of p0 and pj. At this point I want to consider U=p0 pj with 0 + jc0..-xx . 33 C-u \ MSE\ nil has a t distribution on the same number of degrees of freedom as the MSE, 32 in this case. (Neter p. 718-9) I didnt yet know MSE for phase 2, so I used my MSE from phase 1. I felt that with more explanatory variables in phase 2, MSE would tend to be smaller. I also expect less general variance because as I said in the discussion of time series effects, by doing 2 practice runs the times should be more consistent. I want to do a Lower tail test because Im not worried about tf being larger than U. Only if 0 is too small can we have an important improvement not show up as significant t(0.9; 32) = 1.694 MSE il_ 0.0942 f 2 [nil] \l l 64/2) So with 90% confidence, U - ^ 0.0767 1.694 = 0.1299. This means that if Â£T > 0.1299 it will be found to be significant. Further, if U > 2*0.1299 = 0.260, we can be 90% confident that the parameter will 34 be found to be significant. This is just the difference of the transformed values. So far all conpletion, so that log(T) = pi If factor level 1 shaves percentage P off of level 0, then Tj = T0(l-P). log(T1)= log(T0(l-P P, = log(T0(l-P)) Pi = Po + log(l-P) Po Pi = -log(l-P) U = -log(l-P) 1-P = eu P = 1-eu If U = 0.260, P = 0.23. That is, whenever changing treatment levels improves speed by 23% or better, we can reasonably expect a significant statistic. Table 2.10 shows means from the MINITAB ANOVA analysis. For Flexibility, CT = 5.794 4.0823 = 1.7117. Therefore, a 90% confidence interval for 0 is, our analysis has been done in log-of-time mode. Let T = time of Estimates of improvement percentages 1.7117 35 = 1.7117 = 1.7117 P = l-eu, \ 0.0942 *1.782 UJ 0.2233 = {1.477, 1.935} so our confidence interval for P is from 77% to 85% with 81% being the point estimate. If we proceed in the same manner for organization we get, 0.3265 0.2233 = {0.103, 0.5498} This would imply that P is between 10% and 42% with a point estimate of 27%. But as we said earlier, the organization effect might be inflated by the time series effects. Table 2.9, the regression analysis I did when I threw out the 3 points I wasnt sure about give the slope for the organization effect as 0.197. Since the interval from 0 to 1 is 1 unit wide, changing from 0 to 1 decreases the response variable by 0.197. Therefore = 0.197. Using the same standard error as before, P now falls in a confidence interval of -3% to 34% with 18% most likely. That is, were no longer sure that there even is an organization effect. 36 MEANS subject N rsp 0 8 4.6994 1 8 4.9917 2 8 5.1235 org N rsp 0 12 5.1014 1 12 4.7749 flex N rsp 0 12 5.7940 1 12 4.0823 MTB > nopaper Table 2.10 37 CHAPTER 3 5 FACTOR FACTORIAL ANALYSIS In this part of the study, I collected 64 data, so that I could study 5 factors at 2 levels each with 2 data per treatment. Each of 8 subjects were timed on 8 treatments. 4 subjects were experienced plumbers, and 4 were not. As in phase 1, the order in which the treatments were administered was randomized. However, this time there were 16 treatments that a given subject could be given. The first factor was plumbing experience, and so depended only on who the subject was. To get the right assignments I paired subjects up and assigned the 16 treatments randomly by sampling without replacement, the integers from 0 through 15. Each of these numbers corresponds to a treatment, the first 8 treatments were carried out by the first subject. The last 8 by the second See tables 3.1-8. Equality of Variance Figure 3.1 shows residuals plotted against organization. To 38 By Variable Subjects 0,1 In Sequence ord__________ref__________experience organ flex_________________holes path________________time 14 0 0-inexp 0-disora 0-riaid 0-small 0-crooked 711 13 1 0-inexp 0-disorg 0-riaid 0-small 1-straight 263 15 2 0-inexp 0-disora 0-riaid 1-large 0-crooked 719 5 3 0-inexp 0-disora 0-riaid 1-larae 1-straight 105 7 4 0-inexp 0-dlsorg 1-flex 0-small 0-crooked 86 4 5 0-inexp 0-disora 1-flex 0-small 1-straight 55 8 6 0-inexp 0-disora 1-flex 1-large 0-crooked 138 3 7 0-inexp 0-disora 1-flex 1-large 1-straight 43 6 8 0-inexp 1-ora 0-riaid 0-small 0-crooked 474 11 9 0-inexp 1-ora 0-riaid 0-small 1-straight 94 2 10 0-inexp 1-ora 0-riaid 1-large 0-crooked 454 10 11 0-inexp 1-org 0-rigid 1-large 1-straight 80 1 12 0-inexp 1-ora 1-flex 0-small 0-crooked 58 0 13 0-inexp 1-ora 1-flex 0-small 1-straight 52 12 14 0-inexp 1-org 1-flex 1-large 0-crooked 63 9 15 0-inexp 1-flex 1-large 1-straight 67 ord_______ref________experiencr organ flex holes path time 0 13 0-inexp 1-ora 1-flex 0-small 1-straight 52 1 12 0-inexp 1-org 1-flex 0-small 0-crooked 58 2 10 0-inexp 1-org 0-rigid 1-large 0-crooked 454 3 7 0-inexp 0-disorg 1-flex 1-large 1-straight 43 4 5 0-inexp 0-disorg 1-flex 0-small 1-straight 55 5 3 0-inexp 0-dlsorg 0-rigid 1-large 1-straight 105 6 8 0-inexp 1-org 0-rigid 0-small 0-crooked 474 7 4 0-inexp O-disorg 1-flex 0-small 0-crooked 86 8 ' TJ 0-inexp 0-dlsorg 1-flex 1-large 0-crooked 138 9 15 0-inexp 1-ora 1-flex 1-large 1-straight 67 10 11 0-inexp 1-ora 0-rigid 1-large 1-straight 80 11 9 0-inexp 1-org 0-rigid 0-small 1-straight 94 12 14 0-inexp 1-ora 1-flex 1-large O-crooked 63 13 1 O-inexp 0-disorg 0-riaid 0-small 1-straight 263 14 0 0-inexp 0-disora 0-rigid 0-small 0-crooked 711 15 2 0-inexp 0-disora 0-riaid 1-large 0-crooked 719 ON By Variable Subjects 2, 3 ord_______ml_________experience organ flex holes path 15 0 0-lnexp 0-disorg 0-rigid 0-small 0-crooked 353 3 1 0-inexp 0-disorg 0-rigid 0-small 1-straight 172 1 2 0-inexp 0-disorg 0-rigid 1-larae 0-crooked 342 2 3 0-inexp 0-disorg 0-rigid 1-large 1-straight 121 12 4 0-lnexp 0-disorg 1-flex 0-small 0-crooked 89 0 5 0-inexp 0-disorg 1-flex 0-small 1-straight 59 4 6 0-inexp 0-disorg 1-flex 1-large 0-crooked 44 14 7 0-inexp 0-disorg 1-flex 1-larae 1-straight 42 10 8 0-inexp 1-orq 0-rigld 0-small 0-crooked 376 9 9 0-inexp 1-ora 0-rigid 0-small 1-straight 60 11 10 0-inexp 1-ora 0-rigid 1-large 0-crooked 423 5 11 0-lnexp 1-org 0-rigid 1-larae 1-straight 64 8 12 0-inexp 1-ora 1-flex 0-small 0-crooked 55 6 13 0-inexp 1-org 1-flex 0-small 1-straight 40 7 14 0-inexp 1-org 1-flex 1-larae 0-crooked 43 13 15 0-inexp 1-ora 1-flex 1-large 1-straight 51 In Sequence ord________ref experience organ flex boles path UIU 0 5 0-inexp 0-disorg 1-flex 0-small 1-straight 59 1 2 0-inexp 0-disorg 0-rigid 1-large 0-crooked 342 2 3 0-inexp 0-disorg 0-rigid 1-large 1-straight 121 3 1 0-inexp 0-disorg 0-rigid 0-small 1-straight 172 4 6 0-inexp 0-disorg 1-flex 1-large 0-crooked 44 5 11 0-inexp 1-org 0-rigid 1-large 1-straight 64 6 13 0-inexp 1-ora 1-flex 0-small 1-straight 40 7 14 0-inexp 1-org 1-flex 1-large 0-crooked 43 8 12 0-inexp 1-org . 1-flex 0-small 0-crooked 55 9 9 0-inexp 1-org _ 0-rigid 0-small 1-straight 60 10 8 0-inexp 1-ora 0-rigid 0-small 0-crooked 376 11 10 0-inexp 1-org 0-rigid 1-large 0-crooked 423 12 4 0-inexp 0-disorg 1-flex 0-small 0-crooked 89 13 15 0-inexp 1-org 1-flex 1-larae 1-straight 51 14 7 0-inexp 0-disoig 1-flex 1-larae 1-straight 42 15 0 0-inexp 0-disorg 0-rigid 0-small 0-crooked 353 By Variable Subjects 4, 5 ord_______ref________experience organ flex holes path time 15 0 1-exper 0-disorg 0-rigid 0-small 0-crooked 504 7 1 1-exper 0-disorg 0-rigid 0-small 1-straight 58 13 2 1-exper 0-disorg 0-rigid 1-large 0-crooked 385 8 3 1-exper 0-disorg 0-rigid 1-large 1-straight 257 6 4 1-exper 0-disorg 1-flex 0-small 0-crooked 46 10 5 1-exper 0-disorg 1-flex 0-small 1-straight 71 14 6 1-exper 0-disorg 1-flex 1-large 0-crooked 37 0 7 1-exper 0-disorg 1-flex 1-large 1-straight 38 1 8 1-exper 1-org 0-rigid 0-small 0-crooked 292 11 9 1-exper 1-org 0-rigid 0-small 1-straight 102 9 10 1-exper 1-org 0-rigid 1-large 0-crooked 328 2 11 1-exper 1-org 0-rigid 1-large 1-straight 39 5 12 1-exper 1-otg 1-flex 0-small 0-crooked 32 4 13 1-exper 1-org 1-flex 0-small 1-straight 32 3 14 1-exper 1-org 1-flex 1-large 0-crooked 30 12 15 1-exper 1-org 1-flex 1-large 1-straight 26 In Sequence ord________ref_________experiencei organ flex holes path time 0 7 1-exper 0-disorg 1-flex 1-large 1-straight 38 1 8 1-exper 1-org 0-rigid 0-small 0-crooked 292 2 11 1-exper 1-org 0-rigid 1-large 1-straight 39 3 14 1-exper 1-orq 1-flex 1-targe 0-crooked 30 4 13 1-exper 1-org 1-flex 0-small 1-straiqht 32 5 12 1-exper 1-org 1-flex 0-small 0-crooked 32 6 4 1-exper 0-disorg 1-flex 0-small 0-crooked 46 7 1 1-exper 0-disoig 0-rigid 0-small 1-straight 58 8 5 1-exper O-disorg 0-rigid 1-large 1-straight 257 9 10 1-exper 1-otg 0-rigid 1-large 0-crooked 328 10 5 1-exper 0-disorg 1-flex 0-smail 1-straight 71 11 9 1-exper 1-org 0-rigid 0-small 1-straight 102 12 15 1-exper 1-org 1-flex 1-large 1-straight 26 13 2 1-exper 0-disorg 0-rigid 1-large 0-crooked 385 14 6 1-exper 0-dlsorg 1-flex 1-large 0-crooked 37 15 0 1-exper 0-disorg 0-rigid 0-small 0-crooked 504 By Variable Subjects 6, 7 ord_______ref_______experienceorgan flex holes path time 9 0 1-exper 0-disorg 0-rigid 0-small 0-crooked 442 1 1 1-exper 0-disorg 0-rigid 0-small 1-straight 98 3 2 1-exper 0-disorg 0-rigid 1-large 0-crooked 324 8 3 1-exper 0-disorg 0-rigid 1-large 1-straight 112 12 4 1-exper 0-disorg 1-flex 0-small 0-crooked 56 13 5 1-exper 0-dlsorg 1-flex 0-smail 1-straight 47 14 6 1-exper 0-disorg 1-flex 1-large 0-crooked 52 4 7 1-exper 0-disorg 1-flex 1-large 1-straight 40 11 8 1-exper 1-org 0-rigid 0-small 0-crooked 467 10 e 1-exper 1-org 0-rigid 0-small 1-straight 59 5 10 1-exper 1-org 0-rigid 1-large 0-crooked 213 7 11 1-exper 1-org 0-rigid 1-large 1-straight 56 6 12 1-exper 1-org 1-flex 0-small 0-crooked 44 0 13 1-exper 1-ora 1-flex 0-small 1-straight 40 15 14 1-exper 1-ora 1-flex 1-large 0-crooked 42 2 15 1-exper 1-ora 1-flex 1-large 1-straight 29 In Sequence 0 13 1-exper 1-org 1-flex 0-small 1-straight 40 1 1 1-exper 0-disorg 0-rigid 0-small 1-straight 98 2 15 1-exper 1-org 1-flex 1-large 1-straight 29 3 2 1-exper 0-disorg 0-rigid 1-large 0-crooked 324 4 7 1-exper 0-disorg 1-flex 1-large 1-straight 40 5 10 1-exper 1-ora 0-rigid 1-large 0-crooked 213 6 12 1-exper 1-org 1-flex 0-small 0-crooked 44 7 11 1-exper l-ora 0-rigid 1-large 1-straight 56 8 3 1-exper 0-disorg 0-rigid 1-large 1-straight 112 9 0 1-exper 0-disorg 0-rigid 0-small 0-crooked 442 10 9 1-exper 1-org 0-rigid 0-small 1-straight 59 11 8 1-exper 1-org 0-rigid 0-small 0-crooked 467 12 4 1-exper 0-disorg 1-flex 0-small 0-crooked 56 13 5 1-exper 0-disorg 1-flex 0-small 1-straight 47 14 6 1-exper 0-disorg 1-flex 1-large 0-crooked 52 15 14 1-exper 1-org 1-flex 1-large 0-crooked 42 os *0 0? 0 eae oz "o eoo s a N a - VI i $L 8 a a * Â£ S 1 1 1 X XXX XXX XKWOKm XXX XX X X w s a 1 00 ki uaauu^uL uui mu uuuuiu u X*B#XXP XK Ma QB0BQKX 0 s UJ Figure 3.1 43 40 produce the residuals, the logs of completion times are used for a response variable just as in phase 1. It appears to have less variance for organized fittings than disorganized fittings. Remember that the same issue arose in the pilot study. As in the pilot study, we want to do a Hartley test to make sure we can use a standard ANOVA model. Our sample variances this time are 0.0277 for organized fittings and 0.0629 for disorganized. The test statistic H = 2.27. We have 63 d. f. For a = .05, H* = 1.67. If a = .01, H* = 1.96. That is, this time we really do have a variance problem. On the one hand, this completely confirms my earlier hypothesis that organizing reduces variance. However, we cant proceed with the analysis until we fix the problem. So next I used the transformation Y = log(log(Y)). Now the sample variances are 0.0461 and.0385. H = 1.066 < 1.67. In the pilot study, I used Y = log(Y) for theoretical reasons; we were studying multiplicative agents. Here Im only using Y = log(log(Y)) because it makes the data analyzable. Figure 3.2 shows the new residuals plotted against organization. Figures 3.3-5 are similar plots for experience, flexibility and path respectively. Figures 3.6-7 show that the new transformation drives the residual variances farther apart in the case of hole size. However, the 44 08-0 09"0 0^-0 03*0 00*0 9 m f l 6 H 6 CD a t- 6 X X XOOCX X X8S06KX XXX* X X * 6 o 0 s ut Figure 3.2 v ****** X X *XttttK X rS ffvnnnnn n n 45 X 0.10 ne wires id 0.00 g x * 0 K x g -0.10 ' X 8 g 8 $ g 8 X 0.00 0.20 0.40 0.60 ----H 0.80 ----1----- 1.00 expep Figure 3.3 o Q I 6 h- O -+ 6 S H- 9 n C *s (D VI M* 9- S a h* X XOOSM X>X to -- O c* -- 9 a -- 9 H*
X |