PAGE 1
DIFFERENCE AND DIFFERENTIAL EQUATIONS IN POPULATION GENETICS by Aesoo Chung B.S. University of Colorado at Denver, 1996 A thesis submitted to the University of Colorado at Denver in partial fulfillment of the requirements for the degree of Master of Science Applied Mathematics 1998
PAGE 2
This thesis for the Master of Science degree by Aesoo Chung has been approved by William L. Bnggs isher Date _l_o.:.../_"'_1_:../_7_1? ____
PAGE 3
Chung, Aesoo (M.S., Applied Mathematics) Difference And Differential Equations In Population Genetics Thesis directed by Professor William L. Briggs ABSTRACT Genetics is the modern science of heredity which originated in the 19th centnry with the discoveries of Darwin and Mendel. The hereditary information of all living cells is transmitted by genes which contain DNA. Alternative forms of a single gene are called alleles and a physical location of a gene is called a locus. Using basic Mendelian genetics, we can derive basic models for large randomly mating diploid populations which describe the gene pool proportion of different types of alleles. With these models, we will study three cases using both ordinary difference and differential equations: two alleles with one locus, two alleles with two loci, and two alleles with three loci. This abstract accurately represents the content of the candidate's thesis. I recommend its publication. Signed William L. Briggs 111
PAGE 4
ACKNOWLEDGMENT Professor Briggs has willingly lent his expertise and guidance in a caring and supportive manner. For: his work in mentoring, I give my complete thanks. Professors Fisher and Bennethum have also given me their time and talent, earning my appreciation.
PAGE 5
CONTENTS Chapter 1. Introduction and Background 2. Case of Two Alleles, One Locus 2.1 Ordinary Difference Equations 2.2 Ordinary Differential Equations 3. Case of Two Alleles, Two Loci .. 3.1 Ordinary Difference Equations 3.2 Ordinary Differential Equations 4. Case of Two Alleles, Three Loci 5. Conclusion References . v 1 8 8 16 24 24 39 57 64 66
PAGE 6
FIGURES Figure 2.1 (a) s 1 > s2 > s 3 > 0, allele A dominates; (b) s3 > Sz > s 1 > 0, allele a dominates .. ...... , , , ... , , 12 2.2 (a) Sz > s 1 > 0, s2 > s 3 > 0, type Aa dominates (Heterosis); (b) 0 cr2 > cr3 allele A dominates; (b) cr3 > cr2 > cr1 allele a dominates. . . , , , , , , . . , . . . . 20 2.4 (a) CTz > cr1 cr2 > cr3 Aa dominates (Heterosis); (b) and (c) CTz < 0'1, CTz < cr3, Aa is least favored. 22 3.1 Fitness Curve I 34 3.2 Phase Portraits: (a) b > a > 1 > c = 0; (b) b > 1 > a > c = 0 38 3.3 Fitness Curve II 48 3.4 (a)b>a>0>c=1;(b)b>O>a>c=1 52 3.5 (a) b >a> 0 > c = 1; (b) b > 0 >a> c = 1 53 3.6 Phase Portraits, b >a> c = 0: (a) (Po, ro) = 0 a > c = 0 > d; (b) b > a > 0 > c > d 60 4.3 Growth rates: (a) b >a> c = 0 > d; (b) b >a> 0 > c >d. 61 4.4 Growth rates: (a) b > a > c = 0 > d; (b) b > a > 0 > c > d 62 4.5 Growth rates: (a) b >a> c = 0 > d; (b) b >a> 0 > c > d, 63 VI
PAGE 7
TABLES Table 1.1 Mendel's laws ....................... 3.1 Probability of genotypes being selected for reproduction 3.2 Fitness parameters of genotypes 3.3 Fitness parameters of genotypes 3.4 Fitness parameters of genotypes 4.1 Fitness parameters of genotypes Vll 6 24 25 34 49 60
PAGE 8
1. Introduction and Background Genetics is the modern science of heredity which was originated in the 19th century with observations by Darwin and Mendel. Mendel discovered that observable hereditary characteristics are determined by genes transmitted without change and in a predictable fashion from one generation to the next. The term genetics was introduced by British biologist William Bateson in 1907. A cell is the smallest unit of life consisting, at a minimum, of an outer membrane enclosing a watery medium containing organic molecules in cluding the genetic material deoxyribonucleic acid (DNA). DNA is a molecule composed of deoxyribose nucleotide that encodes the genetic information of all living cells. Chromosomes, a thin chainlike structures composed of DNA and proteins found in the nucleus of a cell, are large molecules in living cells that carry the information for all of its chemical needs. Some cells have single chro mosomes, and are called haploid, and others h
PAGE 9
cells reproduce according to the simple asexual process: enlarge, divide in two, enlarge, divide in two, and so forth. This process is called cell division. Cell division produces two daughter cells that are roughly identical copies of the original cell before it starts to grow. Each round of growth and cell division is called a cell cycle [1]. In multicellular organisms, the cell cycle is only part of the life cycle. Multicellular organisms begin life as a. fertilized egg. Repeated cell divisions produce hundreds of trillions of cells that make up the adult orga.nisrns. Mean while, a specialized set of cells in the reproductive organs undergoes a different kind of cell division, called meiosis, which is sexual division [1]. In meiosis, the parent cells create new cells, called gametes; one gamete from one parent fertilizes a gamete from the other parent to form a fertile cell, called a zygote. In humans, the gametes are the spermatozoa (male) and ova (female). After a sperm and an egg fuse, the zygote undergoes division to form the embryo [2]. When a. cell divides, it transmits two essential requirements for life to its offspring. The first requirement is the hereditary information to direct life processes and the second requirement is materials that offspring need to survive and to use their hereditary information. The hereditary information of all living cells is DNA. A molecule of DNA consists of along chain of smaller subunits. A gene is a segment of DNA 2
PAGE 10
on a chromosome that encodes the instructions for producing a specific trait. The location of a specific gene on a chromosome is called its locus. Differences in DNA composition at the same gene locus form different alleles of the gene [1]. The actual alleles of each gene carried by the organism is called genotype and the physical apperance of an organism. is called phenotype. For any cell to survive, it must have a complete set of genetic instruc tions. Therefore, when a cell divides, it cannot simply split its set of genes in half and give each offspring cell one half of the set. The cell first duplicates its DNA, like making a photocopy of an instruction manual. Then each offspring cell receives a complete set of DNA containing all the genes [1]. Through celluar reproduction/ division, hereditary inforrnation is passed on from one generation to next. Gregor Mendel was a monk in the monastery of St. Thomas in Brunn (now Brno), Czechoslovakia. For two years, Mendel attended the University of Vienna to earn a teaching certification. At the university, he studied botany and mathematics among other subjects. While carrying out his monastic du ties, he performed experiments on inheritance with edible peas [1]. At first, Mendel worked on one trait at a time and chose the traits which had unmistakably different forms of expression, for example, a white flower versus a purple flower. Mendel raised a variety of pea plants that were truebreeding 3
PAGE 11
for different forms of a single trait and crossfertilized them. Such an experiment, involving organisms that differ in only one trait, is called a monohybrid cross. Mendel saved the resulting hybrid seeds and grew them the next year [1 J. In one of his experiments, Mendel crossfertilized a whiteflowered pea with a purpleflowered pea and all the first generation offspring had purple as their flower color and they were as purple as the parents' color. Mendel began to wonder whatever happened to the white color which seemed to disappear. To answer this question, Mendel fertilized the first generation offspring with another first generation offspring. In the second generation of offspring, both white and purple colors showed up. Overall, about three fourths of the pea plants had purple and one fourth had white flowers. Then, he fertilized the second generation offspring with another second generation offspring and found that when a whiteflowered plant was fertilized with another white, the third generation offspring was all white, and this was true for any generation. In conclusion, Mendel found that all whiteflowered parents gave birth to whiteflowered offspring, about one third of the purpleflowered second generation offspring were truebreeding and the remaining two thirds were hybrid, and hybrids gave birth to both purple and whiteflowered offspring. Hence, the second generation offspring had one fourth truebreeding 4
PAGE 12
plants, one half hybrid purpleflowered plants and the last one fourth had true breeding white flowered plants. By observing these results, Mendel formed five laws: (1) Each tmit is determined by pairs of discrete physical units, now called genes. Each individual organism has two genes that together control the expression of a given trait. (E.g., genes for purple or white flowers). (2) Pairs of genes separate from each other during gamete formation. (3) Which member of a paiT of genes becomes included in a gamete zs determined by chance. ( 4) There may be two or more alternative forms of a gene. Whiteflowered peas have a different form for the gene controlling flower color than purpleflowered peas do. In many instances, one allele, called the dominant allele (allele for purple flower) can completely overshadow the expression/physical appearance of the other, recessive allele (allele for white flower), but the recessive allele can be passed down to the next generation unchanged. (5) Truebreeding organisms have two of the same alleles for the trait whereas hybrids have two different alleles [1]. Suppose we let letter A be allele for purple flower and a for white flower. Then tru;"breeding purple flower will have genotype AA whereas hybrid purple flower will 5
PAGE 13
Parents Frequency of Progeny AA A a a a AA X AA 1 I = 1 AA X A a 1 1 = 1 ? ? AA X a a 1 = 1 Aa x Aa 1 = 1 Aa x aa = 1 aa x aa 1 = 1 Table 1.1. Mendel's laws have genotype Aa. If we simplify Mendel's laws for two alleles, A and a, with one locus, we get the results in Table 1.1. Suppose we let A be brown hair color and a be blonde, and A is dominant over a. Then according to the Table 1.1, when both of the parents have AA as their genotype; both of them are truebreeding brunette, all of their offspring will have AA as their genotype as well; all the children wil1 be true breeding brunette. If one of the parents has AA and the other has Aa; say the father is truebreeding brunette, but the mother is not truebreeding brunette, then all of their children will be brunette, but half of them will be truebreeding brunette and the other half will be hybrid. A child of hybrid brunette parents (both have Aa as their genotype) is going to be either brunette or blonde (has of chance of being truebreeding brunette, another of chance of being truebreeding blonde and of being hybrid brunette). When both of the parents have aa as their genotype (truebreeding blonde), all of their children will be truebreeding blonde. The sum of frequencies of all the genotype for each mating instance is equal to l. 6
PAGE 14
Using genetics, we have been able to describe how organism popula tions reproduce and propagate various properties to their descendants. Through the study of heredity, it is possible to understand diseases and find cures. This paper, using the basic biology of Mendel's Laws, will model changes in allele frequencies at one locus with two alleles in a randomly mating pop ulation, without genetic mutations, with systems of ordinary differential or difference equations. We then model and analyze the case of two loci with two alleles, and then three loci with two alleles. 7
PAGE 15
2. Case of Two Alleles, One Locus Using the basics of Mendelian genetics, summarized in Table 1.1, we will derive the basic model for a large randomly mating diploid population describing changes in frequency of alleles. Consider a diploid population and denote two possible alleles at a certain locus A and a. An individual can have either AA, Aa or aa (Aa and aA are indistinguishable; we will assume the phenotypes of these two genotypes are the same). A population consists of a pool of different alleles, and by studying this pool, we can make certain predictions about these alleles in future populations. 2.1 Ordinary Difference Equations A new population is created by reproduction whereby genes are se lected from the parents' gene pool at random according to Mendel's laws (Table 1). First, assume there is no overlap between generations. vVe will let Pn de note the proportion of allele A in the gene pool at generation n, and let qn denote proportion of allele a. The sequence {Pn} describes the gene pool from one generation to the next [3]. A population of N individuals carries 2N genes (since we are consid ering a diploid population) in its pool. With two possible alleles, A and a, at a locus, the probability of two A genes being selected for reproduction will be 8
PAGE 16
and the probability of one A and one a gene being selected will be 2pnqn. Since we let the genotype Aa to be equal to the genotype aA, we include the factor of 2. The probability of two a genes being selected will be The proportion of the gene pool consisting of allele A at the current generation, n, will be + Pn qn (we take half of 2pn qn because the corresponding genotype contains only one A allele). Similarly the proportion of gene pool of type a will be Pn qn + Since the sum of the frequencies of all the genotypes equals 1, we have + 2pnqn + q;, = 1. We know that (Pn + qn)2 = + 2pnqn + q;, so that (Pn + qnJ2 = 1 which implies Pn + qn = 1. Hence qn = 1 Pn which is the proportion of allele a. vVe now introduce variable fitness of genotypes and natural selection. Let the fraction of individuals surviving to the next reproduction be given by the fitness parameters, s1 for AA, s2 for Aa, and s3 for aa types of offspring. The total size of the gene pool at the n'h generation is (2.1) Note that when s; = 1 for i = 1, 2, 3, we have Wn = 1, when s; > 1 for i = 1,2,3, we have Wn > 1, and when s; < 1 fori= 1,2,3, we have wn < 1. The gene pool proportion of allele A at the next generation is determined by Pn+l (2.2) Also note that if s; = 1 fori = 1, 2, 3, we have Pn+l = Pn, and the gene pool is at equilibrium with HardyWeinberg proportions in which the proportions of AA, Aa, aa are Equation (2.2) was derived by Fisher, Wright, 9
PAGE 17
and Haldane in the 1930's and has been widely used to study genetic traits [3]. The proportion of type a in generation n + 1 is simply qn+l = 1 Pn+l The changes in the gene pool through the generations can be determined by iterating (2.2) with specific fitness values. We will consider the following four cases: (1) genotype AA is most fit, then Aa, and aa: s1 >52 > s3 > 0, (2) genotype aa is most fit, then Aa, and then AA: s3 > 8 2 > s1 > 0, (3) Aa dominates both AA and aa: s2 > 5 1 > 0 and 5 2 > s3 > 0, ( 4) A a is dominated by both AA and aa: 0 < s2 < s1 and 0 < s2 < s3 Letting Pn+! = F(pn), we can find the fixed points ofF by solving p* = F(p*). Then s1p*2 + s2p*(l p*) p = s1p'2 + 2s2p*(lp*) + s3(1p*)2 Multipling both sides by the denominator of right hand side gives Subtracting the left hand side from both sides of equation and factoring gives Solving for p*, we get p' = 0, p* 2s3 + s1 3s2 j(3s2s,2ss)2 + 4(5,2.s2 + .ss)(s2s3) 2s1 4s2 + 2s3 10
PAGE 18
= I 2 2 2s3 + s1 3s2 ys2 2s1s2 + s1 2(8J 2s2 + s3 ) r:c,2s3 + s1 3s2 J(s2 s1)(s2 s!) 2(s1 2s2 + s3 ) 2s3 + s1 3s2 ( s2 s!) = 2(s1 2s, + s3 ) Then s3Sz s1 2s2 + s3 or Hence, we have three fixed points: When IF'(p*)l < 1, we have p* is a stable fixed point and if IF"(p*)l > 1, then p* is an unstable fixed point. If IF'(p')l = 1, then it is neutral. Since F''( ) = 2s3) + s,s3) + 2s3Pn(sls,) + s2s3 Pn (p?,(s1 2s2 + s3) + 2pn(s,s3) + s3)2 evaluating the derivative F'(Pn) at the fixed points, we get Now we evaluate these conditions with the four cases under consideration. 11
PAGE 19
(a) (b) = 52 > 1 =} is unstable. 53 [F'(p;)[ = 52 < l =} is stable. Sj For this case, we have two valid fixed points, Pi = 0 and P2 = l; Pi = 0 is unstable and Pz =lis stable (see Figure 2.1a). The third fixed point P* ,,_,, lies outside the unit interval [0, 1] and is not a valid fixed 3 s12s2+s3 point. 1.21 I 't oof I J'l o. 0.2 0 0 03 o; 3 0< 01 01 OA 0.3 0.3 ,, 0.6 o.s e e, Figure 2.1. (a) s1 > s2 > s3 > 0, allele A dominates; (b) s3 > s2 > s1 > 0, allele a dominates In Figure 2.1a, the slope of the graph Fat the fixed point pz = l is less than one and the slope at p; = 0 is greater than one. Hence "P2 = 1 is stable, allele A is favored and will eventually dominate the gene pool since the proportion of A goes to 1. 12
PAGE 20
(2) S3 > Sz > sr > 0 (a) IF' (p;) I = 32 < 1 =;. p; is stable. 53 (b) IF'(p;)l = 32 > 1 =;. p; is unstable. SJ Thus, p; = 0 is stable and p; = 1 is unstable (see Figure 2.lb). The third fixed point p3 = lies outside the unit interval [0, 11 under s1 S2 S3 J this condition, and hence it is not a valid fixed point. In Figure 2.1b, the slope of the graph F at the fixed point P2 = 1 is greater than one, and the slope at Pi = 0 is less than one. Hence Pi = 0 is stable, allele a will eventually dominate the gene pool since the frequency of a goes to 1. (a) (b) (c) IF'(p;)l = 52 > 1 =;. Pi is unstable. 53 IF'(p2)1 = 52 > 1 =;. p; is unstable. sr IF'( *)I s1s2 2srs3 + SzS3 1 . bl p3 = < =;. p3 rs sta e. S1S3 The explanation for the implication in (c) is as follows. Since 13
PAGE 21
0.6 .s1s3 > 0. Hence ,,(,,+,,)2'''' > 0, and we can drop the absolute 51 s3 value sign. We have s2 > .s1 and Sz > S3. Then we have 0 < (sz.s1)(s2 s 3 ) = Thus we have shown that IF'(p:J)I < 1. Thus, = 0 and p; = 1 are the unstable fixed points, and p3 = '?_,. is the stable fixed point 51 (see Figure 2.2a). 0.0 O.l Figure 2.2. (a) s2 > s1 > 0, s2 > s3 > 0, type Aa dominates (Heterosis); (b) 0 < s2 < s1 0 < s2 < s3 type A a is least favored. In Figure 2.2a, the slope of the graph ofF at the fixed points p; = 1, p; = 0 is greater than one, and the slope at p:J is less than one. Hence p*3 = is stable. and the genotype Aa will _eventually S1S2S3 dominate the gene pool. Since both allele types A and a will remain, the genotypes AA and aa will also remain in the population. This state 14
PAGE 22
is called heterosis. Hence the population is in the polymorphic state where all the possible alleles and genotypes remain in the population. ( 4) 0 < Sz < s1 and 0 < Sz < s3 (a) IF' I = 52 < 1 =;. is stable. '3 (b) IF' (p;) I = 52 < 1 =;. p; is stable. St (c) The explanation for the third inequality (c) is as follows. Thus we have shown that IF'(p;)l > 1. Thus, Pi = 0 and p; = 1 are stable fixed points, and p3* = 'r't is an unstable fixed point (see 8182 83 Figure 2.2b ). In Figure 2.2b, the slope of the graph F at the fixed points p; and Pi is less than one, and the slope at p3 is greater than one. If the initial condition p0 is closer to 1, then the allele A will eventually dominate the gene pool or if the initial condition 1 p0 is closer to 1, then allele a will eventually dominate the gene pool. With the difference equation (2.2), we examined a model where dominance of the gene pool by the genotypes was based on the fitness level of each 15
PAGE 23
genotype, given two alleles A and a at one locus. Among four cases we consid ered, there was only one case where both alleles dominated the gene pool, and hence all three genotypes AA, Aa and aa dominate the gene pool. In the other two cases, either allele A or a died out, therefore only either the genotype AA or aa remained in the population. Now we will take the ordinary difference equation and convert it into an ordinary differential equation and test to see if the same results occur. 2.2 Ordinary Differential Equations We now need to model a gene pool in which the allele frequencies change continuously in time, rather than in discrete generations. To do this, we must introduce a continuous time variable. Recall from Section 2.1, that the fitness parameters, s; for i = 1, 2, 3, give the fraction of individuals with genotype i (i = 1 for AA, i = 2 for Aa and i = 3 for aa) that survive to the next reproduction, with s; > 1 for growth, s; < 1 for decay and s; = 1 as the neutral case. If the growth equations were linear, we would have s; = 1 + r; and where r; is the growth fraction over a full generation (r; > 0 gives growth and r; < 0 gives decay). We now let o; be the growth rate for the ;th genotype. Then the growth fraction over a small time interval, h, is ho;. Thus we let the fitness parameters s; be given by s; = 1 + ho; fori= 1, 2, 3. If o; > 0, we have growth, 16
PAGE 24
if D"i < 0, we have decline, and when O"i = 0, we have Si = 1 and it is neutral. Now, we find a smooth function p( t) such that Pn = p( nh) when h is small. Setting t = nh [3], we have ( s1p(t)2 + s2p(t)(1p(t)) pt+h)= s1p(t)2 + 2s2p(t)(1p(t)) + s3(1p(t))2 Then subtracting p(t) from the both sides of this equation and dividing by h, we have s1p(t)2+s,p(t)(1p(t)) () p(t +h)p(t) s1p(t)'+2s2p(t)(1 p(t))+s3(1 p(t))' P t h h Rearranging and factoring p( t) ( 1 p( t)) we have p( t + h) p( t) h p(t)(1p(t))(s1p(t) + s2(12p(t))s3(1p(t))) h(s1p(t)2 + 2s2p(t)(1p(t)) + s3(1p(t))2 ) Using Si = 1 + hui for i = 1, 2, 3 and taking the limit as h goes to 0 on this equation, we get p(t +h)p(t) p(t)(1p(t))(u1p(t) + u2(12p(t))u3(1p(t))) h = p(t)2 + 2p(t)(lp(t)) + (1p(t))2 Since p(tj2 + 2p(t)(lp(t)) + (1p(t))2 = 1, we have the following differential equation p'(t) = p(t)(lp(t))[u1p(t) + u2(12p(t))u3(1p(t))] = G(p). (2.3) Solution of equation (2.3), p(t), describes the frequency of allele A in the gene pool, and 1 p(t) is the frequency of allele a. By setting G(p*) = 0, we can find the fixed points of the equation (2.3) and compare the results with the results from the difference equation (2.2). When G(p*) = 0, we have p' = 0 p* = 1 and u1p' + u,(l2p')os(lp*) = 0 17
PAGE 25
Then for the differential equation (2.3), we also get three fixed points, These three fixed points are equivalent to the following fixed points for the difference equation (2.2) Now we will examine these fixed points for stability and compare the results to the stability analysis for the difference equation. Each fixed point of the differential equation (2.3) must satisfy the inequality O'(p*) < 0 in order to be stable. When 0' (p*) > 0, we have that p* is unstable, and when 0' (p*) = 0, we have neutral state. Since O(p) = p(t)(lp(t))[cr1p(t) + cr2(l2p(t))cr3(lp(t))], we have O'(p*) Evaluating the 0' at the fixed points gives Now we consider the four cases used in evaluating the fixed points of the difference equation (2.2). 18
PAGE 26
(1) O"j > 0"2 > 0"3 (a) (b) =? Pi is unstable. =? p; is stable. For this case, there are only two valid fixed points, Pi = 0 and p; = l. And p; is an unstable fixed point and is a stable fixed point. The fixed point p* = "' lies outside the unit interval [0, 1] and is not a valid fixed point. In figure 2.3a, p(t) always converges to 1 and 1 p(t) converges to 0 for all timet. This means the genotype AA will eventually dominate the gene pool. The condition 0"1 > 0" 2 > 0" 3 is equivalent to the condition s1 > s2 > s3 for the difference equation (2.2). The result with 0"1 > D"2 > 0" 3 for equation (2.:3) is the same as the result with s1 > s2 > s3 for equation (2.2). 19
PAGE 27
0.0 00 j '"]' '' j ''f L 10 "coo"cro so go 100 "o 10 20 30 J2 > 0'3 allele A dominates; (b) 0'3 > J2 > 0'1 allele a dominates. (a) =? is stable. (b) =? p; is unstable. There are also two fixed points for case two, p; and The fixed point p* = lies outside the unit interval [0, 1] and is not a CTl 0"2 cr3 valid fixed point. This time, p; = 0 is stable, and Pz = 1 is unstable. Thus, p(t) converges to 0 and 1p(t) converges to 1 (see Figure 2.3b). Hence, the genotype aa will eventually dominate the gene pool. The condition o1 < o2 < o3 is equivalent to the condition s1 < >2 < S3 for the difference equation (2.2). The result obtained with o1 < o2 < o3 is the same as the result with s1 < s2 < S3 for equation (2.2). 20
PAGE 28
(a) (b) (c) =? p; is unstable. =? r; is unstable. = (D",0"2)(0"30"2) < 0 0"! 20"2 + 0"3 =? is stable. The explanation for inequality (c) is as follows. Suppose 0"1,0"2 and 0"3 are all less than 0. Then 0"1 0"2 < 0 and 0"3 0"2 < 0. Also, 2 + 0 cfh 0 N' d a1. <7z a3 < us, < ow suppose o1 Uz an 0"3 are all greater than 0. Then, 0"1 0"2 < 0 and 0"3 0"2 < 0. Also, 0" 20" + 0" < 0. Hence (a,a,)(a,a,) < 0. 1 2 3 a12a2+a3 [n this case, there are three fixed points: Pi = 0 and P2 = 1 are unstable, and p; is stable. The frequency of allele A, p(t), converges top; = and the frequency of allele a, 1p(t), converges to 1 p;. This means the genotype A a will eventually dominate the gene pool, and all the genotypes AA, Aa and aa will remain in the population, a polymorphic state (see Figure 2.4a). The condition D"2 > D"1 0"2 > 0"3 is equivalent to the condition 82 > s1 s2 > s3 for the difference equation (2.2). Under these conditions, we get the same stability analysis of the three fixed points for both the difference and the differential equations. 21
PAGE 29
" ,+""'""'Ht++HH + + PI t ' [o6 + + + t t t + + t + + I ,, of' Figure 2.4. (a) CJ2 > CJ1 CJ2 > CJ3 Aa dominates (Heterosis); (b) and (c) CJ2 < CJ1 CJ2 < CJ3, Aa is least favored. (a) =? is stable. (b) =? is stable. (c) =? p; is unstable. The explanation for the third inequality (c) is as follows. Suppose Al 2 + 0 T'h (o,a, )(a,a,) 0 N so, CJ1 CJ2 CJ3 > us. 2 + > ow suppose CJ1, CJ2 ""Icrz cr3 and CJ3 are ail greater than 0. Then, CJ1 CJ2 > 0 and o3 CJ2 > 0. Also, 22
PAGE 30
In this case, we have two stable points, p; = 0 and = 1, and one unstable point, P3 The frequency p(t) approaches 1 if the initial frequency of allele A, p(O), is greater than the unstable fixed point, p3, and it approaches 0 if p(O) is less than p:J. In Figure 2.4b, p(O) is if the initial condition p(O) is closer to 1, then the genotype AA will eventually dominate the gene pool or if the initial condition 1 p(O) is closer to 1, then the genotype aa will dominate the gene pool. The conditions u2 < u1 and u2 < u3 are equivalent to the conditions s2 < s1 and s2 < s3 for the difference equation (2.2). Once again, the results under these conditions from both difference and differential equations are the same. With the differential equation (2.3), through the stability analysis of the fixed points, we conclude that there is only one instance where all the genotypes dominate the gene pool, as we did with the difference equation (2.2). Now we will increase the number of loci to two and follow a similar investigation of the frequencies of alleles. 23
PAGE 31
3. Case of Two Alleles, Two Loci In this section, we will consider a diploid population with two alleles at two loci. 'When we have two alleles at two loci, we have foJlowing four alleles: A, a, B and b. An individual can be distinguished by the nine genotypes: AABB, AABb, AAbb, AaBB, AaBb, Aabb, aaBB, aaBb or aabb. 3.1 Ordinary Difference Equations We will let Pn denote the frequency of allele type A in the gene pool at generation n, then 1 Pn represent the frequency of allele type a, rn be the frequency of allele B, and 1 rn be the frequency of allele type b. As in the section 2.1, the sequences {Pn} and {rn} describe the gene pool from one generation to the next. The probability of each genotype being selected for reproduction at generation n is listed in Table 3.1. The fitness parameters for Genotype Probability AABB p;, for AA, r; for BB AABb 2p;;rn(1rn) AAbb P7J1 rn)2 AaBB 2pn(lPn)r;; AaBb 4pn(1Pn)rn(lrn) 0abb I 2pn(1 Pn)(1 rn)2 I aaBBt (1 Pn)2r;; aaBb I aabb I (1 c._ Pn)2(i rn');;:2"1 Table 3.1. ProbabiLity of genotypes being selected for reproduction 24
PAGE 32
Genotype Probability AABB S! AABb 32 AAbb 33 AaBB s4 AaBb ss Aabb s6 aaBB S7 aaBb 3g aabb Sg Table 3.2. Fitness parameters of genotypes each genotype are listed in Table 3.2. Then the total size of the gene pool at the nth generation is Then the gene pool proportion of allele A at the next generation is determined by [ s1p;r; + rn) + rn)2 + S4Pn(l:n)r;;] +2sspn(lPn)rn(lrn) + S6Pn(lPn)(lrn) Pn+l = (3.1) the numerator of equation (3.1) because the corresponding contain only one A gene. Then proportion of allele type a in the generation n + 1 is 25
PAGE 33
given by (1Pn+J) The proportion of allele type B is given by = '(3.2) Similarly, we take half of rn), 4pn(1Pn)rn(1rn) and 2(1Pn)2rn(1rn) in the numerator of equation (3.2) because the correspond ing genotypes contain only one B gene. The proportion of allele b is given by (1 Tn+J) We will fmd the fixed points of difference equations (3.1) and (3.2). Let Pn+l = F(pn, rn) and Tn+J = G(pn, rn) Then the fixed points satisfy p' = F(p', r*) and r' = G(p*, r*). For the simplicity of the equations, now we will use p, w and r for p*, w' and r', respectively. Then this leads to the conditions pw = s1p2r2 + 2s2p2T(lr) + s3p2(1r)2 + S4p(lp)r2 +2s5p(lp)r(lr) + ssp(1p)(1r)' rw s1p2r2 + s2p2r(1r) + 2s4p(1p)r2 + 2s5p(1p)r(1r) +s7(1 p)2 T 2 + s8(1p)2r(1r), where w s1p2 T 2 + 2s2p2r(1r) + s3p2(1r)2 + 2s4p(l+4s5p(lp)r(lr) + 2s6p(lp)s*2 + S7(lp)2r2 +2s8(1p)'r(lr) + s9(1p)2(1r)2 26
PAGE 34
Subtracting the right hand side from the both sides of the equations, we get s7(1p)r2 2s8(1p)r(lr)sg(1p)(1 r?J and 2 2 2 +ss(1p) 2s2p rs3p (1r) 4s5p(lp)r 2.s6p(1p)(1r)2.s8(1p)2rsg(1p)2(1r)]. Now, we solve for fixed points (p, r). Note that the left hand side of both equations are 0. Letting the right hand side of both equations to equal each other, we get (p,r) = (0,0) (p,r) (1,1). We also get [ ( 2 2 )' 2 0 = p(1p) s6 1r) 2s4pr 2s6p(lr s7(lp)r +p(lp)r(lr)[4s5p2s8(1p) + 4ssr + 2s6(1r)] 27
PAGE 35
(3.3) If we let p = 0, then equation (3.3) becomes 2 ( .2) 3 + 2 ( 2 3) ( ? 2 + 3) 0 ss r r T S7T' s 8 r r r Sg r .... r r = or or Solving for r, we first get r = 0; since we already have (p, r) = (0, 0) as a fixed point, we can omit this point. We also have '(' = Thus or s 7 3s8 + 2s9 J( s7 + 3s8 2.sg)2 + 4(s72ss + s 9 )(s 8 s 9 ) 0( 2 \ .::..\.S7 S_s I Sgj s7 3s8 + 2s9 Js?2s7s8 + s 2(s72s8 + sg) s7 3ss + 2sg ( s7 ss) 2(s7 2s8 + s9 ) Ss + Sg S72ss + s9 r = 1. Thus we have found two more fixed points ( Ss + Sg (p, r) = (0, 1) and (p, r) = 0, =2=...::__ S7Ss + Sg 28
PAGE 36
Letting r = 0, equation (3.3) becomes This implies that or Solving for p, we first get p = 0; since (p, r) = (0, 0) is already a fixed point, we omit this point. We also get p Thus or s3 + 3se2sg (s3s6) 2( s3 + 2s6s 9 ) p = s6 + s 9 s3 2s6 + s9 p = 1. Letting p = 1 in equation (3.3) gives and letting r = 1 in euqation (6) gives 5754 p = 0, p = 1 and p = s1 2s4 r S7 29
PAGE 37
We have a total of eight fixed points for the difference equations (3.1) and (3.2) (p',r')=(O,O), (p',r*)= (o, ), (p',r')=(0,1) S7 Sg .Sg (p',r')= '(p',r')= and (p',r') = (1, 1). There may be additional fixed points that are not revealed by this analysis. We will now consider the isoclines of the system and show that solutions remain in [0, 1] x [0, 1]. Set Pn = 0. Then Pn+l = 0 and 'n+l = ,,,;+,sc,(!r,) which s7r?, +2ss ( lrn)+s9 ( 1rn)2 implies that when Pn = 0, Pn+! does not change, but 'n+l does. Thus, when Pn = 0, the solutions in the phase plane move only in vertical direction; along rn axis. when Tn = 0, Tn+l does not change, but Pn+! does. Hence, the solutions in the phase plane move only in horizontal direction; along Pn axis. S t 1 1'h 1 d ,,,;,+,zr,(lr,) S e Pn = .. en, Pn+l = an rn+l = 2+2 (1 )+ (1 )2' o, s1rn S21n rn S3 rn when Pn = 1, there is no horizontal movement of the solution, and the solutions move only in vertical direction. S t 1 Tl t '1P2 +2'zPn(lpn) e r n = 1en we ge Pn+l = +? (1 )+ (l S1Pn S2Pn Pn S3 Pn ) 2 and Tn+J = 0. When rn = 1, solutions move in horizontal direction only. Thus, we can conclude that the values of Pn+l and Tn+J stay in the unit square r 1 [0,1J X lO,lj. 30
PAGE 38
Now we will work on the stability analysis of the system of difference equations (3.1) and (3.2). Since we have system of equations, we will form a Jacobian matrix of equations (3.1) and (3.2), evaluate the Jacobian matrix at the fixed points and find eigenvalues. If the eigenvalues of the matrix are less then 1 in absolute value, then the ftxed points are stable, and if they are greater than 1 in absolute value, then they are unstable fixed points. For the simplicity of the equations, we will let Pn = p and rn = r. The partial derivatives are oF ap l s1pr2(s4pr2 + 2s5pr(1r) + s6p(r1)2 + 2(1p)(s1r2 l +2s8r + s9(1r)2ssr2 s9r(1r))) + 2s2pr(1r)(s4pr2 +2s5pr(1r) + s6p(r1)2 + 2(1p)(s7r2 + 2s8r + s9(1r) 2 31
PAGE 39
fJF or fJG fJp and 2 +(1p)2(s1r2 + 2s8r(lr) + sg(lr)Z 32
PAGE 40
fJG or +4s6p(lr) + (1p)(s8r + 2sg(lr))) + 2s5p(2s6p(lr)2 +(p1)(2s6p(rl)(2s7r + ss(lr) +(p l)(s7r(ssr + 2s9(1r)) + sssg(lr)2)))) 2 We will make some assumptions about the fitness values in order to reduce the number of parameters. So, we will consider the following: Assign a weight to each allele: w(A) = 1, w(B) = 1, w(a) = 0 and w(b) = 0. The weight of each genotype is the sum of the weights of the alleles. So, if we have genotype AABB, its weight is w(AABB) = w(A) + w(A) + w(B) + w(B) = 4. Then weights of all the genotypes are as follows: w(AABB) = 4, w(AABb) = 3, w(AAbb) = 2, w(AaBB) = 3, w(AaBb) = 2, w(Aabb) = 1, w(aaBB) = 2, w(aaBb) = 1, w(aabb) = 0. The numbers 0, 1, 2, 3 and 4 represent the numeric phenotypes of the genotypes. Consider the fitness curve in Figure 3.1, which gives the fitness value 33
PAGE 41
for each numeric phenotype. Fitness Value !"""'$phenotype 0 1 2 4 Figure 3.1. Fitness Curve I Assign new fitness values f to each phenotype. Let f(1) = f(3) =a, f(2) = b, f(O) = f( 4) = c. Then we have the fitness parameters in Table 3.3. .s1 = f(AABB) c s2 = f(AABb) =a 83 = f(AAbb) = b .s4 = f(AaBB) =a ss = f(AaBb) = b s6 = f(Aabb) =a 87 = f(aaBB) = b I s 8 = f(aaBb) =a I s9 = f(aabb) = c Table 3.3. Fitness parameters of genotypes vVe now consider the following two cases: (3.4) vVe have four fixed points that are not at the corners of the unit square (p r') = (1 .,_, ) and (p'. r*) = ( ,,_,., 1). For these points to he 1 1 s1 Zs2+s3 \s1Zs4+s7 1 viable fixed points, they need to in within the unit square [0, 1] x [0, 1]. First we will write these points in terms of a, b and c. 34
PAGE 42
Writing (o, ) in terms of a, b and c, we have 87ss 89 (O, s8 + sg ) = (o, a+ c ) = (o, a ) = (o, _a s7 2s8 + s 9 b 2a + c b 2a 2a To be a fixed point under the condition (3.4), we must have 2a b > 0 and a < 2ab, so that (0, is inside the unit square [0, 1] x [0, 1]. However, these conditions imply a < b and a > b which contradicts the condition b > a. Hence (0, ) = (0, _,;+:: ) cannot be a fixed point with the assumption s7ss,sg (3.4). The point ( o) in terms of a, b and c, gives s3se sg S6 + Sg s3 2s6 + Sg 0 = .0 = 0 = 0 ) ( a+ c ) ( a ) ( a ) b 2a + c b 2a' 2a b' This is no longer a fixed point by the previous argument. Writing (1, ) in terms of a, band c, we have SlSz 83 = 1 = 1 ( ba ) ( ba) c 2a + b b 2a To be a fixed point, we must have b2a > ba, so that (1, t:2:) is inside the unit square [0, 1] x [0, 1]. We know a > 0 which means 2a > a. The inequality b2a > ba implies 2a > a. And 2a > a implies 2a < a, a contradiction. Hence (1 ba ) = (1 _, ) cannot be a fixed point with ) b2a l SJ2S2jS3 the assumption (3.4). The point ( ,,,,. 1) in terms of a, b and c, gives s12s4js7' This is no longer a fixed point by the previous argument. Hence we have only four valid fixed points with the assumption (:3.4). 35
PAGE 43
Evaluation of the Jacobian matrix of equations (3.1) and (3.2) at the four fixed points results in [;,0 ], J(O, 0) = 0 ""'9 [ :;;0 l J(O, 1) = 0 !!>,, [;:0 ], J(1, 0) = 0 ,, [; ; l If we solve det( J >I) = 0 for A, where I is identity matrix, we get the eigenvalues, A, of J. Then based on the eigenvalues found, we will be able to determine the stability of the fixed points for equations (3.1) and (3.2). The following observations can be made for the four fixed points. (1) At (0, 0), the characteristic polynomial is which leads to the eigenvalues We assumed that the fitness value cis 0. Since we can not havA'! division by 0, assume c ! 0; then approaches oo. This implies that A1 > 1 and ?,2 > 1, and hence the fixed point (0, 0) is unstable. 36
PAGE 44
(2) At (0, 1), the characteristic polynomial is which gives the eigenvalues Since b >a, ); < 1 and A1 < 1, A2 < 1. Thus, the fixed point (0, 1) is stable. (3) At (1, 0), the characteristic polynomial is which gives the eigenvalues Since b >a, ); < 1 and A1 < 1, A2 < 1. Thus, the fixed point (1, 0) is stable. (4) At (1, 1), the characteristic polynomial is which leads to the eigenvalues 1 s4 a 1 s2 a /11 , "'2 SJ C SJ C We assumed that the fitness value cis 0. Since we can not haye division by 0, assume c 7 0. Then approaches oo. This implies that A1 > 1 c and A2 > 1, and hence the fixed point (0, 0) is unstable. 37
PAGE 45
Hence, we have four valid fixed points (p*, T*) = (0, 0): unstable; (p*,r*) = (0,1): stable; (p*,T*) = (1,0): stable; (p*,r*) = (1,1): unstable, due to the assumption (3.4) on the weights of genotypes. *. '' . 0 4 OS OS P., Vrequsncyol aiimo A * * i + ::J + + + + + + + + foe I f . . .. Figure 3.2. Phase Portraits: (a) b >a> 1 > c = 0; (b) b > 1 >a> c = 0 With the assumption (3.4), either the alleles a and B will eventually dominate the gene pool since Pn converges to 0 and r n converges to 1, or alleles A and b will dominate the gene pool because Pn converges to 1 and rn converges to 0. Computational results showed that convergence of (pn, r) to either (0, 1) or (1, 0) was based on (p0 r0), the initial value of (pn, rn); if (p0 ro) was closer to the point (0, 1), then (pn, rn) approached (0, 1) (see Figure 3.2a), and if (p0 r0 ) was closer to (1, 0), then (Pn, rn) converged to the fixed point (1, 0) (see Figure 3.2b). Figure 3.2 shows the phase portraits of the equations (3.1) and (3.2). In Figure 3.2a, we have b > a > 1 > c = 0, and in Figure 3.2b, we have b > 1 > a > c = 0. Based on the initial condition, either the genotype aaBB 38
PAGE 46
or AAbb remains in the gene pool. Now, just as in the section 2.1, we will convert the difference equations (3.1) and (3.2) into differential equations. Using a similar analysis of the fixed points, we will predict what happens with the alleles in the gene pool. 3.2 Ordinary Differential Equations Using the argument of section 2.1 to convert a discrete equation to a differential equation, we let Si = 1 + h(}'i, for = 1 ... 9, where si's are fitness values as in section 3.1, h is a short time interval, and (}'i are growth rates. If (}'i > 0, we have growth, if (}'i < 0, we have decline, and when (}'i = 0, Si = 1, which is the neutral case. Now we find a smooth function p(t) such that Pn = p( nh) when h is small. Setting t = nh, we have s1p(t)2r(t)2 + 2s2p(t)2r(t)(1r(t)) + s3p(t)2(1r(t))2 +s4p(t)(lp(t))r(t)2 + 2s5p(t)(lp(t))r(t)(lr(t)) +s6p(t)(lp(t))(lT'(t))2 p(t +h)= ..:::.._ ________________ _::_ w r(t+h)= where s1p(t)2r(t)2 + s2p(t)2r(t)(l r(t)) + 2s4p(t)(l p(t))r(t)Z +2s5p(t)(lp(t))r(t)(lr(t)) + s7(1p(t)J2r(tj2 +ss(lp(tWr(t)(lr(t)) w 39
PAGE 47
+2s4p(t)(1p(t))r(t)2 + 4s5p(t)(1p(t))r(t)(1r(t)) +2s6p(t)(1p(t))(1r(t))2 + s7(1p(t)fr(t)2 +2s8(1p(t))2r(t)(1r(t)) + s9(1p(t))2(1r(t))2 Then subtracting p(t) and r(t) from the both sides of the top two equations, respectively, and dividing by h, we have and s1p(t)2r(t)2 + 2s2p(t)'r(t)(1r(t)) +s3p(tj2(1r(t))2 + s4p(t)(1p(t))r(t)2 +2s5p(t)(1p(t))r(t)(1r(t)) p(t +h)_ p(t) = _;:__+_s6_P_(t_)(_1__P_(t_)):(1__r(_t)_)2__P_(_t)_w_...c:.. h s1p(t)2r(t)2 + s2p(t)2r(t)(1r(t)) +2s4p(t)(1p(t))r(t)2 +2s5p(t)(1p(t))r(t)(1r(t)) + s7(1p(t))2r(t)2 r( t + h) r( t) = _;:_ __ +_s_s (_1__P_( t_) )_'r_(.,t) _( 1__,_( t_) )__r_(_t )_w __ __::_ h hw Rearranging and factoring p(t)(lp(t)) and r(t)(lr(t)), respectively, where p(t) and r(t) are replaced with p and r, respectively, on the right hand side of the equations, we have 40
PAGE 48
p( t + h) p( t) h and hw +2ssp(1p) + s8(1pj2r(2s2p2 + 4ssp(1p) r(t +h)_ r(t) +2s6p(1p) + s9(1p)2 ] h hw Using s; = 1 + hCJ; for i = 1 ... 9 and taking the limit as h goes to 0 on these two equations, we get p(2CJ4r2 + 4CJ5r(1r) + 2CJs(1r)2)[ ( 1) ( ) hp(1p)2[CJrr2 + 2osr(1T) + CJ9(1r)2 ] limp t + p t = "'c::,"""'h;o h 41
PAGE 49
and ( h) ( ) hr(1r)2[o3p2 + 2o6p(1p) + og(1p)2 ] lim r t + r t = =,c=htO h where +2o4p(1p)r2 + 4o5p(1p)r(lr) + 2o6p(1p)(1r)2 +o7(1p)V + 2os(1p)"r(l1') + og(lp)2(1r)2]. Then we have the following differential equations and r' ( t) +2o2r(lr) + o3(1r)2)] + p(1p)[o4r2 + 2o5r(1r) 2 2 )2)] +o6(1r) p(2o4r + 4o5r(1r) + 2o6(1r p(1p)2[o7r2 + 2osr(lT) + o9(1r)2 ] +2o4p(lp) + o7(1p)2)] + T(1r)[o2p2 + 2o5p(1;p) +o8(1p)2 r(2o2p2 + 4o5p(1p) + 2os(1p)2)] r(1 r)2hp2 + 2osp(lp) + og(lp)2]. 42 (3.5) (3.6) 'I !'I I i.[' I II II I Ji I ii j L
PAGE 50
Solutions of equations (3.5) and (3.6), p(t) and r(t), determine the frequencies of the alleles A and Bin the gene pool, and 1p(t) and 1r(t) are the frequencies of the alleles a and b, respectively. Now we let p'(t) = F(p(t),r(t)) and r'(t) = G(p(t),r(t)). Then by setting F(p',r') = 0 and G(p', r') = 0, we can find the fixed points of equations (3 .. 5) and (3.6) and compare the results frorn the section 3.1. For the simplicity of the equations, we will replace p* and r* with p and r, respectively. Setting F(p', r') = F(p, r) = 0 and G(p', r*) = G(p, r) = 0, we have 2crs(1p)r(1 r)cr9(1p)(1r)2 ] and ?2 ? 2? 2'3 3 0 = cr1p"r + 2cr4p(1p)r + cr7(1p) rcr1p r 2cr4p(1p)r ( 2 3 )[ 2 ( ) 2 2 cr7 1 p) r + r(1 r cr2p + 2crsp 1 p + crs(1 p) 2crzp r cr3p2(1r)4cr5p(1p)r2cr6p(1p)(1r) 2crs(lp)2rcr9(1p)2(1r)] Since both right hand sides of these equations are equal to 0, let them equal each other. Solving these two equation for (p, r) simultaneously, first we get (p,r) = (0,0), (p,r) = (1,1). 43
PAGE 51
Then we get 2 ,2 + 2 2 (1 ) 2(1 )2 3 2 2 3 (1 ) u1P 1 uzp r r + u3p r u1P r u2P r r u3p3(1r)2 + p(1p)[u4T2 + 2usr(1T) + u6(1r)2 2u4pr2 ( 2 4uspr 1r)2u6p(lr) u7(1p)r 2us(1p)r(1r)u9(1p)(1r)2 ] u3 p2(1r) 4usp(1p)r2u6p(lp)(1r) 2us(lp)2ru9(1p)2(1r)] When we subtract the right hand side of this equation from both sides of equation, we get 2 2 2 2 l +2uzp r + u3p (1T) + 2u8(1p) r + ug(1p) (1r) +p(1p)r(lr)[4usp2u8(1p) + 4u5r + 2us(1r)]. Letting p = 0, we get u7r2 us(rr2 ) + u7T3 + 2us(T2 r 3 ) + ugr(lr)2 = 0. Rearranging terms and factoring out r gives 44
PAGE 52
Solving for r, first, we get r = 0. We already have (0, 0) as a fixed point. Solving the rest of the equation for r, we get Then, or 0'73as + 2ag j(a73as + 2a9 ) 2 4(a72a8 + a9)( a8 + a9 ) 2( 0'7 2a8 + O'g) as+ O'g r= 0'7 2as + a9 r = 1. This gives us two more fixed points ( as+ O'g ) (p, r) = (0, 1) and (p, r) = 0, ,.::_2 '0'7 as + O'g Now we let r = 0, then When we solve for p, we get p = 0, which reduces to or 0'6 + O'g p= 0'3 2as + a9 p = 1. 45
PAGE 53
Letting p = 1 gives 0"3 0"2 1' = 0, r = 1 and r = ==o, 2o2 + 0"3 and letting r = 1 gives Hence, we have the following eight fixed points for equations (3.5) and (3.6): (p*,r*)=(O,O), (p*,r*)=(O,l), (p*,r*)= (o, o;+o9 ), 0"7 os + og ( 030'2 ) (p*,r*) = (1,0), (p*,r*) = 1, o, 2o2 + 0"3 (p*' r*) = o6 + og o) ( *) = ' p 'r o3 2os + og and (p*, r*) = (1, 1). The fitness parameters for the difference equations (3.1) and (3.2), .s;, are analogous to O"i, for i = 1 ... 9. Hence, the eight fixed points we have found for the differential equations (3.5) and (3.6) are equivalent to the eight fixed points of the difference equations (3.1) and (3.2). Now we will examine these fixed point for stability using the same method as in the section 3.1 with the difference equations. The eigenvalues of the Jacobian matrix of equations (3.5) and (3.6) must be less than 0 for the fixed points to be stable and when they are greater than 0, they are unstable points. The partial derivatives of F(p, r) and G(p, T) are as follows:" oF 8p(t) 46
PAGE 54
and 8G or ( t) oF 8r(t) 8G ap( t) +2o8r + o9(1r)2o8ro9r(1r)) +o4r(12p) + os(2r 1)(2p 1) + os(1r)(2p 1) +(p 1)(o7r + os(12r)oo(r1))) +o5(2r1)(2p1) + o6(1r)(2p 1) + 0"7r(p1) +o8(12r)(p1) + o9(p1)(r1)) 2p) +(p1)(2o4pr(3r2)2o5p(6r2 6r + 1) + 2o6p(3r2 4r + 1) +(1p)(o7r(3r2)o8(6r2 6r + 1) + o9(3r2 4r + 1))). Before we find and evaluate the eigenvalues of Jacobian matrix, we will also consider the fitness curve in Figure 3.3. 47
PAGE 55
Fitness Value phenotype 12....._84 .] Figure 3.3. Fitness Curve II We will make similar assumptions about the weights of alleles and genotypes, and the fitness values of the phenotypes as in the section 3.2; Assign weights to each allele: w(A) = 1, w(B) = 1, w(a) = 0 and w(b) = 0. The weight of each genotype is the sum of the weights of the alleles. So, if we have genotype AABB, its weight is w(AABB) = w(A) + w(A) + w(B) + w(B) = 4. Then weights of genotypes are as follows: w(AABB) = 4, w(AABb) = 3, w(AAbb) = 2, w(AaBB) = 3, w(AaBb) = 2, w(Aabb) = 1, w(aaBB) = 2, w(aaBb) = 1, w(aabb) = 0. The numbers 0, 1, 2, 3 and 4 represent the numeric phenotypes of the genotypes. Assign new fitness values f to each phenotype. Let f(l) = f(3) =a, f(2) = b, f(O) = f( 4) = c. Then we have the fitness parameters in Table 3.4. We now consider the following two cases : b > a > 0 > c = 1 and b > 0 > a > c = 1 (3.7) 48
PAGE 56
orf(AABB) = c o2 = f(AABb) =a o3 = f(AAbb) = b o4 = f(AaBB) =a 0, a+ 1 > 0, and a+ 1 < 1 + 2ab. Since b > a > 0 > c = 1 and b > 0 >a> c = 1, we have a+ 1 > 0. If 1 + 2ab > 0, then 1 + 2a >band if a+ 1 < 1 + 2ab, we have a < 2ab. The condition a < 2ab implies that a < b which means a > b, and this condition contradicts the (3.7). Hence (0, 1_;'i;Lb) = (o, cannot be a fixed point. 49
PAGE 57
For the second fixed point in question, rewriting (p*,r*) = ( as+ ug o) = 0"32<76 + O"g' _c:a ,1 ) ( a + 1 ) '0 = 0 6 2a 1 1 + 2a 6' Similarly, this point cannot be a fixed point of the equations (3.5) and (3.6). = = We need a b > 0, 1 + 2a b > 0 and a b > 1 + 2a b for this point to be a fixed point. If ab > 0, then a> b which contradicts the assumption (3.7). We also have (p*,r*) = ( ,1). This point in terms of a, band a1 cr 4 cr7 C IS ba ) 1 = 12a + b' c.1 ab ) 1 + 2ab Similarly, this point cannot be a fixed point of the equations (3.5) and (3.6). Now, using the four partial derivatives, we can form the Jacobian matrix at the fixed points (0, 0), (0, 1), (1, 0) and (1, 1): J(O, 0) = [ o6 0 ug O"s ag 0 J(O, 1) = [ a 0 : a, a, 0 "' I J(1, 0) = [ "' : "' a, 0 a, 50
PAGE 58
.!(1, 1) = [ <74 0 O"j Now we find the eigenvalues of the four Jacobian matrices evaluated at the fixed points. (1) At (0, 0), the characteristic polynomial is (o6<7g.\)(os<7g.\) = 0. Therefore, .\ 1 = o6 o9 = ac = a+ 1 and ,\2 = o8 o9 = a+ 1 If b > a > 0 > cor b > 0 > a > c, a + 1 > 0. And .\1 > 0. With a similar argument, we also have .\2 > 0. Therefore, for case (3.7), we have both .\1 > 0 and .\2 > 0 which means (0, 0) is unstable. (2) At (0, 1), the characteristic polynomial is Therefore, If b > a > 0 > c or b > 0 > a > c, a b < 0 which means .\1 < 0. Similarly, we have .\2 < 0. Therefore, for case (:3. 7), we have both .\1 < 0 and .\2 < 0 which means (0, 1) is stable. (3) At (1, 0), the characteristic polynomial is 51
PAGE 59
'' '' Hence, If b > a > 0 > cor b > 0 > a > c, ab < 0 and ,\1 < 0. Similarly, we have A2 < 0. Therefore, for case (3.7), we have both ,\ < 0 and ,\2 < 0 which means (1, 0) is stable. (4) At (1, 1), the characteristic polynomial is Therefore, A1 = cr4cr1 =ac = a+ 1 and At = cr2o1 =ac =a+ 1 If 6 > a > 0 > c or b > 0 > a > c, a + l > 0 and ,\1 > 0. With similar argument, we also have ,\2 > 0. Therefore, for case (3.7), we have both ,\1 > 0 and ,\2 > 0 which means (1, 1) is unstable. I: '\'''FoiAI jl r( )"Fq(B) 1. p(I},FqiA} II r(t)"Fq(Blj \ + + +++ +++ ++++++++++ ++++++++++++ { 02 . + + + + + + + + +.,. + + + + + + + + + + + + + 0 "'*** ............. '" I (N\Imt;.er cl Figure 3.4. (a) b >a> 0 > c = 1; (b) b > 0 >a> c = 1 52
PAGE 60
Hence, we have (p*, r') = ( 0, 0) and (p*, r*) = ( 1, 1) as unstable fixed points and (p', r*) = (0, 1) and (p', r') = (1, 0) as stable fixed points. Either the alleles a and B will eventually dominate the gene pool since the frequency p( t) converges to 0 and the frequency r( t) converges to 1, or the alleles A and b will dominate the gene pool because the frequency p(t) converges to 1 and the frequency r(t) converges to 0, which is the same result as in section :3.1 with difference equations. In Figure 3.4, we show the allele frequencies p(t) and r(t) for t ::0: 0. In Figure 3.4a, we have b > a > 0 > c = 1 and in Figure 3.4b, we have b > 0 >a> c = 1. Computational results showed that if the initial conditions, p(O) and r(O), were closer to 0, then the frequencies, p(t) and r(t), converged to 0. If p(O) and r(O) were closer to 1, then p(t) and r(t) converged to 1 (see Figure 3.4). Hence, based on the initial condition, either genotypes aaBB or AAbb remains in the gene pooL 0.0 ..{)_2o 100 200 300 400 500 600 100 oco 900 1000 I(NUmberol Q<).,eiOlOOS} o.a 06 + I'** ...... 1>U*******H*HU*UH"!0 o. OA . ""** ... 350 400 4 50 500 550 I[Numbel ol Ganeralions) Figure 3.5. (a) b >a> 0 > c = 1; (b) b > 0 >a> c = 1 53
PAGE 61
While testing the stability of the fixed points computationally, we discovered a new fixed point that was not found analytically. The growth rates cr, = crg = .03; crz = cr2 = 0"5 = crs = .61; 0"3 = 0"5 = 0"7 = .76, satisfy condition (3.7) and there exists a saddle point The Jacobian matrix J evaluated at ( i) with these cr;'s is [ ::: ::: l 400 800 Thus )q = > 0 and .\2 = < 0. When we have the initial condition (p(O), r(O)) = (p(t), r(t)) remains (see Figure 3.5a). When 0 < p(O) = r(O) < 1, (p(t),r(t)) is attracted then the solutions converge to either ( 0, 1) or ( 1, 0), the stable fixed points (see Figure 3.5b). The point (i, also a saddle point for the difference equations (5) and (6). As shown in the phase portrait in Figure 3.6a, (Pn, rn) remains at ( when (Po, ro) When we have 0
PAGE 62
'' "o_; '' '' '"8 1r} '  1 <:1 O.B I 10.61 x l . . . .. 0 '" 00 '' '' 0 ,, '' O.ti ,, p0 : or alielaA P,: FreqU!lncyol allele A Figure 3.6. Phase Portraits, b >a> c = 0: (a) (p0,r0 ) = (b) 0 < p0 = ro, < 1 '' Analysis of the difference equations (3.1) and (:l.2) and the differential equations (3.5) and (3.6) gave a total of eight fixed points. We found these six fixed points by letting either the frequency of A or the frequency of B be 0, which was the simplist case. With the assumptions we made about the fitness values of the phenotypes, (3.4) and (3.7), we had only four fixed points two of which were stable. Also, we were able to find a saddle point for these equations computationally which indicates there are more fixed points for the difference equations (3.1) and (3.2) and the differential equations (3.5) and (3.6), which we have not found analytically. In all of these cases, only two of the four alleles dominated the gene pool and hence only one genotype dominated. By letting the values of s3 s5 s7 and 0'3 0'5 0'7 be greater than the ot)ler fitness values, we favored the heterozygous genotypes over the hornozygous genotypes, because s.;'s and O'i's fori = 3, 5, 7 represent the genotypes AAbb, AaBb, 55
PAGE 63
aaBB. We observed that only the genotypes AAbb and aaBB came out as winners and remained in the population. 56
PAGE 64
4. Case of Two Alleles, Three Loci With two alleles at three loci, we have allele types A, a, B, b, C and c with following 27 genotypes: AABBCC, AABBCc, AABBcc, AABbCC, AABbCc, AABbcc, AAbbCC, AAbbCc, AAbbcc, AaBBCC, AaBBCc, AaBBcc, AaBbCC, AaBbCc, AaBBbcc, AabbCC, AabbCc, Aabbcc, aaBBCC, aaBBCc, aaBBcc, aaBbCC, AABbCc, AABbcc, aabbCC, aabbCc, aabbcc. We now need fitness values cr1 ... cr27 for each genotype. We let p1(t) be the frequency of A, q1(t) be the proportion of B, and r1 ( t) be the proportion of C in the population. Then P2 ( t) = 1 P1 ( t) represents the frequency of allele a, q2(t) = 1 q1(t) represents the frequency of allele b, and r2(t) = 1r1(t) for the frequency of allele c. Using the method of the previous sections, we get the following differential equations respectively, where {3 22'2 2 22+2 2+4 p = CTJPJC] 1 T 1 T CT2P1CJ1 rrr2 + CT3p1q 1 r 2 CT4[!JCJ1CJ2T1 CTsP1CJ}CJ2T1T2 +2cr6plqlq,r3 + + + + D'10P2CJiri 57
PAGE 65
As before we will assign a fttness to each genotype as follo)'ls; Assign a weight to each allele: w(A) = 1, w(B) = 1, w(C) = 1, w(a) = 0, w(b) = 0 and w( c) = 0. For example, we have genotype AABBCC, 58
PAGE 66
.12 w(AABBCC) = w(A) + w(A) + w(B) + w(B) + w(C) + w(C) = 6. The numeric phenotypes of the genotypes are as follows: w(AABBCC) = 6, w(AABBCc) = 5, w(AABBcc) = 4, w(AABbCC) = 5, w(AABbCc) = 4, w(AABbcc) = 3, w(AAbbCC) = 4, w(AAbbCc) = 3, w(AAbbcc) = 2, w(AaBBCC) = 5, w(AaBBCc) = 4, w(AaBBcc) = 3, w(AaBbCC) = 4, w(AaBbCc) = 3, w(AaBbcc) = 2, w(AabbCC) = 3, w(AabbCc) = 2, w(Aabbcc) = 1, w(aaBBCC) = 4, w(aaBBCc) = 3, w(aaBBcc) = 2, w(aaBbCC) = 3, w(aaBbCc) = 2, w(aaBbcc) = 1, w(aabbCC) = 2, w(aabbCc) = 1, w(aabbcc) = 0. vVe will use the fitness curves in Figure 4.1 to obtain the fitness parameters (growth rates) O'i. Specifically f(O) = f(6) = d, f(1) = f(5) = c, f(2) = f( 4) =a and f(3) =b. Value I Fitness Value Figure 4.1. (a) Fitness Curve III; (b) Fitness Curve IV Thus, the growth rates for the genotypes are given in the Table 4.1. 59
PAGE 67
o1 = f(AABBCC) = d o14 = f(AaBbCc) = b o2 = f(AABBCc) = c o1s = f(AaBbcc) =a o3 = f(AABBcc)a o16 = f(AabbCC) = b o4 = f(AABbCC) = c o17 = f(AabbCc) =a as= f(AABbCc) =a i O"Js = f(Aabbcc) = c o6 = f(AABbcc) = b o19 = f(aaBBCC) =a o1 = f(AAbbCC) = a O"zo = f(aaBBCc) = b as= f(AAbbCc) = b o21 = f( aaBBcc) = a og = f(AAbbcc) =a o22 = f(aaBbCC) = b o10 = f(AaBBCC) = c o23 = f(aaBbCc) =a on= f(AaBBCc)a o24 = f(aaBbcc) = c o12 = f(AaBBcc) = b O"zs = f(aabbCC) =a L o27 = f( aabbcc) = d Table 4.1. Fitness parameters of genotypes We will consider the following two cases: b > a > c = 0 > d and b > a > 0 > c > d Figures 4.2, 4.3, 4.4 and 4.5 are some of the computational results obtained from the differential equations under these conditions. '' / ....... o.s * t++' '\ 0.6 \ r +t t + + 0.4r o.J p1(J),Fq(A} +t q1 (J),Fq(B} ... rt(t)=Fq(C) ++++++ l oa ' Jost\_/ 04 0.2 <. I I (Nwnbwol Gene!aUons) I (Na> c = 0 > d; (b) b >a> 0 > c > d 60
PAGE 68
Figures 4.2a, 4.3a, 4.4a and 4.5a correspond to Figure 4.1a and have cr, = CTzo = d = .12, CTz = CT4 = crw = CTJs = 0'24 = CTzs = c = 0 crs = CTs = 0'12 = cr14 = 0'16 = CT2o = 0'22 = b = .56 Figures 4.2b, 4.3b, 4.4b and 4.5b correspond to Figure 4.lb and have and the remaining CTi are the same as above. "\ i+ I I p1(t)=Fq(A) q1(t):::Fq(8) r1(t)=Fq(C) ,, 1.2 M 0.8 H +,. 0.2 + p1(1):::Fq(A) I q1(1)c:::Fq(B) 1 11(1):::Fq(C) I 200 600 wo t I (Numbar ol Genaratkltls) 1 (Numlli!r ol Gaoomtions) Figure 4.3. Growth rates: (a) b >a> c = 0 > d; (b) b >a> 0 > c > d From Figures 4.2, 4.3, 4.4 and 4.5, we can hypothesize that (0, 1) is a stable fixed point since each frequency converges to 0, i or 1. Under different f1tness values, there could exist more fixed points with different stability. If 61
PAGE 69
b >a> 0 > c > d = .12 (Figure 4.2b, 4.3b, 4.4b, 4.5a and 4.5b), we have .8 and .2 as saddle points since two of the three frequencies are attracted to either .8 or .2, then converge 1). 1 p1 (t)=Fq(Ai I + + ql(t)=Fq(B) r1{t),Fq(C) OB["f "f 0< ot 0 0 200 a > c = 0 > d; (b) b > a > 0 > c > d Because there is always one of the frequencies approaching 0 and one approaching 1, two of the allele types, A, B, C, and two of a, b and c will eventually dominate the gene pool. Figure 4.2a has p1 ( t) approaching 0 and r1 ( t) converging to 1. This means the alleles a, B, b and C will dominate the gene pool. Hence only the genotypes aaBBCC, aaBbCC and aabbCC, which are one ninth of the total genotypes, dominate the population. The Figure 4.2b has r1(t) approaching 0 and q1(t) aproaching 1. 62
PAGE 70
I.< p1 (t).::Fq(A) I pt(!)=Fq(A) I qi (t):::Fq(B) .. q1(t):::Fq(Bll12 r1(t):::Fq(C) ri (I}=Fq(C) .rtllfl*UHU ,,. **"' OS : ::; 06 0.< 0< 0.2 02 I I \_ ol 3,. I I I I I 0 0 200 "' 001) "" 1000 "00 0 200 oo 000 BOO ""' 1200 t (Num001 ol t (NumOOr ol Figure 4.5. Growth rates: (a) b >a> c = 0 > d; (b) b >a> 0 > c > d Thus, alleles A, a, c and B will eventually dominate the gene pool. Therefore, AABBcc, AaBBcc and aaBBcc are the only genotypes dominating the gene pool. Once again, only one ninth of the genotypes dominates in the population. Note that in Figure 4.4a, p1(t) and q1(t) converge to values other than 0 or 1, and r1(t) converges to 0. Hence, only the genotypes containing the alleles A, a, B, b and c wiil dominate the gene pool. Thus the genotypes AABBcc, AABbcc, AAbbcc, AaBBcc, AaBbcc, AaBbcc, aaBBcc1 aaBbcc and aabbcc will dominate the population, which is one third of all genotypes. However, this was true only when the initial condition was (p1(0), q1(0), r1(0)) = (.45, .54, .36). 63
PAGE 71
5. Conclusion The difference and differential equations from chapters 2, 3 and 4 had 3, 9, and 27 fitness parameters, respectively. While evaluating the difference and differential equation, we reduced the number of parameters favoring heterozygous over homozygous genotypes in hopes of finding a polymorphism, where all the genotypes coexist (visible) in the gene pool. In section (2.1), we examined a difference equation model where domination of the gene pool by genotypes was based on the fitness level of each genotype given two alleles A and a at one locus. Among four cases we looked at, there was only one case where both alleles dominated the population, hence all three genotypes AA, Aa and aa were visible in the gene pool. In section (2.2), with the differential equation (2.3), we found a similar result. We were able conclude that there was one instance where all the genotypes dominated the gene pool through the stability analysis of the fixed points. Therefore, given two alleles at one locus, there exists a polymorphism when the fitness value of heterozygous genotype is greater than homozygous genotypes. In section (3.1) and (3.2), we found that not all the alleles dominated the gene pool; either alleles a and B dominated or alleles A and b dominated. Convergence of (pn, Tn) to either (0, 1) or (1, 0) is based on (p0 ro), the initial value of (Pn, rn); if (p0 r 0 ) is closer to the point (0, 1), then (pn, rn) approaches (0, 1), and if (p0 r 0 ) is closer to (1, 0), then (p,, rn) converges to the fixed point 64
PAGE 72
(1, 0). Analysis of the difference equations (3.1) and (3.2) and the differential equations (3.5) and (3.6) gave a total of eight fixed points. With the assump tions we made about the fitness values of the phenotypes, (3.4) and (3.7), we had four fixed points only two of which were stable. Also, we were able to find an additional saddle point for these equations computationally. Thus, only two of four alleles dominated the gene pool and hence only one genotype dom inates. By letting the values of s3 s5 S7 and o3, as, o7 be greater than the other fitness values, we favored the heterozygous genotypes over the homozy gous genotypes since si's and a/s fori = 3, 5, 7 corresponds to the genotypes AAbb, AaBb, aaBB. However, only the genotypes AAbb and aaBB came out as winners, that is, dominated the population. Hence there was no polymorphism. When we have two loci, we get different types of heterozygous genotypes that have the same weight 2. We can consider the phenotypes of AAbb and aaBB as one class, and AaBa as the other. If we assigned differ ent numeric phenotypes and fitness values to these two classes of heterozygous genotypes, instead of favoring AaBa, there may exist a polymorphic state. 65
PAGE 73
REFERENCES [1] udesirk, Gerald, Audesirk, Teresa, Biology Life on Earth, Macmillan Pub lishing Co., New York, 1993. [2] oppensteadt, Frank C., Mathematical Theories of Population: Demograph ics, Genetics and Epidemic, Society of Industrial and Applied Mathemat ics, 1974. [3] oppensteadt, Frank C., and Peskin, Charles S., Mathematics in Medicine and the Life Science, SpringerVerlag, New York, 1992. 66
