Citation |

- Permanent Link:
- http://digital.auraria.edu/AA00002591/00001
## Material Information- Title:
- Discrete logarithms in finite fields
- Creator:
- Reiff, Dean Phillip
- Publication Date:
- 1996
- Language:
- English
- Physical Description:
- v, 29 leaves : ; 29 cm
## Thesis/Dissertation Information- Degree:
- Master's ( Master of science)
- Degree Grantor:
- University of Colorado Denver
- Degree Divisions:
- Department of Mathematical and Statistical Sciences, CU Denver
- Degree Disciplines:
- Applied mathematics
## Subjects- Subjects / Keywords:
- Coding theory ( lcsh )
Combinatorial geometry ( lcsh ) Cryptography ( lcsh ) Logarithms ( lcsh ) Coding theory ( fast ) Combinatorial geometry ( fast ) Cryptography ( fast ) Logarithms ( fast ) - Genre:
- bibliography ( marcgt )
theses ( marcgt ) non-fiction ( marcgt )
## Notes- Bibliography:
- Includes bibliographical references (leaves 26-29).
- General Note:
- Submitted in partial fulfillment of the requirements for the degree, Master of Science, Applied Mathematics.
- General Note:
- Department of Mathematical and Statistical Sciences
- Statement of Responsibility:
- by Dean Phillip Reiff.
## Record Information- Source Institution:
- |University of Colorado Denver
- Holding Location:
- Auraria Library
- Rights Management:
- All applicable rights reserved by the source institution and holding location.
- Resource Identifier:
- 37366653 ( OCLC )
ocm37366653 - Classification:
- LD1190.L622 1996m .R45 ( lcc )
## Auraria Membership |

Full Text |

DISCRETE LOGARITHMS
IN FINITE FIELDS by Dean Phillip Reiff B.A., University of Colorado at Boulder, 1980 B.A., University of Colorado at Denver, 1987 A thesis submitted to the University of Colorado at Denver in partial fulfillment of the requirements for the degree of Master of Science Applied Mathematics 1996 This thesis for the Master of Science degree by Dean Phillip Reiff has been approved by William Cherowitzo Sylvia Lu Stan Payne Date Reiff, Dean Phillip (M.S., Applied Mathematics) Discrete Logarithms in Finite Fields Thesis directed by Associate Professor William Cherowitzo Abstract Given a finite held Fq of order q, and g a primitive element of Fq, the discrete logarithm base g of an arbitrary, non-zero y Â£ Fq is that integer x} 0 < x < q such that gx = y in Fq. The security of many real-world cryptographic schemes depends on the difficulty of computing discrete logarithms in large finite fields. This thesis is a survey of the discrete logarithm problem in finite fields, including: some cryptographic applications (password authentication, the Diffie-Hellman key exchange, and the ElGamal public-key cryptosystem and digital signature scheme); Niederreiters proof of explicit formulas for the discrete logarithm; and algorithms for computing discrete logarithms (especially Shanks algorithm, Pollards p-method, the Pohlig-Hellman algorithm, Coppersmiths algorithm in fields of order 2n, and the Gaussian integers method for fields of prime order). This abstract accurately represents the content of the candidates thesis. I recommend its publication. Signed William Cherowitzo m ACKNOWLEDGEMENT I would like to thank my adviser, Bill Cherowitzo, for all the help and guidance that he has graciously given me. I would also like to thank Sylvia and Stan Payne for their service on my thesis committee. Contents 1 Introduction ............................................ 1 2 Cryptographic Applications .............................. 3 3 Explicit Formulas ....................................... 6 4 Square Root Algorithms .................................. 9 4.1 Baby-Step Giant-Step Method ........................ 9 4.2 Pollards /j-Method ............................... 10 4.3 Pohlig-Hellman Algorithm .......................... 12 5 Index Calculus Algorithms .............................. 14 5.1 Coppersmiths Algorithm ........................... 17 5.2 Gaussian Integers Method ...........................20 5.3 Other Developments ................................ 23 6 Conclusion.............................................. 25 References ............................................... 26 V 1 Introduction Let Fq be a finite field of order q, where q = pn (p prime), and let F* = Fq {0}. Given g, a primitive element of Fq, and an arbitrary y Â£ F*, the discrete logarithm, of y base g is defined as logfl y = x 77 gx = y in Fq and 0 < x < q 2. (Notice that g primitive =7* existence of x, and 0 < x < q 2 =7 uniqueness of x.) A fundamental difference between discrete logarithms and real logarithms is that the magnitude of y gives us no information about the magnitude of x. Gauss [15] referred to the discrete logarithm as the index of a number (a nomenclature that is still sometimes used), and he noted (for the case q a prime) some of its basic properties: Calculations with discrete logarithms are performed modulo q 1. log3(j/iJ/2) = log3 yi + logg y2 (mod q 1) loSs(^) = lo&g IF ~ log3 y2 (mod q 1) Change of base: suppose T is another primitive element of Fq and we know log3r = 7. r and g both primitive =7 (7, q 1) = 1 =7^7 such that 77 = 1 (mod q 1) =7 g = T7 in Fq. Therefore logfl y = x<=?y = gx = in Fq 77 logr y = 72: (mod q 1). Multiplying the last congruence by 7 gives logfl y = logfl Tdogr y (mod q1). The most obvious method of finding the discrete logarithm of y is to simply keep raising g to different powers until we find the specific exponent x such that gx = y in Fq. However, if q is very large, this method is computationally infeasible, and while faster algorithms have been developed, the discrete logarithm problem remains intractable (for almost all y Â£ F*) in very large fields Fq. This intractability makes the discrete logarithm problem useful in cryptographic applications, and many such applications have been developed, and implemented, in the last two decades. These real-world cryptographic implementations have raised the stakes on the discrete log problem, making the research of faster algorithms to solve the problem a matter of financial, and even national, security - its now imperative to know in how large a field the problem is solvable. f There is also interest in the discrete logarithm problem in other environments - e.g. in the group of points on an elliptic curve over a finite held, or in the multiplicative (non-cyclic) group of units in Zn, where n is a composite integer. Even in F*} some cryptographic applications allow g to be non-primitive, so that the order of the group < g >, and the mere existence of logfl y for arbitrary y G F*, are unknown. However, the case where g is primitive in a finite held environment has received the most attention, and I have limited the scope of this thesis to that case. I provide a survey of the discrete logarithm problem, including algorithms for computing discrete logarithms, explicit formulas for the discrete logarithm, and some of the most important cryptographic schemes whose security depends on the intractability of the discrete logarithm problem. Throughout this thesis, log2 and In will denote real logarithms to the bases 2 and e, respectively. Additionally, q will continue to denote the order of a finite held, and g will always be assumed to be a primitive element. I refer to fields of prime order as prime fields, and to all other finite fields as fields of prime power order. 2 2 Cryptographic Applications Discrete logarithms in large finite fields Fq come into play in cryptography because they appear to have the attributes of a one-way function. That is, while the discrete logarithm problem is intractable, its inverse function (viz. discrete exponentiation) is relatively easy to compute. Using the binary representation of an exponent x, 0 < x < q 2, the square-and-multiply method can compute gx G Fq with at most 2 log2q multiplications and modular reductions. For example, f/19 = f/100112 = = (((g2)2)2 gf 5, and computational blowup can be avoided by taking a modular reduction after each multiplication. Below, I present just a few of the most widely used cryptographic schemes whose security depends on the difficulty of computing discrete logarithms. A safe held size in which to implement these schemes is q ss 1015 ~ lO300, depending on the importance of the secret, how long it needs to remain secret (e.g. an embarassing diplomatic correspondence may need to remain secret for decades), and the computing resources available to those who might want to compromise the secret. (For comparison, there are approximately 1051 atoms in our planet, and the age of the universe is approximately 1017 seconds.) As will be seen in later sections, there are other aspects of the held Fq that can effect the ease with which discrete logarithms can be found in it. Most importantly, it is necessary that q 1 have at least one large prime factor. For those who havent seen much cryptography before, it may not be obvious how we can talk about messages (i.e. strings of characters of a language) as being integers, or even more generally, as elements of a hnite held. An easy way to make the conversion to integers is simply to instruct our computer to take the sequence of 0-1 bits it had been interpreting as a character string, and to instead interpret that sequence of bits to be a large integer (stored in binary). In order to hx a maximum for the size of integers we want to deal with, it may be necessary to break the message into a number of sub-messages of a hxed size. Thus, for example, if we break a message into blocks of 64 8-bit characters, we can limit the size of our integers to be less than 2512 10154. Similarly, the bits of the message can be interpreted to be the coefficients of an element in a held FA, while for helds Fpn} p odd, we can convert a binary message to p-ary coefficients - e.g. convert the message 0111 to 2x + 1 Â£ F3n. 3 The simplest cryptographic application that relies on the difficulty of computing discrete logarithms concerns user authentication for example, verifying passwords on a multi-user computer. It would be very risky to have a hie stored in the computer that contained every users password, yet the computer needs some way to verify the legitimacy of a password. Instead of storing the password ay for each user z, we can choose a finite held Fq and a primitive element g, and store the values gXl = zq Â£ Fq for each user i. (g, and the modulus for Fq also need to be stored, but they can be the same for all users.) To authenticate a users password, the computer hrst calculates gx% then compares the result for a match on the stored hie. However, even if someone gains access to the stored hie, in order to impersonate user z, they would hrst have to calculate the discrete logarithm of zq in Fq. The problem with classical, private-key cryptography has always been that, if two users wanted to communicate privately over a public, insecure channel, they hrst needed, in some private and secure way, to agree on a shared secret key, which they would both use to encipher and decipher their messages to each other. Furthermore, this problem is compounded for a large system of users, since a different secret key is needed for each pair of users in the system. This problem was eliminated in 1976, when Diffie and Heilman [11] invented the concept of a public-key cryptosystem. In their scheme, a hnite held Fq and a primitive element g are publicly agreed upon. Each user z chooses an integer private key ay, 2 < ay < q 2, which they keep secret, but makes publicly known their public key zq = gXl Â£ Fq. To encipher and decipher messages to each other, two users A and B use the key gXAXB} which they can each calculate as (hb)Xa or (Ha)Xb } respectively. An ability to hnd discrete logarithms in Fq would clearly break the system, and for some special cases of q, it has been shown [6, 24] that any method of breaking the system is computationally equivalent to the discrete logarithm problem. In 1985, ElGamal [12] proposed a public-key cryptosystem and digital signature scheme in which the public and private keys are the same as in the Diffie-Hellman system. User B can send a message to Â£ Fq to user A by choosing an integer k, 2 < k < q 2 (and its important to choose a different k for each message), then sending the pair (gk,myA) to user A. User A knows xa (mod q 1), and can calculate yA as (gk)~XA to recover the message m = (mykA)yJk. To implement the ElGamal digital signature scheme, we must restrict ourselves to a held of prime order p (actually this can be extended to fields of prime power order if we use an integer representation of held elements where necessary see e.g. the description in [32]). To sign a message to, 1 < to < p 1, 4 a user A will provide a pair of integers (r, s), 1 < r, s < p 1, that satisfy the following properties: a knowledge of xa is necessary to produce the signature; anyone who knows to (including a judge in a court of law) can use ija to verify the signature; and any alteration of the message to after the signature was produced will nullify the signature. To calculate r and s, user A chooses an integer k} 1 < k < p 2, such that (k,p 1) = 1 (again, we must use a different k for each message), and calculates r = gk mod p, and s = &-1(to xat) mod p 1 ((k,p 1) = 1 => 3 k~x (mod p 1)). Thus, these r, s satisfy 9m = gXAr+ks = (gXA)r{gky = (yn)Vs (mod P), so to verify the signature, anyone can calculate gm and (PaYrs (mod p), and check that they are equal. In 1994, the National Institute of Standards and Technology [28] adopted a slightly modified version of this scheme as the Digital Signature Standard to be used by U.S. government agencies. 5 3 Explicit Formulas While worthless for actually calculating discrete logarithms (it would be faster to generate the whole held), it is fascinating that there exists explicit formulas for discrete logarithms in finite fields. Wells [42] gave an explicit polynomial form for discrete logarithms in prime fields, and Mullen and White [26] found an explicit form for discrete logarithms in fields of prime power order. Niederreiter [29] has provided much simpler proofs of these formulas, which I present below, that also generalizes the formula of Wells to fields of prime power order. Let q = pn}n > 1, be the order of the held. Some of the simplicity of Niederreiters proof is achieved by noting that it suffices to determine logfl y modulo the characteristic p. To see this, notice that we can represent loggy = EITo1 xiP\ where ay Â£ {0,1,... ,p 1} Vz. Assume we can hnd x0 = logfl y mod p. Then logfl y = x0 + cip, for some integer c\ > 0. Let y1 = (yg~x)p Then y1 = (gx+ciP-x')p = gcii = gcy So c\ = \oggy\. Continuing this process, we can, by assumption, hnd x1 = logfl y1 mod p, Then logg y = x0 + cxp = x0 + (x1 + c2p)p = x0 + x1 p1 + c2p2. Let y2 = (yig~Xl)p = (yg-x-x^)^ = (gC2p2)^ = g2. So c2 = loggy2. x2 = logfl y2 mod p xn_t = logg yn_i mod p. Thus, we can obtain logfl y = x0 + x1 p + x2p2 + + (s:m_i + cnp)pn~1, where 0 < logfl y < q 2 cn = 0. Niederreiters proof uses the convention that 0 = 1. Lemma 3.1 For integers j > 0 we have CeFg 0? if j = 0, or j ^ 0 mod (q 1) -1, otherwise. Proof If j = 0, we count to q modulo p (since 0 = 1). If j = 0 mod (q 1) and j ^ 0, we count to q 1 modulo p (since (P = 0). If j ^ 0 mod (q 1), Yd = Y d ceFq ceF* y (siy i = 0 Ywy 8 = 0 g3{q~L 1 g3 -1 1 -1 g3 -1 0. 6 Lemma 3.2 If q > 3 and k is any integer with 0 < k < q 1, then G Fp, and the sum is congruent to k (mod p) ceFq L c+1 Proof Let Sk be the sum on the left. If k = 0, then c 1 s> = = Er- Y c Ti Y c ceFq ceFq C^l C^l 1 = E = E d (just permuting the terms in the sum) ceF* ^ deF* = d = 0 (by lemma 1, with j = 1). deFq Ifl 1, then Sk = sk o = sk s0 k k 11 k = E(^'-Vi) = EE ;! j = -EEcJ_1 j = 1 j = 1 ceFq C j = 1 ceFq c^l c^l = -E(Ec'_1 + 1-1) = -E(E c'_1 -1) j = 1 ceFq j = 1 ceFq C+1 k = EE(0 1) (by lemma 1, since j 1 < k 1 < q 1) j=i = k, and k an integer => k Â£ Fp. From lemma 3.2, the following definition is immediate: ck Tk := EE ---------- = k (mod p) for q > 3 and 1 < k < q 1. ^Fq 1 c (3.0.1) 0,1 V r\k (i.e. Tk = Sk = Sk, since k > 0.) As will be seen, equation (3.0.1) is the beautifully simple key to Niederreiters proofs. Theorem 3.1 proves the result of Mullen and White [26]. 7 Theorem 3.1 For any y Â£ F* ,9 > 3, we have 9-2 V3 loggy = ~l + J2~1-------7 (mod p) g~'J 1 j=i y where on the right-hand side of the congruence there is an element of Fp Proof Put k = logfl y + 1 in equation (3.0.1), and note that cloÂ§gV = (^l0S9C)l0S92/ = (glo^y fOSgC = ylo%gc \Jc Â£ p1*. Then we get log3J/ = -1 + X) clgg y +1 cEFq cÂ£ 0,1 r< L- 1-c E4+^?-l ceFq cÂ£ 0,1 c!gg y ,,lg0 c r/loS3 C cÂ£Fq cÂ£0,l ^ ^-ioggC 1 :Â£F* y 1 (mod p) (mod p) cÂ£l 9-2 V9 -i + T,F (mod p) (where c / 1 <Â£ j ^ 0). g~'J 1 g=l n If we dehne the discrete logarithm slightly differently, we get another form, which is a generalization of Wells [42] result to fields of prime power order. Let Loggy be the unique integer x such that y = gx and 1 < x < q 1. (Note that Loggy = logfl y, except that Log3l = q 1, while logfl 1 = 0.) Theorem 3.2 For any y Â£ Ff} q > 3, we have 9-2 v3 Log ay = Y (mod p), j=i 1 93 where on the right-hand side of the congruence there is an element of Fv. Proof Put k = Loggy in equation (3.0.1), and note that cLog9y = j/LoÂ§9C \JC ^ p*_ Then we get Log 9y = Y cLog gy yLogge cEFq cÂ£ 0,1 1 - - Â£ ' 1 nLo Â§ EFq 1 y cÂ£ 0,1 lgc (mod p) 9-2 y3 = ------r (mod p) (where 1). 1 93 3 = 1 8 4 Square Root Algorithms In this section I present three algorithms for computing discrete logarithms, each of which can achieve a running time of order roughly p 2, where p is the largest prime factor of q 1. Thats not fast enough to be useful in large, arbitrary finite fields, but if q 1 has no large prime factors, these algorithms can be very practical. Several examples in this section use the held T37 generated by g = 5, so I list i+*7 here: X 5X X 5X X 5X 0 1 12 10 24 26 1 5 13 13 25 19 2 25 14 28 26 21 3 14 15 29 27 31 4 33 16 34 28 7 5 17 17 22 29 35 6 11 18 36 30 27 7 18 19 32 31 24 8 16 20 12 32 9 9 6 21 23 33 8 10 30 22 4 34 3 11 2 23 20 35 15 4.1 Baby-Step Giant-Step Method Let m = [(5 1)2]. This algorithm, which is attributed to Shanks [38], makes use of the fact that we can express logfl y as x = i + jm, with 0 < i,j < m. To find i and j, we first pre-compute a list of pairs (LA) fr 0 < z < to, and sort the list according to the second component. Then for 0 < j < to, we calculate yg~rm until we find a j such that yg~rm = gl for some z in the list. Thus, for this z and j, y = gl+]m. For example, if q = 37 (to = 6), and g = 5, we pre-compute and sort the list: (z,A) ^ (0,1)(1,5)(3,14)(5,17)(2,25)(4,33). Now, to find the log5 of y = 2, we let j = 0 and compute 2 5-0'6 = 2, and let j = 1 and compute 2 5-1'6 = 2 530 = 17. Since 17 appears as a second component in the list, corresponding to i = 5, we have log5 2 = 5 + 1- 6 = 11. 9 To achieve a running time of order roughly p2, we can use the algorithm in several stages, where in each stage we compute a logarithm in a subgroup of F* of order not greater than the approximate size of p. I believe Pollard [35] was the first to propose this multi-stage idea, and (while its obviously not worth the trouble for a held this small) to show how this can be done, I outline the procedure for finding log5 2 in U37 in 3 stages. Note that 37 1 = 32 22. STAGE 1 in a subgroup of order 3: We use the algorithm to find that the logarithm of y12 to the base g12 is 2, so y12 = (g12)2. Taking 12th roots, we have V = g2{g3)kli o < k\ < 11 (since g3 is a 12th root of unity), so x = 2 + 37+. STAGE 2 in a subgroup of order 3: To find Â£+, let y2 = yg~2 = (g3)kl, and use the algorithm to find the logarithm of y\ to the base (g3)4- Since the logarithm is 0, we have (g3)4'0 = y2l so taking 4th roots gives us y2 = (g3)(g9)k2 = (.g3f{g3)3k\ 0 < k2 < 3, so h = 0 + 3k2. STAGE 3 in a subgroup of order 4: Use the algorithm to find that the logarithm of y2 (= y2 (g3)~) to the base g9 is 1. Thus, k2 = l=^ki = 3=^x = 11. 4.2 Pollards p-method One drawback of Shanks algorithm is the need to sort and store a list of size p? Pollard [35] introduced a probabilistic algorithm which eliminates the need for such storage. To find loggy} we first divide F* into three sets Si, S2, and S3, that are approximately the same size. Define a sequence r4- from F* by r0 = 1, and for i > 0, 'yrt if rt G St O+i < r\ if o G S2 .gri if O G S3. Thus, every element in the sequence has the form r4- = ya'gh\ where a0 = b0 = 0, and for i > 0, di+1 cii + 1 (mod q 1) < 2tq (mod q 1) if O G Si if D G S2 if G G S3, and bi+1 'hi if O G Si 2bi (mod q 1) if O G S2 bi -\- 1 (mod q 1) m ^-0 CO 10 The sequence rq behaves like a random walk in F*} so we expect to find an i of size approximately q?, such that rq = r2; (see [35] for a more detailed discussion on the expected value of this i). To find log j/, we run through a sequence of 6-tuples (ry, a4-, r2i, a2i, calculating each one from the previous one (which is quite easy, since ry, r2j_i, and r2i are calculated by simple held multiplications), until we reach a case such that rq = r2i (i.e. ya'gb' = ya2lgh21). Let m = cq a2i (mod q 1) and n = b2i bi (mod q 1), then ym = gnmFq. (4.2.1) Now, we use the Extended Euclidean Algorithm to find a A and y such that d = gcd(m,q 1) = Am + y(q 1). Thus Am = d (mod q 1), so raising equation (4.2.1) to the A power gives us yd = gXn, and An must be of the form dk (since the left hand side is a dth power). Taking dth roots gives us y = gk(gSL^y, 0 random, we expect large values of d to be rare). As an example, we find log5 17 in F37. Let Si = {1,2,... 12}, S2 = {13,14,... 24}, and S3 = {25, 26,... 36}. We calculate (but dont store): (n = 17, a4 = 1, b4 = 0, r2 = 30, a2 = 2, b2 = 0) (r2 = 30, a2 = 2, b2 = 0, r4 = 34, a4 = 3, b4 = 1) (r3 = 2, a3 = 2, b3 = 1, r6 = 3, a6 = 6, b6 = 4) (r4 = 34, a4 = 3, b4 = 1, r8 = 11, a8 = 14, b8 = 8) (r5 = 22, a5 = 3, b5 = 2, r10 = 34, a10 = 16, b10 = 8) (r6 = 3, a6 = 6, b6 = 4, r12 = 3, a12 = 32, b12 = 18) Then we have 3 = 54176 = 5181732, so 1726 = 5-14 = 522. d = gcd(26, 36) = 2 = 7 26 5 36, so raising to the 7th power gives us 172 = 522'7 = 510. Taking square roots gives 17 = 55 (518)J, j = 0 or 1. Trying j = 0 gives us the equality 17 = 17, so log5 17 = 5. By a procedure similar to that outlined in the previous subsection, Pollards p-method can be used in several stages, in subgroups of F*} to yield a running time of order f2 Its possible, especially in a subgroup of very small size, that the algorithm will give us a useless equation (e.g. yagb = ya+q~1gb). If this happens, we can simply change the initial conditions a0 and b0} to try the algorithm again with a different r0. 11 Pollard [35] also proposed a A-method, which can find a discrete logarithm that is known to lie in an interval aq < x < x2 in time 0((x2 aq)2). However, to my knowledge, no one has figured out a way to pre-determine such an interval. 4.3 Pohlig-Hellman Algorithm Although this method is known as the Pohlig-Hellman [34] algorithm (who first published it), they credit Roland Silver as having independently discovered it earlier. We begin by finding the prime factorization q-l = Y[p-'. i = l We will reduce the problem of finding x = logfl y into k subproblems, where each subproblem will be solved by finding rii discrete logarithms in a subgroup of size Pl- For each z, 1 < z < Â£y we will find ay = x (mod pP')} then use the Chinese Remainder Theorem to find x. For each z, let nt-1 xi = Xh]PU 0 < XhJ < Pi ~ 1, 1=0 and note that Vz, x = ay + op]1', for some G Z. Then, since gq 1 = 1, we have the following equalities in F*: i (y)Pt (A 3=1 (Y* Pi g V^J z-1 . =o * iJPi+CiPi l \ <7-1 Pl {g 3=1 Pt i 'Pi 0 0 < ayj0 < pt 1 (4.3.1) (y-s 3=1 Xi.O \ P2 & (e; = g 8-1 = 1 ' i,]Pi+ciPi PA1 (9 3=1 Pl i^z,l 0 < xhl < pt 1 (4.3.2) (y a 0 %i,l Pi P' = 9 i 1 ? , \ q 1 12 ^,A+C8?y !)T3- (5 q-1 Pz 0 < ayj2 < pt - 1 (4.3.3) (y g E n, 2 1=0 ^'A))V = g ' i' = (g Pi )^w-y 0 < a:w_i < Pi 1 (4.3.4) 12 These equalities basically describe the algorithm. That is, to find ay, we run through equations (4.3.1) through (4.3.4), first calculating the left hand side of an equation (using previously found results), then finding ay-j as its discrete logarithm to the base g For small py we can find ay-j by trial and error, while for large jy, we can use one of the previously mentioned algorithms of this section to find these logarithms. (Notice that if q 1 = 2% t odd, then equations (4.3.1) through (4.3.4), with pi = 2, give us a very easy way to extract the s least significant bits in the binary representation of x. For more on the security of individual bits in discrete logarithms, see [5, 33, 22].) As an example , we find log 517 in Fs7: We have q 1 = For pi = 2: -1 = 1 7^- 17 2 = (5f J^1,0 _ _\xl,0 xlfi 1 1 = = (17 5_1) 36 4 = (5f J^i,i q^i ,1 => xiti = 0 So x1 - 1,0 + 1,1 2 = = 1 + 0- 2 = 1 x = 1 (mod 4). For p2 = 3: 26 = 1 7 17 3 = (5) ^2,0 = lO^2-0 =t 2,0 2 10 = = (17 5~2y 36 9 : = (5) a?2,i = lO^2-1 => X2)l = 1 So x2 : = 2,0 + 2,1 3 = = 2 + 1- 3 = 5 13 5 Index Calculus Algorithms The fastest method for computing discrete logarithms is known as the index calculus method. The basic ideas in the algorithm appear as early as 1922 in the work of Kraitchik [19, pp. 119-123]. Adleman [1] analyzed the running time of the algorithm for the case q a prime, and Heilman and Reyneri [18] extended the algorithm to the case q = pn, for fixed p and n > oo. The running time of the algorithm has the form L[q,a,c] = exp((c + o(l))(ln g)"(lnIn g)1-"), 0 < a < 1, and c a constant (o(l) > 0, as q > oo), which is known as subexponential (if a were 0, the time would be polynomial in In q; if a were 1, it would be fully exponential in In q). Since the algorithm uses operations in a ring embedded in the finite held, its description is different for different types of fields. To begin, I describe the basic version of the algorithm, which has run time generally 0(L[q, |,c]), with c a little too large to be practical in large fields. In later subsections, we will look at faster versions of the method. For the case Fp, p prime, whose elements are represented by the set {0,1,. . p 1}, with arithmetic performed modulo p: Let S (the factor base), be the set of all prime integers less than or equal some bound b. An element of Ff is said to be smooth with respect to b if, in the ring Z, all of its factors are contained in S. The algorithm proceeds in three stages. In the first stage, we take a random integer z in [l,p 2], calculate gz (mod p), and see if gz (mod p) is smooth. If it is smooth, say IH gz mod p = n PT, Pi G S, i=1 then we get an equation in the discrete logarithms to the base g: \s\ z = ^2 oii logfl pi (mod p 1), (5.0.1) i=1 where z and all of the cq are known. We continue this process until we have generated more than |1 equations. In stage two of the algorithm, we solve the system of equations (5.0.1) to find a unique solution for the log3p8-. Then in stage three, we are able to find the discrete logarithm of any y Â£ Ff. To do this, 14 we take z1 s at random again, until we find a z such that ygz (mod p) is smooth. When we find such a z, we get an equation \s\ logg y = -z + J2 ai loS3 Pi (mod p !)> i = l where everything on the right hand side is known. For example, in F37 with g = 5, let S = {2, 3}. Stage 1: Try z = 7: 57 = 18 (mod 37) = 2 32 => log5 2 + 2 log5 3 = 7 (mod 36). Try z = 6: 56 = 11 (mod 37). Not smooth. Try z = 14: 514 = 28 (mod 37). Not smooth. Try z = 31: 531 = 24 (mod 37) = 23 3 => 3 log5 2 + log5 3 = 31 (mod 36). Stage 2: Subtracting 3 times the first equation from the second, and using 5_1 = 29 (mod 36), we get the solution log5 2 = 11 and log5 3 = 34. Stage 3: Suppose we want to find log5 17. Try z = 24: 17 524 = 35 (mod 37). Not smooth. Try 2 = 15: 17 515 = 12 (mod 37) = 22 3 => log517 = -15 + 2 11 + 1 34 = 5 (mod 36). The algorithm works basically the same for the case Fpn? n p, whose elements are the set of all polynomials over Fp of degree less than n, with multiplication performed modulo a fixed polynomial f(x) of degree n that is irreducible over Fp. We take S to be the set of all irreducibles in the ring Fp[x\ whose degree does not exceed some bound 6, and call an element of F* smooth if (in Fp[x]) all of its factors are in S. To see why this simple extension of the algorithm doesnt yield a subexponential running time for arbitrary p and n, consider the case Fp2. We need the products of elements in the factor base to give us a significant percentage of the held elements, but an ax + b Â£ Fp2 can have at most one linear polynomial factor. (In section 5.2 we will see a way to get around this problem.) Notice that, in general, there is a tradeoff inherent in our choice of how large to make the factor base S. By increasing the size of S, we increase the probability that a random held element is smooth, so the task of generating equations in stage one can be accomplished faster. However, that will also increase the number of unknowns (i.e. the discrete logarithms of all elements in S) in stage two, thus making it more difficult to solve the system of equations. Therefore, we seek to bound INI to the size that will balance these considerations. On a related topic, notice that the generation of equations in stage one can be done by 15 a large network of processors working in parallel, with very little communication between them, while the linear algebra in stage two is difficult to parallelize. For large fields, the algorithm requires the solution of very large, but extremely sparse (and not too prone to exhibit linear dependence), systems of linear equations. Furthermore, the columns corresponding to the smaller irreducibles in the factor base will tend to be the most dense, while the columns corresponding to the larger irreducibles will be the most sparse. To solve the system, we can use a very effective method known as structured Gaussian elimination to try to eliminate the sparsest columns and the most dense rows first. This can reduce the system by as much as 95% (especially if we have generated many more equations than unknowns). The reduced system is still sparse enough to be solved by using other sparse matrix techniques, so the entire system can be solved in time roughly |A|2 (see [20, 30]). (To get a better sense of the sizes involved, a system arising from \FP\ ss 1058 had |A| ~ 100,000 and an average 15.5 non-zeros per equation. Structured Gaussian elimination reduced it to around 6,000 unknowns, and an average 80.5 non-zeros per equation.) The fact that we need to solve the system of equations mod ^ 1 is not too troubling. We can try to perform Gaussian elimination as though we were in a held, and possibly succeed. We will have a problem though, if we have to perform it on a column in which every entry is non-invertible mod q 1. At worst, we can solve the equations modulo the prime divisors of q 1, use Hensels lemma to lift the solutions to the appropriate powers of the divisors, then use the Chinese Remainder Theorem to fold the solution mod q 1. However, McCurley [25] pointed out a simpler possibility: Assume we are in column j; have a non-zero entry in position jj (if not, switch rows); and want to introduce a zero in position ij. We use the Extended Euclidean Algorithm to fold integers d, r, and s, such that d = raij + sa]r Now replace row j by r (row i) + s (row j), and replace row i by (yy) (row j) (y-r) (row i). This will leave a zero in position ij, and d in position jj. If after doing this (possibly many times in the same column), d is invertible mod q 1, then weve overcome the problem. In the least, we will have simplified the problem. Analyzing index-calculus algorithms depends on knowing the probabilities that random held elements are smooth. These probabilities are now pretty well understood for the relative sizes of |A| that are of interest. In prime folds Fp with smoothness bound b = L\p, |,/3], the work of Canfold, Erdos, and Pomerance [7] shows that we will probably have to test around L\p, |, ^] random integers between 1 and pa to fold one integer that is smooth. In folds ifo, Odlyzko [30] has shown that we will have to test approximately exp((l + o(l))| ln(|)) degree k polynomials to find one that has all of its irreducible factors of degree b or 16 less. (For other fields Fpn} see the survey article [31].) In general, the larger the magnitude of the field element we are testing, the less likely it will be smooth. Therefore, an effective way to speed up the algorithm is to reduce the magnitude of the field elements that we test for smoothness (but this must be done in such a way that we can still get an equation among the discrete logarithms of the irreducibles in S). 5.1 Coppersmiths Algorithm Coppersmith [8] described and analyzed his algorithm for fields FA, though it can be generalized to other fields Fpn of prime power order, with p growing slowly as pn > oo (it cannot be applied in prime fields Fp). Whereas in the basic version of the index-calculus algorithm we would try to factor elements of degree < n into irreducibles of degree < b ^ n2 In2 n, in Coppersmiths algorithm we 1 2 2 1 will take b ss n3 In3 n and test elements of degree < n3 In3 n for smoothness. Thus, the algorithm achieves a run time of the form L[2n, |, c], where c oscillates to as high as 1.5874. Coppersmiths algorithm was inspired by an idea that Blake, Fuji-Hara, Mullin, and Vanstone [4] referred to as systematic equations, in which we try to generate some of the equations for stage one by raising the irreducibles in 8 to a power of 2. For example, suppose we are in the field FA defined by f(x) = x7 + x + 1, and S contains all irreducible polynomials of degree at least as large as 2. Then x8 = x2 + x (mod f(x)), and if we raise x + 1 Â£ S to the 8th power, we get an equation relating the discrete logarithms of irreducibles in S: (x + l)2 = x2 + 1 (since we are in a field of characteristic 2) = x2 + x + 1 (mod f(x)) G S => 8logfl(:r + 1) = \ogg(x2 + x + 1) mod 27 1. Unfortunately, we can get at most | of the equations we need by this method (see [30, sec. 5.1]). (Blake, et.al. also describe a method wherein we use the Extended Euclidean Algorithm to reduce the problem of finding a polynomial of degree ss n that is smooth, into the more probable task of finding two polynomials of degree ss | that are both smooth.) Before describing Coppersmiths algorithm, we need to define some more notation. Let the irreducible polynomial that defines FA be f(x) = xn + fi(x). 2 We need deg(fi(xj) < n3, and heuristically we expect to be able to find an fi(x) of degree ss log2 n. If we are given the task of finding discrete logarithms in a representation of the field that doesnt meet this requirement, Zierler [43] has shown that it is relatively easy to find the isomorphism between the representation we are given, and any representation that we choose. So we can 17 use our preferred f(x), then transfer our results back to the given representation. Let k (E Z such that 2k ss ns ln_3 n, let h = \^~\ (w nf Ins re), and let 1 2 B ss ns Ins n. To generate an equation in stage one, we choose a Ui(x) and u2(x) of degrees < B} with (ui(x)} u2(x)) = 1, and set W\(x) = Ui(x)xh + u2(x), and w2(x) = Wi(x)2 (mod f(x)). If Wi(x) and w2(x) are both smooth, then since logfl w2(x) = 2k logfl w1(x) (mod 2n 1), we get an equation relating the discrete logarithms of elements in S. Furthermore, w2(x) = Ui(x2k)xh2k + u2(x2k) (mod f(x)) = Ui(x2k )xh2k~n fi(x) + u2(x2k), and h2k n equals some c, 0 < c < 2fc, so both Wi(x) and w2(x) have degrees 2 < 0(ns). Also, given the probability that both Wi(x) and w2(x) will be smooth, B is large enough that we should have enough relatively prime pairs (ni(x), u2(x)) 1 2 to generate more than l^l equations, where the bound on S is b ss ns lm n. For an example, in F27 dehned by f(x) = x7 + x + 1, let 2k = 2, h = 4, and B = 3. To generate one equation, let Ui(x) = 1 and u2(x) = x. Then Wi(x) = (l)x4 + x} and w2(x) = w\{x)2 = (x4 + x)2 = x8 + x2 = (x2 + x) + x2 = x (mod f(x)). Since x4 + x = x(x + l)(x2 + x + 1), we get the equation: logfl x = 2(logfl x + log3(x + 1) + \ogg(x2 + x + 1)) mod 27 1. Fortunately, we can test polynomials for smoothness before we try to factor them. Coppersmith noted that to test whether a w(x) is smooth with respect to the bound 6, we can check the necessary (though not sufficient) condition: b vo'(x) ])([ (x2 x) = 0 (mod w(x)), *=r|i where w'(x) is the formal derivative of w(x). That is, x2' x is the product of all irreducible polynomials over F2 whose degree divides z, so if a(x) Â£ S and a(x)k\\w(x), then a(x ) I nj=r|-| - x), and a (x)k 1 | w'(x). The test is not a sufficient test of smoothness because polynomials of degree > b which appear to an even degree in w(x) will also appear to that degree in w\x) (since the 18 field has characteristic 2). However, this situation will occur only rarely, and since polynomials that pass the test still have to be factored, the only harm is a little wasted work. (See also [30] for a different smoothness test, and [17] for a procedure that sieves through pairs, taking those that will make Wi(x) smooth.) In general, the third stage of index-calculus algorithms is so much faster than the other two stages, that not much attention is paid to it. Once you know the discrete logarithms of the irreducibles in the factor base, you have essentially solved the problem in that held, since finding logfl y for any y Â£ F* is as simple as finding one smooth ygz Â£ F*, where z is known. However, with the factor base as small as it is in Coppersmiths algorithm, it would take a long time (0(exp(n3 In3 njj) to find a random ygz that is smooth. Therefore, we start by trying to find a ygz whose irreducible factors Vj(x) all have degree < ns lm n (where deg(ygz) < n =? there are less than n factors ry(:r)). We then use a method similar to that used in the first stage of the algorithm, applied recursively (with decreasing bounds), to compute the discrete logarithms of the Vi(x), thus allowing us to calculate logfl y = z + lg3 vi(x) (mod 2n 1). In each recursive step, we get an equation relating the discrete logarithm of an irreducible v(x) in terms of the discrete logarithms of (< n) smaller degree irreducibles. After a finite number of recursions, these smaller degree irreducibles will be elements of the factor base, whose discrete logarithms we already know. Each recursive step involves Ending a Wi(x) and w2{x) = Wi(x)2 (mod f(x)), for some k, similar to the procedure used in the first stage of the algorithm, but with the additional condition v(x) |rci(x). (The coefficients of acceptable Ui(x) and u2(x) that comprise Wi(x) can be determined by a set of deg(u(x)) linear equations.) If and w2(x) both factor into irreducibles of degree not exceeding some bound b < deg(u(x)), say Wi(x) = v(x) Si(x)a and w2(x) = ]^tj(x)^J, i i then taking discrete logarithms base g gives us the equation Yj Pj logfl ti(x) = 2fc(lg3 v(x) + Y log3 s*(:c)) (mod T !) 3 i so we can find logfl v(x) in terms of the logfl Si(x) and logfl tj(x). The ideas of Blake, et.al. were successfully implemented [4, 27] in the held F2i27, and Coppersmith and Davenport [8, 9] implemented Coppersmiths algorithm in F2127 (as alluded to earlier, an implementation is deemed successful if it can determine the discrete logarithms of all irreducibles in the factor base 19 S, without necessarily bothering to find an arbitrary log y). These were quite signihcant at the time, since both Mitre Corporation and Hewlett-Packard Corporation had working models of the Difhe-Hellman key exchange in this held. Using a number of enhancements to Coppersmiths algorithm that Odlyzko [30] had proposed, Gordon and McCurley [17] successfully implemented Coppersmiths algorithm in F2227, F2313, and F2401 (including some success in parallelizing the linear algebra of stage two). They were also able to complete the stage one equation generation in the held F2503, but because they didnt have sole access to the computers they were using, they were unable to complete stage two of the algorithm. 5.2 Gaussian Integers Method Coppersmith, Odlyzko, and Schroeppel [10] proposed three algorithms for hnding discrete logarithms in prime helds Fp, each of which has a run time T[p, |, c = 1] for the hrst and second stages, and a run time of T[p, |,c = |] for stage three. One of these algorithms, which Coppersmith, et.al. called the Gaussian integers method, has proved to be the most practical, and the most important in the way it has spurred the development of other discrete logarithm algorithms. LaMacchia and Odlyzko [21] successfully implemented the algorithm in a prime held of order ss 1058 (and went halfway through stage two with a prime ss 1067). Although their primary motivation was to obtain empirical results on sparse matrix techniques [20], the implementation with a 58 digit prime also broke an authentication scheme that Sun Microcomputers, Inc. had implemented in that held as part of their Network File System (and which had been incorporated into release 4.0 of UNIX System V). The Gaussian integers method was inspired by a subexponential algorithm that ElGamal [13] invented for hnding discrete logarithms in helds Fp2, p > 00. The idea is to perform the index-calculus algorithm in an isomorphic copy of F,p2. If ra is a quadratic nonresidue mod p, then (p) is a prime ideal in the ring of quadratic integers I(yFn)} and Fp2 = /(y/m)/(p), (e.g. one representation is {ayFn -\- b \ 0 < a}b < p 1; a,6 Â£ Z}. We can hnd the isomorphism ^ : Fp2 > I(y/m)/(p) by adding multiples of p to the coefficients of the irreducible polynomial f(x) that dehnes Fp2 (this doesnt change the representation of Fp2) until the square-free part of the discriminant of f(x) is equal to to. (To use ElGamals own example: to hnd the isomorphism from F172 dehned by f(x) = x2 + x + 6 onto /(a/-3)/(IT), we hnd that x = 8 + 3^3 is a root of x2 + (1 17)x + (6 + 85) = 0, so we may let Since the isomorphism preserves products, it also preserves logarithms (to the 20 base generates the entire group of units in I(y/fn))} and one quadratic prime with prime norm from each associate class in I(y/fn) whose norm is less (in absolute value) than some bound b. (A similar procedure [14] can be used for other fields Fpn with p > oo and fixed n ^ 1,2.) In the Gaussian integers method, we will use a very simple mapping of Fp to a subset of Z X Z. Let r be a small negative integer that is also a quadratic residue modulo p preferably r Â£ { 1, 2, 3, 7, 11, 19, 43, 67, 163}, so we can work in a unique factorization domain, though the algorithm can be modified to work in a non-UFD. (If p = 1 mod 4, we can take r = 1.) Let W be an integer such that W2 = r mod p, and let w represent the complex number yfr. Find two integers T, V < yfp such that T2 = rV2 mod p, and let p' = T + Vw. To simplify matters, I limit my discussion to the case where T2 rV2 = p. Then the norm N(p!) = p =$ p' is prime => (p') is a maximal ideal of Z[w\ and Z[w\/(p!) is isomorphic to Fp. In fact, / : Z[w]/{pr) > Fp defined by /(e + fw) = e + fW mod p is an isomorphism. Choose a complex prime G = ai + a2w that generates the group of units (fZ[w\ / (pr))* this G will be the new base for logarithms. Mapping complex numbers e + fw to real numbers e + /IF, and the base G = oq + a2w to g = a\ + a2W} preserves logarithms. Let the smoothness bound b = T[p, |, |], and we let the factor base contain the integer V, all real primes < 6, and all complex primes x + yw Â£ Z[w] whose norm is <6 (including real primes that factor into two complex primes we can remove redundancy from the factor base before we solve for unknowns in stage two). Rather than look for random smooth numbers to generate equations in stage one, we will sieve through pairs of small (positive or negative) integers (ci,c2) looking for pairs that make ciV c2T smooth with respect to the real primes in the factor base. (A pair (&ci, kc2), for constant k, will give us the same equation as (ci,c2), so we should avoid using such multiples, e.g. by using only relatively prime pairs.) For each (ci,c2) that makes ciV c2T smooth, we check to see if Ci + c2w is smooth (i.e. factors completely) with respect to the complex primes in the factor base. (Since the norm of a product of quadratic integers is the product of their norms, we can test c\ + c2w for smoothness by testing its norm for smoothness with respect to the norms of complex elements in the factor base.) If Ci + c2w is also smooth, then we can get an equation among the discrete logarithms base G of elements in the factor base, because c\V c2T = V(ci + c2w) c2(T + Vw) = V(ci + c2w) mod p . 21 To sieve through (ci,c2) pairs, we fix a c\ value, and allocate (and initialize to zeros) an array C indexed by all the possible c2 values we want to try. For each real prime p4- in the factor base, pi /T, compute d\ = c-^VT-1 mod pp and add In to all array elements C\c2] such that c2 = d\ mod pp (These are those c2 such that p4- | (c^V c2T).) Do the same for d2 = CiVT-1 mod p2, adding In pi to all array elements C\c2] such that c2 = d2 mod p2, and the same mod p?, mod pf, ... until we reach the highest power j such that pj < b. Additionally, for those c2 that are congruent to dj mod p), we test whether c{V c2T has higher powers of p4- as factors, and add appropriate multiples of lnp4- to C\c2] if it does (i.e. one lnp4- for each additional power). After doing this for each real prime in the factor base, we compare the value in a C\c2] to the value of ln(ciV c2T). If ln(ciV c2T) ss C*[c2], then C\V c2T is probably smooth, whereas ln(ciV c2T) > C\c2] indicates that c{V c2T probably has factors not in the factor base. Since c\ and c2 are small, and T, V < ^/p, the probabilities of smoothness are high, and we should get enough equations by sieving through positive integers ci,c2 < b. To find an individual logfl y in stage three, we take random z, as in the basic index-calculus algorithm, and try to find a ygz that is smooth with respect to (real) primes < T[p, |,2] (say ygz = YliUi). To find the discrete logarithm of a up we can again use sieving to look for ci,c2 such that Cl1/~C2T is smooth with respect to the real primes in the factor base (say ClV~C2T = EL; PjJ)? and fr which Ci + c2w is smooth with respect to the complex primes in the factor base (say ci + c2w = Y\k qflk). Then Ui = ciV c2T V(ci + c2w) V]Jk(fk Pk n jP3 n jP3 n jP3 mod pE and taking logarithms gives us logG U{ (= logfl v^) in terms of the discrete logarithms of elements in the factor base. For an example of stage one in F3r} we can let r = 1, so w = i and W = 6. Let T = 1 and V = 6, so p1 = 1 + 61 and ^ : Z[z]/(p') > F37 is an isomorphism defined as
logarithms in Z[z]/(p/). Although the bound on the factor base is less than 3, Im |