Citation
On Newton's methods to maximize the likelihood function of pairwise comparisons

Material Information

Title:
On Newton's methods to maximize the likelihood function of pairwise comparisons
Creator:
Gangolli, Vinata
Publication Date:
Language:
English
Physical Description:
vii, 42 leaves : illustrations ; 29 cm

Subjects

Subjects / Keywords:
Newton-Raphson method ( lcsh )
Directed graphs ( lcsh )
Linear systems ( lcsh )
Matrices ( lcsh )
Directed graphs ( fast )
Linear systems ( fast )
Matrices ( fast )
Newton-Raphson method ( fast )
Genre:
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )

Notes

Bibliography:
Includes bibliographical references (leaves 41-42).
General Note:
Submitted in partial fulfillment of the requirements for the degree, Master of Science, Applied Mathematics.
General Note:
Department of Mathematical and Statistical Science
Statement of Responsibility:
by Vinata Gangolli.

Record Information

Source Institution:
University of Colorado Denver
Holding Location:
Auraria Library
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
32480689 ( OCLC )
ocm32480689
Classification:
LD1190.L622 1994m .G36 ( lcc )

Downloads

This item has the following downloads:


Full Text
ON NEWTONS METHOD TO
MAXIMIZE THE LIKELIHOOD FUNCTION
FOR PAIRWISE COMPARISONS
by
Vinata Gangolli
A thesis submitted to the
Faculty of the Graduate School of the
University of Colorado at Denver
in partial fulfillment
of the requirements for the degree of
Master of Science
Applied Mathematics
1994


This thesis for the Master of Science
degree by
Vinata Gangolli
has been approved for the
Graduate School
by
j David C. Fisher
Burt Simon
12 li-qy
Date
11


Gangolli, Vinata (M.S. Applied Mathematics)
On Newton's Method to Maximize the Likelihood
Function for Pairwise Comparisons
Thesis directed by Associate Professor David C. Fisher
We consider the problem of ranking a set of objects based on
a number of pairwise comparisons made among them. This
becomes a more complex problem if each object is compared
only with some of the remaining ones in that set, and if this
experiment is repeated unequal number of times for different
pairs. A directed graph can be used to represent such a
scenario. Given such a directed graph representing pairwise
comparisons of objects in a set of data, a likelihood function
based on the pairwise comparisons of these objects is
constructed with an underlying assumption, that the
performance of the object is an independently distributed
random normal variable with a common variance. Newton's
method is then used to get the maximum solution to this
function. It is shown that there exists a unique maximum
solution to this problem for the case of a strongly connected
directed graph.
This abstract accurately represents the content of the
candidate's thesis. I recommend its p,,ui:
Abstract
signed
David C. Fisher
ill



To Chinmoy
for his love and support
IV


Chapter
1. Introduction.............................................1
2. Ranking of Performances..................................4
3. Mathematical Model.......................................7
4. Solution to the Model....................................1 5
5. Computation of the Solution..............................21
6. Applications.............................................2 6
Appendix
A. Rankings based on NFL Pre-Playoff Games (1993-94
Season).....................................................2 8
B. Computer program.........................................3 0
References....................................................41
v



Figure
1. Digraph representing games between teams......................5
2. Plot of Normal CDF............................................8
3. Plot of r(X)..................................................1 7
4. Plot of r'(X).................................................17
5. Plot of r"(X).................................................17

VI


ACKNOWLEDGMENTS
' would like to thank David Fisher, James Koehler and Burt Simon for
giving me the opportunity to learn by sharing their time and
jtpertise. I especially thank Dr. Fisher for his support, invaluable
guidance, patience and wonderful enthusiasm.

Vll


1. Introduction
We are often confronted by the problem of ranking a set of
pairwise alternatives, based on the data available on the merit of
each alternative. We consider a population of n objects with
various degrees of merit in the attributes of our interest. It is
desired that these n objects be ranked in a fair manner
according to their merits. A number of pairwise comparisons
may be made to decide the better object among a pair to be
(
compared. So we would have possible pairs to compare.
\2J
Suppose we compare only some of these pairs, and repeat this
experiment of comparison unequal number of times for each
pair, then we need to formulate a way to rank these objects
based on the experiments.
The method of paired comparisons is used primarily in cases
when the objects to be compared can be judged only
subjectively; that is, it is impossible to make an individual
measurement for each object to judge its merit. Paired
comparisons are widely employed by psychometricians. The
method was introduced by Fechner ([3],[4]) and, popularized by
Thurstone after several extensions ([13], [14], [15]).
This problem has been widely studied ([2], [7], [9], [10], [11],
[12]). The process of ranking is complicated for the case of the
absence of a clear-cut winner. Each alternative may have some
characteristic or merit in which it scores over others. Various




schemes have been suggested and implemented to decide the
best alternative.
Saaty [10] has introduced a procedure for prioritizing decision
alternatives based on an nxn positive matrix, = of pairwise
comparisons of n > 2 items expressed on a ratio scale. His method
treats r.tj >0 as an estimate of the perceived intensity, w/., of a
wj'
judge's preference in favor of alternative i versus j, where
w* =(w*,w2,...,vv*)r is an underlying column vector of nonnegative
weights that sum up to one. The judgment matrix i? = (r..), is
thus taken to be reciprocal, namely, r}i = V 1 assumption that the judges preference in favor of item i or j is
elicited with only one of the pairs (i,j) or (j,i) for i*j. Saaty
proposed to use /i= >0, a linear transform of the
Perron eigenvalue of R, as a measure of the cardinal
consistency in the judge's responses (i.e., rij-rjk = rik, l and posed the problem of determining how it might vary as a
function of the rjj's. Aupetit and Genest [6] answered this using
the Perron-Frobenius theorem. Noether [7] has suggested a
method of ranking based on two response curves. A linear model
of the form Xij=ai-aj+eij, (1 of preference of item i over item j. Weights are assigned for
each item i (1 < i < n), and they are estimated using the method of
maximum likelihood estimation.
We introduce here a model to rank a set of objects in a
probabilistic manner, based on the performance of each object
relative to the rest of the population. A directed graph or
2


digraph is used to represent results of the comparisons between
different pairs of objects. The probability of an alternative, say ,
i being superior to another alternative, say j, is calculated,
based on the results of the comparisons made so far. A variable
or weight associated with each object, gives a measure of its
merit. Hence difference between weights of two objects gives a
measure of superiority of one object over the other. A random
normal variable is associated with each object to represent the
random error associated with the outcome of each comparison.
The probabilities of each alternative i scoring over the
alternative j are calculated using this normal variable and then
the maximum likelihood function is formulated. This is solved to
get a maximum using Newton's method. It is also shown that a
unique maximum solution exists for a strongly connected
digraph. If normal density function is used for calculating the
probabilities, then the model becomes similar to the one
underlying the Law of Comparative Judgment or the Thurstone-
Mosteller model [2].
3


2. Ranking of Performances
We start with an example to explain the problem under
consideration and to develop and analyze a mathematical model
for it.
Consider a league of four teams 1, 2, 3, 4, which play several
games against each other Teams 1 and 2 play three games
against each other, two of which are won by team 2 and one by
team 1. Team 2 loses in its only game against team 3 and team 1
wins in the game it plays against team 4. Teams 3 and 4 play
two games against each other, both of which are won by team 4.
The digraph in figure 1 gives a graph representation of this
scenario together with the results of each game. Teams 1 and 3
do not play against each other. Similarly, teams 2 and 4 do not
play against each other.
The vertices in the digraph indicate the teams participating in
the event. Any two vertices are adjacent to each other, that is
they have an arc between them, if there is a game played
between them. The head of each directed arc represents the
loser in that game and the vertex at the tail of that arc
represents the winner in the game. The digraph is said to be
strongly connected, if there is a path from each of the nodes to
every other node.
4



Figure 1. Digraph representing the games
played between teams 1, 2, 3 and 4
The graph shown above represents the whole scenario being
studied here. It is a directed graph or a digraph of the games
played between these teams, viz., 1, 2, 3, 4. There is a cycle 1>
4 > 3 > 2 > 1 in the graph shown above. Thus team 1 beats
team 4 and team 4 beats team 3, which beats team 2, which in
turn beats team 1. We can see that there exists a path from
every node to every other node in the graph above. So it is a
strongly connected graph.
The whole scenario can be represented using a score matrix.
An entry s(>. (1 < i,j < n), is positive, if object i scores over object j
those many times. Thus will represent the number of times
that object j is found to be superior to object /. Any of the
situations such as no comparison made, or no win for object i
over object j counts as a zero in (ij)th place in the score matrix.
Ties have not been considered for the sake of simplicity. All
diagonal entries are zero since an object cannot be compared
5


with itself. A score matrix corresponding to the graph shown
above, is given below.
S =
1
2
3
4
12 3 4
o i o r
2 0 0 0
0 10 0
0 0 2 0
We now proceed to build a generalized mathematical model, which
v/ould reflect the characteristics of the problem in mathematical
tjrms, thereby enabling us to compute a sequence of real numbers,
which in turn, would give the ranking of these n objects in the data
set.
6


3. Mathematical Model
We wish to rank every object based on its performance relative
to the performance of the others at the same time. Let us start
with some assumptions before proceeding to develop the model.
We assume that the outcome of each pairwise comparison is
independent of other comparisons. We associate a variable w.
known as weight, with object i (1 Let X; be the random variable representing the random
error in the performance associated with object i. So the value of
vv. + x,. will determine the position of the ith object in the data
"V
set. The vector w =
gives the weight associated with each of

the n objects.
Object i would beat object if w; + X.I> + Xj.
Object j would beat object i, if w, + X, < w. + X..
Hence Prob[Object i scores over Object j ] = Prob[w,. + Xi > w. + Xj] =
Prob [ X. Xj > Wj wt]
We assume that Xn for all l identically distributed as random normal variables. It follows
that Xt-Xj is also a normal random variable. Without loss of
generality, we assume that X^Xj for all pairs in the set, is
independently and identically distributed with mean 0 and
standard deviation 1. Hence (z,. -Xj) ~ N(0,1), for all 1 The Prob[Z(.-Xy-> wy-w,.] can be calculated using the normal
distribution function as follows:
7


Prob[X( Xj > Wj w,.] = J
tv- -
We denote this quantity as
Wj Wj
dt = |
">Adt.
44k .1 44k
f(w(. -w^. The error function
associated with variable X is represented as F(x), where
1
,44k
X J
F(x) = J j=e'4-dt. Figure 2 gives a plot of F(;t).
X
Figure 2. Plot of Normal C.D.F
The value of the F(x) can be evaluated using Simpson's rule. The
computer program to do this computation is given in Appendix
B.
8


We use this to calculate the probability of the outcome for the
problem referred to in Figure 1. First we define the variables Wn
for team i to give a measure of the merit of each team. The
reader may recall that we referred to them as weights .
Thus,
Team 1 has weight w,,
Team 2 has weight w2,
Team 3 has weight w3,
Team 4 has weight w4,
Prob[Team 1 beats Team 2] = Prob[w, + X, >w2 + X2]
= Prob[ X, X2 > w2 w,]
Team 1 beats 4 if w, -w4+x, -X4>0,
That is, if X, X4 > w4 w,,
Team 1 loses to 4 if w, w4+X,-X4 <0,
That is, if X, X4 < w4 w,
Similarly,
Team 2 beats Team 3 if X2-X3>w1-w2,
Team 2 loses to Team 3 if X2-X, Therefore, Prob[team 1 beats team 2 & team 2 beats team 1
twice & team 3 beats team 2 & team 1 beats team 4 & team 4
beats team 3 twice ] .
Team 1 and team 2 play three games against each other. Since
the order of the win-lose sequence is not important here, there
are ways in which team 2 can win any two out of those three
w
games.
9


r3^
v2y
Prob [team 1 beats team 2] Prob [team 2 beats team 1 twice]
Prob [team 3 beats 2] Prob [team 1 beats team 4] Prob [team 4 beats team 3 twice]
= Prob [team 1 beats team 2] Prob [team 2 beats team l]2 Prob [team 1 beats team 4]
Prob [team 3 beats team 2] Prob [team 4 beats team 3]2.
Having devised a way to compute probabilities of the outcome of
comparisons, we proceed further to construct a function of the
combined outcome. Since object /' is superior to object j on s-y
occasions, and since the outcome of each comparison is
independent of each other, the probability is given by F(wi w} )*".
Thus the joint probability that the outcomes are as given by the
result in each comparison, is given by the likelihood function,
ti n sij
(/> (w) = nn Tj f[\V[ Wj'j Here 77 represents the product of the
<=1 j=1
i*j Sjj >0
combinatorial terms corresponding to the outcomes of games
between each pair of teams. This likelihood function represents
the likelihood of the scenario given by the digraph and its
accompanying score matrix Without loss of generality, we put a
n
constraint = 0 on the weights It can be seen that
i=1
/ \ S'J

<=i y=i
i*j S;j>0
n
Therefore, the constraint Y w.=0 enables the unique estimation
i = 1 1
of the weights.
We take natural logarithms of the likelihood function, since
taking logarithm simplifies the likelihood function considerably.
10


Thus the problem can be defined mathematically as follows,
maximize
n n sij n
0 (^) = nil7l F{wi ~wj) subject to the constraint ^w, =0.
i=i j=i ;=i
s,j>0
Without loss of generality, the positive constant 77 can be
substituted by 1, since maximizing 77 f(w,.-yv;) is equivalent to
maximizing f(w,. Since taking the natural logarithm
simplifies the function, we redefine the model as
n it n
maximize In 0 (w) = In f(w; subject to the constraint w,- = 0.
;=i 7=1 i=i
j,;>0
We denote r(x) = lnF(x), and ERF: R ->(0,1), 0: Rn ->(0,1) and (-o,0). Now the above
model can be restated as,
n n n
maximize cp (vv) = 12'M Wj Wj} subject to the constraint = 0.
1=1 j=1 i=i
*j *,;/ >0
We now use this model to construct a likelihood function for
the digraph in Figure 1.
0 (w) = F(wl w2) F(w, w4) (F(w2 w,))2 F(w3 w2) (f(w4 w3))2
4
subject to the constraint ]£w,. = 0.
1=1
Therefore, the problem can be redefined after taking the
logarithms as follows:
maximize
ln(0(vv)) = r(w, w2) + r(w, w4) + 2r(w2 -w,) + r(w3 w2) + 2r(w4 w3)
4
subject to the constraint = 0.
1=1
That is equivalent to,
maximize
(p(w) = r(vo, w2) + r(w, w4) + 2r(w2 w,) + r(w3 w2) + 2r(w4 w3)
1 1


subject to the constraint ^w( = 0. Here w =
1=1
w,
w,
W,
w.
refers to the
vector with its /th entry equal to weight assigned to the y'th team.
We wish to evaluate the nx 1 vector w, which would maximize the
likelihood function defined above. This means the linear system
d _
of n equations - owi
method can be used to calculate the vector w iteratively.
Suppose we wish to solve V(w) = 0 for the function V:.
Newton's method for the n-dimensional case gives,
WM = ~
A-1
w,
V(wt) where wk is the nxl vector
k y
obtained in the kth iteration. Since we took the logarithm of to make the function linear, we take the first and second order
derivatives of (p(wk) to restate the equation as
f -)2 V1 d
^{vi^k)) (w*)). Let us call these two
w*+i =wk~
dwk
matrices of first and second derivatives as matrix Anxn and vector
bmp respectively, for the sake of simplicity. We have to compute
Anxn and bllxl and then solve for the system of n equations to get
the vector representing the change in the weights in the (fc + l)th
iteration. This vector is then subtracted from wk to get vPi+1.
By our previous definition, r(x) = ln(F(;t)).
Differentiating with respect to x, we have,
./ x F(x)
F{x)
Differentiating with respect to once more, we have,
12


. F"(x)F(x)-{F'(x)f
(F)2
Hence the first derivative of cp(w) with respect to w; gives
for 1 iJ n-
>0
The ith entry in the vector b^ represents the first derivative of
q>(w) with respect to w;. Taking the second derivatives, we get,
f(*). = = £ V"(w/ -%)+ 'Lsjir"(wj wi\ fr 1 - -n-
j=1
s,,>0
j=i
and
-w,)-Sj/fa w ), for 1 Applying this to the digraph in Figure 1, we get,
(p(w) = ln(0(w)) = r(w, w,) + r(w, m>4) + 2r(w2 w,) + r(w3 w2) + 2r(w4 w3)
= r'(w, -w2) + r'(w, w4) 2r'(w2 w,)
Differentiating with respect to w]f
d 2 d mv
= r"(wl w2) + r"{\V\ w4) + 2r"(w>2 w,)
Differentiating with respect to
3 2
= -r"(w, w2) 2r"{w2 W',)
d w2d w,
Differentiating with respect to
Differentiating with respect to
w
2
w3,
d 2(p
d w3d w,
d 2 d w4d w,
= 0
- = -r"(wl w4)
Similarly, Differentiating - = -r'(w, -w2) + lr'{w2 w,)- r'(w3 w2)
dw2


and differentiating yields,
-^- = r'(w3-w2)-2r'(w4-w^) and -^- = -r'(w,-w4) + 2r'(w4-wi)
owy dw4
We take second order derivatives of (p(w) to get the matrix A.
d2
A4x4=?((p(w)) is given by the matrix below.
ow
r"(wl w2) + r"(w{ w4) +
2r"(w2 w,)
-r"(wi-w1)-2r"(w2-w,)
-r"(w, w4)
-r"(w, - w,)-2r"(w, w,) 0 ~r"{w\
i r"(Wy - w2) + r"(w2 -w4) + ~wi) -r"(w3 -w2) 0
-r"(w3 w2) r"{w} - w,) + -2 r"(w4

2r"{W4 -w3)
-2r"(w4 - w3) r"(w{ -
0
2r"(w4 -
K\ = -jrr d w
r'(vvj w2) + r^w, w4) 2r'(w2 w,)
-r'(wj w2) + 2r'(w2 w,) r'(w3 w2)
r'(w3 w2) 2r'(w4 w3)
-r'(w,-w4) + 2r'(w4-w3)
It may be noted, that the sum of all terms in the vector btlxl is
zero.
We will prove in Theorem 1 that matrix is singular and of
rank n-1. The solution to Ax = b is obtained by Gauss-Seidel
method of iterations. This vector, say wnew, is then subtracted
from wk to get wk+l.
14


4. Solution to the Model
Let us examine how a solution to the model as defined earlier,
can be obtained. We prove two results in order to define the
special characteristics displayed by the matrix A=(p^{x)
corresponding to a digraph D.
As seen earlier, all the entries in the matrix A are a function of
the second derivative of r(x) = ln(F(;t)). Thus the value of r"(x)
plays an important role in the structure of A.
d F'(x) F"(x)F(x)-(F'(x)f
dx F(x) (F(x))
It can be seen that F(x) gives a real value between 0 and 1.
Hence (F(;e))2 is always a real positive value in (0,1) for any in
(-oo,+oo). Thus the numerator of r"(x) decides the sign of r"(x).
The next result shows that the value of F"(x)F(x) (F'(x)f and
hence r"(x) is always negative.
Lemma 1: F"(*)F(;t) (F'(x))2 < 0,
Proof:
x j
As defined earlier, we have F(x) = J -j=e'^dt
Taking the first and second derivative of F(x) with respect to. *,
we have,
F'(x) = -r=e~l2/' andF"(x) = ~^~2xe~A = -xF'(x)............(4.1)
' V2tt v V2tt 2 y
Using the equation (4.1), we have,
F"(x)F{x) (F'(x)f = F'(x){-xF(x) F'(x)} ...... (4.2)
F'(x) is always positive for any in (-,+).
F(x) 0 as x oo ......(4.3)
15


(4.4)
Lim F'(x) = Lim J e ^1 = 0
*-~ V2?r
We now examine the behavior of the expression (4.2). We have
already seen that F'(x) > 0, for all x.
Consider h(x) = -xF(x) F'(x). The derivative with respect to.x
gives h'(x) = -xF'{x) F(x) F"(x).
Since we have shown earlier that F"(x) = -xF'(x), the two terms
cancel out giving h'(x) = -F(x). This value would always be
negative. We can conclude from this that h(x) is a decreasing
function. Now,
Fix)
Lim f(x) = Lim-xF(x) Lim F'{x) = Lim 0, because Lim F'(x) = 0.
X00 X-^oo Xoo x(~yA x>*
Applying LHospital's rule, we get,
Lim^^= Lim^-kr-= Lim-------------. Applying L'Hospital's rule once
*+-{-%) x~2 y 5 F
V27T
more, we get,
2
Lim = Lim---------^ = Lim 2(V2/r = Lim 4JtF'ix) = 0. Thus
X*1 */ X1 i*/ X>oo V / xoo
f=e /2 x;=e A
V2 K
we have proved that the limit of h(x) goes to zero as x->. f{x)
is also a decreasing function. Thus the value of h{x) must always
be negative.
Now, F"(x)F(x)-(F'(x)f = F'(x)h(x), and F'(x) > 0, for all x. Hence the
product of h(x), a negative value, with F'(x) will give a negative
value.
We can conclude from this that F"(x)F(x) (F'(x))2 would give a
negative value for all x,
Figures 3, 4 and 5 give plots of the functions r(*), r(x), r M,
respectively.
16


r(X) vs X
rH(X) vs. X
symmetric negative semidefinite matrix with rank n-1. Any


vector in the null space of matrix A is of the form a 1, where a
is a real number.
Proof:
d2
Let A = j(pD(w) be associated with digraph d. Matrix A by its
aw
definition, is a symmetric matrix of order n For a strongly
connected digraph, each node must have a path to each of the
remaining nodes. That means it must have an outgoing edge to at
least one of the remaining nodes of the graph. So none of the
rows or columns of A can have all elements equaling zero at the
same time. The main diagonal terms of matrix A are the
negative of the sum of the off-diagonal terms. By Lemma 1, we
can see that r"(x) is negative. Hence the diagonal terms are
negative, while the off-diagonal terms are nonnegative.
As can be seen from the definition of matrix A, we have,
n
= ~au fr 1 ^i^n, and lai(.l> 0 for all 1 M
i*i
diagonally dominant.
V
Consider a vector x =
in the null space of matrix A. Let xi b e
LXnJ
the maximum of all for Hence
Multiplying the £th row of A by x, we get
ak\Xl "* ak2X2Jr"'JrakkXkJr''lraknXn ~
~akkXk ~ ak\X\ aklX2'^---'^akk-\Xk-\ akk+iXk+\~^aknXn
Since the digraph is strongly connected, and not all the XjS on the
right hand side are zero.
Using triangular inequality, we have,


Since xk is the maximum of all x,., where 1 lajl*t.l< laHllxt.l+lfl,2llxill+....+lflM_1IUJ+lau.+1IUx.l+...+lfl,llxtl
= 1**1 (fltl+....+att_, + %+,+ +<%,)
Now consider a set Me{1,2,................... n} such that if /eMthen *, =**.
If M = {1,2,......,n} then we get x{=xk,x2 = xk, ....,xa=xk. If not, there
is at least one element, say j, in {1,2,....,}-M. Since the digraph
D is strongly connected, there exists a path from every i e M to
j. Therefore, we can find at least one element i e M such that i
beats j. Similarly, there must exist a path from j to every ieM.
This implies that ai}> 0.
We have, -a,fxf = a,,x,+..........+aii_lxi_i + aiMxM +.....+ainxn.
Since x, = xt, -aHx, < anx,+..........+aii_xxi + a,7+1x,+..+ainxi
n
But au = -£a/y. So -affx, < anx(+..........+aii_xxi+aiMxi+........+ainxx = -anxr
j=i
j*
Therefore, x; = xk gives
ai\X 1 + ailX2Jr----~>raik-\Xk-\ aik+\Xk + \^~---~^amXn
= anxk + anxk +... +aik_xxk + aik+lxk +... +ainxk.
But this would imply that a,yxy = aijxi = atjxk, since a,y > 0.
Xj = xk => j e M. This is contradictory to our earlier assumption
that j Therefore M = {1,2, ,n} must be true.
But x(. = xk V ieM. That proves that x, = xk, x2 = xk, ....,x = xt,
where,
V
x =
is in the null space of matrix A.
19


We can see that the product of any row of matrix A with the
vector I is zero. Thus I is in the null space of matrix A. This
means the dimension of the null space of matrix A is at least 1.
Since we just proved that all entries in any null vector of matrix
A are equal, A'ic=0, implies x = cci, for some a e R.
Therefore, dim(nullspace(A)) < 1. So dim(nu//.vpace(A)) = 1.
Thus we can conclude that the rank of matrix A must be n-1.
We wish to prove next that matrix A is negative semidefinite.
Matrix A would be negative semidefinite if and only if all its
eigenvalues are nonpositive. Since the rank of matrix A is n-1,
one of its n eigenvalues is zero. Ay = Ay, where X is the
eigenvalue of matrix A associated with eigenvector y Therefore,
(A-A/)y = 0. Now if X > 0, then the subtraction of positive terms
from the main diagonal terms of matrix A would make the main
diagonal terms of (A-A/J smaller than the sum of the
corresponding off-diagonal terms. But (A-A/,,) is diagonally
dominant. Therefore, X > 0 cannot be true. That means the
eigenvalues of matrix A cannot be positive. Ay = Ay => A < 0. This
would be true only if A is negative semidefinite. Hence matrix A
is negative semidefinite with rank n-l for a strongly connected
graph.
20


5. Computation of the Solution
The linear system of n equations can be solved iteratively by
Gauss-Seidel method to get a quickly converging solution to the
problem. Using Gauss-Seidel method of iterations we get,
-%aw?)-%aAk~l)+bi
w(*)=-d--------LiH----------, where w{k) is the value of the ith
a..
component of the vector w obtained in the kth iteration. wis
accepted as a reasonable approximation to the solution if
ll#(*> _#(*-') II
----- ------ We are able to get a convergence to a solution using the Gauss-
Seidel method due to the negative semidefiniteness of matrix A.
The program 'GAUSS does this iterative computation. The
reader is referred to the program listing in Appendix A.
The values of w for the digraph in Figure 1. were as follows:
'0.04128163 1
__ -0.04128163
W~ -0.2806859
0.2806859
Thus teams 4 and 1 were ranked in the first and second
position respectively. Team 3 finished last.
Next we prove that it is possible to get a maximum likelihood
estimate of these paired comparisons for the case of a strongly
connected digraph.
Theorem 2: There exists a unique maximum to the likelihood
' n
function (pD(x), given =0, for the case of a strongly connected
i=i
digraph d.
21


Proof: We first prove that there exists a maximum solution to
this problem for the case of a strongly connected digraph.
n
Consider the nx 1 vector x. We have, ]£*,. =0, because of the
<=>
constraint defined for the mathematical model defined in
chapter 3.
Let II^IL = ^. Hence I*,I =k, for some i, where l that X; > 0. The sum of the remaining n-1 variables would add to
-k, and therefore their average would be ~y^n_iy So there exists
a minimum, say x}, among these n-1 variables, such that
Consider /-(y) = ln(F(y)).
Since F(y) is in (0,1) for all y, o < y < o 5 the value of r(y) is
always negative. Applying Taylors theorem, we have,
r(y) = r(0) + r'(0)y + for some £ in (0,y).
Since Lemma 1 gives r"(£)<0 r(y) < r(0) + r7(0)y, for ally.
Since r(0) =-ln(2), and r'( 0) = ^/, we have, r(y) <-ln(2) + ^%y ,
for all yeR. If the underlying graph is strongly connected, there
exists a path from node i to node j and vice versa. Since the
total number of nodes are n, the length of this path would be at
most n-1. We now consider a sequence of t teams including
team j, that are on the path from j to i. If team j beats team i,
then r=l. Otherwise, there would be some team, whom team j
beats, which in turn beats some other team, which would finally
beat team i. For the sake of simplicity, we represent team j as
j0 and team i as j,. Hence the sequence of the teams on the path
22


from jQ to j, can be represented as j0 - -> j2 j,_{ -> j,. This
is a path with length t and is a subset of the whole graph. Hence
t n n
m=l /=!
*SJojAXJo -XJ>) + SJaAXh ~xh)+.+sjAXJ ~XA
since > 0 for all 0 < q < t l,and r(y) < 0 for all o < y < oo.
- Axh rixi, xh)+.......+rixi,., xi. )
since j,, >1 for all 0 JqJq+l
By using the upper bound on r(x) that we derived earlier, we
find that, ^o(x)^(-ta'(2) + ^(^0-^j)+....^-ln(2) + ^(^.i-xj).
+xh -xh++x],-> ~xu)
< -n ln(2) + -]f%(xJo ~ Xjl +xh xj+....+xj The terms on the right hand side cancel out to give,
Since x} < ~^n ^ and llxIL = k,we have, (pD{x) < -nln(2)- -1)
Since ^D(^)<-nln(2)--^^**^* This gives the upper
bound on (pD(x). We can prove this result in a similar fashion for
the case of <0. (pD(x) is a linear combination of the natural
logarithm of error functions. Error function F(y) is a continuous
function with its range in (0,1) for all yeR.. The function ln(v) is a
real valued function for any veR, and it is continuous in the
interval (0,1). Therefore, r(y) which equals ln(F(y)) is a
continuous function on (0,1). Hence (pD(x) is a continuous function
inRnD. Let Z be a set where (pD{z) > 23


nonempty set. The upper bound that was just proved to exist on
(pD(x) is also an upper bound for Also (pD(x')<(pD{o), forallJeZ. We just proved the existence of an
upper bound on x for a strongly connected digraph D.
Therefore, closed set. Finally, Z is a compact set because of the continuity of
maximum on a closed and compact set [8]. Using this result we
can conclude that (pD attains a maximum at some point x e Z.
Thus we have proved the existence of the maximum solution to
the likelihood function corresponding to a strongly connected
digraph.
We now prove that there is a unique maximum solution.
We know that the function (pD(x) is twice differentiable. Let
there be a maximum solution to the problem at two vector
points aandb. Therefore, (p'D(a) = (p'D(b} = d.
Using Taylor's theorem, we have,
(p'D{b) = (p'D{a) + - flj, for some c such that c = oca + (1 a)b.(5.1)
for some a e [0,1].
9d(^) ~ Therefore, (b-d) is in the null space of entries in a and in b is zero as a result of the constraint defined
earlier, a andb are orthogonal to I. Hence b-a is orthogonal to 1.
Therefore, lr(5-a) = 0. Using the result in theorem 1, we can
denote b-a in terms of I as b-a =/3l,wherej8isascalar.
r(/?i)=o.
/j(lrl) = j5-n = 0=>P = 0.
24


b a = 0 => b = a.
his proves the uniqueness of the maximum solution to this problem.
25


6. Applications
We have seen that this method can be used for ranking teams in
an incomplete or a complete tournament. Although the case of
ties has not been considered here, it could be shown that the
same approach would give us a solution for a strongly connected
graph, if the tied teams are awarded y2 a point each.
In general, this method of ranking can be used for ranking a
number of objects in a sample. If we are considering different
varieties of chickens for their quality, we can make pairwise
comparisons depending upon their relative merits in different
attributes of interest. Then, with the probabilistic approach used
above, we can rank them accordingly. Other potential uses would
be for pairwise comparisons in areas such as sensory testing,
especially taste testing, to consumer tests, personnel rating, and
in general to the study of preference and choice behavior.
The methodology presented in this paper was applied to find
the ranking of the teams participating in the preplayoff games in
the 1993-94 NFL football season. The rankings of the 28
participating teams are given in Table 1 in Appendix A. Table 2
in Appendix A gives the results of the playoff games and also of
the Super Bowl. It can be seen from Table 2 playoff results that
Kansas Chiefs were indeed better than Houston Oilers as
suggested by the rankings, since they beat the Houston Oilers in
the playoff games. Table 3 in Appendix A gives the schedule of
the pre-playoff games for both of these teams. We can see that
26


although Kansas Chiefs had 11 wins to their credit as against the
12 wins of Houston Oilers, they played against better teams.
They also won against some of the better teams and thus their
being ranked ahead of Houston Oilers has some justification. Nine
out of the eleven playoff games gave results which agree with
the ranking of teams given in Table 1.
A similar approach has been used by Jech [6]. That approach
gives a probabilistic estimate of how much better a team is than
some other team. Hence decision making is done by using the
transitive relationship, that is, team A better than team B and
team B better than team C implies that team A is better than
team C.
27


Appendix A
Table T. Results of NFL Pre-Playoff Games (1993-94 Season)
Name of the Team Conference Wins Losses Weight Rank
Buffalo Bills AFC 1 2 4 0.7750839 1
Dallas Cowboys NFC 1 2 4 0.6917094 2
Kansas City Chiefs AFC 1 1 5 0.6418724 3
Houston Oilers AFC 1 2 4 0.5756177 4
New York Giants NFC 1 1 5 0.4976263 5
Los Anqeles Raiders AFC 1 0 6 0.3867019 6
Green Bay Packers NFC 9 7 0.2879164 7
Minnesota Vikinqs NFC 9 7 0.2781058 8
Denver Broncos ARC 9 7 0.2324088 9
Detroit Lions NFC 1 0 6 0.2276092 10
San Francisco 49ers NFC 1 0 6 0.2203107 1 1
Miami Dolphins AFC 9 7 0.1929059 1 2
San Dieqo Charqers AFC 8 8 0.1433809 1 3
Philadelphia Eaqles NFC 8 8 0.08746736 1 4
Pittsburqh Steelers AFC 9 7 0.06787151 1 5
New York Jets AFC 8 8 0.00514422 1 6
Chicaqo Bears NFC 7 9 -0.0464035 1 7
New Orleans Saints NFC 8 8 -0.0555461 1 8
Arizona Cardinals NFC 7 9 -0.1798368 1 9
Seattle Seahawks AFC 6 1 0 -0.2970637 20
Cleveland Browns AFC 7 9 -0.318633 21
Tampa Bay Bucaneers NFC 5 1 1 -0.356672 22
Atlanta Falcons NFC 6 1 0 -0.4439348 23
Los Anqeles Rams NFC 5 1 1 -0.5727481 24
New Enqland Patriots AFC 5 1 1 -0.6100564 25
Washinqton Redskins NFC 4 1 2 -0.654241 26
Indianapolis Colts AFC 4 1 2 -0.7633866 27
Cincinnati Benqais AFC 3 1 3 -1.00721 28
Source: 1993-94 NFL Season
28


Table 2. Results of NFL Playoff Games (1993-94 Season)
Lower Ranked Team Higher Ranked Team
Denver Broncos (AFC) L.A Raiders (AFC)*
Pittsburgh Steelers (AFC) Kansas City Chiefs(AFC)*
Houston Oilers (AFC) Kansas City Chiefs(AFC)*
L.A Raiders (AFC) Buffalo Bills (AFC)*
Kansas City Chiefs (AFC) Buffalo Bills (AFC)*
Minnesota Vikings (NFC) New York Giants (NFC)*
Detroit Lions Green Bay Packers*
San Francisco 49ers (NFC)* New York Giants (NFC)
Green Bay Packers (NFC) Dallas Cowbovs(NFC)*
San Francisco 49ers (NFC) Dallas Cowboys(NFC)*
Dallas Cowboys(NFC)* Buffalo Bills (AFC)
Note: indicates the winning team.
Table 3. Pre-Playoff NFL Schedule (1993-94 Season)
Team :Kansas City Chiefs Team: Houston Oilers
Record: 11 Wins 5 Losses Record: 12 Wins 4 Losses
Rank: 3 Rank: 4
Played Against Rank Result
New Or. Saints 1 8 L
Kan.City Chiefs 3 W
San Diego Chrgrs 1 3 L
L.A. Rams 24 L
Buffalo Bills 1 L
New Eng. Pats 25 W
Cincinnati Bengals 28 W
Seattle Seahawks 20 w
Cincinnati Bengals 28 w
Cleveland Browns 21 w
Pitts. Steelers 1 5 w
Atlanta Falcons 23 w
Cleveland Browns 21 w
Pitts. Steelers 1 5 w
San Fran. 49ers 1 1 w
N.Y. Jets 1 6 w
Played Against Rank Result
Tampa Bay Buc. 22 W
Houston Oilers 4 L
Denver Broncos 9 W
L.A. Raiders 6 W
Cincinnati Bengals 28 W
San Diego Chrgrs 1 3 W
Miami Dolphins 1 2 L
Green Bay Packrs 7 W
L.A. Raiders 6 W
Chicago Bears 1 7 L
Buffalo Bills 1 W
Seattle Seahawks 20 W
Denver Broncos 9 L
San Diego Chrgrs 1 3 W
Minnesota Vikings 8 L
Seattle Seahawks 20 W
Source: 1993-94 NFL Season
29


Appendix B
j Hsis5i!*****5(i**!f:*H:!|**s|!******!|***sl!**4:*H:*!f!*******H***!fs***sfs!i!*****5f:**5|:sic*H5*=t:*
he main program calls the function input() to read the data,
he solution is obtained by Newton's method. The Matrices used in
IjJewton's method are initialized using the function initialize().
he function normal_area() is called to calculate normal area
function, r(X), r(X) and r"(X).
he computed matrices Anxn and bnxl are checked by the function
c]heck().
he function gauss_seidel() is called to do Gauss-Seidel iterations,
he convergence of the solution is checked by computing the L2
orm, for which the function 12_norm() is called.
N 4; 4* 4* 4* 4* 414 4# 4* 4* 4* 4* 4 4* 4# 4* 4 4 4* 4/ 4 4# 4 4 4# 4* 4# 4/ 4 4# .1. *1. .1. .1. 4 4 4* 4^ 4 4 4 4# 4* 4 4 4* 4 4* 4* 4# 4* 4> 414* 4* 4* 4 4^ I
1*1 V "J* f V rT* *7* ^ ip ^ ip ip ip ip ^ ip ip ip ip ipipip ip ip ip ip ip ip ip ip ip ip ^ ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip #p ip *p #
# include
^include
E ILE *in,*out;
[\loid input();
oid initialize();
oid normal_area();
oid 12_norm();
oid check();
oid check_sum();
oid gauss_seidel();
:ouble val, x, q, PI ;
c ouble r,old_total_r,new_total_r,r 1 ,r2,a[ 100][ 100],b[ 100],
old_x[100],new_x[100],norm, gaussold_x[100],
gaussnew_x[100];
[ it
io_var,no_games,iteration,incidence! 100] [100],row_sum,column_sum,
nain_counter=0;
oid
main()
int k,i,j;
30


PI = 3.14159265358979;
while(main_counter == 0)
{
printf("\nIteration %d\n",iteration);
if(iteration == 0)
{
in = fopen("in.dat7r");
out = fopen("out.dat","w");
q = sqrt(2.0*PI);
inputQ;
for(i=0;i {
for(j=0;j {
printf("%d ",incidence[i][j]);
}
printf("\n");
}
fclose(in);
fclose(out);
}
for(k=0;k {
old_x[k]=new_x[k];
}
printf("\n");
initialize();
for (i=0;i {
for(j=0;j {
if(incidence[i] [j]>0)
{
x = fabs(old_x[i]-old_x[j]);
val = old_x[i]-old_x[j];
normal_area();
new_total_r += r*incidence[i][j];
a[i][j] += -incidence[i][j]*r2;
a[j][i] += -incidence[i][j]*r2;
a[i][i] += incidence[i][j]*r2;
a[j][j] += incidence[i][j]*r2;


b[i] += incidence[i][j]*rl;
b[j] += -incidence[i][j]*rl;
}
}
}
if(fabs(new_total_r-old_total_r)<1.0e-3)
{
main_counter=l;
printf("\n The Solution has converged to its
maximum\n");
printf("\n %d The value of the likelihood
function=%lf",
iteration,new_total_r);
for(i=0;i {
printf("new_x[%d]=%lf ",i,new_x[i]);
}
check_sum();
exit(7);
}
printf("\n The value of the likelihood function is %f",
new_total_r);
check();
gauss_seidel();
for(i=0;i {
new_x[i] = old_x[i] gaussnew_x[i];
}
iteration += 1;
if (iteration == 14)
{
main_counter = 1;
for(i=0;i {
printf("new_x[%d]=%lf\n",i,new_x[i]);
}
check_sum();
}


void input()
i 1* ^ lli *i# 1 1# tt* ! tltf fcl* 1* -1 fcF* -1 L kt# ] *1* 1 I. h] ! fcfj .1 I > -I- .1. f* .1. J I .1. I* l kl< ** lc !/ tl# kl* ki* ki* kl# kl* kik kli kU kl* kii *J* kjk kl# kl*
/ j* ^ ^ kp |k ^ ip ^k ip ip ip ip ip ip ip ip ip ip ^ ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ip ^ ip ^ ip ip ip ip ip *Jk ip ip ip p
program to read the input file .
he program also validates the data.
kl kl* kl* kli k^ kl* kl* k^ kl* k^ kl* 1 k^ |. . I. -1 t .1. .1. [. kl* kl* klk k^ .1. T. f. [. [. I. .1. .1. .1. I .1. I .1, .1. I. -1. kli kl* kl* kl* k|k kl* kl* k^ Ui k^ kl* tJi k^ k^ kl# k^ kl* k^ kl* kl* kl* kl* kl* kl* kl* i
*| *Jk ^k ip ip p ip #p ip ip ip ip ip kp kp kp fp ip ip kp ip ip ip ip ip ip ip *^k *p kp #p kp ip ip *Jk ip kp #p kp ^k ip ^k ip ip ip ^k ip ip p ^k ^k #p ip kp kp kpkp ip ip ip #p *jk ip /
int k,i,j,sum;
fscanf(in,"%d",&no_var);
for(k=0;k {
fscanf(in,"%lf",&old_x[k]);
}
printf("\n");
fscanf(in,"%d",&no_games);
sum=0;
for (i=0; i {
for (j=0; j {
fscanf(in,"%d",&incidence[i][j]);
if(incidence[i][j]<0 II incidence[i][i] != 0)
{
printf("\nError in the input\n");
exit(0);
}
sum += incidence [i][j];
}
}
printf("\n Number of Games= %d Sum of the Incidence Matrix
%d\n",
no_games,sum);
if(sum != no_games)
{
)rintf("\n Error: Incidence Matrix sum does not match the number of
james
played\n");
exit(l);
}
for(i=0;i {
row_sum=column_sum=0;
33


for(j=0;j row_sum += incidence[i][j];
column_sum += incidence[j][i];
}
printf("\nWins for team %d=%d\n Losses for team
%d=%d",
i,row_sum,i,column_sum);
printf("\nGames played by team
3 bd=%d" ,i,row_sum+column_sum);
}
return;
}

rogram to initialize matrix AnXn and vector bnxi-
L* L tb kb kb kb b %b kb \b kb kb kb vj* \b kb kb* kb kb kb kb kb kb kb kl* kb kb kb kb kb kb klk kb kb b kb *1* kb kb kb kb kb f- kb ki# kb kb kb kb b kb *b kb kb kb kb kb kb kb kb kb* kb kb kb kb kb kb kb kb t
T* * *1* ^T* 'f* T 'f**T*'lT* T*'T T 'f' kfk kf kfk kfk rp rf J >f]!{* J Jp 5j! JJ55J! Jp *J*P |!Jp pT*P T P *r VV ** /
k'oid initialize()
[
int i,j;
for(i=0;i {
for(j=0;j {
a[i][j] = 0.0;
}
old_total_r=new_total_r;
new_total_r = 0.0;
b[i] =0.0;
}
return;

program to calculate the normal function integral given the value
X.
Simpson's l/3rd rule has been used to calculate the area under the
n )rmal curve.
The program also calculates r(x)=ln(ERF(X)), r'(X) and r"(X).
L :******************************************************************/
)id normal_area()
34


int i,no_inter,j;
double a, u,h, integral, derive_l ,derive_2;
static int count;
integral = derive_l = derive_2 =0.0;
no_inter=0;
if (x > 1.0e-6)
{
h = sqrt(sqrt((6.0e-10*q)));
h /= sqrt(sqrt(x));
if(h > x)
{
no_inter = 2;
}
else
{
no_inter= (int)(x/h);
if(no_inter % 2 != 0)
{
no_inter+= 1;
}
else
{
no_inter += 2;
}
h = x/(double)(no_inter);
}
integral = 1.0/q;
integral += (exp(-(x*x/2.0)))/q;
for(i=l; i {
a = (float)(i) h;
u = exp(-(a*a/2.0))/q;
if(i % 2 == 0)
{
integral += 2.0*u;
}
else
{
integral += 4.0*u;
}
}
integral *= h/3.0;
if(val == x)
35


{
integral += 0.5;
}
else
{
integral = 0.5 integral;
}
}
else
{
integral = 0.5;
}
printf ("\n The area under the normal curve for x= %e is %lf\n",
val, integral);
r = log(integral);
derive_l = (1.0/q)*exp(-1.0*val*val/2.0);
derive_2 = -1.0*derive_l*val;
rl = derive_l/integral;
r2 = (derive_2*integral -
derive_l*derive_l)/(integral*integral);
return;
Program to compute L2 norm for an iteration.
void
I
12_norm()
int k;
double sum =0.0;
norm = 0.0;
for(k=0;k {
sum += (old_x[k]-new_x[k])*(old_x[k]-new_x[k]);
}
norm = sqrt(sum);
return;
36


1 ia ^ ala ala ala ala aia ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala
/ J* fja ip a^ a^ fp fp ip ip ^a ip a^ ip ip ip ip ip a^ ip ip Sfr a^ ip ip ip ip ip ip aja ip a^ ip ip ^a a^ a^ a^ ^a ip rja fp #p *p *p ap *p aja rp aja
his program checks if the sum of all entries inside bnxl is zero.
d ala ala a^ ^a ^a ^a ^a ala ^ ala all %1* ^a a^ a^ ala ala ala ala 1* ala ala * ala .1* ala ala a^ ala ala ala ala ala t* ala ala ala ala ala ala ala ala ala ala ala a^ ala a^ ^ ala ala ala ala ala ala ala ala ala I
i| ip ip ip ip ^ ^a Ip ip a^ ip ^ ip ^ aja ip ip ip ip ip ip ip ^a ip ^a a^ a^ ip a^ ip ip ip ip ip a^ ip ip ip ip ^a ^a ^a ^a ip ip ^a a^ a^ a^* ip ip Ip ip ip Ip ip *p a^a ip f
\oid check()
{
int k,i,j;
double sum=0.0;
for(k=0; k {
sum += b[k];
}
printf("\n Sum of all terms in the first der.matrix= %lf\n",sum);
for(i=0;i {
printf("%lf ",a[i][j]);
}
printf("\n");
return;
/ ala ala ala da air di ala ^ al* di d# ala ala ala ala ala ala ala f r ala .1. ala ala f- aF# ala ala ala ala ala ala ala ala ala f.. f ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala ala
a i ip p ip ip a^a #p ip Ip r^a aya ip ip ip ip aja gp ap> ip ip ip #p rp ip rp rp a^a rp rp rp ip ip a^a rp rp a^a a^a rp rp rp rp a^a rp rp rp rp rp rp rp rp a^a Ip rp rp aya Ip rp ip rp ip rp
: unction to solve the given system of linear equations.
Using Gauss_Seidel Iterations.
|a £a a^a a^a a|r a^a a^a a^a a^a tja a^a r|r a^a rjr r|r |a a^a a^a t|a a^a a£a a^a a^a a^a a|a a^a aja a^a i^r a|r a^a a£a a^aa^a a^a a^aa^a a^aa^a a^a a^a a|a a|aa^a ijiiji a^a rja a^a a^a a^a a^a a^a a^a a^a a^a a^a r|r a^a a|a a|a^
void gauss_seidel()
double do_check, inf_norm, old_inf_norm, new_inf_norm,
total_sum=0.0, avge=0.0;
int i,j,counter,step=0;
counter=0;
for(i=0 ;i {
gaussold_x[i] = gaussnew_x[i]=0.0;
}
while(counter == 0)
{
do_check = inf_norm =old_inf_norm = new_inf_norm=0.0;
step +=1;
for(i=0;i {
37


f or(j=0; j {
if(j != i)
{
gaussnew_x[i] += -gaussold_x{j]*a[i][j];
}
}
gaussnew_x[i] += b[i];
gaussnew_x[i] /= a[i] [i];
new_inf_norm = fabs(gaussold_x[i]-gaussnew_x[i]);
if(fabs(gaussnew_x[i]) > inf_norm)
{
inf_norm = fabs(gaussnew_x[i]);
}
if(old_inf_norm < new_inf_norm)
{
old_inf_norm=new_inf_norm;
}
gaussold_x[i] = gaussnew_x[i];
}
do_check=old_inf_norm/inf_norm;
if(do_check<1.0e-4)
{
counter = 1;
for(i=0; i {
total_sum += gaussnew_x[i];
}
avge = total_sum/no_var;
for(i=0; i {
gaussnew_x[i] += -avge;
}
}
else
{
inf_norm=0.0;
oid_inf_norm=0.0;
}
}
return;
38


4 4* ^ 4# 4^ 4* 4* 4* 4 4* 4* 4* 4* 4* 4 4 4 4 4 4# 4/ 4 4 4 4 4 4 4# *1* 4 4 4 4 4# 4 4* 4* 4 4* 4* 4* 4 4* 4* 4* 4# 4* 4* 4* 4J *4* *1* 4* 4f 4
/ X j*I *T',J *T* *T* r* *T* T'Pr V *T* v*t* rj ^ >]<^t ^kji *i*p ip fpip ip ip ip ip ip ip ip ip ip t ip *p *p p 'T*T ! p'r *T* *p
jprogram to identify two highest ranked teams in each of the two
Sootball conferences
11 414* 4d 4* 4? 4* 4* L 4> 4* 4* 4* 4* 4* 4* 4* 4* 4* L 4 4* 4* 4* *1# 4* 4 4* 4* 4# 4# 4# 4 4# 4j 4 4 4i 4 4 4# 4 4* 4* 4* 4 4* 4* 4* 4* 4* 4* 4* 4* 4J 4^ 4* 4? 4* 4* 4* 4# 4* 4* 4# /
11 **1* *T**T* J*T* 'I* *P *T 'T* *T* *P *1* 'rT*P'l*T' T VT'P'P'r *T* *P >pip>p ^ y )id check_sum()
int i,imax=0,second_imax=0,league=l;
double max=0.0,second_max=0.0,var_sum=0.0;
for(i=0; i {
var_sum += new_x[i];
}
for(i=0;i {
if (new_x[i]>max)
{
second_max = max;
max = new_x[i];
second_imax=imax;
imax = i;
}
else if(new_x[i] > second_max)
{
second_max = new_x[i];
second_imax = i;
}
}
printf("\n The first ranked team for league %d is %d",
league,imax);
printf("\n The weight associated with team %d is %lf\n",
imax,max);
printf("\n The second ranked team for league %d is %d",
league, second_imax);
printf("\n The weight associated with team %d is %lf\n",
second_imax, second_max);
league += 1;
imax=second_imax=0;
max=second_max=0.0;
for(i=no_var/2;i {
if(new_x[i]>max)
39


{
second_max=max;
max=new_x[i];
imax=i;
}
else if(new_x[i] > second_max)
{
second_max = new_x[i];
second_imax =i;
}
printf("\n The first ranked team for league %d is %d",
league, imax);
printf("\n The weight associated with team %d is
?>lf\n", imax, max);
printf("\n The second ranked team in league %d is %d",
league, second_imax);
printf("\n The weight associated with team %d is %lAn",
second_imax,second_max);
printf("\n The sum of all variables = %lf",var_sum);

return;
f
40


References
1. Burden, R. A., Faires, J. D. (1989). Numerical Analysis. PWS-
KENT Publishing Company.
2. David, H. A. (1988). The method of paired comparisons. Oxford
University Press, New York.
3. Fechner, G. T. (1860). Elemente der Psychophysik. Leipzig:
Breitkopf und Hartel.
4. Fechner, G. T. (1965). Elements of Psychophysics, Vol. 1, transl.
H. E. Adler. New York: Holt, Rinehart & Winston.
5. Genest, C., LaPointe, F., Drury, S.W. (1993). On a proposal of
Jensen for the analysis of ordinal pairwise preferences using
Saatys eigenvector scaling method. Journal of Mathematical
Psychology 37, 002-038(1993).
6. Jech, T., (1983). The Ranking of Incomplete Tournaments: A
Mathematician's Guide to Popular Sports. American
Mathematical Monthly Vol. 90, No. 4, 246-264, April 1993.
7. Noether, G.E., (1960). Remarks about a paired Comparison
Model. Psychometrika, Vol 25, No.4.
8. Rosenlicht, M., (1968) Introduction to Analysis. Dover
Publications, Inc., New York.
9. Saaty, T.L., (1977). A scaling method for priorities in
hierarchical structures. Journal of Mathematical Psychology, 15,
234-281.
10. Saaty, T.L., (1980). The analytic hierarchy process. New York,
NY : McGraw-Hill.
4 1


11. Saaty, T.L., (1986). Axiomatic foundation of the analytic
hierarchy process. Management Science, 32, 841-855.
12. Saaty, T.L., (1990). Eigenvector and logarithmic least squares.
European Journal of Operational Research, 48, 156-160.
13. Thurstone, L. L. (1927a). A Law of comparative judgment.
Psychol. Rev. 34, 273-86.
14. Thurstone, L. L. (1927b). Psychophysical Analysis. Amer. J.
Psychol. 38, 368-89.
15. Thurstone, L. L. (1927c). The method of paired comparisons
for social values. J. Abnorm. Soc. Psychol. 21, 384-400.
4 2


Full Text

PAGE 1

ON NEWTON'S METHOD TO MAXIMIZE THE LIKELIHOOD FUNCTION FOR PAIRWISE COMPARISONS by Vinata Gangolli A thesis submitted to the Faculty of' the Graduate School of the University of Colorado at Denver in partial fulfillment of the requirements for the degree of Master of Science Applied Mathematics 1994

PAGE 2

This thesis for the Master of Science degree by Vinata Gangolli has been approved for the Graduate School by David C. Fisher r James Koehler Burt Simon ii Date

PAGE 3

Gangolli, Vinata (M.S. Applied Mathematics) On Newton's Method to Maximize the Likelihood Function for Pairwise Comparisons Thesis directed by Associate Professor David C. Fisher Abstract We consider the problem of ranking a set of objects based on a number of pairwise comparisons made among them. This becomes a more complex problem if each object is compared only with some of the remaining ones in that set, and if this experiment is repeated unequal number of times for different pairs. A directed graph can be used to represent such a scenario. Given such a directed graph representing pairwise comparisons of objects in a set of data, a likelihood function based on the pairwise comparisons of these objects is constructed with an underlying assumption, that the performance of the object is an independently distributed random normal variable with a common variance. Newton's method is then used to get the maximum solution to this function. It is shown that there exists a unique maximum solution to this problem for the case of a strongly connected directed graph. This abstract accurately represents the content of the candidate's thesis. I recommend its publication. David C. Fisher Ill

PAGE 4

To Chinmoy for his love and support IV

PAGE 5

Chapter 1. Introduction ............................................................................................... 1 2. Ranking of Performances .................................................................... .4 3. Mathematical Model. .............................................................................. ? 4. Solution to the ModeL.. .......................................................................... 1 5 5. Computation of the Solution ............................................................... 2 1 6. Applications ............................................................................................... 2 6 Appendix A. Rankings based on NFL Pre-Playoff Games (1993-94 Season) .............................................................................................................. 2 8 B. Computer program .................................................................................. 3 0 References .......................................................................................................... 4 1 v

PAGE 6

Figure 1. Digraph representing games between teams .............................. 5 2. Plot of Normal CD F ................................................................................ S 3. Plot of r(X) .................................................................................................. l 7 4. Plot of r'(X) ................................................................................................. l 7 5. Plot of r"(X) ............................................................................................... .! 7 vi

PAGE 7

ACKNOWLEDGMENTS I would like to thank David Fisher, James Koehler and Burt Simon for giving me the opportunity to learn by sharing their time and expertise. I especially thank Dr. Fisher for his support, invaluable guidance, patience and wonderful enthusiasm. vii

PAGE 8

1. Introduction We are often confronted by the problem of ranking a set of pairwise alternatives, based on the data available on the merit of each alternative. We consider a population of n objects with vanous degrees of merit in the attributes of our interest. It is desired that these n objects be ranked in a fair manner according to their merits. A number of pairwise comparisons may be made to decide the better object among a pair to be compared. So we would have (;) possible pairs to compare. Suppose we compare only some of these pairs, and repeat this experiment of comparison unequal number of times for each pair, then we need to formulate a way to rank these objects based on the experiments. The method of paired comparisons IS used primarily m cases when the objects to be compared can be judged only subjectively; that 1s, it is impossible to make an individual measurement for each object to judge its merit. Paired compansons are widely employed by psychometricians. The method was introduced by Fechner ([3],[4]) and, popularized by Thurstone after several extensions ([13], [14], [15]). This problem has been widely studied ([2], [7], [9], [10], [11], [12]). The process of ranking is complicated for the case of the absence of a clear-cut winner. Each alternative may have, some characteristic or merit in which it scores over others. Various

PAGE 9

schemes have been suggested and implemented to decide the best alternative. Saaty [ l 0] has introduced a procedure for prioritizing decision alternatives based on an n x n positive matrix, R = (ru ), of pairwise compansons of n <: 2 items expressed on a ratio scale. His method treats r,j > 0 as an estimate of the perceived intensity, w; /,, of a jwj judge's preference m favor of alternative i versus 1, where ( ' ')T d l 1 f w = w"w2 ,w, IS an un er ymg co umn vector o nonnegative weights that sum up to one. The judgment matrix R =h), is thus taken to be reciprocal, namely, rj, = fru, 1 :> i,j :> n, with the assumption that the judge's preference in favor of item i or j IS elicited with only one of the pairs (i,j) or (j,i) for i j. Saaty (A. -n1)/ proposed to use J1 = mox /(n -1) > 0, a linear transform of the Perron eigenvalue A.mox of R, as a measure of the cardinal consistency in the judge's responses (i.e., r,j rj, = rik, 1 :> i,j,:> n ), and posed the problem of determining how it might vary as a function of the rij's. Aupetit and Genest [6] answered this using the Perron-Frobenius theorem. Noether [7] has suggested a method of ranking based on two response curves. A linear model of the form Xu =a,aj + e,j, (1 :> i,j :> n), is defined for the amount of preference of item i over item j. Weights are assigned for each item i (1 :> i :> n), and they are estimated using the method of maximum likelihood estimation. We introduce here a model to rank a set of objects in a probabilistic manner, based on the performance of each object relative to the rest of the population. A directed graph or 2

PAGE 10

digraph is used to represent results of the compansons between different pairs of objects. The probability of an alternative, say i being superior to another alternative, say j, is calculated, based on the results of the comparisons made so far. A variable or weight associated with each object, gives a measure of its merit. Hence difference between weights of two objects gives a measure of superiority of one object over the other. A random normal variable is associated with each object to represent the random error associated with the outcome of each comparison. The probabilities of each alternative i sconng over the alternative j are calculated using this normal variable and then the max1mum likelihood function is formulated. This is solved to get a maximum using Newton's method. It is also shown that a unique maximum solution exists for a strongly connected digraph. If normal density function is used for calculating the probabilities, then the model becomes similar to the one underlying the Law of Comparative Judgment or the ThurstoneMosteller model [2]. 3

PAGE 11

2. Ranking of Performances We start with an example to explain the problem under consideration and to develop and analyze a mathematical model for it. Consider a league of four teams 1, 2, 3, 4, which play several games against each other Teams 1 and 2 play three games against each other, two of which are won by team 2 and one by team 1. Team 2 loses in its only game against team 3 and team 1 wms m the game it plays against team 4. Teams 3 and 4 play two games against each other, both of which are won by team 4. The digraph in figure 1 gives a graph representation of this scenario together with the results of each game. Teams l and 3 do not play against each other. Similarly, teams 2 and 4 do not play against each other. The vertices in the digraph indicate the teams participating in the event. Any two vertices are adjacent to each other, that is they have an arc between them, if there is a game played between them. The head of each directed arc represents the loser in that game and the vertex at the tail of that arc represents the winner in the game. The digraph is said to be strongly connected, if there is a path from each of the nodes to every other node. 4

PAGE 12

1 3 Figure 1. Digraph representing the games played between teams I. 2. 3 and 4 The graph shown above represents the whole scenario being studied here. It is a directed graph or a digraph of the games played between these teams, viz., 1, 2, 3, 4. There is a cycle 1--> 4 --> 3 --> 2 --> 1 in the graph shown above. Thus team 1 beats team 4 and team 4 beats team 3, which beats team 2, which in turn beats team 1. We can see that there exists a path from every node to every other node in the graph above. So it is a strongly connected graph. The whole scenario can be represented using a score matrix. An entry su (1:::; i,j:::; n), is positive, if object i scores over object J those many times. Thus sj, will represent the number of times that object j is found to be superior to object i. Any of the situations such as no comparison made, or no win for object over object j counts as a zero in (i,j)th place in the score matrix. Ties have not been considered for the sake of simplicity. All diagonal entries are zero since an object cannot be compared 5

PAGE 13

with itself. A score matrix corresponding to the graph shown above, is given below. 1 2 3 4 1 0 1 0 1 S=2 2 0 0 0 3 0 1 0 0 4 0 0 2 0 We now proceed to build a generalized mathematical model, which would reflect the characteristics of the problem in mathematical terms, thereby enabling us to compute a sequence of real numbers, which in turn, would give the ranking of these n objects in the data set. 6

PAGE 14

3. Mathematical Model We wish to rank every object based on its performance relative to the performance of the others at the same time. Let us start with some assumptions before proceeding to develop the model. We assume that the outcome of each pairwise comparison is independent of other comparisons. We associate a variable w, known as weight, with object i ( 1 :> i :> 11) for judging its merit. Let X, be the random variable representing the random error in the performance associated with object i. So the value of w, +X, will determine the position of the ith object m the data w, set. The vector w = gives the weight associated with each of the 11 objects. Object i would beat object j, if w, +X,> w1 +Xi. Object j would beat object i, if w, +X,< w1 +Xi. Hence Prob[Object Prob[X, -Xi >wi -w,] scores over Object j ] = Prob[ w, +X,> wi +Xi] = We assume that X1 for all!:> i :> 11, are independently and identically distributed as random normal variables. It follows that X1 -Xi is also a normal random variable. Without loss of generality, we assume that X,-Xi for all pairs in the set, is independently and identically distributed with mean 0 and standard deviation 1. Hence ( x,-Xi)N(O,l), for all 1 :> i,j 9t. The Prob[ X,-Xi> wi-w;] can be calculated using the normal distribution function as follows: 7

PAGE 15

Prob[X,xj > W;w,] We denote this quantity as F( w,-wJ The error function associated with variable X is represented as F(x), where X I F(x)== L -.J'ii/';{dt. Figure 2 gives a plot of F(x). F(X) -4 -2 0 X Figure 2. Plot of Normal C.D.F 2 4 The value of the F(x) can be evaluated using Simpson's rule. The computer program to do this computation is given in Appendix B. 8

PAGE 16

We use this to calculate the probability of the outcome for the problem referred to in Figure 1. First we define the variables W,, for team i to give a measure of the merit of each team. The reader may recall that we referred to them as weights Thus, Team l has weight w,, Team 2 has weight w,, Team 3 has weight wJ, Team 4 has weight w,, Prob[Team l beats Team 2] = Prob[ w, +X, > w2 +X,] = Prob[ X, -X2>w2 -w,] Team 1 beats 4 if w, w, +X, -X,> o, That is, if X,-X, >w4 -w,, Team l loses to 4 if w, w4 +X, -X, < o, That is, if X, -X, < w, -w, Similarly, Team 2 beats Team 3 if X,-X,> w,w2 Team 2 loses to Team 3 if X,-X,< w,-w,, and so on. Therefore, Prob[team l beats team 2 & team 2 beats team 1 twice & team 3 beats team 2 & team 1 beats team 4 & team 4 beats team 3 twice ] Team 1 and team 2 play three games against each other. Since the are order of the win-lose sequence is not important here, there ( 3 ) ways in which team 2 can win any two out of those three games. 9

PAGE 17

= G )Prob [team I beats team 2] Prob [team 2 beats team I twice] Prob [team 3 beats 2] Prob [team 1 beats team 4] Prob [team 4 beats team 3 twice] = Prob [team I beats team 2] Prob [team 2 beats team 1]' Prob [team 1 beats team 4] Prob [team 3 beats team 2] Prob [team 4 beats team 3]2 Having devised a way to compute probabilities of the outcome of comparisons, we proceed further to construct a function of the combined outcome. Since object i is superior to object j on sij occasions, and since the outcome of each comparison is independent of each other, the probability is given by F(w,wj f'. Thus the joint probability that the outcomes are as given by the result in each comparison, is given by the likelihood function, n n so (w)= IliJ 1) F(w, -wj) Here 1J represents the product of the i=l j=l i;t.j s,1>D combinatorial terms corresponding to the outcomes of games between each pair of teams. This likelihood function represents the likelihood of the scenario given by the digraph and its accompanying score matrix Without loss of generality, we put a constraint .2:, w, = 0 on the weights It can be seen that i=l n n (w+ aT)= Illl11 F(w, + a+-(wj +a)) = (w), for some constant a E R. i=l j=l i;t) S;j>O ll Therefore, the constraint I w. = 0 enables the umque estimation ; = 1 l of the weights. We take natural logarithms of the likelihood function, since taking logarithm simplifies the likelihood function considerably. 1 0

PAGE 18

Thus the problem can be defined mathematically as follows. maximize n n s,l n (w)=I1I1TJF(w;-wj) subject to the constraint L,w;=O. i=:d j=! i=! s,1>0 Without loss of generality, the positive constant T) can be substituted by 1, since maximizing T) F( w;-wj) is equivalent to maximizing F( w;-wj ). Since taking the natural logarithm simplifies the function, we redefine the model as " maximize In ( w) = L, L.su In F( w; -w;) subject to the constraint L, W; = 0. i=l j=l i=l s,1>0 We denote r(x)=lnF(x), and !Jl(w)=ln(w). It can be seen that ERF: R (0.1), : R, (0,1) and !p: R, ( -oo,O). Now the above model can be restated as, 11 n n maximize !p ( w) = L, L, sur( w; -wj) subject to the constraint L, W; = 0. i=I i=l i=l i*j :vii >0 We now use this model to construct a likelihood function for the digraph in Figure I. (."ii) = F(w,-w2 ) F( w1 -w4 ) (F(w2 -w,))2 -F( w3 w,) (F(w4 -w3))2 4 subject to the constraint L, w; = 0. i=l Therefore, the problem can be redefined after taking the logarithms as follows: max1m1ze ln((v)) = r(w, -w2)+r(w, -w4)+2r(w2 -w,)+r(w3 -w,)+2r(w4 -w3 ) 4 subject to the constraint L, w; = 0. 1=1 That is equivalent to, maximize 1 1

PAGE 19

4 subject to the constraint 2:, w; = 0. Here w = refers to the 1=1 vector with its ith entry equal to weight assigned to the jth team. We wish to evaluate the 11xl vector w, which would maximize the likelihood function defined above. This means the linear system of 11 equations :W. cp(v) = 6, (1 $ i $ 11), needs to be solved. Newton's method can be used to calculate the vector w iteratively. Suppose we wish to solve V(w) = 6 for the function V:R"-+ R". Newton's method for the n-dimensional case gives, wk+I = w, -C/w, v ( w,) r v ( w,)' where w, is the nxl vector obtained in the kth iteration. Since we took the logarithm of ( w) to make the function linear, we take the first and second order derivatives of cp( w,) to restate the equation as (cp (w,)). Let us call these two matrices of first and second derivatives as matrix A""" and vector respectively, for the sake of simplicity. We have to compute A""" and b,"'' and then solve for the system of 11 equations to get the vector representing the change in the weights in the (k + l)th iteration. This vector is then subtracted from w, to get w,+,. By our previous definition, r(x) = ln(F(x)). Differentiating with respect to x, we have, r'(x)= F'(x) F(x) Differentiating with respect to x once more, we have, 1 2

PAGE 20

4 subject to the constraint L w; = 0. Here w = refers to the l=:d vector with its ith entry equal to weight assigned to the jth team. We wish to evaluate the nxl vector w, which would maximize the likelihood function defined above. This means the linear system of n equations = 6, (1:;; i:;; n), needs to be solved. Newton's aw; method can be used to calculate the vector w iteratively. Suppose we wish to solve V(w) = 6 for the function V:R" -'t R". Newton's method for the n-dimensional case gives, w,+, = w, (a aw, V ( w,) r V ( w,), where ,"V, is the nxl vector obtained in the kth iteration. Since we took the logarithm of 4>(w) to make the function linear, we take the first and second order derivatives of cp(w,) to restate the equation as w,+, = w, (


PAGE 21

:. r"(x)= F"(x)F(x)-\F'(x))2 (F(x )) Hence the first derivative of q:>(w) with respect to w; gtves '( ) '( )f 1< .. < cp w i a wi wi-wj J!;;:sjir wj-wi or -l,jn .. su>O sp>O The ith entry in the vector b,u, represents the first derivative of q:>( w) with respect "(-) _a 'q:> (w) to w,. Taking the second derivatives, we get, q:> w iia w2 Is/'( w,wj) + Is/"( wjw,),Jor IS i,j S n. j=l j=l s;1>0 sp>O "( -) "( -) "( ) "( ) fi I< .. < . qJ w iJq> w 1,.--sur wiw1 s11r w1 w, or _l,J n, l =F J Applying this to the digraph in Figure 1, we get, q:>(w) = ln(rfi(w)) = r(w, -w2)+r(w1 -w4)+2r(w2 -w1)+r(w3 -w2)+2r(w4 -w3 ) a a q:> = r'(w,-w,) + r'(w,-w.)-2r'(w,-w,) w, Differentiating with respect to w" a 'q:> "C l "C l 2 "C l -a 2 = r w, w2 + r w1 -w4 + r w2 w, w, Differentiating with respect to w2 a'q:> "( ) 2"( ) a a = -r w1 -w2 -r w2 -w1 w2 wl Differentiating with respect to a'q:> w,, aw3aw, Differentiating with respect to w., a'q:> a w.a w, 0 = -r"(w,w4 ) Similarly, Differentiating q:>( w) with respect to w2 we get; aq:> =-r'(w1-w2)+2r'(w2 -w,)-r'(w3 -w2 ) aw, 1 3

PAGE 22

and differentiating
PAGE 23

4. Solution to the Model Let us examine how a solution to the model as defined earlier, can be obtained. We prove two results in order to define the special characteristics displayed by the matrix A= cp;(x) corresponding to a digraph D. As seen earlier, all the entries m the matrix A are a function of the second derivative of r(x) = ln(F(x)). Thus the value of r"(x) plays an important role in the structure of A. r"(x)=.!!_F'(x) = F"(x)F(x)-(F'(x))2 dx F(x) (F(x))2 It can be seen that F(x) gives a real value between 0 and 1. Hence (F(x))2 is always a real positive value in (0,1) for any x in (-oo,+oo). Thus the numerator of r"(x) decides the sign of r"(x). The next result shows that the value of F"(x)F(x)-(F'(x))2 and hence r"(x) is always negative. Lemma 1: F"(x)F(x)-(F'(x))2 <0, Proof: As defined earlier, we have F(x)= J / e-'Adt -y 2JT Taking the first and second derivative of F(x) with respect to. x, we have, F '( ) 1 _,,; d F"( ) 1 -2x _,,; x =--e 1 an x =----e 1 = -J2ii -J2ii 2 xF'(x). ..... ( 4.1) Using the equation (4.1), we have, F"(x)F(x)-(F'(x))2 = F'(x){-xF(x)-F'(x)} F'(x) is always positive for any x in (-oo,+oo). F(x) --7 0 as x --7 -oo 1 5 ......... (4.2) (4.3)

PAGE 24

L F'( ) L' l zm x = zm =e 2 =0 x-d<><> x.....;<><> \/ 21C """"' (4.4) We now examine the behavior of the expression (4.2). We have already seen that F'(x) > 0, for all x. Consider h(x) = -xF(x)F'(x). The derivative with respect to. x gives h'(x)=-xF'(x)-F(x)-F"(x). Since we have shown earlier that F"(x) = -xF'(x), the two terms cancel out giving h'(x)=-F(x). This value would always be negative. We can conclude from this that h(x) is a decreasing function. Now, Lim f(x) = Lim-xF(x)-Lim F'(x) = Lim F(x) -0, because Lim F'(x) = 0. X-f-00 X-4-<><> x.....;-<><> ( -Yx) X--t-<><> Applying L'Hospital's rule, we get, Lim F(x) =Lim F'(x) =Lim x' Applying L'Hospital's rule once x.....;-"" (-Yx) x->-"" x-2 x---Jo-oo -!21i more, we get, xLf.'!!, / .;;, = ;I, = 2( -f27i)e .;;, = 4n: F'(x) = 0. Thus {21ie x {2n e we have proved that the limit of h(x) goes to zero as x --7 -=. f(x) is also a decreasing function. Thus the value of h(x) must always be negative. Now, F"(x)F(x)( F'(x))2 = F'(x)h(x ), and F'(x) > O,for all x. Hence the product of h(x), a negative value, with F'(x) will give a negative value. We can conclude from this that F"(x)F(x)-(F'(x))2 would give a negative value for all x. 0 Figures 3, 4 and 5 give plots of the functions r(x), r'(x), r"(x), respectively. 1 6

PAGE 25

r(X) vs X ,_ -----+--3 -2 0 2 3 -4 -8 Figure 3. Plot of r(X) ---------------------r'(X) vs X Y=r'(X) 4 T 2 -!-3 -2 -1 -4 ,_ Y=-X ----------Figure 4. Plot of r'(X) 3 ,---------------------------------r"(X) vs. X ------+---" -3 -2 1 0 3 Y=r"(X) Y=-1 ---------Figure 5. Plot of r"(X) Theorem 1: Matrix A for a strongly connected digraph is a symmetric negative semidefinite matrix with rank n -1. Any 1 7

PAGE 26

vector m the null space of matrix A 1s of the form a 1, where a is a real number. Proof: J' Let A= JW2 ij =-a,, for 1 :s; i :s; n, and Ia)> 0 for all! :s; i :s; n. Thus matrix A is jo:d j:t:.i diagonally dominant. Consider a vector x = x, m the null space of matrix A. Let x, b e x, the max1mum of all x, for 1 :'> j :'> n. Hence Multiplying the kth row of A by .X, we get Since the digraph is strongly connected, and not all the xjs on the right hand side are zero. Using triangular inequality, we have, 1 8

PAGE 27

:.I a"ll x,I:S la"llx11+1a,llx21+ .... +la,,_,llx,_11+1a,,+111 x,+11+ ... +Ia,, llx, I Since x, is the maximum of all x,, where I::; i::; n, we have, Ia,, II x, IS laklllx,l+l a,211x, II+ .... +la,,_,llx, l+la,,+111x, 1+ ... +Ia,, llx, I = lx,l(akl+ .... +a,,_, + akk+1+ .... +a,,). Now consider a set M s {1,2, ..... n} such that if i EM then x, = x,. If M = {1,2, ..... ,n} then we get x, = x,, x2 = x,, .... ,x, = x,. If not, there is at least one element, say j, in {1,2, .... ,n}-M. Since the digraph D is strongly connected, there exists a path from every i EM to J. Therefore, we can find at least one element i EM such that beats j. Similarly, there must exist a path from j to every i EM. This implies that au > 0. But aii =L,aij. So -aux1 :s; ailxi+ ....... +ai1 1xi + aii+Jx;+ ..... +a1nxi = -ai1x1 j=l j,oi Therefore, x1 = x, gives ailxl + a12X2 + .... +aik-Jxk-1 + aik+Jxk+J + ... +ainx11 = ailxk + ai2xk + ... +aik-lxk + aik+Jxk + ... +ainxk. But this would imply that aiixj = aux, = aux,, since au> 0. :. xj = x, =; j EM. This is contradictory to our earlier assumption that jE{!,2, .... ,n}-M. Therefore M={l,2, ..... ,n} must be true. But x, = x, ViE M. That proves that x1 = x,, x2 = x,. .... ,x, = x,, where, x, x = IS in the null space of matrix A. 1 9

PAGE 28

We can see that the product of any row of matrix A with the vector 1 is zero. Thus 1 is in the null space of matrix A. This means the dimension of the null space of matrix A is at least 1. Since we just proved that all entries in any null vector of matrix A are equal, A' i=O, implies i = a1, for some a E R. Therefore, dim(nullspace(A))::; 1. So dim(nullspace(A)) = 1. Thus we can conclude that the rank of matrix A must be n -l. We wish to prove next that matrix A is negative semidefinite. Matrix A would be negative semidefinite if and only if all its eigenvalues are nonpositive. Since the rank of matrix A is n -l, one of its n eigenvalues is zero. Ay = ;l.,y, where A is the eigenvalue of matrix A associated with eigenvector y Therefore, (A-M,)y =0. Now if A > 0, then the subtraction of positive terms from the main diagonal terms of matrix A would make the main diagonal terms of (A-M,) smaller than the sum of the corresponding off-diagonal terms. But (A-M,) is diagonally dominant. Therefore, A > 0 cannot be true. That means the eigenvalues of matrix A cannot be positive. Ay = ;l.,y =;A ::; 0. This would be true only if A is negative semidefinite. Hence matrix A is negative semidefinite with rank n-1 for a strongly connected graph. 20

PAGE 29

5. Computation of the Solution The linear system of n equations can be solved iteratively by Gauss-Seidel method to get a quickly converging solution to the problem. Using Gauss-Seidel method of iterations we get, i-1 'I:' I k J 'I:' I k-1 J b L...Jaijwj L..JaUwj + 1 wl'l = i=' i=i+l where wl'l is the value of the ith I I aii component of the vector r"v obtained in the kth iteration. w1 1 is accepted as a reasonable approximation to the solution if llw1 1 -w1 ,_, 111 1 1 S: TOL, where TOL represents the desired accuracy. llw We are able to get a convergence to a solution using the Gauss Seidel method due to the negative semidefiniteness of matrix A. The program 'GAUSS does this iterative computation. The reader is referred to the program listing in Appendix A. The values of w for the digraph in Figure 1. were as follows: 0.04128163 w= -0.04128163 -0.2806859 0.2806859 Thus teams 4 and 1 were ranked in the first and second position respectively. Team 3 finished last. Next we prove that it is possible to get a maximum likelihood estimate of these paired comparisons for the case of a strongly connected digraph. Theorem 2: There exists a umque maximum to the likelihood function
PAGE 30

Proof: We first prove that there exists a maximum solution to this problem for the case of a strongly connected digraph. Consider the n x 1 vector x. We have, I, x, = 0, because of the i=l constraint defined for the mathematical model defined m chapter 3. Let llxll_=k. Hence lx,l=k, for some i, where 1::>i::>n. We assume that x,;:: 0. The sum of the remaining n -1 variables would add to -k, and therefore their average would be -!7(11 _1 ). So there exists a minimum, say xi, among these n -1 variables, such that <-kl f 1 < "< . xi-/(n-1 ), or -J-n,J*l. :. x,-xi ;:: k-( -Jcn -1)) = n!Y(n -1). Consider r(y) =In( F(y)). Since F(y) is in (0,1) for all y, -oo::>y::>=, the value of r(y) is always negative. Applying Taylor's theorem, we have, r(y) = r(O) + r'(O)y + r"( for some (O,y ). Since Lemma 1 gives r(y)::>r(O)+r'(O)y, for ally. Since r(O) =-In(2), and r'(O)=J%, we have, r(y)<-In(2)+J%y, for all y E R. If the underlying graph is strongly connected, there exists a path from node i to node j and vice versa. Since the total number of nodes are n, the length of this path would be at most n-I. We now consider a sequence of t teams including team j, that are on the path from j to i. If team j beats team z, then t=l. Otherwise, there would be some team, whom team j beats, which in turn beats some other team, which would 'finally beat team i. For the sake of simplicity, we represent team j as } 0 and team i as },. Hence the sequence of the teams on the path 22

PAGE 31

from }0 to }, can be represented as }0 }, ---7 }2 ---7 ... ---7 },_1 ---7 }, This is a path with length t and is a subset of the whole graph. Hence t $ n -l.We now look at the function cp0(:t). n n I, I,sm,r(xm -x,) m=! 1=1 S111f>Ol-;em $s1 1 r(x1 -x1 )+s1 1 r(x. -x. )+ ...... +s. r(x -x. ), 0 1 0 l 2 h h 1:-! ft-1 it since s1 1 0 for all 0 $ q $ t-I, and r(y) < 0 for all -oo $ y $ oo. q q+l $ r(x1 -x1 ) + r(x1 -x. )+ ...... +r(x. x ), 0 I I f2 lt-1 .fr since s1 1 :2: I for all 0 $ q $ t -I. q q+J By using the upper bound on r(x) that we derived earlier, we find that, cp0(i) $ (-In(2)+J%(xj0 -xj,))+ .... +(-lu(2)+J%h., -xj,)). :. cp0(x)$-tlu(2)+ rv:.(x1 -x1 +x. -x + .... +x. -x1.) IJ7;c Q I }j .f2 it-1 I $-nln(2)+ rv:.(x1 -x,. +x1 -x + .... +x. -x1 ). v 7 Jr 0 I I h h-1 I The terms on the right hand side cancel out to give, cp0(x)$-lu(2)+-JY,;'(xj, -xj,) Since xj $ -,.Ycn -I) and I/ k, we have, cp0(i) $ -nln(2)J%k/(n -I)' X-l).This gives the upper bound on cp0(i). We can prove this result in a similar fashion for the case of x, $0. cp0(i) is a linear combination of the natural logarithm of error functions. Error function F(y) is a continuous function with its range in (0,1) for all yE R .. The function ln(v) is a real valued function for any v E R, and it is continuous in the interval (0,1). Therefore, r(y) which equals ln(F(y)) is a continuous function on (0,1). Hence cp0(i) is a continuous function in R;. Let Z be a set where cp0(z) :2: cp0(6), 'dz E Z. Since 6 E Z, Z is a 23

PAGE 32

nonempty set. The upper bound that. was just proved to exist on cp0(x) is also an upper bound for cp0(6). Also cp0(x'):,; cp0(6). for all x';;; Z. We just proved the existence of an upper bound on x for a strongly connected digraph D. Therefore, cp(6)s;;cp(z)s;;-n!n(2)-.JY,;k/(n-l)l:fzEZ. Hence Z is a closed set. Finally, Z is a compact set because of the continuity of cp0 in (0,1). A continuous function attains a minimum and a maximum on a closed and compact set [8]. Using this result we can conclude that cp0 attains a maximum at some point x E Z. Thus we have proved the existence of the maximum solution to the likelihood function corresponding to a strongly connected digraph. We now prove that there IS a umque maximum solution. We know that the function cp0(x) is twice differentiable. Let there be a maximum solution to the problem at two vector points a and b. Therefore, cp;(a) = cp;(E) = 6. Using Taylor's theorem, we have, cp;(E)=cp;(a)+cp;(i')(E-a).forsomecsuchthat c=aa+(!-a)b ....... (5.!) for some a E (0, !). cp;(E)-cp;(a) = 6-6 = 6 = cp;(c)(E-a) ................. (5.2) Therefore, (E -a) is in the null space of cp;(c). Since the sum of all entries in a and in b is zero as a result of the constraint defined earlier, a and b are orthogonal to i. Hence b-a is orthogonal to 1. Therefore, F(E-a)=O. Using the result in theorem 1, we can denote b-a in terms of 1 as b-a = {31, where f3 is a scalar. :. F(f31) = 0. :. f3(F1) = f3 n = 0 f3 = 0. 24

PAGE 33

:. [,-a= 6 =>[,=a. This proves the uniqueness of the max1mum solution to this problem. 25

PAGE 34

6. Applications We have seen that this method can be used for ranking teams in an incomplete or a complete tournament. Although the case of ties has not been considered here, it could be shown that the same approach would give us a solution for a strongly connected graph, if the tied teams are awarded Yz a point each. In general, this method of ranking can be used for ranking a number of objects in a sample. If we are considering different varieties of chickens for their quality, we can make pairwise comparisons depending upon their relative merits in different attributes of interest. Then, with the probabilistic approach used above, we can rank them accordingly. Other potential uses would be for pairwise comparisons in areas such as sensory testing, especially taste testing, to consumer tests, personnel rating, and in general to the study of preference and choice behavior. The methodology presented in this paper was applied to find the ranking of the teams participating in the preplayoff games m the 1993-94 NFL football season. The rankings of the 28 participating teams are given in Table 1 in Appendix A. Table 2 in Appendix A gives the results of the playoff games and also of the Super Bowl. It can be seen from Table 2 playoff results that Kansas Chiefs were indeed better than Houston Oilers as suggested by the rankings, since they beat the Houston m the playoff games. Table 3 in Appendix A gives the schedule of the pre-playoff games for both of these teams. We can see that 26

PAGE 35

although Kansas Chiefs had 11 wins to their credit as against the 12 wins of Houston Oilers, they played against better teams. They also won against some of the better teams and thus their being ranked ahead of Houston Oilers has some justification. Nine out of the eleven playoff games gave results which agree with the ranking of teams given in Table l. A similar approach has been used by Jech [6]. That approach gives a probabilistic estimate of how much better a team is than some other team. Hence decision making is done by using the transitive relationship, that is, team A better than team B and team B better than team C implies that team A is better than team C. 27

PAGE 36

Appendix A Table 1: Results of NFL Pre-Playoff Games (1993-94 Season) Name of the Team Wins Losses Weight Rank Buffalo Bills AFC 1 2 4 0. 7750839 1 Dallas Cowboys NFC 1 2 4 0.6917094 2 Kansas City Chiefs AFC 1 1 5 0.6418724 3 Houston Oilers AFC 1 2 4 0.5756177 4 New York Giants NFC 1 1 5 0.4976263 5 Los Angeles Raiders AFC 1 0 6 0.3867019 6 Green Bay Packers NFC 9 7 0.2879164 7 Minnesota Vikings NFC 9 7 0.2781058 8 Denver Broncos AFC 9 7 0.2324088 9 Detroit Lions NFC 1 0 6 0.2276092 1 0 San Francisco 49ers NFC 1 0 6 0.2203107 1 1 Miami DolPhins AFC 9 7 0.1929059 1 2 San Diego Chargers AFC 8 8 0.1433809 1 3 Philadelphia Eagles NFC 8 8 0.08746736 1 4 Pittsburgh Steelers AFC 9 7 0.06787151 1 5 New York Jets AFC 8 8 0.00514422 1 6 Chicago Bears NFC 7 9 -0.0464035 1 7 New Orleans Saints NFC 8 8 -0.0555461 1 8 Arizona Cardinals NFC 7 9 -0.1798368 1 9 Seattle Seahawks AFC 6 1 0 -0.2970637 20 Cleveland Browns AFC 7 9 -0.318633 2 1 Tampa Bay Bucaneers NFC 5 1 1 -0.356672 22 Atlanta Falcons NFC 6 1 0 -0.4439348 23 Los Angeles Rams NFC 5 1 1 -0.5727481 24 New England Patriots AFC 5 1 1 -0.6100564 25 Washington Redskins NFC 4 1 2 -0.654241 26 Indianapolis Colts AFC 4 1 2 -0.7633866 27 Cincinnati Bengals AFC 3 1 3 -1.00721 28 Source: 1993-94 NFL Season 28

PAGE 37

Table 2. Results of NFL Playoff Games ( 1993-94 Season) Lower Ranked Team Higher Ranked Team Denver Broncos (AFC) LA Raiders (AFC)* Pittsburoh Steelers (AFC) Kansas City Chiefs(AFCl* Houston Oilers (AFC) Kansas City Chiefs(AFC)* LA Raiders (AFC) Buffalo Bills (AFC)* Kansas City Chiefs (AFC) Buffalo Bills (AFC)* Minnesota Vikings (NFC) New York Giants (NFC)* Detroit Lions Green Bay Packers* San Francisco 49ers (NFC)* New York Giants (NFC) Green Bay Packers (NFC) Dallas Cowboys NFC)* San Francisco 49ers (NFC) Dallas Cowboys NFC)* Dallas Cowboys(NFC)* Buffalo Bills (AFC) Note: indicates the winning team. Table 3. Pre-Playoff NFL Schedule (1993-94 Season) Team :Kansas City Chiefs Team: Houston Oilers Record: 11 Wins Rank: 3 Played Against Tampa Bav Sue. Houston Oilers Denver Broncos L.A. Raiders Cincinnati Bengals San Dieoo Chrgrs Miami Dolphins Green Bav Packrs L.A. Raiders Chicaoo Bears Buffalo Bills Seattle Seahawks Denver Broncos San Diego Chrgrs Minnesota Vikings Seattle Seahawks 5 Losses Rank Result 22 w 4 L 9 w 6 w 28 w 1 3 w 1 2 L 7 w 6 w 1 7 L 1 w 20 w 9 L 1 3 w 8 L 20 w Source: 1993-94 NFL Season 29 Record: 12 Wins Rank:4 Played Against New Or. Saints Kan.City Chiefs San Diego Chrgrs L.A. Rams Buffalo Bills New Eng. Pats Cincinnati Bengals Seattle Seahawks Cincinnati Bengals Cleveland Browns Pitts. Steelers Atlanta Falcons Cleveland Browns Pitts. Steelers San Fran. 49ers N.Y. Jets 4 Losses Rank Result 1 8 L 3 w 1 3 L 24 L 1 L 25 w 28 w 20 w 28 w 2 1 w 1 5 w 23 w 21 w 1 5 w 1 1 w 1 6 W'

PAGE 38

Appendix B !******************************************************************** The main program calls the function input() to read the data. The solution is obtained by Newton's method. The Matrices used in Newton's method are initialized using the function initialize(). The function normal_area() is called to calculate normal area function, r(X), r'(X) and r"(X). The computed matrices Anxn and bnxl are checked by the function check(). The function gauss_seidel() is called to do Gauss-Seidel iterations. The convergence of the solution is checked by computing the L2 norm, for which the function 12_norm() is called. *******************************************************************/ #include #include FILE *in, *out; void input(); void initialize(); void normal _area(); void 12_norm(); void check(); void check_sum(); void gauss_seidel(); double val, x, q, PI ; double r ,old_total_r ,new _total_r ,r 1 ,r2,a[ 100 ][1 OO],b [ 1 00], old_x[1 OO],new _x[l OO],norm, gaussold_x[l 00], gaussnew _x[ 1 00]; int no_ var,no_games,iteration,incidence[ 1 00] [ 1 OO],row _sum,column_sum, main_counter=O; void main() { int k,i,j; 30

PAGE 39

PI = 3.14159265358979; while(main_counter == 0) { printf("\nlteration %d\n" ,iteration); if(iteration == 0) { ) in = fopen("in.dat", "r"); out = fopen("out.dat" ,"w"); q = sqrt(2.0*PI); input(); for(i=O;iO) { x = fabs(old_x[i)-old_xU)); val = old_x[i]-old_xUJ; normal_area(); new _total_r += r*incidence[iJU]; a[i]UJ += -incidence[iJU]*r2; aUJ[i] += -incidence[iJU]*r2; a[i][i) += incidence[i]UJ*r2; aUJUJ += incidence[i]UJ*r2; 3 1

PAGE 40

} b[i] += incidence[iJU]*rl; bUJ += -incidence[iJU]*rl; if(fabs(new _total_r-old_total_r )< l.Oe-3) { } main_counter= 1; printf("\n The Solution has converged to its maximum\n"); printf("\n o/od The value of the likelihood function=o/olf", i teration,new _total_r); for(i=O;i
PAGE 41

void input() !**************************************************************** Program to read the input file The program also validates the data. ******************************************************************/ { int k,i,j,sum; fscanf(in, "%d" ,&no_ var); for(k=O;k
PAGE 42

for(j=O;j
PAGE 43

int i,no_inter,j; double a,u ,h,integral,derive_l ,deri ve_2; static int count; integral = derive_! = derive_2 =0.0; no_inter=O; if (x > l.Oe-6) { h = sqrt(sqrt((6.0e-10*q))); h I= sqrt(sqrt(x)); if(h > x) { } else { } no_inter = 2; no_inter= (int)(x/h); if(no_inter % 2 != 0) ( no_inter+= 1; else { no_inter += 2; } h = x/(double)(no_inter); integral = 1.0/q; integral += (exp(-(x*x/2.0)))/q; for(i=l; i
PAGE 44

{ integral += 0.5; } else { integral = 0.5 integral; } else { integral = 0.5; } printf ("\n The area under the normal curve for x= o/oe IS o/olf\n", val, integral); r = log(integral); derive_! = ( 1.0/q)*exp( -l.O*val *val/2.0); derive_2 = -l.O*derive_l *val; r1 = derive_l!integral; r2 = (derive_2*integral derive_l *derive_ I )/(integral*integral); return; !******************************************************************** Program to compute Lz norm for an iteration. *********************************************************************! void l2_norm() { } int k; double sum =0.0; norm = 0.0; for(k=O;k
PAGE 45

/************************************************************ This program checks if the sum of all entries inside box 1 is zero. *************************************************************/ void check() { } int k,i,j; double sum=O.O; for(k=O; k
PAGE 46

} } for(j=O;j inf_norm) { inf_norm = fabs(gaussnew _x[i]); } if( old_inf_norm < new _inf_norm) { old_inf_norm=new _inf_norm; } gaussold_x[i] = gaussnew_x[i]; do_ c heck=old_i nf _n orm/inf _norm; if(do_check
PAGE 47

!****************************************************************** Program to identify two highest ranked teams in each of the two football conferences ********************************************************************! void check_sum() int i,imax=O,second_imax=O,league=l; double max=O.O,second_max=O.O, var_sum=O.O; for(i=O; imax) { } second_max = max; max = new _x[i]; second_imax=imax; imax = i; else if(new _x(i] > second_max) { } second_max = new _x[i]; second_imax = 1; printf("\n The first ranked team for league %d is %d", league,imax); printf("\n The weight associated with team %d is %lf\n", imax,max); printf("\n The second ranked team for league %d is %d", league,second_imax); printf("\n The weight associated with team %d is %1f\n", second_imax, second_max); league += 1; imax=second_imax=O; max=second_max=O .0; for(i=no_ var/2;imax) 39

PAGE 48

} { } second_max=max; max=new_x[i]; imax=i; else if( new _x[i] > second_max) { } second_max = new _x[i]; second imax =t; printf("\n The first ranked team for league o/od is o/od", league,imax); printf("\n The weight associated with team o/od is o/olf\n" ,imax,max); } printf("\n The second ranked team in league o/od is o/od", league, second_imax); printf("\n The weight associated with team o/od is o/olf\n", second_imax, second_max); printf("\n The sum of all variables = %If", var_sum); return; 40

PAGE 49

References 1. Burden, R. A., Faires, J. D. (1989). Numerical Analysis. PWS KENT Publishing Company. 2. David, H. A. (1988). The method of paired comparisons. Oxford University Press, New York. 3. Fechner, G. T. (1860). Elemente der Psychophysik. Leipzig: Breitkopf und Hartel. 4. Fechner, G. T. (1965). Elements of Psychophysics, Vol. 1, trans!. H. E. Adler. New York: Holt, Rinehart & Winston. 5. Genest, C., LaPointe, F., Drury, S.W. (1993). On a proposal of Jensen for the analysis of ordinal pairwise preferences using Saaty's eigenvector scaling method. Journal of Mathematical Psychology 37, 002-038(1993). 6. Jech, T., (1983). The Ranking of Incomplete Tournaments: A Mathematician's Guide to Popular Sports. American Mathematical Monthly Vol. 90, No. 4, 246-264, April 1993. 7. Noether, G.E., (1960). Remarks about a paired Comparison Model. Psychometrika, Vol 25, No.4. 8. Rosenlicht, M., (1968) Introduction to Analysis. Dover Publications, Inc., New York. 9. Saaty, T.L., (1977). A scaling method for priorities in hierarchical structures. Journal of Mathematical Psychology, 15, 234-281. 10. Saaty, T.L., (1980). The analytic hierarchy process. New York, NY : McGraw-Hill. 4 1

PAGE 50

11. Saaty, T.L., (1986). Axiomatic foundation of the analytic hierarchy process. Management Science, 32, 841-855. 12. Saaty, T.L., (1990). Eigenvector and logarithmic least squares. European Journal of Operational Research, 48, 156-160. 13. Thurstone, L. L. (1927a). A Law of comparative judgment. Psycho!. Rev. 34, 273-86. 14. Thurstone, L. L. (1927b). Psychophysical Analysis. Amer. J. Psycho!. 38, 368-89. 15. Thurstone, L. L. (1927c). The method of paired comparisons for social values. J. Abnorm. Soc. Psycho!. 21, 384-400. 42