Citation
Laser statistics through atmospheric turbulence

Material Information

Title:
Laser statistics through atmospheric turbulence a simulation approach
Creator:
Yan, Tze Kuan
Place of Publication:
Denver, CO
Publisher:
University of Colorado Denver
Publication Date:
Language:
English
Physical Description:
vi, 66 leaves : illustrations ; 29 cm

Subjects

Subjects / Keywords:
Laser beams -- Atmospheric effects -- Simulation methods ( lcsh )
Genre:
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )

Notes

Bibliography:
Includes bibliographical references.
Thesis:
Submitted in partial fulfillment of the requirements for the degree, Master of Science, Department of Electrical Engineering
Statement of Responsibility:
by Tze Kuan Yan.

Record Information

Source Institution:
|University of Colorado Denver
Holding Location:
|Auraria Library
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
24678883 ( OCLC )
ocm24678883
Classification:
LD1190.E54 1991m .Y36 ( lcc )

Downloads

This item has the following downloads:


Full Text
LASER STATISTICS THROUGH ATMOSPHERIC TURBULENCE:
A SIMULATION APPROACH
by
Tze Kuan Yan
B.S., University of Colorado, 1989
A thesis submitted to the
Faculty of the Graduate School of the
University of Colorado in partial fulfillment
of the requirements for the degree of
Master of Science
Department of Electrical Engineering
1991
5-'
-i.
"j


This thesis for the Master of Science
degree by
Tze Kuan Yan
has been approved for the
Department of
Electrical Engineering
by
Arun K. Majumdar
Marvin F. Anderson

Date


ABSTRACT
Yan, Tze Kuan (M.S. Electrical Engineering)
Laser Statistics Through Atmospheric Turbulence: A
Simulation Approach
Thesis directed by Professor Arun K. Majumdar
This thesis is based on maximum entropy method Of
constructing an unknown probability density function (PDF)
from the high-order moments. Maximum entropy method is very
useful in data reduction, and solving system of equations
problems. This paper will look at the classical solution of
solving the moment problems, and compare it with the new
algorithm developed here, which is used to calculate the
Lagrange multipliers. This method allows us to know how
actually the Lagrange multipliers are related to moments of
distribution. This new algorithm also applies to the infinite
domain, semi-infinite domain, and finite domain.
The form and content of this abstract are approved. I
recommend its publication.
Signed
Arun K. Majumdar


DEDICATION
I wish to acknowledge the following people who made
this study possible. My mentor, Dr. Arun K. Majumdar, of the
Faculty of the Graduate School, Department of Electrical
Engineering, University of Colorado at Denver, who was the
source of my inspiration for this study. Dr. James Baker-Jarvis
of NIST, who helped me to discover the new algorithm. Last,
but not least, I am very grateful to my wife, and friends who
assisted in typing this paper.


CONTENTS
CHAPTER
1. INTRODUCTION...........................................1
2. LASER STATISTICS IN ATMOSPHERIC TURBULENCE.... 4
Definition of Atmospheric Turbulence...............4
Variance of Intensity Fluctuations.................6
Relationship between PDF and Higher Order
Moments.........................................7
3. THE INFORMATION ENTROPY...............................10
Definition of Entropy.............................10
Example: Information Content of a Morse Code. . . 10
The Linear Problem with One Constraint............12
The Linear Problem with Many Constraints..........14
4. THE MOMENT PROBLEM...................................16
History of the Moment Problem................ 16
Bayesian Theory...................................18
Derivation of Maximum Entropy Methods.............21
Moments and Cumulants.............................27
Analysis of the MAXENT Algorithm..................30
Application in Infinite Domain................... 32
Application in Semi-Infinite Domain...............35
Application in Finite Domain......................40
Constructing PDF from Experimental Data. ...... 49


vi
5. DISCUSSION AND CONCLUSION.................54
APPENDIX..........................................56
BIBLIOGRAPHY......................................65


CHAPTER I
INTRODUCTION
Laser irradiance fluctuations in real atmospheric
turbulance as well as laboratory simulated turbulance have
been studied for a long time. Nevertheless, it still remains as
a puzzle for us to understand how the statistics of irradiance
fluctuations behave in turbulant medium. In order to
understand the nature of irridiance fluctuations, we have to
know the form of probability density function (PDF), because
probability density function is used to described the statistics
of a fluctuating phenomenon like scintillation caused by
atmospheric turbulance [10 ]. There are many ways we can
calculate the probability density function. One of these
methods is called polynomial expansion. It is a very popular
method because it has been examined and studied by a lot of
mathmaticians and scientists for more than a century. In
recent years, a method called maximum entropy has been under
intensive study. For problems outside the traditional domain of
thermodynamics, the conceptual foundations of maximum
entropy also have been debated for some time. This method
becomes more meaningful in practical applications because


2
computers have became more powerful today than before.
From a practical point of view, the maximum entropy method
is more useful because it use moments to describe the
probability distribution. It is very easy for us to obtain the
moments of the distribution from the experimental data.
This thesis is to use the technique of maximum entropy
to reconstruct the probability density function from the finite
moments. It is assumed that we know the physical data of
signal information where we can calculate the lower-order and
higher-order moments. The method of maximum entropy is a
result of Bayesian theorem. With P(Ai/A) being the probability
calculated by maximum entropy (MAXENT). We will exam
P(Aj/A) very carefully later. Use of maximum entropy allows
us to obtain and update the probability distributions as more
information is available.
In 1948, Shannon [ 1 ] introduced the idea of maximum
entropy to describe the information content of a signal. This
idea was used by Jaynes [ 2 ] to analyze statistical physics in
1957. Since then, there are many outstanding application
which used the method of maximum entropy. Mead and
Papanicolaon [ 3 ] used MAXENT to solve integral equations.
James Baker-Jarvis [ 4 ] use it to solve differential equations
with constraint of boundary condition problems. Arun K.
Majumdar has studied the general inverse scattering problem
in the context of maximum entropy [ 5 ].
The uncertainty of a probability distribution can be
measured by the information entropy. By maximizing the


3
entropy subject to available information, we can obtain the
probability distribution function that is least consistent with
the knowledge of the data. So, a probability distribution for
the current state of a system consistent with prior
information can be obtained by MAXENT [ 6 ].
There are some restrictions that must be satisfied in
Shannons derived theory of maximum entropy [ 1 ].
1. The uncertainty of a probability distribution must be
positive.
2. It must be an increasing function of the number of the
degree of freedom (1/N).
3. Independent sources of uncertainty must be additive.
There are many areas of research that require the
estimation of a probability density function (PDF) from the
moment of a distribution, such as mine detection problems,
moment problems, optical propagation problems. It is crucial
in estimating the probability of successful detection in the
mine detection problems. The probability of detection is
quantified by a PDF for the noise level. Some of the research in
the past has assumed a Gaussian distribution function is
immersed into a sinusoidal signal. This is not always true,
because the Gaussian distribution function is not the only
noise exiting. This is why we need some kind of methods to
calculate the probability density function from the moments
which are obtained from the available data. The maximum
entropy techniques for the moment problem that we are
developing in this paper are very useful in this kind of
problems.


CHAPTER 2
LASER STATISTICS IN ATMOSPHERIC TURBULENCE
Definition of Atmospheric Turbulence
Most energy from the optical waves will redistribute
in atmospheric turbulence through small scale intensity
scintillations. The log-intensity variance, is used to
describe the logarithm of the intensity of the wave. The
variance usually increase along with the turbulence's strength.
The turbulence saturates when the log-intensity variance is
equal to 2.4. We can calculate the intensity scintillations in
the saturation region when measured through aperture as small
as fkh in diameter, where X is the optic wavelength, L is
the path length of propagation. The optic wave length varies
inversely as the variance of the log-intensity scintillations.
Because atmosphere is non-stationary, random fluctuations
in density and refractive index are usually caused by
atmosphere turbulence. Also the random fluctuations change
according to different place and time. Since the variation of
pressure is relatively small, we define the turbulence mainly
as the fluctuations of temperature. In fact, the fluctuations of


5
temperature is the prime reason to cause the fluctuation of
the refractive index. The refractive index is defined as,
5n = -79P&r /(9-l)T2 (2.1)
Where T is the temperature, P is the pressure and 0 is
the ratio of specific heat which is equal to 1.47 for air.
Velocity fluctuation is also another factor. Which
is from large scale phenomena to affect the equilibrium
of fluctuations. We define Lo the large scale, size, as the
2 7C
outer scale of turbulence. Where Lo = K0 ,'Ko is a
wave number. Also Lo is the inner scale of turbulence,
2 7t
where lo = Km
Km is also a wave number.
The formula of spectrum is the same for velocity,
temperature, and refractive index. It is defined by
Talarslri as below [ 10 ] .
(k) = 0.033C5 K Texp(--Si)
Km (2.2)
Where
K is the spatial wave number.
Km = 5.92/10.
For small scales the equation is a good estimation
if K >= -K-m.
For large scales the equation is a poor estimation
if K <= K0.


6
Lo, lo, and c£ are the three main parameters to
describe the atmospheric turbulence, is defined as the
structure parameter of refractive index. The intensity of the
refractive index can be measured by Cn. Where I0A-17 <= C<=:
10A-13, the unit is metersA(-2/3). The outer scale (large size)
of atmospheric turbulence is Lo, above which the energy is to
be introduced into the turbulence. It varies with the strength
of the turbulence. That is 100 meters above the ground. The
inner scale (small size) of atmosphere turbulence is lo, below
which viscous dissipates and penetrates the spectrum.
Turbulence energy is usually dissipated around 1mm above the
ground.
Variance of Intensity Fluctuations
*
The variance of the log-intensity, f is usually
measured from the experiment. So we can calculate the log
amplitude
5x = 5ini/4 (2.3)
5?ni has to be small for the spherical wave and the beam wave
to be valid.
The value of Sini can be larger than 2.5 if the high value
of Cn. is obtained in the 500 meters horizontal propagation
path which is close to the ground. It is very different for
2
Sini to be larger than 2.5 in vertical propagation paths because
Cl is constantly changing along the propagation path. From the


7
experimental results, we can show that the spherical wave
has a smaller variance than other kinds of waves.
Let Wo represents the radius of an optical beam at the
transmitter and K is the optical wave number, 2 it/A,. If
L K WoA2/2 then Sini starts out equal to the variance of a
plane wave. On the other hand, if L KWoA2/2 the variance
will be approximately equal to the value of a spherical wave
[10].
Relations between the PDF and the High Order Moments
To study phenomena like laser propagation through
turbulence atmosphere, it is necessary to know the probability
distribution function. However, it is difficult to measure PDF
because of many technical problems such as random spikes and
saturation of detector. As a result, there are many models have
been proposed to obtain the probability density function of
laser scintillation. Such as K distribution, l-K distribution,
log-normal, exponential Bessel, Furutsn distribution [9]. No
matter which model is used to calculate the PDF, we need to
know the higher order moments of the random variable. The
characteristic function of the random variable contains a
series of higher order moments, which can be used to calculate
the probability distribution. The higher order moments contain
information to reconstruct the tail of probability density
function. The information concerning the width and the lack of
symmetry of the distribution can be obtained from the even-


8
and odd order moments, if we use skewness to represent the
measure and asymmetric nature, or flatness to measure the
weight in the tail. It is very easy to obtain this information
from Gaussian type distribution, but it is not the case when
dealing with non-Gaussian type distribution. In order to
understand the physical nature, we need to know the moments
beyond 4th order.
skewness:
r3
excess:
r4
superskewness:
r5
super excess:
r6
hyperskewness:
r7
hyperexcess:
r8
1 (2.4)
3 2 (2.5)
-2 -10 < X3 X X2 > (2.6)
<*6>-15 3 (2.7)
<*7> -105 < X3 X X2 >2 (2.8)
<*"> -105 4 (2.9)
For a Gaussian probability distribution, the odd-order
central moments disappear; so r3> rs and r7 does not exist. We
can obtain r4 = r6 = r8 = 0 according to the above definition.
Remember,Gaussian distribution is described by the first two
moments only [ 5 ].


9
In this paper, it is assumed that we are able to obtain all
of the high order moments that we need. Therefore, we can use
this information to reconstruct the probability density
function.


CHAPTER 3
THE INFORMATION ENTROPY
Definition of Entropy
Let us use a very simple example to induce the result,
which will help us to understand the formalism of entropy.
Consider the problem of tossing two coins. Each having
possible outcomes Ri, R2; therefore, the total number of
possible outcomes are R = Ri R2, or in this particular case,
2*2 = 4. Nevertheless, the information contained in the two
coins should be additive:
|( Ri, R2) = |(Ri) + |(R2)
= Ln R (3.1)
So, we know the information must satisfy a logarithmic type
of function.
Example: Information Content of a Morse Code
Consider an example below:


11
Assume
Ni = number of dashes
N2 = number of dots
therefore
N = Ni + N2
(total number of dashes and dots)
N!
R = (3.2)
Ni N2j
I = Ln R
= [ Ln(N!) Ln(Ni !) Ln(N2!) ] (3.3)
Stirling's approximation:
N = V2 7t n nn en (3.4)
Ln N! = Ln(V2i) + 0.5*LnN + n*LnN n (3.5)
also
Ln(V27c) + 0.5*LnN n*LnN n (3.6)
Ln N N(LnN 1) (3.7)
therefore
1 = N(LnN 1) Ni(Ln Ni -1) N2(LnN2 -1) (3.8)
remember the normalized information is
I i = (3-9)
N
= LnN -1 (Ni/N)*LnNi + Ni/N -(N2/N)*LnN2
+ N2/N
= LnN -(Ni/N)*LnNi -(N2/N)*LnN2


12
= -(Ni/N)*LnNi (N2/N)*LnN2+ (N!/N)*LnN
+(N2/N)*LnN (3.10)
As a result
= -^Ln&
N
N
*2LnNl
N N
(3.11)
Letting Pi = Nj/N, we can generalize the information entropy
formula
k
S = £ Pi LnPi
i . (3.12)
an estimate of Pi can be obtained if we maximize the entropy
[ 7 ].
The Linear problem with one constraint
Given
k
S = £ P, LnPi
i
where
k
£Pi = 1
i
is the normalized constraint for k points.
(3.13)
(3.14)
If


13
K K
S = -2 Pi LnPj -MEpi-1)
taking the variation of s with respect to Pi
SS
SPi
k
= SfLnPi 1 -X]
1
(3-15)
(3.16)
then set Equation (3.16) equal to zero.
Ln Pi + 1 X = 0 (3.17)
therefore
Pi = exp( X -1) (3.18)
To find X we need to use the constraint condition.
X Pi = K expo. -1) = 1
i
(3.19)
where
K = 1/exp(A. -1)
= 1/Pi
(3.20)


14
or
Pi = 1/K
(3.21)
The Linear Problem with Many Constraints
The constraints from an experiment are usually given in
term of expectation values.
< fk > = X pi fki
i
where
k
i pi i
i
therefore
(3.22)
(3.23)
S = X [PiLnPi Xo(Pi -1) X Xk(fkiPi )]
i k
(3.24)
= -XtLnPi -^o -
SPi i k (3.25)
then set equation (3-25) equal to zero, as a result
Pi = exp( -Xo X ^kfki)
k (3.26)
from normalization constraint condition, we obtain


15
exp(-Xo) =
X exP( X ^kfki)
(3.27)
then the probability is
Pi =
exp( X ^kfki)
__________k__________
X exP( X ^kfki)
k
1
(3.28)


CHAPTER 4
THE MOMENT PROBLEM
History of the Moment Problem
From the classical point of view, estimation of an
associated with moment problem can be calculated from the
moments of the function. Maximum entropy is one of the
techniques used for these type of problems. In order to obtain
the least biased estimate of a probability density function,
we need to maximize the entropy of a probability distribution
and also consider the constraint that is the prior information.
This process can be done with the maximum entropy method by
use of Lagrange multipliers. It is not an easy job to calculate
the Lagrange multipliers because they always involves a
system of simultaneous non-linear equations. Since non-linear
equations are not unique, the calculation always give multiple
solutions to the same systems of equations. Therefore, this
kind of process requires us to understand the physical behavior
of the nature of the system, and generate a method to guess
the initial conditions. Such a step is very important because
Lagrange multipliers are just a set of number from a non-
linear mathematical equations and they have no meaning by


17
themselves. As a result, we need to understand the
relationship between the Largrange multipliers and the
physical nature. In this paper, the main object is to find an
algorithm that is used to calculate the Lagrange multipliers
explicitly.
Let us define a positive square integral function, v(x), so
the moments of the function v(x) are given:
where n = 0,1,2,3,4, ...
Ideally we should have infinite numbers of moments, but in
real life, only finite moments can be considered. As I mention
above, such a system of equations may be unique; it will give
multiple solutions to the same system of equations. Therefore,
we have to introduce another constraint that is the function
that will at least decay as fast as exp(-|x|) for a large x.
Maximum entropy is a very powerful alternative method
to solve the moment problem, where we have successfully used
in optics problems, data reduction application and physics
problems.
In this paper, I am going to take a closer look at the
classical method that is used to calculate moment problems by
solving a system of simultaneous non-linear equations, a job
that is not always easy. The new iterative technique that we
are developing in this paper, which is easier to handle. I also
(4.1)


18
compare the moment expansion technique and cumulant
expansion technique.
Bavesian Theorem
Suppose that an experiment is repeated a large number, N
times, resulting in both A and B. A n B nn times; A and not B,
A n B, n2i times, B and not A, A n B, 1*12 times, and neither A
or B, 1122 times. I present these results in table 1.1.
Table 4.1 Table for Event A and B
A A
B nn 012
B 021 022
Note that N = nn + ni2 + n2i + 022.
Then it follows that :
nn +02i
P(A) = __________________________ (4.2)
nil +ni2+n2i + n22
nn +ni2
P(B) = __________________________ (4.3)
nil +ni2+n2i +n22
nn
P(A/B)
nil +ni2
(4.4)


19
nil
P(B/A) _____ (4.5)
nn +ni2
nn
P(B n A) = __________________________ (4.6)
nil +ni2+n2i +n22
P(A n B) nn
P(A/B) = ____________ = ___________ (4.7)
P(B) nn + ni2
or
P(AB/C) = P(A/BC) P(B/C) (4.8)
also
P(A/B) + P( A /B) =1 (4.9)
We can prove
P(A/BC)*P(B/C) = P(B/AC)*P(A/G) (4.10)
as a result, we have Bayesian theorem:
P(B/AC)*P(B/C)
P(A/BC) = _________.__________ (4.11)
P(A/C)
Let us consider C as our prior information about A before
we receive information B. We treat P(A/C) as our prior


20
probability of A as long as we know C. If we also know the new
information about B, we can use this new information to
update the posterior probability of P(A/BC). We can see now
how powerful the Baysian theorem is. It allows us to study and
forecast future events from our limited information, and it
also allows us to add more information, when it becomes
available, in the learning process.
Sometimes, we might encounter multivariate sample
space. The Bayesian rule can be rewritten in another form.
Assume sample space S is a union of mutually exclusive
subsets. That is, say S = Bi u B2 u ... u Bk) when Bi u Bk
= $ where i and k are not equal. Then any subset A of S can
be written as :
A = A n S
= A n (Bi u B2 u ... u Bk)
= (An Bi) u(An b2) u ...u (A n Bk) (4.12)
so
P(A) = P(An Bi) + p (An B2) + ... + P(A n Bk)
= P(Bi)*P(A/Bi) + P(B2)*P(A/B2) + ...
+P( Bk)* P( Bk)
k
=S P(Bi)*P(A/Bi)
i=l
(4.13)


21
a conditional probability of the form P(Bi/A) can be evaluated
as :
P(A/Bi)
P(Bi / A) = _____________
P(A)
P(Bi)*PA/Bi)
£ P(Bi)*P(A/Bi)
(4.14)
Derivation of Maximum Entropy Methods
If we define the information entropy
S
j v(x) Lnv(x) dx
(4.15)
where S is a function of a continuously varying random
variable x [1J.
AS in chapter 3, we use the variation method to
maximize s, In order to obtain the most probable distribution
function, the constraints can be carried into the entropy
formalism by Lagrange multipliers.
S
j v(x) Lnv(x) dx normalized constraint
(4.16)


22
where
r k
normalized constraint = (Xo -1) I v(x)dx + ^ Xn
L n=l
k is an even integer, and are Lagrange multipliers.
If we take 8S with respect to 8V and set it equal to zero,
then we have an expression for the probability density
function.
k
V(x) = exp[ X xn ]
n=0 (4.17)
let
<|)(k) = j [exp( ikx) V(x)] dx
k
exp( ikx) exp[ ^ xn] dx
n=0
(4.18)

v(x) x"dx
Where we have assumed > 0.
If we integrate ^(k) by parts and use of boundary
conditions at infinity, then we will have a differential
equation for the Fourier transform of the probability density
function.
n-2 3k<'1) (4.19)
assume


23
wo =
(4.20)
and if we substitute ^(k) back into Equation (4.19), then
arrange all co-efficients according to their powers. As a
result, we will have a system of linear equations.
k
£n^n = (j-l)
n=l (4.21)
for j = 1,2 ,3 ,4....m. (m must be an even integer).
The reason why m must be an even integer will be explained
later.
Put into a matrix form, we will have
... I X1 0
... I 2X2
... I 3*3 = 2
... I 4X4 1 3
... ...
(4.22)
let us define the above equation as
x p = d
(4.23)


24
therefore
...
...
...
...
P = (X,i 2X2 ,3^3,4^.4 ,kXk)1
(t denotes transpose).
and
d = (0 2 3 ,... (k-l))1
(4.24)
(4.25)
(4.26)
Ideally, we can solve the infinite numbers of linear equation of
(4.22) to obtain Lagrange multipliers, ^n. If we know all ^n,
then we can calculate ^0 by using the following formula.
V(0) = exp(-Ao) =
(4.27)
where


25
A
k
exp[ X x1 ] dx
i=l
so we have finally completed the solution of V(x)
(4.28)
k
V(x) = exp[ X x* ] dx
i=o (4.29)
here, we have assumed that = 1.
Now, Let us look at another alternative. If we define ^(k)
in term of cumulants instead of moments, use it to expand the
Fourier transform of the probability distribution, we will have
a different result.
Let
k
(k) = exp[ X
n=0
(ik^Cn
n!
(4.30)
if we substitute it into Equation (4.19), expand this equation,
and then identifying the k power of the coefficients. We will
have another form of linear system of equations.
1 Cl Cl2 + C2
0 c2 2cic2 + c3
0 C3 2C1C3 + 2C22 + C4
0 C4 2C1C4 + 6C2C3 + C5
Cl3 + 3C1C2 + C3
3Cl2C2 + 3C1C3 + 3C22 + C4 ...
3Cl2C3 + .... + 9C2C3 + C5 ...
3Cl2C4 + ... + 9C32 + C6 ...


26
If we define it as
1 0
2X2 1 1
3A.3 1 = 0
4X4 1 0
(4.31)
C (3 = a
(4.32)
where C is equal to the matrix below
1 Cl Cl2 + c2
0 C2 2ClC2 + C3
0 C3 2clC3 + 2C22 + C4
0 C4 2C1C4 + 6C2C3 + C5
Cl3 + 3cic2 + c3
3Cl2C2 + 3C1C3 + 3C22 + c4 ...
3Cl2C3 + .... + 9C2C3 + C5 ...
3Cl2C4 + ... + 9c32 + C6 ...
P = (X\, 2X2 .3A.3,4A.4 ,..., k^k) (4.33)
and
a = (0, 1, 0, 0, ... )l (4.34)


27
Moments and Cumulants
Since we can either use moment method or cumulant
method to expand the Fourier transform, we should know the
relationship between them in order to understand how they
affect each other.
From Equations (4.20) and (4.30)
I
n=0
(ik)n
n!
k
exp[ X
11=0
(ik)nCn 1
n! J
(4.35)
lhs = { l+t+^^ + ... + ^t+...}
(4.36)
where 1 = W
rhs = exp{ 1+cit + ^y-+...+^|-+ ...}
exp(~if> exp(2lr) exp(7r) -
fl hClt I Clt2 I 1 M I c2t2 | 1 I
{1+ 1; + 2! +...}{! + ^r + 2f(-^-) +
{1 + ^- +A-(<£-*) +
1 r! 2lK r[2 ; + "
) ..
(4.37)
It is worth noting that the very tedious process of
writing down the explicit relation for particular values of n


28
may be shortened considerably. In fact, differentiating
Equation (4.35) with respect to 9i we have
| (1 + t + +... + + ...)
3t 13tr
-----------h ... + ------ + ...
3ci r' 9cj (4.38)
hence identifying the power of t
3
3cj
( j )
(4.39)
In particular, j = 1
3
3ci
r
(4.40)
therefore, given any in term of the cs, we can write
down successively those of lower orders by a differentiation.
= ci (4.41)
< x2> = c2 + ci2 (4.42)

C3 + 3C2C1 + Ci3
(4.43)


29
< x4>


C4 + 4C3C1 + 3c| + 6C2C12 + Cl4
C5 + 5C4C1 + IOC3C2 + IOC3C12 + 15c|ci + 10C2C^ + cf
(4.44)
(4.45)
C6 + 6C5C1 + 15C4C2 + 15C4C12 + 10c§ + 6OC3C2C1 + 2OC3C1
+ 2OC3C1 + 15c23 + 45c^c] + 15C2C4 + cf
(4.46)
The above moments are regular moments.
Conversely
£l£ + ,..+Ct£ + ... = Ln(i+<2d^_£ +...+ <*ZJL + .
1! r! 1! r!
we will have the first 6 terms as the following
Cl =
C2 = 2
C3 = 3 ' + 23

C4 = 4 3 2 + 12 2 64
(4.47)
(4.48)
(4.49)
(4.50)
(4.51)


30
C5 = 5 10 + 20 2 + 30z
- 60 3 + 245
(4.52)
C6 = 6 15 + 30 2 102
+ 120 1203 + 303
- 270 4 ^Qoc1;*6
(4.53)
Analysis of the MAXENT Algorithm
From Equation (4.22), we can see the systems of linear
equation is infinite. It is impossible to solve such a system of
linear equation. There are many ways to approximate the
solution to this kind of system of equation. One of this ways
is to truncate the system of linear equation into a even K
system. (K must be an even number). The truncated system
will include the higher order moments, which is not the case
for Gaussian distribution. It only requires two moments to
describe its characteristics. Therefore, maximum entropy
becomes a problem of non-linear nature. In this case, we need
to intelligently guess the higher the higher moments in order
to solve the system for the Lagrange multipliers, except for
the Gaussian distribution. We can obtain the Lagrange
multipliers for Gaussian distribution by using two linear


equations. Higher order moments do not require here because
they do not influence the solution.
The algorithm I am developing here can improve the
approximation to maximum entropy by partitioning the
equations into two matrix systems
Sp = 5 (4.54)
= 5 (4.55)
The second system has H equations. (H is an even number).
Assume the is the highest moment that is going to be
used, and all the higher moments after < xk+2 > are set to zero.
Arrange the equations completely, so we solve moments
instead of Lagrange multipliers. B is an H by H matrix of
Lagrange multipliers. And beta is an H by 1 matrix of moments.
From the first equation, we can approximate the values of
moments in A and use these values to solve for the Lagrange
multipliers. Then, I insert the calculated Lagrange multipliers
into the second system of equations to calculate the new
moments. After that I can substitute the new moments values
from Equation (4.55) to update the a matrix A in Equation
(4.54). The moments and the Lagrange multipliers will
finally converge in a few iterations. The above method will
only work based on two assumptions. The first assumption is
to have an educated initial guess. The second assumption
requires that the Lagrange multipliers are consistent with the
integrability of the distribution.
There is no difference between the Equation (4.22) and
the system of linear equations obtained from non-linear, least


32
square approximation. Nevertheless, in the least square
approach, the development of moment matrix is based on the
assumption that the underlying probability distribution is
constant. We also need a series expression of V(x) in terms of
expression function. In MAXENT approach, no such assumptions
have been made. In addition to this, the maximum entropy
method is more flexible, and we feed additional constraint
information into system of equations without any trouble.
When I used maximum entropy algorithm, I found out
some interest points, if I used ten or more moments in
calculating the Lagrange multipliers. The numerical solutions
for Lagrange multiplier were not as good as I wanted them to
be. They were very noisy. This might due to the ill-condition
matrix, because the values of the elements in the first column
and the last column are to much different. The values of the
first column are too small, but the values of the last column
are too big. The highest moment I ever used in my algorithm is
sixth moment. It seemed to give me some good results. Finally,
the numerical values of Lagrange multipliers calculated from
moment method and cumulant method were very close to each
other.
Application in Infinite Domain
Let us consider the classical problem. As we mentioned
earlier, Gaussian distribution is one of the special functions
that we only use two moments to describe it. So, the Lagrange


33
multipliers can be calculated by two linear equations. Put it
into the matrix form .
j I ^>1
I 2A.2
0

(4.56)
using Cramers rule, we obtain


2
(4.57)
%2
________2_______
2 2 2
(4.58)
exp(-Xo) =

exP( 2 x" ) d*
n=l
(4.59)
Example 1: a distribution of Gaussian type.
Let
f(x) = exp (- x2) (4.60)
the moments are
< xn > = I exp (- x2) xn dx = T( L)
(4.61)


34
for n = 0, 2, 4, ... even numbers.
the moments are equal to zero for n = 1, 3, 5, ... odd numbers.
= 1
= = = 0
= 0.5
= 0.75
= 1.875
therefore, the Lagrange multipliers are calculated from the
bellowing equation
1 0 0.5 0 | 0
0 0.5 0 0.75 | 1X2 1
0.5 0 0.75 0 | 3X2 = 0
0 0.75 0 1.8751 4X4 0
(4.62)
we get the solutions for this system.
^2=1
A,i = X3 = X4 = 0
in this case, the solution is exact.
In fact, if we set the higher moments equal to zero, we
can still get the same result because a Gaussian distribution
is completely determined by its first and second moments.


35
therefore
V(x) = exp(- x2 )
This expression is plotted on Figure 4.1.
Application in Semi-Infinite Domain
(4.63)
We define the moments
..... |
v( x) xn dx
(4.64)
and the Laplace transform of the probability density function
of the Fourier transform
= I v( x) exp( ikx) dx = ^ ^ ^ ^ X >
Jo n=0 n!
(4.65)
we can calculate the Lagrange multipliers from the following
equation


36
f[x] and v[x]
Figure 4.1
Comparison of
f[x] and v[x]


37
... I v(0)
| ... I 2X2
| ... I 3X3 = 2
... I 4X4 3
(4.66)
Example 1: a distribution function with initial conditions.
Let
V( X ) = exp( -x )
the moments are given as
< Xn> = I J ( "exp(-x)xdX = n! )
where
= 1
= 1
= 2
= 6
= 24
substitute these values into Equation
I 1 1 2 | Xi
I 1 2 6 | 2X2
i 2 6 24 | 3X3
(4.67)
(4.68)
2
(4.69)


38
we get
h = 1
A<2 = A,3 = 0
again the maximum entropy is exact.
V(x) = exp( x ) (4 70)
This function is plotted in Figure 4.2.
Now let us look at another application of Equation (4.66)
which yields an inexact solution.
Example 2:
Let
V( x ) = x exp (- x ) (4.71)
the moments are given as
< xn>
[
x exp (-x) xn dx
(n+1)!
(4.72)
where
= 1
= 2
= 6
= 24
= 120
= 720
= 5040


39
f[x] and v[x]
Figure 4.2
Comparison of
f[x] and v[x]


40
substitute these values i into Equation (4.66)
| 1 2 6 24 I Xl 0
I 2 6 24 120 | 2X2 = 1
I 6 24 120 720 | 3A.3 4
24 120 720 5040 I 4A.4 18
(4.73)
the solutions are
a.i = -3
^2 = 1.5
X3 = -0.22222
X4 = 0.0104167
therefore the function can be approximated as
V( x) = exp (- 2.75 + 1.5 x 0.3 x2 + 0.222 x* 0.0104167 x*)
This expression is plotted in Figure 4.3.
(4.74)
Application in Finite Domain
The domain of the problem of the probability function is
a finite region a =< x =< b. In this case, we define the moments


41
f[x] and v[x]
Figure 4.3
Comparison of
f[x] and v[x]


42
< xn>
-f
v( x) xn dx
(4.75)
and the Fourier transform of the probability function

n=a
(4.76)
if we require that the probability distribution is not zero at
the initial and the end point, then we must use the boundaries
conditions to calculate the Lagrange multipliers. Substituting
these conditions into the Equation (4.22). then we have
... I *1
| ... I 2X2
| ... I 3A.3
... I 4X4
Example 1:
Let
V( x ) = x exp (- xA2 )
V(a) V(b) |
+ aV(a) bV(b) |
2 + aA2V(a) bA2V(b) |
3 + aA3V(a) bA3V(b) |
(4.77)
(4.78)


43
the moments are defined as
< xn > = I x exp (-x2) xn dx = T(n ^ )
Jo
(4.79)
where
= 0.5
OLl> = 0.44311134
= 0.5
- 0.66467
= 1
= 1.6616755
= 3
using the Equation (4.76), we can approximate the Lagrange
multipliers
= -10.2
h, = 12.415
^3 = -5.653
>-4 = 0.969
therefore, the function can be approximated as
V( X) = exp (- 3.5 + 10.2 x 12.415 x2 + 5.653 x3 0.969 x4)
This function is plotted at Figure 4.4.
(4.80)


44
f[x] and v[x]
Figure 4.4
Comparison of
f[x] and v[x]


45
Example 2: Gamma distribution.
Gamma distribution is defined
P(y) -it
P Ha)
v0t-l y
------] exp (- )
(4.81)
the moment generating function m(t) for a continue random
variable y is defined as E[exp( ty )].
therefore
m(t) = E[ ]
exp(ty) y06-1
p Ha)
exp( ) dy
P
r Ha)]
1
d-Pt)
(4.82)
using Taylors expansion series, we can expand Equation (4.82)
thus
= l + (-a) (l^C-pt) +
0.1, . (-(-|3t)2
2!
! + t(p)+^L^r1)p2]+131 +...
2!
3!
(4.83)
therefore, in general


46
= a(a+l)(a+2)...(a+k-l)pk
If we choose a = 2.5, P = 2, then
(4.84)
= 1
= 5
= 35
= 315
= 3465
= 45045
675675
using Equation (4.76), we can approximate the Lagrange
multipliers
= -1.50001
*2 = 0.300003
^3 = -0.190478E-1
*4 = 0.396831 E-3
therefore, the function can be approximated as
V( x ) = exp (- 4.1 + 1.5 x 0.3 x2 + 0.019 x3 0.000397x4)
(4.85)
This function is plotted at Figure 4.5.


47
Figure 4.5
Comparison of
f[x] and v[x]


48
Constructing PDF from the Experimental Data
Experiment 1.
Experiment 2.
Experiment 3.
Experiment 4.
Experiment 5.
This experiment is plotted at Figure 4.6.
This experiment is plotted at Figure 4.7.
This experiment is plotted at Figure 4.8.
This experiment is plotted at Figure 4.9.
This experiment is plotted at Figure 4.10
All these experimental data are obtained from the lab at
the University of Colorado at Denver. They are unknown
probability density functions.


v[x]
Figure 4.6
v[x] is an unknown
probability density function


V[X]
Figure 4.7
v[x] Is an unknown
probability density function


51
v[x]
probability density function


52

v[x]

Figure 4.9
v[x] is an unknown
probability density function


Y[X]
Figure 4.10
v[x] is an unknown
probability density function


CHAPTERS
DISCUSSION AND CONCLUSION
In this paper, we have carefully examineD the classical
moment problems, using maximum entropy. An explicit
procedure for approximating the Lagrange multipliers has been
developed foe infinite and finite domains for the classical
problems. This method is based on developing a differential
equation for the Fourier Transform of the probability
distribution. This new method allows the physical
interpolation of the various Lagrange multipliers in term of
the cumulants and moments of the distribution. The algorithm
is good to solving certain types of non-linear differential
equations. The difference between this algorithm and the old
classical algorithm is this new method provides a significant
result which eliminates the tedious solution of systems of
non-linear equations by solving two linear differential
systems of equations. Also, the new algorithm does not has an
assumption as in the maximum likelihood problem that the
underlying probability distribution is Gaussian distribution,
which is not always true in the real world.


55
The MAXENT algorithm allows us to truncate the infinite
sum in the expansion of the Fourier Transform the probability
density function. In fact, this algorithm only use the first few
moments to approximate the Lagrange multipliers, but give me
good results.


APPENDIX
! This program is used to plotted Figure 4.1.
! Program language: Mathematica.
a = 0
b =1
c = 0
d = 0
g[xj = Exp[-a*x b*xA2 c*xA3 d*xA4]
e = 0
v[xj = Exp[-e a*x b*xA2 c*xA3 d*xA4]
Plot[ v[x], {x, -3, 3}, AxesLabel -> {"x", Mv[x]"}]
f[xj = Exp[-xA2]
Plot[ { v[x], f[x] }, {x, -3, 3}, AxesLabel -> {"x", "f[x] and v[x]"} ]


57
! This program is used to plotted Figure 4.2.
! Program language: Mathematica.
a = 1
b = 0
c = 2.9*10A-8
d = 3.7*10A-9
g[xj = Exp[-a*x b*xA2 c*xA3 d*xA4]
init = Nlntegrate[ g[x], { x, 0, 10} ]
e = Log[ init ]
v[xj = Exp[-e a*x b*xA2 c*xA3 d*xA4]
Plot[ v[x], {x, 0, 3}, AxesLabel -> {"x", "v[x]"}]
f[xj = Exp[ -x ]
Plot[ { v[x], f[x] }, {x, 0, 3}, AxesLabel -> {"x", "f[x] and v[x]"} ]


58
! This program is used to plotted Figure 4.3.
! Program language: Mathematica.
a = -3
b = 1.5
c = 0.22222
d = 0.0104167
g[xj = Exp[-a*x b*xA2 c*xA3 d*xA4]
init = Nlntegrate[ g[x], { x, 0, 10} ]
e = Log[ init ]
v[xj = Exp[-e a*x b*xA2 c*xA3 d*xA4]
Plot[ v[x], {x, 0, 5}, AxesLabel -> {"x", v[x]"}]
f[xj = x*Exp[ -x ]
Plot[ { v[x]p f[x] }, {x, 0, 3}, AxesLabel -> {"x", Mf[x] and v[x]"} ]


59
! This program is used to plotted Figure 4.4.
! Program language: Mathematica.
a = -10.2
b = 12.415
c = 5.653
d = 0.969
g[xj = Exp[-a*x b*xA2 c*xA3 d*xA4]
init = Nlntegrate[ g[x], { x, 0, 10} ]
e = Log[ init ]
v[xj = Exp[-e a*x b*xA2 c*xA3 d*xA4]
Plot[ v[x], {x, 0, 5}, AxesLabel -> {"xH, "v[x]"}]
f[xj = x*Exp[ -x ]
Plot[ { v[x], f[x] }, {x, 0, 3}, AxesLabel -> {"x", "f[x] and v[x]"} ]


60
! This program is used to plotted Figure 4.5.
! Program language. Mathematica.
a = -1.5
b = 0.3
c = 2/105
d = 1/2520
g[*J = Exp[-a*x b*xA2 c*xA3 d*xA4]
init = Nlntegrate[ g[x], { x, 0, 10} ]
e = Log[ init ]
v[xj = Exp[-e a*x b*xA2 c*xA3 d*xA4]
Plot[ v[x], {x, 0, 5}, AxesLabel -> {"x", nv[x]"}]
f[xj = x*Exp[ -x ]
Plot[ { v[x], f[x] }, {x, 0, 3}, AxesLabel -> {"x", Hf[x] and v[x]"} ]


61
! program: MAXENT.BAS
! programming language: Vax Basic
print chr$(30); chr(92); chr(50);chr(74);
print chr$(27); chr(92); chr(74);
main:
y$ = y"
until ( y$ o y and y$ o y )
input" input (size) nl nl%
input" input ( size) n "; n%
input" iteration time 'd' d
length = 2 nl% 1
length = 2 n% 1
dim m(nl%,nl%), inv_m(nl%,nl%), b(nl%,l),
lamda(nl%,l), g(length,l)
dim ma(n%,n%), inv_ma(nl%,nl%), c(nl%,l),
x(nl%,l), h(length,l)
print" input matrix element g[i] size is nl"
mat input g
when error in
for dl = 1 to d
mat b = zer
for 1 = 2 to nl%
b(l,l) = (l-l)*g(l-l,l)
next 1
k = 0
for i = 1 to n%
for j = 1 to nl%
m(i,j) = j g(j+k,l)
next j
k = k + 1
next i
call print_matrix ("matrix m", m(,) )
call print_matrix("matrix b", b(,) )
mat inv_m = inv(m)
call print_matrix ( inversed matrix inv_m, inv_m(,) )
call calculate_solution_x ( lamda(,), inv_m(,), b(,))
call print matrix ("lamda ", lamda(,) )
mat c = zer
c(l.l) = -g(5,l) lamda(l,l) + 4 g(4,l)
c(2,l) = 5 g(5,l)
mat h = zer
h(l,l) = -6
for 1 = 2 to 5


62
h(l,l) = lamda( 1-1, 1)
next 1
mat ma = zer
k = 3
for i = 1 to n%
bO = 0
for j = 1 to n%
if ( ( bO+k) >= 1) then
ma(i,j) = h(l,l)
h(l,l) = h(l,l) 1
else
ma(i,j) = ( bO+k-1) h(bO+k, 1)
end if
end if
bO = bO + 1
next j
k = k 1
next i
call print_matrix ( "matrix ma", ma(,) )
call print_matrix ( "matrix c ", c (,) )
call print_matrix ( "matrix h ", h (,) )
mat inv_ma = inv( ma )
call print_matrix (" inversed matrix ma", inv_ma(,) )
call calculate_solution_x ( x(,), inv_ma(,), c(,) )
call print_matrix ( "x", x(,) )
for 1 = lengh-1 to lengh
g(l,l) = x(l-5,l)
next 1
call print_matrix ("solution matrix g", g(,) )
next dl
call print_matrix ("solution matrix x", x(,) )
use
print unresolvable matrix, try again"
end when
input" continue.; y$
next
end
sub initial_lamda ( hfloat m(,), b(,) )
mat b = zer
for 1 = 2 to nl%
b(l,l) = ( 1-1 ) g( j+k, 1)
next 1


63
k = 0
for i = 1 to nl%
for j = 1 to nl%
m( i, j ) = j g(j+M)
next j
k = k + 1
next i
end sub
sub initialjma ( ma(,), c(,), h(,), lamda(,) )
mat c = zer
c(l,l) = -g(5,l) lamda(l,l) + 4 g(4,l)
c(2,l) = 5 g(5,l)
mat h = zer
h(l,l) = -6
for 1 = 2 to 5
h(l,l) = lamda(l-l,l)
next 1
mat ma = zer
k = 3
for i = 1 to n%
bO = 0
for j = 1 to n%
if ( k+bO) >= 1) then
if ( ( bO+k-1) = 0 ) then
h(l,l) = h(l,l) -1
mat( ij) = h( 1,1)
else
a(i,j) = (bO+k-1) h( bO+k, 1)
end if
end if
bO = bO + 1
next j
k = k 1
next i
end sub
sub calculate_solution x ( hfloat x(,), inv_a(,), b(,) )
mat x = inv_a b
end sub
sub print_matrix ( a$, hfloat out_matrix(,) )
print a$


mat print out_matrix;
end sub


BIBLIOGRAPHY
[1] C. Shannon, "A mathematical theory of communication."
Tech Rep. Bell Syst. Technical Journal,1948.
[2] E. Jaynes, "Information theory and statistical
mechanics." Phys. Rev. Vol. 106, P. 620, 1957.
[3] L. Mead and N. Papanicolaon, "Maximum entropy in the
problem of moments."Journal of Mathematical Physics,
Vol. 25, P 2204,1984.
[4] J. Baker-Jarvis, J. Alameddine, and M. Racine, "Solving
differential equations by a maximum entropy-minimum
norm method with a application to fokker-plank
equations." J. Math Phys,Vol.30, PP.1459-1463, 1989.
[5] A. K. Majumdar, "Uniqueness of statistics derived from
moments of irradiance fluctuations in atmospheric
optical propagation." Optical Communications, Vol. 50,
PP. 1-7,1984.
[6] J. Baker-Jarvis and R. Geyor, Application of the
method of maximum entropy in stochastic processes and
Inverse problems. ( To be published ).
[7] M. G. Kendall and A. Stuart, "The advanced theory of
statistics. New York: Hafner publishing company, Vol.1,
PP. 55-104,1969.


66
[8] W. Menderhall, R. L. Scheaffer, and D. D. Wockerly,
"Mathematical statistics with application." Boston:
Duxbury Press, 1978.
[9] R.G. Frehlich and J. H. Churnside, "Probability density
function for estimates of the moments of laser
scintillation. SPIE Proc., Infrared and Millimeter wave
Propagation Engineering, 1988.
[10] S. A. Brown-Van Hoozer, "Statistical laser irradiance
fluctuations through atmospheric turbulence based on
higher-order moments." Master Thesis, University of
Colorado at Denver, 1990.
[11] A. K. Majumdar, "Higher-order statistics of laser-
irradiance fluctuations due to turbulence." Journal of the
Optical Society of America, Vol. I, P.1067. Nov. 1984.


Full Text

PAGE 1

LASER STATISTICS THROUGH ATMOSPHERIC TURBULENCE: A SIMULATION APPROACH by Tze Kuan Van B.S., University of Colorado, 1989 A thesis submitted to the Faculty of the Graduate School of the University of Colorado in partial fulfillment of the requirements for the degree of Master of Science Department of Electrical Engineering 1991

PAGE 2

This thesis for the Master of Science degree by Tze Kuan Van has been approved for the Department of Electrical Engineering by Arun K. Majumdar Joe E. Thomas Marvin F. Anderson Date

PAGE 3

ABSTRACT Van, Tze Kuan (M.S. Electrical Engineering) Laser Statistics Through Atmospheric Turbulence: A Simulation Approach Thesis directed by Professor Arun K. Majumdar This thesis is based on maximum entropy method of constructing an unknown probability density function (PDF) from the high-order moments. Maximum entropy method is very useful in data reduction, and solving system of equations problems. This paper will look at the classical solution of solving the moment problems, and compare it with the new algorithm developed here, which is used to calculate the Lagrange multipliers. This method allows us to know how actually the Lagrange multipliers are related to moments of distribution. This new algorithm also applies to the infinite domain, semi-infinite domain, and finite domain. The form and content of this abstract are approved. recommend its publication. Sign Arun K. Majumdar

PAGE 4

DEDICATION I wish to acknowledge the following people who made this study possible. My mentor, Dr. Arun K. Majumdar, of the Faculty of the Graduate School, Department of Electrical Engineering, University of Colorado at Denver, who was the source of my inspiration for this study. Dr. James Saker-Jarvis of NIST, who helped me to discover the new algorithm. Last, but not least, I am very grateful to my wife, and friends who assisted in typing this paper.

PAGE 5

CONTENTS CHAPTER 1. INTRODUCTION ................................... 1 2. LASER STATISTICS IN ATMOSPHERIC TURBULENCE. ... 4 Definition of Atmospheric Turbulence. . . . . . .4 Variance of Intensity Fluctuations ............... 6 Relationship between PDF and Higher Order Moments .................................. 7 3. THE INFORMATION ENTROPY ....................... 10 Definition of Entropy. . . . . . . . . . . . . 1 0 Example: Information Content of a Morse Code ... 1 0 The Linear Problem with One Constraint. . . . .12 The Linear Problem with Many Constraints ...... 14 4. THE MOMENT PROBLEM ............................ 16 History of the Moment Problem. . . . . . . . 16 6ayesian Theory. . . . . . . . . . . . . . . 18 Derivation of Maximum Entropy Methods ......... 21 Moments and Cumulants. . . . . . . . . . . 27 Analysis of the MAXENT Algorithm ............. 30 Application in Infinite Domain. . . . . . . . 32 in Semi-Infinite Domain. . . . . . 35 Application in Finite Domain. . . . . . . . . 40 Constructing PDF from Experimental Data. . .. .. 49

PAGE 6

vi 5. DISCUSSION AND CONCLUSION. . . . . . . . .. 54 APPENDIX. . . . . . . . . . . . . . . . . . . . . . . 56 BIBLIOGRAPHY .......................................... 65

PAGE 7

CHAPTER I INTRODUCTION Laser irradiance fluctuations in real atmospheric turbulance as well as laboratory simulated turbulance have been studied for a long time. Nevertheless, it still remains as a puzzle for us to understand how the statistics of irradiance fluctuations behave in turbulant medium. In order to understand the nature of irridiance fluctuations, we have to know the form of probability density function (PDF), because probability density function is used to described the statistics of a fluctuating phenomenon like scintillation caused by atmospheric turbulance [1 0 ]. There are many ways we can calculate the probability density function. One of these methods is called polynomial expansion. It is a very popular method because it has been examined and studied by a lot of mathmaticians and scientists for more than a century. In recent years, a method called maximum entropy has been under intensive study. For problems outside the traditional domain of thermodynamics, the conceptual foundations of maximum entropy also have been debated for some time. This method becomes more meaningful in practical applications because

PAGE 8

2 computers have became more powerful today than before. From a practical point of view, the maximum entropy method is more useful because it use moments to describe the probability distribution. It is very easy for us to obtain the moments of the distribution from the experimental data. This thesis is to use the technique of maximum entropy to reconstruct the probability density function from the finite moments. It is assumed that we know the physical data of signal information where we can calculate the lower-order and higher-order moments. The method of maximum entropy is a result of Bayesian theorem. With P(Ai/A) being the probability calculated by maximum entropy (MAXENT). We wiU exam P(Ai/A) very carefully later. Use of maximum entropy allows us to and update the. probability distributions as more information is available. In 1948, Shannon [ 1 ] introduced the idea of maximum entropy to describe the information content of a signal. This idea was used by Jaynes [ 2 ] to analyze statistical physics in 1957. Since then, there are many outstanding application which used the method of maximum entropy. Mead and Papanicolaon [ 3 ] used MAXENT to solve integral equations. James Saker-Jarvis [ 4 ] use it to solve differential equations with constraint of boundary condition problems. Arun K. Majumdar has studied the general inverse scattering problem in the context of .maximum entropy [ 5 ]. The uncertainty of a probability distribution can be measured by the information entropy. By maximizing the

PAGE 9

3 entropy subject to available information, we can obtain the probability distribution function that is least consistent with the knowledge of the data. So, a probability distribution for the current state of a system consistent with prior information can be obtained by MAXENT [ 6 ]. There are some restrictions that must be satisfied in Shannon's derived theory of maximum entropy [ 1 ]. 1. The uncertainty of a probability distribution must be positive. 2. It must be an increasing function of the number of the degree of freedom (1/N). 3. Independent sources of uncertainty must be additive. There are many areas of research that require the estimation of a probability density function (PDF) from the moment of a distribution, such as mine detection problems, moment problems, optical propagation problems. It is crucial in estimating the probability of successful detection in the mine detection problems. The probability of detection is quantified by a PDF for the noise level. Some of the research in the past has assumed a Gaussian distribution function is immersed into a sinusoidal signal. This is not always true, because the Gaussian distribution function is not the only noise exiting. This is why we need some kind of methods to calculate the probability density function from the moments which are obtained from the available data. The maximum entropy techniques for the moment problem that we are developing in this paper are very useful in this kind of problems.

PAGE 10

CHAPTER2 LASER STATISTICS IN ATMOSPHERIC TURBULENCE Defjnjtjon of Atmospheric Turbulence Most energy from the optical waves will redistribute in atmospheric turbulence through small scale intensity scintillations. The log-intensity variance, is used to describe the logarithm of the intensity of the wave. The variance usually increase along with the turbulence's strength. The turbulence saturates when the log-intensitY variance is equal to 2.4. We can calculate the intensity scintillations in the saturation region when measured through aperture as small as VA. L in diameter, where A. is the optic wavelength, L is the path length of propagation. The optic wave length varies inversely as the variance of the log-intensity scintillations. Because atmosphere is non-stationary, random fluctuations in density and refractive index are usually caused by atmosphere turbulence. Also the random fluctuations change according to different place and time. Since the variation of pressure is relatiyely small, we define the turbulence mainly as the fluctuations of temperature. In fact, the fluctuations of

PAGE 11

5 temperature is the prime reason to cause the fluctuation of the refractive index. The refractive index is defined as, On = -79 P br I (8 -1) T2 (2.1) Where T is the temperature,. P is the pressure and 8 is the ratio of specific heat which is equal to 1.47 for air. Velocity fluctuation is also another factor. Which is from large scale phenomena to affect the equilibrium of fluctuations. We define Lo the large scale, size, as the 2..K outer scale of turbulence. Where Lo = Ko ,'Ko is a wave number. Also Lo is the inner scale of turbulence, 21t where lo =Km Km is also a wave number. The formula of spectrum is the same for velocity, temperature, and refractive index. It is defined by Talarslri as below [ 10 ] = Km. For large scales the equation is a poor estimation if K <= Ko. (2.2)

PAGE 12

6 La, Ia, and Ci are the three main parameters to describe the atmospheric turbulence, Ci is defined as the structure parameter of refractive index. The intensity of the refractive index can be measured by Ci. Where 10"-17 <= Cn<= 1 0"-13, the unit is meters"(-2/3). The outer scale (large size) of atmospheric turbulence is La, above which the energy is to be introduced into the turbulence. It varies with the strength of the turbulence. That is 1 00 meters above the ground. The inner scale (small size) of atmosphere turbulence is lo, below which viscous dissipates and penetrates the spectrum. Turbulence energy is usually dissipated around 1 mm above the ground. Variance of Intensity Fluctuations The variance of the log-intensity, a1n1 is usually measured from the experiment. So we can calculate the log amplitude = aiDI/4 (2.3) arn1 has to be small for the spherical wave and the beam wave to be valid. The value of arn1 can be larger than 2.5 if the high value of is obtained in the 500 meters horizontal propagation path which is close to the ground. It is very different for arm to be larger 2.5 in vertical propagation paths because is constantly changing along the propagation path. From the

PAGE 13

7 experimental results, we can show that the spherical wave has a smaller variance than other kinds of waves. Let Wo represents the radius of an optical beam at the transmitter and K is the optical wave number, 2rc I A.. If L << K Wo"2/2 then starts out equal to the variance of a plane wave. On the other hand, if L >> KWo"212 the variance will be approximately equal to the value of a spherical wave [ 10 ]. Relations between the POE and the High Order Moments To study phenomena like laser propagation through turbulence atmosphere, it is necessary to know the probability. distribution function. However, it is difficult to measure PDF because of .many technical problems such as random spikes and saturation of detector. As a result, there are many models have been proposed to obtain the probability density function of laser scintillation. Such as K distribution, 1-K distribution, log-normal, exponential Bessel, Furutsn distribution [9]. No matter which model is used .to calculate the PDF, we need to know the higher order moments of the random variable. The characte ristic function of the random variable contains a series of higher order moments, which can be used to calculate the probability distribution. The higher order moments contain information to the tail of probability density function. The information concerning the width and the lack of symmetry of the distribution can be obtained from the even-

PAGE 14

8 and odd order moments, if we use skewness to represent the measure and asymmetric nature, or flatness to measure the weight in the tail. It is very easy to obtain this information from Gaussian type distribution, but it is not the case when dealing with non-Gaussian type distribution. In order to understand the physical nature, we need to know the moments beyond 4th skewness: r3 = (2.4) excess: r4 = -3 '1. (2.5) superskewness: r5 = -10 (2.6) super excess: r6 = < x6> -15 3 (2.7) hyperskewness: r1 = -105 2 (2.8) hyperexcess: rs = -105 4 (2.9) For a Gaussian probability distribution, the odd-order central moments disappear; so r3, r5 and r1 does not exist. We can obtain r4 = r6 = rs = 0 according to the above definition. Remember,Gauss.ian distribution is described by the first two moments only [ 5 ].

PAGE 15

9 In this paper, it is assumed that we are able to obtain all of the high order moments that we need. Therefore, we can use this information to reconstruct the probability density function.

PAGE 16

CHAPTER3 THE INFORMATION ENTROPY Definition of Entropy Let us use a very simple example to induce the result, which will help us to understand the formalism of entropy. Consider the problem of tossing two coins. Each having possible outcomes Rt, R2; therefore, the total number of possible outcomes are R = Rt R2, or in this particular case, 2*2 = 4. Nevertheless, the information co ntained in the two coins should be additive: I( Rt, R2) = I(Rt) + I( R2) = Ln R (3.1) So, we know the information must satisfy a logarithmic type of function. Example: Information Content of a Morse Code Consider an exart:Jple below: . ----

PAGE 17

Assume Nt = n.umber of dashes N2 = number of dots therefore N = Nt + N2 (total number of dashes and dots) N! R =------= Ln R = [ Ln(N!} Ln(Nt } Ln(N2! } ] Stirling's approximation: N = i2"1tn nn en Ln N! = Ln(YZx) + O.S*LnN + n*LnN -n also Ln(YZx) + O.S*LnN << n*LnN -n LnN! = N(LnN-1) therefore = N(LnN -1} Nt(Ln Nt -1) -N2(LnN2 -1) remember the normalized information is i= N = LnN -1 (Nt/N)*LnNt + Nt/N -(N2/N)*LnN2 +N2/N = LnN (Nt/N)*LnNt -(N2/N)*LnN2 1 1 (3.2} (3.3} (3.4} (3.5} (3.6} (3.7} (3.8} (3.9}

PAGE 18

12 = -(Nt/N)*LnNt (N2fN)*LnN2 + (Nt/N)*LnN +(N2/N)*LnN (3.1 0) As a result i = N2 Ln N2 N N (3.11) Letting Pi= NJN, we can the information entropy formula (3.12) an estimate of Pi can be obtained if we maximize the entropy [ 7 ]. The Linear problem wjth one constraint Given (3.13) where (3.14) is the normalized constraint for k points. I f

PAGE 19

13 K K S = L pi LnPi A.(L Pi-1) (3.15) taking the variation of s with respect to Pi (3.16) then set Equation (3.16) equal to zero. Ln Pi + 1 A. = 0 (3.17) therefore Pi = exp( A. -1 ) (3.18) To find A. we need to use the constraint condition. L Pi = K exp(A. -1) = 1 (3.19) where K = 1/exp(A. -1) (3.20)

PAGE 20

14 or (3.21) The Linear Problem wjth Many Constraints The constraints from an experiment are usually given in term of expectation values. where therefore s = L [;t?iLnPi-Ao(Pi -1)L Ak(fkipi)] k then set equation (3-25) equal to zero, as a result Pi = exp( -A.o L Akfki ) k from normalization constraint condition, we obtain (3.22) (3.23) (3.24) (3.25) (3.26)

PAGE 21

15 exp( -Ao ) = -I,=-e-x-p(--"=:L:;-:-Ak-:-fki----:-.) i k (3.27) then the probability is (3.28)

PAGE 22

CHAPTER 4 lHE MOMENT PROBLEM History of the Moment Problem From the classical point of view, estimation of an associated with moment problem can be calculated from the moments of the function. Maximum entropy is. one of the techniques used tor these type of problems. In order to obtain the least biased estimate of a probability density function, we need to maximize the entropy of a probability distribution and also consider the constraint that is the prior information. This process can be done with the maximum entropy method by use of Lagrange. multipliers. It is not an easy job to calculate the Lagrange multipliers because they always involves a system of simultaneous non-linear equations. Since non-linear equations are not unique, the calculation always give multiple solutions to the same systems of equations. Therefore, this kind of process requires us to understand the physical behavior of the nature of the system, and generate a method to guess the initial conditiqns. Such a step is very important because Lagrange multipliers are just a set of number from a non linear mathematical equations and they have no meaning by

PAGE 23

17 themselves. As a result, we need to understand the relationship between the Largrange multipliers and the physical nature. In this paper, the main object is to find an algorithm that is used to calculate the Lagrange multipliers explicitly. Let us define a positive square integral function, v(x), so the moments of the function v(x) are given: < x > = [ v(x) x dx (4.1) where n = 0,1,2,3,4, ... Ideally we should have infinite numbers of moments, but in real life, only finite moments can be considered. As I mention above, such a system of equations may be unique; it will give multiple solutions to the same system of equations. Therefore, we have to introduce another constraint that is the function that will at least decay as fast as exp(-lxl) for a large x. Maximum entropy is a very powerful alternative method to solve the moment problem, where we have successfully used in optics problems, data reduction application and physics problems. In this paper, I am going to take a closer look at the classical method that is used to calculate moment problems by solving a system of simultaneous non-linear equations, a job that is not easy. The new iterative technique that we are developing in this paper, which is easier to handle. I also

PAGE 24

compare the moment expansion technique and cumulant expansion technique. Bayesjan Theorem 18 Suppose that an experiment is repeated a large number, N times, resulting in both A and B. A n 8 nu times; A and not 8, A n B, n21 times, 8 and not A, A n 8, n12 times, and neither A or B, n22 times. I present these results in table 1 .1 Table 4.1 Table for Event A and. 8 A B nu Note that N = n11 + n12 + n21 + n22. Then it follows that : nu + n21 P(A) = n11 +n12 P(B) = nu P(A/8) = (4.2) (4.3) (4.4)

PAGE 25

19 P(B/A) .., nn. (4.5) nn + n12 nn P(B f"'l A) = (4.6) nn + n12 + n21 +fi22 P(A f"'l B) fill P(NB) = = (4.7) P(B) nn + n12 or P(AB/C) = P(NBC) P(B/C) (4.8) at so P(NB) + P( A /B) = 1 (4.9) We can prove P(A/BC)*P(B/C) = P(B/AC)*P(A/C) (4.1 0) as a result, we have Bayesian theorem: P(B/ AC)* P(B/C) P(NBC) = (4.11) P(A/C) Let us consider C as our prior information about A before we receive information B. We treat P(NC) as our prior

PAGE 26

20 probability of A as long as we know C. If we also know the new information about 8, we can use this new information to update the posterior probability of P(A/BC). We can see now how powerful the Baysian theorem is. It allows us to study and forecast future events from our limited information, and it also allows us to add more information, when it becomes available, in the learning process. Sometimes, we might encounter multivariate sample space. The Bayesian rule can be rewritten in another form. Assume sample space S is a union of mutually exclusive subsets. That is, say S = Bt u B2 u ... u Bk, when Bi u Bk =


PAGE 27

21 a conditional probability of the form P(Bi/A) can be evaluated as: P(Bi I A) = = P(A/Bi) P(A) k L P(Bi)*P(A/BJ i=l Derivation of Maximum Entropy Methods If we define the information entropy S = f v(x) Lnv(x) dx where S is a function of a continuously varying _random variable x [1]. (4.14) (4.15) As in chapter 3, we use the variation method to maximize s, In order to obtain the most probable distribution function, the constraints can be carried into the entropy formalism by Lagrange multipliers. S = f v(x) Lnv(x) dx llOtillllfued COnstraint (4.16)

PAGE 28

22 where normalized constraint = 1) 1" v(x)dx + t A,.lv(x) xDdx n-1 k is an even integer, and An are Lagrange multipliers. If we take as with respect to av and set it equal to zero, then we have an expression for the probability density function. k V(x) = exp[L An xn] n=O (4.17) let = f [exp( ikx) V(x)] dx = i" exp( ikx) exp( -!. A. xn] dx (4.18) Where we have assumed An > 0. If we integrate q,(k) by parts and use of boundary conditions at infinity, then we will have a differential equation for the Fourier transform of the probability density function. k (n-1) cp(k)(ik-At) L n A.ni-(n-l) a cp(k) = 0 n=2 ak
PAGE 29

23 cp(k) and if we substitute <(k) back into Equation (4.19), then arrange all co-efficients according to their powers. As a result, we will have a system of linear equations. k I,n"-n = G-l) n=l for j = 1,2 ,3 ,4 ... m. (m must be an even integer). (4.20) (4.21) The reason why m must be an even integer will be explained later. Put into a matrix form, we will have l ... A.t 0 l ... 2A.2 l 3A.3 = 2 l 4"-4 3 1 ... (4.22) let us define the above equation as ... = d (4.23)

PAGE 30

24 (4.25) (t denotes transpose). and (4.26) Ideally, we can solve the infinite numbers of linear equation of (4.22) to obtain Lagrange multipliers, An. If we know all An, then we can calculate Ao by using the following formula. V(O) = exp(-Ao) = (4.27) where

PAGE 31

25 (4.28) so we have finally completed the solution of V(x) k V(x) = exp[L Ai xi] dx i=O (4.29) here, we have assumed that (k) in term of cumulants instead of moments, use it to expand the Fourier transform of the probability distribution, we will have a different result. Let (k) k (i.k)n = exp[ L 'Cn n=O n. (4.30) if we substitute it into Equation (4.19), expand this equation, and then identifying the k power of the coefficients. We will have another form of linear system of equations. I 1 C1 c1 2 + c2 C13 + 3CIC2 + C3 ... I I 0 C2 2CIC2 + C3 3c12c2 + 3cl c3 + 3c22 + C4 . I I 0 C3 2CIC3 + 2c22 + C4 3CI2C3 + .... + 9C2C3. + C5 . I I 0 C4 2CIC4 + 6C2C3 + C5 3c12c4 + ... + 9c32 + C6 . I I ' ... I

PAGE 32

26 I AI I 0 I I 2A.2 I 1 I I 3A.3 = I 0 I I 4A.4 I 0 I I I I (4.31) If we define it as C f3 = a (4.32) where c is equal to the matrix below 1 Ct c12 + c2 c13 + 3CtC2 + C3 ... I 0 C2 2CIC2 + C3 3Ct2c2 + 3CI C3 + 3c2 2 + C4 .. I 0 C3 2CIC3 + 2c2 2 + C4 2 3CI C3 + .... + 9C2C3 + C5 . I 0 C4 2CtC4 + 6C2C3 + C5 3ci2c4 + ... + 9c32 + C6 . I . I (4.33) and a = ( 0, 1, 0, 0, ... )1 (4.34)

PAGE 33

27 Moments and Cumulants Since we can .either use moment method or cumulant method to expand the Fourier transform, we should know the relationship between them in order to understand how they affect each other. From Equations (4.20) and (4.30) cj)(k) k (ik)n exp[ L I Cn ] 0 n. n= lh { 1 t2 tr S = 1 + t + 21 + ... + I + ... } r where t = (ik)r c2t2 Crtr rhs = exp{ 1 +ctt +-21 + ... + 1 + ... } r. { 1 Ct t Ct t2 } { Cz t 2 1 r4 = +-11 +-21 + ... 1 + -21 +-21 (-) + ... } ... . . 2!2 (4.35) (4.36) r.f. 1 cf t2r {1 + I + -21 (--) + ... } ... r. r!2 (4.37) It is worth noting that the very tedious process of writing down the explicit relation for particular values of n

PAGE 34

28 may be shortened considerably. In fact, differentiating Equation (4.35) with respect to Cj we have (4.38) hence identifying the power of t (4.39) In particular, j = 1 (4.40) therefore, given any in term of the c's, we can write down successively those of lower orders by a differentiation. (4.41) (4.42) (4.43)

PAGE 35

29 (4.44) (4.45) (4.46) The above moments are regular moments. Conversely Ct t 1 C tr -1-,-+ ... ... = r. L ( 1 tl tr ) n + 1! + + r! + (4.47) we will have the first 6 terms as the following (4.48) (4.49) (4.50) (4.51)

PAGE 36

30 :-60 3 + 245 C6 = 6 -15 + 30 2 1Q2 + 120 1203 + 30 3 270 2 + 360 4 1206 Analysis of the MAXENT Algorithm (4.52) (4.53) From Equation (4.22), we can see the systems of linear equation is infinite . It is impossible to solve such a system of linear equation. There are many ways to approximate the solution to this kind of system of equation. One of this ways is to truncate the system of linear equation into a even K system. (K must be an even number). The truncated system will include the higher order moments, which is not the case for Gaussian distribution. It only requires two moments to describe its characteristics. Therefore, maximum entropy becomes a problem of non-linear nature. In this case, we need to intelligently guess the higher the higher moments in order to solve the system for the Lagrange multipliers, except for the Gaussian distribution. We can obtain the Lagrange multipliers for Gaussian distribution by using two linear

PAGE 37

31 equations. Higher order moments do not require here because they do not influence the solution. The algorithm I am developing here can improve the approximation to maximum entropy by partitioning the equations into two matrix systems = d; = di (4.54) (4.55) The second system has H equations. (H is an even number). Assume the < xk+2 > is the highest moment that is going to be used, and all the higher moments after < xk+2 > are set to zero. Arrange the equations completely, so we solve moments instead of Lagrange multipliers. B is an H by H matrix of Lagrange multipliers. And beta is an H by 1 matrix of moments. From the first equation, we can approximate the values of moments in A and use these values to solve for the Lagrange multipliers. Then, I insert the calculated Lagrange multipliers into the second system of equations to calculate the new moments. After that I I can substitute the new moments values from Equation (4.55) to update the a matrix A in Equation (4.54). The moments and the Lagrange multipliers will finally converge in a few iterations. The above method will only work based on two assumptions. The first assumption is to have an educated initial guess. The second assumption requires that the Lagrange multipliers are consistent with the integrability of the distribution. There is no difference between the Equation (4.22) and the system of linear equations obtained from non-linear I least

PAGE 38

32 square approximation. Nevertheless, in the least square approach, the development of moment matrix is based on the assumption that the underlying probability distribution is constant. We also need a series expression of V(x) in terms of expression function. In MAXENT approach, no such assumptions have been made. In addition to this, the maximum entropy method is more flexible, and we feed additional constraint information into system of equations without any trouble. When I used maximum entropy algorithm, I found out some interest points, if I used ten or more moments in calculating the Lagrange multipliers. The numerical solutions for Lagrange multiplier were not as good as I wanted them to be. They were very noisy. This might due to the ill-condition matrix, because the values of the elements in the first column and the last column are to much different. The values of the first column are too small, but the values of the last column are too big. The highest moment I ever used in my algorithm is sixth moment. It seemed to give me some good results. Finally, the numerical values of Lagrange multipliers calculated from moment method and cumulant method were very close to each other. Appljcatjon jn Infinite Domain Let us consider the classical problem. As we mentioned earlier, Gaussian is one of the special functions that we only use two moments to describe it. So, the Lagrange

PAGE 39

33 multipliers can be calculated by two linear equations. Put it into the matrix form = using Cramer's rule, we obtain AI = 2 exp( A.o ) = < xO > L exp( -').,. x ) dx Example 1: a distribution of Gaussian type. Let f( x) = exp (-x2) the moments are < x > = L exp ( x2) x dx = r( n + 1) 2 0 (4.56) (4.57) (4.58) (4.59) (4.60) (4.61)

PAGE 40

34 for n = 0, 2, 4, ... even numbers. the moments are equal to zero for n = 1, 3, 5, ... odd numbers. = 1 = = = 0 = 0.5 = 0.75 = 1.875 therefore, the Lagrange multipliers are calculated from the bellowing equation I 1 0 0.5 0 I A.t 0 I 0 0.5 0 0.75 I 2A.2 1 I 0.5 0 0.75 0 I 3A.3 0 I 0 0.75 0 1.8751 4A.4 0 (4.62) we get the solutions for this system. = 1 in this case, the solution is exact. In fact, if we set the higher moments equal to zero, we can still get the same result because a Gaussian distribution is completely determined by its first and second moments.

PAGE 41

35 therefore V( x ) = exp(-x2 ) (4.63) This expression is plotted on Figure 4.1. Application in Semi-Infinite Domain We define the moments < x > = 1" v( x ) x dx (4.64) and the Laplace transform of the probability density function of the Fourier transform cp( k) = f v(x)exp(ikx)dx = (4;65) we can calculate the Lagrange multipliers from the following equation

PAGE 42

f[x] and v[x] -3 -2 -1 1 Figure 4.1 Comparison of f[x] and v[x] 36 2

PAGE 43

37 l At v(O) l 2A2 l 3A3 = 2 l ... 4A4 3 1 ... (4.66) Example 1: a distribution function with initial conditions. Let V( X) = exp( -x ) (4.67) the moments are given as = f exp(-x)xdx = n! (4.68) where = 1 = 1 = 2 = 6 = 24 substitute these values into Equation (4.66) 1 1 2 AI 1 1 2 6 2A2 = 1 2 6 '24 3A3 2 (4.69)

PAGE 44

38 we get = 1 A.2 = A.3 = 0 again the maximum entropy is exact. V( x) = exp( -x ) (4.70) This function is plotted in Figure 4.2. Now let us look at another application of Equation (4.66) which yields an inexact solution. Example 2: Let V( X ) = X exp ( -X ) (4.71) the moments are given as = 1 = 2 = 6 = 24 = 120 = 720 = 5040

PAGE 45

/ f[x] and v[x] 1 0.8 0.6 0.4 0.2 0.5 1 1.5 2 2.5 3 Figure 4.2 Comparison of f[x] and v[x] 39

PAGE 46

substitute these values into 1 2 6 2 6 24 6 24 120 24 120 720 the solutions are = = = = -3 1.5 -0.22222 0.0104167 24 120 720 5040 Equation (4.66) I A.t I 2A.2 I 3A.3 I 4A.4 therefore the function can be approximated as = V( x) = exp (-2.75 + 1.5 x 0.3 x2 + 0.222 x3 0.0104167 x4) This expression is plotted in Figure 4.3. Application in Finite Domain 40 0 1 4 1 8 (4.73) (4.74) The domain of the problem of the probability function is a finite region a =< x =< b. In this case, we define the moments

PAGE 47

f[x] and v[x] 0.4 0.3 0.2 0.1 1 f 2 3 Figure 4.3 Comparison of f[x] and v[x] 41 4

PAGE 48

42 = f v(x)xdx (4.75) and the Fourier transform of the probability function ( k ) lb = ( -k )n < xn > = v( x ) exp( ikx ) dx =. n! (4.76) if we require that the probability distribution is not zero at the initial and the end point, then we must use the boundaries conditions to calculate the Lagrange multipliers. Substituting these conditions into the Equation (4.22). then we have l A-1 l 2A.2 l 3A.3 = l ... 4A.4 1 ... I V(a) V(b) I .I + aV(a) bV(b) I I 2 + aA2V(a) bA2V(b) I I 3 + aA3V(a) bA3V(b) I (4.77) Example 1: Let V( x ) = x exp ( xA? ) (4.78)

PAGE 49

43 the moments are defined as = [ x exp ( -x) xdx = lnn+2) 2 2 (4.79) where = 0.5 = 0.44311134 = 0.5 = 0.66467 = 1 = 1.6616755 = 3 using the Equation (4.76), we can approximate the Lagrange multipliers A.1 = -10.2 A.2 = 12.415 A.3 = -5.653 A.4 = 0.969 therefore, the function can be approximated as V( x ) = exp (-3.5 + 10.2 x-12.415 x2 + 5.653 x3 0.969 x4) (4.80) This function is plotted at Figure 4.4.

PAGE 50

f[x] and v[x] 0.4 0 3 0.2 0.1 0.5 1 1.5 Figure 4.4 Comparison of f[x] and v[x] 44 2 2.5

PAGE 51

Example 2: Gamma distribution. Gamma distribution is defined P(y) = _l_ [ y-l ] exp ( -I.) J3a r( a) f3 45 (4.81) the moment generating function m(t) for a continue random variable y is defined as E[exp( ty )]. therefore m(t) = E[ ety] = f." exp(ty) Y""' exp( _x_) dy J3a r(a) f3 0 = = 1 (4.82) using Taylor's expansion series, we can expand Equation (4.82) thus (-a)(-a-1 )( 1)-2>(-At)2 = 1 +(-a) (1)"a-l(-J3t) + 2 1-' + ... 1 ( A) t2 [ a(a+1)f3l a(a+1)(a+3)f33 ] = + t a.., + 2! + 3! + (4.83) therefore, in general

PAGE 52

= k a( a+ 1)(a+2) ... If we choose a = 2.5, = 2, then = 1 = 5 = 35 = 315 = 3465 = 45045 = 675675 using Equation (4.76), we can approximate the Lagrange multipliers = = = = -1.50001 0.300003 -0.190478E-1 0.396831 E-3 therefore, the function can be approximated as V( x) = exp (-4.1 + 1.5 x-0.3 x2 + 0.019 x30.000397x4) This function is plotted at Figure 4.5. 46 (4.84) (4.85)

PAGE 53

frxl and v[x] D. 75 0.15 0.125 0.1 0.075 0.05 0.025 5 10 Figure 4.5 Comparison of f[x] and v[x] 47 15

PAGE 54

Constructing POE from the Experimental Qata Experiment 1. This experiment is plotted at Figure 4.6. Experiment 2. This experiment is plotted at Figure 4.7. Experiment 3. This experiment is plotted at Figure 4.8. Experiment 4. This experiment is plotted at Figure 4.9. Experiment 5. This experiment is plotted at Figure 4.1 0. 48 All these experimental data are obtained from the lab at the University of Colorado at Denver. They are unknown probability density functions.

PAGE 55

v[x] 60 40 20 0.01 0.02 0.03 0.04 0.05 0.06 Figure 4.6 v[x] is an unknown probability density function 49

PAGE 56

v[ x] 2.5 2 1.5 1 0.5 0.2 0.4 0.6 0.8 1 1.2 1.4 Figure 4.7 v[x] Is an unknown probability density function 50

PAGE 57

v[x] 2.5 2 1.5 1 0.5 0. 2 0. 4 0. 6 0. 8 1 1. 2 1. 4 Figure 4.8 v[x] is an unknown probability density function 51

PAGE 58

v[x] 25 20 15 10 5 0. 0250. 050. 075 0. 1 0. 1250. 150. 175 Figure 4.9 v[x] is an unknown probability density function 52

PAGE 59

v[x] 30 25 20 15 10 5 0.020.040.060.08 0.10.120.14 Figure 4.10 v[x] is an unknown probability density function 53

PAGE 60

CHAPTERS DISCUSSION AND CONCLUSION In this paper, we have carefully examineD the classical moment problems, using maximum entropy. An explicit procedure for approximating the Lagrange multipliers has been developed foe infinite and finite domains for the classical problems. This method is based on developing a differential equation for the Fourier Transform of the probability distribution. This new method allows the physical interpolation of the various Lagrange multipliers in term of the cumulants and moments of the distribution. The algorithm is good to solving certain types of non-linear differential equations. The difference between this algorithm and the old classical algorithm is this new method provides a significant result which eliminates the tedious solution of systems of non-linear equations by .solving two linear differential systems of equations. Also, the new algorithm does not has an assumption as in the maximum likelihood problem that the underlying proba.bility distribution is Gaussian distribution, which is not always true in the real world.

PAGE 61

55 The MAXENT algorithm allows us to truncate the infinite sum in the expansion of the Fourier Transform the probability density function. In fact, this algorithm only use the first few moments to approximate the Lagrange multipliers, but give me good results.

PAGE 62

APPENDIX This program is used to plotted Figure 4.1. Program language: Mathematica. a=O b =1 C=O d=O g[x_j = Exp[-a*x b*x"2 c*x"3 d*x"4] e=O v[x_j = Exp[-e -a*x b*x"2 c*x"3 d*x"4] Plot[ v[x]. {x, -3, 3}, Axeslabel -> {"x", "v[x]"}] f[x_j = Exp[-x"2] Plot[ { v[x], f[x] }, {x, -3, 3}, Axeslabel -> {"x", "f[x] and v[x]"} ]

PAGE 63

! This program is used to plotted Figure 4.2. Program language: Mathematica. a=1 b=O c = 2.9*1 0"-8 d = 3. 7*1 0"-9 g[x_J = Exp[-a*x b*x"2 c*x"3 d*x"41 in it = Nlntegrate[ g[x1, { x, 0, 1 0} 1 e = Log[ init ] v[x_J = Exp[-e a*x b*x"2 c*x"3 d*x"41 Plot[ v[x1, {x, 0, 3}, Axeslabel -> {"x", "v[x]"}] f[x_J = Exp[ -x ] 57 Plot[ { v[x1, f[x] }, {x, 0, 3}, Axeslabel -> {"x", ''f[x1 and v[x1"} 1

PAGE 64

! This program is used to plotted Figure 4.3. Program language: Mathematica. a= -3 b = 1.5 c = -0.22222 d = 0.0104167 g[x_j = Exp[-a*x b*xA2 c*xA3 d*xA4] in it = Nlntegrate[ g[x], { x, 0, 1 0} ] e = Log[ init ] v[x_j = Exp[-e a*x -b*xA2 -c*xA3 d*xA4] Plot[ v[x], {x, 0, 5}, Axeslabel -> {"x", "v[x]"}] f[x_j = x Exp[ -x ] 58 Plot[ { v[x], f[x] }, {x, 0, 3}, Axeslabel -> {"x", "f[x] and v[x]"} ]

PAGE 65

! This program is used to plotted Figure 4.4. Program language: Mathematica. a = -10.2 b = 12.415 c =-5.653 d = 0.969 g(x_] = Exp(-a*x b*x"2 c*x"3 d*x"41 in it = Nlntegrate[ g(x1, { x, 0, 1 0} 1 e = Log[ init ] v[x_] = Exp[-e a*x b*x"2 c*x"3 d*x"4] Plot[ v[x1, {x, 0, 5}, Axeslabel -> {"x", "v[x1"}] f[x_] = x*Exp[ -x 1 59 Plot[ { v[x1, f[x1 }, {x, 0, 3}, Axeslabel -> {"x", "f[x1 and v[x1"} 1

PAGE 66

! This program is used to plotted Figure 4.5. Program language. Mathematica. a = -1.5 b = 0.3 c = 2/105 d = 1/2520 g[x_] = Exp[-a*x b*x"2 c*x"3 d*x"4] in it = Nlntegrate[ g[x], { x, 0, 1 0} ] e = Log[ init ] v[x_] = Exp[-e a*x b*x"2 c*x"3 d*x"4] Plot[ v[x], {x, 0, 5}, AxesLabel -> {"x", "v[x]"}] f[x_] = x*Exp[ -x ] 60 Plot[ { v[x], f[x] }, {x, 0, 3}, AxesLabel -> {"x", "f[x] and v[x]"} ]

PAGE 67

61 program: MAXENT.BAS programming language: Vax Basic print chr$(30); chr(92); chr(50);chr(74); print chr$(27); chr(92); chr(74); main: y$ = II y" until ( y$ <> II y II and y$ <> II y II ) input" input (size) n1 "; n1% input" input ( size) n "; n% input" iteration time 'd' "; d length = 2 n1% -1 length = 2 n% -1 dim m(n1%,n1%), inv_m(n1%,n1%), b(n1%,1), lamda(n1 %,1), g(length,1) dim ma(n%,n%), inv_ma(n1 %,n1 %), c(n1 %,1), x(n1 %,1), h(length,1) print" input matrix element g[i] size is n1" mat input g when error in for d1 = 1 to d mat b = zer for 1 = 2 to n1% b(l,1) = (l-1)*g(l-1,1) next 1 k=O fori= 1 ton% next i for j = 1 to nl% m(i,j) = j gG+k,l) next j k=k+l call print_matrix ("matrix m", m(,) ) call print_matrix("matrix b", b(,) ) mat inv_m = inv(m) call print_matrix (" in versed matrix inv _m", inv _m(,) ) call calculate_sohition_x ( lamda(,), inv_m(,), b(,)) call print matrix ("lamda ", lamda(,) ) mat c = zer c(l,l) = -g(5,1) lamda(l,l) + 4 g(4,1) c(2,1) = 5 g(5,1) math = zer h(l,l) = -6 for 1 = 2 to 5

PAGE 68

h(l,1) = lamda( l-1, 1) next I mat rna= zer k=3 fori= 1 ton% bO = 0 for j = 1 ton% if ( ( bO+k) >= 1) then ma(i,j) = h(l, 1) h(l,1) = h(l,1) -1 else ma(i,j) = ( bO+k-1) h(bO+k, 1) end if end if b0=b0+1 next j k = k-1 next i call print_matrix ( "matrix rna", rna(,) ) call print_matrix ( "matrix c ", c (,) ) call print_matrix ( "matrix h ", h (,) ) mat inv _rna = inv( rna ) 62 call print_matrix (" in versed matrix rna", inv _rna(,) ) call calculate_solution_x ( x(,), inv _rna(,), c(,) ) call print_matrix ( "x", x(,) ) for 1 = lengh-1 to lengh g(l,1) = x(l-5,1) next I call print_matrix ("solution matrix g", g(,) ) next d1 call print_matrix ("solution matrix x", x(,) ) use print unresolvable matrix, try again" end when input" continue.; y$ next end sub initial_lamda ( hfloat m(,), b(,) ) mat b = zer for 1 = 2 to n1% b(l,1) = ( 1-1 ) g( j+k, 1) next I

PAGE 69

k = 0 for i = 1 to n1% for j = 1 to n1% m( i, j ) = j g(j+k,1) next j k=k+1 next i end sub sub initial_ma ( rna(,), c(,), h(,), Iamda(,) ) mat c = zer c(1,1) = -g(5,1) Iamda(1,1) + 4 g(4,1) c(2,1) = 5 g(5,1) math = zer h(1,1) = -6 for I = 2 to 5 h(I,1) = Iamda(I-1,1) next I mat rna = zer k=3 fori= 1 ton% bO = 0 for j = 1 ton% if < k+bO) >= 1) then if ( ( bO+k-1) = 0 ) then h(l,1) = h(I,1) -1 mat( i,j) = h( I,1) else a(i,j) = (bO+k-1) h( bO+k, 1) end if end if bO=b0+1 next j k = k-1 next i end sub sub caicuiate_soiution x ( hfloat x(,), inv _a(,), b(,) ) mat x = inv _a b end sub sub print_matrix ( a$, hfloat out_matrix(,) ) print a$ 63

PAGE 70

mat print out_matrix; end sub 64

PAGE 71

BIBLIOGRAPHY [1] C. Shannon, "A mathematical theory of communication." Tech Rep. Bell Syst. Technical Journal, 1948. [2] E. Jaynes, .. Information theory and statistical mechanics ... Phys. Rev. Vol. 106, P. 620, 1957. [3] L. Mead and N. Papanicolaon, "Maximum entropy in the problem of moments." Journal of Mathematical Physics, Vol. 25, P 2204,1984. [4] J. Saker-Jarvis, J. Alameddine, and M. Racine, .. Solving differential equations by a maximum entropy-minimum norm method with a application to fokker-plank equations." J. Math Phys,Vol.30, PP.1459-1463, 1989. [5] A. K. Majumdar, "Uniqueness of statistics derived from moments of irradiance fluctuations in atmospheric optical propagation." Optical Communications, Vol. 50, p p. 1-7' 1984. [6] J. Saker-Jarvis and R: Geyer, "Application of the method of maximum entropy in stochastic processes and Inverse problems." ( To be published ). [7] M. G. Kendall and A. Stuart, "The advanced theory of statistics." New York: Hafner publishing company, Vol.1, PP. 55-104,1969.

PAGE 72

66 [8] W. Menderhall, R. L. Scheaffer, and D. D. Weckerly, "Mathematical statistics with application." Boston: Duxbury Press, 1978. [9] R.G. Frehlich and J. H. Churnside, "Probability density function for estimates of the moments of laser scintillation." SPIE Proc., Infrared and Millimeter wave Propagation Engineering, 1988. [1 O] S. A. Brown-Van Hoozer, "Statistical laser irradiance fluctuations through atmospheric turbulence based on higher-order moments." Master Thesis, University of Colorado at Denver, 1990. [11] A. K. Majumdar, "Higher-order statistics of laser irradiance fluctuations due to turbulence." Journal of the Optical Society of America, Vol. I, P.1067. Nov. 1984.