THE ANALYSIS AND APPLICATION OF THROUGHPUT IN
COMPUTER COMMUNICATION NETWORKS USING CSMA/CD
PROTOCOL
by Giao Van Pham
B.S., University of Colorado, 1989
A thesis submitted to the
Faculty of the Graduate School of the
University of Colorado in partial fulfillment
of the requirements for the degree of
Master of Science
Department of Electrical Engineering
1991
V
5
This thesis for the Master of Science degree by
Giao Van Pham
has been approved for the
Department of
Electrical Engineering
by
Joe Thomas
Date (%Â£ 6jl, mi
Pham, Giao Van (M.S., Electrical Engineering)
The Analysis and Application of Throughput in Computer
Communication Networks Using CSMA/CD Protocol.,
Thesis directed by Professor Alfred Fermelia
The stateoftheart in the deployment of computer
networks is in its infancy. As such, the theories being developed
which support deployment activities are unstructured and
empirically based. This thesis defines the three segments that
must be addressed in a consistent manner in order to ensure
proper deployment of a computer network.
Once these segments have been defined, the network can
be simulated using closed loop methodology (CLM). Using the
CLM approach to simulation provides a stateoftheart tool
which yields structure to this unstructured problem. Chapter 4
entitled "Closed Loop Methodology Applied to the Simulation of
an Ethernet" details this approach.
Moreover, in order to formulate and execute a
methodology that will produce quantitative requirements for
computer network deployment, the CLM approach to modeling
will be applied to the simulation of network segments
introduced in the introduction. Specific application will be made
to analyze the Carrier Sense Multiple Access with Collision
Detection (CSMA/CD) protocol. State variables necessary for CLM
application will be defined in terms of CSMA/CD parameters.
IV
This approach employs an adaptive processor which is
used to minimize the difference between actual measured
values obtained via a sensor such as a Lanalyzer predicted from
a model of the computer network. Applying the approach to an
ethernet will quantify the effect of workstations, TCP/IP,
throughput, jamming time, propagation time, and offered load
on network utilization.
Signed
Alfred Fermelia
V
DEDICATION
This thesis is respectfully dedicated to my grandparents,
Ky Pham and Loan Thi Phan, and my parents, Khuyen Pham and
Sinh Thi Truong who are still in Vietnam, for coming into this
beautiful world with their endless love.
This thesis is also lovingly dedicated to my wife, MyLinh
Thi Le, and my daughter, MyTrang Thi Pham, who have
patiently endured with me through my education and
supported me while I spent so much time away from home
preparing this thesis. However, I think that I can not thank my
wife enough for all her love, support, encouragement and
understanding without her, none of this would be possible.
VI
ACKNOWLEDGEMENTS
I would like to express my gratitude, my sincerest
appreciation, to Dr. Alfred Fermelia for his great support,
encouragement, and input during the preparation of this thesis,
which proved to be extremely helpful in the technical
presentation of the study. Also, the enthusiasm of Dr. Joe
Thomas, the chairman of Electrical Engineering and Computer
Science, and Dr. Jan Bialasiewicz, who have served with Dr.
Alfred Fermelia as the student's thesis committee, are gratefully
acknowledged. Without their support, I would not have been
able to further my education.
Special thanks are also given to my cousin, Son Nam
Nguyen, for his moral support and inspiration.
CONTENTS
CHAPTER
1. INTRODUCTION....................................... 1
2. ANALYSIS OF CARRIER SENSE MULTIPLE ACCESS
WITH COLLISION DETECTION IN RANDOM ACCESS
NETWORK........................................... 11
2.1 Introduction.............................. 11
2.2 The properties of CSMA/CD protocol.... 13
2.2.1 CSMA/CD performance................. 13
2.2.2 The operation of CSMA/CD............ 13
2.3 Contention of collision in CSMA/CD........ 14
3. INTRODUCTION OF CLM TO THE SIMULATION OF
AN ETHERNET....................................... 16
3.1 Introduction.............................. 16
3.2 The overview of CLM....................... 16
3.3 The overview of classification set........ 19
3.4 Open loop methodology.................... 23
3.5 Closed loop methodology................... 24
3.6 Basic concepts of the Kalman filter....... 25
3.7 Closed loop methodology design............ 30
4. THROUGHPUT ANALYSIS............................... 32
4.1 Introduction
32
Vlll
4.2 Poisson law................................. 33
4.3 The analysis of throughput using
CSMA/CD..................................... 36
5. DESIGN AND DEPLOY A COMPUTER NETWORK
WHICH OPTIMIZES RESOURCES........................... 46
5.1 Introduction................................ 46
5.2 CLM mechanization for the synthesis/real
data........................................ 46
5.3 OLM applied to CSMA/CD ethemet design
problem..................................... 50
5.4 OLM applied to CSMA/CD ethemet
measurement problem......................... 55
5.5 CLM applied to CSMA/CD ethemet
design vs. measurement problem ............. 58
6. APPLICATION RESULTS.................................. 60
6.1 Introduction................................ 60
6.2 Methods of stochastic approximation ........ 61
6.3 Characteristics of an estimation/prediction
problem..................................... 62
6.4 Explanation of scenarios.................... 63
7. CONCLUSION
85
IX
APPENDIX
A. KALMANHLTER............................ 87
B. STATE SPACE FORMULATION................ 98
C STATE VARIABLE DESCRIPTIONS OF
THROUGHPUT............................ 102
D. LAN PATROL............................ Ill
E INNOVATION PROCESS.................... 114
F. SIMULATION DATA....................... 116
G. SIMULATION PROGRAMS................... 121
BIBLIOGRAPHY ................................ 134
FIGURES
Figure
1.1 Network segment relationships........................... 2
1.2 Communication system components......................... 4
1.3 System/component levels in a communication
network................................................. 5
1.4 Component level of a communication network ............. 6
2.1 Flow diagram for CSMA................................ 12
2.2 Flow diagram for CSMA/CD............................... 14
2.3 Timing diagram for CSMA/CD............................. 15
3.1 High level description of the process.................. 21
3.2 Open loop methodology.................................. 23
3.3 Closed loop methodology................................ 25
3.4 An exploration of CLM applied to CSMA/CD............... 26
3.5 The Kalman filter block diagram........................ 27
3.6 Processing steps in the Kalman filter.................. 29
5.1 CLM mechanization for synthesis data................... 48
5.2 CLM mechanization for real data........................ 49
5.3 OLM applied to CSMA/CD ethemet design problem
(Predeployment produces predicted
measurement)........................................... 52
XI
5.4 OLM applied to CSMA/CD ethemet design problem
(Empirically derived throughput produces
predicted measurement)................................. 53
5.5 OLM applied to CSMA/CD ethemet design problem
(Requirement matching produces predicted
measurement).................................... 54
5.6 OLM applied to CSMA/CD ethemet
design/measurement problem (Deployment of
network produces actual measurement) ................ 56
5.7 OLM applied to CSMA/CD ethemet
design/measurement problem (Deployment of
network predicted vs. actual measurement) ....... 57
5.8 CLM applied to CSMA/CD ethemet
design/measurement problem (Deployment of
network predicted vs. actual measurement) ........... 59
6.1a Offered load (actual ~ predicted) vs. number of
events per workstation for N = 50.................. 67
6.1b Offered load (actual ~ predicted) vs. number of
events per workstation for N = 100................. 68
6.1c Offered load (actual ~ predicted) vs. number of
events per workstation for N = 500 ................ 69
6.Id Offered load (actual ~ predicted) vs. number of
events per workstation for N = 1000 ............... 70
6.2a Throughput (actual ~ predicted) vs. number of
events per workstation for N = 50.................. 71
6.2b Throughput (actual ~ predicted) vs. number of
events per workstation for N = 100 ................ 72
6.2c Throughput (actual ~ predicted) vs. number of
events per workstation for N = 500 ................ 73
xii
6.2d Throughput (actual ~ predicted) vs. number of
events per workstation for N = 1000 ............... 74
6.3a Output measurement (actual ~ predicted) vs.
number of events per workstation for N = 50 ....... 75
6.3b Output measurement (actual ~ predicted) vs.
number of events per workstation for N = 100 ...... 76
6.3c Output measurement (actual ~ predicted) vs.
number of events per workstation for N = 500 ...... 77
6.3d Output measurement (actual predicted) vs.
number of events per workstation for N = 1000 ..... 78
6.4a Actual vs. predicted histogram for N = 50............ 79
6.4a' Actual vs. predicted histogram for N = 50
(plotted at higher covariance)......................... 80
6.4b Actual vs. predicted histogram for N = 100 .......... 81
6.4b' Actual vs. predicted histogram for N = 100
(plotted at higher covariance)......................... 82
6.4c Actual vs. predicted histogram for N = 500 .......... 83
6.4d Actual vs. predicted histogram for N = 1000 ......... 84
D.l The LAN patrol and packet length information of
type/length in multiple operation systems
network............................................... 113
XI11
TABLES
Table
1.1 Segments of Network Deployment......................... 1
1.2 CSMA/CD Design Problem................................. 8
1.3 CSMA/CD Measurement Problem............................ 9
1.4 CSMA/CD Integration Problem........................... 10
3.1 Process Characterization.............................. 17
3.2 CLM Design Steps...................................... 30
5.1 Component Level vs. System Level...................... 47
A.l Alternative Form of Kalman Filter..................... 89
A.2 Least Square Form of Kalman Filter.................... 90
F.l Data of Scenario #1 (N = 50)......................... 117
F.2 Data of Scenario #2 (N = 100)........................ 118
F.3 Data of Scenario #3 (N = 500)....................... 119
F.4 Data of Scenario #4 (N = 1000)....................... 120
CHAPTER 1
INTRODUCTION
In order to understand and design a computer network
system, it is imperative that three distinct segments associated
with its deployment be investigated. These three segments are:
 the network design
 its measurement
 the integration of design and measurement
Each of these above segments can be further partitioned
into two additional levels, i.e., system and component. These
notions are illustrated in table 1.1 below.
1) DESIGN
2) MEASUREMENT defined at both
a) System level
b) Component level
3) INTEGRATION
Table 1.1: Segments of Network Deployment.
It should be noted that design a) is applied globally, b)
precedes deployment, and c) uses global statistics (exponential
2
and Poisson distributions which are detailed in section 4.2). In
contrast to design, measurement a) is applied locally, b) makes
its appearance only after deployment, and c) uses local statistics
(monitors packet size distribution and bandwidth utilization). In
order to address issues regarding design and measurement in a
systematic, beforethefact manner, the task of integrating
design and measurement is required to issue an optimal cost
effective network.
Figure 1.1 illustrates the conceptual relationship between
design, measurement, and integration. As shown in this figure,
design is based on global statistical information, empirical fits,
process uncertainties, and physical laws. Information from the
design process manifests itself as output in terms defined as the
measurable set. From this figure, the throughput as a function of
offered load can be considered as an example of this measurable
set.
DESIGN SEGMENT MEASUREMENT SEGMENT
BASIC OF A PRIORI MODEL MEASURABLE SET, M BASIC OF A PRIORI MEASUREMENT
GLOBAL STATISTICS * MEASUREMENT DEVICES
EMPIRICAL FITS MEASUREMENT UNCERTAINTIES
* PROCESS UNCERTAINTIES PHYSICAL LAWS LOCAL STATISTICS
SIGNATURE
SET.Z
INTEGRATION SEGMENT
Figure 1.1: Network segment relationships.
3
Measurement is based on devices used to measure the
performance of the deployed network (i.e., LAN patrol given in
appendix D), measurement uncertainties, and local statistics
such as packet size distribution. Output from the measurement
segment is designated as a member of an observable set. An
example of a member of this set is system utilization.
As indicated in table 1.1, segments of network deployment
can be partitioned into system and component levels of
activities. Before further discussing these concepts, it will be
useful to examine figure 1.2. This figure illustrates the
components of a communication system that must be simulated
in order to obtain a measure of realism. As shown in this figure,
software and hardware concerns are considered. Having
pictorialized the computer network components, the abstract
notions of system and component levels can now be defined.
If the gateway component of figure 1.2 is assumed to be a
WAN, the communication system may be partitioned into the
respective levels as shown in figure 1.3. However, assuming that
the communication system to be analyzed were only a LAN, the
host I/O channel, the communication controller and the
LAN/WAN gateway correspond to placement of a "1" (a unit
transfer function) in the gateway component. Under this
assumption, the network may be considered to be at the
component level. This is illustrated by figure 1.4.
Two system level activities relating to the design segment
are defined broadly as:
SYSTEM  HOST SYSTEM
4
'^1 \^'
(SOFTWARE) (PHYSICAL)
Figure 1.2: Communication system components
HOST
5
s
co
>*
CO
i
n
S
e
CO
CO
O
P
<
U

I
o
u
,r
SYSTEM LEVEL
'
_________
t
COMPONENT LEVEL
Figure 1.3: System/component levels in a communication network
6
2
P
Vi
>*
Vi
H
on
O
y
A
y
Figure 1.4: Component level of a communication network
7
a) The simulation of network architectures.
b) Closed loop methodology (CLM) design.
Under the first activity issues and trades to be considered
are routing, IEEE 802.X protocols, packet layer protocol and
FDDI. Discussion of CLM design as applied at the component and
system levels will be addressed in a later chapter.
Component level design activities can only be described
specifically after a given protocol has been selected. Assuming
that the CSMA/CD (Carrier Sense Multiple Access with Collision
Detection) protocol is to be evaluated, component level design
activities that require simulation are throughput, channel
utilization, packet sizing, propagation delay, statistics, jamming
time, and frame transmission time. Notions regarding system
and component level activities as they pertain to CSMA/CD
design are highlighted in table 1.2.
From the integration segment perspective, component level
activities associated with the measurement segment must
mirror those of the design. Therefore, component level activities
of measurement are identical to those of design. The same might
be said of the system level. That being the case, there appears
to be no distinction between design and measurement. However,
careful examination of table 1.3 indicates that this is not the
situation.
When comparing table 1.2 and table 1.3, design and
measurement are distinguished by their respective predictive
and measured features. The predictive capability relies on
8
global statistics which are invoked whereas measurement is
produced using statistics generated in a local sense. Therefore,
the assumptions for the basic understanding of design and
measurement are different.
SYSTEM LEVEL
(WAN Predictions)
*) Simulate Network Architecture
Routing
IEEE 802.X Protocols
Packet Layer Protocols (i.e.,X.25)
FDDI
*) CLM Design
COMPONENT LEVEL
(LAN Predictions)
*) Throughput Analysis
*) Channel Utilization
*) Packet Sizing
*) Propagation Delay
*) Statistics
*) Jamming Time
*) Frame Transmission Time
*) CLM Design
Table 1.2: CSMA/CD Design Problem.
SYSTEM LEVEL
(Measurements Obtained from a WAN)
*) Throughput
*) Channel Utilization
*) Packet Distribution
*) Propagation Delay
*) Statistics unique to each WAN
*) Jamming Time
*) Collisions
*) Frame Transmission Time
*) CLM Design
COMPONENT LEVEL
(Measurements Obtained from a LAN)
*) Throughput
*) Channel Utilization
*) Packet Distribution
*) Propagation Delay
*) Statistics unique to each WAN
*) Jamming Time
*) Collisions
*) Frame Transmission Time
*) CLM Design
Table 1.3: CSMA/CD Measurement Problem.
10
Similar to design and measurement, the integration
segment can be broken into component and system levels. The
objectives of the integration problem may be simply stated for
each level as shown in figure 1.4 below:
SYSTEM LEVEL
Compare predictions obtained using global statistics
to measurements that exhibit global statistics.
COMPONENT LEVEL
Compare predictions obtained using global statistics
to measurements that exhibit global statistics.
Table 1.4: CSMA/CD Integration Problem.
CHAPTER 2
ANALYSIS OF CARRIER SENSE MULTIPLE ACCESS WITH
COLLISION DETECTION IN RANDOM ACCESS NETWORK
2.1 Introduction
According to [6], the local networks which have shorter
propagation delay are relatively compact terrestrial radios with
stations close together. This kind of network requires stations to
"listen" to the channel to determine the acting situation of the
channel before the station attempts to transmit a packet. If the
channel is not available (it is busy or not ready to receive
signals), the packet sending from a random channel would be
overlapped. This should cause the transmission time to be
longer and the propagation of signals becomes more
sophisticated. The determination of propagation time over
transmission time will give more exact calculation for the
"listen" requirement. For example, a radio network provides
direct radio communication to connect the host processor and
collect the microprocessor controlled digital radio signal. The
channel operates at lOOKB/s (Kilobytes per second). Consider a
500B message is transmitted over a distance of 240 km. It is
easy to see that the transmission time is 5 milliseconds
12
. 500B .......... . ... .
ToOKB/s = ^ms). le message 1S propagated with a speed of
0.8c (eighttenths the speed of light) and if any of the messages
overlap, the message is destroyed and rescheduled for the next
transmission. The propagation time is therefore = 1ms.
In this example, the propagation time can be compared with the
transmission time by obtaining the ratio as 7mS = 5. This means
l ms
that about five messages have already been transmitted before
the availability of a channel is determined. Thus, if the channel
is idle, the station begins to transmit the packet. If the channel
is busy, the station can defer its transmission until the channel
is sensed to be idle. Figure 2.1 illustrates this action. This
approach is called the Carrier Sense Multiple Access method,
abbreviated as CSMA.
{TRANSMISSION}
Figure 2.1: Flow diagram for CSMA
CSMA/CD stands for Carrier Sense Multiple Access with
Collision Detection. The process of CSMA/CD is equivalent to
13
CSMA's. However, the CSMA/CD protocol is the CSMA protocol
with an additional feature of collision detection.
Unfortunately, the delay analysis for both approaches is
very difficult and sophisticated. Thus, the throughput or the
aspect of performance can not be obtained in the direct way.
However, its behavior may be investigated by using the Poisson
Law.
2.2 The properties of the CSMA/CD protocol
2.2.1 CSMA/CD performance: To implement the feature of
collision detection in CSMA, a hardware implementation
technique must be included in the transceiver. The purpose of
this transceiver is to detect the short time of the collision
interval and to minimize the channel time occupied by
successful transmission. The functional requirement of
hardware is to monitor the channel before transmitting and
also to "monitor the channel while transmitting." Moreover, the
CSMA/CD has a more dependent extended character in terms of
normalized propagation time a. The performance of CSMA/CD
approaches that of CSMAs if a decreases. The similarity
between protocols is illustrated by letting a approach unity.
2.2.2 The operation of CSMA/CD: The operation of CSMA/CD
can be explained by using the flow diagram in figure 2.2. In this
figure, after a packet is transmitted, CSMA/CD employs
14
hardware to listen while transmitting over the cable length. Two
things happened in collision detection:
i) If collision is detected, the packet is miscarried and the
collided packet must be backed off. At this time, the station
transmits a jamming signal to the channel.
ii) If collision is not detected, the packet is transmitted
successfully.
Figure 2.2: Flow diagram for CSMA/CD
2.3 Contention of Collision in CSMA/CD
Consider the timing diagram given in figure 2.3. The two
stations A and B are separated by a distance of d.
Assume that at time t0 a signal is transmitted from A to B
and will be received at station B after x seconds. Before
receiving the signal from A (or at time tQ+ Y, where Y< x ),
station B starts to transmit its own signal to A. This signal will
collide with a signal being sent from A.
15
Therefore, station B needs r seconds to detect the collision
right after it receives signal sending from A. At tg + x + T]; station
B also sends out a jamming signal for tj seconds. This jamming
signal arrives at A in x seconds.
Station A initiates its own jamming signal to send to B at
tQ + x + 2t in the same process. The total length of transmission
time is then 2t + ti + tj seconds.
time to transmit a signal from A to B.
v[: time to detect collision (same for all stations),
t;: jamming time.
Y: time (random variable) selected by B to begin
transmitting its signal to A.
Figure 2.3: Timing diagram for CSMA/CD
CHAPTER 3
INTRODUCTION OF CLM TO THE SIMULATION OF AN ETHERNET
3.1 Introduction
Since the magnitude of the computer network deployment
problem encompasses all the features necessary to fully
understand the propagation of packets from their origin to their
actual destination, it is natural to analytically view the problem
from a signal processing perspective. Reference [2] illustrates
how the CLM technique can be used to address this problem. A
brief description of the philosophy behind the approach will be
repeated here for ease of reference. Following the overview, a
detailed description illustrating the state variable formulation
for each component of the candidate system will be given.
3.2 The overview of CLM
The analysis of any physical system must begin by
characterizing the specific process to be modeled. This
characterization must consider system type, governing
equations of the process, and system questions to be answered.
Table 3.1 illustrates these notions.
17
a) TYPES OF SYSTEMS:
 Deterministic, i.e., systems free of uncertainty.
 Stochastic, i.e., systems with uncertainty.
b) SYSTEM DESCRIPTION:
 Ordinary differential equations (linear/nonlinear)
 Partial differential equations (linear/nonlinear).
 Algebraic equations.
c) SYSTEMS QUESTIONS TO BE ANSWERED:
 Solution to governing equations.
 Estimation.
 Identification.
 Control.
Table 3.1: Process Characterization.
Systems may be typed as being deterministic or stochastic.
A deterministic system is defined as one that incorporates no
uncertainty and the stochastic as one that does. True systems
may be represented by physical laws expressed by partial
differential equations (PDEs), ordinary differential equations
(ODEs), and algebraic equations (AEs). Hence, the governing
equations of the process are given by PDEs, ODEs, AEs, or a
combination of these three. In the case of the deterministic type
system, the solution is easily obtained. The stochastic system
solution presents additional concerns.
18
The solution of stochastic system may be obtained by
defining and solving the estimation, identification, and control
problems (EIC) associated with the given process. This study
illustrates the need for solving the EIC problems as they pertain
to signal propagation, i.e., packet propagation.
The control problem may be stated as the problem of
"nulling" the error that exists when predicted measurements are
compared with the actual measurements. This "nulling"
procedure is accomplished using either open loop methodology
(OLM) or closed loop methodology (CLM). These methodologies
will be described in more detail in the next sections. The CLMs
null out the error via an adaptive processor (AP). The functions
of the adaptive processor are to solve the estimation and
identification problems in a systematic manner. More details of
the description of the adaptive processor will be given in
chapter 5.
Having stated that OLM and CLM are essentially techniques
for controlling the comparison error, theoretical aspects
required to obtain the system results will now be discussed.
This is introduced by presentation of the thesis:
Control of the error which results from the comparison of
data obtained from the true system to that produced by an a
priori model will result in:
a) obtaining a classification set, which implies that the
signal processing problem can be solved. This in turn implies
that the system equations can be solved, or
19
b) denial of the classification set, which implies that
signature propagation is not well understood.
Acceptance of this thesis implies that the solution of the
control problem is imperative in order to model the computer
network.
In presenting the thesis, the term "classification set" must
be introduced in the following section.
3.3 The overview of classification set
Classification set is abbreviated as CS. The notion of set is
generally defined in mathematical terms. In particular, this set
may be given as the set that contains the following elements:
independent variables, dependent variables, coefficients of the
governing equations, and sensor output. This set mathematically
can be written as:
_( independent dependent coefficients sensor output 1
\variables variables of equation (measurement)!
(3D
Since this definition of the classification set holds for
propagation of a signal through any system, the elements of this
set will be obtained from the equations that govern the
propagation of the signal. Prior to doing this, the relationships of
estimation, identification and control to the classification set will
be given. Equation (32) shows the relationships. For example, a
solution to the estimation problem will yield estimates of the
20
independent variables and dependent variables. Similarly, the
solution to control provides both estimation and identification in
addition to providing an estimate of sensor output.
j independent dependent coefficients sensor output 1
=\variables variables of equation (measurement)]
v i \____________f
s/
ESTIMATION
v
IDENTIFICATION

CONTROL
(32)
Before further classifying of the classification set is
feasible, the basic elements of the physical process must be
considered. As shown in figure 3.1, two basic elements may be
defined as being:
 the subsystem under evaluation, and
 the measurement device (reference [3] defines these two
elements as the design and the measurement segments
associated with network deployment).
The subsystem under investigation produces a measurable
set, M, and the measurement device acting on this set produces
the output, Z. In addition to illustrating basic elements, figure
3.1 shows the basic for their physical makeup.
However, this description of the process does not allow total
definition of the classification set. For example, the only elements of
the classification set that are illustrated are the measurable set, M,
and the signature set, Z.
21
SUBSYSTEMS UNDER EVALUATION MEASURABLE MEASUREMENT DEVICE
BASIC OF A PRIORI DATA BASE * STATISTICAL INFORMATION EMPIRICAL FITS * PROCESS UNCERTAINTIES * PHYSICAL LAWS SET, M BASIC OF A PRIORI DATA BASE * TRANSFER FUNCTIONS MEASUREMENT UNCERTAINTIES
SIGNATURE
SET.Z
TEST CELL/A PRIORI DATA BASE
Figure 3.1: High level description of the process.
In addition, the measurable set contains only some of the
dependent variables, none of the independent variables, and
none of the coefficient set. The level of detail necessary to
quantify the elements of the classification set can be achieved
once the system under evaluation can be expressed
mathematically by a set of n first order discrete equations given
by
x(k+l) = $(k+l,k) x(k) + w(k) (33)
where x = nxl, state vector
$ = nxn, dynamics matrix
and the uncertainty w., is nxl zero mean white noise sequence
having a covariance given by
T f Qt if k=j
Eta^) =QAj = {0kifk,j
(34)
22
In similar fashion, the output of the measurement device is
related to the state of the system by the set of m equations
Â£(k+l) = H(k+1) x(k+l) + jr(k+l) (35)
where z = mxl, measurement vector
H = mxn, measurement matrix
x = nxl, state vector
and the measurement noise y_, is a zero mean white noise
sequence with its covariance given by
Rk+ld
Rk+iifk=i
0 if k*j
(36)
By representing the system via equations (33) and (35),
further classifying of the classification set is possible, i.e., the
classification set may be expressed as
STATE STATE ESTIMATE
VARIABLE UNCERTAINTY
I I
CS =x P
ESTIMATION
UNCERTAINTIES DUE SENSOR
PROCESS AND SENSOR 1 1 OUTPUT 1
i 1 <3> Q H R 1 *1
IDENTIFICATION
/
v
CONTROL
(37)
23
3.4 Open loop methodology
Figure 3.2 pictorializes the open loop problem as it applies
to network integration. As shown in this figure, open loop
control is characterized by its lack of feedback. The open loop
methodology uses a priori design predictions to compare with
the utilization obtained in the measurement segment. Results of
this comparison can range from complete agreement to no
agreement at all.
AGREEMENT
Figure 3.2: Open loop methodology
In addition, the confidence level of results of a given test is
typically low until a large statistical data base indicates the
attributes of the samples. In an effort to achieve agreement
between predicted and measured utilization, data base
parameters (packet size, etc.) and estimated stimuli (number of
workstations, application, etc.) are adjusted to reduce this error.
24
This "knob tweaking" is usually conducted by methods that are
far from being optimal.
Because CLMs have considerable shortcomings, the need
for a more systematic approach has become evident in recent
years. Modem system theory provides such an approach based
on CLMs that form the bases for most applications where
feedback is used to control the error.
3.5 Closed loop methodology
The closed loop methodology is shown in figure 3.3. As
shown in this figure, it uses adaptive procedures that provide a
design with quantified confidence levels. Modem system theory
recognizes the potential differences in predictions and
measurements that can be attributed to forcing function
uncertainty and incorrect estimates of the constituents of the
empirical equations used in the design. The adaptive processor
design attempts to accommodate these uncertainties and to
systematically keep the difference between prediction and
actual measurement output to a minimum.
In order to illustrate additional detail of how CLM is
applied to satisfy design and measurement concerns, figure 3.4
provides the next level detailing the concept. This figure
illustrates how the carrier sense multiple access with collision
detection (CSMA/CD) design's predicted utilization is compared
to that obtained in the performance segment. The adaptive
processor takes this error and tries to minimize it by
25
automatically changing packet sizes, applications, and number of
work stations.
Figure 3.3: Closed loop methodology.
3.6 Basic concepts of the Kalman filter
Notions discussed in the above sections can be further
classified by examination of figure 3.5. This figure shows how
the Kalman filter is used to produce an estimate, z, of the actual
data given by z. Comparison of these two values creates an
error, E, which is acted on by the Kalman gain, G. Output of the
Kalman gain box is then summed with a priori estimates of a
state vector to produce an estimate (a posteriori) of the state
vector. However, since this estimate of the state is based on
incorrect coefficients, trends in the error are more likely to
Z: utilization
S: throughput
a: normalized propagation time
b: normalized jamming lime
G: offered load
N: number of workstations
A: applications
P: packet size
B: bit rale
?: propagation time
Iji jamming time
where A indicates predicted value
Figure 3.4: An exploration of CLM applied to CSMA/CD
to
ON
Figure 3.5: The Kalman filter block diagram
to
28
occur. Removal of these trends is brought about by optimizing
the weighted residual error with respect to the parameters
which reside in the coefficient set. It should be noted that this
optimization takes place subject to the fact that the Kalman
filter equations are satisfied.
Referring to appendix A, Gk+1 is a notation that is
commonly used for the gain matrix in the Kalman filter,
quantity P^+i/k+i *s considered as a predicted covariance
matrix, and is considered as an a posteriori covariance
matrix (error covariance matrix). The Kalman filter description
can be partitioned into two types: type A (alternative
formulation) and type B (least square formulation). These two
types can be shown identically to each other. However, the
application of each type depends on the dimensions of matrices.
For example, type A is used when the measurement dimension
is less than the state's; type B is applied when the measurement
dimension is more than the state dimension. Each type can be
used if the dimensions of the state and of the measurement are
equal.
One of the most important features of the Kalman filter is
recursion process. Its recursion is extremely useful not only in
processing measurements to obtain the optimal estimate, but
also in utilizing a digital computer. The information process in
the Kalman filter can be divided into three main steps as
follows:
29
Step 1: The estimate is propagated forward by
premultiplying by system matrix A to give a prediction, xk+1^..
Step 2: xk+1yk is premultiplied by measurement matrix H,
the result is then subtracted from the actual measurement to
give error Ek+1/k.
Step 3: Ek+1^c is premultiplied by the Kalman gain G, and
the result is added to a prediction xk+1^k+1 to give a current
estimation. This value of estimation x^j^j is then stored from
time k to time k+1. When the next measurement is made, the
process is cyclically repeated from step 1.
These steps of the Kalman filter's process can be
summarized in figure 3.6.
Figure 3.6: Processing steps in the Kalman filter
30
3.7 Closed loop methodology design
Having completed the CLM overview, discussion will now
address how the CLM is actually designed. By following a very
specific set of steps, a realization of the approach may be
achieved. These design steps are detailed in table 3.2.
1) DEVELOP THE A PRIORI MODEL
2) DEFINE THE CLASSIFICATION SET FROM THE MODEL
3) PERFORM AN OLM SENSITIVITY STUDY
4) DESIGN THE ADAPTIVE PROCESSOR
i.e., DESIGN THE ESTIMATION
AND IDENTIFICATION ALGORITHMS
5) VALIDATE THE DESIGN VIA MEASUREMENT DATA
Table 3.2: CLM Design Steps.
Since the CLM approach is a system level analysis tool, the
development of the a priori model should be as modular and as
general (in the sense that all subsystems have the same
mathematical format) as possible. By using a state variable
representation of the system, the modularity and mathematical
generality requirements can be met. Having defined the
components via state variables, definition of the classification
set may the be easily obtained step two of the design process.
Details regarding the adaptive processor design can be obtained
from a number of references. Specifically, references [7] and [8]
offer descriptions regarding design of the adaptive processor.
31
The final step in the design process is verification of the results
by using data obtained from a measurement source.
CHAPTER 4
THROUGHPUT ANALYSIS
4.1 Introduction
Data flows in computer communication networks are
naturally stochastic processes. There are many terms considered
as random variables such as delay time, contention time, etc.
This chapter describes a stochastic process which characterizes a
communication network, i.e., the binomial process. However, in
order to simplify the mathematics, the binomial law will be
approximated by the Poisson law. It is this distribution that will
be described in this chapter.
The throughput is usually defined as the average number
of bits per unit time passing a given point in the network; and
the unit of throughput is found to be "bits/second." In this
thesis, the "normalized throughput" will be investigated. The
"normalized throughput," assigned by notation "S," is defined as
that throughput normalized by dividing it through the rate of
channel transmission (bits/second). In the other words, S is
calculated as the average number of successful transmissions
per packet transmission using the approach of Poisson law. S is
dimensionless and its range should be from 0 to 1. Section 3.2
33
introduces the asymptotic behavior of the binomial law which
will give the reader an idea of using Poisson approximation in
random variable analysis of throughput. Section 3.3 is the
explosion of throughput analysis of a random local
communication network. The linearization method of
throughput using the partial derivative approaches will be
given in appendix C.
4.2 Poison law
The Poison law can be used as an approximation to the
binomial law. Take, for example, the event Er = {r successful
transmission in n independent trials}. Let:
ej denote the strings with r successes,
p = probability of success,
q = 1p probability of failure.
Assume R is the number of ordered arrangements
consisting r successes. Since {ej} are disjoint and P[{ej}] = prqnr,
by the property of individual Bernoulli Trial with probability p
of r successes, we have
R
R
P[Er] =p[u{ej) ] = U [P{ej} ]
j=l j=l
(41)
where b(r; n,p) is the symbol of binomial law.
34
Suppose that the probability of success p is too small and n
is big (i.eM pl, nl), and letting the product of n and p to be a
c
constant, called c = np, then substituting p = ~ into (41), yields
n!
*r nP> =r!(nr)! P'(1 P>
,nr
=S1 (Z)'(l
r!(n r)! 'n' n'
C\T/ c\nr
=cr(l )
r!(n r)!nr v n'
= n(ndX^2tin^ncr(1.^r (4.2)
Since nl, it can be shown that n(nl)(n2)...(nr+l) nr,
therefore
C\ nr
Kr, n,p) =[ cr(l
nr
(43)
Assume nr, p0 and n, taking limits of both sides of
(43) yields
lim
n
b(r; n,p)
lim
n>oo
n
r
(. c^nr
r! C n>~ Vi" nJ
35
lim
n
b(r; n,p)
J_ r^m
rJ c n><
y(n)
(44)
where
y(n)
=(iÂ£)nr
v ny
(45)
Taking the natural logarithm of (45), yields
ln(l )
In y(n) = (nr) ln(l ^) = j1~
nr
By taking the limit of both sides of the above equation,
y(n>]
lim
n>>
v n7
1
L nr J
lim
n
dnL v n/J
ri
dnLnrJ
lim
n
n
nc
cn
2
(46)
which is
lim
n>
[y(n)l = e'c
and (44) becomes:
36
(47)
Equation (47) is considered an approximation to the
Poisson distribution. Furthermore, let X be the average number
of arrivals per unit time and t the length of the time interval,
then the probability of r events in t seconds is defined as
where use has been made of (47) and c = A,t.
This will be carefully analyzed later in the next chapter to
derive the formula for throughput.
4.3 The analysis of throughput using CSMA/CD
In the discussion in section 2.3, it is assumed that station A
is a reference station which transmits a reference packet at
tQ = 0. The time to detect a collision, T], is zero, and the jamming
time tj is constant.
Recall the contention period C, is given by
VA,U it
p(r arrivals in t seconds) = , e
(48)
C = Y + tj + 2t
(49)
where t is a propagation time between every pair of stations.
37
This equation states that tj and x are constants for all
contention periods, but Y is a random variable. Hence, to
evaluate the average contention, we compute the average of Y,
or
C = Y + tj + 2r (410)
Because Y is a random variable, the estimation of Y will be
a little more complicated. Before performing this estimation,
assume that station B transmits a packet to station A
immediately after station B senses the idle channel. This means
that the delay time Y is the arrival time of the first packet
which collides with the reference packet sent from station A in
the possible interval [0,x]. The distribution function of Y can be
calculated using the probability approach. It is of interest to
investigate the probability of arrivals in interval [0,x], The
probability distribution function (PDF) of arrivals in this
interval is
P[Yy] (411)
where P[Y>y] is the probability of no arrivals for a given one
arrival in [0,y].
Let P[Yq] be the probability of at least one arrival in [0,x],
and P[C]=P[Y
where C is the product of two events {ej: Y(ej)^y (no arrivals in
38
[0,y])} and {e2: e2e YQ (at least one arrival in [y,x])}.Therefore, by
the theory of conditional probability, we have
P[q (P[Y
P[Y>y] P[Yq] P[Yq]
for 0
(412)
because the above two events in C are independent events, (4
12) can be written as
P[Y>y]
(P[e1:Y(e1)^y].P[e2:e2gY0l)
P[Y0]
(413)
with r=0 (no arrivals in [0,x] and t=y), from equation (48), it is
easy to obtain the following facts:
P[0 arrivals in y seconds] = ^ ^ e"^y = e^y
(414)
P[0
arrivals in xy seconds] = e "^T y) = e"^T y) (415)
(A.x) .
P[0 arrivals in x seconds] = ~gj = z
(416)
From probability argument, the above equations can be
easily seen in different expression as
P[at least one arrival in y seconds] = 1 e"^y
(417)
P[at least one arrival in xy seconds] = 1 e^Ty)
P[at least one arrival in x seconds] = 1 e*T
Hence, equation (413) yields:
_ eXy[l e'X(T~y)] e'Xy e~XT
P[Y>y] j eXx i eXT
By substituting P[Y>y] into (411), yields
39
(418)
(419)
(420)
Xy Xx
FY(y) = P[Yy] = 1 j
1 e'Xy
FtW =! e4, (421)
Equation (421) is called the Probability Distribution
Function (PDF). The PDF Fy(y) may be expressed in terms of an
expectation via Fy(y) = E[Y] and evaluated as follows:
oo
E[Y] = Jy/Y(y) dy (422)
OO
where /y(y) is the probability density function (pdf) of
Probability Distribution Function (PDF), Fy(y). The relationship
between pdf and PDF is:
40
/yW =^fyW
a n e'xy'
dyLl e'Xx_
1
xc^y
1 e'^x
(423)
Therefore the expected value of the event Y is given by
f Xety
Enn = Jy y e~x t Jy
or recognizing only finite limits exists in reality,
E[Y]
le^y
rr7^dy
Manipulating this expression produces
E[Y]
X
1 e'Xx
X
Jyeky dy
0
41
or
(424)
By letting Xz = g and recognizing that:
X = G/tf is average number of arrivals per unit time,
tf is packet time,
g is offered load channel traffic per propagation period,
G is offered load channel traffic per packet time,
z is propagation period,
then equation (424) becomes
Before completing the derivation, it is convenient to refer
to the busy and idle periods to calculate the throughput of the
network. The throughput of the network can be defined as
(425)
Substituting E[Y] into (410) yields
1
= t + z 2 + 
J L g
1 e" g.
(426)
s
U
I + B
42
(427)
or expressed as the expected value of s (given by S) to have
E[U1
" E[I] + E[B]
(428)
where E[U] is the average time in a cycle when transmission is
completed successfully, E[I] is the average time of idle, and E[B]
is the average busy time.
Initially the average time of U in a successful transmission
will be examined. The successful transmission occurs when one
station sends a packet without any other packets colliding with
it. (The probability of this is denoted by Ps.) Otherwise, it is a
bad or an unsuccessful transmission. Either good or bad, each
will take a length of time in tf seconds to transmit a complete
packet on the assigned cable line. Therefore, the average time in
a successful transmission for a packet is:
E[U] = tfPs = tje_Xt = tjeS (429)
The busy period is either a successful period or a
contention period during the transmission (refer to figure 2.3
and [6]). Its average is given by:
E[B] = Ps(tf + t) + (1 PS)E[C]
(430)
43
where:
tf + t = the time that it takes to transmit the first byte to
the last byte in a packet from station A to station B.
1 Ps = probability of unsuccessful transmission.
Using E[C] in equation (426):
E[B] = e'g (tf + t) + (1 eE) j tj
1 e"g
+ x 2 +  _
g 1 e" g.
= (1 e'g)
tj + 2, +jj
g
(431)
Since the idle period is the time interval for each arrival, it
is possible to write:
Em
!_!
Xg
(432)
After substituting equations (429), (431) and (432) into
(428), and by taking a few algebraic steps to simplify, (428)
becomes
gtfe~B
____________________T____________________
gtfeg ft.g 1
+ (1 eg)^" + 2gJ + (2 + e"g)
S
(433)
44
where g = ~~ = aG
(434)
or
(435)
T
G may be expressed from [9] as
(436)
where:
N: the number of workstations or number of nodes on the
network.
tp the mean time interval between stations completing the
transmission until initiating transmission of the next
frame.
tf: the mean transmission time of frame minus contention
time, (packet time).
Hence, the equation of throughput can be rewritten as:
the normalized propagation time is: a = 7"
(437)
the normalized jamming time is
(438)
45
_______________Ge"aG_______________
GeaG + (1 e'aG)(baG+2aG) + (2+e'aG)
(439)
Note that tf can also be expressed as a function of the packet
size:
tf = (preamble + header + cycle redundancy check)/(bit
rate) + (8*packet length)/(bit rate). (440)
CHAPTER 5
DESIGN AND DEPLOY A COMPUTER NETWORK WHICH OPTIMIZES
RESOURCES
5.1 Introduction
Having the background of modeling methodologies (open
loop methodology and closed loop methodology) stated in the
last chapter, the next step is to go into application where the
methodologies will solve two types of problems:
a) Type 1 a system level problem defined as design and
deployment of a computer network which optimizes resources.
b) Type 2 a component level problem defined via a
proper mix of workstations, applications, and collisions to
provide maximum utilization of the network.
5.2 CLM mechanization for svnthesis/real data
Figure 5.1 illustrates the CLM mechanization for synthesis
data, which can be observed in two "worlds": the real world and
the Kalman filter world. The relationship between these two
worlds exists as a "mirror" to create unbiased estimates by
looking at the system level and the component level, specified
in table 5.1 below:
47
COMPONENT LEVEL
a) E{x x} = 0.
b) Equivalent to design problem.
c) Given noisy measurement z = predict the estimate of
the state variable.
SYSTEM LEVEL
a) E{z z} = 0.
b) Equivalent to signal processing problem.
c) Given noisy measurement z => predict the best estimate
of the next measurement.
Table 5.1: Component Level vs. System Level.
If the conditions of the both system and component levels
in the above table are satisfied, then the mathematical state
variable structure of the real measurement equation must
match the predicted measurement equation, and the zero mean
noise sequence will be fulfilled. This leads to the solution of the
control problem obtained by characterization of the "innovation"
process. The characterization will require statistical measures to
be levied on the "innovation" given in appendix E. When using
this process, it is necessary to create pseudo data prior to
designing a Kalman filter algorithm, so the "innovation" will give
a desired small error after it reaches the steady state (figure
5.2).
Figure 5.1: CLM mechanization for synthesis data
00
E
Figure 5.2: CLM mechanization for real data
VO
50
5.3 OLM applied to CSMA/CD ethemet design problem
The network can be deployed in three segments as below:
 Design
 Measurement
 Integration
All of these three segments require definitions at a system
level as well as a component level where either system level or
component level must further require a mechanization process
(block diagram). Before further analysis of the design problem,
it is important to recognize that a system level yields the best
estimation of the next measurement; hence, the system becomes
synonymous with "signal processing." In contrast, the
component level yields the best estimation of the states for a
given noisy measurement, so the component becomes
synonymous with "design" (table 5.1). The design problem can
be divided into three main steps as follows:
1) Step 1 component definition: The understanding
requirement of the component definition is necessary to
produce a predicted measurement. The component definition is
based on the use of software and hardware. Software
components include TCP/IP, applications, and LAN patrol.
Usable hardware components are workstations, cable, and
dedicated P/C. The application gives packets, and the cable
length produces the bit rate. LAN patrol used in "calculated
utilization," and associated with "dedicated P/C" in hardware,
will give a predicted utilization. Finally, the comparison of the
51
desired and predicted utilization yields an error, E, as shown in
figure 5.3.
2) Step 2 apply signal processing analysis: Figure 5.4
gives information necessary to produce an utilization from
throughput, where throughput is a function of normalized
propagation time, normalized jamming time, offered load, and
uncertainty. This uncertainty is a system zero mean white noise
sequence. This figure shows the basic versus the empirically
derived throughput in which the calculated utilization can be
computed as a ratio of the calculated throughput to the
maximum throughput.
3) Step 3 matching requirement: Having all components
defined in step 1 and the signal processing analysis applied in
step 2, a question is asked: "How can these components be
applied in order to utilize a network?" In order to answer this
question, it is necessary to utilize the network step by step. The
bit rate is obviously derived from cable length. The packet size
(P) will give packet time (tf), and transmission control
protocol/internet protocol (TCP/IP) will produce the change in
packet size (P). Packets are the production of applications,
computer aided design (CAD), or computer aided manufacturing
(CAM). Inputs a, b, G, w feed into one block to produce output S.
Inputs x, tf feed into another to give output "a," etc. This
relationship gives a final OLMs mechanization for the network,
illustrated in figure 5.5.
B bit rate
Z measured utilization
 Applications
 LAN patrol
 Cable
 Dedicated P/C
UTILIZATION DESIRED
(REQUIREMENT)
Figure 5.3: OLM applied to CSMA/CD ethernet design problem
(Predeployment produces predicted measurement).
to
a
THROUGHPUT
S f (a, b, G, W)
Â§
UTILIZATION (CALCULATED)
Throughput (calculated)
S V
Throughput (measurement)
A
z
Iw
UNCERTAINTY
WHERE
Z predicted utilization
Z required utilization
3 predicted throughput
a normalized propagation time
b normalized jamming time
G predicted offered load
MEASUREMENT UNCERTAINTY. V
6
UTILIZATION DESIRED
(REQUIREMENT)
Figure 5.4: OLM applied to CSMA/CD ethernet design problem
(Empirically derived throughput produces predicted measurement).
WHERE
N = number of workstations
A = applications
P = packet size
x = propagation time
tj = jamming time
B = bit rate
2
Figure 5.5: OLM applied to CSMA/CD ethernet design problem
(Requirement matching produces predicted measurement).
55
Since the design problem has been established, resources
are now referred to a measurement problem, specified in the
following sections.
5.4 OLM applied to CSMA/CD ethemet measurement problem
As the design problem stated in the above sections. The
measurement problem can be specified in the same manner. It
may be divided into two steps that apply for OLM (deployment
of network and integration of design/measurement), and one
final step applied for CLM. This final step is to compromise the
applications between design and measurement, and this step
will be addressed in the next section.
In the deployment of a network, actual measurement can
be observed if input information is given. The block diagram
requirement for matching is constructed in the same ways as in
a design problem. However, a measurement problem gives
actual data, while a design problem produces predicted data
(figure 5.6). The second step is to integrate design and
measurement by creating the derivation of the number of
workstations, applications, packet size, transmission time,
jamming time, etc., and comparing the actual value to predicted
measurement output. However, the error produced between
actual and predicted outputs does not lead to quantified results
(figure 5.7). Therefore, it is necessary to design an adaptive
processor to control error and to minimize it.
z
Figure 5.6: OLM applied to CSMA/CD ethernet design/measurement problem
(Deployment of network produces actual measurement).
Ul
Z: utilization
S: throughput
a: normalized propagation time
b: normalized jamming time
0: offered load
N: number of workstations
A: applications
P: packet size
B: bit rate
'i': propagation lime
1j: jamming time
tvhere A indicates predicted value
Figure 57: OLM applied to CSMA/CD ethernet design/measurement problem
(Deployment of network predicted vs. actual measurement).
LA
58
5.5 CLM applied to CSMA/CD ethemet design vs. measurement
problem
Recall that the CLM gets its name from the fact that it uses
an adaptive processor as a controller of the error resulting
from the comparison of measured test information and
predictions of the same. In order to drive this error under an
acceptable threshold, the adaptive processor provides this
control by solving the estimation and identification problems in
a sequential fashion. This is accomplished by constructing the
adaptive processor to consist of a Kalman filter serially linked
with a performance measure, J. This performance measure is
then optimized to solve for the coefficient members of the
classification set, stated in chapter 3.
The final CLM architecture, shown in figure 5.8, illustrates
the use of an adaptive processor. The functional algorithm of an
adaptive processor is to adjust the error E (minimize the
difference between the actual measurement obtained from a
sensor such as LANALYZER and the predicted value from model
of the computer network) by automatically changing 1) the
number of workstations, 2) packet size, 3) applications, and 4)
jamming time.
Z: utilization
S: throughput
a: normalized propagation lime
b: normalized jamming time
G: offered load
N: number of workstations
A: applications
P: packet size
B: bit rate
propagation time
t.: jamming time
where A indicates predicted value
c
ADAIMIVE
UTILIZATION (CALCULATED)
Huouglqwl (cdcuUlcd) Â§ +9 Z
Throughput (measutoncnl)
i i
 Meastacmen^Uncertalnty) r i i j
i Ae
11
\L.
APPUCATlOt ' t ' r THROUGHPUT
ih S
CAD CAM ac  y>KP,A) Sr(a,b.O,W)
rxiA X) N Nl 0 =.
0) U + lf
V STATIONS J
r
w
O
tmUZATlON (CALCULATED)
Huaughpil (cikulnal)
S afV
Thmiglfwl (mcniacmait)
(Meanranent Uncertainty)
_____L_____I
Figure 5.8: CLM applied to CSMA/CD ethernet design/measurement problem
(Deployment of network predicted vs. actual measurement).
\o
CHAPTER 6
APPLICATION RESULTS
6.1 Introduction
The major subject of the statistics field is often defined as
the scientific study technique for analyzing, collecting and
drawing conclusions from the collected data. The statistical
procedures are applicable to the study in which observations
are made. The use of statistics in many areas in the real world is
to describe and summarize the experimental data upon each
result collection. The measurements or characteristics of a result
description or summation of the collected data are sometimes
not constant (random variables) but manifest variability upon
repeating the experiment under the same condition. This
statistical sense is fully applicable to analyzing the result of any
stochastic problem.
Another reason for using statistical approximation in
throughput analysis is that there exist a priori and a posteriori
estimations in the control system, where the procedures of a
priori and a posteriori can be characterized as the estimations
were known a priori and they could be predicted or calculated
directly. However, if the opposite situation exists, we need to
61
determine the estimations that a particular set of conditions
existed from the results already obtained. This is consideration
of posteriori or after the fact estimations behavior.
Due to the above behavior, and the consideration of
jamming time, transmission time, packet length, etc., in discrete
time domain, the expression of throughput, offered load, and
output measurement must behave discretely. The discrete
distributions are the description of numerical data which in
each case has a number of finite values. In addition, if the
random variables associated with the sample spaces are
considered in each trial, the measurable step changes in it must
be involved. This is the process in theorical distribution which
relates to experimental distribution in some senses of grouping
collected data. Before going into further analysis of the results
using statistical properties, let me introduce the features of LAN
patrol which was used to analyze communication network.
Before going further into the explanation of the scenarios,
it is important to introduce methods of stochastic
approximation.
6.2 Methods of stochastics approximation
Since the states are estimated under the presence of noise,
the stochastic process must be used so that the states can be
characterized by statistics such as mean and covariance, as
stated in chapter 4. The techniques that contain the optimal
process to estimate the random variables to minimize the
62
functional estimation errors (minimize the least square) from
noisecorruptedmeasurement data is called the "stochastic
approximation method." This technique yields the recursive
estimates with properties of convergence as well.
Hence, the algorithms of stochastic approximation is
usually an iterating process, and the optimal estimation criteria
often depend on the statistical characteristic assumption of
random vectors associated with noisecorruptedmeasurement
data. The practical results for this kind of problem may not stay
the same all the time (i.e., each frequency distribution is
different), but the average of the processes shows a point of
convergence.
In addition to these stochastic approximation properties, it
is important to state the characteristics of the
estimation/prediction and control problems in the next section.
6.3 Characteristics of an estimation/prediction problem
 it provides the means to use passive network monitoring
devices for examining the network performance,
 it gives a mechanism to derive a control law that
optimizes the system performance,
 it provides insight into the design,
 it provides insight into the critical system parameters,
 it validates the methodologies (CLM, OLM),
 it demonstrates the ability to predict what can be
measured, and finally
63
 it solves the design/measurement integration problem.
The application of the CLM to the CSMA/CD deployment
problem was examined via four different event scenarios.
Results of this application to each of the scenarios will now be
described.
6.4 Explanation of scenarios
The main problem that needs to be solved in this thesis is
to simulate applications (or to determine packet size
distributions) necessary to produce a desired throughput
profile. The primary step to solving this problem is to solve the
estimation versus prediction problem. This means that the
histograms of the current applications being exercised on the
network must be constructed by collecting data (given in
appendix E). Then, the results of the histograms will be
determined.
The four scenarios that have been examined are as follows:
1) Scenario 1:
Number of events = 500
Number of workstations = 50
Number of events per workstation =10
64
2) Scenario 2:
Number of events = 1,000
Number of workstations =100
Number of events per workstation =10
3) Scenario 3:
Number of events = 5,000
Number of workstations = 500
Number of events per workstation =10
4) Scenario 4:
Number of events = 10,000
Number of workstations = 1,000
Number of events per workstation =10
These four scenarios have been examined under one stable
condition: the same number of events per workstation (10
events/workstation). Hence, if the number of events is
increased, the number of workstations must be increased to
keep the ratio of events to workstations constant.
In this process, the mean time interval between stations
completing transmission and stations initiating transmission of
the next frhme, tj, is unchanged, as expected. The offered load,
however, keeps increasing proportionally to the number of
workstations. Moreover, due to the properties of the stochastic
problems, the normalized propagation and jamming time
65
provide an internal change (caused by the changes in t, tj, tf) to
yield distributions of packets all over the network. Keeping in
mind that data in the first iteration affects the next one in the
stochastics process, data collected at the first iteration would be
different from the next iteration. The reinitializing of the
system through the computer is, therefore, needed.
The actual and predicted values of throughput of the four
scenarios are close to each other (figures 6.1a, 6.2a; 6.1b, 6.2b;
6.1c, 6.2c, and 6.Id, 6.2d). But, there is a small error between
the actual and the predicted as long as the number of
workstations increases. This is caused by complicated activity in
the large network. From these given graphs, it is easy to create
the pseudo data on the scale of throughput versus offered load
for each scenario (N = 50, 100, 500, 1000). The way to construct
this relationship is to select the value of the predicted
throughput and the corresponding predicted offered load at
each event per workstation.
The output measurement of each scenario, shown in
figures 6.3a, 6.3b, 6.3c and 6.3d, indicates that the measurement
is predicted one step behind the actual value. This implies that
for a given value of a measurement at a given time tk, we can
predict a next measurement at the time tk+1.
The collected data, represented as histograms, is given in
figures 6.;4a, 6.4b, 6.4c, and 6.4d. In these histograms, the
application is set on the horizontal axis and frequency on the
vertical axis. These histograms display the current applications
66
that are being exercised on the system's network, where each
division on the application axis represents the range of packet
length distribution of applications, called "class length." This
"class length" of applications depends on the difference
between mean transmission time and contention time. The
more tf increases, the more frequent is the distribution of
packet length. Figures 6.4a and 6.4b show the different values
of tf. These histograms show the status of abilities and
disabilities of LAN patrols sixtypefilters and the percentage
of packet length that has been generated by the system (given
in appendix D). In other words, they provide accurate
information on the size of packet length that is applicable to
the network. For example, if a network contains SO
workstations (figure 6.4a), then the size of packet length that
needs to be used is at least 257 bytes. Hence, the examination
of application histograms is very important to determine the
range of packet length class being generated by each system.
67
Number of events/workstation
Figure 6.1a: Offered load (actual ~ predicted) vs. number
of events per workstation for N = 50
Actual
Predicted
68
Number of events/workstation
Actual
Predicted
Figure 6.1b: Offered load (actual ~ predicted) vs. number of
events per workstation for N = 100
69
Number of events/workstation
Actual
Predicted
Figure 6.1c: Offered load (actual ~ predicted) vs. number of
events per workstation for N = 500
70
Figure 6.Id: Offered load (actual ~ predicted) vs. number of
events per workstation for N = 1000
71
Figure 6.2a: Throughput (actual ~ predicted) vs. number of
events per workstation for N = 50
72
Number of events/workstation
* Actual
Predicted
Figure 6.2b: Throughput (actual ~ predicted) vs. number of
events per workstation for N = 100
73
Number of events/workstation
Actual
Predicted
Figure 6.2c: Throughput (actual ~ predicted) vs. number of
events per workstation for N = 500
Throughput(S)
74
Figure 6.2d: Throughput (actual ~ predicted) vs. number of
events per workstation for N = 1000
75
Number of events/workstation
Actual
Predicted
Figure 6.3a: Output measurement (actual ~ predicted) vs.
number of events per workstation for N = 50
76
Number of events/workstation
Figure 6.3b: Output measurement (actual ~ predicted)
number of events per workstation for N
 Actual
Predicted
VS.
= 100
77
Figure 6.3c: Output measurement (actual ~ predicted) vs.
number of events per workstation for N = 500
Output measurement (Z)
78
Figure 6.3d: Output measurement (actual ~ predicted) vs.
number of events per workstation for N = 1000
7 9
0 1 2 3 4 5 6 7
Application
Figure 6.4a: Actual vs. predicted histogram for N = 50
80
0 1 2 3 4 5 6 7
Application
Figure 6.4a: Actual vs. predicted histogram for N = 50
(Plotted at higher covariance).
81
Application
Figure 6.4b: Actual vs. predicted histogram for N = 100
82
Application
Figure 6.4b: Actual vs. predicted histogram for N = 100
(Plotted at higher covariance).
83
3 4
Application
Figure 6.4c: Actual vs. predicted histogram for N = 500
84
0 1 2 3 4 5 6 7
Application
Figure 6.4d: Actual vs. predicted histogram for N = 1000
CHAPTER 7
CONCLUSION
This thesis has revealed the need to provide structure to
unstructured network deployment problems. Structure was
given by defining the three segments necessary to optimally
deploy a computer network. The three segments of network
deployment are design, measurement and integration. These
segments were exemplified by defining system and component
issues concerned with each segment.
Having defined the segment concepts, modeling notions
were considered to assist in the analysis of the problem. The
classical open loop methodology (OLM) was set against the
modern closed loop methodology (CLM). A conceptual design of
the CLM approach was applied to the deployment of an
ethernet. Further classification of the CLM application is
contained in "Sensitivity of the CSMA/CD Parameters using CLM"
in [4].
Throughout this thesis, it is very important to realize that
the open loop method of solving the estimation and
identification problems does not lead to a consistent set of
results. On the other hand, the closed loop method solves the
86
control problem by solving the estimation and identification
problems in a systematic fashion which yields consistent results.
Moreover, an important goal of this thesis was to present
how the CLM can be applied to simulate a computer network. In
order to apply the approach, four design steps must be satisfied.
Appendix B contains the solution to the first step of this process
when the CLM is to be applied to simulate a CSMA/CD protocol.
Use of this appendix in conjunction with the formal definition of
the classification set given in (37) satisfies the second step.
Currently, the OLM sensitivity study and design of the adaptive
processor are being conducted.
In addition, the network deployment presented in this
thesis has been viewed as a facility which makes possible
communication in local area networks. Due to the economic
advantages, the application of this thesis is more effective in
local area communication networks than in longhaul
communication networks because of the following comparable
reasons:
a) Local networks extend over limited geographical areas
and are installed with lownoise links and high speed channels.
b) Longhaul networks, on the other hand, extend over
large geographical areas and carry large a bit rate which
requires expensive links that are noisy.
These advantages (low noise links and high speed
channels) make it feasible for the use of the Kalman filter and
for the operation of CSMA/CD.
APPENDIX A
KALMAN FILTER
