Citation
A neural network-based model-reference adaptive control system

Material Information

Title:
A neural network-based model-reference adaptive control system
Alternate title:
Adaptive control system
Creator:
Ince, David Leland
Publication Date:
Language:
English
Physical Description:
xii, 94 leaves : illustrations ; 29 cm

Thesis/Dissertation Information

Degree:
Master's ( Master of Science)
Degree Grantor:
University of Colorado Denver
Degree Divisions:
Department of Electrical Engineering, CU Denver
Degree Disciplines:
Electrical engineering

Subjects

Subjects / Keywords:
Neural circuitry -- Adaptation ( lcsh )
Adaptive control systems ( lcsh )
Adaptive control systems ( fast )
Neural circuitry -- Adaptation ( fast )
Genre:
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )

Notes

Bibliography:
Includes bibliographical references (leaves 93-94).
General Note:
Spine title: Adaptive control system.
General Note:
Submitted in partial fulfillment of the requirements for the degree, Master of Science, Department of Electrical Engineering and Computer Science.
Statement of Responsibility:
by David Leland Ince.

Record Information

Source Institution:
University of Colorado Denver
Holding Location:
Auraria Library
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
23525020 ( OCLC )
ocm23525020
Classification:
LD1190.E54 1990m .I52 ( lcc )

Downloads

This item has the following downloads:


Full Text
A NEURAL NETWORK-BASED MODEL-REFERENCE
ADAPTIVE CONTROL SYSTEM
by
David Leland Ince
B.S., University of Colorado, 1988
A thesis submitted to the
Faculty of the Graduate School of the
University of Colorado in partial fulfillment
of the requirements for the degree of
Master of Science
Department of Electrical Engineering
and Computer Science
1990


This thesis for the Master of Science
degree by
David Leland Ince
has been approved for the
Department of
Electrical Engineering and Computer Science
by
9i>?£ 7<."
Date


Ince, David Leland (M.S., Electrical Engineering)
A Neural Network-Based Model-Reference Adaptive
Control System
Thesis directed by Professor Edward T. Wall
This thesis examines the implementation of
recurrent backpropagation neural networks in a model-
reference adaptive control system. The neural network
is used as a self-adapting controller in a single-
input/single output system. Two approaches at model-
reference adaptive control systems are defined and
examined by computer simulation.
The simulations show that these are viable
implementations when the stability requirements of the
system to be controlled are met and the architecture of
the neural network used is sufficiently large. Once the
neural network has been trained for a close
approximation of the actual plant, training can continue
on-line. In this way the errors in the identification
of the system and variations in the plant parameters can
be taken into account.
The approaches presented in this thesis show that
this is a promising new method of controls engineering


and the indications are that it is applicative to
problems of a higher complexity. The form and abstract
of this thesis are approved. I recommend its
publication.
Signed
Edward T. Wall
iv


ACKNOWLEDGEMENTS
This thesis is dedicated to the following people
whose support and guidance has made it possible.
My thanks to Dr. Julio C. Proano who developed the
recurrent neural network used in this thesis.
Dr. Edward T. Wall for his advice and understanding
of control systems.
Dr. Jan T. Bialasiewicz for his mathematical
direction.
Dr. William J. Wolfe for his encouragement and
advice.
I would also like to thank my mother for her
understanding.


CONTENTS
FIGURES.........................................viii
1. INTRODUCTION..........:...........................1
2. BACKGROUND NEURAL NETWORK....................... 5
2.1 Introduction..................................5
2.2 Memory........................................7
2.3 Calculation of output.........................8
2.4 Training by Backpropagation..................10
3 . PLANT SIMULATION................................ 13
3.1 Introduction.................................13
3.2 Procedure....................................13
3.3 Conclusions................................ 17
4. SIMULATION RESULTS...............................19
4.1 Introduction.................................19
4.2 Open Loop Control System.....................19
4.2.1 Case Studies.......................21
4.2.2 20% Friction Coefficient
Increase..........................21
4.2.3 20% Moment of Inertia
Coefficient Increase ............ 27
4.2.4 Conclusion.........................27
4.3 Closed Loop Control system...................35
4.3.1 Case Studies.......................40


4.3.2 20% Friction Coefficient Decrease 40
4.3.3 20% Spring Coefficient Increase 44
4.3.4 20% Spring Coefficient Decrease 52
4.3.5 20% Moment of Inertia Coefficient Increase 57
4.3.6 5% Friction Coefficient Increase 57
4.3.7 Conclusions 62
5. CONCLUSIONS .
APPENDIX A . .
APPENDIX B . .
BIBLIOGRAPHY .
vii


FIGURES
Fig. 1 Open loop control system......................3
Fig. 2 Closed loop control system....................3
Fig. 3 Generalized architecture of
recurrent backpropagation neural network [1]........... 6
Fig. 4 Sigmoidal Function K = 1.
a = 0.5 (---), 1 (--) & 2 (---)......................9
Fig. 5 Response of Model {----) and
uncompensated plant () to a unit
step input.............................................22
Fig. 6 Output of model (***) and
compensated plant (---) to unit step input.............23
Fig. 7 Difference between the model
and the plant outputs {model plant) as
a function of time................................... 24
Fig. 8 Initial learning curve, sum of the
square error versus interation.........................25
Fig. 9 Response of the trained system to a
multi-level step input.................................26
Fig. 10 Retraining curve for a 20% increase
in friction coefficient............................*.. 28
Fig. 11 Response of model (***) and output
of the retrained plant (---) after adaptation
to 20% increase in friction coefficient................29
Fig. 12 Difference between the model and the
retrained plant (model plant) after retraining
to a 20% increase in the coefficient of friction
as a function of time..................................30
Fig. 13 Retraining curve for a 20% decrease
in the moment of inertia coefficient...................31


Fig. 14 Response of the model (***) and the
output of the retrained plant (-) after
adaptation to 20% decrease in the moment"of
inertia coefficient...................................32
Fig. 15 Difference between the model and plant
outputs (model plant) after retraining to a
20% decrease in the coefficient of moment of
inertia as a function of time.........................33
Fig. 16 Response of the model (***) and the
uncompensated plant (---) to a unit step input........37
Fig. 17 Initial learning curve, sum of the
square error versus inter at ion......................38
Fig. 18 Output of model (***) and compensated
plant (---) to a unit set input.......................39
Fig. 19 Difference between the model and the
plant outputs (model plant) as a function of
time................................................ 41
Fig. 20 Response of the trained system to a
multi-level step input................................42
Fig. 21 Comparison between the outputs of the
model (***) and the plant (---) after a 20%
decrease in the coefficient of friction before
adaptation............................................43
Fig. 22 Response of the model (***) and the
retrained plant (---) after adaptation to 20%
decrease in the coefficient of friction..............45
Fig. 23 Difference between the model and the
retrained plant (model plant) after retraining to
a 20% decrease in the coefficient of friction as a
function of time......................................46
Fig. 24 Retraining curve for a 20% decrease in
the coefficient of friction..........................47
Fig. 25 Comparison between the outputs of the
model (***) and the plant (---) after a 20%
increase in the spring coefficient before
adaptation........................................... 48
ix


Fig. 26 Retraining curve for a 20% increase in
the spring coefficient.................................49
Fig. 27 Comparison between the outputs of the
model (***) and the plant () after a 20%
increase in the spring coefficient after
adaptation.............................................50
Fig. 28 Difference between the model and the
retrained plant (model plant) after retraining to
a 20% increase in the spring coefficient as a
function of time.......................................51
Fig. 29 Comparison between the outputs of the
model (***) and the plant (^) after a 20%
decrease in the spring coefficient before
adaptation.............................................53
Fig. 30 Retraining curve for a 20% decrease in
the spring coefficient.................................54
Fig. 31 Comparison between the outputs of the
model {***) and the plant () after a 20%
decrease in the spring coefficient after
adaptation.............................................55
Fig. 32 Difference between the model and the
retrained plant (model plant) after retraining to
a 20% decrease in the spring coefficient as
a function of time....................................56
Fig. 33 Comparison between the outputs of the
model (***) and the plant (*) after a 20%
increase in the moment of inertia coefficient
before adaptation......................................58
Fig. 34 Retraining curve for a 20% increase in
the coefficient of moment of inertia...................59
Fig. 35 Comparison between the output of the
model (***) and the plant (-) after a 20%
increase in the coefficient of moment of inertia
after adaptation.......................................60
Fig. 36 Difference between the model and the
retrained plant (model plant) after retraining to
a 20% increase in the coefficient of moment of
inertia as a function of time........................61
x


Fig. 37 Comparison between the outputs of the
model (***) and the plant (*^) after a 5% increase
in the coefficient of friction before adaptation......63
Fig. 38 Retraining curve for a 5% increase in the
coefficient of friction...............................64
Fig. 39 Comparison between the output of the
model (***) and the plant () after a 5% increase
in the coefficient of friction after adaptation......65
Fig. 40 Difference between the model and the
retrained plant (model plant) after retraining to
a 5% increase in the coefficient of friction as a
function of time......................................66
xil


CHAPTER 1
INTRODUCTION
The goal of this research is to study the use of a
recurrent neural network (NN) as a controller in a model
reference adaptive control (MRAC) system. The more
common approaches of control system compensation, both
classical and modern, work well only where the plant is
in a linear mode of operation. This can be overly
restrictive since most physical plants are nonlinear.
One method of overcoming this restriction is by using a
model that for a given input will produce the desired
output. This model may be either hardware or software.
In model reference control systems the output of the
model and the plant are compared and this difference is
used to modify the control signals forcing the plant to
track the model. This method has been proven quite
useful in overcoming nonlinearities. Since plants also
exhibit variations do to enviornmental changes the model
reference control system needs to be adaptable to these
changes. Adaptive controllers are at most second order
dynamical systems and may not deliver the proper


response for higher order systems. The NN creates the
possibility of being a higher order controller.
Reference [1] demonstrates the ability of the NN to
implement a higher order dynamical compensator with
adjustable parameters. Such a system will be called a
NNMRAC system.
Two different methods of application will be
investigated. The first method of using a NN in a MRAC
system is shown in Fig. 1. As illustrated in this
system the output of the plant is not compared to the
reference signal, i.e., there is no feedback from the
output of the to be subtracted from the reference input
since this is an open loop control system. The second
method is shown in Fig. 2. This is a true feedback
control system because the output of the plant is
compared to the reference input. Both of these systems
have been investigated. The approach taken was as
follows:
1} A plant which exhibits overshoot to a step input
was selected. A reference model with no overshoot
was chosen.
2) The plant with a NN compensator was trained so that
the plant would track the model until the sum of
2


Figure 1 Open loop control system.
Figure 2 Closed loop control system.
3


the square error between the model and the plant
was at a low level.
3) The parameters of the plant were perturbed to
simulate different effects such as changes in the
moment of inertia or in friction.
4) The NN was retrained in order to see if the new
error between the model and the plant could be
reduced down to the level that had been arrived at
in step 2.
Results of the initial training, the effects of the
perturbations, and the results of the retraining have
been included. These experiments show that the NNMRAC
system is indeed a viable approach for control systems.
4


CHAPTER 2
BACKGROUND NEURAL NETWORK
2.1 Introduction
The generalized architecture of a recurrent
backpropagation NN is shown in Fig. 3. It consists of
three segments known as the input layer, the hidden
layers and the output layer. The input layer has a
single input and the number of outputs is NO. There are
usually two hidden layers of neurons with N1 and N2
neurons in each layer respectively. The output layer is
a single neuron where N3 is used to identify it. Every
output of the input layer is connected to every neuron
in the first hidden layer through a network of
multipliers known as weights. This network is called
the W01 weight matrix and its size is NO by Nl. Every
output of the first hidden layer is an input to every
neuron in the second hidden layer through a second
weight matrix, W12, and its size is Nl by N2. Each
neuron in the second hidden layer is an input to the
output neuron through a third weight matrix W23. Its
size is N2 by N3. Since N3 is one, W23 is in reality a
column vector of weights. The recurrent feature of the


Fig. 3 - Generalized architecture of recurrent
backpropagation neural network [1].
6


NN is that each neuron in the hidden layers is an input
to itself, after it has been delayed for a time interval
of duration T.
2.2 Memory
The outputs of the hidden layers are nonlinear in
that the output value of each of these neurons is
governed by a sigmoidal function which acts as the
memory feature of the NN. The equation of the output of
the m-th neuron in the n-th hidden layer at the k-th
interval of time is
Onm(k) = f([NETnm(k) K-1 Onm(k-l)] (2.1)
where NETnm(k) is the input to the m-th neuron from the
preceeding layer.
f (s) = Ktanh ([oj][s]/2) (2.2)
is known as the sigmoidal function, where K is the
scaling factor of the the sigmoidal function and A is
related to the slope of the sigmoidal function.
Equation (2.2) can also be expressed by
f(s) = K(l-exp(-[a][s]))/(l+exp(-[a][s])) (2.3)
where exp is the exponential function. Taking the
derivative of Equation (2.4) with respect to s yields
7


f'(S>
(2.4)
2K[o]exp(-[a][s])
(l+exp(-[ot] [s])2
Evaluating Equation (2.4) at s=0 gives the slope of the
sigmoidal function in the linear region.
f'{0) = 2K[a]/4 = K[a]/2 (2.5)
This is shown in Fig. 4 with K=l.
2.3 Calculation of Output
The first step in using a NN is to choose the
desired response to an input signal, which is usually a
step function. The response to this input must span a
sufficient time for the output to reach steady-state.
The number of training points will determine the
accuracy of the output of the NN. The higher the number
of points, the higher the precision. However, this
requires more training time per learning iteration and
more memory space to hold all of the weight matrices.
The input signal is passed through a series of NO-1 time
delays of T duration. The input signal inside of the NN
will then last for a time interval of NOT. This maps
the input dynamics into the NN. Each T interval of the
response of the NN is referred to as a point. For each
point in the training sequence the column vector of the
input to the first layer is determined by the equation
8


OUTPUT
Fig. 4 Sigmoidal Function K = 1.
a = 0.5 (----), 1 ( ), & 2 (-----).
9


(2.6)
NETl(k) = W01(k)T c(k)
where c is the column vector of outputs from the delay
layer and T is the transpose operator. The value of
NETl(k) is operated on by equation (2.1) and its output
value is 01. This is used by the equation
NET2(k) = W12(k)T 01(k) (2.7)
to calculate the input to the second hidden layer.
NET2(k) is operated on by equation (2.1) to determine
the output of the second hidden layer 02(k). The input
to the third hidden layer is calculated by the equation
NET3(k) = W23(k)T 02(k) (2.8)
The output of the NN is finally determined by the
equation
03 (k) = [a]NET3 (k) (2.9)
2.4 Training by Backpropagation
The technique of training a NN by backpropagation
in Reference [1,2] requires that the derivative and an
intermediate variable, delta, be calculated at each
point for each neuron. The equation
DERn(k) = (-On(k)On(k) +KnKn)/(2Kn) (2.10)
10


determines the derivative vector for the neurons in the
n-th hidden layer. The derivative of the output neuron
is assumed to be a vector of composed ones. The deltas
for each layer are calculated using the deltas of the
previous layer by the equations
DEL3(k) = [0m(k)-03(k)]DER3(k) (2.11)
DEL2(k) = W23 (k) [ (DEL3 (k) [a] ]DER2 (k) (2.12)
DELI(k) = W12 (k) [ (DEL2 (k) [a] ]DERI (k) (2.13)
At the k-th point the square error, nu, must be
calculated by the equation
nu(k) = [0m(k)-03(k)]2 / (Om(k)2 (2.14)
The values of nu and DEL at each k-th interval are used
to calculate an incremental change in the weight
matrices. The equations,
W23i(k+1) = W23i(k) + nu(k)02(k)[DEL3(k)A] (2.15)
W12i(k+l) Wl2i(k) + nu(k)01(k)[DEL2(k)A] (2.16)
W01i(k+1) = WOli(k) + nu(k)c(k)[DELI(k)A] (2.17)
are used at the end of each training iteration the
weight matrices are updated by the equations
W23(t+1) = W23(t) + W23i(t)/np + m[W23(t)-W23(t-1)]
(2.18)
11


W12(t+1) = W12(t) + W12i(t)/np + m[W12(t)-W12(t-1)]
(2.19)
WOi(t+i) = W01(t) + WOii(t)/np + m[WOi(t)-WOl(t-1)]
(2.20)
where t represents the t-th iteration, np is the number
of points for each sequence, and m is the momentum term
in [2]. This term is used to speed the convergence of
the backpropagation algorithm.
After training the NN for several (typically 20 to
300) iterations, the value of the square error has been
t
reduced to a low value. This means that the changes to
the weights per iteration is small. One procedure to
increase the learning rate is to raise the value of the
square error to a power less than one. Since the square
error is less than or equal to one, this will increase
the value of the square error. When this is applied
properly the number of iterations need to train the NN
can be reduced.
12


CHAPTER 3
PLANT SIMULATION
3.1 Introduction
The training of the NN requires that the output of
the plant be compared to the reference model output on a
point-by-point basis. Each point is separated in time
as in a discrete-time control system. In the second
method of NNMRAC systems the output of the plant is used
to modify the input signal via feedback. The algorithm
used to define the plant should be capable of handling
the feedback accurately with fairly large values of
sampling intervals. For this reason a discrete-time
filter to simulate the plant was not considered.
3.2 Procedure
A suitable method makes use of the Runge-Kutta
Fourth Order algorithm to find the first three output
values. The remaining values of the plant output are
calculated using an Adams-Bashforth Four-Step Predictor
and an Adams-Moulton Three-Step Corrector. Reference
[3] presents the algorithms when the input is constant.


Therefore, some changes to the algorithm were made to
take into account the changing inputs. The unmodified
algorithm is evaluated using the four previous time and
state values. The modification includes using the four
previous inputs to the plant. While this did exhibit
errors, the resultant plant gave a reasonable
approximation to the correct output and was
significantly more accurate than a step invariant
discrete-time representation for the same transfer
function with identical time intervals. While the
simulation time for the Runge-Kutta plant is much longer
than that of a discrete-time plant, the accuracy of the
Runge-Kutta method permits a significant reduction in
the number of neurons in the time delay layer of the NN.
The Runge-Kutta method permits the description of the
plant transfer function in continuous-time control
canonical form, which allowed variations in the plant
parameters when testing the trained system for plant
perturbations. This was the motivation for the
selection of the Runge-Kutta method. The algorithm was
divided into three parts. The first part provides the
variables needed and initializes their values. The
second part provides a function call that evaluates the
expressions defined in [3] of equation 3.1 below
14


X'(t) = AX(t) + Bu(t)
Y(t) = CX(t) + Du(t)
(3.1)
(3.2)
where X(t) is the (n by 1) vector of the states of an n-
th order system, A is the (n by n) matrix called the
system matrix, and B is the (n by r) matrix called the
input matrix. Also U(tj is the (r by 1) vector composed
of the system input functions, and Y(t) is the (p by l)
vector composed of the defined outputs. In addition C
is the (p by n) matrix called the output matrix and D is
the (p by r) matrix that represents the direct coupling
between the input and output. The third part of the
method determines which variables are sent to the
function call and how the returned values are processed
into an output for a particular time instant after
equation (3.2). For a time invariant plant the Runge-
Kutta Fourth-Order starter calculates the value of the
temporary variables, Kl, K2, K3 and K4 three times using
in the following equations,
Kl = Tf(w(i-l))
K2 = Tf(w(i-l) + Kl/2) (3.3)
K3 = Tf(w(i-l) + K2/2)
K4 = Tf(w(i-l) + K3)
where T is the time interval between output values, f is
the function to be evaluated and w(i-l) are the states
at the previous evaluation. This evaluates only
15


X' = AX.
(3.4)
Changes to the starter to which evaluates the entire
equation (3.1) are
Kl = Tf(w(i-l),On(i-l))
K2 = Tf(W(i-l) + Kl/2/On(i-l)) (3.5)
K3 = Tf(w(i-i) + K2/2,On(i-l))
K4 = Tf(w(i-1) + K3,On(i-l))
where On (i-l) is the output of the NN delayed by one
time interval. The Adams-Bashforth Four-Step predictor
is
w(i) = w(i-l) + T(55f(w(i-l)) 59f(w(i-2))
+ 37f(w(i-3)) 9f(w(i-4)))/24. (3.6)
The changes used for the predictor are
w(i) = w(i-l) + T(55f(w(i-l),On(i)) -
59f(w(i-2),On(i-l)) + (3.7)
37f(w(i-3),On(i-2)) -
9f(w(i-3),On(i-3)))/24,
where On(i) is the output of the NN at the present time
and the other On values are delayed by one time
interval. The Adams-Moulton corrector is
w(i) = w(i-l) + T(9f (w(i)) +
19f(w(i-l) - (3.8)
5f(w(i-2) +
f(w(i-3)))/24
16


where w(i) on the right side of the corrector equation
is the value of w(i) from the predictor equation. The
changes to the corrector are then
w(i) = w(i-l) + T(9f(w(i),On(i)) +
I9f(w(i-l),On(i)) - (3.9)
5f(w(i-2),On(i-l)) +
f(w(i-3),On 3.3 Conclusions
These changes were tested by replacing the NN
compensator by a proportional gain with a value of one.
The values of the output of the NN were then replaced by
using the output of this interim plant subtracted from
the reference input to form the next input to the plant.
When compared to a step-invariant discrete time filter
with the same transfer function the Runge-Kutta
determined plant gave better results for identical time
intervals. However it should be noted that the Runge-
Kutta program easily becomes unstable when the
coefficients in the transfer function are large, for
large time increments. Other methods presented in
reference [4] also exhibit less accuracy when employed
in a feedback loop with the notable exception of the
bilinear transformation.
This method is quite accurate, however it does not
permit easy adjustments for the change in parameters
17


that is neccessary for the purpose of this study. The
introduction of a plant whose input is the output of the
NN does involve some minor changes to the
backpropagation algorithm. In the algorithm the
updating of the weights is accomplished by a comparison
of the output of the desired reference model with the
output of the NN. The changes that must be made are
made in order to compare the output of the reference
model with the output of the plant. The modified
program is in Appendix A.
18


CHAPTER 4
SIMULATION RESULTS
4.1 Introduction
This chapter consists of two parts. The first part
will examine the open loop control system where the NN
had a single hidden layer. The configuration is that of
Fig. 1. The second part investigates the closed loop
control system where the NN had two hidden layers. The
configuration used is shown in Fig. 2.
4.2 Open Loop Control System
Since this type of system is open loop the plant to
be controlled must be type 0. If the plant to be
controlled is of type 1 then the plant can be
transformed into a type 0 by using unity feedback around
the plant. The plant used in this simulation is a
servomotor with a transfer function of the form
F
G (s) = ------ (4.1)
Ms(s+B)
where M represents the moment of inertia (MOI) of the
motor and load, B represents the friction and F is the
applied force. Since this function is a type 1, unity


feedback is added to make the plant type 0. The
resultant transfer function is
F
G(S) = ------------- (4.2)
Ms* + Bs + F.
This system is now controllable in the open loop
configuration. With parameters inserted into equation
(4.2) the transfer function becomes
25
G(s) = ~Z------------- (4.3)
S* + 6s + 25.
This plant will exhibit overshoot when subjected to a
step input. A more desirable transfer function is
120
Gm(s) = ---------------
(s+4)(s+5)(s+6)
120
= -3-------5------------ (4.4)
which diplays no overshoot since none of the poles of
the transfer function are complex. It should be
observed that the transfer function of the model is of
higher degree than that of the plant. An unexpected
result is that the NN is more easily trained when the
model is of higher order than the plant.
20


Fig. 5 shows the unit step response of the type 0 plant
and the model. Table 4.1 details the neural
configuration and the training duration (Appendix B) .
The initial training was suspended after 300 iterations
at which point the sum of the square error was 0.141.
Fig. 6 compares the model and the compensated output.
There is a close matching between the two. Fig. 7 shows
the difference between the model and the plant output as
a function of time. Fig. 8 gives the learning curve and
Fig. 9 the response of the system to a multi-level
input. This result indicates that the NN has learned
the system dynamics.
4.2.1 Case Studies
As a test for adaptability the coefficients of MOI
and friction were independently varied by plus and minus
20%.
4.2.2 20% Friction Coefficient increase
The first case to be examined is a friction
increase by 20%. The new transfer function of the plant
becomes
25
Gf(s) = --------------
s^ + 7.2s + 25.
(4.5)
21


OUTPUT
Fig. 5 Response of Model (-----) and uncompensated
plant (---) to a unit step input.
22


OUTPUT
Fig. 6 Output of model (***) and compensated
plant (---) to unit step input.
23


ERROR
Fig. 7 Difference between the model and the plant
outputs (model plant) as a function of time.
24


ERROR
Fig. 8 Initial learning curve, sum of the square
error versus interation.
25


OUTPUT
Fig. 9 Response of the trained system to a multi-
level step input.
26


Table 4,2 gives the specifications of the retraining
(Appendix B). Briefly the sum of the square error had
returned from 0.506 to 0.141 in 124 iterations. Fig. 10
shows the learning curve of the retraining and Fig. 11
shows the response after retraining and Fig. 12 gives
the difference between the model and the retrained
system as a function of time.
4.2.3 20% Moment of inertia Coefficient increase
The second case is a decrease in the MOI by 20%.
The new transfer function for this case is
31.25
Gmoi(s) = -------- (4.6)
s2 + 7.5s + 31.25.
Table 4.3 gives the retraining specifications (Appendix
B) The sum of the square error returns to 0.141 from
0.161 in 64 iterations. Fig. 13 shows the retraining
curve, Fig. 14 compares the output of the retrained
system with the model and Fig. 15 shows the difference
between the model and the retrained system as a function
of time.
4.2.4 Conclusions
The results of retraining the plant with an increase in
MOI or a decrease in friction are not included. The
27


ERROR
RETRAINING CURVE for 20% FRICTION INCREASE
Fig. 10 Retraining curve for a 20% increase in
friction coefficient.
28


Fig. 11 Response of model (***) and output of the
retrained plant (-----) after adaptation to 20%
increase in friction coefficient.
29


Fig. 12 Difference between the model and the
retrained plant (model plant) after retraining to
a 20% increase in the coefficient of friction as a
function of time.
30


Fig. 13 Retraining curve for a 20% decrease in
the moment of inertia coefficient.
31


Fig. 14 Response of the model (***) and the
output of the retrained plant () after
adaptation to 20% decrease in the moment of inertia
coefficient.
32


xia-a
Fig. 15 Difference between the model and plant
outputs (model plant) after retraining to a 20%
decrease in the coefficient of moment of inertia as
a function of time.
33


reason for this exclusion is that for these two cases
the NN would not retrain. This will require further
investigation. In the first two cases where retraining
did occur, the amount of damping increased with the
variation of parameters while in the second two cases
the damping factor decreased resulting in more
overshoot. These results can only be deemed as partial
successes. The outcome of these experiments can however
be useful. Initial training should be performed with
the maximum expected MOI and the minimum expected
friction before adaptation. As the plant is presently
operated the friction will increase as the lubricants
become contaminated. The weights of the NN should be
retained so that after periodic maintenance these values
can be reloaded into the NN. In this way this type of
control system could be used successfully. A caution
here is that these conclusions may not apply to higher
order plants because the variations may cause different
effects on the poles of the transfer functions. Present
knowledge indicates that simulations for each individual
case would be necessary in order to study the effects of
these variations.
34


4.3 Closed Loop control System
A closed loop should be capable of controlling a
type 0 or a type 1 system. Research on a type 1 system
did not give an encouraging response. However when the
NN was applied to a type 0 plant the results were
satisfactory. If the desired plant to be controlled is
of type 1 then it requires a unity feedback branch in
order to transform it into a type 0 system. The plant
used in this study is a robotic arm with a transfer
function
F
G(S) = ---------------- (4.7)
sfMs^ + Bs + K)
where M is the moment of inertia, B is the friction, K
is the spring constant and F is the gain. Since this is
a type 1 system it is transformed into a type 0 system
by providing a unity feedback branch. The plant
transfer function now becomes
F
G (s) = ----------------- (4.8)
MsJ + Bs* + Ks + F.
For the purposes of simulation the values of the
parameters were chosen to be such that the transfer
function of the plant is
35


G(s)
120
(4.9)
S3 + 16s2 + 60S + 120
120
S(S+6)(s+10) + 120.
(4.10)
Once again this plant will exhibit overshoot to a unit
step input. The model was chosen to be
180
Gm(s) = -------------------------------- (4.11)
S4 + 18sJ + 119s* + 343s + 360
180
= ---------------------- (4.12)
(S+3)(s+4)(s+5)(s+6).
Since all of the roots are real the model does not
experience overshoot. Fig. 16 shows the comparison
between this desired model output and the plant, in the
configuration of Fig. 4, if the NN is replaced with a
proportional compensator with a coefficient of one.
Details of the NN compensator and the training data
appear in Table 4.4 (Appendix B). Initial training was
suspended after 650 iterations at which point the sum of
the square error was 0.505. This is illustrated in Fig.
17. Fig. 18 shows the output of the trained system and
the model. This figure shows that the plant still
exhibits overshoot but it is considerably reduced from
the output in Fig. 16. Also the final output of the
36


OUTPUT
Fig. 16 - Response of the model {***) and the
uncompensated plant (---) to a unit step input.
37


ERROR
Fig. 17 Initial learning curve, sum of the square
error versus interation.
38


output
TIME in seconds
Fig. 18 Output of model (***) and compensated
plant (---) to a unit set input.
39


plant is not in Fig. 16. Also the final output of the
plant is not the desired value of 0.5 as hoped for but
it is now 0.48. This gives the output an error of 4%.
Fig. 19 shows the difference between the model and the
plant as a function of time and Fig. 20 shows the
response of the trained system to a multi-level input
indicating that the NN has learned the dynamics of the
system.
4.3.1 Case Studies
The test for adaptability consisted of raising and
lowering the coefficients of the MOI, friction, and
spring constants by 20% whenever the plant algorithm
would remain stable.
4.3.2 20% Friction Coefficient Decrease
The transfer function for this case becomes
120
Gf SJ + 12.8sz + 60s + 120.
Fig. 21 compares the output of the model and the output
of the plant after this perturbation but before
adaptation. After perturbation the sum of the square
error was 0.567. Table 4.5 shows the retraining data.
The NN took 119 iterations to return to a sum of the
40


Fig. 19 - Difference between the model and the
plant outputs (model plant) as a function of
time.
41


OUTPUT
Fig. 20 - Response of the trained system to a
multi-level step input.
42


TIME in seconds
Fig. 21 Comparison between the outputs of the
model (***) and the plant ( ) after a 20%
decrease in the coefficient of friction before
adaptation.
43


square error of 0.505 (Appendix B). Fig. 22 shows the
response of the plant after 150 iterations. Fig. 23
shows the difference between the model and the retrained
system as a function of time. Fig. 24 shows the
retraining curve. The maximum overshoot was 0.513, and
the undershoot was 0.484. The steady-state value of the
ouput was 0.485.
4.3.3 20% Spring Coefficient Increase
The modified transfer function of the plant is now
120
Gk(s) = ---------------------- (4.14)
S3 + 16sz + 72s + 120.
The response after perturbation but before adaptation is
shown in Fig. 25. The sum of the square error before
adaptation was 0.743. Table 4.6 gives the retraining
data (Appendix B) The NN has returned to a sum of the
square error of 0.505 in 97 iterations. Fig. 26 shows
the retraining curve for this perturbation. Fig. 27
compares the response of the model and the retrained
system and Fig. 28 shows the difference between the
model and the retrained system as a function of time
after 100 iterations. The maximum overshoot was 0.512
and the undershoot was 0.487, which is the steady-state
value. For these two cases the NN displayed a greater
44


Fig. 22 - Response of the model (***) and the
retrained plant () after adaptation to 20%
decrease in the coefficient of friction.
45


Fig. 23 Difference between the model and the
retrained plant (model plant) after retraining to
a 20% decrease in the coefficient of friction as a
function of time.
46


ERROR
Fig. 24 Retraining curve for a 20% decrease in
the coefficient of friction.
47


TIME in seconds
Fig. 25 Comparison between the outputs of the
model (***) and the plant (--------) after a 20%
increase in the spring coefficient before
adaptation.
48


ERROR
Fig. 26 Retraining curve for a 20% increase in
the spring coefficient.
49


0.6
Fig. 27 Comparison between the outputs of the
model (***) and the plant ( ) after a 20%
increase in the spring coefficient after
adaptation.
50


Fig. 28 - Difference between the model and the
retrained plant (model plant) after retraining to
a 20% increase in the spring coefficient as a
function of time.
51


willingness to readapt for these perturbations and can
be considered totally successful.
4.3.4 20% Spring Coefficient Decrease
The new transfer function is
120
Gk(s) = ---------------------- (4.15)
S3 + 16s2 + 48s + 120.
Fig. 29 shows the system response after the perturbation
and before adaptation. The sum of the square error was
1.01. After 600 iterations the sum of the square error
could only be reduced to a level of 0.578. The
retraining data is in Table 4.7 (Appendix B) Fig. 30
shows the retraining curve and Fig. 31 compares the
output of the partially retrained system to that of the
model. Fig. 32 shows the difference between the model
and the retrained system as a function of time. This
case exhibited the largest overshoot of all of the cases
and was 0.524. This is still below a 5% error. The
maximum undershoot and the steady-state value were both
0.479.
52


Fig. 29 Comparison between the outputs
model (***) and the plant (---------) after
decrease in the spring coefficient
adaptation.
of the
a 20%
before
53


ERROR
Fig. 30 Retraining curve for a 20% decrease in
the spring coefficient.
54


0 1 2 3 4 5 6 7
TIME in seconds
Fig. 31 - Comparison between the outputs of the
model (***) and the plant () after a 20%
decrease in the spring coefficient after
adaptation.
55


Fig. 32 - Difference between the model and the
retrained plant (model plant) after retraining to
a 20% decrease in the spring coefficient as a
function of time.
56


4.3.5 20% Moment of Inertia Coefficient Increase
The perturbed transfer function becomes
100
Gm(s) = ------------------------- (4.16)
SJ + 13.33s* + 50s + 100.
Fig. 3 3 shows the response of the perturbed system
before adaptation. The sum of the square error was
0.543. The retraining data is in Table 4.8 (Appendix
B). Fig. 34 shows the retraining curve. The NN would
only retrain to a sum of the square error of 0.542 after
250 iterations showing a great reluctance to retrain.
Fig. 35 shows the response of the vaguely retrained
system as compared to the model. Fig. 36 shows the
difference between the model and the retrained plant as
a function of time. The maximum overshoot was 0.512 and
the maximum undershoot was 0.482 while the steady state
value was 0.483.
4.3.6 5% Friction Coefficient Increase
The algorithm used to simulate the plant becomes
unstable for higher perturbation rates. The transfer
function for this variation is the
120
Gf(s) = ----------;-------------- (4.17)
sJ + 16.Ss^ + 60s + 120.
57


TIME in seconds
Fig. 33 - Comparison between the outputs of the
model (***} and the plant (^) after a 20%
increase in the moment of inertia coefficient
before adaptation.
58


ERROR
Fig. 34 Retraining curve for a 20% increase in
the coefficient of moment of inertia.
59


Pig. 35 Comparison between the output of the
model (***) and the plant () after a 20%
increase in the coefficient of moment of inertia
after adaptation.
60


Fig. 36 Difference between the model and the
retrained plant (model plant) after retraining to
a 20% increase in the coefficient of moment of
inertia as a function of time.
61


Fig. 37 gives the response of the perturbed system
versus the model before adaptation. The initial sum of
the square error was 0.515. The NN retrained to a level
of 0.513 in 500 iterations. Fig. 38 shows the
retraining curve and the retraining data is in Table 4.9
(Appendix B). Fig. 39 compares the output of the
retrained system to that of the model. Fig. 40 shows
the difference between the model and the incompletely
retrained system as a function of time. The maximum
overshoot was 0.511, the undershoot and the steady-state
value were both 0.480. For the case of a reduction of
MOI, no data was obtained because the plant algorithm
became unstable for any decrease in this parameter.
4.3.7 Conclusions
The results of these experiments show that the
NNMRAC system in this configuration will attempt to
adapt for changing parameters since in all cases for
which the plant algorithm remained stable the sum of the
square error could be reduced although in some of the
cases this reduction did not amount to a significant
value. While this configuration can not handle a type 1
system it provides a more robust control system than the
first configuration.
62


TIME in seconds
Fig. 37 - Comparison between the outputs of the
model (***) and the plant (----} after a 5% increase
in the coefficient of friction before adaptation.
63


ERROR
Fig. 38 Retraining curve for a 5% increase in the
coefficient of friction.
64


Fig. 39 - Comparison between the output of the
model (***) and the plant (----) after a 5% increase
in the coefficient of friction after adaptation.
65


Fig. 40 Difference between the model and the
retrained plant (model plant) after retraining to
a 5% increase in the coefficient of friction as a
function of time.
66


CHAPTER 5
CONCLUSIONS
These tests performed and reported on in Chapter 4
show that the NN can function at least partially well as
a controller in a MRAC system. The second method of
application that was examined does seem to hold the most
promise and is the more commonly seen form of
compensation used in control systems. While the results
were not perfect, i.e., training did result in perfect
model following and retraining did not always reduce the
error to the desired level, the NN did train to an
acceptable level of operation. The errors in maximum
overshoot, undershoot and steady-state value were all
under 5%. This is quite acceptable for most engineering
applications.
The results do indicate that further research in
training algorithms to increase the rate of training
would be well worth the effort and funding. This
approach to KRAC systems provides for an automatic form
of adaptation that would be quit useful in slowly
varying systems. NNs may also hold the possibility of
being extended to nonlinear systems. There is a


definite need to study the effects of changing the
sigmoidal function parameters, K and a, and how they
might best be chosen to avoid lapses in training.
Finally, in retrospect, the Runge-Kutta method for
the plant simulation would not be used. Instead a
discrete-time algorithm such as bilinear transformation
(and if possible with frequency prewarping) as described
in reference [4] would be used.
68


APPENDIX A
COMPUTER PROGRAMS
The programs included in this Appendix were used to
investigate the use of neutral networks in MRAC systems.
The programs were written in AT-MATLAB.
Programs t2adnn.m and t2adnnlm.m train a NN with
one hidden layer. Program t2adnnx.m tests the training.
Programs t3adnn4.m and t3adn41m.m are used to train
a NN with two hidden layers. Program t3adnn4x.ra tests
the training.
All other programs are support the training and
testing programs.


% t2adnn.m
%
%
% testh03.m
%
%
%
%
%
trains a two layer neural network for
adaptive control
is a program that test the dynamical
neural network with HYSTERESIS AND
LINEAR OUTPUT NEURON
CUMMULATIVE LEARNING '
format compact;
invectr
NO=input('number of neurons in the DELAY layer = ');
np=NO+l;
Nl=input('number of neurons in the FIRST hidden layer..
=');
ddinhl
iyninh2
kct2
alfact2
nu=l;
numero=input('number of iterations =');
lipp=input(7 ENTER: value of lippman constant = ');
factor^input(* ENTER: nu factor (usually 1 or larger)..
= ');
errloc=zeros(numero,1);
subplot
for jj=l:numero,
iyninh2 % reset c to zeros
On=zeros(1,5); % Output of N.N. for R-K,A-B,A-M
plantst
ddaccin
ddoicc
for ii=l:np,
adcoleh
ddfwlhn
adpercen
nu=nuA(1/factor);
ddwcl
errloc(jj)=abs(Ot(1,ii)-Op(1,ii))+..
errloc(jj);
end
ddwc2
kk=rem(jj,4)+221;
if kk-=221,
subplot
end
subplot(kk),plot(errloc),title('ERROR');
end
save neuOlou W01 W12 alfal alfa2 hi h2 K1 K2


% invectr.m invector train
% used in training sequence
invect=input('ENTER: amplitude of training step ..
input = ');
tf=input('ENTER: time at end = ');
% ddinhl.m
%
%
% iyninhl.m
%
%
%
%
%
%
%
%
%
%
%
%
%
N2=l;
rand('uniform');
netl=zeros(Nl,1);net2=zeros(N2,1);
netlh=zeros(Nl,1);net2h=zeros(N2,1);
ol=zeros(Nl,1);o2=zeros(N2,l);
olh=zeros(Nl,1);o2h=zeros(N2,1);
dell=zeros(Nlf1);del2=zeros(N2,1);
derl=zeros(Nl,1);der2=zeros(N2,1);
alfal=ones(Nl,1);alfa2=ones(N2,1);
hl=zeros(Nl,1);h2=zeros(N2,1)?
Kl=ones(Nl,1);K2=ones(N2,1);
W01=-ones(NO,Nl)+2*rand(N0,Nl);
W12=-ones(Nl,N2)+2*rand(Nl,N2);
W011ipp=zeros(N0,N1);
W12lipp=zeros(Nl,N2);
used for training sets up neural
network
is a program that initilizes all the
possible variables and structure of the
neural network
It uses the values
NO = # of inputs (delayed) GIVEN
BY THE INPUT
Nl = # of neurons in the first..
hidden layer
N2 = # of neurons in the output..
layer GIVEN BY THE OUTPUT
it uses the information of NO (included
in It), number of input signals.


% kct2.m is a program that initializes all the values
% of K1,K2 with the same value.
% OBS: the value of K(a constant) is asked here
K=input('saturation level of neurons = ');
%
K1=K1*K;
K2=K2*K;
% here we need to renormalize the weights to avoid
% saturation of neurons at the beginning of the
% learning process
W01=W01/(K*NO);W12=W12/(alfa2*Nl);
% alfact2.m is a program that initializes all the
% values of alfal,alfa2 with the same
% value.
% OBS: the value of alfa(a constant) is
% been asked here
alfa=input('value of the ct. slope = ');
%
alfal=ones(Nl,1)*alfa;
alfa2=ones(N2,1)*alfa;
% ddaccin.m is a program that initilizes all the
% possible variables that will work as
% accumulators for cumulative learning
%
alfali=zeros(Nl,1);alfa2i=zeros(N2,1);
hli=zeros(Nl,1);h2i=zeros(N2,1);
Kli=zeros(Nl,l);K2i=zeros(N2/1);
W01i=zeros(N0,Nl);W12i=zeros(NlfN2);
ai=zeros(NO,1);gi=zeros(NO,1);
% ddoicc.m program for initializing the initial
% conditions
% for the outputs of every neuron
% ( = zeros)
% for every run of the complete set of
% points
ol=zeros(Nl,1);o2=zeros(N2,1);


% plantst.m sets up R-K for a4plantj.m
% Adam's Forth Order Predictor/Corrector
%Aplt=[0 1;-8 -6];Bplt=[0;8];w0 = [0;0];
%Aplt=[0 l;-16 -10];Bplt=[0;16];w0=[0;0];
%Aplt=[0 1 0;0 0 1;42.72 -58.82 -14.8];
%Bplt=[0;0;42.72];w0=[0;0;0];
%Aplt=[0 1;0 -4];Bplt=[0;4];w0=[0;0]; % 4/(s(s+4))
%Aplt=[0 l;0 -3];Bplt=[0;5];w0=[0;0]; % 5/(s(s+3))
%Aplt=[0];Bplt=[4];w0=[0]; % 4/s enlx series
%Aplt=[0 l;-29 -4];Bplt=[0;29];w0=[0;0]; % fnlx series
%Aplt=[0 l;-29 -4.8];Bplt=[0;29];w0=[0;0]; % fnlx
% w/ 20% friction increase
%Aplt=[0 l;-8 -4];Bplt=[0;8];w0=[0;0]; % fn2x series
%Aplt=[0] ;Bplt=[2] ;w0=[0] ,* % 2/s hnlx series
%Aplt=[0 l 0;0 0 l;0 -40 -14];Bplt=[0;0;75];w0=[0;0;0];
%knlx series
%Aplt=[0 l;-25 -6];Bplt=[0;25];w0=[0;0]; % double fb
% lx series
%Aplt=[0 1 0;0 0 1;-120 -60 -16];Bplt=[0?0;120];
% w0=[0;0;0]; % df series
%Aplt=[0 1 0;0 0 1;-120 -60 -12.8];Bplt=[0;0;120];
% w0=[0;0;0]; %df -20% fric
%Aplt=[0 1 0;0 0 1;-100 -50 -13.33];Bplt=[0;0;100];
% w0=[0;0;0]; %df +20% moi
Aplt=[0 1 0;0 0 1;-120 -48 -16];Bplt=[0;0;120];
w0=[0;0;0]; %df -20% k
to = 0;
h = T;
n = length(wO);
t = tO;
w = zeros(n,1);wp0 = w0;wpl=w;wp2=w;wp3=w;
points = zeros(n,np);
Op=zeros(l,np);
kl=w;k2=w;k3=w;k4=w;
w=w0;
points(:,1)=w;
% adcoleh.m for total system test
%
% inycoexh.m is a program to produce the feedforward
%c=Ie(:,ii);
c(2:NO,l)=c(l:N0-lfl) ;
if (ii==l),
c(l,1)=0;
else
c(l,l)=invect(l,l);
end


%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
ddfwlhn.m used for feedback looping
iynfwllh.m is a program that produces all the
forward computation in the feedforward
path for purpose of learning
ASSUMES FEEDFORWARD COMPENSATION (output
of comp.= c)
This subroutine calculates the status of all
variables at time ii
EVERY NEURON IS IMPLEMENTED WITH HYSTERESIS
(except for the neuron in the output layer
which is LINEAR) introduces the variables
net*h=net*-o*./K*
THE OUTPUT LAYER ONLY HAS ONE NEURON
FORWARD COMPUTATION
%
% netl=W0l'*lt(:,ii);
netl=W01'*c;
netlh=netl-ol./K1;
ol=(-exp(-alfal.*(netlh+hl))+ones(length(alfal),1));
ol=ol./(exp(-alfal.*(netlh+hl))+ones(length(alfal),1)) ;
ol=ol.*Kl;
derl=(-ol.*ol+Kl.*K1)./(2*K1);
net2=W12'*ol;
% LINEAR FUNCTION IN THE OUTPUT NEURON
o2=alfa2.*(net2+h2);
der2=ones(N2,1);
%
% BACK COMPUTATION
shfton2
a4plantj
del2=(0t(:,ii)-Op{:,ii)).*der2;
dell=(W12*(del2.*alfa2)).*derl;
del0=W01*(dell.*alfal);
%
% This program provides all the parameters for
% applying the Dynamical Extended Delta Rule
% adpercen.m This calculates the error between the
% plant output Op and the desired model's
% output Om
nu=min((sum(abs(Ot(:,ii)-0p(:,ii)).A2))/..
(sum(abs(Ot(:,ii)).A2)+eps),1);


% ddwcl.m is the program that calculates
% increments for w and stores them in
% accumulators
W12i=W12i+nu*ol*(del2.*alfa2)';
W01i=W01i+nu*c*(dell.*alfal)';
% NOTE: that the input has been changed to (c) to
% include the feedback
% ddwc2.m is the program that calculate
% increments for W in cumulative fashion
% from the accumulators
W12=W12+W12 i/np+lipp*(W12-W121ipp);
W01=W01+W01i/np+lipp*(W01-W011ipp);
W121ipp=W12;
W011ipp=WOl;
%
.%
%
%
%
%
%
%
%
t2adnnlm.m trains a three layer neural network for
adaptive control Learn More
testh03.m is a program that test the dynamical
neural network with HYSTERESIS AND
LINEAR OUTPUT NEURON
CUMMULATIVE LEARNING
numero=input('number of iterations =');
lipp=input('ENTER: value of lippman constant = ');
factor=input('ENTER: value of nu reciprocal exponent..
= ');
errloc=zeros(numero,1);
subplot
for jj=l:numero,
iyninh2
On=zeros(1,5); % Output of N.N. to plant for R-K
plantst
ddaccin
ddoicc
for ii=l:np,
adcoleh
ddfwlhn
adpercen
nu=nuA(1/factor);
ddwcl
errloc(jj)=abs(Ot(l# ii)-Op(1,ii))+errloc(jj);
end


ddwc2
kk=rem(jj,4)+221;
if kk==221,
subplot
end
subplot(kk),plot(errloc),title('ERROR');
end
save neuOlou W01 W12 alfal alfa2 hi h2 K1 K2
% t2adnnx.m
%
%
%
%
ddinxhl
invector
iyninxh2
plantst
sizeI=NO;
fbv=zeros(NO,1)
On=zeros(l,5);
stepcnt=l;
incnt=l;
for ii=l:np.
This runs the plant with the N.N. (one
hidden layer) compensator.
Used for testing, especially for varying
input data such as when the the step has
negative and positive values
adcoexh
ddfwxlh
shfton2
a4plantj
stepcnt=stepcnt+l;
if(stepcnt>npoints),
stepcnt=l;
incnt=incnt+l;
end
end


% ddinxhl.m initializes variable for execution of
% trained
% N.N. that has one hidden layer
Nl=length(alfal);
N2=length(alfa2);
nsteps=input(' ENTER: # of steps = ');
npoints=input('ENTER: # of points per step = ');
np=nsteps*npoints;
Oe=zeros(np,1);
netl=zeros(Nl,l);
netlh=zeros(Nl,l);
ol=zeros(Nl/1);
net2=zeros(N2,1);
net2h=zeros(N2,1);
o2=zeros(N2,1);
% invector.m creates stream of input vectors for use
% in total test of trained neurally
% compensated system
invect=zeros (nsteps, 1).;
for dli=linsteps,
invect(dli,l)=input('ENTER: amplitude = ');
end
% iyninxh2.m is a program to initialize the
% compensators for normal operation
c=zeros(NO,1);
% adcoexh.m for total system test
%
% inycoexh.m is a program to produce the feedforward
%c=Ie{:,ii);
c(2:N0,1)=c(l:NO-l,l);
c(1,1)=invect(incnt,1);
% shfton2.m SHiFT On
% this shifts the output of the neural
% network in array On so that it can be
% used for the R-K,A-B,A-M
On(1,2:5)=0n(1,1:4);
0n(l,l)=o2;
77


% shfton.m SHiFT On
% this shifts the output of the neural
% network in array On so that it can be
% used for the R-K tA-B,A-M
On(l,2:5)=0n(1,1:4);
On(1,1)=o3;
%
%
%
%
%
%
%
%
%
%
%
%
%
%
ddfwxlh.m used with one hidden layer N.N.
iynfwxlh.m is a program the executes the (iith)
normal operation of the neural network
section
INPUT = Ie NO x 1
ASSUMING FEEDFORWARD COMPENSATION
OUTPUT= Oe np x 1
FORWARD COMPUTATION
EVERY NEURON IS IMPLEMENTED WITH HYSTERESIS
(except for the neuron in the output layer
which is LINEAR) introduces the variables
net*h=net*-o*./K*
% THE OUTPUT LAYER ONLY HAS ONE NEURON
% netl=W01'*Ie(:,ii);
netl=W0l'*c;
netlh=netl-ol./K1;
ol=(-exp(-alfal.*(netlh+hl))+ones(length(alfal),1));
ol=ol./(exp(-alfal.*(netlh+hl))+ones(length(alfal),1));
ol=ol.*K1;
net2=W12'*ol;
% LINEAR FUNCTION OF THE OUTPUT NEURON
o2=alfa2.*(net2+h2);
Oe(ii)=o2(l);
% t3adnn4.m
%
%
% testh03.m
%
%
%
%
%
trains a three layer neural network for
adaptive control
is a program that test the dynamical
neural network with HYSTERESIS AND
LINEAR OUTPUT NEURON
CUMMULATIVE LEARNING
format compact;
invectr
T=input('enter: T = ');
'TO


N0=input('number of neurons in
np=(tf+T)/T; % NO+1;
Nl=input('number N2=input('number of neurons in
of neurons in
layer =');
adinhl
iyninh2
Kct
alfact
the DELAY layer = ');
the FIRST hidden layer.,
the SECOND hidden..
nu=l;
numero=input('number of iterations = ');
lipp=input('ENTER: value of lippman constant = ');
factor=input('ENTER: nu factor (usually 1 or larger)..
= ');
errloc=zeros(numero,1);
subplot
for jj=l:numero,
fbv=0;
On=zeros(l,5); % Output of N.N. for R-K,A-B,A-M
plantst
iynaccin
neuroicc
iyninh2
for ii=l:np,
adcoleh
makeinpx
adfwlhn
adpercen
nu=nuA(1/factor);
iynwcl
errloc(jj)=abs(Ot(l,ii)-Op(l,ii))+errloc(jj) ;
shftfbv4
end
iynwc2
subplot
subplot(211),plot(errloc),title('ERROR t3adnn4.m')
subplot(212),plot(Op),title('PLANT OUTPUT')
end
save neuOlou W01 W12 W23 alfal alfa2 alfa3 hi h2 h3..
K1 K2 K3
70


% adinhl.m
%
%
% iyninhl.m
%
%
%
%
%
%
%
%
%
%
%
%
%
%
N3=l;
rand( 'uniform');
netl=zeros(Nl,1);net2=zeros(N2/l);net3=zeros(N3,l);
netlh=zeros(Nl,l);net2h=zeros(N2,1);net3h=zeros(N3,1) ;
ol=zeros(Nl,1);o2=zeros(N2,1);o3=zeros(N3,1);
olh=zeros(Nl,l);o2h=zeros(N2,1);o3h=zeros(N3,1);
dell=zeros(Nl,l);del2=zeros(N2,1);del3=zeros(N3,1);
derl=zeros(Nl,1);der2=zeros(N2,1);der3=zeros(N3,1);
alfal=ones(Nlf1);alfa2=ones(N2,1);alfa3=ones(N3,1);
hl=zeros(Nl,1);h2=zeros(N2,1);h3=zeros(N3/1);
Kl=ones(Nl,1);K2=ones(N2,1);K3=ones(N3,1);
W01=-ones(N0rNl)+2*rand(N0,Nl);
W12=-ones(N1,N2)+2*rand(Nl7 H2);
W2 3=-ones(N2,N3)+2*rand(N2,N3);
W011ipp=zeros(N07N1);
W12lipp=zeros(N1,N2);
W231ipp=zeros(H2,N3);
used for training sets up neural
network
is a program that initilizes all the
possible variables and structure of the
neural network
It uses the values
NO = # of inputs (delayed) GIVEN BY
THE INPUT
Nl = # of neurons in the first hidden
layer
N2 = # of neurons in the second hidden
layer
N3 = # of neurons in the output layer
GIVEN BY THE OUTPUT
it uses the information of NO (included
in It), number of input signals.
% kct.m is a program that initializes all the values
% of Kl,K2,K3 with the same value.
% OBS: the value of K(a constant) is asked here
K=input('saturation level of neurons = ');
%
K1=K1*K;
K2=K2*K;
K3=K3*K;
% here we need to renormalize the weights to avoid
% saturation of neurons
% at the beginning of the learning process
W01=W01/(K*NO);W12=W12/(K*N1);W23=W23/(alfa3*N2) ;


% alfact.m is a program that initializes all the
% values of alfal,alfa2,alfa3 with the
% same value.
% OBS: the value of alfa(a constant) is
% been asked here
alfa=input('value of the ct. slope = ');
%
alfal=ones(N1,1)*alfa;
alfa2=ones(N2,1)*alfa;
alfa3=ones(N3,1)*alfa;
% iynaccin.m is a program that initilizes all the
% possible variables that will work as
% accumulators for cumulative learning
%
alfali=zeros(Nl, 1) ;alfa2i=zeros (N2,1) ,*alfa3i=zeros..
(N3,l);
hli=zeros(Nl,1);h2i=zeros(N2,1);h3i=zeros(N3,l);
Kli=zeros(Nl,1);K2i=zeros(N2,1);K3i=zeros(N3,1);
W01i=zeros(N0,N1);Wl2i=zeros(Nl,N2);W23i=zeros(N2,N3);
ai=zeros{NO#1);gi=zeros(NO,1);
% neuroicc.m program for initializing the initial
% conditions for the outputs of every
% neuron ( = zeros) for every run of the
% complete set of points
ol=zeros(Nl,1);o2=zeros(N2,1);o3=zeros(N3,1);
% makeinpx.m this makes the input to the Neural
% Network from the reference input and the
% plant's output, using negative feedback
c(l,l)=c(l,l)-fbv(l,l);
% iynwcl.m is the program that calculates
% increments for W and stores them in
% accumulators
W23i=W23i+nu*o2*(del3.*alfa3)';
W12i=W12i+nu*ol*(del2.*alfa2) ';
W01i=W01i+nu*c*(dell.*alfal)';
% NOTE: that the input has been changed to (c) to
% include the feedback


%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
iynfwlhn.m used for feedback looping
iynfwllh.m is a program that produces all the
forward computation in the feedforward
path for purpose of learning
ASSUMES FEEDFORWARD COMPENSATION (output
of comp.= c)
This subroutine calculates the status of
all variables at time ii
EVERY NEURON IS IMPLEMENTED WITH HYSTERESIS
(except for the neuron in the output layer which
is LINEAR) introduces the variables
net*h=net*-o*./K*
THE OUTPUT LAYER ONLY HAS ONE NEURON
% FORWARD COMPUTATION
%
% netl=W01'*It(:,ii);
netl=W01'*c;
netlh=netl-ol./K1;
ol=(-exp(-alfal.*(netlh+hl))+ones(length(alfal),1));
ol=ol./(exp(-alfal.*(netlh+hl))+ones(length(alfal),1));
ol=ol.*K1;
derl=(-ol.*ol+Kl.*K1)./(2*K1) ;
net2=W12'*ol;
net2h=net2-o2./K2;
o2=(-exp(-aIfa2.*(net2h+h2))+ones(length(alfa2),1));
o2=o2./(exp(-alfa2.*(net2h+h2))+ones(length(alfa2),l));
o2=o2.*K2;
der2=(-o2.*o2+K2.*K2)./(2*K2);
net3=W2 3'*o2;
% LINEAR FUNCTION IN THE OUTPUT NEURON
o3=alfa3.*(net3+h3);
der3=ones(N3,1);
%
% BACK COMPUTATION
shfton
a4plantj
de!3=(0t(:,ii)-0p{:,ii)).*der3;
%del3=(0t(:,ii)-o3).*der3;
del2=(W23*(del3.*alfa3)),*der2;
dell=(W12*(del2.*alfa2)).*derl;
del0=W01*(dell.*alfal);
%
% This program provides all the parameters for
% applying the Dynamical Extended Delta Rule


% shftfbv.m This performs the update of the Feed-
% Back Vector
fbv(2:sizel(l),l)=fbv(l:sizel(1)-1,1);
fbv(l,l)=Op(l,ii);
% iynwc2.m is the program that calculate increments
% for W in cumulative fashion from the
% accumulators
W23=W23+W23i/np+lipp*(W23-W23lipp);
wi2=W12+12i/np+lipp* (W12-W121 ipp) ;
WO 1=W01+WO1i/np+1ipp* (WOl-WO 11 ipp)
W231ipp=W23;
W12lipp=W12;
W011ipp=W01;
% t3dnn4x.m
%
%
%
%
%
adinxhl
invector
iyninxh2
piantst
sizeI=NO;
fbv=zeros(NO,1)
On=zeros(1,5);
stepcnt=l;
incnt=l;
for ii=l:np,
For use when the nuetral network is in
the feedback loop. This runs the plant
with N.N. compensator. Used for
testing, especially for varying input
data such as when the step has negative
and positive values.
f
adcoexh
makeinpx
adfwxlh
shfton
a4plantj
shftfbv
stepcnt=stepcnt+l;
if(stepcnt>npoints),
stepcnt=l;
incnt=incnt+l;
end
end


%
%
%
%
%
%
%
t3adn41m.m trains a three layer neural network for
adaptive control Learn More
testh03.m is a program that test the dynamical
neural network with HYSTERESIS AND
LINEAR OUTPUT NEURON
% CUMMULATIVE LEARNING
%
format compact;
nuraero^nputf'number of iterations =');
lipp=input(7ENTER: value of lippman constant = ');
factor=input(# ENTER: value of nu reciprocal exponent..
=
errloc=zeros(numero,1);
subplot
for jj=l:numero,
fbv=0;
On=zeros(l/5); % Output of N.N. to plant
% for R-K
plantst
iynaccin
neuroicc
iyninh2
for ii=l:np,
adcoleh
makeinpx
adfwlhn
adpercen
nu=nu*(1/factor);
iynwcl
errloc(jj)=abs(Ot(1,ii)-Op(l,ii))+errloc(jj);
shftfbv4
end
iynwc2
subplot
subplot(211),plot(errloc),title('ERROR t3adn41m.m/)
subplot(212),plot(Op)fhold on
plot(Ot,':'),title('MODEL & PLANT OUTPUT'),hold off
end
save neuOlou W01 W12 W23 alfal alfa2 alfa3 hi h2 h3..
K1 K2 K3
QA


% adinxhl.m for total test of system
%
% iyninxhl.m is a program that initializes all the
% variables for the neural network to work
% properly in the feedforward mode*
%load neuOlou;
Nl=length{alfal);
N2=length(alfa2);
N3=length(alfa3);
nsteps=input('ENTER: # of steps = ');
npoints=input('ENTER: # of points =');
np=nsteps*npoints;
Oe=zeros(np,1);
netl=zeros(Nl,1);
netlh=zeros(Nl,1);
ol=zeros(Nl,1);
net2=zeros(N2,1);
net2h=zeros(N2,1);
o2=zeros(N2,1);
net3=zeros(N37l);
net3h=zeros(N3,1);
o3=zeros(N3,1);
% iyninxh2.m is a program to initialize the
% compensators for normal operation
c=zeros(NO,1);
% adfwxlh.m is a program the executes the (iith)
% normal operation of the neural network
% section
% INPUT = Ie NO x 1
% ASSUMING FEEDFORWARD COMPENSATION
% OUTPUT= Oe np X 1
% FORWARD COMPUTATION
% EVERY NEURON IS IMPLEMENTED WITH HYSTERESIS
% (except for the neuron in the output layer which
% is LINEAR) introduces the variables net*h=..
% net*-o*./K*
%
% THE OUTPUT LAYER ONLY HAS ONE NEURON
%
%
% netl=W01'*Ie(:7ii);
netl=W01'*c;
netlh=netl-ol./K1;
or


ol=(-exp(-alfal.*(netlh+hl))+ones(length(alfal), 1)) ;
ol=ol./(exp(-alfal.*(netlh+hl))tones(length(alfal),l));
ol=ol.*Kl;
net2=W12'*oi;
net2h=net2-o2./K2;
o2=(-exp(-alfa2.*(net2h+h2))tones(length(alfa2),1));
o2=o2./(exp(-alfa2.*(net2h+h2))tones(length(alfa2),1));
o2=o2.*K2;
net3=W23'*o2;
% LINEAR FUNCTION OF THE OUTPUT NEURON
o3=alfa3.*(net3+h3);
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
%
if
%
a4plantj.m Adam's Forth Order Predictor/Corrector
used with neural networks is a program
to solve continous time equations using
Runge-Kutta initial value estimator,
Adams-Bashforth four step predictor,
& Adams-Moulton three step corrector.
See plantst.m
diffeqpi.m forms the product of A*w + b*u,
w is the present value of the system and
u is the unit step function,
wo = initial conditions of the system,
to = starting time of run.
tf = finishing time of run.
h=(tf-to+T)/T
The entire output of the run is in matrix points, if
this is desired.
(ii>=l)&(ii<=3),
u=o3; % ot(ii);
kl=h*diffeqpi(w,Aplt,Bplt,0n(l,2));
k2=h*diffeqpi(w+kl/2,Aplt,Bplt,0n(l,2));
k3=h*diffeqpi(w+k2/2,Aplt,Bplt,On(l,2));
k4=h*diffeqpi(w+k3,Aplt,Bplt,0n(l,2});
w=points(:,ii)+(kl+2*k2+2*k3+k4)/6;
points(:,ii+1)=w;
wpO=points(:,1);
wpl=points(:,2);
wp2=points(:,3);
wp3=points(:,4);
end
if (ii>=4),
% Adams-Bashforth 4 step predictor
w=wp3+h*(55*diffeqpi(wp3,Aplt,Bplt,On(l,l))..
-59*diffeqpi(wp2,Aplt,Bplt,On(l,2))..
+37*diffeqpi(wpl,Aplt,Bplt,On(l,3))..
-9*diffeqpi(wpO,Aplt,Bplt,On(l,4))) /24-;
or


0\0 o\o o\ t> o\ (\ o\ e\ \P
%
Adams-Moulton 3 step corrector
w=wp3+h*(9*diffeqpi(w,Aplt,Bplt,On(l,1))..
+19*diffeqpi(wp3 ,Aplt,Bplt,On(l,1))..
-5*diffeqpi(wp2,Aplt,Bplt,On(l,2))..
+diffeqpi(wpl,Aplt,Bplt,0n(l,3)))/24;
points(:,ii+1)=w;
wpO = wpl;
wpl = wp2;
wp2 = wp3;
wp3 = w;
end
Op(l,ii)=points(l,ii+l);
diffeqpi.m This is function diffeqpi.m which is
used in a4plantj.m.
Input: t=time of evaluation. If the values do not
change with time, the value of t is
unimportant.
w=vector value under evaluation,
matrices A & B
u is shown here as a unit step input, this could be
changed.
function f = diffeqpi(w,A,B,u)
f=A*w + B*u;
87


APPENDIX B
The tables listing the training and retraining data
are included in this Appendix.
Abbreviations used in this appendix are on the
following page.


ABBREVIATIONS USED FOR TABLES
4.lr 4.2, & 4.3
A = amplitude of the training input
tf = time at the end of the training iteration
NO = number of outputs in the delay layer
N1 = number of neurons in the first hidden layer
K = saturation level of the sigmoidal curve
a = slope of the sigmoidal curve at the origin
I = number of iterations
L = value of the momentum constant
f = value of the nu reciprocal re-scaling exponent
sses = value of the sum square error at the beginning of
'the iterations
ssee = value of the sum square error at the end of the
iterations
ABBREVIATIONS USED FOR TABLES
4.4, 4.5, 4.6, 4.7, 4.8, & 4.9
A = amplitude of the training input,
tf = time at the end of the training iteration
T = time between successive points during the
iteration
NO = number of outputs in the delay layer
N1 = number of neurons in the first hidden layer
N2 = number of neurons in the second hidden layer
k = saturation level of the sigmoidal curve
a = slope of the sigmoidal curve at the origin
I = number of iterations
L = value of the momentum constant
f = value of the nu reciprocal re-scaling exponent
sses = value of the sum square error at the beginning of
the iterations
= value of the sum square error at the end of the
iterations
ssee


Full Text

PAGE 1

A NEURAL NETWORK-BASED MODEL-REFERENCE ADAPTIVE CONTROL SYSTEM by David Leland !nee B.S., University of Colorado, 1988 A thesis submitted to the Faculty of the Graduate School of the University of Colorado in partial fulfillment of the requirements for the degree of Master of Science Department of Electrical Engineering and Computer Science 1990 .. ... L:-.::LJ

PAGE 2

This thesis for the Master of Science degree by David Leland Ince has been _approved for the Department of Electrical Engineering and computer Science by ---Jan T. Bialasiewicz / Date

PAGE 3

Ince, David Leland (M.S., Electrical Engineering) A Neural Network-Based Model-Reference Adaptive Control System Thesis directed by Professor Edward T. Wall This thesis examines the implementation of recurrent backpropagation neural networks in a modelreference adaptive control system. The neural network is used as a self-adapting controller in a singleinput/single output .system. Two approaches at modelreference adaptive control systems are defined and examined by computer simulation. The simulations show that these are viable implementations when the stability requirements of the system to be controlled are met and the architecture of the neural network used is sufficiently large. neural network has been trained for once the a close approximation of the actual plant, training can continue on-line. In this way the errors in the identification of the system and variations in the plant parameters can be taken into account. The approaches in this thesis show that this is a promising new method of controls engineering

PAGE 4

and the indications are that it is applicative to problems of a higher complexity. The form and abstract of this thesis are approved. publication. I recommend its Signed Edward T. Wall iv

PAGE 5

ACKNOWLEDGEMENTS This thesis is dedicated to the following people whose support and guidance has made it possible. My thanks to Dr. Julio c. Proano who developed the recurrent neural network used in this thesis. Dr. Edward T. Wall for his advice and understanding of control systems. Dr. Jan T. Bialasiewicz for his mathematical direction. Dr. William J. Wolfe for his encouragement and advice. I would also like to thank my mother for her understanding.

PAGE 6

CONTENTS FIGURES ......... viii 1. INTRODUCTION . . . . . . 1 2. BACKGROUND NEURAL NETWORK .. ................. 5 2.1 Introduction 5 2.2 Memory ................... 7 2.3 Calculation of output . . . . 8 2.4 Training by Backpropagation ... 10 3 PLAN'T SIMtJLA.TION 13 3.1 Introduction . . . . . . . 13 3.2 Procedure .................................. 13 3.3 Conclusions ................................. 17 4. SIMtJLA.TION RESULTS . . . . 19 4.1 Introduction ................. 19 4.2 Open Loop Control System ................... 19 4.2.1 Case Studies 21 4.2.2 20% Friction Coefficient Increase o o o 21 4.2.3 20% Moment of Inertia Coefficient Increas.e o o o o o 2 7 4.2o4 Conclusion . 2 7 4.3 Closed Loop Control system o 3 5 4.3.1 Case Studies .... o 4 0

PAGE 7

4.3.2 20% Friction Coefficient Decrease . . . 4 o 4.3.3 20% Spring Coefficient Increase . . . . . ... 4 4 4.3.4 20% Spring Coefficient Decrease ........... 52 4.3.5 20% Moment of Inertia Coefficient Increase ......... 57 4.3.6 5% Friction Coefficient Increase . . . . . . 57 4.3.7 Conclusions . . . . . 6 2 5. CONCLUSIONS 67 APPENDIX A .......................... 69 APPENDIX B .................................... 88 BIBLIOGRAPHY ....... 93 vii

PAGE 8

FIGURES Fig. 1 Open loop control system ......... 3 Fig. 2-Closed loop control system ........ 3 Fig. 3 -Generalized architecture of recurrent backpropagation neural network [1]. Fig. 4 -Sigmoidal Function -K = 1. . 6 a = 0 o 5 ( ---) I 1 ( 0 ) I & 2 (-) o o o o o o o o o o o o o o o o o o o o o 9 Fig. 5 -Response of Model (---) and uncompensated plant (---) to a unit step input. . . . . . . . . . . 2 2 Fig. 6 -output of model (***) and compensated plant (-----) to unit step input ...... 23 Fig. 7 -Difference between the model and the plant outputs (model -plant) as a function of time. ............................... 2 4 Fig. 8 -Initial learning curve, sum of the square error versus interation ................. 25 Fig. 9 -Response of the trained system to a multi-level step input ............................... 26 Fig. 10 -Retraining curve for a 20% increase in friction coefficient ................... i 28 Fig. 11 -Response of model (***) and output of the retrained plant (----) after adaptation to 20% increase in friction coefficient ............ 29 Fig. 12 -Difference between the model and the retrained plant (model -plant) after retraining to a 20% increase in the coefficient of friction as a function of time. . . . . . . . 3 0 Fig. 13 -Retraining curve for a 20% decrease in the moment of inertia coefficient ................ 31

PAGE 9

Fig. 14 -Response of the model (***) and the output of the retrained plant (---) after adaptation to 20% decrease in the inertia coefficient .................................. 32 Fig. 15 -Difference between the model and plant outputs (model -plant) after retraining to a 20% decrease in the coefficient of moment of inertia as a function of time. .... 33 Fig. 16 -Response of the model (***) and the uncompensated plant (---) to a unit step input ...... 37 Fig. 17 -Initial learning curve, sum of the square error versus interation. 38 Fig. 18 -output of model (***) and compensated plant (---) to a unit set input .... 39 Fig. 19 -Difference between the model and the plant outputs (model -plant) as a function of time. ............. . . . . . . . . 41 Fig. 20 -Response of the trained system to a multi-level step input .. 42 Fig. 21 -Comparison between the outputs of the model (***) and the plant (---) after a 20% decrease in the coefficient of friction before adaptation. .. . . . . . . . . . 4 3 Fig. 22 -Response of the model (***) and the retrained plant (---) after adaptation to 20% decrease in the coefficient of friction ..... 45 Fig. 23 -Difference between the model and the retrained plant (model -plant) after retraining to a 20% decrease in the coefficient of friction as a function of time. ................................... 4 6 Fig. 24 Retraining curve for a 20% decrease in the coefficient of friction ........................ 47 Fig. 25 -Comparison between the outputs of the model (***) and the plant (---) after a 20% increase in the spring coefficient before adaptation. ............................ . . . 4 8 ix

PAGE 10

Fig. 26 -Retraining curve for a 20% increase in the spring coefficient ......... 49 Fig. 27 -Comparison between the outputs of the model (***) and the plant after a 20% in the spring coefficient after adaptation ........................................... 50 Fig. 28 -Difference between the model and the retrained plant (model -plant) after retraining to a 20% increase in the spring coefficient as a function of time. . . . . . . . . . 51 Fig. 29 -comparison between the outputs of the model (***) and the plant (---) after a 20% decrease in the spring coefficient before adaptation ........................................... 53 Fig. 30 -Retraining curve for a 20% decrease in the spring coefficient ............... 54 Fig. 31 -Comparison between the outputs of the model (***) and the plant (---) after a 20% decrease in the spring coefficient after adaptation. ......................................... 55 Fig. 32 -Difference between the model and the retrained plant (model -plant) after retraining to a 20% decrease in the spring coefficient as a function of time. . . . . . . . . 56 Fig. 33 -Comparison between the outputs of the model (***) and the plant (---) after a 20% increase in the moment of inertia coefficient adaptation ....................... 58 Fig. 34 -Retraining curve for a 20% increase in the coefficient of moment of inertia .......... 59 Fig. 35 -Comparison between the output of the model (***) and the plant (---) after a 20% increase in the coefficient of moment of inertia after adaptation ..................................... 60 Fig. 36 -Difference between the model and the retrained plant (model -plant) after retraining to a 20% increase in the coefficient of moment of inertia as a function of time .................... 61 X

PAGE 11

Fig. 37 -Comparison between the outputs of the model (***) and the plant (---) after a 5% increase in the coefficient of friction before adaptation .. 63 Fig. 38 -Retraining curve for a 5% increase in the coefficient of friction 64 Fig. 39 -Comparison between the output of the model (***).and the plant (-) after a 5% increase in the coefficient of friction after adaptation ...... 65 Fig. 40 -Difference between the model and the retrained plant (model -plant) after retraining to a 5% increase in the coefficient of friction as a function of time. . . . . . . . . . 6 6 xi I

PAGE 12

CHAPTER 1 INTRODUCTION The goal of this research is to study the use of a recurrent neural network (NN) as a controller in a model reference adaptive control (MRAC) system. The more common approaches of control system compensation, both classical and modern, work well only where the plant is in a linear mode of operation. This can be overly restrictive since most physical plants are nonlinear. One method of overcoming this restriction is by using a model that for a given input will produce the desired output. This model may be either hardware or software. In model reference control systems the output of the model and the plant are compared and this difference is used to modify the control signals forcing the plant to track the model. This method has useful in overcoming nonlinearities. been proven quite Since plants also exhibit variations do to enviornmental changes the model reference control system needs to be adaptable to these changes. controllers are at most second order dynamical systems and may not deliver the proper

PAGE 13

response for higher order systems. The NN creates the possibility of being a higher order controller. Reference [1] demonstrates the ability of the NN to implement a higher order dynamical compensator with adjustable parameters. Such a system will be called a NNMRAC system. Two different methods of application will be investigated. The first method of using a NN in a MRAC system is shown in Fig. 1. As illustrated in this system the output of the plant is not compared to the reference signal, i.e. there is no feedback from the output of the to be subtracted from the reference input since this is an open loop control system. The second method is shown in Fig. 2. This is a true feedback control system because the output of the plant is compared to the reference input. Both of these systems have been investigated. The approach taken was as follows: 1) A plant which exhibits overshoot to a step input was selected. A reference model with no overshoot was chosen. 2) The plant with a NN compensator was trained so that the plant would track the model until the sum of 2

PAGE 14

Input Model Neural Network Plant Figure 1 Open loop control system. Model Neural Network Plant Figure 2 -Closed loop control system. 3 output + Output

PAGE 15

the square error between the model and the plant was at a low level. 3) The parameters of the plant were perturbed to simulate different effects such as changes in the moment of inertia or in friction. 4) The NN was retrained in order to see if the new error between the model and the plant could be reduced down to the level that had been arrived at in step 2. Results of the initial training, the effects of the perturbations, and the results of the retraining have been included. These experiments show that the NNMRAC system is indeed a viable approach for control systems. 4

PAGE 16

CHAPTER 2 BACKGROUND NEURAL NETWORK 2.1 Introduction The generalized architecture of a recurrent backpropagation NN is shown in Fig. 3. It consists of three segments known as the input layer, the hidden layers and the output layer. The input layer has a single input and the number ofoutputs is NO. There are usually two hidden layers of neurons with Nl and N2 neurons in each layer respectively. The output layer is a single neuron where NJ is used to identify it. Every output of the input layer is connected to every neuron in the first hidden layer through a network of multipliers known as weights. This network is called the WOl weight matrix and its size is NO by Nl. Every output of the first hidden layer is an input to every neuron in the second hidden layer through a second weight matrix, W12, and its size is Nl by N2. Each neuron in the second hidden layer is an input to the output neuron through a third weight matrix W23. Its size is N2 by NJ. Since NJ is one, W23 is in reality a column vector of weights. The recurrent feature of the

PAGE 17

input 1 z 1 i z l DELAY 1 NETWORK \ : > I I : '----' 1ST HIDDEN 2ND HIDDEN LAYER LAYER Fig. 3 Generalized architecture backpropagation neural network [1]. 6 of recurrent

PAGE 18

NN is that each neuron in the hidden layers is an input to itself, after it has been delayed for a time interval of duration T. 2.2 Memory The outputs of the hidden layers are nonlinear in that the output value of each of these neurons is governed by a sigmoidal function which acts as the memory feature of the NN. The equation of the output of the m-th neuron in the n-th hidden layer at the k-th interval of time is Onm(k) = f([NETnm(k) -K-1 Onm(k-1)] (2 .1) where NETnm(k) is the input to the m-th neuron from the preceeding layer. f(s) = Ktanh([a][s]/2) (2.2) is known as the sigmoidal function, where K is the scaling factor of the the sigmoidal function and A is related to the slope of the sigmoidal function. Equation (2.2) can also be expressed by f(s) = K(l-exp(-[a][s]))/(l+exp(-[a][s])) (2.3) where exp is the exponential function. Taking the derivative of Equation (2.4) with respect to s yields 7

PAGE 19

2K(a]exp(-(a](s]) f'(s) = (l+exp(-(a](s])2 (2.4) Evaluating Equation (2.4) at s=O gives the slope of the sigmoidal function in the linear region. f'(O) = 2K(a]/4 = K(a]/2 (2.5) This is shown in Fig. 4 with K=l. 2.3 Calculation of Output The first step in using a NN is to choose the desired response to an input signal, which is usually a step function. The response to this input must span a sufficient time for the output to reach steady-state. The number of training points will determine the accuracy of the output of the NN. The higher the number of points, the higher the precision. However, this requires more training time per learning iteration and more memory space to hold all of the weight matrices. The input signal is passed through a series of N0-1 time delays of T duration. The input signal inside of the NN will then last for a time interval of NOT. This maps the input dynamics into the NN. Each T interval of the response of the NN is referred to as a point. For each point in the training sequence the column vector of the input to the first layer is determined by the equation 8

PAGE 20

E-o "-E-o 0 0.6 0.6 0.1 0.2 0 -0.2 -0.1 -0.6 -0.6 I .. ,l,' I' ,/ ,' ,'/ / / I ,,'' i ,' ,' i ""',,,'' ,' I. I I / .. .. //1// // ,,, ,. ------.,/ -10 -B -6 -4 -2 0 2 Fig. a = 4 0.5 INPUT Sigmoidal Function -K = 1. (---), 1 (.. ), & 2 (-). 9 6 B 10

PAGE 21

NET1(k) = W01(k)T c(k) (2. 6) where c is the column vector of outputs from the delay layer and T is the transpose operator. The value of NET1(k) is operated on by equation (2.1) and its output value is 01. This is used by the equation NET2(k) = W12(k)T 01(k) (2.7) to calculate the input to the second hidden layer. NET2 (k) is. operated on by equation (2 .1) to determine the output of the second hidden layer 02(k). The input to the third hidden layer is calculated by the equation NET3(k) = W23(k)T 02(k) ( 2. 8) The output of the NN is finally determined by the equation 03(k) = [a)NET3(k) (2.9) 2.4 Traininq by Backpropaqation The technique of training a NN by backpropagation in Reference [1,2] requires that the derivative and an intermediate variable, point for each neuron. delta, be calculated The equation DERn(k) = (-On(k)On(k) + KnKn)/(2Kn) 10 at each (2.10)

PAGE 22

determines the derivative vector for the neurons in the n-th hidden layer. The derivative of the output neuron is assumed to be a vector of composed ones. The deltas for each layer are calculated using the deltas of the previous layer by the equations DELJ(k) = [Om(k)-03(k)]DER3(k) DEL2(k) = W23(k) [(DEL3(k)[a]]DER2(k) DEL1(k) = W12(k) [(DEL2(k) (a]]DER1(k) (2.11) (2.12) (2.13) At the k-th point the square error, nu, must be calculated by the equation nu(k) = [Om(k)-03(k)]2 I (Om(k) 2 ( 2. 14) The values of nu and DEL at each k-th interval are used to calculate an incremental change in the weight matrices. The equations, W23i(k+1) = W23i(k) + nu(k)02(k) (DELJ(k)A] W12i(k+1) = W12i(k) + nu(k)01(k) [DEL2(k)A] W01i(k+1) = W01i(k) + nu(k)c(k) [DEL1(k)A] ( 2. 15) (2 .16) (2.17) are used at the end of each training iteration the weight matrices are updated by the equations W23(t+1) = W23(t) + W23i(t)jnp + m[W23(t)-W23(t-1)] ( 2. 18) 11

PAGE 23

Wl2(t+l) = Wl2(t) + W12i(t)/np + m[W12(t)-Wl2(t-1)] (2 .19) W01(t+l) = W01(t) + WOli(t)/np + m[WOl(t)-WOl(t-1)] (2.20) where t represents the t-th iteration, np is the number of points for each sequence, and m is the momentum term in [2]. This term is used to speed the convergence of the backpropagation algorithm. After training the NN for several (typically 20 to 300) iterations, the value of the square error has been reduced to a low value. This means that the changes to the weights per iteration is small. One procedure to increase the learning rate is to raise the value of the square error to a power less than one. Since the square is less than or equal to one, this will increase the value of the square error. When this is applied properly the number of iterations need to train the NN can be reduced. 12

PAGE 24

CHAPTER 3 PLANT SIMULATION 3.1 Introduction The training of the NN requires that the output of the plant be compared to the reference model output on a point-by-point basis. Each point is separated in time as in a discrete-time control system. In the second method of NNMRAC systems the output of the plant is used to modify the input signal via feedback. The algorithm used to define the plant should be capable of handling the feedback accurately with fairly large values of sampling intervals. For this reason a discrete-time filter to simulate the plant was not considered. 3.2 Procedure A suitable method makes use of the Runge-Kutta Fourth Order algorithm to find the first three output values. The remaining values of the plant output are calculated using an Adams-Bashforth Four-Step Predictor and an Adams-Moulton Three-Step Corrector. Reference [3] presents the algorithms when the input is constant.

PAGE 25

Therefore, some changes to the algorithm were made to take into account the changing inputs. The unmodified algorithm is evaluated using the four previous time and state values. The modification includes using the four previous inputs to the plant. While this did exhibit errors, the resultant plant gave a reasonable approximation to the correct output and was significantly more accurate than a step invariant discrete-time representation for the same transfer function with identical time intervals. While the simulation time for the Runge-Kutta plant is much longer than that of a discrete-time plant, the accuracy of the Runge-Kutta method permits a reduction in the number of neurons in the time delay layer of the NN. The Runge-Kutta method permits the description of the plant transfer function in continuous-time control canonical form, which allowed variations in the plant parameters when testing the trained system for plant perturbations. This was the motivation for the selection of the Runge-Kutta method. The algorithm was divided into three parts. The first part provides the variables needed and initializes their values. The second part provides a function call that evaluates the expressions defined in [3] of equation 3.1 below 14

PAGE 26

X'(t) = AX(t) + Bu(t) Y(t) = CX(t) + Du(t) ( 3 .1) (3.2) where X(t) is the (n by 1) vector of the states of an n-th order system, A is the (n by n) matrix called the system matrix, and B is the (n by r) matrix called the input matrix. Also U(t) is the (r by 1) vector composed of the system input functions, and Y(t) is the (p by 1) vector composed of the defined outputs. In addition C is the (p by n) matrix called the output matrix and D is the (p by r) matrix that represents the direct coupling between the input and output. The third part of the method determines which variables are sent to the function call and how the returned values are processed into an output for a particular time instant after equation (3.2). For a time invariant plant the Runge-Kutta Fourth-Order starter calculates the value of the temporary variables, K1, K2, K3 and K4 three times using in the following equations, K1 = Tf(w(i-1)) K2 = Tf(w(i-1) + K1/2) K3 = Tf(w(i-1) + K2/2) K4 = Tf(w(i-1) + K3) (3.3) where T is the time interval between output values, f is the function to be evaluated and w(i-1) are the states at the previous evaluation. This evaluates only 15

PAGE 27

X' ;:: AX. ( 3. 4) Changes to the starter to which evaluates the entire equation (3.1) are Kl = Tf(w(i-1),0n(i-1)) K2 = Tf(w(i-1) + K1/2,0n(i-1)) K3 = Tf(w(i-1) + K2/2,0n(i-1)) K4 = Tf(w(i-1) + K3,0n(i-1)) ( 3. 5) where On(i-1) is the output of the NN delayed by one time interval. The Adams-Bashforth Four-step predictor is w(i) = w(i-1) + T(55f(w(i-1)) -59f(w(i-2)) + 37f(w(i-3)) -9f(w(i-4)))/24. The changes used for the predictor are w(i) = w(i-1) + T(55f(w(i-1),0n(i)) (3. 6) 59f(w(i-2),0n(i-1)) + (3.7) 37f(w(i-3),0n(i-2)) -9f(w(i-3),0n(i-3)))/24, where On(i) is the output of the NN at the present time and the other On values are delayed by one time interval. The Adams-Moulton corrector is w(i) = w(i-1) + T(9f(w(i)) + -5f(w(i-2) + f(w(i-3)))/24 16 (3.8)

PAGE 28

where w(i) on the right side of the corrector equation is the value of w(i) from the predictor equation. The changes to the corrector are then w(i) = w(i-1) + T(9f(w(i),On(i)) + 3.3 Conclusions 19f(w(i-l),On(i)) -(3.9) Sf(w(i-2),0n(i-1)) + f(w(i-J),On(i-2)))/24. These changes were tested by replacing the NN compensator by a proportional gain with a value of one. The values of the output of the NN were then replaced by using the output of this interim plant subtracted from the reference input to form the next input to the plant. When compared to a step-invariant discrete time filter with the same transfer function the Runge-Kutta determined plant gave better results for identical time intervals. However it should be noted that the Runge-Kutta program easily becomes unstable when the coefficients in the transfer function are large, for large time increments. Other methods presented in reference [ 4] also exhibit less accuracy when employed in a feedback loop with the notable exception of the bilinear transformation. This method is quite accurate, however it does not permit easy adjustments for the change in parameters 17

PAGE 29

that is neccessary for the purpose of this study. The introduction of a plant whose input is the output of the NN does involve some minor changes to the backpropagation algorithm. In the algorithm the updating of the weights is accomplished by a comparison of the output of the desired reference model with the output of the NN. The changes that must be made are made in order to compare the output of the reference model with the output of the plant. The modified program is in Appendix A. 18

PAGE 30

CHAPTER 4 SIMULATION RESULTS 4.1 Introduction This chapter consists of two parts. The first part will examine the open loop control system where the NN had a single hidden layer. The configuration is that of Fig. 1. The second part investigates the closed loop control system where the NN had two hidden layers. The configuration used is shown in Fig. 2. 4.2 Open Loop Control System Since this type of system is open loop the plant to be controlled must be type 0. If the plant to be controlled is of type 1 then. the plant can be transformed into a type 0 by using unity feedback.around the plant. The plant used in this simulation is a servomotor with a transfer function of the form F G(s) = ( 4. 1) Ms(s+B} where M represents the moment of inertia (MOI} of the motor and load, B represents the friction and F is the applied force. Since this function is a type 1, unity

PAGE 31

feedback is added to make the plant type o. The resultant transfer function is F G(s) = Ms2 + Bs + F. (4.2) This system is now controllable in the open loop configuration. With parameters inserted into equation (4.2) the transfer function becomes 25 G(s) = s 2 + 6s + 25. (4.3) This plant will exhibit overshoot when subjected to a step input. A more desirable transfer function is 120 Gm(s) = (s+4) (s+5) (s+6) 120 = s 3 + 15s2 + 74s + 120. (4.4) which diplays no overshoot since none of the poles of the transfer function are complex. It should be observed that the transfer function of the model is of higher degree than that of the plant. An unexpected result is that the NN is more easily trained when the model is of higher order than the plant. 20

PAGE 32

Fig. 5 shows the unit step response of the type 0 plant and the model. Table 4. 1 details the neural configuration and the training duration (Appendix B) The initial training was suspended after 300 iterations at which point the sum of the square error was 0 .141. Fig. 6 compares the model and the compensated output. There is a close matching between the two. Fig. 7 shows the difference between the model and the plant output as a function of time. Fig. 8 gives the learning curve and Fig. 9 the response of the system to a multi-level input. This result indicates that the NN has learned the system dynamics. 4.2.1 Case Studies As a test for adaptability the coefficients of MOI and friction were independently varied by plus and minus 20%. 4.2.2 20% Friction Coefficient Increase The first case to be examined is a friction increase by 20%. The new transfer function of the plant becomes 25 Gf(s) = s 2 + 7.2s + 25. (4.5) 21

PAGE 33

0.0 0.6 0.1 . 0.5 I ' 1 1.5 2 2.5 3 TlltlE in seconds 3.5 4 4.5 5 Fig. 5 -Response of Model (---) and uncompensated plant (---) to a unit step input. 22

PAGE 34

0.0 E-< ::J 0.. O.G E-< ::J 0 0.1 0.2 0 0 0.5 1.5 2 2.5 3 3.5 4 4.5 TIME in seconds Fig. 6 Output of model (***) and compensated plant (---) to unit step input. 23 5

PAGE 35

0.015. 0.01 0.005 0 0 1&1 -0.005 -0.01 -0.015 0 0.5 1.5 2 2.5 3 3.5 4 4.5 5 TIME in seconds Fig. 7 -Difference between the model and the plant outputs (model -plant) as a function of time. 24

PAGE 36

45 40 35 30 e:: 25 0 e:: e:: 101 20 15 10 5 0 0 50 100 150 200 250 300 ITERATION Fig. 8 -Initial learning curve, sum of the square error versus interation. 25

PAGE 37

2 1.5 1 0 \ 0.5 E-::. Q. 0 E-::. 0 -0.5 -1 J -1.5 -2 0 5 10 15 20 25 30 35 TIME in seconds Fig. 9 -Response of the trained system to a multilevel step input. 26

PAGE 38

Table 4. 2 gives the specifications of the retraining (Appendix B). Briefly the sum of the square error had returned from 0.506 to 0.141 in 124 iterations. Fig. 10 shows the learning curve of the retraining and Fig. 11 shows the response after retraining and Fig. 12 gives the difference between the model and the retrained system as a function of time. 4.2.3 20% Moment of Inertia coefficient Increase The second case is a decrease in the MOI by 20%. The new transfer function for this case is 31.25 Gmoi(s) = 2 s + 7.5s + 31.25. ( 4 0 6) Table 4.3 gives the retraining specifications (Appendix B). The sum of the square error returns to 0.141 from 0.161 in 64 iterations. Fig. 13 shows the retraining curve, Fig. 14 compares the output of the retrained system with the model and Fig. 15 shows the.difference between the model and the retrained system as a function of time. 4.2.4 Conclusions The results of retraining the plant with an increase in MOI or a decrease in friction are not included. The 27

PAGE 39

RETRAINING CURVE for 207. FRICTION INCREASE 0.7 O.G 0.5 c:: 0 c:: 0.4 c:: l:&l 0.3 0.2 0.1 0 20 40 60 80 100 120 ITERATIONS Fig. 10 -Retraining curve for a 20% increase in friction coefficient. 28 140

PAGE 40

1.2 1 0.6 E-::> Q., 0.6 E-::> 0 0.1 0.2 0 0 0.5 1.5 2 2.5 3 3.5 4 4.5 5 TIME in seconds Fig. 11 -Response of model (***) and output of the retrained plant (---) after adaptation to 20% increase in friction coefficient. 29

PAGE 41

10 5 0 -5 TIME in seconds Fig. 12 Difference between the model and the retrained plant (model -plant) after retraining to a 20% increase in the coefficient of friction as a function of time. 30

PAGE 42

0.16 0.17 _/ 0.1G a: 0 0.15 a: a: "-! 0.11 0.13 0.12 0 10 20 30 40 50 60 70 BO ITERATIONS Fig. 13 -Retraining curve for a 20% decrease in the moment of inertia coefficient. 31

PAGE 43

0.8 0.6 O.t 0.2 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 TIME in seconds Fig. 14 -Response of the model (***) and the output of the retrained plant (---) after adaptation to 20% decrease in the moment of inertia coefficient. 32

PAGE 44

x 10 a 6 4 2 c:: 0 c:: c:: -G -B -10 0 0.5 1.5 2 2.5 3 3.5 4 4.5 5 TIME in seconds Fig. 15 -Difference between the model and plant outputs (model -plant) after retraining to a 20% decrease in the coefficient of moment of inertia as a function of time. 33

PAGE 45

reason for this exclusion is that for these two cases the NN would not retrain. This will require further investigation. In the first two cases where retraining did occur' the amount of damping increased with the variation of parameters while in the second two cases the damping factor decreased resulting in more overshoot. These results can only be deemed as partial successes. The outcome of these experiments can however be useful. Initial training should be performed with the maximum expected MOI and the minimum expected friction before adaptation. As the plant is presently operated the friction will increase as the lubricants become contaminated. The weights of the NN should be retained so that after periodic maintenence these values can be reloaded into the NN. In this way this type of control system could be used successfully. A caution here is that these conclusions may not apply to higher order plants because the variations may cause different effects on the poles of the transfer functions. Present knowledge indicates that simulations for each individual case would be necessary in order to study the effects of these variations. 34

PAGE 46

4.3 Closed Loop Control System A closed loop should be capable of controlling a type 0 or a type 1 system. Research on a type 1 system did not give an encouraging response. However when the NN was applied to a type 0 plant the results were satisfactory. If the desired plant to be controlled is of type 1 then it requires a unity feedback branch in order to transform it into a type 0 system. The plant used in this study is a robotic arm with a transfer function F G(s) = s(Ms2 + Bs + K) ( 4. 7) where M is the moment of inertia, B is the friction, K is the spring constant and F is the gain. Since this is a type 1 system it is transformed into a type o system by providing a unity feedback branch. The plant transfer function now becomes F G(s) = Ms3 + Bs 2 + Ks + F. ( 4. 8) For the purposes of simulation the values of the parameters were chosen to be such that the transfer function of the plant is 35

PAGE 47

120 G(s) = s 3 + 16s2 + 60s + 120 ( 4. 9) 120 = ------------------( 4.10) s(s+6) (s+10) + 120. Once again this plant will exhibit overshoot to a unit step input. The model was chosen to be 180 Gm(s) = s 4 + 18s3 + 119s2 + 343s + 360 ( 4 .11) 180 = (4.12) (s+3) (s+4) (s+5) (s+6). Since all of the roots are real the model does not experience overshoot. Fig. 16 shows the comparison between this desired model output and the plant, in the configuration of Fig. 4, if the NN is replaced with a proportional compensator with a coefficient of one. Details of the NN compensator and the training data appear in Table 4.4 (Appendix B). Initial training was suspended after 650 iterations at which point the sum of the square error was 0.505. This is illustrated in Fig. 17. Fig. 18 shows the output of the trained system and the model. This figure shows that the plant still exhibits overshoot but it is considerably reduced from the output in Fig. 16. Also the final output of the 36

PAGE 48

0.7 0.6 0.5 f-0.4 :I a.. f- :I 0 0.3 0.2 0.1 2 3 5 6 7 TIME in seconds Fig. 16 Response of the model (***) and the uncompensated plant (---) to a unit step input. 37

PAGE 49

30h 25 20 a:: 0 15 a:: a:: tal 10 5 0 0 100 200 300 400 500 600 ITERATION Fig. 17 -Initial learning curve, sum of the square error versus interation. 38

PAGE 50

0.6 0.5 0.4 E-o :::::> Q. 0.3 E-o :::::> 0 0.2 0.1 8 10 12 14 TIME in seconds Fig. 18 -Output of model (***) and compensated plant (---) to a unit set input. 39

PAGE 51

plant is not in Fig. 16. Also the final output of the plant is not the desired value of 0.5 as hoped for but it is now 0 4 8 This gives the output an error of 0. Fig. 19 shows the difference between the model and the plant as a function of time and Fig. 20 shows the response of the trained system to a multi -level input indicating that the NN has learned the dynamics of the system. 4.3.1 case studies The test for adaptability consisted of raising and lowering the coefficients of the MOI, friction, and spring constants by 20% whenever the plant algorithm would remain stable. 4 .2 20% Friction Coefficient Decrease The transfer function for this case becomes 120 Gf(s} = s 3 + 12.as2 + 60s + 120. ( 4.13} Fig. 21 compares the output of the model and the output of the plant after this perturbation but before adaptation. After perturbation the sum of the square error was 0. 567. Table 4.5 shows the retraining data. The NN took 119 iterations to return to a sum of the 40

PAGE 52

0.02f 0.015 0.01 c: 0.005 0 c: c: l:ll -0.005 -0.01 1 J -0.015 0 2 3 4 5 6 7 Fig. plant time. 19 Difference outputs (model 41 TIME in 11econds between plant) the as model and the a function of

PAGE 53

0.5 E-o :;:) 0.. 0 E-o :;:) 0 -0.5 -1 0 5 10 15 20 25 30 35 40 45 50 TIME in seconds Fig. 20 Response of the trained system to a multi-level step input. 42

PAGE 54

6 B 10 12 14 TIME in seconds Fig. 21 -Comparison between the outputs of the model (***) and the plant (---) after a 20% decrease in the coefficient of friction before adaptation. 43

PAGE 55

square error of 0.505 (Appendix B). Fig. 22 shows the response of the plant after 150 iterations. Fig. 23 shows the difference between the model and the retrained system as a function of time. Fig. 24 shows the retraining curve. The maximum overshoot was 0.513, and the undershoot was 0.484. The steady-state value of the ouput was 0.485. 4.3.3 20% spring coefficient Increase The modified transfer function of the plant is now 120 Gk(s) = s 3 + 16s2 + 72s + ( 4.14) The response after perturbation but before adaptation is shown in Fig. 25. The sum of the square error before adaptation was 0. 743. Table 4. 6 gives the retraining data (Appendix B). The NN has returned to a sum of the square error of 0.505 in 97 iterations. Fig. 26 shows the retraining curve for this perturbation. Fig. 27 compares the response of the model and the retrained system and Fig. 28 shows the difference between the model and the retrained system as a function of time after 100 iterations. The maximum overshoot was 0.512 and the undershoot was 0.487, which is the steady-state value. For these two cases the NN displayed a greater 44

PAGE 56

0.6 0.5 0.-1 E-< ::J c.. 0.3 E-< ::J 0 0.2 O.J 0 0 2 3 4 5 TIME in seconds Fig. 22 -Response of the model (***) retrained plant (---) after adaptation decrease in the coefficient of friction. 45 6 and to 7 the 20%

PAGE 57

0.02 ...-----"T'"----.------,----......... -----,-----,-----, 0.015 0.01 a: 0.005 0 a: a: tal 0 -0.01 -0.015 '-----'-----'-------L----'------..J'-----'------' 0 2 3 4 5 6 7 TIME in seconds Fig. 23 Difference between the model and the retrained plant (model -plant) after retraining to a 20% decrease in the coefficient of friction as a function of time 46

PAGE 58

0.57 0.56 0.55 0.5-1 0.53 0:: 0 0.52 c:: c:: w 0.51 0.5 0.-10 0.48 0.-17 0 20 40 60 BO 100 120 140 160 ITERATIONS Fig. 24 -Retraining curve for a 20% decrease in the coefficient of friction. 47

PAGE 59

I I 6 B 10 12 14 TIME in seconds Fig. 25 -Comparison between the outputs of the model (***) and the plant (---) after a 20% increase in the spring coefficient before adaptation. 48

PAGE 60

0.75 0.7 0.65 0:: 0 0.6 0:: 0:: 0.55 0.5 0 10 20 30 40 50 60 70 80 90 100 ITERATIONS Fig. 26 -Retraining curve for a 20% increase in the spring coefficient. 49

PAGE 61

0.6 0.5 0.4 E-::> 0. 0.3 E-::> 0 0.2 0.1 0 2 4 5 6 7 TIME in seconds Fig. 27 -Comparison between the outputs of the model (***) and the plant (---) after a 20% increase in the spring coefficient after adap-tation. 50

PAGE 62

0.025 ,.------.-----.-------""T----,------,----...--------. 0.02 0.015 0.01 c:: g 0.005 c:: 1&1 -0.005 -0.01 -0.015 L_ __ _._ ___ ......__ __ ___._ ___ _.__ __ ___. ___ _._ __ __j 0 2 3 4 5 6 7 TIME in seconds Fig. 28 Difference between the model and the retrained plant (model -plant) after retraining to a 20% increase in the spring coefficient as a function of time. 51

PAGE 63

willingness to readapt for these perturbations and can be considered totally successful. 4.3.4 20% Sprinq coefficient Decrease The new transfer function is 120 Gk(s) = s 3 + 16s2 + 48s + 120. (4.15) Fig. 29 shows the system response after the perturbation and before adaptation. The sum of the square error was 1.01. After 600 iterations the sum of the square error could only be reduced to a level of 0.578. The retraining data is in Table 4.7 (Appendix B) Fig. 30 shows the retraining curve and Fig. 31 compares the output of the partiallyretrained system to that of the model. Fig. shows the difference betwe.en the model and the retrained system as a function of time. This case exhibited the largest overshoot of all of the cases and was 0.524. This is still below a 5% error. The maximum undershoot and the steady-state value were both 0.479. 52

PAGE 64

0.6 0.4 (-o ;:l 0... 0.3 (-o ;:l 0 0.2 0.1 2 4 6 B 10 TIME in seconds Fig. 29 -Comparison between the outputs model (***) and the plant (---) after decrease in the spring coefficient adaptation. 53 12 14 of the a 20% before

PAGE 65

0.05 a: 0 0.8 a: a: w 0.75 0.7 0.65 0.6 0.55 0 100 200 300 400 500 600 ITERATIONS Fig. 30 -Retraining curve for a 20% decrease in the spring coefficient. 54

PAGE 66

o.r. O..t !-:;, 0.3 0 0.2 0.1 0 0 2 3 5 6 7 TIME in seconds Fig. 31 -Comparison between the outputs of the model (***) and the plant (---) after a 20% decrease in the spring coefficient after adaptation. 55

PAGE 67

0.02 0.015 0.01 0.005 g:: 0 g:: 0 g:: -0.005 -0.01 -0.015 -0.02 2 3 5 6 7 TIME in seconds Fig. 32 Difference between the model and the retrained plant (model -plant) after retraining to a 20% decrease in the spring coefficient as a function of time. 56

PAGE 68

4.3.5 20% Moment of Inertia Coefficient Increase The perturbed transfer function becomes 100 Gm(s) = s 3 + 13.33s2 + 50s + 100. ( 4.16) Fig. 33 shows the response of the perturbed system before adaptation. The sum of the square error was 0. 543. The retraining data is in Table 4. 8 (Appendix B). Fig. 34 shows the retraining curve. The NN would only retrain to a sum of the square error of 0.542 after 250 iterations showing a great reluctance to retrain. Fig. 35 shows the response of the. vaguely retrained system as compared to the model. Fig. 36 shows the difference between the model and the retrained plant as a function of time. The maximum overshoot was 0.512 and the maximum undershoot was 0.482 while the steady state value was 0.483. 4.3.6 5% Friction coefficient Increase The algorithm used to simulate the plant becomes unstable for higher perturbation rates. function for this variation is the 120 Gf(s) = s 3 + 16.8s2 + 60s + 120. 57 The transfer (4.17)

PAGE 69

0 10 12 14 TIME in seconds Fig. 33 -Comparison between model (***) and the plant increase in the moment of before adaptation. 58 the outputs of the (---) after a 20% inertia coefficient

PAGE 70

0.543 0.542 0:: 0.542 0 0:: 0:: l&l 0.542 0.542 0.542 0 50 100 150 200 250 ITERATIONS Fig. 34 -Retraining curve for a 20% increase in the coefficient of moment of inertia. 59

PAGE 71

0.6 0.5 0.4 e-::::> c.. 0.3 e-0 0.2 0.1 0 0 2 3 4 5 TIME in seconds Fig. 35 Comparison between the output model (***) and the plant (---) after increase in the coefficient of moment of after adaptation. 60 l j j 6 7 of the a 20% inertia

PAGE 72

0.02 0.015 0.01 1 0.005 1 0 tal -0.005 -0.01 -0.02 .__ __ __,_ ___ ....._ __ __,_ ___ _.__ __ __,c._ __ -.L.. __ _J 0 2 3 4 5 6 7 TIME in seconds Fig. 3 6 -Difference between the model and the retrained plant (model -plant) after retraining to a 20% increase in the coefficient of moment of inertia as a function of time. 61

PAGE 73

Fig. 37 gives the response of the perturbed system versus the model before adaptation. The initial sum of the square error was 0.515. The NN retrained to a level of 0.513 in 500 iterations. Fig. 38 shows the retraining curve and the retraining data is in Table 4.9 (Appendix B). Fig. 39 compares the output of the retrained system to that of the model. Fig. 40 shows the difference between the model and the incompletely retrained system as a function of time. The maximum overshoot was 0.511, the undershoot and the steady-state value wereboth 0.480. For the case of a reduction of MOI, no data was obtained because the plant algorithm became unstable for any decrease in this parameter. 4.3.7 conclusions The results of these experiments show that the NNMRAC system in this configuration will attempt to adapt for changing parameters since in all cases for which the plant algorithm remained stable the sum of the square error could be reduced although in some of the cases this reduction did not amount to a significant value. While this configuration can not handle a type 1 system it provides a more robust control system than the first configuration. 62

PAGE 74

r 0.6 0.5 OA f-o ;::, Q., 0.3 f-o ;::, 0 0.2 0.1 8 10 12 14 TIME in seconds Fig. 37 -Comparison between the outputs of the model (***) and the plant (---) after a 5% increase in the coefficient of friction before adaptation. 63

PAGE 75

0.515 0.515 0.514 0.514 0.513 0 50 100 150 200 250 300 350 400 450 500 ITERATIONS Fig. 38 -Retraining curve for a 5% increase in the coefficient of friction. 64

PAGE 76

O.G 0.5 0.4 E-::::> 0.. 0.3 E-::::> 0 0.2 0.1 0 0 2 3 4 5 6 7 TIME in seconds Fig. 39 Comparison between the output of the model (***) and the plant (---) after a 5% increase in the coefficient of friction after adaptation. 65

PAGE 77

0.02 ( 0.01 '\ ) 0 tal -0.01 v 0 2 3 5 6 7 TIME in seconds Fig. 40 -Difference between the model and the retrained plant (model -plant) after retraining to a 5% increase in the coefficient of friction as a function of time. 66

PAGE 78

CHAPTER 5 CONCLUSIONS These tests performed and reported on in Chapter 4 show that the NN can function at least partially well as a controller in a MRAC system. The second method of application that was examined does seem to hold the most promise and is the more commonly seen form of compensation in_control systems. While the results were not perfect, i.e., training did result in perfect model following and retraining did not always reduce the error to the desired level, the NN did train to an acceptable level of operation. The errors in maximum overshoot, undershoot and steady-state value were all under 5%. This is quite acceptable for most eng.ineering applications. The results do indicate that further research in training would be algorithms to increase the rate of training well worth the effort and funding. This approach to MRAC systems provides for an automatic form of adaptation that would be quit useful in slowly varying systems. NNs may also hold the possibility of being extended to nonlinear systems. There is a

PAGE 79

definite need to study the effects of changing the sigmoidal function parameters, K and a, and how they might best be chosen to avoid lapses in training. Finally, in retrospect, the Runge-Kutta method for the plant simulation would not be used. Instead a discrete-time algorithm such as bilinear transformation (and if possible with frequency prewarping) as described in reference (4] would be used. 68

PAGE 80

APPENDIX A COMPUTER PROGRAMS The programs included in this Appendix were used to investigate the use of neutral networks in MRAC systems. The programs were written in AT-MATLAB. Programs t2adnn.m and t2adnnlm.m train a NN with one hidden layer. Program t2adnnx.m tests the training. Programs tJadnn4.m and tJadn4lm.m are used to train a NN with two hidden layers. the training. Program tJadnn4x.m tests All other programs are support the .training and testing programs.

PAGE 81

% t2adnn.m % trains a two layer neural network for adaptive control % % testhOJ.m is a program that test the dynamical neural network with HYSTERESIS AND LINEAR OUTPUT NEURON % % % % % CUMMULATIVE LEARNING format compact; invectr NO=input('number of neurons in the DELAY layer='); np=NO+l; Nl=input('number of neurons in the FIRST hidden layer =') ; ddinhl iyninh2 kct2 alfact2 nu=l; numero=input('number of iterations='); lipp=input('ENTER: value of lippman constant='); factor=input('ENTER: nu factor (usually 1 or larger) = I) i errloc=zeros(numero,l); subplot for jj=l:numero, iyninh2 % reset c to zeros On=zeros(l,S); %output of N.N. for R-K,A-B,A-M ddaccin ddoicc for ii=l:np, adcoleh ddfwlhn adpercen nu=nuA(l/factor); ddwcl errloc(jj)=abs(Ot(l,ii)-Op(l,ii))+ .. errloc(jj); end ddwc2 kk=rem(jj,4)+221; if kk==221, subplot end subplot(kk) ,plot(errloc) ,title(-'ERROR'); end save neuOlou WOl Wl2 alfal alfa2 hl h2 Kl K2

PAGE 82

% invectr.m invector train % used in training sequence invect=input('ENTER: amplitude of training step input= '); tf=input('ENTER: time at end='); 0 ddinhl.m 0 0 0 iyninhl.m 0 0 0 0 0 0 % 0 0 0 0 0 0 N2=1; used for training sets up neural network is a program that initilizes all the possible variables and structure of the neural network It uses the values NO = # of inputs (delayed)--GIVEN BY THE INPUT Nl =#of neurons in the first hidden layer N2 =#of neurons in the output layer--GIVEN BY THE OUTPUT it uses the information of NO (included in It), number of input signals. rand('uniform'); netl=zeros(Nl,l);net2=zeros(N2,1); netlh=zeros(Nl,l);net2h=zeros(N2,1); ol=zeros(Nl,l);o2=zeros(N2,1); oln=zeros(Nl,l);o2h=zeros(N2,1); dell=zeros(Nl,l);del2=zeros(N2,1); derl=zeros(Nl,l) ;der2=zeros(N2,1); alfal=ones(Nl,l);alfa2=ones(N2,1); hl=zeros(Nl,l);h2=zeros(N2,1); Kl=ones(Nl,l);K2=ones(N2,1); W01=-ones(NO,N1)+2*rand(NO,Nl); W12=-ones(Nl,N2)+2*rand(Nl,N2); WOllipp=zeros(NO,Nl); W12lipp=zeros(Nl,N2);

PAGE 83

% kct2.m is a program that initializes all the values % of Kl,K2 with the same value. % OBS: the value of K(a constant) is asked here K=inptit('saturation level of neurons='); Kl=Kl*K; K2=K2*K; % here we. need to renormalize the weights to avoid % saturation of neurons at the beginning of the % learning process WOl=WOl/(K*NO) ;W12=W12/(alfa2*Nl); % alfact2.m is a program that initializes all the % values of alfal,alfa2 with the same % value. % OBS: the value of alfa(a constant) is % been asked here alfa=input('value of the ct. slope='); % ddaccin.m % % % alfal=ones(Nl,l)*alfa; alfa2=ones(N2,l)*alfa; is a program that iriitilizes all the possible variables that will work as accumulators for cumulative learning alfali=zeros(Nl,l);alfa2i=zeros(N2,1); hli=zeros(Nl,l);h2i=zeros(N2,1); Kli=zeros(Nl,l);K2i=zeros(N2,1); W01i=zeros(NO,Nl);W12i=zeros(Nl,N2); ai=zeros(NO,l);gi=zeros(NO,l); % ddoicc.m program for initializing the initial % conditions % for the outputs of every neuron % ( = zeros) % for every run of the complete set of % points ol=zeros(Nl,l);o2=zeros(N2,1);

PAGE 84

% plantst.m sets up R-K for a4plantj.m % Adam's Forth Order Predictor/Corrector %Aplt=[O 1;-8 -6];Bplt=[0;8];wo = [0;0]; %Aplt=[O 1;-16 -10];Bplt=(0;16];wO=[O;O]; %Aplt=[O 1 0;0 0 1;-42.72 -58.82 -14.8]; %Bplt=[0;0;42.72];wO=[O;O;O]; %Aplt=(O 1;0 -4];Bplt=[0;4];wO=[O;O]; % 4/(s(s+4)) %Aplt=(O 1;0 -3];Bplt=(0;5];wO=[O;O]; % 5/(s(s+J)) %Aplt=[O];Bplt=[4];wO=[O]; % 4/s en1x series %Aplt=[O 1;-29 -4];Bplt=[0;29];wO=[O;O]; % fn1x series %Aplt=[O 1;-29 -4.8];Bplt=[0;29];wO=[O;O]; % fn1x % W/ 20% friction increase %Aplt=[O 1;-8 -4];Bplt=[0;8];wO=[O;O]; % fn2x series %Aplt=[O];Bplt=[2];wO=[O]; % 2/s hn1x series %Aplt=[O 1 0;0 0 1;0 -40 -14];Bplt=[0;0;75];wO=[O;O;O]; %kn1x series %Aplt=[O 1;-25 -6];Bplt=[0;25];wO=[O;O]; % double fb % 1x series %Aplt=[O 1 0;0 0 1;-120 -60 -16];Bplt=[0;0;120]; % wO=[O;O;O]; % df series %Aplt=[O 1 0;0 0 1;-120 -60 -12.8];Bplt=[0;0;120]; % wO=[O;O;O]; %df -20% fric %Aplt=[O 1 0;0 0 1;-100 -50 -13.33];Bplt=[0;0;100]; % wO=[O;O;O]; %df +20% moi Aplt=[O 1 0;0 0 1;-120 -48 -16];Bplt=[0;0;120]; wO=[O;O;O]; %df -20% k to = o; h = T; n = length(wO); t = to; w = zeros(n,1);wpo = wo;wp1=w;wp2=w;wp3=w; points= zeros(n,np); Op=zeros(1,np); k1=w;k2=w;k3=w;k4=w; w=wO; points(:,1)=w; % adcoleh.m for total system test 0 % inycoexh.m is a program to produce the feedforward %c=Ie(:,ii); c(2:NO,l)=c(1:N0-1,1); if (ii==1), c(1,1)=0; else c(1,1)=invect(1,1); end

PAGE 85

% ddfwlhn.m used for feedback looping % % % % % % % % % % % % % % % % % iynfwllh.m is a program that produces all the forward computation in the feedforward path for purpose of learning ASSUMES FEEDFORWARD COMPENSATION (output of camp.= c) This subroutine calculates the status of all variables at time ii EVERY NEURON IS IMPLEMENTED WITH HYSTERESIS (except for the neuron in the output layer which is LINEAR) introduces the variables net*h=net*-o*./K* THE OUTPUT LAYER ONLY HAS ONE NEURON FORWARD COMPUTATION % netl=WOl'*It(:,ii); netl=WOl'*c; netlh=netl-ol./Kl; ol=(-exp(-alfal.*(netlh+hl))+ones(length(alfal),l)); ol=ol.f(exp(-alfal.*(netlh+hl))+ones(length(alfal),l)); ol=ol.*Kl; derl=(-ol.*ol+Kl.*Kl)./(2*Kl); net2=W12'*ol; % LINEAR FUNCTION IN THE OUTPUT NEURON o2=alfa2.*(net2+h2); der2=ones(N2,1); % % BACK COMPUTATION shfton2 a4plantj del2=(0t(:,ii)-Op(:,ii)).*der2; dell=(W12*(del2.*alfa2)).*derl; delO=WOl*(dell.*alfal); % % % This program provides all the parameters for applying the Dynamical Extended Delta Rule % adpercen.m This calculates the error between the % plant output Op and the desired model's % output Om nu=min((sum(abs(Ot(:,ii)-Op(:,ii)).A2))/ .. (sum(abs(Ot(:,ii)).A2)+eps),l);

PAGE 86

ddwcl.m is the program that calculates increments for W and stores them in accumulators W12i=W12i+nu*ol*(del2.*alfa2)'; WOli=WOli+nu*c*(dell.*alfal)'; % NOTE: that the input has been changed to (c) to % include the feedback % % % ddwc2.m is the program that calculate increments for W in cumulative fashion from the accumulators W12=W12+W12i/np+lipp*(W12-W12lipp); WOl=WOl+WOli/np+lipp*(WOl-WOllipp); W12lipp=W12; WOllipp=WOl; % t2adnnlm.m trains a three layer neural network for .% adaptive control Learn More % testh03.m is a program that test the dynamical neural network with HYSTERESIS AND LINEAR OUTPUT NEURON % % % % % CUMMULATIVE LEARNING numero=input('number of iterations='); lipp=input('ENTER: value of lippman constant='); factor=input('ENTER: value of nu reciprocal exponent .. = I) i errloc=zeros(numero,l); subplot for jj=l:numero, iyninh2 On=zeros(1,5); %Output of N.N. to plant for R-K plantst ddaccin ddoicc for ii=l:np, adcoleh ddfwlhn adpercen nu=nuA(l/factor); ddwcl errloc(jj)=abs(Ot(l,ii)-Op(l,ii))+errloc(jj); end

PAGE 87

ddwc2 kk=rem(jj,4)+221; if kk==221, subplot end subplot(kk),plot(errloc),title('ERROR'); end save neuOlou WOl W12 alfal alfa2 hl h2 Kl K2 % t2adnnx.m % This runs the plant with the N.N. (one hidden layer) compensator. % % % Used for testing, especially for varying input data such as when the the step has negative and positive values ddinxhl invector iyninxh2 plantst fbv=zeros(NO,l); On=zeros(l,S); stepcnt=l; incnt=l; for ii=l:np, end adcoexh ddfwxlh shfton2 a4plantj stepcnt=stepcnt+l; if(stepcnt>npoints), stepcnt=l; incnt=incnt+l; end

PAGE 88

% ddinxhl.m initializes variable for execution of % trained % N.N. that has one hidden layer Nl=length(alfal); N2=length(alfa2); nsteps=input('ENTER: #of steps='); npoints=input('ENTER: #of points per step='); np=nsteps*npoints; Oe=zeros(np,l); netl=zeros(Nl,l); netlh=zeros(Nl,l); ol=zeros(Nl,l); net2=zeros(N2,1); net2h=zeros(N2,1); o2=zeros(N2,1); % invector.m creates stream of input vectors for use % in total test of trained. neurally % compensated system invect=zeros (nsteps, 1).; for dli=l:nsteps, invect(dli,l)=input('ENTER: amplitude='); end iyninxh2.m is a program to initialize the compensators for normal operation c=zeros(NO,l); % adcoexh.m for total system test % inycoexh.m is a program to produce the feedforward %c=Ie(:,ii); c(2:NO,l)=c(l:N0-1,1); c(l,l)=invect(incnt,l); % shfton2.m SHiFT On % this shifts the output of the neural % network in array On so that it can be % used for the R-K,A-B,A-M On(1,2:5)=0n(1,1:4); On(l,l)=o2; 77

PAGE 89

% shfton.m SHiFT Ori % this shifts the output of the neural % network in array On so that it can be % used for the R-K,A-B,A-M On(l,2:5)=0n(l,l:4); On(l,1)=o3; % ddfwxlh.m used with one hidden layer N.N. % % % % % % % % % % % % % iynfwxlh.m is a program the executes the (iith) normal operation of the neural network section INPUT = Ie -NO X 1 ASSUMING FEEDFORWARD COMPENSATION OUTPUT= Oe --np X 1 FORWARD COMPUTATION EVERY NEURON IS IMPLEMENTED WITH HYSTERESIS (except for the neuron in the output layer which is LINEAR) introduces the variables net*h=net*-o*./K* % THE OUTPUT LAYER ONLY HAS ONE NEURON % netl=W01'*Ie(:,ii); netl=W01'*c; netlh=net1-ol./K1; o1=(-exp(-alfal.*(net1h+hi))+ones(length(alfa1),1)); o1=o1.f(exp(-alfal.*(net1h+h1))+ones(length(alfa1),1)); o1=o1.*K1; net2=W12'*o1; % LINEAR FUNCTION OF THE OUTPUT NEURON o2=alfa2.*(net2+h2); Oe(ii)=o2(1); % t3adnn4.m % % % testh03.m % % % % % format compact; invectr trains a three layer neural network for adaptive control is a program that test the dynamical neural network with HYSTERESIS AND LINEAR OUTPUT NEURON CUMMULATIVE LEARNING T=input('enter: T = ');

PAGE 90

NO=input('number np=(tf+T)/T; Nl=input('number ='); N2=input('number layer='); adinhl iyninh2 kct alfact nu=l; of neurons in the DELAY layer='); % NO+l; of neurons in the FIRST hidden layer of neurons in the SECOND hidden numero=input('number of iterations='); lipp=input('ENTER: value of lippman constant='); factor=input('ENTER: nu factor (usually 1 or larger) .. = I); errloc=zeros(numero,l); subplot for jj=l:numero, fbv=o; On=zeros(l,S); %Output of N.N. for R-K,A-B,A-M plantst iynaccin neuroicc iyninh2 for ii=l:np, adcoleh makeinpx adfwlhn adpercen nu=nuA(l/factor); iynwcl errloc(jj)=abs(Ot(l,ii)-Op(l,ii))+errloc(jj); shftfbv4 end iynwc2 subplot subplot(211),plot(errloc),title('ERROR t3adnn4.m') subplot(212),plot(Op),title('PLANT OUTPUT') end save neuOlou WOl W12 W23 alfal alfa2 alfa3 hl h2 h3 .. Kl K2 K3 70

PAGE 91

0 adinhl.m % 0 % iyninhl.m % % 0 0 0 0 0 % 0 0 % 0 % % NJ=l; used for training sets up neural network is a program that initilizes all the possible variables and structure of the neural network It uses the values NO = #of inputs (delayed)--GIVEN BY THE INPUT Nl = # of neurons in the first hidden layer N2 = # of neurons in the second hidden layer NJ = # of neurons in the output layer-GIVEN BY THE OUTPUT it uses the information of NO (included in It), number of input signals. rand('uniform'); netl=zeros(Nl,l);net2=zeros(N2,1);netJ=zeros(NJ,l); netlh=zeros(Nl,l);net2h=zeros(N2,1);netJh=zeros(NJ,l); ol=zeros(Nl,l);o2=zeros(N2,1);oJ=zeros(NJ,l); olh=zeros(Nl,l);o2h=zeros(N2,1);o3h=zeros(NJ,l); dell=zeros(Nl,l);del2=zeros(N2,1);delJ=zeros(NJ,l); derl=zeros(Nl,l);der2=zeros(N2,1);derJ=zeros(NJ,l); alfal=ones(Nl,l);alfa2=ones(N2,1);alfaJ=ones(NJ,l); hl=zeros(Nl,l);h2=zeros(N2,1);hJ=zeros(NJ,l); Kl=ones(Nl,l);K2=ones(N2,1);KJ=ones(NJ,l); W01=-ones(NO,N1)+2*rand(NO,Nl); W12=-ones(Nl,N2)+2*rand(Nl,N2); W23=-ones(N2,NJ)+2*rand(N2,NJ); WOllipp=zeros(NO,Nl); Wl2lipp=zeros(Nl,N2); W2Jlipp=zeros(N2,NJ); % kct.m is a program that initializes all the values % of Kl,K2,KJ with the same value. % OBS: the value of K(a constant) is asked here K=input('saturation level of neurons='); Kl=Kl*K; K2=K2*K; KJ=KJ*K; % here we need to renormalize the weights to avoid % saturation of neurons % at the beginning of the learning process W01=W01/(K*NO);W12=W12/(K*Nl);W23=W23/(alfa3*N2);

PAGE 92

% alfact.m is a program that initializes all. the % values of alfal,alfa2,alfa3 with the % same value. % OBS: the value of alfa(a constant) is % been asked here alfa=input('value of the ct. slope='); % % % % % alfal=ones(Nl,l)*alfa; alfa2=ones(N2,1)*alfa; alfa3=ones(N3,1)*alfa; iynaccin.m is a program that initilizes all the possible variables that will work as accumulators for cumulative learning alfali=zeros(Nl,l);alfa2i=zeros(N2,1);alfa3i=zeros (N3,1); hli=zeros(Nl,l);h2i=zeros(N2,1) ;h3i=zeros(N3,1); Kli=zeros(Nl,l);K2i=zeros(N2,1);K3i=zeros(N3,1); W01i=zeros(NO,Nl);W12i=zeros(Nl,N2);W23i=zeros(N2,N3); ai=zeros(NO,l);gi=zeros(NO,l); % neuroicc.m program for initializing the initial % conditions for the outputs of every % neuron ( = zeros) for every run of the % complete set of points ol=zeros(Nl,l);o2=zeros(N2,1);o3=zeros(N3,1); % makeinpx.m this makes the input to the Neural % Network from the reference input and the % plant's output, using negative feedback c(l,l)=c(l,l)-fbv(l,l); % % % % % iynwcl.m is the program that calculates increments for W and stores them in accumulators W23i=W23i+nu*o2*(del3.*alfa3)'; W12i=W12i+nu*ol*(del2.*alfa2) '; WOli=WOli+nu*c*(dell.*alfal)'; NOTE: that the input has been changed to (c) to include the feedback

PAGE 93

% iynfwlhn.m used for feedback looping % % % % % % % % % % % % % % % % iynfwllh.m is a program that produces all the forward computation in the feedforward path for purpose of learning ASSUMES FEEDFORWARD COMPENSATION (output of comp.= c) This subroutine calculates the status of all variables at time ii EVERY NEURON IS IMPLEMENTED WITH HYSTERESIS (except for the neuron in the output layer which is LINEAR) introduces the variables net*h=net*-o*./K* THE OUTPUT LAYER ONLY HAS ONE NEURON % FORWARD COMPUTATION % netl=WOl'*It(:,ii); netl=WOl'*c; netlh=netl-ol./Kl; ol=(-exp(-alfal.*(netlh+hl))+ones(length(alfal),l)); ol=ol./(exp(-alfal.*(netlh+hl))+ones(length(alfal),l)); ol=ol. *Kl; derl=(-ol.*ol+Kl.*Kl)./(2*Kl); net2=W12'*ol; net2h=net2-o2./K2; o2=(-exp(-alfa2.*(net2h+h2))+ones(length(alfa2),1)); o2=o2./(exp(-alfa2.*(net2h+h2))+ones(length(alfa2),1)); o2=o2.*K2; der2=(-o2.*o2+K2.*K2)./(2*K2); netJ=W23'*o2; % LINEAR FUNCTION IN THE OUTPUT NEURON oJ=alfaj.*(netJ+hJ); derJ=ones(NJ,l); % % BACK COMPUTATION shfton a4plantj delJ=(Ot(:,ii)-Op(:,ii)).*derJ; %delJ=(Ot(:,ii)-oJ).*derJ; del2=(W23*(delJ.*alfaJ)) .*der2; dell=(Wl2*(del2.*alfa2)) .*derl; delO=WOl*(dell.*alfal); % % % This program provides all the parameters for applying the Dynamical Extended Delta Rule

PAGE 94

% shftfbv.m This performs the update of the Feed-% Back Vector fbv(2:sizei(l),l)=fbv(l:sizei(l)-1,1); fbv(l,l)=Op(l,ii); % % % iynwc2.m is the program that calculate increments for W in cumulative fashion from the accumulators W23=W23+W23i/np+lipp*(W23-W23lipp); W12=W12+W12i/np+lipp*(W12-W12lipp); WOl=WOl+WOli/np+lipp* (WOl-WOllipp) i. W23lipp=W23; W12lipp=W12; WOllipp=WOl; % t3dnn4x.m For use when the nuetral network is in the feedback loop. This runs the plant with N.N. compensator. Used for testing, especially for varying input data such as when the step has negative and positive values. % % % % % adinxhl invector iyninxh2 plantst sizei=NO; fbv=zeros(NO,l); On=zeros(l,5); stepcnt=l; incnt=l; for ii=l:np, adcoexh makeinpx adfwxlh shfton a4plantj shftfbv end stepcnt=stepcnt+l; if(stepcnt>npoints), stepcnt=l; incnt=incnt+l; end

PAGE 95

% % % t3adn4lm.m trains a three layer neural network for adaptive control Learn More % testh03.m is a program that test the dynamical neural network with HYSTERESIS AND LINEAR OUTPUT NEURON % % % % % CUMMULATIVE LEARNING format compact; numero=input('number of iterations='); lipp=input('ENTER: value of lippman constant='); factor=input('ENTER: value of nu reciprocal exponent = '); errloc=zeros(numero,l); subplot for jj=l:numero, fbv=O; On=zeros(1,5); % output of N.N. to plant % for R-K plantst iynaccin neuroicc iyninh2 for ii=l:np, adcoleh makeinpx adfwlhn adpercen nu=nuA(l/factor); iynwcl errloc(jj)=abs(Ot(l,ii)-Op(l,ii))+errloc(jj); shftfbv4 end iynwc2 subplot subplot(211),plot(errloc),title('ERROR t3adn4lm.m') subplot(212),plot(Op),hold on plot(Ot,':'),title('MODEL & PLANT OUTPUT'),hold off end save neuOlou WOl W12 W23 alfal alfa2 alfa3 hl h2 h3 .. Kl K2 K3 OA

PAGE 96

% adinxh1.m for total test of system % iyninxh1.m is a program that initializes all the % variables for the neural network to work % properly in the feedforward mode. %load neu01ou; N1=length{alfa1); N2=length{alfa2); N3=length{alfa3); nsteps=input{'ENTER: #of steps='); npoints=input {'ENTER: # of points =.') ; np=nsteps*npoints; Oe=zeros{np,1); net1=zeros{N1,1); net1h=zeros{N1,1); o1=zeros{N1,1); net2=zeros{N2,1); net2h=zeros{N2,1); o2=zeros(N2,1); net3=zeros{N3,1); net3h=zeros{N3,1); o3=zeros(N3,1); % iyninxh2.m is a program to initialize the % compensators for normal operation c=zeros(N0,1); % % % % % % % % % % % % adfwxlh.m is a program the executes the (iith) normal operation of the neural network section INPUT = Ie -NO X 1 ASSUMING FEEDFORWARD COMPENSATION OUTPUT= Oe --np x 1 FORWARD COMPUTATION EVERY NEURON IS IMPLEMENTED WITH HYSTERESIS (except for the neuron in the output layer which is LINEAR) introduces the variables net*h= net*-o*./K* % THE OUTPUT LAYER ONLY HAS ONE NEURON % % % net1=W01'*Ie(:,ii); net1=W01'*c; net1h=net1-o1./K1;

PAGE 97

ol=(-exp(-alfal.*(netlh+hl))+ones(length(alfal),l)); ol=ol./(exp(-alfal.*(netlh+hl))+ones(length(alfal),l)); ol=ol.*Kl; net2=W12'*ol; net2h=net2-o2./K2; o2=(-exp(-alfa2.*(net2h+h2))+ones(length(alfa2),1)); o2=o2./(exp(-alfa2.*(net2h+h2))+ones(length(alfa2),1)); o2=o2.*K2; net3=W23'*o2; % LINEAR FUNCTION OF THE OUTPUT NEURON o3=alfa3.*(net3+h3); % a4plantj.m Adam's Forth Order Predictor/Corrector % used with neural networks is a program to solve continous time equations using Runge-Kutta initial value estimator, Adams-Bashforth four step predictor, & Adams-Moulton three step corrector. See plantst.m % diffeqpi.m forms the product of A*w + b*u, % w is the present value of the system and u is the unit step function. wo = initial conditions of the system. to = starting time of run. tf = finishing time of run. h=(tf-to+T)/T % The entire output of the run is in matrix points, if % this is desired. if (ii>=l)&(ii<=J), U=o3; % Ot(ii); kl=h*diffeqpi(w,Aplt,Bplt,On(1,2)); k2=h*diffeqpi(w+kl/2,Aplt,Bplt,On(1,2)); k3=h*diffeqpi(w+k2/2,Aplt,Bplt,On(1,2)); k4=h*diffeqpi(w+k3,Aplt,Bplt,On(1,2)); w=points(:,ii)+(k1+2*k2+2*k3+k4)/6; points(:,ii+l)=w; wpO=points(:,l); wpl=points (:, 2) ; wp2=points(:,3); wp3=points(:,4); end if (ii>=4), % Adams-Bashforth 4 step predictor w=wp3+h*(55*diffeqpi(wp3,Aplt,Bplt,On(l,l)) .. -59*diffeqpi(wp2,Aplt,Bplt,On(l,2)) +37*diffeqpi(wpl,Aplt,Bplt,On(1,3)) -9*diffeqpi (wpO ,_Apl t, Bpl t, On ( 1, 4))) /24-; OL

PAGE 98

Adams-Moulton 3 step corrector w=wp3+h*(9*diffeqpi(w,Aplt,Bplt,On(l,l)) +19*diffeqpi(wp3,Aplt,Bplt,On(l,l)) -5*diffeqpi(wp2,Aplt,Bplt,On(1,2)) +diffeqpi(wpl,Aplt,Bplt,On(l,J)))/24; points(:,ii+l)=w; wpO = wpl; wpl = wp2; wp2 = wp3; wpJ = w; end Op(l,ii)=points(l,ii+1); % diffeqpi.m This is function diffeqpi.m which is % used in a4plantj.m. % Input: t=time of evaluation. If the values do not % change with time, the value of t is % unimportant. % w=vector value under evaluation. % matrices A & B % u is shown here as a unit step input, this could be % changed. function f = diffeqpi(w,A,B,u) f=A*w + B*u; R7

PAGE 99

APPENDIX B The tables listing the training and retraining data are included in thi.s Appendix. Abbreviations used in this appendix are on the following page.

PAGE 100

A = tf = NO = Nl = K = a = I = L = f = sses = ssee = ABBREVIATIONS USED FOR TABLES 4.1, 4.2, 4.3 amplitude of the training input time at the end of the training iteration n\unber of outputs in the delay layer number of neurons in the first hidden layer saturation level of the sigmoidal curve slope of the sigmoidal curve at the origin number of iterations .value of the momentum constant value of the nu reciprocal re-scaling exponent value of the sum square error at the beginning of ,the iterations value of the sum square error at the end of the iterations ABBREVIATIONS USED FOR TABLES 4.4, 4.5, 4.6, 4.7, 4.8, 4.9 A = amplitude of the training input. tf = time at the end of the training iteration T = time between successive points during the iteration NO = number of outputs in the delay layer Nl = number of neurons in the first hidden layer N2 = number of neurons in the second hidden layer k = saturation level of the sigmoidal curve a = slope of the sigmoidal curve at the origin I = number of iterations L = value of the momentum constant f = value of the nu reciprocal re-scaling exponent sses = value of the sum square error at the beginning of the iterations ssee = value of the sum square error at the end of the iterations 00

PAGE 101

A tf No N1 K a I L f sses ssee I L f sses ssee I L f sses ssee TABLES FOR TBR OPEN LOOP CONTROL SYSTEM TABLE 4.1 Initial Training 1 4.9 49 49 4.5 0.4 100 50 50 50 0.9 0.9 0.9 0.9 1 5 20 40 42.3 4.79 0.891 0.261 4.81 0.904 0.264 0.141 TABLE 4.2 Retraining for 20% Friction Coefficient Increase 20 20 40 40 10 0.9 0.9 0.9 0.9 0.9 1 5 10 20 40 0.507 0.517 0.412 0.225 0.145 0.516 0.421 0.227 0.146 0.132 at 0.141 at iteration 124 TABLE 4.3 Retraining for 20% Moment of Inertia Coefficient Decrease 20 20 20 20 0.9 0.9 0.9 0.9 1 5 10 20 0.161 0.162 0.168 0.144 0.162 0.169 0.145 0.123 at 0.141 at iteration 64 90

PAGE 102

TABLES FOR TBR CLOSED LOOP CONTROL SYSTEM TABLE 4.4 Initial Training A 1 tf 6.9 T 0.1 NO 49 N1 49 N2 49 k 1.8 a 0.8 I 50 200 50 50 50 25 L 0.1 0.1 0.1 0.1 0.1 0.1 f 1 1 3 5 5 7 sses 29.5 4.55 1.85 1.14 0 758 0.642 ssee 4.71 1.85 1.15 0.762 0.643 0.611 I 75 50 50 50 L 0.1 0.1 0.1 0.1 f 7 9 13 21 sses 0.610 0.577 0.554 0.526 sses 0.578 0.554 0.526 0.505 TABLE 4.5 Retraining for 20% Friction coefficient Increase I L f sses ssee I L f sses ssee 50 50 50 0.1 0.1 0.1 1 5 10 0.567 0.561 0.524 0.561 0.527 0.478 at 0.505 at iteration 119 TABLE 4.6 Retraining for 20% Spring coefficient Increase 50 0.1 1 0.743 0.740 at 0.505 50 0.1 5 0.739 0.493 at iteration 97 91

PAGE 103

I L f sses ssee I L f sses ssee I L .f sses ssee TABLE 4. 7 Retraining for 2 o.% spring Coefficient Decrease 50 0.1 1 1.01 0.864 50 0.1 40 0.803 0.981 50 0.1 5 0.981 0.822 20 0.1 80 0.781 0.765 50 0.1 10 0.863 0.809 30 .1 80 0.764 0.734 50 0.1 20 0.821 0.811 50 0.1 160 0.737 0.674 50 0.1 20 0.809 0.812 50 0.1 160 0.673 0.604 50 0-.1 40 0.811 0.804 50 0.1 160 0.602 0.578 50 0.1 40 0.812 0.782 TABLE 4.8 Retraining for 20% Moment Of Inertia coefficient Increase 50 0.1 1 0.543 0.542 50 0.1 1 0.542 0.542 50 0.1 1 0.542 0.542 50 0.1 1 0.542 0.542 50 0.1 1 0.542 0.542 TABLE 4.9 Retraining for 5% Friction Coefficient Increase I L f sses ssee 50 0.1 1 0.515 0.515 100 0.1 1 0.515 0.514 92 100 0.1 1 0.514 0.514 200 0.1 1 0.514 0.513 50 0.1 1 0.513 0.513

PAGE 104

1. 2. 3. 4. Proano, Systems", 1989. J c. 1 Ph.D. BIBLIOGRAPHY "Neurodynamic Adaptive Thesis, University of Control Colorado, Lippman, R. P., Neural Nets", 1987. "An Introduction to Computing with IEEE ASSP Magazine, pp 4-221 April Burden, R.L. and J. D. Faires, Analysis", Fourth Edition, PWS-Kent Company, Boston, MA, pp 262-266, 1989. "Numerical Publishing Ogata, K., Prentice Hall 1987. "Discrete Time Control Englewood Cliffs, NJ, pp Systems", 308-330, 5. Philips, C.L., and R. D. Harbor, "Feedback Control Systems", Prentice Hall, Englewood Cliffs, NJ, pp 40-63, 1988. 6. Vemuri, v., "Artificial Neural Networks: Theoretical Concepts 11 Computer Society Press of the IEEE, Washington, D.C., 1988. 7. Proano, J.C., E.T. Wall, and J.T. Bialasiewicz, "Neurodynamic Adaptive Control Systems", Submitted to Kybernetika. 8. Bialasiewicz, J. T. and J. c. Proano, "Model Reference Intelligent Control System", Kybernetika Vol. 25, No. 2, pp 95-103, 1989. 9. Melsa, P.J.W., "Neural Networks: overview", -Tel labs Research Center, 1989. A Conceptual Mishawaka, IN, 10. Ince, D.L. J.T. Bialasiewicz, and E.T. Wall, "Neural Network-Based Model-Reference Adaptive Control System", to be published in the Proceedings of the Workshop on Neural Networks: Academic/ Industrial/NASA/Defense Auburn University, Auburn AL, 1990.

PAGE 105

11. Kung, S. Y. and J. N. Hwang, "Neural Network Architectures for Robotic Applications", IEEE Transactions on Robotics and Controls, pp 641-657, 1989. 94