PAGE 1
ROBUST DETECTION OF FREQUENCY SHIFT KEYING MODULATED SIGNALS by Adam Dennis Hall B.S. Physics, Brigham Young University, 2009 A thesis submitted to the Faculty of the Graduate School of the University of Colorado in partial fulfillment of the requirements for the degree of Master of Science Electrical Engineering 2013
PAGE 2
ii This thesis for the Master of Science degree by Adam Dennis Hall has been approved for the Electrical Engineering Program by Titsa Papantoni Chair Miloje (Mike) Radenkovic Jan Bialasiewicz March 14, 2013
PAGE 3
iii Hall, Adam Dennis (M.S., Electrical Engineering) Robust Detection of Frequency Shift Keying Modulated S ignals Thesis directed by Professor Titsa Papantoni ABSTRACT W e derive and examine the Neyman Pearson Detection of an FSK modulated waveform in white Gaussian noise. We specifically examine three cases: 1) detection when the waveform is known exactly; 2) detection when the waveform is known down to an unknown initial phase and; 3) detection of the waveform when the energy level is know n but the signal is not. We then der ive robust detection schemes where the noise comes from a broader class of stochastic processes and includes the possible occurrence of outlier data. We compare these robust detection schemes to their non robust equivalents under various noise models. The form and conte n t of this abstract are approved I recommend its publication. Approved: Titsa Papantoni
PAGE 4
iv ACKNOWLEDGMENT S I wish to give my sincere thanks to my advisor, Dr. Tit sa Papantoni She taught me detection and estimation theory which forms the fo undation for much of this thesis. In addition, she provided guidance at all stages of writing this thesis from first shaping this work to publishing. And most importantly she brought out and expanded my passion for learning. I would like to thank Dr. Rade nkovic as well as Dr. Bialasiewicz for their time and effort as part of the committee charged with the approval of this thesis. Finally, I would like to thank each and every one of my professors for sharing their domain knowledge.
PAGE 5
v DEDICATION This thesis is dedicated to my wife Rachel, for her support, patience and sacrifice as I pursued this Masters degree and to my son Max whose smile and laugh ter brought me joy and strength as I worked on this thesis Finally, I want to thank my employer, Northrop Gru mman Information Systems for allowing me to pursue this degree and their financial support. After the time and effort dedicated to the completion of this degree and this thesis, I am truly a better engineer and more importantly, I have a deeper understand ing of the world around me.
PAGE 6
vi TABLE OF CONTENTS Chapter 1 Introduction ................................ ................................ ................................ ........................ 1 1.1 Overview ................................ ................................ ................................ ..................... 1 1.2 Notation and Symbols ................................ ................................ ................................ .. 2 2 Parametric Detection ................................ ................................ ................................ .......... 4 2.1 Introduction ................................ ................................ ................................ ................ 4 2.2 Detection when the waveform and initial phase are known ................................ ......... 4 2.3 Detection when the waveform is unknown (energy detection) ................................ .... 6 2.4 Detection when the waveform is known down to an unknown initial phase ................ 7 2.5 Comparison of the Detection Performance ................................ ................................ 15 2.5.1 Detection Performance of Neyman Pearson like Detectors ................................ 16 2.5.2 Detection Performance of the Neyman Pearson Detectors ............................... 19 3 Robust Detection ................................ ................................ ................................ .............. 24 3.1 Detection when the waveform and the initial phase are known ................................ 24 3.1.1 Case where contamination occurs over data pairs ................................ .............. 24 3.1.2 Case where contamination occurs per datum ................................ ..................... 30 3.1 .3 Comparison of the robust cases with the optimal at the nominal model test ...... 39 3.2 Detection when the waveform is unknown (energy detection) ................................ .. 50 3.3 Detection when the waveform is known down to an unknown initial phase .............. 69 3.3.1 Robust Estimation of the unknown initial phase ................................ ................. 70
PAGE 7
vii 3.3.2 Robust Detection ................................ ................................ ............................... 78 3.3.3 Comparing robust detection with non robust detection ................................ ..... 83 4 Thesis Conclusions ................................ ................................ ................................ ............ 88 4.1 Summary of Findings ................................ ................................ ................................ 88 4.2 Future Research ................................ ................................ ................................ ......... 90 B i b l i o g r a p h y ................................ ................................ ................................ .............................. 92 A p p e n d i x A : Extension of Huber Robust Detection ................................ ................................ ................... 93
PAGE 8
1 1 Introduction 1.1 Overview Signal Detection is a key step in many communication, RADAR, and signal geolocation systems. Optimal signal detection design is dependent on the apriori knowledge of the signal being detected as well as the correct model for the transmission or noise channel. For any real world application it is extremely difficult to precisely model the noise channel. Furthermore, optimal signal contrast, robust detectors provide good performance at the assumed noise channel and maintain good performance over a wide class of possible noise channels. This thesis exam ines non robust and robust detection schemes for three cases of apriori signal knowledge and provides detailed analysis comparing these detection schemes In Chapter 2 we derive and compare optimal signal detection of a Frequency Shift Keying (FSK) modulat ed signal in an additive white Gaussian noise channel for three typical cases of apriori signal knowledge. The relative performance gain between the cases is found to be significantly different than approximate performance gain figures derived using first and second order statistics only. In Chapter 3 we derive robust signal detection schemes for the three cases examined in Chapter 2. For two of the three cases, the robust detection schemes derived in Chapter 3 can be shown to be optimal; they are Neyman Pe arson detection schemes designed at the least favorable density pair. For the remaining case, an ad hoc but intuitively pleasing detection scheme is designed and shown to provide good performance in the presence of low frequency outliers. All cases are co mpared to the non robust detection schemes found in Chapter 2 under various noise models.
PAGE 9
2 1.2 Notation and Symbols Throughout this thesis all symbols used to represent variables, random variables, constants, or functions are defined when first introduced. The table below contains a brief description of terms which are common among multiple chapters and sections of this thesis. The table is not an exhaustive list, as many symbols used for a given section are unique to that section and are therefore best introduced when they are first used Furthermore, the table does not contain a precise definition of all terms listed a s such a precise definition can only be provided in the context of the equations in which the terms are used. The table is placed here to introduce the reader to common symbols and enable the reader to understand a given chapter or section without reading all preceding text. Table 1 : Table of mathematical symbols. Energy of the signal per sample Probability Density function assuming the hypothesis is active Cumulative Density function assuming the hypothesis is active The class of all possible density functions active if the hypothesis is active The hypothesis The zeroth ordered Bessel function evaluated at The number of samples The noise spectral density Probability of false alarm The power or probability of detection
PAGE 10
3 The Marcum Q function of order evaluated at The real part of a complex sample from the observation vector The imaginary part of a complex sample from the observation vector False alarm constraint Threshold value used in detection tests The decision rule for a given detection test The frequency of outliers assuming the hypothesis is active The initial phase offset of the signal The mean of a random variable The signal to noise ratio The standard deviation of a random variable generally Gaussian The standard deviation of a random variable (x) The probability density function of a zero mean, unit variance Gaussian variable evaluated at x (x) The cumulative density function of a zero mean, unit variance Gaussian variable evaluated at x
PAGE 11
4 2 Parametric Detection 2.1 Introduction After the signal is brou ght to baseband, it is low pass filtered and sampled. We then obtain the following two data sample vectors ( Eq. 2.1 1 ) ( Eq. 2.1 2 ) and are Independent White Gaussian Noise processes w ith T he expressions for depend on the low pass filter used as shown in [1] To simplify our presentation we assume the deployment of an ideal low pass filter without distortion such that (Eq. 2.1 3 ) (Eq. 2.1 4 ) w here is a uniform random variable on the interval for scenarios where it is unknown and a fixed constant for scenarios where it is known. is in general a random variable; for the case where the waveform is unknown and a deterministic function; for the case where the waveform is known. Finally, is the energy of the signal. 2.2 Detection when the waveform and initial phase are known In this case, is no longer random; equivalently, for some deterministic The optimal detector is given by deciding if
PAGE 12
5 (Eq. 2.2 1 ) Let (Eq. 2.2 2 ) The detection statistics are then, (Eq. 2.2 3 ) (Eq. 2.2 4 ) (Eq. 2.2 5 ) (Eq. 2.2 6 ) (Eq. 2.2 7 ) (Eq. 2.2 8 ) Define And re express the probability of correct signal detection as follows: (Eq. 2.2 9 )
PAGE 13
6 2.3 Detection when the waveform is unknown (energy detection) is now a random variable, which in general depends on the FSK index. If we do not know the FSK index, we will work under the model that each is a random variable uniformly distributed on Our optimal detector is now a simpler energy detector given by deciding if (Eq. 2.3 1 ) Our detection statistics are (Eq. 2.3 2 ) (Eq. 2.3 3 ) (Eq. 2.3 4 ) Let (Eq. 2.3 5 ) Define (Eq. 2.3 6 ) is the cumulative distribution function of the gamma random variable with parameters and
PAGE 14
7 (Eq. 2.3 7 ) (Eq. 2.3 8 ) (Eq. 2.3 9 ) Let Then, (Eq. 2.3 10 ) 2.4 D etection when the waveform is known down to an unknown initial phase In this case is no longer random. Equivalently, there exists a deterministic such that We will design the Neyman Pearson Detector for such a model. The Neyman Pearson Detector decides if (Eq. 2.4 1 ) while is decided otherwise, where are the observed data from the processes where is the probability density functions when is active, and where is a threshold determined by some false alarm constraint We wi ll show that does exist for both To find we note by the theorem of total probability: (Eq. 2.4 2 )
PAGE 15
8 And (Eq. 2.4 3 ) w here is the probability density functio n for the zero mean unit Gaussia n random variable. (Eq. 2.4 4 ) (Eq. 2.4 5 ) (Eq. 2.4 6 ) where we have defined, In contrast to the detection case where both the waveform and the phase offset are known, in this case where the waveform is known but not the phase offset is not memoryless, since
PAGE 16
9 (Eq. 2.4 7 ) To ultimately simplify the detection scheme, we now seek to simplify the expression for W e will start with the integrand (Eq. 2.4 8 ) Let The integrand above can be rewritten as (Eq. 2.4 9 ) (Eq. 2.4 10 ) (Eq. 2.4 11 ) (Eq. 2.4 12 ) (Eq. 2.4 13 ) where denotes the real part. Let
PAGE 17
10 (Eq. 2.4 14 ) (Eq. 2.4 15 ) (Eq. 2.4 16 ) (Eq. 2.4 17 ) w here denotes the conjugate of Finally define such that (Eq. 2.4 18 ) With these substitutions we have (Eq. 2.4 19 ) The overall integral we wish to evaluate is (Eq. 2.4 20 ) Finally with the change of variables we write,
PAGE 18
11 (Eq. 2.4 21 ) (Eq. 2.4 22 ) (Eq. 2.4 23 ) Using the identity (Eq. 2.4 24 ) We rewrite (Eq. 2.4 23 ) as, (Eq. 2.4 25 ) w here The likelihood ratio is thus (Eq. 2.4 26 ) Taking the log ratio and arranging terms, our test decides if (Eq. 2.4 27 ) So far our test takes the observed sequence and the known transmitted signal ( with unit amplitude ) a nd computes
PAGE 19
12 (Eq. 2. 4 28 ) Then, our test computes and compares this test statistic to some threshold selected to attain probability of false alarm equal to However, since the function is strictly a function of and is monotonically increasing with V for while is by definition always nonnegative, the detection rule (Eq. 2.4 29 ) r educes to the equivalent inequality, (Eq. 2.4 30 ) In other words, is the sufficient statistics of the detection scheme. Finally, we can multiply both sides of the detection inequity by (we do this to more easily derived the analytic expression for the probability of detection). In conclusion, o ur detection rule decides if (Eq. 2.4 31 ) w here (Eq. 2.4 32 ) a nd
PAGE 20
13 (Eq. 2.4 33 ) (Eq. 2.4 34 ) A similar derivation for the Bayesian Detection of several orthogonal FSK signals is found in [2] To evaluate the performance of our detection scheme, we now seek an analytic expression for the probability density function of under both hypotheses. Define to be the random variable we seek to find under both and We first examine the statistics of (Eq. 2.4 35 ) which is a linear combination of complex, independent Gaussian variables. Since any linear combination of Gaussian variables is Gaussian, the (Eq. 2.4 36 ) is a complex Gaussian variable and can be completely described by its mean and variance (Eq. 2.4 37 ) (Eq. 2.4 38 ) (Eq. 2.4 39 )
PAGE 21
14 There fore under the distribution of is and under the distribution of is generalized Specifically (Eq. 2.4 40 ) (Eq. 2.4 41 ) (Eq. 2.4 42 ) (Eq. 2.4 43 ) (Eq. 2.4 44 ) (Eq. 2.4 45 ) Let (Eq. 2.4 46 ) where is the Marcum Q function with De fine to be the signal power to noise power ratio, per complex samp le where The probability of detection can be written as a function of the number of samples, the signal to noise ratio and the probability of false alarm as:
PAGE 22
15 (Eq. 2.4 47 ) Figure 2 1 shows the probability of detection versus the product for various false alarms. Figure 2 1 : Power curves as a function of for varie s false alarm rates 2.5 Comparison of the Detection Performance In this section, we will compare the detection performance for all three cases examined in sections 2.2 2.3 and 2.4 by examining the induced probabi lities of d etection as functions of and However, we will first compare the detection performance of Neyman Pearson like detectors which use just the mean and variance of our test statistic under and for all three cases. This analysis, being based on first and second order statistics, will lead to simpler expressions regarding comparisons between the three cases; however, the latter expressions are simply wrong when used for evaluation of the Optimal Neyman Pearson detectors. The
PAGE 23
16 reason we present the performance of detecto rs that use just the mean and the variance is to demonstrate the danger involved in limiting performance evaluations to just first and second order statistics. 2.5.1 Detection Performance of Neyman Pe a rson like Detectors Let and be the mean and standard deviation o f our test statistic under the acting hypothesis where can be either 0 or 1 Let us design our detection threshold to be where is a constant. In other words we set our threshold to be away from the me an, under Clear l y the greater the selected the lower the resulting probability of false alar m Furthermore, to simplify our detection performance analysis we will examine the values of and needed for the case in each of the t hree cases. We are thus using the first and second order statistics of our test statistic under and just the mean under to compare the three cases. From section 2.2 in the case wh ere the waveform as well as the initial phase are known, we have (Eq. 2.5 1 ) (Eq. 2.5 2 ) (Eq. 2.5 3 ) Our detection comparison occurs when or (Eq. 2.5 4 )
PAGE 24
17 o r equivalently when (Eq. 2.5 5 ) From section 2.3 Detection when the waveform is unknown (energy detection) we have (Eq. 2.5 6 ) (Eq. 2.5 7 ) (Eq. 2.5 8 ) Our detection comparison occurs when or (Eq. 2.5 9 ) o r equivalently when (Eq. 2.5 10 ) From section 2.4 D etection when the waveform is known down to an unknown initial phase we have (Eq. 2.5 11 ) (Eq. 2.5 12 )
PAGE 25
18 (Eq. 2.5 13 ) Our detection comparison occurs when or (Eq. 2.5 14 ) To compare the three cases, let and equal the expression s on the left hand side of (Eq. 2.5 5 ) (Eq. 2.5 10 ) and (Eq. 2.5 14 ) respectively. We then obtain the following ratios, as comparison qua ntities between the three tests (Eq. 2.5 15 ) (Eq. 2.5 16 ) (Eq. 2.5 17 ) Based on first and second order statistics our analysis results in simple and easy to remember detection comparisons. From (Eq. 2.5 16 ) we conclude that the coherent detector performs approximately better than the incoherent detector (in either or ). From (Eq. 2.5 17 ) we conclude that the energy detector suffers a performance loss equal to when compared to the incoherent detector. As s hown in the next subsection, these conclusions are not even good ap proximatio ns of performance comparison s between t he optimal Neyman Pearson Detection schemes.
PAGE 26
19 2.5.2 Dete ction Performance of the Neyman Pearson Detectors We can compare the performance of th e Neyman Pearson Detectors via the use of numerical methods which evalu ate the probability of detection as a function of the variables and We first note that the incoherent and coherent detectors performance can be expressed as a function of and the product Let and let us then define and as the smallest which achieves the probability of detection while maintaining the respectively for the coherent and incoherent Neyma n Pearson Detectors The ratio is similar to (Eq. 2.5 16 ) and represents the performance gain from knowing the initial phase offset as compared to the initial phase off set being completely unknown. The energy detectors performance depends on and Define as the smallest which achieves probability of detection while maintaining the assuming that the number of samples in of the Neyman Pearson Energy detector is Define and similarly. The ratios and are analogous to (Eq. 2.5 17 ) and represent the performance gain between the incoherent and energy detection schemes. Figure 2 2 shows From this figure it is clear that incoherent detection is within 1.5 dB the performance of coherent detection for a as compared at any reasonable probability of detection. This is in contrast to the 3 dB performance difference derived using just first and second order statistics Figure 2 3 Figure 2 4 and Figure 2 5 show for and 1000 respectively. From these figures it is clear that energy detection performs significantly
PAGE 27
20 better than the performance loss (in terms of signal to noise ratio) derived from just first and second order stat istics at all the and values covered in these figures Finally Figure 2 6 Figure 2 7 and Figure 2 8 show for and respectively. From these figures it is clear that energy detection performs significantly better than the performance loss (in terms of samples needed) derived from just first and second order statistics at all the values covered in th ese figures. Figure 2 2 : as function of for various
PAGE 28
21 Figure 2 3 : as function of for various with Figure 2 4 as function of for various with
PAGE 29
22 Figure 2 5 as func tion of for various with Figure 2 6 as a function of with and
PAGE 30
23 Figure 2 7 as a function of with and Figure 2 8 as a function of with and
PAGE 31
24 3 Robust Detection 3.1 Dete ction when the waveform and the initial phase are known 3.1.1 Case where contamination occurs over data pairs Let and be two disjoint classes of n dimensional density functions, and let and represent respectively the hypothese s and Assume that is generated by a single density function in Furthermore consider that the two class es are Huber classes defined as (Eq. 3.1 1 ) (Eq. 3.1 2 ) w here (Eq. 3.1 3 ) (Eq. 3.1 4 ) and w here are two given real constants whose values lie in The classes above model the cases where a sequence of data is generated by a corresponding nominal process, and contaminations may appear independently per data pair with probabilities for class and for class We later consider the case where the contamination may occur independently per datum or
PAGE 32
25 Based on the results derived by Huber, as presented in Detection and Estimation (169 74) [3] and extended in Appendix A, w e have (Eq. 3.1 5 ) (Eq. 3.1 6 ) w here (Eq. 3.1 7 ) Let us define (Eq. 3.1 8 ) T h en (Eq. 3.1 9 ) The statistics of under are the same for all and therefore the statistics of under are the same for all Thus, from this point on, we can drop the index from Let us then define
PAGE 33
26 (Eq. 3.1 10 ) (Eq. 3.1 11 ) (Eq. 3.1 12 ) Then (Eq. 3.1 13 ) The analysis has been simplified down to the case where a deterministic scalar constant signal may be transmitted in additive white Gaussian noise and follows Example 6.3.1 presented in Detection and Estimation The key to this simplification lies in the assumption that contamination occurs over data pairs We include some of the key results and equations of the analysis found in Detection and Estimation as future sections will referen ce them The reader should refer to Detection and Estimation (p. 175 85) for more details [3] We will focus our analysis on the case where The robust decision rule is (Eq. 3.1 14 )
PAGE 34
27 w here is as in (Eq. 3.1 13 ) and where and are such that (Eq. 3.1 15 ) The least favorable density functions and are (Eq. 3.1 16 ) (Eq. 3.1 17 ) w here (Eq. 3.1 18 ) a nd is the unique solution of (Eq. 3.1 19 ) w here there is the solution to (Eq. 3.1 19 ) which results in a valid if (Eq. 3.1 20 )
PAGE 35
28 Since the variables are independent and identically distributed when conditioned on any in the random variables are too T he random variable is then asymptotically Gaussian. Let us denote (Eq. 3.1 21 ) (Eq. 3.1 22 ) Denote and as the mean and variance of the random variable as gets large, conditioned on We find (Eq. 3.1 23 ) (Eq. 3.1 24 ) For asymptotically large values of the constant in (Eq. 3.1 15 ) c an be taken equal to zero and we obtain (Eq. 3.1 25 ) Denote and as the mean and variance of the random variable as gets large, conditioned on We find
PAGE 36
29 (Eq. 3.1 26 ) (Eq. 3.1 27 ) Denote by and respectively the mean and variance of the random variable as gets large, conditioned on (K) where is Gaussian with variance and mean Define (Eq. 3.1 28 ) We find (Eq. 3.1 29 ) (Eq. 3.1 30 ) For
PAGE 37
30 (Eq. 3.1 31 ) w e obtain (Eq. 3.1 32 ) (Eq. 3.1 33 ) Let us denote the power induced by the robust decision rule in (Eq. 3.1 15 ) at the density function in (Eq. 3.1 31 ) Due to (Eq. 3.1 25 ) we conclude: for (Eq. 3.1 34 ) 3.1.2 Case where contamination occurs per datum For this case where the waveform is completely known, treat each complex pair a s an individual sample D efine (Eq. 3.1 35 ) Also define (Eq. 3.1 36 ) In other words, is the known signal. With this new notation
PAGE 38
31 (Eq. 3.1 37 ) Our problem can be stated as follows. Let and be two disjoint classes of n dimensional density functions, and let and represent respectively hypothes e s and Assume that is generated by a single density function in Furthermore consider that the two class es are Huber classes defined as (Eq. 3.1 38 ) (Eq. 3.1 39 ) w here (Eq. 3.1 40 ) (Eq. 3.1 41 ) and w here are two given real constants whose values lie in The classes above model the cases where a sequence of data is generated by a corresponding nominal process, and contaminations may appear independently per datum with probabilities for cl ass and for class Based on the results derived by Huber as presented in Detection and Estimation (169 74) [3] and extended in Appendix A w e have (Eq. 3.1 42 )
PAGE 39
32 (Eq. 3.1 43 ) (Eq. 3.1 44 ) w here (Eq. 3.1 45 ) Let us define (Eq. 3.1 46 ) (Eq. 3.1 47 ) (Eq. 3.1 48 ) Then, if (Eq. 3.1 49 ) o r if
PAGE 40
33 (Eq. 3.1 50 ) We will focus our analysis on the case where The robus t decision rule takes the form (Eq. 3.1 51 ) w here are as in (Eq. 3.1 50 ) and where and are such that (Eq. 3.1 52 ) The least favorable density functions and if are (Eq. 3.1 53 ) (Eq. 3.1 54 ) a nd if are
PAGE 41
34 (Eq. 3.1 55 ) (Eq. 3.1 56 ) w here (Eq. 3.1 57 ) a nd are the solution s of (Eq. 3.1 58 ) (Eq. 3.1 59 ) From (Eq. 3.1 51 ) we observe that the robust receiver performs a truncation per datum in a similar fashion as that of the robust receiver found in section 3.1.1 except that the function which performs truncation changes here for each sample according to Also, (Eq. 3.1 49 ) and (Eq. 3.1 50 ) re quire that Due to this requirement, we could have the case where some data samples have s mall enough so that no valid solution to (Eq. 3.1 58 ) or (Eq. 3.1 59 ) exist s Those data samples where no solution exists represent a case where and overlap. For our overall detection such samples do not help our decision rule, and in the analysis that follows we will assign to all samples where no valid solution exist.
PAGE 42
35 The variables are independent when condi tioned on any in Let us denote a subset of as where is arbitrarily small. The members in subset are independent and identically distributed when conditioned on any in and so are the random variables T he random variable where is the number of elements inside the subset, is then asymptotically Gaussian. Finally is a linear combination of a symptotically Gaussian ran dom variables; it is therefore asymptotically Gaussian. (Eq. 3.1 60 ) w here is the relative frequency of such that For a specified the integral in (Eq. 3.1 60 ) will reduce to a sum. Let us assu me that, as gets asymptotically large, ; where for samples and for samples Our model will be asymptotically correct if the signal is such that as gets large the relative occurrence of a sample falling into any phase band of width is equally probable. Within the latter model, (Eq. 3.1 60 ) can be writte n as, (Eq. 3.1 61 ) Let us denote (Eq. 3.1 62 )
PAGE 43
36 (Eq. 3.1 63 ) W ith this notation, and based on (Eq. 3.1 23 ) we find (Eq. 3.1 64 ) Define Define D ue to symmetry (Eq. 3.1 61 ) can be then simplified to the following form, (Eq. 3.1 65 ) Since we must find numerically, we must also evaluate (Eq. 3.1 65 ) numerically. In a similar fashion and based on (Eq. 3.1 24 ) we find (Eq. 3.1 66 ) 2 1 2 1 + + + 1 1 2  ; (Eq. 3.1 67 )
PAGE 44
37 For asymptotically large values of the con stant in (Eq. 3.1 52 ) can be set at zero, resulting in (Eq. 3.1 68 ) Asymptotically is Gaussian, when conditioned on ; where is Gaussian with mean and variance Define In (Eq. 3.2 26 ) (Eq. 3.2 27 ) (Eq. 3.2 29 ) and (Eq. 3.2 30 ) we can replace with with with and with and integrate in a similar fashion as in (Eq. 3.1 65 ) and (Eq. 3.1 66 ) to obtain the mean and varianc e (as of ; when conditioned on (Eq. 3.1 69 ) (Eq. 3.1 70 ) (Eq. 3.1 71 ) (Eq. 3.1 72 ) (Eq. 3.1 73 )
PAGE 45
38 (Eq. 3.1 74 ) (Eq. 3.1 75 ) (Eq. 3.1 76 ) For (Eq. 3.1 77 )
PAGE 46
39 w e have (Eq. 3.1 78 ) (Eq. 3.1 79 ) Let us denote by the power induced by the robust decision rule in (Eq. 3.1 52 ) at the density function in (Eq. 3.1 77 ) Due to (Eq. 3.1 68 ) we conclude: for (Eq. 3.1 80 ) 3.1.3 Comparison of the robust cases with the optimal at the nominal model test In section 2.2 we examined the Neyman Pearson detection rule at the nominal model The latter rule does not provide protection against data contamination represented by extreme outliers. In section 3.1.1 we examined the robust detection scheme in the case where contamination occurs per data pair In section 3.1.2 we examined the robust detection scheme in the case where contam ination occurs independently per datum. In this section we will compare the asymptotic performance of all three detection schemes under all three contamination models (no contamination, contamination occurs per data pair, and contamination occurs per datu m). We will use the notation to denote the power of the detection scheme under the contamination model where and can be to respectively represent the cases of no con tamination, contamination occurring per da ta pair, or contamination occurring per single datum For example represents the power of the robust detection rule derived assuming con tamination occurring per data pair ( ) under the model where contamination actually occurs per single datum ( ) More s pecifically for
PAGE 47
40 we will consider the power of the detection rule under as defined in (Eq. 3.1 31 ) ; and for we will consider the power of the detection rule under as defined in (Eq. 3.1 77 ) Comparison under the model of no contamination Denote From (Eq. 2.2 8 ) we have (Eq. 3.1 81 ) Based on (Eq. 3.1 34 ) we have (Eq. 3.1 82 ) w here and are defined in (Eq. 3.1 23 ) (Eq. 3.1 24 ) (Eq. 3.1 26 ) and (Eq. 3.1 27 ) Based on (Eq. 3.1 80 ) we have (Eq. 3.1 83 ) w here and are defined in (Eq. 3.1 65 ) (Eq. 3.1 66 ) (Eq. 3.1 69 ) and (Eq. 3.1 70 ) Figure 3 1 shows the power curves for parameters: Figure 3 2 shows the power curves for parameters: It can easily in both Figure 3 1 and Figure 3 2 that for all values plotted. In other words Figure 3 1 and Figure 3 2 show that the optimal detection scheme out performs the robust detection scheme if there are no outliers.
PAGE 48
41 Figure 3 1 : Power curves for Figure 3 2 : Power curves for
PAGE 49
42 Comparison under the model of contamination per data pair From Detection and Estimation (p. 184) [3] we obtain (Eq. 3.1 84 ) From (Eq. 3.1 34 ) we have (Eq. 3.1 85 ) w here and are defined in (Eq. 3.1 23 ) (Eq. 3.1 24 ) (Eq. 3.1 32 ) and (Eq. 3.1 33 ) To derive w e consider density functions defined by (Eq. 3.1 86 ) (Eq. 3.1 87 ) w here (Eq. 3.1 88 ) These result in (Eq. 3.1 89 ) Furthermore define to be Gaussian with mean and variance and to be Gaussian with mean and variance W e obtain
PAGE 50
43 (Eq. 3.1 90 ) w here and are the same densities as in Section 3.1.1 In other words, the densities defined by (Eq. 3.1 86 ) and (Eq. 3.1 87 ) result in the same density model assumed under contamination model We therefore c an find as by examining the statistics of conditioned on (Eq. 3.1 86 ) and (Eq. 3.1 87 ) where is defined in (Eq. 3.1 49 ) and (Eq. 3.1 50 ) The density defined by (Eq. 3.1 86 ) and (Eq. 3.1 87 ) is similar to the density model assumed under contamination model C e xcept that the outliers always occur on consecutive values of T he statistics of as will be the same whether outliers al ways occur consecutively or not T herefore we conclude that From (Eq. 3.1 80 ) we have (Eq. 3.1 91 ) w here and are defined in (Eq. 3.1 65 ) (Eq. 3.1 66 ) (Eq. 3.1 78 ) and (Eq. 3.1 79 ) respectively. Figure 3 3 shows the power curves for parameters: Figure 3 4 shows the power curves for parameters: It can easily be in both Figure 3 3 and Figure 3 4 that and for all values plotted. In other words, Figure 3 3 and Figure 3 4 show that the robust detection scheme based on the appropriate outlier mo del outperforms the two other detection scheme s examined
PAGE 51
44 Figure 3 3 : Power curves for Figure 3 4 : Power curves for
PAGE 52
45 Comparison under the model of contamination per datum To find we need to examine the statistics of where under (Eq. 3.1 92 ) Define (Eq. 3.1 93 ) (Eq. 3.1 94 ) (Eq. 3.1 95 ) Define as the mean and standard deviation of as conditioned on density where can be denoting respectively the den sities in (Eq. 3.1 93 ) (Eq. 3.1 94 ) and (Eq. 3.1 95 ) We note that conditioned on or is a linear combination of Gau ssian variables and is thus Gaussian. From Section 3.1.2 we note that is Gaussian with mean ; for data values and with mean ; for data values, while, both and data values have common variance Denote where is defined in Section 3.1.2 We then find (Eq. 3.1 96 )
PAGE 53
46 In a similar fashion we find (Eq. 3.1 97 ) (Eq. 3.1 98 ) (Eq. 3. 1 99 ) Define and as the mean and variance of ; as conditioned on the density in (Eq. 3.1 92 ) We then find (Eq. 3.1 100 ) (Eq. 3.1 101 ) (Eq. 3.1 102 ) Denote by as the mean and standard deviation of the random variable ; as gets large, conditioned on density ; where can be referring respectively to the densities in (Eq. 3.1 93 ) (Eq. 3.1 94 ) and (Eq. 3.1 95 ) We note that is Gaussian when conditioned on or with variance We will define to be the mean of divided by when conditioned on density ; where can be or We find
PAGE 54
47 (Eq. 3.1 103 ) (Eq. 3.1 104 ) (Eq. 3.1 105 ) (Eq. 3.1 106 ) w here (Eq. 3.1 107 ) (Eq. 3.1 108 ) (Eq. 3.1 109 )
PAGE 55
48 and w here (Eq. 3.1 105 ) and (Eq. 3.1 106 ) are obtained from (Eq. 3.1 29 ) and (Eq. 3.1 30 ) For and the dependence of disappears, and and are identical to and which are found in (Eq. 3.1 29 ) and (Eq. 3.1 30 ) Define and as the mean and variance of ; as conditioned on the density in (Eq. 3.1 92 ) We then find (Eq. 3.1 110 ) + (Eq. 3.1 111 ) a nd (Eq. 3.1 112 ) w here and are defined in (Eq. 3.1 23 ) and (Eq. 3.1 24 ) Finally, f rom (Eq. 3.1 80 ) we have (Eq. 3.1 113 ) w here and are defined in (Eq. 3.1 65 ) (Eq. 3.1 66 ) (Eq. 3.1 78 ) and (Eq. 3.1 79 ) respectively. Figure 3 5 shows the power curves for parameters:
PAGE 56
49 Figure 3 6 shows the power curves for parameters: It can easily be seen in both Figure 3 5 and Figure 3 6 that and for all values plotted. In other words, Figure 3 5 and Figure 3 6 show that the robust detection scheme based on the appropriate outlier model outperforms the two other detection scheme s examined. Figure 3 5 : Power curves for
PAGE 57
50 Figure 3 6 : Power curves for 3.2 Detection when the waveform is unknown (energy detection) Let and be two disjoint classes of n dimensional density functions, and let and represent respectively hypothes e s and Assume that are generated by a single density function in Furt hermore consider that the two class es are Huber classes defined as (Eq. 3.2 1 ) (Eq. 3.2 2 ) w here (Eq. 3.2 3 )
PAGE 58
51 (Eq. 3.2 4 ) and w here are two given real constants whose values lie in The classes above model the cases where a sequence of data is generated by a corresponding nominal process, and contamination s may appear independently per data pair with probabilities ; for class and ; for class (Eq. 3.2 5 ) d efine Then (Eq. 3.2 6 ) (Eq. 3.2 7 ) w here (Eq. 3.2 8 ) Let us define
PAGE 59
52 (Eq. 3.2 9 ) (Eq. 3.2 10 ) (Eq. 3.2 11 ) Then (Eq. 3.2 12 ) (Eq. 3.2 13 ) Finally define (Eq. 3.2 14 ) Then (Eq. 3.2 15 ) (Eq. 3.2 16 ) The robust detection rule can be written as
PAGE 60
53 (Eq. 3.2 17 ) w here and are such that (Eq. 3.2 18 ) The least favorable density functions and are (Eq. 3.2 19 ) (Eq. 3.2 20 ) a nd the constants and are such that (Eq. 3.2 21 )
PAGE 61
54 (Eq. 3.2 22 ) Define such that ; if and only if Define new constants as below, (Eq. 3.2 23 ) (Eq. 3.2 24 ) With this notation (Eq. 3.2 15 ) can be written (Eq. 3.2 25 ) Then (Eq. 3.2 21 ) and (Eq. 3.2 22 ) can be written (Eq. 3.2 26 ) (Eq. 3.2 27 ) o r equivalently as; if :
PAGE 62
55 (Eq. 3.2 28 ) (Eq. 3.2 29 ) We no te that if then (Eq. 3.2 27 ) becomes ; therefore this will only occur if Also we note that we must have as dictated by (Eq. 3.2 25 ) We will focus our analysis on the case where We will denote the solutions of in (Eq. 3.2 28 ) (Eq. 3.2 29 ) as and and are continuous since all functions in (Eq. 3.2 28 ) and (Eq. 3.2 29 ) are continuous over all variables. We will define the break down point as the closest to zero such that We can also denote by the val ue at whic h the breakdown point occurs; in other words the value when occurs. From (Eq. 3.2 28 ) and (Eq. 3.2 29 ) with and we conclude that must satisfy (Eq. 3.2 30 ) a nd
PAGE 63
56 (Eq. 3.2 31 ) We note that since (Eq. 3.2 32 ) t hen (Eq. 3.2 33 ) There is no known analytic expression for or ; therefore we will use numerical methods to analyze these equations. Figure 3 7 : Threshold Values when
PAGE 64
57 Figure 3 8 : Threshold Values when Figure 3 9 : Threshold Values when
PAGE 65
58 Figure 3 10 : Threshold Value at the breakdown point Figure 3 11 : Breakdown Point Since the variables are independent and identically distributed, when conditioned on any in the random variables are too, and the random variable is then asymptotically Gaussian. From (Eq. 3.2 21 ) (Eq. 3.2 22 ) and (Eq. 3.2 25 ) we find
PAGE 66
59 (Eq. 3.2 34 ) (Eq. 3.2 35 ) (Eq. 3.2 36 ) The integral in (Eq. 3.2 36 ) cannot be written in terms of a well known function. In the spirit of the Marcum Q function we define the following function s (Eq. 3.2 37 ) (Eq. 3.2 38 )
PAGE 67
60 Then (Eq. 3.2 36 ) can be written as, (Eq. 3.2 39 ) a nd (Eq. 3.2 40 ) (Eq. 3.2 41 ) For asymptotically large values of the constant in (Eq. 3.2 18 ) can be set at zero, resulting in,
PAGE 68
61 (Eq. 3.2 42 ) We will examine the asymptotic performance of our detector under ; where (Eq. 3.2 43 ) a nd (Eq. 3.2 44 ) a nd where is a parameter of which may be viewed as the energy of the outliers. We note that is the same density as that given by (Eq. 3.2 4 ) written here in terms of instead of Also results from assuming a Gaussian model outlier under both and ; with means and such that and variance sum ming to Let and denote the mean and variance of conditioned on or respectively We d efine the following useful functions (Eq. 3.2 45 ) (Eq. 3.2 46 ) We obtain
PAGE 69
62 (Eq. 3.2 47 ) (Eq. 3.2 48 ) (Eq. 3.2 49 )
PAGE 70
63 (Eq. 3.2 50 ) For (Eq. 3.2 51 ) we have (Eq. 3.2 52 ) (Eq. 3.2 53 ) Let us denote by the power induced by the robust decision rule in (Eq. 3.2 18 ) at the density function in (Eq. 3.2 51 ) Due to (Eq. 3.2 42 ) we conclude; for (Eq. 3.2 54 ) Given a false alarm rate let us consider the optimal Neyman Pearson test at the density functions and By (Eq. 2.3 10 ) we have
PAGE 71
64 (Eq. 3.2 55 ) w here (Eq. 3.2 56 ) a nd is the inverse function of is the cumulative distribution function of the gamma random variable with parameters and Denote by and the mean and variance of under the density function Then, (Eq. 3.2 57 ) (Eq. 3.2 58 ) Let us denote by the power induced by the optimal Neyman person rule at for Directly from (Eq. 2.3 8 ) (Eq. 3.2 57 ) and (Eq. 3.2 58 ) we obtain (Eq. 3.2 59 )
PAGE 72
65 Figure 3 12 : Probability of detection with and Figure 3 13 : Probability of detection with and
PAGE 73
66 In Figure 3 12 we observe that ; for all examined. However in Figure 3 13 we observe that ; for all examined. The only difference between Figure 3 12 and Figure 3 13 is that in Figure 3 12 while in Figure 3 13 The test in set a threshold to guarantee that the probability of false alarm was less than or equal to at and subsequently guaranteed that the probability of false alarm at any was less than or equal to In contrast set a threshold to guarantee that the probability of false alarm was less than or equal to at only. Let denote the smallest false alarm satisfied by the test used to produce at The robust detection guarantees that If then comparing and may be misleading since each satisfies different false alarm upper bound s at L et us examine the powe r o f the robust decision rule; when the threshold is selected to satisfy the false alarm constraint only at We then obtain (Eq. 3.2 60 ) w here
PAGE 74
67 (Eq. 3.2 61 ) (Eq. 3.2 62 ) Figure 3 14 shows that for all examined using the same parameters as in Figure 3 13 At the optimal test at the nominal model is expected to outperform the robust detection rule The power loss due to the use of the robust detection rule is controlled by and so is the density used to set the threshold of the test ; in satisfaction of the false alarm constraint Figure 3 15 shows and ; with It is noted that the power loss in is much smaller than that in ; Figure 3 15 This is evidence of the penalty paid when a false alarm of less than or equal to is guaranteed at any ; instead of only at
PAGE 75
68 Figure 3 14 : Probability of detection with and Figure 3 15 : Probability of detection with and set to 0.05.
PAGE 76
69 3.3 Detection when the waveform is known down to an unknown initial phase In this case is not induced by a memoryless process, and the theorems due to Huber used in sections 3.1 and 3.2 cannot be used in the search for the optimal robust decision rule. We can, however, use the results from section 3.1 to construct robust decision rules which should provide good protection against occasional outliers. In this section we will develop such ad hoc, but intuitive ly pleasing decision rules, and analyze their performance. For more work on robust detection of processes with memory, the reader is referred to [4] We will begin by deriving a Neyman Pearson like rule for the non robust case. The reason for doing this will become obvious later on. In section 2.4 we derived our decision rule based on the metric where (Eq. 3.3 1 ) In this section, we will examine the Neyman Pearson like rule based on the metric where (Eq. 3.3 2 ) a nd is the which maximizes the conditional probability density function. Using the notation in section 2.4 we find (Eq. 3.3 3 )
PAGE 77
70 w here is defined in (Eq. 2.4 16 ) is defined in (Eq. 2.4 17 ) and is defined by (Eq. 2.4 18 ) From (Eq. 3.3 3 ) it is easy to see and (Eq. 3.3 4 ) Our Neyman Pearson like rule decides in favor of if (Eq. 3.3 5 ) Where is selected to satisfy the false alarm constraint. After simplification, our Neyman Pearson like rule reduces down to the exact same rule found in section 2.4 In other words, the optimal Neyman Pearson test is equivalent to a test which finds the maximum likelihood estimate of and then does the Neyman Pearson detection. Based on this observat ion we will design our robust detection rule as follows, first use robust estimation to estimate and then use an appropriate robust decision rule from Section 3.1 3.3.1 Robust Estimation of the unknown initial phase The maximum likelihood estimate of can be found by forming complex numbers from the data pairs and forming complex number s computing the complex metric and is the angle of This is not an M estimate; however the real and imaginary components of are M estimates. We therefore examine the components used in the metric. Let (Eq. 3.3 6 ) (Eq. 3.3 7 )
PAGE 78
71 w here all random variables are conditioned on Then clearly (Eq. 3.3 8 ) Since and are linearly combinations of Gaussian variables they also are Gaussian. In fact are Independent Identically Distributed Gaussian variables with mean and variance and are Independent Identically Distributed Gaussian with mean and variance We can now develop a robust estimate of denoted by independently using robust estimates of the means of and The formal mathematical approach to finding the robust estimate of a location parameter (the mean) of a generating process within a specific class of functions is due to Huber and can be found in detail in Detection and Estimation (p320 30) [3] This work will summarize the results and assumptions as they pertain towards robust estimate of the means of Gaussian variables. Let us conside r the possible occurrence of extreme independent outliers on independent identically distributed Gaussian data. Let denote the density function of a zero mean Gaussian variable with variance Let outliers appear with frequency and let their distribution be represented by some unknown density function whose entropy and Fisher information measure exist, and which is such that where is the entropy of and Let be the class of all valid density functions which can be written (Eq. 3.3 9 )
PAGE 79
72 w here can be any density function which satisfies the constraints given above. Given in let denote the Fisher information measure. Define which if exists satisfies (Eq. 3.3 10 ) For is as follows (Eq. 3.3 11 ) Where the constants are such that (Eq. 3.3 12 ) Let For (Eq. 3.3 13 ) a nd Given some data sequence generated from some where is within and is a constant t he robust estimation of denoted is
PAGE 80
73 (Eq. 3.3 14 ) This robust estimation is consistent at every Let be the rate of convergence of some M estimate at Let be the class of all such that The rate of convergence of our robust M estimate is such that (Eq. 3.3 15 ) We define our robust estimate as follows: use the estimat ion rule in (Eq. 3.3 13 ) to obtain estimates and and then defi ne At a given as the random variables and are asymptotically Gaussian. We will consider (Eq. 3.3 16 ) where and where and are defined via a discrete random variable as follows (Eq. 3.3 17 ) (Eq. 3.3 18 )
PAGE 81
74 (Eq. 3.3 19 ) Define as the density function of the variable where is as defined in (Eq. 3.3 6 ) and are realization s of random variables defined by (Eq. 3.3 16 ) Also define similarly where is as defined in (Eq. 3.3 7 ) we obtain (Eq. 3.3 20 ) (Eq. 3.3 21 ) Clearly, and are symmetric density functions around zero; the outlier model is such that where is defined in (Eq. 3.3 9 ) The robust estimates and at and are asymptotically Gaussian. Denote and as the mean and standard deviation of at as Define and similarly. By apply ing the theorem in Detection and Estimation (p 312) [3] we find (Eq. 3.3 22 ) (Eq. 3.3 23 )
PAGE 82
75 (Eq. 3.3 24 ) (Eq. 3.3 25 ) To simplify and we will define a new function (Eq. 3.3 26 ) and we will also use the following equalities (Eq. 3.3 27 ) (Eq. 3.3 28 ) We obtain
PAGE 83
76 1 0 ,0 + 1 0 +2 2 + 2 2 + 2 + 2 + 2 + 2 + +2 2 2 + 2 2 + (Eq. 3.3 29 ) To simplify and we will define a new function (Eq. 3.3 30 ) We obtain (Eq. 3.3 31 ) We will now find the probability density function of our estimate as To simplify future expressions we will find the probability density function of where A realization of is found as the angle of The complex random variable
PAGE 84
77 is Gaussian with both real and imaginary parts having variance and whose real mean is and whose imaginary mean is zero. Define (Eq. 3.3 32 ) (Eq. 3.3 33 ) where and denote taking the imaginary and real part of respectively. Define (Eq. 3.3 34 ) We find (Eq. 3.3 35 ) Let and denote the probability density function of conditioned on and respectively. Let denote the probability density function of the random variable We obtain (Eq. 3.3 36 )
PAGE 85
78 (Eq. 3.3 37 ) (Eq. 3.3 38 ) Using (Eq. 3.3 37 ) and (Eq. 3.3 38 ) we can write (Eq. 3.3 36 ) as (Eq. 3.3 39 ) The probability density function of as is (Eq. 3.3 40 ) 3.3.2 Robust Detection As noted earlier, when the initial phase is unknown (Section 2.4 ) the optimal detection at the nominal model is equivalent to finding the maximum likelihood estimate for We d enot e this estimate assuming and examining the same metric use d in optimal detection at the nominal model where the waveform is completely known (Section 2.2 ) In similar fashion, based on our outlier model, we can use the appropriate robust detection metric found in sections 3.1.1 or 3.1.2 under the assumption that where is obtained from the
PAGE 86
79 method described in 3.3.1 We wil l assume an outlier model where outliers occur per data pairs and therefore, we will adopt the detection rule in section 3.1.1 We will focus our analysis on the case where the occurrence of outliers is the same under both and and is less than ( Define (Eq. 3.3 41 ) (Eq. 3.3 42 ) where is the solutions to (Eq. 3.1 19 ) The robust decision rule is (Eq. 3.3 43 ) where is as in (Eq. 3.3 42 ) and where and are such that (Eq. 3.3 44 ) The least favorable density functions and are given by (Eq. 3.1 16 ) and (Eq. 3.1 17 ) The density function of conditioned on and some is the s ame regardless of the value of Therefore the density function of conditioned on is the same as the density function of conditioned on and From section 3.1.1 this density function is asymptotically Gaussian with mean and variance given by (Eq. 3.1 23 )
PAGE 87
80 and (Eq. 3.1 24 ) respectively. Denote by and respectively the mean and variance of conditioned on For asymptotically lar ge values of the constant in (Eq. 3.3 43 ) can be taken equal to zero ; we thus then obtain (Eq. 3.3 45 ) Denote by and respectively the mean and variance of the random variable conditioned on and for large values. Since conditioned on and are independent identically distributed Gaussian variables with mean and variance and are respectively equal to and in (Eq. 3.1 29 ) and (Eq. 3.1 30 ) with Let (Eq. 3.3 46 ) where (Eq. 3.3 47 ) Equivalently is the density resulting from the density functions and in (Eq. 3.3 18 ) and (Eq. 3.3 19 ) Define (Eq. 3.3 48 )
PAGE 88
81 (Eq. 3.3 49 ) We rewrite (Eq. 3.3 46 ) as, (Eq. 3.3 50 ) Denote by and respectively the mean and variance of the random variable conditioned on and for large values of Define and similarly. Since is a Gaussian probability density function with variance and are respectively equal to and in (Eq. 3.1 29 ) and (Eq. 3.1 30 ) with Similarly, since is a Gaussian probability density function with variance and are respectively equal to and in (Eq. 3.1 29 ) and (Eq. 3.1 30 ) with Denote by and respectively the mean and variance of the random variable conditioned on in (Eq. 3.3 50 ) and for large values of We obtain (Eq. 3.3 51 )
PAGE 89
82 (Eq. 3.3 52 ) Let us denote by the power induced b y the r obust decision rule in (Eq. 3.3 43 ) at the density function conditioned on Due to (Eq. 3.3 45 ) we conclude: for (Eq. 3.3 53 ) Let us denote by the power induced by the robust decision rule in (Eq. 3.3 43 ) at the density function in (Eq. 3.3 50 ) From our analysis it is difficult to find since is directly computed from our observed data If was completely independent of the data sequence and st ill had the distribution given by (Eq. 3.3 40 ) the theorem of total probability would give : (Eq. 3.3 54 ) where is defined in (Eq. 3.3 40 ) As any single data pair contribution to is negligible; therefore the distribution of conditioned on a selected using the data pair
PAGE 90
83 is close to the distribution of conditioned on a selected independently of This logic cannot be extended to 3.3.3 Comparing robust detection with non robust detection Without a straight forward way to compute even asymptoti cally, we will compare the performance of the robust and the optimal detectors both under the assumption that the angle estimates, and were selected independently of the data sequence One way to achieve this independence is to assum e that the analyst receives two separate independent sequences of data pairs from the same generating process, each of length n The analyst uses t h e first to estimate or and then based on this estimate uses the second to decide or We conjecture that this comparison will be at least qualitatively similar to the comparison of the case where the angle estimates are made directly from the same data sequence used to decide or The and estimates can be expressed as (Eq. 3.3 55 ) (Eq. 3.3 56 ) Denote by the distribution defined in (Eq. 3.3 16 ) We note that the nominal distribution. Let us denote by the probability distribution of Define similarly. Under and as the variables and are Gaussian ; therefore the and are of the mathematical form found in (Eq. 3.3 40 )
PAGE 91
84 We find that and both have the same value of but generally different Let us denote by and the parameter of and respectively. Figure 3 16 shows and for various values of where and When shows that the optimal estimate at is superior. However, as increases, increases without bound, whereas does n ot; does not change significantly as a function of Figure 3 17 shows and as a function of at where and ; as increases increases showing the tradeoff between protecting against a greater frequency of outliers and optimal performance at Figure 3 18 shows and where and ; the closeness of the two distributions indicates that the performance of robust estimate is close to the performance of the optimal estimate at Figure 3 19 shows and where and ; the difference in the two distributions shows the danger of using the opt imal estimate at with outliers present. Figure 3 16 : Sigma comparison where and
PAGE 92
85 Figure 3 17 : Sigma comparison at where and Figure 3 18 : Probability density functions at where and
PAGE 93
86 Figure 3 19 : Probability density functions at with where and Denote by the power of the robust decision rule obtained by independently estimating and then using the robust decision rule given by (Eq. 3.3 43 ) at the density pair ; is found using the integral found in (Eq. 3.3 54 ) Denote by the power of the robust decision rule obtained by independently estimating and then using the robust decision rule given by (Eq. 2.2 1 ) at the density pair ; is found using an integral with the same form as (Eq. 3.3 54 ) where the robust conditional power and robust probability density replace their the non ro bust equivalents. Figure 3 20 shows and where and as a function of The difference between the two curves shows the perfor mance loss induced by the robust decision rule designed at the highest permissi ble fre quency of outliers ( as compared to the optimal decision rule both in the absence of outliers Figure 3 21 shows and as functions of where and The difference between the two curves shows the performance loss from using the opti mal decision rule in the presence of outliers as compared to robust decision rule.
PAGE 94
87 Figure 3 20 : Power curves at where and Figure 3 21 : Power curves at where and
PAGE 95
88 4 Thesis Conclusions 4.1 Summary of Findings In Section 1 we developed the Neyman Pearson detection rules for a n FSK modulated signal that is trans mitted via a channel modeled by additive white Ga ussian noise and is filtered by an ideal brick wall filter for three cases: 1) when the waveform is known completely (coherent det ection), 2) when the waveform is known except for a random initi al phase (incoherent detection) and 3) when only the energy of the waveform is known (energy detection). To start, we used only first and second order statistics to com pare the three detectors. Then we compared the three detecto rs using full statistical descriptions of the involved waveforms While the first and second order statistical analysis led to simple and often quoted performance comparison figures, its results were expectedly highly inferior to those induced by the full statistical analysis The full statistical analysis showed that the performance of the three detectors is generally closer than the indications induced by the first and second order statistics an alysis. In Section 2, we developed the robust detectors for the three cases cited above where the classes for both the null and emphatic hypothesis are the Huber classes. For the coherent detection case, we conside red two outlier models: 1 ) when outlier s occur over complex data pairs and 2) when outliers occur independently per datum We derived analytic expressions for th eir asymptotic performance, when the outliers are drawn from a Gaussian distribution. Finally, for various models, we compared their p erformance to that of the optimal at the nominal distribution pair (the non robust detector) detector, as the latter was developed in the previous section. F or the energy detection case, we similarly derived the robust detector and compared it to the non r obust detector found in the previous section. In general, in the absence of outliers, ou r analysis showed slightly reduced performance for the robust detectors as compared to that
PAGE 96
89 of the non robust detector s. I n contrast, in the presence of outliers, our analysis showed significant ly reduced performance for the non robust detector s, as compared to that of the robust detectors. Both the coherent and the energy detection generating processes considered were memoryless ; as a result the robust detection sche mes were developed using existing theorems and are solutions to saddle point optimization problem s However, the incoherent detection generating process is not memoryless and no known theorems could be used to derive the optimal detection scheme. Despite t his, we were able to develop an intuitively pleasing incoherent robust detector for the incoherent case Our robust detector first used the data to derive a robust initial phase estimate; using the latter phase estimate, our detector subsequently used the robust detection established methods for the coherent case. Due to the inter dependence between the phase estimate and the subsequent detection we did not find analytic expressions for the asymptotic performance of our detector We derived analytic expre ssions for the asymptotic performance of a slightly modified incoherent robust detector however, which removed the inter dependence between the phase estimate and the detection steps We compared the latter analytic expressions to non robust versions in both the presence and the absence of outliers. In the absence of outliers, t hese comparisons showed a sl ight performance loss when the robust detector is used, as compared to the case when the non robust det ector is used, instead. In the presence of outliers the robust detector significantly outperformed the non robust detector. While our analysis focused specifical ly on the detection of FSK signals, S ection 1 highlighted the importa nce of using full statistical models, when different detectors are compared and; Section 3 highlighted the danger of relaying on non robust detectors in the presence of outliers. Non robust detectors relay on complete knowledge of the underlying data generating distributions. In general, for all real world problems full knowledge of the data generating processes is non
PAGE 97
90 feasible. Our analysis may derive the generating distributions from appr oximate physics based models. Otherwise, our analysis may attempt to estimate these distributions using data observations. In either case, the generating distributions are only appr oximately known. F or the latter case, in particular, the tails of the distributi ons are difficult to estimate correctly Robust detection performs well in the presence of outliers which occur at extre me values represented by the tails of our nominal distribution. In contrast, non robust detection performance breaks down in the presence of even infrequent outliers. We therefore conclude that robust detectors should always be used over non robust detectors for all real world problems. 4.2 Future Research There are several areas to expand and further the research presented in this thesis. Future work could examine the non robust and robust detection schemes of non FSK signals. While much of the resu lts in this thesis extend to non FSK band limited signals with constant energy, many of the results need to be modified to address signals with time varying energy such as Amplitude Modulated signals. In addition, this thesis examined detection under the a ssumption of an ideal brick wall filter; future work could examine the detection schemes of signals using other filter models. Of particular interest are filters which are found in receiver hardware such as Finite Impulse Response filters. The current thes is focused on Neyman Pearson detection schemes where the number of samples for a given detection scheme is fixed. Future work could examine the performance of Sequential Detection Testing for both the non robust and robust cases. The robust detection schem es performed data truncation based on the frequency of outliers. For most real world applications, the frequency of outliers will not be known a priori, and may not be constant. Future work could examine methods to optimally estimate this frequency, and up date this estimate as needed. Future work could also examine performance of the various detection schemes on simulated and real data. This performance could be compared to the
PAGE 98
91 analytic performance presented in this thesis. Some of the performance curves fo und in this thesis were based on the assumption of a large number of samples. Future work could do performance analysis assuming a small number of samples. While this thesis examined detection of FSK signals, future work could examine the optimal schemes t o demodulate FSK signals with particular interest in robust demodulation algorithms. Finally, future research could examine ways to make the FSK signals more robust to extreme outliers thereby increasing performance of robust detection demodulation algori thms.
PAGE 99
92 BIBLIOGRAPHY [1] P. Papantoni Kazakos and I. M. Paz, "The Performance of a Digital FM System with Discriminator: Intersymbol Interference Effects," IEEE Transactions on Communications, vol. 23, no. 9, pp. 867 78, September 1975. [2] J. G. Proakis and M. Salehi, Digital Communications, New York: McGraw Hill, 2008, pp. 212 14. [3] D. Kazakos and P. Papantoi Kazakos, Detection and Estimation, New York: Computer Science Press, 1990. [4] T. Papantoni, P. Papantoni Kazakos and A. Burrell, "Robust Sequential Algorithms For the Detection of Changes In Data Generating Processes," University of Colorado Denver, Denver, 2009.
PAGE 100
93 Appendix A : Extension of Huber Robust Detection Let and be two disjoint classes of n dimensional density functions, and let and represent respectively hypothesi s and Denote by a set of density function s within the class Assume that is generated by a single set of density function s in either or denoted Finally assume that and contain only memoryless processes Then, given an observation vector and some we have (Eq. A 1 ) Given some pair our non robust decision rule is (Eq. A 2 ) w here are selected to satisfy the false alarm constraint. If some pair exist s with decision rule such that (Eq. A 3 ) a nd (Eq. A 4 ) t hen is the robust decision rule.
PAGE 101
94 Lemma A.1 : Let there exist a least favorable pair for every under a si ngle sample hypothesis test, let and co ntain only memoryless processes and let each least favorable pair satisfy (Eq. A 5 ) Then the pair is the least favorable density pair over and In other words, if a process is memoryless, the least favorable density pair can be found by taking the least favorable density pair over each data sample, conditioned on each leas t favorable density pair satisfying (Eq. A 5 ) Proof : The condition in (Eq. A 5 ) implies that given and a random variable and non decreasing functions and exist, such that and the functions coincide respectively with the probabilities found in (Eq. A 5 ) The proof then follows the proof found in Detection and Estimation (Lemma 6.3.1, p.172 3) [3] replacing instances of with If and (Eq. A 6 )
PAGE 102
95 (Eq. A 7 ) where is the class of all one dimensional density functions within the n the least favorable density pair is (Eq. A 8 ) (Eq. A 9 ) where are such that (Eq. A 10 ) (Eq. A 11 ) If we consider a robust detection rule over and where and classes defined by (Eq. A 6 ) and (Eq. A 7 ) and where and contain only memoryless processes, then, due to Lemma A.1 and (Eq. A 8 ) and (Eq. A 9 ) ou r robust detection rule is (Eq. A 12 )
PAGE 103
96 w here are selected to satisfy the false alarm constraint, and where (Eq. A 13 ) (Eq. A 14 )
