ADAPTIVE WAVELET THRESHOLDING
IN DENOISING OF 2D SIGNALS
by
Erwin F. Siegel
B.S. Physics, University of Pittsburgh, 1980
B.S.E.E., University of Colorado at Denver 1991
A thesis submitted to the
University of Colorado at Denver
in partial fulfillment
of the requirements for the degree of
Master of Science
Electrical Engineering
1999
This thesis for the Master of Science
degree by
Erwin F. Siegel
has been approved
by
Tamal Bose
Mike Radenkovic
Jan Bialasiewicz
Date
Siegel, Erwin F. (M.S., Electrical Engineering)
Adaptive Wavelet Thresholding in DeNoising of 2D Signals
Thesis directed by Associate Professor Tamal Bose
ABSTRACT
In fields ranging from medical imaging to geophysics, digital communications to
astronomy, one must attempt to extract signals from noisy and incomplete data. In
all cases, the goal is to remove as much of the noise as possible without degrading
the finerscale aspects of the data. Wavelet based noise removal techniques have
recently emerged as attractive and powerful tools for signal extraction from
corrupted data. Denoising attempts to recover the true signal from the noisy
version that is observed. Signals are recovered by introducing a procedure that
suppresses noise by thresholding the empirical wavelet coefficients. Essential to
wavelet shrinkage signal denoising techniques is the selection of the shrinkage
threshold. The focus of this thesis is to examine and compare datadependent
threshold selection techniques that are widely used today in wavelet shrinkage
signal denoising.
This abstract accurately represents the content of the candidates thesis. I
recommend its publication.
Signed
in
Tamal Bose
ACKNOWLEDGEMENTS
My thanks to my advisor, Dr. Tamal Bose, for his patience with me during these
last two years and for his time and guidance throughout my graduate career at
UCD. Additionally, I am indebted to Dr. Todd Ogden, who provided informative
and valuable insights into his work on Data Adaptive Wavelet Thresholding and
without whose help, this thesis would not have been completed. I am also deeply
indebted to Gordon Stuck, for sharing with me his programming expertise in
Matlab, and for his willingness to always take time to discuss wavelets with me.
Finally, my deepest thanks go to my family: my wife Michaela, and my daughter
Elbe, for their support and patience throughout this task.
IV
CONTENTS
Chapter
1. Introduction...........................................................1
2. Wavelet Theory.........................................................5
2.1 Multiresolution......................................................29
2.2 Image Denoising Techniques..........................................38
3. Wavelet Thresholding..................................................58
3.1 Selective Wavelet Reconstruction.....................................61
3.2 Global Thresholding..................................................62
3.2.1 Minimax Thresholding...............................................63
3.2.2 Universal Thresholding VisuShrink................................65
3.3 Leveldependent Thresholding.........................................69
3.3.1 SureShrink.........................................................70
3.3.2 Data Analytic Threshold Selection Brownian Bridge................82
3.3.3 Data Analytic Threshold Selection Brownian Sheet.................91
4. Computer Simulations and Comparative Analysis.........................97
5. Conclusions..........................................................107
Appendix
A. Bibliography.........................................................109
v
B. Matlab Code/Simulations
FIGURES
Figure
2.1 Wavelet analysis timefrequency plot.................................6
2.2 Analysis section of a filter bank....................................8
2.3 Scaling and dilation example with Haar wavelet......................10
2.4 Twochannel filter bank.............................................16
2.5 Alias cancellation conditions.......................................21
2.6 Alternating flip and SmithBamwell conditions.......................26
2.7 Daubechies db3 wavelet Scaling and wavelet functions..............28
2.8 Daubechies db3 wavelet filter designs.............................28
2.9 Original Lena image (512 x 512)....................................29
2.10 Single level decomposition of Lena image with Daubechies db3 wavelet....30
2.11 Twodimensional single level decomposition format...................30
2.12 Wavelet coefficients from single level decomposition vector form..31
2.13 Two level decomposition of Lena image with Daubechies db3 wavelet...32
2.14 Twodimensional two level decomposition format......................32
2.15 Wavelet coefficients from two level decomposition vector form.....33
2.16 Three level decomposition of Lena image with Daubechies db3 wavelet.34
2.17 Twodimensional three level decomposition format....................34
vii
Figure
2.18 Wavelet coefficients from three level decomposition vector form.......35
2.19 Two dimensional DWT Decomposition step................................36
2.20 Two dimensional IDWT Reconstruction step..............................37
2.21 Categories of Image Restoration.........................................39
2.22 Original and Noisy versions of the Lena image............................39
2.23 Lowpass filtering of noisy Lena image...................................41
2.24 Median filtering of Lena image corrupted by salt and pepper noise.....42
2.25 Median filtering of Lena image corrupted by white noise...............43
2.26 Median filtering of Lena image corrupted by the combination of salt and
pepper noise and white noise...............................................44
2.27 Power spectral density estimation for Wiener filter.....................46
2.28 Noise reduction by LSIV and LSV Wiener filtering........................48
2.29 Wiener filter block diagram.............................................50
2.30 Image blurring block diagram............................................52
2.31 Inverse filtering block diagram.........................................53
2.32 Multiplicative noise reduction block diagram............................55
3.1 Hard and soft thresholding functions.....................................58
3.2 Blocky function Global Universal and Minimax methods...................65
viii
Figure
3.3 SURE thresholding algorithm...........................................74
3.4 SURE criterion for Blocky function Level 6............................76
3.5 SURE criterion for Blocky function Level 3............................77
3.6 Blocky function SURE and Hybrid SURE methods........................80
3.7 Heavy Sine function SURE and Hybrid SURE methods....................80
3.8 Brownian bridge process behavior......................................85
3.9 Blocky function Level 4 Denoising results of Brownian Bridge........87
3.10 Changepoint analysis in Brownian bridge process.....................88
3.11 Blocky function Denoising results of Brownian Bridge...............89
3.12 Blocky function comparisons of various Denoising techniques:
VisuShrink, SureShrink, and Brownian Bridge.....................t..........90
3.13 Data input into formation of Brownian Sheet..........................94
3.14 Brownian Sheet no significant coefficients removed.................95
3.15 Brownian Sheet all significant coefficients removed................95
3.16 Noisy Horizontal wavelet coefficients from Level 1 Decomposition.....96
3.17 Denoised Horizontal wavelet coefficients from Level 1 Decomposition..96
4.1 Original and noisy versions of Lena image (with wavelet coefficients)..97
4.2 Level 3 Decomposition of original and noisy Lena image................98
IX
Figure
4.3 Scaling and wavelet functions of db3 and Haar wavelets..................99
4.4 Original and noisy versions of Box function (with wavelet coefficients).100
4.5 Denoising results for Lena image corrupted by additive white noise:
VisuShrink, SureShrink, and Brownian Sheet (alpha = 0.999) methods (db3) ...101
4.6 Plot of wavelet coefficients of Figure 4.5 .............................102
4.7 Denoising results for Lena image corrupted by additive white noise 
VisuShrink, SureShrink, and Brownian Sheet (alpha = 0.05) methods (db3)......103
4.8 Denoising results for Box function corrupted by additive white noise 
VisuShrink, SureShrink, and Brownian Sheet (alpha = 0.05) methods (Haar) ...105
4.9 Denoising results for Box function corrupted by additive white noise 
Brownian Sheet (alpha = 0.05, 0.5, 0.9, and 0.999) method (db3)..............106
TABLES
Table
3.1 Minimax thresholds for various sample sizes..............................64
3.2 Critical points for finitesample Brownian Sheet process.................93
XI
1. Introduction
One of the fundamental problems we face everyday in the fields of data
analysis and signal processing is how best to extract a signal from noisy and
corrupted data. This problem has recently been addressed with the use of
wavelets. The power of wavelet analysis lies in fact that wavelets have certain
mathematical properties that enable them to parse data into different frequency
components, and then analyze each component with a resolution matched to its
scale. Temporal analysis is performed with contracted or higher frequency
versions of the mother wavelet, while frequency analysis is performed with dilated
or lower frequency versions of the same mother wavelet. Through a process of
scaling, wavelets are compressed or stretched to more accurately analyze a
particular feature of a signal. Wavelets can also be translated or shifted in time.
The transformation of a time series into its temporal and frequency components is
achieved by the use of the wavelet transform. An interesting characteristic of the
wavelet transform is that resulting individual wavelet functions are localized in
space. Fourier sine and cosine functions are not. This localization feature, along
with the wavelets localization of frequency, results in sparse functions when
transformed into the wavelet domain. This sparseness feature of wavelet
coefficients results in a number of useful applications such as data compression
and denoising.
Wavelets can be used to decompose a signal into different levels of
resolution. This is where wavelet shrinkage techniques have been found to be very
effective in extracting signals from noisy data. Wavelet shrinkage methods select
which wavelet coefficients are to be used in the reconstruction of a signal based
on thresholding. Wavelet shrinkage refers to a signal reconstruction obtained by
first performing a multilevel decomposition, followed by shrinking the wavelet
coefficients towards zero, and then finally reconstructing the signal by the use of
inverse wavelet transformation. Of concern to wavelet shrinkage signal denoising,
is the choice of the threshold value that will be used to shrink the wavelet
coefficients.
The goal of this thesis is to examine a relatively new data adaptive method
for selecting thresholds for wavelet shrinkage denoising of signals. This technique
is unique from other existing datadependent threshold selection techniques
because it considers both the magnitude and location of empirical wavelet
coefficients when determining the threshold X .
Chapter 2 examines the wavelet transform and its ability to represent a
signal at different locations and scales. Multiresolution analysis is also introduced
in this chapter. Multiresolution has the capability of zooming in or zooming
2
out, depending upon the level of examination or focus that is desired. Traditional
methods of image denoising are introduced and compared in this chapter as well.
The objective of denoising is to recover the signal or to remove noise without
reference to an unknown original signal.
Chapter 3 begins with a discussion on soft and hard thresholding, two
methods that are used to apply a threshold to a set of empirical wavelet
coefficients. Thresholding of empirical wavelet coefficients is designed to
distinguish between those coefficients that belong in a reconstruction; i.e.
coefficients that contain significant energy, and those that do not. Ideally,
thresholding should only affect the noise without disturbing the signal. More
important than the actual method of applying a threshold to data, is how the
threshold is derived. Global versus leveldependent thresholding techniques are
examined in detail and the results of various simulations are presented.
In Chapter 4, the results of MATLAB computer simulations are displayed.
Threshold selection methods discussed in Chapter 3 were implemented and tested
on both onedimensional and twodimensional images corrupted by additive
Gaussian white noise. Twodimensional images include the famous Lena image
and a 64x64 box function. The Blocky function, first introduced by Donoho and
Johnstone [1], was used for the onedimensional simulations due to its portrayal
as a signal that contains significant spatial inhomogeneity and multiple jumps.
3
Daubechies six coefficient orthogonal wavelet db3 and the Haar wavelets were
used in all simulations. The results are presented and compared both from a visual
standpoint as well as from SNR calculations. With the exception of the one
dimensional Brownian Bridge algorithm, all remaining onedimensional denoising
simulations were performed with functions from the MATLAB Wavelets Toolkit.
All twodimensional denoising simulations were written by the author to run in
MATLAB.
Chapter 5 concludes this thesis with a discussion on the experimental
results and the versatility of the recursive Brownian Sheet technique that allows
the user to control the smoothness of the reconstructed signal.
4
2. Wavelet Theory
Mathematical transformations are applied to signals in order to obtain
additional information about the signal that is not readily available in its original
form. For example, the Fourier Transform of a time domain signal results in the
frequencyamplitude representation of that signal. The output tells us how much
of each frequency is present in the original signal. However, it does not tell us
when in time these frequency components exist. When the time localization of the
spectral components are needed, a transform that gives the timefrequency
representation of the signal is required. The Wavelet Transform makes it possible
to have it both ways; it allows you to have both the time and frequency
information at the same time. The transformation of a time domain signal into its
temporal and frequency components is provided by the Wavelet Transform.
The fundamental difference between the Fourier Transform and the
Wavelet Transform is resolution. The Fourier Transform has constant resolution
at all times and frequencies. The Wavelet Transform has good time and poor
frequency resolution at high frequencies, and good frequency but poor time
resolution at low frequencies. This is certainly acceptable, though, as the added
advantage is the capability of multiresolutional analysis provided by the Wavelet
Transform. The illustration in Figure 2.1 is commonly used to explain how the
5
Figure 2.1: Wavelet analysis timefrequency plot (timescale)
time and frequency resolutions are interpreted. Note that at low frequencies the
height of the boxes are shorter, which corresponds to better frequency resolution,
but their widths are longer, which correspond to poorer time resolution. At higher
6
frequencies the widths of the boxes decrease, which corresponds to better time
resolution and the heights of the boxes increase as the frequency resolution gets
poorer.
Central to and intrinsically connected to multiresolution analysis are
wavelets. Wavelets have the unique ability to analyze according to the desired
scale, thus allowing signals to be examined using widely varying levels of focus.
In this section, the implementation of the Wavelet Transform is accomplished by
examining a perfect reconstruction filter bank. The scaling and wavelet functions
are given and the Haar wavelet will be introduced to provide insight into the
dilation and translation operations. Wavelets are created from a mother wavelet
through these two operations and can be expressed as
where j is the dilation index and k is the translation index. The connection
between wavelets and filter banks was made by Mallat [12].
(2.1)
7
As seen in Figure 2.2, a filter bank is comprised of a set of filters and sampling
operations. Only the Analysis section of the filter bank is shown here, where the
input signal is decomposed or divided into different scales of resolution. The
Filter and downsample indefinitely
Scaling function
Wavelet
w(0
Difference
Figure 2.2: The Analysis section of the filter bank
Synthesis section follows the Analysis section and is simply the inverse of the
Analysis bank. The signal x() enters the Analysis section and is operated upon by
two filters, H0 and H]. These filters separate the signal into different frequency
bands. H0, the lowpass filter, takes the averages of the signal, while H], the
highpass filter, takes the differences. The first level of decomposition has been
completed after the outputs of the filters have been decimated, or downsampled by
two. The highpass filtered outputs, called wavelet transform coefficients, are not
8
passed onto the next level of filtering and will be used later. The lowpass outputs
are passed into the next level of decomposition. This process continues until it is
determined that further decomposition of the signal is not beneficial to the
analysis. Typical decomposition values for a onedimensional signal is five and
three for twodimensional signal.
As stated earlier, there is a direct relationship between filter banks and
wavelets. The lowpass filter H0, which provides the approximation coefficients,
leads to the scaling function 0(/). The highpass filter Hx, which is associated
with the detail coefficients, produces the wavelet function w(t). Typically, the
scaling function is obtained before the wavelet function w(t), and only after many
iterations of the lowpass filter and downsampling. The wavelet function also
comes from the last decomposition level, but after one application of the highpass
filter. The dilation equation that produces (t) and the wavelet equation w(t) can
be written as
N
(2.2)
9
The wavelet equation follows immediately from the scaling function and can be
written as
W{t) = '1Yah^k)(21 1 ~k\ (2'3)
k
The importance of the translation and dilation operations will be demonstrated
through the introduction of the Haar wavelet. The example in Figure 2.3 will
graphically illustrate the relationship between scale and dilation and translation.
i *(20 k *(21
^
Wavelets are derived from
the Mother wavelet
wjk{t) = w(2Jtk)
/ \
compression shift
Figure 2.3: Scaling and dilation example using the Haar 'mother wavelet
10
By definition, 0(/) = 1 over the interval [0,1] and 0 elsewhere. It then follows that
M = 2fo(0M2,) + hn(\)fj>(2l 1)]. (2.4)
Assume for the Haar case that the lowpass filter is defined as
= (2.5)
The scaling function (t) can then be written as
0(r) = [0(2f) + 0(2fl)]. (2.6)
Assume for the Haar case that the highpass filter is defined as
/#) = [! l} (27)
It follows that w(r) can be written as
w(t) = 2[h} (0)0(2/) + /i, (1)0(2t 1)] (2.8)
11
and
w(O = M2O0(2'i)]
(2.9)
The results agree with the graphical example in Figure 2.3. In order to solve the
dilation equation
0(O = 2XM*M2r*)> (21)
k
apply the continuous Fourier Transform to both sides to obtain
= 2J  k)e~i0*dt, (2.11)
k
and
= 2A21 ~ k)e (2.12)
k
Using substitution by change of variables, let
12
y It k,
21 = y + k,
t = y2{y + k),
dt = y2dy,
to obtain
k
toy ^
/
and
0) CO
<&(co) = ^Tih(k)e j 2j(f>(y)e >Yldy.
k
Now define
O
u
. co
\
and
(
H
co
V 2
Continuing, we finally get
(2.13)
(2.14)
(2.15)
(2.16)
(2.17)
13
(2.18)
f 6)\^( O)''
V 2~>
O
V z 7
This is a recursive process and by repeating the iteration, we get
M ( CO \ 'CO'
H b
^ 2 , U) v 4,
(2.19)
' coN (co V fa) ' co
H H O
UJ ,8 )
(2.20)
*M=riH
j=i
CO
4>(0).
(2.21)
If h0(k) is known, (2.21) becomes
M=nH
7 = 1
(2.22)
The scaling function O(ry) is obtained after an infinite sequence of filtering and
downsampling. Fundamental to the development of the scaling function is the
design of H0 (z). How are its coefficients determined? In addition, referring to
14
Figure 2.2, what is its relationship to //, (z) and the remaining design
requirements of the filter bank? These questions will be answered in the following
section.
15
Y V U W
J0 Y0 u0 rr0
Figure 2.4: Twochannel filter bank
A twochannel filter bank, with both the Analysis and Synthesis sections shown,
is depicted in Figure 2.4. The role of a filter bank is to separate an input signal
into frequency bands by filtering and downsampling, and then reconstruct it by
A
upsampling and filtering. The signal x{n) enters the filter bank from the left and
passes into the Analysis bank. The signal is acted upon by a lowpass filter H0(z)
and a highpass filter //,(z). Both filtered versions are then downsampled by two.
This results in the removal of oddnumbered samples and two halflength outputs.
The gap in the center of Figure 2.4 indicates where the downsampled signals may
be compressed, enhanced, or coded for storage or transmission. For the
development of this topic, we will assume that this gap is closed. The signals are
then passed into the Synthesis bank, where they are upsampled by two. This
16
results in the insertion of zeros in the oddnumbered locations and two 2x length
outputs. For the filter bank to exhibit perfect reconstruction with / time delays,
A
x(n) = x(n /). This is equivalent to stating in the z domain that
A
X{z) = z~l X(z). This occurs when there is no aliasing and no distortion of the
input signal. The following derivation will shed some light on these two
requirements. From the filter bank in Figure 2.4, it can be written that
r(z) =
(2.23)
+ li(
(2.24)
and substituting (2.23) into (2.24) to obtain
r0(z) = y\Ha(zf')X(z>,') +
(2.25)
t/0(Z) = Fq(z2) = y2[H(i(z)X(z)+ //0(z)A'(z)], (2.26)
and
W0{z) = y2[F0(z)H0(z)X(z) + F0{z)H0(z)X{z)]. (2.27)
17
The same derivation techniques can be used for the lower branch of the Analysis
bank to solve for
IT, (z) = /2[F] (z)H, (z)X(z) + F{ {z)Hx {z)X{z)\
Adding the expressions for lT0(z) (2.27) and Wx[z) (2.28),
X(z) = tV6(z) + (z).
Now, collect like terms and simplify to obtain
X(z) =
A[F,(z)H, (z) + F, (z)H, (z)] X(z).
As stated earlier, for perfect reconstruction with / time delays, X{z) = z
The distortion term in (2.30) must be set equal to z~l, so
'AlF^H^ + F^H.iz^z',
(2.28)
(2.29)
(2.30)
'A(z).
(2.31)
18
[F0(z)//0(z) + JF,(z)//1(z)] = 2z/,
(2.32)
while the alias term must be set equal to zero
[F0(z)//0(z) + F,(z)//,(z)] = 0. (2.33)
Perfect reconstruction is guaranteed for a twochannel filter bank when two
conditions are met. First, the no distortion requirement is given by:
Fo{z)Ho{z) + Fx{z)H\{z) = 2z~' (234)
The second condition, alias cancellation, can be written as
F(z)H(z) + F[(z)H,(z) = 0. (2.35)
The filters F0, H0, //,, and Fj need to be designed so that these conditions are
satisfied. From (2.35) above, it can be shown that alias cancellation occurs and
(2.33) is satisfied when
F0(z) = Hl{z)
(2.36)
19
and
f,(z) = W0(z). (2.37)
Substituting (2.36) and (2.37) into (2.33) will verify that these filter relationships
remove aliasing in the filter bank. The conditions put forth in (2.36) and (2.37)
permit examination of the relationship of the filters coefficients. Assume
H(z) b0 + b{z~] + b2z~2 + 63z~3 +... => [60 b] b2 b3 ...], (2.38)
7/(z) = b0 6,z"' + b2z~2 b3z~3 +...=> \b0 bx b2 b3 ...], (2.39)
and
H(z) = b0 +6,z~' b2z~2 +b3z~3 [60 bx b2b3...]. (2.40)
The output vector associated with (2.39) indicates that oddnumbered coefficients
change sign for the relationship F0(z) = //, (z). Similarly, (2.40) shows that
F] (z) = H0(z) results in the evennumbered coefficients changing sign. This is
shown graphically in Figure 2.5.
20
H,(z)
a.b.c
Fq(z)
p,q,r,s,t
p,q,r,s,t
a,b,c
FAz)
Figure 2.5: Alternating signs pattern of a twochannel filter bank
To complete the analysis and to satisfy (2.34), define the product filters P0(z) and
P\z) as
/>(z) = r0(z)H,(z)
and
Pl(z)=Fl(Z),(Z). (2.42)
21
Substituting directly allows us to find that
4(4 = 4i(4
(2.43)
The no distortion requirement of (2.32) is manipulated to obtain
F0{z)H0{z) + F\z)H\z) = 2z~', (2.44)
F0{z)H0{z)F0{z)H0{z) = P0(z)Pq(z) = 2z, (2.45)
and
P0{z)P0{z) = 2z!.
(2.46)
Further examination of (2.46) leads to the fact that / must be odd and all
of the odd powers in P0(z) must be equal to zero. The exception is z~', which has
a coefficient of one. Simplifying (2.46), we let
P{z) = z!P0{z)
(2.47)
22
and multiplying (2.46) by zl to obtain
z'P0{z)z'P0{z) = 2,
(2.48)
p{z)~ (z)>0(z)
= 2,
(2.49)
and
P(z)+P(z) = 2. (2.50)
To summarize, performing this analysis for a twochannel perfect reconstruction
filter bank yields a design that can be reduced to a few steps:
Stepl: Design a halfband lowpass filter P(z) such that P(z) + P(z) = 2.
Step 2: Factor P0(z) into F0(z)H0(z), then use (2.36) and (2.37) to find
Fx (z) and //, (z).
Note that the length of PQ{z), by definition, determines the sum of the lengths of
F0(z) and //0(z). Factoring P0[z) into F0(z)//0(z) can be done several ways.
One such method will result in linear phase filters, another in orthogonal filters.
23
So, for the perfect reconstruction condition to be satisfied, P(z) must be a
halfband filter. All even powers in P{z) are zero, except for the constant term,
which is equal to zero. The odd terms will dictate the filter design.
We still need to focus on the design of the filters H0{z), Ht (z), F0(z),
and F{{z). Typically H0(z) is designed first, then //,(z) follows. Early work on
this topic was done by Croisier, Estaban, and Garland in 1977. Their choice dealt
with alternating signs and is described by
H]{z)=H0{z). (2.51)
The resulting filters were called QMF (Quadrature Mirror Filter). This design
satisfies the alias cancellation criteria, but does not result in an orthogonal filter
bank. Further work by Smith and Barnwell in 1984 proved to be more rewarding
as they chose
(2.52)
This is known as the alternating flip method (see Figure 2.6) and will lead to the
design of orthogonal filters when H0(z) is correctly chosen. If the design of
24
//0(z) leads to perfect reconstruction in the alternating filter bank, it will also
result in orthogonality. Rewriting (2.52),
Hi{z) = zNH0(z~'), (2.53)
we can now expand it in the same format as (2.38) through (2.40) to get
H0{z) = b0 +b{z~' + b2z~2 + b3z~2 +...=> [60 6, b2 b3 ...], (2.54)
Ha(z_1) = b0 + blz + b2z2 + b3z3 + ...=> [b0 6, b2 b3 ...], (2.55)
//0(z_1) = bQb\Z + b2z2 b3z3 ...=> [bQ b] b2 b3 ...], (2.56)
z~3//0(z_1) = b3 +b2z~] 6,z~2 +bQz~3 ...=> [~b3 b2 6, b0 ...], (2.57)
and finally
z~3//0(z~') = b3 b2z 1 + 6,z 2 b0z 3 ... => [b3 b2 bx b0 ...]. (2.58)
25
H0{z) K(z)
Hx(z) Fx(z)
Figure 2.6: Alternating flip for orthogonality and SmithBamwell conditions for
alias cancellation
Substituting (2.36) into (2.41) yields
P0{z) = Hx(z)H0(z).
(2.59)
We need
W,(z) = (2)"w0(z1)
(2.60)
and for N odd, (2.60) becomes
(2.61)
26
and the product filter can then be written as
P0{z) = z~nH0(z')H0{z), (2.62)
which will result in an orthogonal filter and symmetric factorization.
To summarize the design of half band filters and orthogonal filter banks:
If //,(z) is the alternating flip of //0(z),then P0(z) = z~NH0{z~x^H0(z), and
this will lead to an orthogonal filter bank.
Design P(z) such that P(z) + P{z) = 2. This will guarantee perfect
reconstruction.
Alias cancellation requires that
Fo{z) = HA~z) and Fi(z) = H0(z).
Alternating flip leads to orthogonal filters and symmetric factorization.
Design a halfband filter P(z) such that P(z) + P(z) = 2. Define
PQ(z) = z~'P(z). Factorize P0(z) = F0(z)H0(z) to find F0(z) and H0(z).
Several combinations are possible. Use the SmithBamwell conditions to find
Fx (z) and H] (z).
27
Daubechies db3 wavelet:
db3 Scaling function phi
db3 Wavelet function psi
Figure 2.7: Daubechies db3 scaling and wavelet functions
Mag Response db3 lo_D
Mag Response db3 Hi_D
Mag Response db3 Lo_R
db3 Decomposition Lo_D
0 2 4 6
db3 Reconstruction Lo R
t f
1
05
0
0 5
1
05
0
0.5
db3 Decomposition Hi_D
0 2 4 6
db3 Reconstruction Hi R
Figure 2.8: Daubechies db3 wavelet filter designs
H0(z) =[0.0352 0.0854 0.1350 0.4599 0.8069 0.3327]
H}(z) =[0.3327 0.8069 0.4599 0.1350 0.0854 0.0352]
F0(z) = [0.3327 0.8069 0.4599 0.1350 0.0854 0.0352]
F,(z) = [0.0352 0.0854 0.1350 0.4599 0.8069 0.3327]
28
2.1 MultiResolution
Original Lena Image (512x512)
Figure 2.9: Original Lena Image
29
Original Lena Image Level 1 Decomposition
Figure 2.10: Single level decomposition of Lena with Daubechies db3 wavelet. Detail coefficients
(horizontal, vertical, and diagonal) and approximation coefficients displayed.
LH Approximation coefficients LH Horizontal coefficients
HL Vertical coefficients HH Diagonal coefficients
Figure 2.11: Single level decomposition display
30
31
Original Lena Image Level 2 Decomposition
Figure 2.13: Two level decomposition of Lena with Daubechies db3 wavelet.
LL LH LH
HL HH
HL HH
Figure 2.14: Two level decomposition display
32
1000
Wavelet coefficients from 2 level decomposition
33
Original Lena Image Level 3 Decomposition
S.\& Â¥ 
i*d.f  ** . s'. '.V'.*
tg. . II 'V ,
V 'T/ f.V ,\.r MV:'s .
&$ . ,A ^ t: 4# ? A % ir JSV >. y? *V
\ .
l \4#*
u v , !V*r .^Jjr >,
t  ry.'{ '/M , fr &.'k :  4/ \
L \?i . >' tpi:./ *.
>7 V . v*. rs*.;:. . ,4/ .,
Figure 2.16: Threelevel decomposition of Lena with Daubechies db3 wavelet.
LL LH LH LH
HL HH
HL HH
HL HH
Figure 2.17: Three level decomposition display
34
Wavelet coefficients from 3 level decomposition
Figure 2.18: Wavelet coefficients from Figure 2.16 in vector form
35
Twodimensional
DWT
x(n)
Downsample rows, keep
even indexed columns
t
coefficients
Downsample rows, keep
even indexed columns
Rows
Rows
H,(z)
Rows of the image are convolved
with the Decomposition filter H0 (:)
Rows of the image are convolved
with the Decomposition filter //
Columns
()
Columns
Columns of the image are convolved
with the Decomposition filter H0 {:)
Columns of the image are convolved
with the Decomposition filter Ht(:)
Figure 2.19: Two dimensional DWT decomposition steps
36
Twodimensional
IDWT
Rows
Rows
Upsample columns, insert zeros
at odd indexed columns
LL
Approximation
coefficients
LH
Horizontal
coefficients
HL
Vertical
coefficients
HH
Diagonal
coefficients
Upsample rows, insert zeros
Rows . ,, , ,
at oddindexed rows
Upsample columns, insert zeros
at odd indexed columns
Rows of the image are convolved
with the Reconstruction filter
Rows of the image are convolved
with the Reconstruction filter F,(z)
Columns
Columns
Columns of the image are convolved
with the Reconstruction filter ^
Columns of the image are convolved
with the Reconstruction filter F, (_)
Figure 2.20: Two dimensional IDWT reconstruction steps
37
2.2 Image Denoising Techniques
The field of Digital image processing can be broadly separated into three
categories: image enhancement, restoration, and compression. Image enhancement
and restoration techniques are closely related to one another and their methods
and algorithms may overlap. In general, though, image restoration algorithms are
more mathematically complex and attempt to exploit certain characteristics of the
signal and noise. In addition, some type of error criterion is used to provide a
measure of the effectiveness of the algorithm.
The goal of image enhancement is to process an image so that the result is
more suitable than the original image for a particular application. In image
restoration, the image has been degraded through some process and the goal is to
reduce or eliminate the effect of the degradation so that the resulting image
resembles the original as closely as possible. The objective of image compression
is to represent an image with as few as bits as possible while still maintaining an
acceptable level of quality. As with image enhancement techniques, the
development and selection of a particular restoration method depends on several
factors. The method selected for denoising an image depends on the image
characteristics and the type of noise or degradation, i.e. the method selected is
image and noise specific. Image restoration can be broken down further into four
38
categories as seen in Figure 2.21: reduction of additive random noise, deblurring,
reduction of signal dependent noise, and temporal filtering. This section will
review various image restoration algorithms and present some examples of
various techniques. Specifically, the focus will be on algorithms used to reduce
additive random noise from a degraded image. A brief discussion on the
remaining categories will be given, but the reader is referred to Lim [7] for a more
complete description and extensive algorithm development.
Image restoration
Reduction of random noise Deblurring Reduction of signal dependent Temporal filtering
Lowpass filtering Inverse filtering noise Frame averaging
Median filtering Blind Motion
Wiener filtering compensation
(LSIV and Space
invariant
Figure 2.21: Categories of Image restoration
These traditional image restoration techniques produce an approximation of the
original image. The noise can never be fully removed from the original signal. To
provide a quantitative measure of performance, signal to noise ratio (SNR) and
signal to noise ratio improvement (SNRI) are introduced. SNR is measured in dB
and is defined as
39
Original Lena image Noisy Lena image
Figure 2.22: Original and Noisy versions of the Lena image
SNR = 10log,0 v *
1 k (2.63)
where x[i,j) is the original image and n(i,j) is the noise that is corrupting the
image. The definition for SNR can be extended to provide a gauge for how well a
given restoration technique improved the corrupted image. SNRI is defined as
SNRI 10 logl0
ZIWi'J)
i k
J k
(2.64)
40
where y(i,j) is the filtered version of the corrupted image.
The simplest type of image restoration algorithm involves the application
of a lowpass filter. Lowpass filtering is effective in reducing the additive random
noise in a degraded image and it preserves much of the low frequency components
of the image. An example of lowpass filtering can be seen in Figure 2.23. This
example clearly demonstrates that lowpass filtering is effective in reducing the
noise, but also exhibits a limitation of the type of technique. Lowpass filtering can
cause blurring in an image. This is the tradeoff that must be made when lowpass
Filtered with h(nt ji2) Frequency response of filter h(n1 ,n2)
Figure 2.23: Results of lowpass filtering a noisy Lena image
41
filtering an image; there is a substantial reduction in the amount of noise, but at
the expense of a small loss of signal.
Another restoration technique, median filtering, is an improvement over
lowpass filtering. It is similar to lowpass filtering in that it smoothes the image
and reduces high frequency noise. Unlike lowpass filtering, median filtering can
preserve discontinuities or edges in an image while reducing the noise. Median
Impulse noise added to 20% of pixels 3x3 Median filter
5x5 Median filter 7x7 Median filter
Figure 2.24: Example of salt and pepper noise reduction by median filtering
filtering is a nonlinear process useful for removing impulsive or salt and pepper
type noise that occurs as random bit errors during discretization or transmission.
42
The filter uses sliding neighborhoods to process an image. It determines the
value of each output pixel by examining an mxn neighborhood around the
corresponding input pixel, sorts the pixel values in ascending order, and then
chooses the median value as the output result. An important parameter that effects
the performance of the median filter is the size of the neighborhood or window
10dB white noise added to image 3x3 Median filter
5x5 Median filter 7x7 Median filter
Figure 2.25: Example of white noise reduction by median filtering
that slides over the image. In general, the best choice for the size of the window is
image dependent and the resultant improvement or degradation will vary from
image to image. Examples presented in Figures 2.24, 2.25, and 2.26 demonstrate
43
how effective median filtering is in removing different types of noise with various
window sizes. In Figure 2.24, the SNR of the Lena image that was corrupted with
impulse noise only was found to be 5.9dB. The best SNRI was obtained using the
5x5 window, approximately 14.8dB. In Figure 2.25, the SNR of the Lena image
that was corrupted with white noise only was found to be lOdB. The best SNRI
was obtained using the 3x3 window, approximately 12.1dB. In Figure 2.26, the
SNR of the Lena image that was corrupted with white noise and salt and pepper
noise was found to be 5.6dB. The best SNRI was obtained using the 5x5 window,
3x3 Median filter
5x5 Median filter 7x7 Median filter
Figure 2.26: Example of white noise and salt and pepper noise reduction by median filtering
44
approximately 12.1dB.
Wiener filtering was first introduced in the early 1960s as a tool for use in
image restoration applications. Two types of Wiener filters will be disussed here,
linear space invariant and linear space variant. The model of an image degraded
by additive random noise is given by
where x(nx,n2) is the corrupted signal, s(,,n2) is the original signal, and
w(n],n2) is the additive random noise. If it is assumed that s(nx ,n2) and
\v(nx ,n2) are zero mean stationary processes and are linearly independent of each
other and their power spectral density Ps(col ,a>2) and Pw(a>{ ,co2) is known, then
the optimal linear minimum mean square error estimate of s{nx,n2) is found by
filtering x(nx ,n2) with a Wiener filter whose frequency response H(co],co2) is
given by
xl
(nx,n2) = s(nx,n2) + w(nx,n2)
(2.65)
Ps(cox,co2) + Pw(cox,co2)
(2.66)
45
As stated above, the Wiener filter design is based on the assumption that the
power spectral densities of Ps(a>{,a>2) and Pw(co] ,a>2) are known or can be
estimated from a set of images of the same class. In the first Wiener filter example
presented here, the image power spectral density was estimated from six different
images. In this case, the power spectral density was calculated by averaging
,ry,)~ from six images. These images are similar to Lena and are shown is
Figure 2.27.
Man#1 Man #2 Man #3
Figure 2.27: Images used for estimation of power spectral density in first Wiener filter example.
46
The estimated image power spectral density can be written as
(2.67)
Continuing, we can now define //(&>, ,a>2) as
Â£K2) + ^("1^2)
and the final step in the implementation of the LS1V Wiener filter is given by
The results of applying a LSIV Median filter to a corrupted Lena image are
displayed in Figure 2.28. Lena was corrupted with white noise and the resulting
SNR was lOdB. After application of the Wiener filter, the SNRI was 9.5dB.
Although this is a substantial improvement, it came at the expense of smoothing
the entire image. Edges and other high frequency parts of the Lena image were not
preserved. The Wiener filter designed using this approach is essentially a lowpass
(2.69)
47
Original Lena image
10dB of white noise added
Space variant Wiener filter
Figure 2.28: Examples of white noise reduction by LSIV and LSV Wiener filtering.
filter with no adaptive qualities whatsoever, hence the name of Linear Space
Invariant filter. The LSIV Wiener filter assumes that the characteristics of the
signal and noise do not change over different regions of the image. Typically, this
is not the case, as one expects characteristics to differ considerably from one
region of an image to another. It would make sense to let the filter adapt to the
changing characteristics of the image. This concept is central to the theory behind
the Linear Space Variant filter; a filter which is a function of the local image
details.
48
Developing the model, assume that the additive white noise is zero mean
with a variance of o2*. The power spectral density is flat and can be written as
Pw(co],co2) = o2w. (2.70)
Within small, local areas of the image, the signal f(nx,n2) is assumed stationary
(statistics in the small area are not changing) and can be modeled as
f(n\n2) = mf + o fw(n,,n2)
(2.71)
where mf and a f are the local mean and standard deviation of f(nx,n2) and
w(nx,n2) is zero mean white noise with a variance of one. Within the local area,
then, the Wiener filter is given by
H(cox, co2) =
P, (p
)
a'/
> >
o' f + o'
(2.72)
49
The local Wiener filter impulse response is found to be
h(nx,n2)
a f
5{nx,n2).
(2.73)
Wiener filter
From Figure 2.29 and (2.73), the processed image p(n{,n2) within the local area
can be expressed as
p(n,,n2) = mf +(g(nl,n2)mf) 2 f2 S(n{,n2)
a~f + (J~
(2.74)
Explicitly rewriting (2.74) for each pixel gives the main equation for the
implementation of the adaptive LSV Wiener filter as
50
P{n i ,n2) = mf (, ,n2) + (g(w, ,n2)~ mf{nn2))
r2f{n i2)
+ cr2w
. (2.75)
To complete the model, we need to define mf(n],n2) as
n, + M nf + M
mf(nx,n2)
(2M+i)2*1=rM*2irM '
(2.76)
and the local variance cr2 r can be estimated as
cr,"
(crg2CJH2), if <7 2 > G2
0 , otherwise
(2.77)
 tt,+M /1: + A? *
~g2(i2) = 7\2 Z Z (278)
(2 M + 1) kx=nxM k2=n2M
The adaptive Wiener filter was implemented in Matlab using (2.75) through
(2.78) and the results are shown in Figure 2.28. The program estimates the local
mean and variance around each pixel, in a mxn neighborhood, and then creates a
pixelwise Wiener filter. The program was run using a 3x3 window and resulted
in a SNRI of 7.2dB. The results obtained with the adaptive Wiener filter produced
51
better results than the LSIV Wiener filter because it tailors or adapts its design to
the local image variance. Where the local image variance is large, the filter
performs little smoothing. Where the local image variance is small, the Wiener
filter performs more smoothing. This makes the filter more selective and better
able to preserve the higher frequency components of the image.
Image deblurring can be modeled as
g{nn2) = f{n,,n2)*b(nn2)
f{nx,n2)
> g{n\ni) = f{n\ni) bin\ni)
Figure 2.30: Block diagram for image blurring
where g(nl ,n2) is the blurred image, f(ni,n2) is the original image; and b(nx,n2)
is the blurring function impulse response. The objective is to estimate /(, ,n2)
from g(n] ,n2). Examples of what would cause blurring may range from lens
misfocus to atmospheric turbulence. Depending on whether b{nx,n2) is known
52
will determine which method is selected for reducing image blur. Inverse filtering
can be used if the blurring function is known. This allows us to write
and
G(co],a>2) = F(col,co2)B(co{,co2)
(2.80)
F(co ,,g>2)
B(o)],q)2)
(2.81)
g{n\>ni.
Figure 2.31: Inverse filtering block diagram
The system shown in Figure 2.31 can recover f{n^,n2) from g{n^,n2) by the
inverse filter   .
B(co^co2)
If the blurring function b(nx,n2) is not known, it must be estimated prior
to inverse filtering. This method is called blind deconvolution because the attempt
53
is made to deconvolve g(nl ,n2) without accurate or detailed knowledge of
b(nx,n2). See Lim [7] for details.
The reduction of signal dependent noise is accomplished by transforming
the degraded signal into a domain where the noise becomes additive signal
independent noise, and then use methods discussed earlier to reduce the signal
independent noise. To illustrate this approach, lets develop a model for an image
g(nx ,n2) degraded by multiplicative noise. Let
g(l>2 ) = /(!>2
(2.82)
where v(n,,n2) is random noise that is not a function of f(nx,n2). Apply the
logarithmic operation to (2.82) to obtain
log g{nx,n2) = log f(nx,n2) + log v(nx,n2)
(2.83)
and rewrite (2.83) to get
S'{nx,n2) = f'(nx,n2) + v'(nx,n2).
(2.84)
54
This demonstrates how the multiplicative noise v(n;,n2) was transformed to the
additive noise v'(nl,n2). Now, any number of previously discussed restoration
algorithms can be used for reducing the additive signal independent noise.
^nl,n2)=f(nl,n2y{nl,n2)
Figure 2.32: Example of system used for reduction of multiplicative noise
Temporal filtering for image restoration involves the filtering of images
that are correlated in time, such as motion pictures. A simple method of temporal
filtering is frame averaging. Frame averaging is effective in processing a sequence
of images in which the image doesnt change much from frame to frame, but the
character of the noise does. One example of frame averaging can be expressed as
estimating an image f{nx ,n2) from a sequence of N degraded image frames
Si {n\ ini) by writing that
1 N
/(wiw2) = T7
1=1
(2.85)
55
Motion compensated image restoration is an extension of frame averaging. For
frame averaging to be effective, the image must remain constant and not change
from frame to frame. This requirement is not typically met in applications such as
television and motion pictures, so motion compensation routines estimate the
movement of an image from frame to frame.
In wavelet shrinkage denoising algorithms, the basic premise involves the
thresholding of wavelet coefficients. A transformation from the spatial domain to
the wavelet domain is required and is accomplished by the application of the
wavelet transform. Models have been developed that provide a statistical basis for
the selection of a threshold and result in the optimization of mean square error in
the estimated signal. The model can be stated as such: assume that we wish to
recover an unknown function / from noisy data. Let yt be the set of noisy
observations and defined as
yi=f{ti) + azi / = 0, 1
(2.86)
where t,. = a is the standard deviation of the noise, and z(. are the independent
n
and identically distributed random variables distributed according to 7V(0,l). The
56
goal is to estimate / with small Z,, risk. i.e. minimize the meansquare error. This
can be written as
R
(A A
/,/
v y
= E
n
i i ( A (
I*/ /
(=o v vy
w
(2.87)
Donoho and Johnstone [1] proposed that (2.87) be subject to the condition that the
A
optimization of the mean square error result in a signal estimate / that is at least
as smooth as /.
A number of thresholding methods are discussed in Chapter 3. These
differ in complexity, determination, and performance. However, there is one
common theme that all of these techniques exhibit. Shrinkage of empirical
wavelet coefficients works best when the underlying set of the true coefficients
of / is sparse. In the first levels of decomposition there are many coefficients, but
not many that correspond to the actual underlying function. Most of these
coefficients are noise. As the decomposition continues, the remaining few large
coefficients represent most of the underlying form of /. Therefore, any effective
wavelet shrinkage technique must throw away most of the noisier or smaller
coefficients, and keep the larger ones.
57
3. Wavelet Thresholding
Once a threshold has been selected, how is it applied to the wavelet
coefficients? There are two methods available for implementation of the threshold
value A to the data; hard and soft thresholding. These two methods are
graphically displayed in Figure 3.1.
Figure 3.1: The hard and soft thresholding functions
58
Hard thresholding is described by
x(Y), if x(7) > A
0, if x(?) ^ /l
Hard Thresholding Function
(3.1)
and implemented by including all the empirical wavelet coefficients whose
absolute value is above the threshold A in the inverse transform during
reconstruction. The remaining coefficients, whose values are below the threshold
A are set equal to zero. Hard thresholding, with its keep or kill approach, can
potentially create discontinuities at wavelet coefficients equal to A.
Donoho and Johnstone [3] recognized that empirical wavelet coefficients
consist of both signal and noise and attempted to isolate the signal portion by
removing only the noise part. This led to the second and more popular type of
thresholding and is called soft thresholding. Soft thresholding is described by
When the soft thresholding function is applied to the empirical wavelet
coefficients, not only are the coefficients whose absolute value is greater than the
x(t) A, if x(t) > A
jsft o, if x(7) < A Soft Thresholding Function
x(t) + A, if x(t) <A.
(3.2)
59
threshold A included in the inverse transform for reconstruction, but their values
are shrunk towards zero by an amount equal to the threshold A.
The selection of the appropriate threshold is one of the fundamental
problems in wavelet shrinkage techniques. Choosing a very large threshold will
shrink almost all of the coefficients to zero, resulting in oversmoothing of the
underlying function. On the other hand, choosing a very small threshold would
result in a wiggly or noisy estimator. The methods for choosing the threshold A
can be grouped into two categories: global thresholding and leveldependent
thresholding and are discussed in detail in the following sections.
60
3.1 Selective Wavelet Reconstruction
Wavelet shrinkage and thresholding methods were presented by Weaver in
[10] to remove random noise from magnetic resonance images. Further work in
selective wavelet reconstruction was preformed by Donoho and Johnstone [1],
The basic idea behind selective wavelet reconstruction is to choose a small
number of wavelet coefficients with which to represent the underlying function.
This idea is based on the premise that a function can be well represented in terms
of only a relatively small number of wavelet coefficients at various resolution
levels. Since the larger true coefficients are typically the ones that are included
in the selective reconstruction when estimating an unknown function, it makes
sense to only include the coefficients that are larger than some specified threshold
value. Therefore, in selective wavelet reconstruction, large coefficients are kept
and smaller coefficients, relative to the threshold X, are set to zero. This
thresholding operation acts on the noisy empirical wavelet coefficients, resulting
in the estimated wavelet coefficients that are then used in the inverse transform
algorithm during reconstruction.
61
3.2 Global Thresholding
Global thresholding refers to using a single value of X for the entire set of
empirical wavelet coefficients. In global thresholding, it is typical to threshold
only the coefficients at the highest resolution, thus leaving intact all lowlevel
wavelet coefficients. These lower level coefficients correspond to macro
features of the data. Donoho and Johnstone [1] were the first to propose two
methods of global thresholding that provide for the selection of the threshold X to
be used in wavelet shrinkage applications. Both of these methods use a single
threshold for all the detail wavelet coefficients. The first one discussed is
Minimax thresholding, where the threshold is set based on the sample size n The
Universal threshold method, or VisuShrink as it is commonly called, sets the
threshold to yjl logn and ensures that all of the detail coefficients are set to zero
if no signal is present.
62
3.2.1 Minimax Thresholding
Minimax thresholding uses a fixed threshold that depends on the sample
size n and is derived to minimize the constant term in the upper bound of the risk
involved in estimating the function given in (3.3). The Minimax threshold X is
defined as the threshold that minimizes the quantity
A* = inf sup
*.()
n 1 +
(0M)
(3.3)
where A* is called the minimax risk bound. Rk (0) can be rewritten as
0,(0)= Â£(
where V^(0) = var(^/l(Ar))and Mx{6) = . The Minimax threshold
applies the optimal threshold in terms of L2 or mean square error risk. The
Minimax threshold values in Table 3.1 are taken from the table presented in
Donoho and Johnstone [1].
An example of Minimax thresholding can be seen in Figure 3.2. As
mentioned above, this type of thresholding results in a procedure that performs
63
n A n A
64 1.474 2048 2.414
128 1.669 4096 2.594
256 1.860 8192 2.773
512 2.047 16384 2.952
1024 2.232 32767 3.131
Table 3.1: Minimax thresholds for various sample sizes, from Donoho and Johnstone [1]
well in terms of meansquared error, but at the expense of underlying smoothness
of the estimated function. Regardless of the sample size n the risks achieved by
the Minimax thresholds are close to the ideal risk and are smaller than the
Universal or VisuShrink method. It should be noted that both the Universal and
Minimax thresholds tend to be larger than necessary, and therefore tend to
oversmooth the underlying function. This is the main weakness of global
thresholding methods and why data driven threshold procedures are more
desirable in denoising applications. The data driven thresholds tend to preserve
more features of the underlying function. This can be observed from the denoising
results displayed in Figure 3.2.
64
3.2.2 Universal Thresholding VisuShrink
The second and more flexible technique proposed by Donoho and
Johnstone [1] for choosing a threshold is called the Universal threshold or
VisuShrink method. The universal threshold is A = crN/21og n where a is the
standard deviation of the noise and n is the sample size. This technique is similar
to the Minimax method in that both will result in estimators that perform well in
True Blocky Function
Global Universal Thresholding
Noisy Blocky Function snr 5dB
0 1000 2000
Global Minimax Thresholding
Figure 3.2: Blocky Function n=2048, SNRatio=5, estimated by both the Global Universal and the
Global MiniMax methods
65
terms of meansquared error, but is simpler to implement and does not require the
use of a lookup table. The Universal threshold value is larger than the MiniMax
threshold for any particular value of n thus resulting in an estimate that is
smoother than the Minimax estimator. This method is called VisuShrink for this
reason; a smoother estimator is considered more visually appealing. The results of
the two methods are displayed in Figure 3.2. The differences are subtle, but
detectable. The Minimax method does a better job at picking up jumps and
discontinuities, but at the expense of overall smoothness. The VisuShrink method
results in smooth estimates, but does not pick up the jumps and other abrupt
features as well.
But, how and why does VisuShrink result in a smoother estimate? Assume
that we have a noiseless signal. Taking the wavelet transform of this signal will
result in a sparse vector, with the most of the coefficients essentially being equal
to zero. If the same procedure is repeated after some noise has been added, these
coefficients are now nonzero. If a sample in the noiseless case ought to be zero
that in the noisy case is now nonzero, the reconstruction will preserve this and
exhibit a noisy appearance with small blips against a clear background. The
VisuShrink thresholding method safeguards against allowing spurious noise into
the reconstruction. This is due to the fact that if z,,..., zn represents an
66
independent and identically distributed white noise sequence with noise ./V(0,1),
then the probability
z( < Jl log n, for all i 
(3.5)
as n goes to infinity. This essentially says that the probability of all noise being
shrunk to zero is very high for large samples. Since the Universal threshold
procedure is based on this asymptotic result, it does not always perform well in
small sample situations.
Referring back to the equation for the universal threshold, X = crJl log n ,
it is important to note that this equation requires the knowledge of cr, the standard
deviation of the data values. While estimating noise levels in a parametric setting
is straightforward, such is not the case in nonparametric regression.
Nonparametric regression is a field where, for example, a curve may be fitted to
the data without assuming any structure to or knowledge of the underlying
function. In most cases, the true value for the standard deviation of the noise is not
known and taking the usual standard deviation of the coefficients will result in an
overestimation of cr. Donoho and Johnstone [1] proposed replacing a with an
estimate, cr = where MAD is the median absolute deviation of the finest
0.6745
67
scale wavelet coefficients. The reason for considering only the empirical wavelet
coefficients at the highest level of resolution is that these tend to consist of mainly
noise. This is not a perfect estimate, as this value may suffer an upward bias due
to the presence of some signal at that level. But, by using the median absolute
value deviation for the standard deviation of the noise, this bias is effectively
controlled.
68
3.3 Leveldependent Thresholding
In Section 3.2, global thresholding was described as having all of the
wavelet coefficients below a certain resolution level shrunk according to a single
value of X The lower level coefficients were included in the reconstruction as
is. However, what if its desired to have thresholds that change depending on the
level of resolution and the characteristics of the empirical wavelet coefficients? In
leveldependent thresholding, a different threshold value, T ., is chosen for each
wavelet level j It is only natural to group empirical wavelet coefficients by level
and then treat each level separately. After all, each level differs in overall
frequency content and contains varying amounts of signal and noise information.
Popular datadependent threshold selection procedures by Donoho and Johnstone
[2], Ogden and Parzen [4],[5], and Hilton and Ogden [6] will be examined in the
following sections.
69
3.3.1 SureShrink
Donoho and Johnstone [2] developed a scheme that uses the wavelet coefficients
at each level j to determine a threshold X with which to shrink the wavelet
coefficients at that level. This thresholding method is based on the unbiased risk
estimation by Stein [11], which minimizes estimated mean squared error.
Recall that we developed a model that described a collection of noisy
observations yt as
y,=f{tl) + oz, i = 0,...,n\ (3.6)
with tr is the standard deviation of the noise, and zt are the independent
n
and identically distributed random variables distributed according to A^(0,1). Our
goal is to estimate / with small meansquared error, i.e. to find an estimate /
that depends on y_, with small risk such that
R
(A A 1 A 2 1 Â£2 (A f 0 fiW
/,/ = E // / /
V J n n i=o u, KnJ)
(3.7)
70
But, in many instances, we perform statistical operations on coefficients in the
wavelet domain. What happens to this risk in the wavelet domain? In the wavelet
domain, we can rewrite (3.6) as
The orthogonality of the Discrete Wavelet Transform results in the fact that the
wavelet transform of white Gaussian noise is also white Gaussian noise of the
same amplitude. The wavelet coefficients of a noisy sample set are themselves
just noisy versions of the noiseless wavelet coefficients. This is very significant
because we can transform the original data into its wavelet coefficients, and then
attempt to minimize the risk in the wavelet domain. This will automatically
minimize risk in the original domain. Finally, we can write that
(3.8)
and
(3.9)
(3.10)
71
Continuing with the SureShrink discussion, let /u = (jui : / = 1,... d) be a
d dimensional vector and x, ~ N(/ut ,1) be multivariate normal observations with
A A
that mean vector. Also, let /u = /r(x) be a fixed estimator of /u. Stein (11)
A 2
introduced a method for estimating
Hli
in an unbiased fashion. He stated that
the loss could be estimated unbiasedly for an estimator of /u and can be written as
/r(x) = x + g(x) where g = . When g(x) is weakly differentiable,
AW ju
d + E^g(x)2 + 2 V g(x)}
(3.11)
where Vgs ^g.. Using the soft threshold estimator
g (x) = ^(x)x =
A, x > A
x, A < x < A
A, x < A
(3.12)
72
and g(x)2 = ^min2(x,/l) and V g = ^(x). Substitution into (3.11)
gives Steins unbiased estimate of risk for any set of observed data x,,..., xd as
d
SURE(A; x) = d2#{t\xk\>A}+Y.min2 (lx* W
*=n
(3.13)
and
* w
fj (x )~M
E/JSURE(A,x).
(3.14)
The threshold level is set so as to minimize the estimate of risk for any set of
observed data x,,..., xd and A can be written as
A = argmin,>0 SURE(t;x). (3.15)
The SURE method can be expected to do well in terms of minimizing risk, since
for large sample sizes the Law of Large numbers will guarantee that the SURE
criterion is close to the true risk and the threshold will be optimal for a given
dataset.
73
The implementation of the above equation for SURE(A,x) is not
immediately obvious. At this point, a computational example demonstrating the
minimization of the SURE criterion would be valuable and insightful. Let the
empirical wavelet coefficients be denoted as the set of data x. = (x,, ... xd)
where d is the length of the data vector. For this example, the wavelet coefficients
were taken from a level 6 decomposition of the Blocky function. The Blocky
Level 6 coefficients Absolute value
Figure 3.3: Demonstration of SURE thresholding algorithm
74
function was created with n=2048 and a signaltonoise ratio of 5. The SURE
thresholding procedure will be demonstrated in Figures 3.3 and 3.4. The noisy
wavelet coefficients for level 6 from the noisy Blocky function are shown in
Figure 3.3. To prepare the empirical wavelet coefficients for input into the SURE
criterion, the following operations are performed:
1) Take the absolute value of the wavelet coefficients.
2) Sort the wavelet coefficients in ascending order.
3) Square the wavelet coefficients. Denote this vector as wc2 .
The resultant vector of coefficients, wc2 is then input into the risks calculation.
The risks function is defined as
risks = ^
1024 (2 (1:1024)) + (cumsum{wc2) + (1023:I:0),wc2)
(3.16)
Since the coefficients are reordered in order of increasing absolute value x( , then
the output of the risks function is strictly increasing between adjacent samples. It
is also strictly increasing between zero and the smallest x, , as well as for
X max( x(  so the minimum must occur at zero or one of the x, . The
minimum value and index of that value are then determined from the risks vector.
The threshold is calculated by taking the square root of the value at that index in
75
the vector of the squared coefficients, wc2 This technique is illustrated in Figure
3.4. The dashed line on the bottom graph in Figure 3.4 indicates the value of the
threshold selected by the SURE technique. In this case, all coefficients below this
wc2 vector
Figure 3.4: SURE criterion for Blocky function Level 6
line will be set equal to zero, while those above it will be shrunk toward zero by
the same amount. Remember that the SURE technique is a data adaptive scheme
that is typically applied to wavelet coefficients at all levels. As stated previously,
this allows the coefficients themselves to determine if and how much shrinkage is
76
required. It is not uncommon for the threshold to be set to zero for lower level
coefficients. However, there is one problem with the SURE method. Refer to
Figure 3.5 for the demonstration. Notice that level 3 seems to have more small
coefficients than large ones. It may appear that the threshold selected is set too
low, and that too many small coefficients will be included in the reconstruction.
risks vector (SURE) Level 3
Figure 3.5: SURE criterion for Blocky function Level 3
77
The consequence will be that the resulting estimator will be noisier. Additionally,
it can be seen that the risks function in Figure 3.5 is relatively flat near the
location where it achieves its minimum. This indicates that there is a wide range
of possible thresholds. The choice of the threshold may have a low quantitative
difference in terms of estimated risk, but may have a significant qualitative
difference in terms of the overall smoothness of the resultant underlying function.
Donoho and Johnstone [2] recognized this problem and noted that the
SURE threshold selection method does not perform well in cases where the
wavelet representation at any level is sparse, i.e. when most of the wavelet
coefficients are close to zero. This is due to the noise from the coefficients that are
close to zero overwhelming what little signal may be present from the nonzero
coefficients. Donoho and Johnstone proposed a hybrid SURE method to get
around this issue. The hybrid method essentially test for sparseness at each level.
It uses the sample variance at each level to determine if the representation at that
level is sparse. The sample variance at any level can be calculated by
d
(3.17)
78
and the wavelet coefficients at the level are judged to be sparse if
5;
(log2 d)2
4d
(3.18)
If the set of coefficients is judged sparse, then the hybrid scheme defaults to the
universal threshold, X = afelogn Otherwise, the SURE criterion is used to
determine the threshold value.
Examples of both the SUREbased thresholding and Hybrid SURE
thresholding methods are shown in Figures 3.6 and 3.7. The Blocky function in
Figure 3.6 was created with n=2048 and a signaltonoise ratio of 5, while the
Heavy Sine function in Figure 3.7 was created with n=512 and a signaltonoise
ratio of 2. The differences between the SURE estimator and the Hybrid SURE
estimator are clearly demonstrated and are easy distinguishable. Note that in the
locations where the function is smooth, the true coefficients at higher resolution
levels are almost entirely white noise. The standard SURE method doesnt
recognize this and results in a noisy estimator, whereas the Hybrid SURE
technique determines the coefficients at that level meet the sparse criteria and
shrinks most of these coefficients to zero.
79
Original signal
Noisy signal SNR=5
Denoised signal SURE
Figure 3.6: Blocky Function n=2048 and SNRatio=5, estimated by SURE and Hybrid SURE
methods
Original signal
Denoised signal SURE
Noisy signal SNR =2
Denoised signal Hybrid SURE
Figure 3.7: Heavy Sine Function n=512 and SNRatio=2, estimated by SURE and Hybrid SURE
methods
80
The SureShrink algorithm consists of the following:
1) Perform a DWT of noisy data and obtain empirical wavelet
A
coefficients. Normalize coefficients by cr. Sort the absolute value of
coefficients and then square the results.
2) Calculate Steins Unbiased Estimate of Risk. This expression is given
by
Determine the minimum of the SURE vector and record its index.
Take the square root of the value at that index in the vector of the
squared coefficients. This value is the threshold A .
3) Perform soft thresholding of noisy wavelet coefficients.
4) Reconstruct empirical wavelet coefficients with IDWT.
d
(3.19)
81
3.3.2 Data Analytic Threshold Selection Brownian Bridge
It has been shown that wavelets are powerful tools for estimating an
underlying function that may contain multiple jumps, discontinuities, or sharp
spikes. Typically, these methods only use the magnitudes of the wavelet
coefficients in choosing a threshold for each level. In addition to the relative
magnitude of the coefficients, it may also be desirable to include the information
contained in the position of the large coefficients. Since a jump or discontinuity
in the underlying function results in several nonzero coefficients adjacent to each
other, a changepoint approach could possibly take advantage of the information
contained in the relative position as well as the magnitude of the coefficients. As
discussed in the previous section, Donoho and Johnstone [2] described their
SureShrink procedure and a related hybrid scheme when the wavelet coefficient
representation is sparse. SureShrink uses the coefficients at each level to select a
threshold that minimizes Steins Unbiased Risk Estimate. Ogden and Parzen [4]
propose a different approach to this problem.
The general concept of the recursive scheme developed by Ogden and
Parzen in [4] and [5] operates on a levelbylevel basis, similar to the SURE
method. The empirical coefficients are examined at each level and a standard
statistical test is performed to determine if the set of coefficients at that level
behaves like white noise or if there is significant signal present. If it is determined
82
that significant signal is present, the largest coefficient in absolute value is
removed and the remaining set of coefficients is tested again. The recursive
process is continued until the original set of empirical wavelet coefficients has
been separated into a set of coefficients that has been judged to contain significant
signal, and a set of coefficients that behaves like white noise. At this point, all of
the noise coefficients are shrunk to zero, and the signal coefficients are
shrunk by the amount required to shrink the noisy coefficients to zero.
At the heart of this technique is the idea that utilizes statistical tests of
hypotheses. In this case, the hypothesis tested results in the formation of a set of
large coefficients that has passed the test of significance, i.e. signal. Hilton and
Ogden [6] state this hypothesis as
H0: //, =... = // =0
Ha: there exists at least one i in {l,...,d}such that /j.(. 0
where Ju], denote the true wavelet coefficients at any single level. These
are estimated by the empirical coefficients Xt, X2,..., Xd. The threshold selection
at each level is based on testing (3.20) recursively, with the null hypothesis Ha
specifying no significant coefficients, and the alternative hypothesis Ha allowing
83
one or more coefficients to be nonzero. The test statistic for the hypotheses listed
above is based on a Brownian bridge stochastic process. It is defined as
B~
1 1
(3.21)
where a is the standard deviation of the noise and X2 is the mean of all of the
squared empirical wavelet coefficients. In most cases, the value for the noise
standard deviation is not known and must be replaced by a  where
0.6745
MAD is the median absolute value of the finest scale wavelet coefficients [1].
Under the null hypothesis, {B~(u), 0 < u < l}in (3.21) converges in distribution to
a Brownian bridge stochastic process, a zeromean continuous Gaussian process
with covariance kernel min(w, v) uv and is zero at 0 and 1.
The fundamental idea behind this scheme is the typical behavior of a
sample Brownian bridge process. As stated previously, the empirical wavelet
coefficients are examined levelbylevel. If none of the coefficients contributes
significant signal, the function in (3.21) should resemble the typical behavior of a
Brownian bridge process. It will exhibit oscillations about zero, without too great
an excursion in either direction. Under the hypothesis Ha, however, the Brownian
84
bridge process will not oscillate about zero and will exhibit large deviations from
zero in either direction. Figure 3.8 shows the Brownian bridge process for the
empirical wavelet coefficients for level 4 from the noisy Blocky function with
n=2048 and a signaltonoise ratio of 5. The top graph in Figure 3.8 is the full data
set before any coefficients have been removed.
Brownian bridge Full data set Level 4
0 50 100 150 200 250
Brownian bridge Reduced data set (alpha=.9) Level 4
0 50 100 150 200 250
Figure 3.8: Brownian bridge process behavior
The second and third graphs more closely resemble typical qualities of a sample
Brownian bridge process. The largest coefficients have been removed according
to the choice of a The level a acts as a smoothing parameter and allows the
user to maintain control over the smoothness of the resulting estimated function. It
85
can be set to a smaller value, thus resulting in an underfitting of the data similar to
SureShrink, or can be set to a larger value, overfitting the data and producing a
result similar to the VisuShrink method.
The computational requirements for this recursive algorithm are described
below:
Step 1: Form the Brownian bridge process as given in (3.21).
Step 2: Test the process in Step 1 for white noise. Use Kolmogorov
Smimov test statistic. Note: Once the Brownian bridge process has been
formed, any number of several test statistics can be used in testing the
Brownian bridge.
Step 3: If the test in Step 2 is significant, remove the wavelet coefficient
with the greatest absolute value from the dataset, set d = d 1 and return
to Step 1.
Step 4: If the test in Step 2 is not significant, set the threshold equal to the
absolute value of the largest wavelet coefficient (in absolute magnitude) in
the current dataset. The coefficients in the noisy dataset at shrunk to zero,
while the true coefficients are shrunk by the same amount.
86
Wavelet coefficients Level 4
10
0
10
10
0
10
10
0
10
10
0
10
1 ^^^^4^
0 50 100 150 200 250
Noisy coefficients Level 4
0 50 100 150 200 250
"Large" coefficients (alpha= 05) Level 4
A A 1 * l A
0 50 100 150 200 "Large" coefficients (alpha=,9) Level 4 250
) * i i
0 50 100 150 200 250
Figure 3.9: Blocky function Level 4 denoising results of Brownian Bridge technique
In this context, the goal of a standard changepoint procedure is to determine from
the data whether a set of empirical wavelet coefficients represents noise or signal.
Ogden and Parzen [4], [5] use the mean corrected cumulative sum process. In
Figure 3.10, plots of the cumulative sum of the wavelet coefficients, the
cumulative sum of the absolute value of the coefficients, and the cumulative sum
of the squared wavelet coefficients are displayed. By looking at the plot of the
cumulative sum of the wavelet coefficients, it is difficult to detect where the
changes are taking place. This is because the coefficients do not have the same
87
Noisy coefficients Level 4 CumSum Level 4
Figure 3.10: Changepoint analysis in Brownian Bridge process
sign and will tend to cancel each other out when added together. The plot of the
cumulative sum of the absolute values of the coefficients gives a slightly better
idea of where the jumps are located. More obvious is from the plot of the
cumulative sum of the squared coefficients, it is easy to see where the underlying
function is relatively flat and where it increases rapidly. This corresponds to a
jump or discontinuity in the underlying Blocky function.
88
True Blocky Function
True Wavelet Coefficients
10
0
10
10
0
10
i
u
0 1000 2000
Noisy Blocky Function
* i rc
m 1 P
0 1000 2000
Data Dependent alpha=.05
100
0
100
I**
0 1000 2000
Noisy Wavelet Coefficients
100
0 Uurf1**"
100
0 1000 2000
Data Dependent alpha=.05
1000
2000
Figure 3.11: Denoising results of Brownian Bridge
The denoising results of the Brownian Bridge process are displayed in Figure
3.11. On the following page, Figure 3.12 compares the results of the Brownian
Bridge process to both VisuShrink and SureShrink. Note how the magnitude of a
can result in a estimator that closely resembles the denoising results obtained with
either the VisuShrink or SureShrink algorithms.
89

PAGE 1
ADAPTNE WAVELET THRESHOLDING IN DENOISING OF 2D SIGNALS b y Erwin F. Siegel B.S. Physics University ofPittsburgh 1980 B.S E.E. University of Colorado at Denver 1991 A thesis submitted to the University of Colorado at Denver in pattial fulfillment of the requirements for the degree of Master of Science E lectri cal Engineering 1999
PAGE 2
This thesis for the Master of Science degree b y Erwin F. Siegel has been approved by Tarnal Bose Mike Radenkovic Jan Bialasiewicz Date
PAGE 3
Siegel Erwin F. (M.S. Electrical Engineering) Adaptive Wavelet Thresholding in DeNoising of2D Signals Thesis directed by Associate Professor Tarnal Bose ABSTRACT In fields ranging from medical imaging to geophysics, digital communications to astronomy one must attempt to extract signals from noisy and incomplete data In all cases the goal is to remove as much of the noise as possible without degrading the fmerscale aspects of the data. Wavelet based noise removal techniques have recently emerged as attractive and powerful tools for signal extraction from corrupted data. Denoising attempts to recover the true" signal from the noisy version that is observed. Signals are recovered by introducing a procedure that suppresses noise by thresholding the empirical wave let coefficients. Essential to wavelet shrinkage signal denoising techniques is the selection of the shrinkage threshold. The focus of this thesis is to examine and compare datadependent threshold selection techniques that are widely used today in wavelet shrinkage signal denoising. This abstract accurately represents the content of the candidate's thesis. I recommend its publication. Tarnal Bose 111
PAGE 4
ACKNOWLEDGEMENTS My thanks to my advisor Dr. Tarnal Bose for his patience with me during these last two years and for his time and guidance throughout my graduate career at UCD. Additionally I am indebted to Dr. Todd Ogden who provided informative and valuable insights into his work on Data Adaptive Wavelet Thresholding and without whose help this thesis would not have been completed I am also deeply indebted to Gordon Stuck for sharing with me his programming expertise in Matlab and for his willingness to always take time to discuss wavelets with me. Finally, my deepest thanks go to my family: my wife Michaela and my daughter Ellie, for their support and patience throughout this task. IV
PAGE 5
CONTENTS Chapter 1. Introduction ........ .............. .. .. .. ......... ................................... .................................. 1 2. Wavelet Theory .......................................... .... ..................................................... 5 2.1 Multiresolution ..................... ............................................................................ 29 2.2 Image Denoising Techniques ..... ........................ ............. ........ ................ ........ 38 3. Wavelet Thresholding ......... .................. .................................... ............... ......... 58 3 1 Selective Wavelet Reconstruction .................... ............................................... 61 3 2 Global Thresholding ............ ................ .............. .. ....................... ..... ............... 62 3 2 1 Minimax Thresholding ......................... ........ ................................................ 63 3.2.2 U niversal ThresholdingVisuShrink .............. ............................................. 65 3.3 Leveldependent Thresholding ............... .. .. ..... .. ....... ............... ........... .. ........... 69 3.3.1 SureShrink .................................................. ................................. .................. 70 3 3.2 Data Analytic Threshold SelectionBrownian Bridge ................ .... ............ 82 3 3.3 Data Analytic Threshold SelectionBrownian Sheet .................. ........ ........ 91 4. Computer Simulations and Comparative Analysis .................................. .......... 97 5. Conclusions ........................................................................................ ............. 1 07 Appendix A. Bibliograph y ...... ......... .............................. ..... ......... ....... .............. .................. 09 v
PAGE 6
B. Matlab Code / Simul a tion s ......................................... ..... .......... ....... ....... ......... 111 V I
PAGE 7
FIGURES Figure 2.1 Wavelet analysis timefrequency plot ................................................................ 6 2.2 Analysis section of a filter banlc ......... ...... ......................................................... 8 2.3 Scaling and dilation example with Haar wavelet.. .......... ................ .... ........ .... I 0 2.4 Twochannel filter bank ................................... .......................... ...... ....... ........ 16 2.5 Alias cancellation conditions ........................................................................... 21 2.6 Alternating flip and SmithBarnwell conditions .... .............................. ............ 26 2.7 Daubechies db3 wavelet Scaling and wavelet functions ............................... 28 2.8 Daubechies db3 wavelet filter designs ...... ....................... ....... ......... ........... 28 2 9 Original Lena image (512 x 512) ..................................................................... 29 2.10 Single level decomposition ofLena image with Daubechies db3 wavelet.. .. 30 2.11 Twodimensional single level decomposition format.. .... .............................. 30 2.12 Wavelet coefficients from single level decomposition vector form ........... 31 2.13 Two level decomposition ofLena image with Daubechies db3 wavelet.. ..... 32 2.14 Twodimensional two level decomposition format ...... ............ ..................... 32 2.15 Wavelet coefficients from two level decomposition vector form ............... 3 3 2 .16 Three level decomposition of Lena image with Daubechies db3 wavelet.. ... 34 2.17 Twodimensional three level decomposition format ........ ............................. 34 vii
PAGE 8
Figure 2.18 Wavelet coefficients from three l evel decomposition vector form .... ....... . 3 5 2 .19 Two dimensional DWTDecomposition step ............ ......... .... ......... .......... . 36 2 20 Two dimensional IDWTReconstruction step ............................................. 37 2.21 Categories oflrnage Restoration .................................................................... 39 2.22 Original and Noisy versions ofthe Lena image ............................................. 39 2.23 Lowpass filtering of noisy Lena image ............ ..... . ............................ ......... .41 2.24 Median filtering of Lena image cmrupted by salt and pepper noise ...... ... .42 2.25 Median filtering of Lena image corrupted by white noise . ...... . .... .... . . .43 2.26 Median filtering of Lena image corrupted by the combination of" salt and pepper" noise and white noise .... .......... .............. ....... ....... .... .... ...... ... . .......... . .44 2.27 Power spectral density estimation for Wiener filter ... .... .... . .... . . .... . ........ .46 2.28 Noise reduction by LSIV and LSV Wiener filtering ...... .... ........... .......... ..... .48 2.29 Wiener filter block diagram .... . .... .......................... ..................... ............ .... 50 2 30 Image blurring block diagram .. .. .. ..... .... .... .................... ............................. . 52 2.31 Inverse filtering block diagram ... ...... .............. ................................. .... ....... 53 2.32 Multiplicative noise reduction block diagram ....... ...... ..... ......... ....... ............ 55 3.1 Hard and soft thresholding functions ...... . .......... ............................................ 58 3.2 Blocky function Global Universal and Minimax methods .................... . .... 65 Vlll
PAGE 9
Figure 3.3 SURE thresholding algorithm .... ........... ........... ..... .... .... ......... ............ .. .... .. ... 7 4 3.4 SURE criterion for Blocky function Level 6 ............ .. ............ .. ....... ............ ... 76 3.5 SURE criterion for Blocky function Level 3 ...... .................................... ..... .... ?? 3 6 Block y function SURE and H y brid SURE method s ...... .... ..... ....... ...... .... . 80 3.7 Hea vy Sine function SURE and H y brid SURE method s .............................. 80 3.8 Brownian bridge process beha v ior ............................ ....... ..... ........................... 85 3.9 Blocky function Level 4 Denoi s ing results of Brownian Bridge .................. 87 3 .10 Changepoint anal y sis in Brownian brid g e process ....... ....................... ........ 88 3.11 Blocky functionDenoisin g re s ults ofBrownian Bridge .......... ....... ........... 89 3.12 Blocky functioncomparisons of v ariou s Denoising techniques : VisuShrink SureShrink and Brownian Bridge ........ ...... .... ........... ... ............. .... 90 3 .13 Data input into formation of Brownian Sheet .. .......... ........... .... .. .... ......... ... 94 3.14 Brownian Sheet no s ignific a nt coeffi c ient s remo v ed ................. ....... .......... 9 5 3.15 Brownian Sheet all s ignificant coefficient s remo v ed .... ..... ..... ............. ....... 95 3.16 Nois y Horizontal wavelet coefficients from Level 1 Decomposition ....... ..... 96 3 .1 7 Denoised Horizontal wavelet coefficients from Le v el 1 Decomposition ..... 96 4.1 Ori g inal and noi sy v er s ion s ofLena ima g e ( with wavelet c o e fficient s) .......... 9 7 4 2 Level 3 Decompo s ition of original and noisy Lena image ... .... ........... .. .... ...... 98 i x
PAGE 10
Figure 4.3 Scaling and wavelet fimctions of db3 and Haar wave l ets . ........................... . 99 4.4 Original and noisy versions of Box fimction (with wavelet coefficients) ..... .lOO 4.5 Denoising results for Lena image corrupted b y additive white" noise: VisuShrink SureShrink and Brownian Sheet (alpha = 0 999 ) methods ( db3 ) .. .101 4.6 Plot ofwavelet coefficients of Figure 4.5 ........ ................. ... ........................ 102 4.7 Denoising results for Lena image corrupted by additive "white noise VisuShrink, SureShrink and Brownian Sheet (alpha = 0.05) methods (db3) ..... 103 4.8 Denoising results for Box fimction coiTupted by additive "white noise VisuShrink, SureShrink and Brownian Sheet (alpha = 0.05) methods (Haar) . .105 4.9 Denoising resu lt s for Bo x fimction conupted by additive "white noise Brownian Sheet (alpha = 0.05 0.5 0.9 and 0.999 ) method (db3) .............. . .... .106 X
PAGE 11
TABLES Table 3.1 Minimax thresholds for various sample sizes ............... .... .... ..... ..... .... ......... 64 3.2 Critical points for finitesample Bro wnian Sheet process ... ............... .... ........ 93 xi
PAGE 12
1. Introduction One of the fundamental problems we face everyday in the fields of data analysis and signal processing is how best to extract a signal from noisy and corrupted data. This problem has recently been addressed with the use of wavelets. The power of wavelet analysis lies in fact that wavelets have certain mathematical properties that enable them to parse data into different frequency components and then analyze each component with a resolution matched to its scale. Temporal analysis is performed with contracted or higher frequency versions of the mother wavelet, while frequency analysis is performed with dilated or lower frequency versions of the same mother wavelet. Through a process of scaling, wavelets are compressed or stretched to more accurately analyze a particular feature of a signal. Wavelets can also be translated or shifted in time. The transformation of a time series into its temporal and frequency components is achieved by the use of the wavelet transform. An interesting characteristic of the wavelet transform is that resulting individual wavelet functions are localized in space Fourier sine and cosine functions are not. This localization feature, along with the wavelet s localization of frequency, results in "sparse" functions when transformed into the wavelet domain. This sparseness feature of wavelet
PAGE 13
coefficients results in a nwnber of useful applications such as data compression and denoising. Wavelets can be used to decompose a signal into different levels of resolution. This is where wavelet shrinkage techniques have been found to be very effective in extracting signals from noisy data Wavelet shrinkage methods select which wavelet coefficients are to be used in the reconstruction of a signal based on thresholding. Wavelet shrinkage refers to a signa l reconstruction obtained by first performing a multilevel decomposition followed b y shrinking the wave l et coefficients towards zero and then fmally reconstructing the signal by the use of inverse wavelet transformation. Of concern to wavelet shrinkage signal denoising is the choice of the tlueshold va lue that will be used to shrink the wavelet coefficients The goa l of this thesis is to examine a relatively new data adaptive method for se lectin g thresholds for wavelet shrinkage denoising of s ignals. This technique is unique from other existing datadependent threshold selection techniques because it considers both the magnitude and location of empirical wavelet coefficients when determining the threshold A. Chapter 2 examines the wavelet transform and its ability to represent a signa l at different location s and scales. Multiresolution analysis is also introduced in this chapter. Multiresolution has the capability of zooming in or zooming 2
PAGE 14
out ", depending upon the level of examination or focus that is desired. Traditional methods of image denoising are introduced and compared in this chapter as well. The objective of denoising is to recover the signal or to remove noise without reference to an unknown original signal. Chapter 3 begins with a discussion on soft and hard thresholding two methods that are used to apply a threshold to a set of empirical wavelet coefficients Thresholding of empirical wavelet coefficients is designed to distinguish between those coefficients that belong in a reconstruction; i.e. coefficients that contain "sign ificant energy, and those that do not. Ideally thresholding should only affect the noise without disturbing the s ignal. More important than the actual method of applying a threshold to data is how the threshold is derived. Global versus level dependent thresholding techniques are examined in detail and the results of various simulations are presented In Chapter 4 the results of MA TLAB computer simulations are displayed Threshold selection methods discussed in Chapter 3 were implemented and tested on both onedimensional and twodimensional images corrupted by additive Gaussian white noise Twodimensional images include the famous Lena image and a 64x64 box function. The Blocky function, first introduced by Donoho and Johnstone [1), was used for the onedimensiona l simulations due to its portra ya l as a signal that contains significant spatial inhomogeneity and multiple jumps. 3
PAGE 15
Daubechies six coefficient otthogonal wavelet db3 and the Haar wavelets w ere used in all simulations. The results are presented and compared both from a v isual standpoint as well as from SNR calculations. With the exception of the one dimensional Brownian Bridge algorithm all remaining onedimensional denoi s ing simulations were performed with functions from the MA TLAB Wavelets Toolkit. All twodimensional denoising sinmlations were written by the author to run in MATLAB. Chapter 5 concludes this thesis with a discussion on the experimental results and the versatility of the recursive Brownian Sheet technique that allows the user to control the smoothness ofthe reconstructed signal. 4
PAGE 16
2. Wavelet Theory Mathematical transformations are applied to signals in order to obtain additional information about the signal that is not readily available in its original form. For example, the Fourier Transform of a time domain signal results in the frequencyamplitude representation of that signal. The output tells us how much of each frequency is present in the original signal. However it does not tell us when in time these frequency components exist. When the time localization of the spectral components are needed, a transform that gives the timefrequency representation of the signal is required. The Wavelet Transform makes it pos s ible to have it both ways; it allows you to have both the time and frequency information at the same time The transformation of a time domain signa l into its temporal and frequency components is provided by the Wavelet Transform. The fundamental difference between the Four ier Transform and the Wavelet Transform is resolution. The Fourier Transform has constant resolution at all times and frequencies. The Wavelet Transform has good time and poor freq uenc y resolution at high frequencies and good frequency but poor time resolution at low frequencies. This is certainly acceptable, though, as the added advantage is the capability of multiresolutional analysis provided b y the Wavelet Transform. The illustration in Figure 2.1 is commonly used to explain how the 5
PAGE 17
OJ Frequency }=3 }=2 j=l ... .... Time t Figure 2.1: Wavelet analysis timefrequenc y plot (time sca le) time and frequency resolutions are interpreted. Note that at low frequencies the height of the boxes are shorter which conesponds to better frequency reso l ution, but their widths are longer, which correspond to poorer time resolution. At higher 6
PAGE 18
frequencies the widths of the boxes decrease which corresponds to better time resolution and the heights ofthe boxes increase as the frequency resolution gets poorer. Central to and intrinsically com1ected to multiresolution analysis are wavelets Wavelets have the unique ability to analyze according to the desired scale, thus allowing signals to be examined using widely varying levels of focus. In this section the implementation ofthe Wavelet Transform is accomplished by examining a perfect reconstruction filter banl<. The scaling and wavelet functions are given and the Haar wavelet will be introduced to provide insight into the dilation and translation operations. Wavelets are created from a mother wavelet through these two operations and can be expressed as where j is the dilation index and k is the translation index. The connection between wavelets and filter banks was made by Mallat [12]. 7 (2. 1 )
PAGE 19
As seen in Figure 2.2 a filter bank is comprised of a set of filters and sampling operations. Only the Analysis section of the filter bank is shown here where the input signal is decomposed or divided into different scales of resolution. The Filte r and downsample indefinitely will result in the Scaling function Scaling function (t) b x(n) ;l, k ....___ Wavelet coefficients Wavelet w(t) Difference Figure 2.2: The Analysis section of the filter bank Synthesis section follows the Analysis section and is simply the inverse of the Analysis bank. The signal x( n) enters the Analysis section and is operated upon by two filters H0 and H1 These filters separate the signal into different frequency bands H0 the lowpass filter, takes the averages of the signal, while H1 the highpass filter, takes the differences. The first level of decomposition has been completed after the outputs of the filters have been decimated or downsampled by two. The highpass filtered outputs called wavelet transform coefficients, are not 8
PAGE 20
passed onto the next level of filtering and will be used later. The lowpass outputs are passed into the next level of decomposition. This process continues until it is determined that further decomposition of the signal is not beneficial to the analysis. Typical decomposition values for a onedimensional signal is five and three for twodimensional signal. As stated earlier there is a direct relationship between filter banks and wavelets. The lowpass filter H0 which provides the approximation coefficients leads to the scaling function (t). The highpass filter H1 which is associated with the detail coefficients produces the wavelet function w(t). Typically the scaling function is obtained before the wavelet function w(t), and only after many iterations ofthe lowpass filter and downsampling. The wavelet function also comes from the last decomposition level but after one application of the highpass filter. The dilation equation that produces (t) and the wavelet equation w(t) can be written as N (t) = 2Lh0(k)(21 t k ). (2. 2) k=O 9
PAGE 21
The wavelet equation follows immediately from the scaling function and can be written as w(t) = 2 L h1 ( k )( 21 tk ). (2. 3) k The importance of the translation and dilation operations will be demonstrated through the introduction of the Haar wave l et. The example in Figure 2.3 will graphically illustrate the relationship between scale and dilation and translation. ( 2 t ) .4!1' (t) .. IV ( I ) ... Haar Mother wavelet (2tI ) IV ( 2 1 ) ,1V(2tI) .. ... 'Wavelets are derived from the Mother wavelet compressiOn shift F igwe 2.3: Scaling and dilation example using the Haar mother wavelet 10
PAGE 22
By defmition (t) = 1 over the interval [0,1] and 0 elsewhere. It then follows that (t) = 2 [ h0(0)(2t) + h0(1)(2t 1)]. (2.4) Assume for the Haar case that the lowpass filter is defmed as (2.5) The scaling function (t) can then be written as (t) = [(2t) + (2t 1)]. (2. 6 ) Assume for the Haar case that the highpass filter is defmed as (2.7) It follows that w(t) can be written as (2. 8 ) II
PAGE 23
and w(t) = [(2t)(2t 1)]. (2.9) The results agree with the graphical example in Figure 2 .3. In order to solve the dilation equation (t) = 2l:h(k)(2t k), (2. 1 0) k apply the continuous Fourier Transform to both sides to obtain
PAGE 24
to obtain and r=2t k 2t = r + k t = li ( r + k ) dt = Jidy, (j) (j)
PAGE 25
(2.18) This is a recursive process and by repeating the iteration we get (2.19) (2.20 ) (2.21) If h0(k) is known (2. 21) becomes ( m) = IT H o ( ) }=I 2 (2 22) The sca ling function ( m) is obtained after an infinite sequence of filtering and downsampling. Fundamental to the development of the sca ling function is the design of H o ( z ) How are its coefficients determined? In addition referring to 14
PAGE 26
F igure 2.2, what is its relationship to H1 ( z ) and the remaining design requirements of the filter bank? These questions will be answered in the following s ection 15
PAGE 27
1\ x (n) Analysis Synthesis x(n) Figur e 2 .4: Twochannel filter bank A twochannel filter bank with both the Analysis and Synthesis sections shown is depicted in Figure 2.4. The role of a filter bank is to separate an input signal into frequency bands by filtering and downsampling and then reconstruct it by 1\ upsampling and filtering. The signal x ( n) enters the filter bank from the left and passes into the Analysis bank. The signal is acted upon by a lowpass filter H0 ( z ) and a high pass filter H1 ( z) Both filtered version s are then downsan1pled b y two. This results in the removal of oddnumbered samples and two halflength outputs. The gap in the center of Figure 2.4 indicates where the downsampled signals ma y be compressed enhanced, or coded for storage or transmission For the development of this topic we will assume that this gap is closed. The signals are then passed into the Synthesis bank where the y are upsampled b y two Thi s 16
PAGE 28
re s ults in the insertion of zeros in the oddnumbered locations and two 2x length outputs For the filter bank to exhibit perfect reconstruction with l time delays, 1\ x ( n) = x( n l) This is equivalent to stating in the z domain that 1\ X( z) = z ,X( z) This occurs when there is no aliasing and no distortion of the input s ignal. The following derivation will shed some light on the se two requirements. From the filter bank in Figure 2.4 it can be written that (2.23) (2.24) and substituting (2. 2 3 ) into (2. 24) to obtain U0( z ) = V0( z2 ) = H0(z)X(z) + H0 ( z)X( z)J, (2.26) and 17
PAGE 29
The same derivation techniques can be used for the lower branch of the Analysis bank to solve for = (2.28) Adding the expressions for W0( z) (2.27) and (z) (2.28), /\ X ( z) = W0( z ) + (z). (2.29) Now, collect like terms and simplify to obtain /\ X(z) = F0(z)H0( z ) + (z) H1 (z) ]X(z) + (2.30) F0(z)H0 ( z) + F1 (z)H1 ( z) ]X(z) /\ As stated earlier for perfect reconstruction with l time delays X(z) = z 1 X(z ) The distortion term in (2.30) must be set equal to z 1 so (2.31) 18
PAGE 30
while the alias term must be set equal to zero [ F0( z)H0 ( z) + F; ( z)H1 ( z ) J = 0 Perfect reconstruction is guaranteed for a twochannel filter bank when two conditions are met. First the no distortion requirement is given by : The s econd condition alias cancellation can be written as (2.32) (2. 3 3 ) (2.34) (2.3 5 ) The filters F0 H0 H1 and F; need to be designed so that these condition s are satisfied From ( 2.35) above it can be shown that alias cancellation occur s and ( 2 .33) is satisfied when (2.36) 19
PAGE 31
and (2.37) Substituting (2.36) and (2.3 7) into (2.33) will verify that these filter relationships remove aliasing in the filter banlc The conditions put forth in (2.36) and (2.37) permit examination of the relationship of the filter's coefficients. Assume and The output vector associated with (2. 39) indicates that oddnumbered coefficients change sign for the relationship F0 ( z) = H1 ( z). Similarly (2.40) shows that ( z ) =H0 ( z) results in the evennumbered coefficients changing sign. This is shown graphically in Figure 2.5. 20
PAGE 32
x(n) SmithBarnwell conditions p ,q,r,s,t a b c ( z ) Figure 2.5: Alternating signs pattern of a two channel filter bank 1\ x(n) To complete the analysis and to satisfy (2 .34) defme the product filters P0(z) and (z) as (2.4 1 ) and (2.42) 2 1
PAGE 33
Substituting directly allows us to fmd that = P0(z). (2.43) The no distortion requirement of (2.32) is manipulated to obtain and (2.46) Further examination of (2.46) leads to the fact that l must be odd and all of the odd powers in P0 ( z ) must be equal to zero. The exception is z _, which has a coefficient of one Simplifying (2.46) we let P(z) = z1 P0( z ) (2.4 7) 22
PAGE 34
and multiplying (2.46) by z1 to obtain z1 P0(z)z1 P0 ( z) = 2, (2.48) P(z)[(z ) P0( z)] = 2, (2.49) and P(z)+ P(z) = 2. (2.50) T o summarize performing this analysis for a twochannel perfect reconstruction filter bank yields a design that can be reduced to a few s teps: Stepl: Design a halfband lowpass filter P(z ) such that P(z ) + P( z) = 2. Step 2: Factor P0(z) into F0(z)H0( z ) then use (2.36) and (2. 37) to find F; ( z ) and H1 (z). Note that the length of P0(z), by definition determines the swn of the length s o f F0(z) and H0(z). Fact oring P0(z) into F0(z)H0(z) can be done sev eral ways. One s uch method will result in linear pha s e filters another in orthogonal filter s 23
PAGE 35
So, for the perfect reconstruction condition to be satisfied, P(z ) must be a halfband filter. All even powers in P( z ) are zero except for the constant term which is equal to zero. The odd terms will dictate the filter design. We still need to focus on the design of the filters H0( z ) H1 ( z ) F0( z ) and F; ( z). Typically H0 ( z ) is designed first then H1 ( z ) follows. Early work on this topic was done by Croisier Estaban and Garland in 1977. Their choice dealt with alternating signs and is described by (2.5 1 ) The resulting filters were called QMF (Quadrature Mirror Filter). This design satisfies the alias cancellation criteria but does not result in an orthogonal filter bank. Further work by Smith and Barnwell in 1984 proved to be more rewarding as they chose ( 2 .52) This is known as the alternating flip method (see Figure 2 .6) and will lead to the design of orthogonal filters when H0( z ) is correctly chosen. If the design of 24
PAGE 36
H0 ( z ) leads to perfect reconstruction in the alternating filter bank it will also result in orthogonality Rewriting (2. 52) we can now expand it in the same format as (2.38) through (2.40) to get and fmally 25 (2.53)
PAGE 37
x(n) Alternating F lip and Smith Barnwell F; ( z ) A x(n) Figme 2 .6: A lt e rn at in g flip for ort ho go nality and SmithBarnwell condit ion s for alias cancellation Substituting (2 36) into (2.41) y ield s We need and for N odd (2.60) becomes 26 (2.59) (2.60) (2 61)
PAGE 38
and the product filter can then be written as (2. 62 ) which will result in an orthogonal filter and symmetric factorization To summarize the design of half band filters and orthogonal filter banks: If H ( z ) is the alternating flip of H0( z ) then P0( z ) = zN H0( z 1 )H0( z ) and this will lead to an otthogonal filter bank. Design P( z ) such that P( z ) + P( z ) = 2 This will guarantee perfect reconstruction. Alias cancellation requires that Alternating flip leads to orthogonal filters and symmetric factorization. Designahalfbandfilter P(z ) suchthat P(z ) +P(z)=2.Defme P0( z ) = z 1 P(z). Factorize P0( z ) = F0( z)H0(z) to fmd F0(z) and H0( z). Several combinations are possible. Use the SmithBarnwell conditions to fmd 27
PAGE 39
Daubechies db3 wavelet: 0 2 .... : ..... .... db3 Wavelet funcbon ps1 1 5 .. ... . 1 ...... .. ... 0 5 ... ... : ..... 1 . : . 15 0 Figure 2.7: Daubechies db 3 sca lin g and wave let functions :r:a:J .. 0 : 0 : .o 5 0 5 0 05 1 0 05 1 0 2 4 6 0 2 4 6 Tl\] 'l1"'i 0 5 j ....... OS ...... j 0 < 0 ; I . 0o 0.5 1 o 0 5 1 O 5o 2 4 6 O 5o 2 6 Figure 2.8: Daube chies' db3 wavelet filter designs H0(z) = [0.0352 0.0854 0.1350 0.4599 0.8069 0.3327] H1 ( z ) = [0.3327 0.8069 0.4599 0.1350 0.0854 0.0352] F0 ( z ) = [0.3327 0.8069 0.4599 0.1350 0.0854 0.0352] F; ( z ) = [0.0352 0.0854 0.1350 0.4599 0.8069 0. 3327] 28
PAGE 40
2.1 MultiResolution Original Lena Image (512x 5 12) Figure 2.9: Original Lena Image 29
PAGE 41
Original Lena ImageLevel1 Decompos i tion Figwe 2. 10 : Single l eve l decomposition of Lena with D aubechies' db3 wave l et. Detail coefficients (horizo ntal vertica l and diagonal) and approximation coefficients di s pla yed. LH LH Approximation Horizontal coefficients coefficients HL HH Vertical Diagonal coefficients coefficients Figwe 2. 11: Single le ve l decomposition displa y 30
PAGE 42
Wavelet coeffiCient s from Single Level Decomposition 0 05 1 5 2 25 3 X 105 Figure 2.12: Wavelet coefficients from Figure 2.10 in vector form 31
PAGE 43
Original Lena ImageLevel 2 Decomposition Figwe 2.1 3 : Two level decomp os ition of Lena with Daub echies' db3 wave l et. LL LH LH HL HH HL HH Figwe 2.14 : Two level decompo s ition dis pla y 32
PAGE 44
Wavelet coefficient s from 21evel decomposition 0 0 5 1S 2 2S 3 X 10 Figure 2.15: Wavelet coefficients from Figme 2 .13 in vector fotm 33
PAGE 45
Original Lena ImageLevel 3 Decomposition Figwe 2 .16: Threelevel decompo s ition of Lena with Daub ec hie s db3 wave l et. LL LH LH HL HH LH HL HH HL HH Figwe 2 .17: Thre e level decompo s ition displa y 34
PAGE 46
Wavel e t coeffici ents from 31evel d ecompositio n 1500 1000 1000 0 0 5 1S 2 2S 3 X 10 Figwe 2 .18: Wave l et coefficients from Figwe 2 .16 in vector fonn 35
PAGE 47
x(n) Rows I H0( z ) Rows H,(z ) Twodimensional DWT Downsample collUTI11s, keep even inde xed rows Rows of the image are convolved with the Decomposition filter H0 (z) Row s of the image are convolved with the Decompo sition filter H1 (z) Downsarnple rows, keep even indexed collUTI11s CollUTI11s LL Approximation coefficients LH Horizontal coefficients HL Vertical coefficients HH Diagonal coefficients Downsample rows keep even indexed collUTI11s CollUTI11s I H o(z) I CollUTI11s I H, ( z ) I CollUTI11S of the image are convo lved with the Decomposition filter H0(z) CollUTI11s of the in1age are convolved with the Decomposition filter H1 (z) Figure 2.19: Two dimensional DWT decomposition steps 36
PAGE 48
Rows I F1 (z) I Twodimensional IDWT U p sam ple columns insert zeros at odd indexed columns LL Approximation coefficients LH Horizontal coefficients HL Vertical coefficients HH Diagonal coefficients Rows Upsample column s, inse1t zeros at odd indexed columns Upsample rows insert zero s at oddindexed rows Upsample rows inse1t zeros at oddindexed rows Columns Rows of the image are convolved with the Recon struct ion filter F o ( z) Ro ws of the ima g e are convolved with the Reconstruction filter F1 ( z) I F0(z) I Columns of the image are convo l ved with the Reconstruction filter F o ( z) Columns of the ima ge are convolved with the Recon struction filter F1 ( z) Figwe 2.20: Two dimensiona l IDWT reconstruction steps 37 x(n)
PAGE 49
2.2 Image Denoising Techniques The field of Digital image processing can be broadly separated into three categories: image enhancement, restoration, and compression. linage enhancement and restoration techniques are closely related to one another and their methods and algorithms may overlap. In general though image restoration algorithms are more mathematically complex and attempt to exploit cetiain characteristics of the signa l and noise. In addition, some type of error criterion is used to provide a measure ofthe effectiveness ofthe algorithm. The goa l of image enhancement is to process an image so that the result is more suitable than the original image for a particular application. In image restoration the image has been degraded through some process and the goal is to reduce or eliminate the effect of the degradation so that the resulting image resembles the original as closely as possible. The objective of image compression is to represent an image with as few as bits as possible while still maintaining an acceptable level of quality. As with image enhancement techniques, the development and selection of a particular restoration method depends on several fact ors. The method selected for denoising an image depends on the image characteristics and the type of noise or degradation, i.e. the method selected is image and noise specific. linage restoration can be broken down further into four 38
PAGE 50
categories as seen in Figure 2.21: reduction of additive random noise, deblurring reduction of signal dependent noise and temporal filtering. This section will review various image restoration algorithms and present some examples of various techniques Specifically, the focus will be on algorithms used to reduce additive random noise from a degraded image. A brief discussion on the remaining categories will be given but the reader is referred to Lim [7] for a more complete description and extensive algorithm development. Reduction of random noise Lowpas s filterin g Median filterin g Wiener filt e rin g ( LSIV and Spac e invariant linage restoration Deblurring Reduction of si g nal dependent Inverse filterin g noise Blind Figme 2 21: Categorie s of Image restoration Temporal filterin g F ram e av er aging Motion comp e n sa tion These traditional image restoration techniques produce an approximation of the original image. The noise can never be fully removed from the original signal. To provide a quantitative measure of performance, signal to noise ratio (SNR) and signal to noise ratio improvement (SNRI) are introduced. SNR is measured in dB and is defmed as 39
PAGE 51
NOISY lena Image Figure 2.22: Original and oisy version s of the Len a image SNR = 10log1 0 IIn2(i,J) j k (2.63) where x(i,J) is the original image and n(i,J) is the noise that is corrupting the image. The defmition for SNR can be extended to provide a gauge for how well a given restoration technique improved the corrupted image. SNRI is defmed as IIn2(i,J) j k SNRJ = 10 log1 0 ? I I ( y(i,J)x(i,J)t j k (2.64) 40
PAGE 52
where y(i,J) is the filtered version of the corrupted image. The simplest type of image restoration algorithm involves the application of a lowpass filter. Lowpass filtering is effective in reducin g the additive random noise in a degraded image and it preserves much of the low frequency components ofthe image. An example oflowpass filtering can be seen in Figure 2.23. T hi s 1 Od8 of GausSian 001se added Frequency response of f11ter h(n1.n2) wl F igme 2.23: R es ult s of Jowpass filtering a noi sy Lena image example clearly demonstrates that lowpass filtering is effective in reducing the noise, but also exhibits a limitation ofthe type oftechnique Lowpass filtering can cause blurring in an image. This i s the tradeoff that mu s t be made when lowpass 4 1
PAGE 53
filtering an image; there is a substantial reduction in the amount of noi se, but at the expense of a small loss of signal. Another restoration technique median filtering is an improvement over lowpass filtering. It is similar to lowpass filtering in that it smoothes the image an d reduce s high frequency noise. Unlike lowpass filtering, median filterin g can preserve discontinuities or edges in an image w hile reducing the noise. Median Impulse noise added to 20% o f pixel s 3x3 Median filter 5x5 Median filter 7 x 7 Median filter F i gure 2.24: Example of" salt and pepper noise red uction b y median filt ering filtering is a nonlinear process useful for removing impulsive or "salt and pepper type noise that occurs as random bit errors during discretization or tran smiss ion 42
PAGE 54
T he filter uses sliding neighborhoods to process an image It determine s the value of each output pi x el by examining an mxn neighborhood around the colTesponding input pixel sorts the pixel values in ascending order and then ch ooses the median value as the output result. An important parameter that effects the performance of the median filter is the s ize of the nei g hborhood or window 1 OdB w h i t e noise added t o image 3x3 Median fi I ter 5 x 5 Median f ilter 7x7 Median f ilter Figure 2.25 : Exam ple of white n oise reduction b y median filtering that slides over the image. In general, the best choice for the size of the window is image dependent and the resultant improvement or degradation will v ar y from image to image. Examples presented in Figures 2.24, 2 .25, and 2 26 demon s trate 43
PAGE 55
how effective median filtering is in removing different types of noise with v arious window sizes. In Figure 2.24 the SNR of the Lena image that was conupted with impulse noise only was found to be 5.9dB. The best SNRI was obtained using the 5x5 window approximately 14.8dB. In Figure 2 .25, the SNR ofthe Lena image that was conupted with white noise onl y was found to be 1 OdB. The best SNRI was obtained using the 3x3 window approximately 12.ldB. In Figure 2.26 the SNR of the Lena image that was conupted with white noise and salt and pepper noise was found to be 5.6dB. The best SNRI was obtained using the 5x5 window 1 OdB white noise+ impulse noise 3x3 Median fi Iter 5x5 M edian f ilte r 7x7 Median filte r F ig m e 2.2 6 : Ex ample of whit e noi se and "s alt and p e pp e r noise reduction b y m e dian filt e rin g 44
PAGE 56
approximately 12.1dB Wiener filtering was first introduced in the early 1960s as a tool for use in image restoration applications. Two types of Wiener filters will be disussed here linear space invariant and linear space variant. The model of an image degraded by additive random noise is given by (2.65) where x ( n1 n2 ) is the corrupted signal s ( n1 n2 ) is the original signal and w ( n1 n2 ) is the additive random noise. If it is assumed that s ( n1 n2 ) and w ( n1 n2 ) are zero mean s tationary proce s ses and are linearl y independent of each other and their power spectral density ( (J) 1 (J) 2 ) and P... ( (J) 1 (J) 2 ) is known then the optimal linear minimum mean square error estimate of s ( n1 n2 ) is found by filtering x ( n1 n2 ) with a Wiener filter whose frequency response H( (J) 1 (J) 2 ) is given by (2.66) 45
PAGE 57
As stated above the Wiener filter design is based on the assumption that the power spectral densities of ( {J) 1 {J) 2 ) and P,. ( {J) 1 {J) 2 ) are known or can be estimated from a set of images of the same class. In the first Wiener filter e x ample presented here the image power spectral density was estimated from six different images In this case the power spectral density was calculated by averagin g IF( {J) 1 {J) 2 f from six images. These images are similar to Lena and are shown is Figure 2 27. Man# 1 Man# 2 Man#3 Man#4 Man# 5 Man#6 F igme 2.27: Ima g e s u s ed for es timation o f power s pectral d e n s ity in fir s t Wien e r filt e r exa mpl e 46
PAGE 58
The estimated image power spectral density can be written as (2. 67 ) Continuing we can now defme H( {J) 1 {J) 2 ) as (2.68) and the fmal step in the implementation ofthe LSIV Wiener filter is given by (2.69) The results of applying a LSIV Median filter to a corrupted Lena image are displayed in Figure 2 28. Lena was corrupted with white noise and the resulting SNR was lOdB. After application of the Wiener filter the SNRI was 9.5dB. Although this is a substantial improvement it came at the expense of s moothing the entire image Edges and other high frequency parts of the Lena ima ge were not preserved. The Wiener filter designed using this approach is essentially a lowpas s 47
PAGE 59
Original Lena image 1 OdB o f white noise added LSIV W iener filter Spac e variant Wiener filter Figure 2.28: Exam ples of white noise re duction by LS IV and LSV Wiener filtering. filter with no adaptive qualities whatsoever hence the name of Linear Space Invariant filter. The LSIV Wiener filter assumes that the characteristics of the signal and noise do not change over different regions of the image Typically this is not the case, as one expects characteristics to differ considerably from one region of an image to another. It would make sense to let the filter adapt to the changing characteristics ofthe image. This concept is central to the theory behind the Linear Space Variant filter; a filter which is a function ofthe local ima ge details. 48
PAGE 60
Developing the model assume that the additive white noise is zero mean with a variance of CJ2,.. The power spectral density is flat and can be written as (2 70) Within small l ocal areas of the image the signal /( n1 n2 ) is assumed stationary (statistics in the small area are not changing) and can be modeled as (2.7 1 ) where m1 and CJ 1 are the local mean and standard deviation of f(n1 n2 ) and w(n1 n2 ) is zero mean white noise with a variance of one Within the l ocal area then the Wiener filter is given by (2.72) 49
PAGE 61
The local Wiener filter impulse response is found to be (2.7 3 ) Wiener filter Fi g ure 2 29 : Block diagram for Wiener filter From Figure 2.29 and (2.73), the processed image p(n1 ,n2 ) within the local area can be expressed as Exp licitl y rewriting (2.74) for each pixel gives the main equation for the implementation ofthe adaptive LSV Wiener filter as 50
PAGE 62
T o complete the model we need to defme m 1 ( n1 n2 ) as (2. 76) and the local variance a2 1 can be estimated as { ( 2 2 ) 2 2 } A 2_ (Yg CY\V 'if (Yg >CY\V CYI 0 o therw i se (2. 77) (2.78) The adaptive Wiener filter was implemented in Matlab using ( 2.75) throu g h ( 2.78) and the results are shown in Figure 2.28. The program estimates the local mean and variance around each pixel in a mxn neighborhood, and then creates a pixelwise Wiener filter. The program was run using a 3x3 window and resulted in a SNRI of7.2dB. The results obtained with the adaptive Wiener filter produced 51
PAGE 63
better results than the LSIV Wiener filter because it tailors or adapts its design to the local image variance. Where the local image var iance is large the filter performs little smoothing. Where the local image variance is small the Wiener filter performs more smoothing. This makes the filter more selective and better able to preserve the higher frequency components of the image. Image deblurring can be modeled as (2.79) Figure 2 .30: Blo ck diagram for image blwrin g where g(n, n2 ) is the blurred image f(n, ,n2 ) is the original image; and b(n, ,n2 ) is the bluning function impulse response The objective is to estimate f ( n1 n2 ) from g(n, ,n2). Examples of what would cause blurring may range from len s misfocus to atmospheric turbulence. Depending on whether b( n1 n2 ) i s known 52
PAGE 64
will determine which method is selected for reducing image blur. Inverse filtering can be used if the bluning function is known. This allows us to write (2.80) and (2.81) Fi g ure 2 31: Inv e rse filterin g block diagram The system shown in Figure 2.31 can recover f(n1 n J from g(n, n2 ) by the lfthe blUJTing function b(n"n2 ) is not known it must be estimated prior to inverse filtering. This method is called blind deconvolution because the attempt 53
PAGE 65
is made to deconvolve g(n1 ,n2 ) without accurate or detailed knowledge of b(n1 ,n2). See Lim [7] for details. The reduction of signal dependent noise is accomplished by transforming the degraded signal into a domain where the noise becomes additive signa l independent noise and then use methods discussed earlier to reduce the s i g nal independent noise. To illustrate this approach let's develop a model for an image g( n1 n2 ) degraded by multiplicative noise. Let (2. 82) where v(n1 ,n2 ) is random noise that is not a function of f(n1 n2). Apply the logarithmic operation to (2. 82) to obtain (2.83) and rewrite (2 83) to get (2.84) 54
PAGE 66
This demonstrates how the multiplicative noise v(n, n2 ) was transformed to the additive noise v'(n1 ,nJ. Now, any number of previously discussed restoration algorithms can be used for reducing the additive signal independent noise. exp Figure 2.32: Exam ple of system used for reduction of multiplicative noi se Temporal filtering for image restoration involves the filtering of images that are correlated in time, such as motion pictures. A simple method oftemporal filtering is frame averaging. Frame averaging is effective in processing a sequence of images in which the image doesn t change much from frame to frame but the character of the noise does. One example of frame averaging can be expressed as estimating an image ](n1 ,n2 ) from a sequence of N degraded image frames g i ( n1 n2 ) by writing that (2.85) 55
PAGE 67
Motion compensated image restoration is an extension of frame averaging. For frame averaging to be effective the image must remain constant and not change from frame to frame This requirement is not typically met in applications such as televisi on and motion pictures so motion compensation routines estimate the movement of an image from frame to frame. In wavelet shrinkage denoisin g algorithms the basic premise involves the thresho ldin g of wave let coefficients. A transformation from the spatial domain to the wave let domain is required and is accomplished by the application of the wavelet transform. Models have been developed that provide a statistical basis for the selection of a threshold and result in the optimization of mean square error in the estimated signal. The model can be stated as such : assume that we w i sh to recover an unknown function f from noisy data. Let Y ; be the set of noisy observations and defmed as i = O, ... n l (2. 86 ) l where t ; =, a is the standard deviation of the noise and z ; are the independent n and identically distributed random variables distributed according to N(O,l). The 56
PAGE 68
goal is to estimate f with small L2 risk i .e. minimize the means quare enor. This can be written a s ( A ) 1 I I A 1 12 1 n 1 ( A ( j ) ( j ))2 (2.8 7) Donoho and Johnstone [1] proposed that (2.87) be subject to the condition that the optimization of the mean square error result in a signal estimate f that is at l eas t as smooth as f A number of thresholding methods are discussed in Chapter 3. These differ in complexity determination and performance. However there is one common theme that all of these techniques exhibit. Shrinkage of empirical wavelet coefficients works best when the underlying set of the true coefficient s of f is sparse. In the first levels of decomposition there are many coefficients but not many that correspond to the actual underlying fimction. Most of these coefficients are noise As the decomposition continues, the remainin g few l arge coefficients represent most of the underlying form of f Therefore any effective wavelet shrinkage technique must throw away most of the noisier or smaller coefficients, and keep the larger ones. 57
PAGE 69
3. Wavelet Thresholding Once a threshold has been selected how is it applied to the wavelet coefficients? There are two methods available for implementation of the threshold value A. to the data; hard and soft thresholding. These two methods are graphically displayed in Figure 3 .1. ,1, Hard Soft Figme 3.1: The hard and soft thresholding functions 58
PAGE 70
Hard thresholding is described by = {x(t), iflx(t) l >A 0 iflxCt)l :S A Hard Thresholding Function (3.1) and implemented by including all the empirical wavelet coefficients whose absolute value is above the threshold A in the inverse transform during reconstruction The remaining coefficients whose values are below the threshold A are set equal to zero. Hard thresholding with its "keep or kill" approach can potentially create discontinuities at wavelet coefficients equal to A Donoho and Johnstone [3] recognized that empirical wavelet coefficients consist of both signal and noise and attempted to isolate the signal pottion by removing only the noise pa11. This led to the second and more populaJ type of thresholding and is called soft thresholding. Soft thresholding is described by {x(t)A, if x(t) >A = 0, iflx(t)l :S A x(t) +A, if x(t) <A. Soft Thresholding Function (3.2) When the soft thresholding function is applied to the empirical wavelet coefficients, not only are the coefficients whose absolute value is greater than the 59
PAGE 71
threshold A included in the inverse transform for reconstruction but their values are shrunk towards zero by an amount equal to the threshold A The selection of the appropriate threshold i s one of the fundamental problems in wavelet shrinkage techniques. Choosing a very large threshold will s hrink almost all of the coefficients to zero resulting in oversmoothing of th e underlying function. On the other hand choosing a very small threshold would res ult in a wi gg l y" or nois y e s timator. The methods for choosing the threshold A c an be grouped into two categories: global thresholding and lev eldependent thresholding and are discussed in detail in the following sections. 60
PAGE 72
3.1 Selective Wavelet Reconstruction Wavelet shrinkage and thresholding methods were presented by Weaver in [10] to remove random noise from magnetic resonance images Further work in selective wavelet reconstruction was preformed by Donoho and Johnstone [1]. The basic idea behind selective wavelet reconstruction is to choose a small number of wavelet coefficients with which to represent the underlying ftmction. This idea is based on the premise that a ftmction can be well represented in terms of only a relatively small number of wavelet coefficients at various resolution levels. Since the larger true coefficients are typically the ones that are included in the selective reconstruction when estimating an unknown ftmction it makes sense to only include the coefficients that are larger than some specified thre s hold value. Therefore in selective wavelet reconstruction large coefficients are kept and smaller coefficients relative to the threshold A., are set to zero. This thresholding operation acts on the noisy empirical wavelet coefficients resulting in the estimated wavelet coefficients that are then used in the inverse transform algorithm during reconstruction. 61
PAGE 73
3.2 Global Thresholding Global thresholding refers to using a single value of A for the entire set of empirical wavelet coefficients. In global thresholding it is typical to threshold onl y the coefficients at the highest resolution thus leaving intact all lowlevel wavelet coefficients. These lower level coefficients conespond to macro features ofthe data Donoho and Johnstone [1] were the first to propose two methods of global thresholding that provide for the selection of the threshold A to be used in wavelet shrinkage applications. Both of these methods use a single threshold for all the detail wavelet coefficients. The first one discussed is Minimax thresholding where the threshold is set based on the sample size n The U niversal threshold method or VisuShrink as it is commonly called sets the threshold to and ensures that all of the detail coefficients are set to zero if no signal is present. 62
PAGE 74
3.2.1 Minimax Thresholding Minimax thresholding uses a fixed threshold that depends on the sample size n and is derived to minimize the constant term in the upper bound of the risk involved in estimating the function given in (3.3). The Minimax threshold A.* is defmed as the threshold that minimizes the quantity (3. 3) where A : is called the minimax risk bound R;. (B) can be rewritten as (3.4) where V;. (B) = var( !i;. (X)) and M;. (B) = E( !i;. (X)) The Minimax threshold applies the optimal threshold in terms of L2 or mean square error risk The Minimax threshold values in Table 3.1 are taken from the table presented in Donoho and Johnstone [1]. An example of Minimax thresholding can be seen in Figure 3.2. As mentioned above this type of thresholding results in a procedure that performs 63
PAGE 75
n A n A 64 1.474 2048 2.414 128 1.669 4096 2 .594 256 1.860 8192 2.773 512 2.047 16384 2 .952 1024 2.232 32767 3 .131 Table 3 .1: Minimax thresholds for various samp l e s izes, from Donoho and Johnstone [1] well in terms of meansquared error, but at the expense of underlying smoothness of the estimated function. Regardless of the sample size n the risk s achieved b y the Minimax thresholds are close to the ideal risk and are smaller than the Unive rsal or VisuShrink method. It should be noted that both the Universal and Minimax thresholds tend to be larger than necessary and therefore tend to oversmooth the underlying function. This is the main weakness of global thresholding methods and why data driven threshold procedures are more desirable in denoising applications. The data driven thresholds tend to preser ve more features of the underlying function. This can be observed from the denoi sing results displayed in Figure 3.2. 64
PAGE 76
3.2.2 Universal ThresholdingVisuShrink The second and more flexible technique proposed by Donoho and Johnstone [1] for choosing a threshold is ca lled the Universal threshold or VisuShrink method The universal threshold is A,= a where a is the standard deviation of the noise and n is the sample size This technique is similar to the Minimax method in that both will result in estimators that perform well in True Blocky Func tion 10 5 0 5 10 0 1000 2000 Global Universal Thresholding 10 5 0 \t 5 10 0 1000 2000 Noisy Blocky Functionsnr 5dB 10 5 0 5 10 0 0 1000 2000 1000 2000 Figure 3 .2: Blocky Function n =2048, SNRatio = 5 estimated b y both the Global U niver sa l and th e Global MiniMax methods 65
PAGE 77
terms of meansquared error but is simpler to implement and does not require the use of a lookup table. The Universal threshold value is larger than the MiniMax threshold for any particular value of n thus resulting in an estimate that is smoother than the Minimax estimator. This method is called VisuShrink for this reason; a smoother estimator is considered more visually appealing. The results of the two methods are displayed in Figure 3 .2. The differences are subtle but detectable. The Minimax method does a better job at picking up jumps and discontinuities but at the expense of overall smoothness. The VisuShrink method results in smooth estimates but does not pick up the jumps and other abrupt features as well. But how and why does VisuShrink result in a smoother estimate ? Assume that we have a noiseless signal. Taking the wavelet transform of this signal will re s ult in a sparse vector with the most of the coefficients essentiall y being equal to zero. Ifthe same procedure is repeated after some noise has been added these coefficients are now nonzero. If a samp l e in the noiseless case ought to be zero that in the noisy case is now nonzero the reconstruction will preser v e this and exhibit a noisy appearance with small blips against a clear background The VisuShrink thresholding method safeguards against allowing spurious noise into the reconstruction. This is due to the fact that if z1 zn represents an 66
PAGE 78
independent and identically distributed white noise sequence with noise N(0 1), then the probability P[ z ; ::::; n for all i = 1 .. n J + 1 (3. 5) as n goes to infinity. This essentially says that the probability of all noise being shrunk to zero is very high for large samples. Since the Universal threshold procedure is based on this asymptotic result, it does not always perform well in small sample situations Referring back to the equation for the universal threshold A = CJ n it is important to note that this equation requires the knowledge of CJ the standard deviation of the data values While estimating noise levels in a parametric setting is straightforward such is not the case in nonparametric regression. Nonparametric regression is a field where for example a curve may be fitted to the data without assuming any structure to or knowledge ofthe underlying function. In most cases the true value for the standard deviation of the noise is not known and taking the usual standard deviation of the coefficients will result in an overestimation of CJ Donoho and Johnstone [ 1] proposed replacing CJ with an estimate ; = MAD where MAD is the median absolute deviation of the finest 0.6745 6 7
PAGE 79
scale wavelet coefficients. The reason for considering only the empirical wavelet coefficients at the highest level of resolution i s that these tend to con s i s t o f mainl y noise. This is not a perfect estimate as thi s v alue ma y suffer an upward bia s due to the presence of some signal at that level. But by using the median absolute value deviation for the standard deviation of the noise this bias is e f fecti vely controlled. 6 8
PAGE 80
3.3 Leveldependent Thresholding In Section 3.2 global thresholding was described as having all of the wavelet coefficients below a certain resolution level shrunk according to a single v alue of A The lower level coefficients were included in the reconstruction as is". However what if it s desired to have thresholds that change depending on the level of resolution and the characteristics of the empirical wavelet coefficients? In leveldependent thresholding a different threshold value A 1 is chosen for each wavelet level j It is only natural to group empirical wavelet coefficients by level and then treat each level separately After all, each level differs in overall frequency content and contains varying amounts of signa l and noise information. Popular datadependent threshold selection procedures by Donoho and Johnstone [2), Ogden and Parzen [4),[5] and Hilton and Ogden [6] will be examined in the follo wing sections. 69
PAGE 81
3.3.1 S u reShrink Donoho and Johnstone [ 2 ] developed a scheme that use s the w a v elet c oe ffic ient s at each level j to determine a threshold A, with which to shrink the wavelet c oefficients at that lev el. This thresholding method is based on the unbiased risk estimation by Stein [11] which minimizes estimated mean squared error. Recall that we developed a model that described a collection of noi sy observations y i as i = O ... n 1 (3.6) with t i = .!_ CJ' is the standard de v iation of the noise and z i are the independent n and identically distributed random variables distributed according to N (0 1 ) Our g oal is to estimate f with small meansquared error i.e to fmd an estimate f that depend s on y i . Y n l with small risk such that ( A ) 1 I I A 112 1 n 1 ( A ( i) ( i ) )2 R f,f =;Eff f; f; (3.7) 70
PAGE 82
But, in many instances we perform statistical operations on coefficients in the wavelet domain. What happens to this risk in the wavelet domain? In the wavelet domain we can rewrite (3.6) as (3. 8) and (3. 9) The orthogonality of the Discrete Wavelet Transform results in the fact that the wavelet transform of white Gaussian noise is also white Gaussian noise of the same amplitude The wavelet coefficients of a noisy sample set are themselves just noisy versions of the noiseless wavelet coefficients. This is very significant because we can transform the original data into its wavelet coefficients and then attempt to minimize the risk in the wavelet domain This will automatically minimize risk in the original domain. Finally, we can write that f(t ; ) w l (w(y ;)a} (3. 1 0) 71
PAGE 83
Continuing with the SureShrink discussion let JL = (JL; : i = 1 .. d) be a d dimensional vector and x, N(JL; ,1) be multivariate normal observation s with that mean vector. Also let JL = JL(x) be a fixed estimator of JL Stein (11) introduced a method for estimating Jlr in an unbiased fashion. He stated that the loss could be estimated unbiasedly for an estimator of JL and can be written a s = x + g(x) where g = (g; When g(x) is weakly differentiable where \1 g = L ; a; g i. Using the soft threshold estimator k {A x g( X) = 5 A (X) X = X : A X A A, X < A 72 (3.11) (3. 1 2)
PAGE 84
gives Stein's unbiased estimate of risk for any set of observed data x,, ... x d as d SURE(A;x) = d 2#{k:i xki Lrnin2(1xki,A) k=ll (3 .13) and A (A) 2 E 11 f.1 (x)f.1 = E 11 SURE(A,x). (3.14) The threshold level is set so as to minimize the estimate of risk for an y set of observed data x1 x d and A can be written as A = arg rnin1;;,o SURE(t; x ). (3.15) The SURE method can be expected to do well in terms of minimizing risk since for large sample sizes the Law of Large numbers will guarantee that the SURE criterion is close to the true risk and the threshold will be optimal for a given dataset. 73
PAGE 85
The implementation ofthe above equation for SURE(A,,x) is not immediately obvious. At this point, a computational example demonstrating the minimization of the SURE criterion would be valuable and insightful. Let the empirical wavelet coefficients be denoted as the set of data X ; = (x,, ... x d ) where dis the length of the data vector. For this example, the wavelet coefficients were taken from a level 6 decomposition of the Blocky function The Block y Level 6 coefficients Sorted in ascending order 4 3 2 1 Absolute value 0 500 Squared 1000 15 10 5 0 500 1000 Figure 3.3: D emo n strat ion of SURE thresholding algoritlun 74
PAGE 86
function was created with n = 2048 and a signaltonoise ratio of 5 The SURE thresholding procedure will be demonstrated in Figures 3.3 and 3.4. The noisy wavelet coefficients for level 6 from the noisy Blocky function are shown in Figure 3.3. To prepare the empirical wavelet coefficients for input into the SURE criterion, the following operations are performed : 1) Take the absolute value of the wavelet coefficients. 2) Sort the wavelet coefficients in ascending order. 3) Square the wavelet coefficients. Denote this vector as wc2 The resultant vector of coefficients wc2, is then input into the risks calculation. T he risks function is defmed as risks= [ 1024 ( 2 (1:1 024)) + ( cumsum( wc2 ) + (1 023:1:0 ).wc2)]. (3.1 6 ) Since the coefficients are reordered in order of increasing absolute value l x ; I, then the output ofthe risks function is strictly increasing between adjacent samples It is also strictly increasing between zero and the smallest l x ; I, as well as for A. = max ; l x ; I so the minirnwn must occur at zero or one of the l x ; I The minimum value and inde x of that value are then determined from the risks vector. The threshold is calculated b y taking the square root of the value at that index in 75
PAGE 87
the vector of the squared coefficients, wc2 This technique is illustrated in Figure 3.4. The dashed line on the bottom graph in Figure 3.4 indicates the value of the threshold selected by the SURE technique. In this case all coefficients below this risks vector (SURE ) Level 6 0 0 200 400 600 800 1000 wc2 vect o r 20 1 5 1 0 5 0 0 200 400 600 800 1000 Figure 3.4: SURE criterion for Blocky function Level 6 line will be set equal to zero while those above it will be shrunk toward zero b y the same amount. Remember that the SURE technique is a data adaptive scheme that is typically applied to wavelet coefficients at all levels. As stated previously this allows the coefficients themselves to determine if and how much shrinkage is 7 6
PAGE 88
required. It is not uncommon for the threshold to be set to zero for lower level coefficients. However there is one problem with the SURE method Refer to Figure 3.5 for the demonstration. Notice that level 3 seems to have more "s mall coefficients than large ones It may appear that the threshold selected is se t too lo w, and that too many "s mall coefficient s will be included in the recon s tru c tion 4 3 2 1 r isks vector (SURE ) Level 3 0 50 100 150 wc2 vect o r 50 Figme 3 .5: SURE c rit erion for Bl ocky functio n L eve l 3 77 t + I + 150
PAGE 89
The consequence will be that the resulting estimator will be noisier. Additionally it can be seen that the r isks function in Figure 3.5 is relatively flat near the location where it achieves its minimum. This indicates that there is a wide range of possible thresholds. The choice of the threshold may have a low quantitative difference in terms of estimated risk but may have a significant qualitative difference in terms ofthe overall smoothness ofthe resultant underlying function Donoho and Johnstone [2] recognized this problem and noted that the SURE threshold selection method does not perform well in cases where the wavelet representation at any level is sparse i .e. when most of the wavelet coefficients are close to zero. This is due to the noise from the coefficients that are close to zero overwhelming what little signal may be present from the nonzero coefficients. Donoho and Johnstone proposed a hybrid SURE method to get around this issue. The hybrid method essentially test for sparseness at each level. It uses the sample variance at each level to determine if the representation at that level is sparse. The sample variance at any level can be calculated by d s ; = Ixi (3.17 ) k=l 78
PAGE 90
and the wavelet coefficients at the level are judged to be sparse if 3 2 (log d)2 s d :Sl+ ..j;j (3. 1 8 ) If the set of coefficients is judged sparse then the hybrid scheme defaults to the universal threshold A= CY J2logn Otherwise the SURE criterion is used to determine the threshold value. Examples of both the SUREbased thresholding and Hybrid SURE thresholding methods are shown in Figures 3.6 and 3 7 The Blocky function in Figure 3.6 was created with n = 2048 and a signaltonoise ratio of 5 while the Heavy Sine function in Figure 3 7 was created with n = 512 and a signalto noise ratio of 2 The differences between the SURE estimator and the Hybrid SURE estimator are clearly demonstrated and are easy distinguishable. Note that in the locations where the function is smooth, the true coefficients at higher resolution levels are almost entirely white noise. The standard SURE method doesn't recognize this and results in a noisy estimator whereas the Hybrid SURE technique determines the coefficients at that level meet the sparse criteria and shrinks most ofthese coefficients to zero 7 9
PAGE 91
Original signal 10 0 1000 2000 Den o ised signalSURE 10 0 1000 2000 10 5 0 5 10 Noisy signalSNR= 5 0 1000 2000 Denoised signalHybrid SURE 10 0 1000 2000 Figure 3 .6: Blocky Function n=2048 and SNRatio=5 estimated by SURE and Hybrid SURE methods Original signal Noisy signalSNR =2 5.. 0 0 200 400 200 400 Denoised signalSURE Denoised signalHybrid SURE 5.. 5.. 0 Figure 3.7: Heavy Sine Function n=512 and SNRatio = 2 estimate d b y SURE and Hybrid SURE methods 80
PAGE 92
The SureShrink algorithm consists of the following: 1) Perform a DWT of noisy data and obtain empirical wave let coefficients Normalize coefficients by CJ. Sort the absolute value of coefficients and then square the results. 2 ) Calculate Stein s Unbiased Estimate of Risk. This expression is given by d SURE(A; x) = d 2#{k:l xkl Imin2( 1 xki,A). (3. 1 9) k=ll Determine the minimum of the SURE vector and record its inde x Take the square root of the value at that index in the vector ofthe squared coefficients. This value is the threshold A, 3) Perform soft thresholding of noisy wavelet coefficients 4) Reconstruct empirical wavelet coefficients with IDWT. 81
PAGE 93
3.3.2 Data Analytic Threshold Selection Brownian Bridge It has been shown that wavelets are powerful tools for estimating an underlying function that may contain multiple jumps discontinuities or sharp spikes. Typically these methods only use the magnitudes ofthe wavelet coefficients in choosing a threshold for each level. In addition to the relative magnitude ofthe coefficients it may also be desirable to include the information contained in the position ofthe "large" coefficients Since a jump or discontinuity in the underlying function results in several nonzero coefficients adjacent to each other a changepoint approach could possibly take advantage of the information contained in the relative position as well as the magnitude of the coefficients. As discussed in the previous section, Donoho and Johnstone [2] described their SureShrink procedure and a related hybrid scheme when the wavelet coefficient representation is sparse. SureShrink uses the coefficients at each level to select a threshold that minimizes Stein's Unbiased Risk Estimate Ogden and Parzen [ 4] propose a different approach to this problem. The general concept of the recursive scheme developed by Ogden and Parzen in [ 4] and [5] operates on a levelbylevel basis similar to the SURE method. The empirical coefficients are examined at each level and a standard statistica l test is performed to determine if the set of coefficients at that level behaves like white noise or if there is significant signal present. If it is determined 82
PAGE 94
that significant signal is present the largest coefficient in absolute value is removed and the remaining set of coefficients is tested again. The recursive process is continued until the original set of empirical wavelet coefficients has been separated into a set of coefficients that has been judged to contain significant signal, and a set of coefficients that behaves like white noise. At this point all of the noise coefficients are shrunk to zero, and the signal coefficients are shrunk by the amount required to shrink the noisy coefficients to zero. At the heart of this technique i s the idea that utilizes statistical tests o f hypotheses. In this case, the hypothesis tested results in the formation of a set of large coefficients that has passed the test of significance, i.e. signal. Hilton and Ogden [ 6] state this hypothesis as H o : JiJ = ... = Jid = 0 H a : there exists at least one i in {l ... ,d} such that Ji; 0 (3.20) w here Ji1 .. Jl d denote the true wavelet coefficients at any single level. T hese are estimated by the empirical coefficients X1 X2 . X d The threshold selection at each level is based on testing (3. 20) recursively with the null hypothesis H0 specifying no significant coefficients and the alternative hypothesis H a allowing 83
PAGE 95
one or more coefficients to be nonzero. The test statistic for the hypotheses listed above is based on a Brownian bridge stochastic process. It is defmed as (3.2 1 ) where O" is the standard deviation of the noise and X2 is the mean of all of the squared empirical wavelet coefficients. In most cases the value for the noise standard deviation is not known and must be replaced by = MAD where 0.6745 MA D is the median absolute value of the finest scale wavelet coefficients [ 1]. Under the null hypothesis { B ( u ) 0 u 1} in (3 .21) converges in distribution to a Brownian bridge stochastic process a zeromean continuous Gaussian process with covariance kemel min( u v) uv and is zero at 0 and 1 The fundamental idea behind this scheme is the typical behavior of a sample Brownian bridge process. As stated previously the empirical wavelet coefficients are examined levelbylevel. If none of the coefficients contributes significant signal the function in (3 .21) should resemble the typical behavior of a Brownian bridge process It will exhibit oscillations about zero without too great an excursion in either direction. Under the hypothesis H a however the Brownian 84
PAGE 96
bridge process will not oscillate about zero and will exhibit large dev iation s from zero in either direction. Figure 3.8 shows the Brownian bridge process for the empirical wavelet coefficients for level 4 from the noisy Blocky function with n = 204 8 and a signal tonoise ratio of 5. The top graph in Figure 3.8 is the full data set before any coefficients have been removed. Brownian bridgeFull da t a set Level4 0 50 100 150 200 250 Brownian bridgeReduced data set (alpha = 05)Level4 j [ : : : : l 0 50 100 150 200 250 Br ownian bridgeReduced dat a set (alpha= 9)Level4 l : : : : : j 0 50 100 150 200 250 Figwe 3 .8: Brownian brid ge pro cess b ehavio r The second and third graphs more closel y resemble typical qualities of a sample Brownian bridge process. The largest coefficients have been removed according to the choice of a The le v el a acts as a s moothin g parameter and allows the user to maintain control o v er the smoothness of the resulting estimated function. It 85
PAGE 97
can be set to a smaller value, thus resulting in an underfitting of the data similar to SureShrink or can be set to a larger value overfitting the data and producing a result similar to the VisuShrink method. below : The computational requirements for this recursive algorithm are described Step 1: Fmm the Brownian bridge process as given in (3.21). Step 2: Test the process in Step 1 for white noise. Use Kolmogorov Smirnov test statistic. Note : Once the Brownian bridge process has been formed, any number of several test statistics can be used in testing the Brownian bridge. Step 3: If the test in Step 2 is significant, remove the wavelet coefficient with the greatest absolute value from the dataset set d = d 1 and return to Step 1. Step 4: If the test in Step 2 is not significant set the threshold equal to the absolute value ofthe largest wavelet coefficient (in absolute magnitude) in the current dataset. The coefficients in the noisy dataset at shrunk to zero while the "true coefficients are shrunk by the same amount. 86
PAGE 98
Wavelet coefficientsLevel4 1 t 10 J 0 50 100 150 200 250 Noisy coeffici entsLevel4 t 10 : i 0 50 100 150 200 250 "Large" coefficien t s (alpha= 05)Level4 :? : l ? J 10 0 50 100 150 200 250 L arge" coefficient s (alpha= 9)Level 4 1 t 10 J 0 50 100 150 200 250 F i gure 3.9: Blo cky fun c t i o n Leve l 4 denoising resu lts of B rownian Bri dge t ec hniq ue In this conte x t the goal of a standard changepoint procedure i s to determin e from the data whether a set of empirical wa v elet coefficients represent s noi s e or s i gnal. Ogden and Parzen [4], [5] use the mean corrected cumulative sum process. In F i g ure 3.1 0 plots of the cumulative sum of the wa v elet coefficients the cumulative sum ofthe absolute value ofthe coefficients and the cumulative sum o f the squared wa v elet coefficients are displayed. B y lookin g at the plot o f th e cumulative sum of the wavelet coefficients it is dif ficult to dete c t w here th e c han g e s are takin g place This is because the coefficients do not ha v e the same 87
PAGE 99
Noisy coeffic i ents Level4 5 0 5 1 0 0 1 00 200 300 CumSum( Abs)Level 4 200 100 100 200 300 CumSum L evel 4 0 1 0 0 1 00 200 300 C u m S u m(Squared)L e vel4 1 000 500 100 200 300 Figure 3 .10: Chan g epoin t ana l ys i s i n B rownian B r id ge p rocess sign and will tend to cancel each other out when added together. The plot o f the cumulati v e sum of the absolute values ofthe coefficients gives a slightl y better idea of where the jumps are locat ed. More obvious is from the plot of the cumulati v e sum of the squared coefficients it is eas y to see where the underl ying function is relatively flat and where it increases rapidly. This corresponds to a jump or discontinuity in the underlying Blocky function 88
PAGE 100
True Blocky Func tion 0 1000 2000 Noisy Blocky Function 0 100 0 2000 Dat a Dependent a lpha= .05 0 1000 2000 True Wavelet Coefficien t s 100 J 1 00 0 1000 2000 Noisy Wavelet Coeff i c i ents 100 .... : ... : I 1 00 0 1 000 2000 Dat a Dependent alpha= .05 10: t..... : : I 1 00 0 1 000 2000 F i gure 3 .11: D e noi sing res ult s of B rownian Brid g e T he denoising results of the Brownian Bridge proce ss are displaye d in F igur e 3 .11. On the following page Figure 3.12 compares the results of the Brownian Bridge process to both VisuShrink and SureShrink. Note how the magnitude of a can result in a estimator that closely resembles the denoising result s obtained w ith either the VisuShrink or SureShrink algorithms 89
PAGE 101
True Blocky Func tion 0 1000 2000 G l obal U n iversal Thresholding 10 0 10 0 1000 2000 Dat a dependent alpha= .05 0 1000 2000 Noisy Blocky Functi onsnr 5dB 10 0 10 0 1000 2000 SURE Thresholding 0 1000 2000 Data dependent alpha=0 9 0 1000 2000 Figure 3.12 : Comparisons of var iou s denoisin g t ec hniques; Vis uShrink SureShrink and Brownian Brid ge proces s 90
PAGE 102
3.3.3 Data Analytic Threshold Selection Brownian Sheet The techniques discussed in the previous section can be readily adapted to a twodimensional problem. Let {Ji;.J: 1 :<; i j :<; d} denote the true wavelet coefficients at any single level. These are estimated by the empirical coefficients X , X2 X d The cumulative sum process is formed using (3. 22) The cumulative sum process can now be v iewed as a continuous function on an equally spaced twodimensional grid. It converges in distribution to a Brownian sheet stochastic process a zeromean continuous Gaussian process with covariance kemel min( u1 JL 2 ) min( v 1 v 2 ) JL 1 v 1 u2 v 2 This process is pinned down on two sides and the opposite comer since Bs( u,O) = Bs(0 v ) = Bs(1,1) = 0 The computational requirements for the recursive algorithm are described below with slight modifications to the Brownian bridge process: Step 1: Form the Brownian Sheet process (E quation 3.22). The empirical wavelet coefficients are normalized by a and squared and are then used to form the Brownian sheet process. 9 1
PAGE 103
Step 2: Test the process in Step 1 for white noise. Calculate the test statistic and compare it the appropriate critical value. Step 3: If a coefficient is judged significant replace it with a placeholder. In the Brownian bridge process removing a coefficient from a onedimensional vector and setting d = d 1 was no problem The same approach is not possible with a grid. Replace the removed coefficient with a placeholder. Either 0 or 1 can be used. Retum to Step 1. Step 4 : If the test in Step 2 is not significant set the threshold equal to the absolute value of the largest wavelet coefficient in absolute magnitude in the current dataset. The coefficients in the noisy dataset are shrunk to zero while the true coefficients are shrunk by the same amount. The test statistic in the twodimensional case is K = max Bs(}_ 1 ) d 1 5.i,j5.d d d (3.23) and is compared to the appropriate critical value to determine if the coefficients in the current dataset behave like white noise. Simulated critical points for thi s squared process were taken from Hilton and Ogden [6] and are tabulated in Table 3 .2. 92
PAGE 104
a N=8 N = 16 N = 32 N = 64 N = 128 N =256 N = 512 .999 0.4248 0.6160 0.7463 0.8292 0.8777 0 9138 0.9394 .990 0.5432 0.7412 0.8656 0.9411 0.9937 1.0312 1.0512 .950 0.6832 0.8767 0.9966 1.0714 1.1244 1.1586 1.1816 .900 0 7745 0 9604 1.0798 1.1524 1.2048 1.2391 1.2619 .800 0.8987 1.0775 1.1922 1.2655 1.3168 1.3522 1.3738 700 1.0009 1.1751 1 2862 1 .3568 1.4089 1.4438 1.4671 .600 1.0969 1.2674 1.3742 1.4413 1.4960 1 5309 1.5533 .500 1.1962 1.3582 1.4622 1.5291 1.5850 1.6192 1.6401 .400 1.3044 1.4 565 1 .5580 1.6 245 1 6780 1. 7 139 1.734 8 .300 1.4 286 1.5691 1.6671 1.7317 1.7848 1.8208 1.844 0 200 1.5873 1.7102 1.8068 1.8703 1.9211 1.9563 1.9773 100 1.8333 1.9274 2.0120 2.0743 2.1228 2 .1633 2.1763 .050 2.0524 2.1250 2.1915 2.2539 2 .30 38 2.3415 2 3570 .025 2 2611 2 3003 2.3616 2.4166 2.4668 2 5032 2 5162 010 2.5214 2.5167 2.5620 2.6 092 2.6752 2 69 86 2 7060 005 2.7098 2 .6698 2.7048 2.727 8 2.8098 2.8363 2.8427 .001 3.1201 3.0210 3.0192 3.0158 3.0158 3 .1318 3.1581 Table 3 .2: Simul ate d critica l points for finitesample Brownian Sheet process based on s qu are d n01mal random varia bl es from Hilton and O gde n [6] 93
PAGE 105
Input t o Brownian S heetLevel1 Decomp Horz (32x32) 0 0 Figure 3.13: Input into formation of the Brownian sheet. Coefficients are from the first le vel of decomposition of the Bo x function and have been normalized and squared. 94
PAGE 106
B o x Function (64x64 ) Form Brownian SheetLevel1 Decomp Horz (32x32) 8 6 4 2 0 2 10 10 0 0 Figure 3 14: Brownian sheet formed from input on Figure 3.13. No s tatistic a l test s perfonne d yet. B o x Function (64x64) Brownian Sheet doneLevel1 Decomp Horz (32x32) 8 6 4 2 0 2 0 0 Figure 3.15: All significant coeffic i ents have been removed from the Brownian sheet. 95
PAGE 107
0 5 0 0. 5 1 1. 5 2 30 Noisy Coeffic ientsLevel1 Decomp Horz (32x32) 30 10 0 0 Figme 3.16: Horizontal empirica l wave l et coefficients from Leve l 1 decomposition 0 5 0 0 5 1 1.5 2 30 Den oised Coeffic i entsLevel1 Decomp Horz (32x32) 30 20 0 0 Figme 3.17: Denoised horizontal empirical wave let coefficients from Lev el I decomposition 96
PAGE 108
4. Computer Simulations and Comparative Analysis This chapter presents the results of the Matlab simulations and compares the twodimensional denoising algorithms by Donoho and Johnstone [1] [2], and Hilton and Ogden [6]. Two images were chosen for analysis and comparison; the Lena image and a 64x64 Bo x function. Lena was chosen because it's one of the most popular and widely used standard test images for compression and denoising algorithms. Figure 4.1 shows the original Lena image along with a conupted O riginal Lena Image (512 x 512 ) O riginal we 1000 1 000 0 1 2 3 Noisy O riginal ( 512 x 512) N X 1 05 OISYWC 1 000 0 1 2 Figure 4.1: Original and noisy versions of Lena image with wavelet coefficients fiom a Le vel 3 decomposition Lena image was corrupted by N(0 20) additi v e white Gauss i an noi s e 97
PAGE 109
version. To corrupt Lena white noise with a mean of zero and a standard deviation of20 was added to the image. Also shown in Figure 4.1 are the wavelet Original Lena ImageLevel 3 Decomposition No i sy Lena Image Level 3 Decomposition Figwe 4.2: Level 3 D ecom po sition of Ori ginal and oisy Lena image 98
PAGE 110
coefficients of the original and noisy images. Finally, the results of a three level decomposition using Daubechies db3 mother wavelet can be seen in Figure 4.2 for both the original and noisy images. The second twodimensional signal used in the simulations is a 64x64 Box function. This image is uniform except for its sharp hard edges associated with the Box function itself. The original and noisy Box functions, along with the wavelet coefficients can be seen in Figure 4.4 White noise with a mean of zero and a standard deviation of 0 2 was added to corrupt the Box function. As w ith the Lena image a three level decomposition and reconstruction was performed but both the Haar and Daubechies db3 wavelets were used in the simulations. The Haar and Daubechies db3 scaling and wavelet functions are shown in Figure 4.3 0 2 .... db3 Wavelet funcnon ps1 1 5 1 05 1 :db 1 Haar Scahng funcnon phi db1 Haar Wavelet funcnoo ps1 1 2 1 1f: ___ 08 ...... 05 ...... 06 ...... 0 ............. .... .. ...... 05 ................. .. 02 ...... 1..OS 15 1 5 s Figwe 4.3 : Scaling and wavelet func tion s for db3 and Ha a r wave let s 99
PAGE 111
2D Box Funcoon (64x64) 2D Box Function. Wavelet Coefficent s I W I I 2 4 0 1000 2000 3000 4000 5000 NO
PAGE 112
Wavelet Transform using the denoised wave l et coefficients as input. The standard deviation of the noise was estimated from the first decomposition level using a= MAD The vertical diagonal and horizontal coefficients were used 0.6745 to determine a. The results of the three different thresholding schemes are Noisy Original Brownian Sheet alpha=0.999 VisuShrink SureShrink Figure 4 .5: Denoising results for the Lena image corrupted b y (0 20) additi v e white Gaussian noi s e Daubechies db3 wavelet was used in simulation. 101
PAGE 113
displayed in Figures 4.5 and 4.6. Hilton and Ogden s Brownian Sheet method is compared to Donoho and Johnstone's VisuShrink and SureShrink methods in Figure 4 .5. The denoising results indicate that the Brownian Sheet method with alpha = 0.999 and the SureShrink method exhibit similar visual qualities Examination of the denoised wavelet coefficients for the Brownian Sheet and SureShrink techniques in Figure 4 6 also shows close agreement. Both techniques Noisy Wave let Coeffic i ents Br ow nian S heet alpha=0 .999 2000 2000 1000 1000 0 !l, ..... 0 ....... rr
PAGE 114
result in more coefficients being used in the reconstruction thereby giving the image more detail and fmer scale information. In contrast the VisuShrink thresholding method results in fewer coefficients remaining after shrinkage thus resulting in a smoother and less detailed version of the estimated image. The results of the Brownian Sheet method with alpha = 0 .05 and VisuShrink method appear to be more similar and are shown below in Figure 4.7. N o isy Original Brownian Sheet a l p h a = 0 0 5 VisuShri n k SureShri n k Figme 4.7: Denoising Results for the Lena ima g e corrupted b y N(0,20) additive white G a u s sian noi se. Daubechi e s db3 wa v el e t wa s u s ed in s imul a tion 103
PAGE 115
The results of the simulations indicate that the Brownian Sheet thresholding procedure gives the user more flexibility in the application and end result of the denoising procedure. The level a of the test hypotheses can be thought of as being similar to a smoothing parameter. Smaller values of a make it more difficult for coefficients to be considered significant thus resulting in worse meansquare error performance but a smoother estimate This is similar to the how the VisuShrink and Brownian Sheet (alpha = 0 05) algorithms behave. On the other hand larger v alues of a make it easier for coefficients to be included in the reconstruction This is more similar to how the SureShrink and Brownian Sheet (alpha = 0.999) algorithms perform While these techniques offer better mean square error performance the resulting estimate will visually appear noisier. This is more clearly demonstrated in Figure 4 8 shown on the next page The Box Function wa s used in this simulation with the Haar wavelet. Notice the increased amount of noise in the SureShrink result. Also notice how the Brownian Sheet technique with an alpha = 0.05 closely resembles the results of the VisuShrink method 10 4
PAGE 116
Noisy Box Func uon (64><64) Brown1an Sheet alpha=O 05 V1suSinnk SureStr1nk Figure 4.8: Denoising results for a 64x64 twodin1ensional Box function corrupted b y N(0 0.2) additive white Gaussian noise. The Haar wavelet was used in the simulations. 105
PAGE 117
To further demonstrate the effect that increasing the value of a has on the results ofthe Brownian Sheet method the reader is referred to Figure 4.9. As a 1s increased, it becomes obvious that more noise is being allowed into the reconstruction of the estimate Lower values of a make it more difficult for noise to be allowed to remain, thus resulting in a s moother estimate. Browman Sheet alpha=O 05 BrO'M11an Sheet alpha=O 5 Brown1an Sheet alpha=O 9 Brown1an Sheet alpha=O 999 Figure 4.9: D enoising results for a 64x64 two d imensional Box function corrupted b y N(0,0.2 ) additive w hite Gaussian nois e with the Brownian s h eet technique for various levels of alpha Daubechies db3 wavelet was use d in s imulation. 106
PAGE 118
5. Conclusions In Chapter 4 the Brownian Sheet SureShrink and VisuShrink algorithms were implemented and tested on two images that had been corrupted by additive white Gaussian noise The goa l was to prove a new data analytic teclmique that considers the magnitude and location of the empirical wavelet coefficients w ork s as well as the available traditional methods in determining the thresholds used for the denoising of signals. The results demonstrate that the Brownian Sheet technique performs well on a variety of signals and can actually mimic either the SureShrink or VisuShrink techniques SureShrink is a data analytic technique that determines a threshold for each level b y minimizing the Stein Unbiased Es timate of Risk for threshold estimates. Application ofthis method to a corrupted image allows some ofthe noise to remain in the reconstruction in an effort to minimize means quare error. This can be observed in the numerous examples both one and two dimen sional in Chapters 3 and 4 The various examples given in this thesis have compared thresholding methods in terms of visua l quality of the resulting estimated underlying function. It has been documented that SureShrink performs well in terms of meansquare error but lacks v isual quality due to the higher amount of spurio us noise that is allowed in the reconstruction process. The tendency of 107
PAGE 119
SureShrink to allow more noise into the reconstruction could be annoying to a user who is attempting to remove as much noise as possible from a signal. Donoho and Johnstone [2] realized that enhancing the visual quality could be obtained by ensuring that every sample in the wavelet transform in which the underlying signal is exactly zero would be estimated as zero They proposed the application of the universal threshold A = a)2log n without any adaptive selection of the threshold. This technique termed VisuShrink due to optimization of the visual quality of the image ensured that essentially all pure noise coefficients would be set to zero by the thresholding. This resulted in an almost noisefree character of the signal and yielded smooth, visually appealing result s but suffered from poor mean square error performance. In comparison the Brownian Sheet simulations in Chapter 4 demonstrate that this thresholding method is much more versatile and flexible than either the SureShrink or VisuShrink algorithms The user can adjust the level a and thu s have the ability to control the smoothness of the reconstructed signal. When a IS lar ge, the results behave like SureShrink If a is small the Brownian Sheet technique behaves like VisuShrink. The balance between minimizing the mean square error and maximizing the visual quality of a signal must be determined b y the application and the desired results 108
PAGE 120
Appendix A. Bibliography [1] D. L. Donoho and I. M. Johnstone Ideal spatial adaptation via wavelet shrinkage" Biometrika vol. 81, pp. 425455 1994 [2) D. L. Donoho and I. M. Johnstone Adapting to unknown smoothness via wavelet shrinkage ", Journal of the American Statistical Association vol. 90,pp. 12001224 1995 [3) D. L. Donoho, Denoising by softthresholding", IEEE T r an s a c t i o ns o n Information Theor y, vol. [ 4] R. T. Ogden and E. Parzen Data dependent wavelet thresholding in nonparametric regression with change point applications ", Computa t ion a l Statisti cs and Dat a Anal y si s, vol. 22, pp.5370 1996b [5] R. T. Ogden and E. Parzen, Changepoint approach to data analytic wavelet thresholding ", Statistic s and Computing vol. 6 pp. 9399 1996a [6] M. L. Hilton and R. T Ogden Data Analytic Wavelet Threshold Selection in 2D Signal Denoising [7] J. Lim Twodimensional Signal and Image Processing ", Prentice Hall Inc. Englewood Cliffs NJ 07632 ISBN 0139353224 [8] T. Bose Class Notes for Wavelets and Digital Filter Banks ", Uni v ersity of Colorado at Denver Electrical Engineering Department Spring Semester 1997 [9] G. Strang and T. Nguyen Wavelets and Filter Banks ", Wellesley Cambridge Press Wellesley MA 02181 ISBN 0961408820 [10] J.Weaver X. Yansun D. Healy Jr., and L. Cromwell Filtering noise from images with wavelet transforms" Magn e tic Resonanc e in M e dicine, vol.24, pp.288295 1991 109
PAGE 121
[ 11] C. Stein, Estimation of the mean of a multivariate normal distribution Annals ofStatistics vol. 10, pp. 11351151 [12] S.G. Mallat A theory for multiresolution signal decomposition: the wavelet representation ", IEEE Transactions on Patt er n Analysis and Machine Int e llig e nce, vol. 11, pp 674693 1989 [13] R.T. Ogden Essential wavelets for statistical applications and data analysis" Birkhauser Boston Cambridge MA 02139, ISBN 0817638644 110
PAGE 122
Appendix B. Matlab Code/Simulations Brownian Sheet (2D) clear ; % Global Constants % PlaceHolder=O; WaveletType ='db3';% Mother wavelet dbl or db3 for simulations DecompLevel = 3; % Decomposition level % Construct the 2D Box function % Hd = zeros( 64 64 ) ; %Hd(20:40,20:40) = ones(21 ,21 ); % figure;mesh(Hd);axis([O 63 0 63 0 max(max(Hd))]);title('Original 64 x 64 Box Function');pause; % Add Noise % % NoisyBox = Hd + sqrt( .04 )*randn(size(Hd)) + 0; % load NoisyBox; % figure;mesh(NoisyBox);title('Noisy 64 x 64 Box Function Gaussian White Noise added'); % 1 d = NoisyBox; [X map ] = gifread('lena512 gif); % Image of Lena 515x512 % figure ; colormap(map ) ; image(X) ; pause ; Jd = X + sqrt(400) randn(size(X)) + 0;% Add noise to Lena image % figure;colormap( map ) ; image( J d);axis image;pause ; % Wavelet Decomposition [cal ,chdl ,cvdl ,cddl ] = dwtper2(Jd, WaveletType ) ; [ ca2 chd2,cvd2,cdd2] = dwtper2( cal, WaveletType ) ; [ca3 chd3,cvd3,cdd3] = dwtper2(ca2,WaveletType) ; 111
PAGE 123
c = [ca3(:)' chd3(:)' cvd3(:)' cdd3(:)' chd2(:)' cvd2( :)' cdd2(:)' chdl(:)' cvdl(:)' cddl(:)']; s = [ size( ca3);size( ca3);size( chd2) ; size( chdl ) ; size(J d)]; %figure;subplot(2,2, 1 );mesh( cal );title('Approximation coefficients Level 1 ') ; %subplot(2,2,2);mesh( chdl );title('Horizontal coefficients Level 1 '); % subplot(2,2 3) ; mesh( cvdl );title('Vertical coefficients Level 1 ') ; % subplot(2 ,2, 4 );mesh( cddl );title('Diagonal coefficients Level 1 ');pause; % figure;subplot(2,2, 1 );mesh( ca2);title('Approximation coefficients Level 2'); % subplot(2,2 2);mesh( chd2);title('Horizontal coefficients Level 2'); % subplot(2 2 3) ; mesh( cvd2);title('Vertical coefficients Level 2'); % subplot(2,2,4 );mesh( cdd2);title('Diagonal coefficients Level 2');pause; % figure;subplot(2,2, 1 );mesh( ca3);title('Approximation coefficients Level 3') ; % subplot(2,2 2);mesh( chd3);title('Horizontal coefficients Level 3'); % subplot(2 ,2, 3) ; mesh( cvd3);title('Vertical coefficients Level 3'); % subplot(2 ,2, 4 );mesh( cdd3);title('Diagonal coefficients Level 3');pause; sX = size(Jd) ; % X contains the loaded image and map contains the loaded colormap row=sX( 1 );col = sX(2); % Image coding nbcol = size(map 1 ); cod X = wcodemat(X,nbcol) ; cod cal =wcodemat( cal ,nbcol); cod chdl =wcodemat( chdl nbcol); cod cvdl =wcodemat( cvdl ,nbcol); cod_ cddl =wcodemat( cddl ,nbcol); dec2d = [ cod ca 1 ,cod_ chd 1; ... cod cvdl cod cddl]; figure;colormap(map );image( dec2d);axis image;axis off; title('Noisy Lena Image Level 1 Decomposition'); print dmeta dec2d; pause ; cod ca2=wcodemat( ca2 nbcol); 112
PAGE 124
cod chd2=wcodemat( chd2 nbcol); cod cvd2=wcodemat( cvd2 nbcol); cod cdd2=wcodemat( cdd2 nbcol); dec3d=[([ cod ca2,cod chd2;cod cvd2,cod cdd2]) cod chd1 ; cod cvd 1 cod cdd 1]; figure;colorrnap(map );image( dec3d) ; axis image;axis off; title('Noisy Lena ImageLevel 2 Decomposition'); print dmeta dec3d pause; cod ca3=wcodemat( ca3 ,nbcol); cod chd3=wcodemat( chd3,nbcol); cod cvd3=wcodemat( cvd3 nbco l); cod cdd3=wcodemat( cdd3 nbcol); dec4d = [ ( [ ([cod_ca3,cod_chd3;cod_cvd3,cod cdd3]) cod chd2;cod cvd2,cod cdd2] ),cod_ chd 1 ; cod_ cvd1 ,cod_ cdd1 ]; figure;colorrnap(map );image( dec4d);axis image ; axis off; title('Noisy Lena Image Level 3 Decomposition') ; pause ; %Pull out the fmest scale empirical wavelet Quad matrices (horizontal,vertical, % and diagonal) % mQth = detcoef2('h' c,s 1 ); mQfv=detcoef2('v' ,c, s 1 ); mQfd=detcoef2('d' ,c,s, 1 ); % Reshape them into colurnwise vectors vQth=reshape(mQth, 1 ,prod(size(mQth))); vQfv=reshape(mQfv,1,prod(size(mQfv))); vQfd=reshape(mQfd, 1 prod(size(mQfd))); % MAD / 0.6745 % sigmaest=median(abs([vQth vQfv vQfd])) I 0.6745; % Start the creation of vGoodStuff by first placing the smallest approximation % matrix into vGoodStuff. We can do this now because this denoising algorithm 113
PAGE 125
% algorithm never works on the approx. coefs. Eventually vGoodStuff which is % a colurnnwise storage vector will be reconstructed % % Store approximation coefficients from Level 3 A = appcoef2(c s WaveletType DecompLevel) ; v QuadGoodStuff = [A(:)']; for Dlev=DecompLevel:1: 1 ifDlev== 3 CritVal = input('Please input the critical value for Decomposition Level 3 ( 8 x 8): '); end ifDlev== 2 CritVal = input('Please input the critical value for Decomposition Level 2 (16 x 16): '); end ifDlev= = 1 CritVal = input('Please input the critical value for Decomposition Level 1 : ( 32x32) '); end for Q = 1:3 ifQ== 1 Quad = detcoef2('h' c s Dlev); end ifQ== 2 Quad = detcoef2('v' c s Dlev); end ifQ== 3 Quad = detcoef2('d' c s Dlev) ; end QuadOrig=Quad ; % Create a matrix that is the same size as "Quad". It will be initialized with all % zeros and will eventually contain the denoised coefficients for this particular % decomposition level and also will be inserted into the vector vGoodStuff. QuadGoodStuff = zeros(size(Quad)); 114
PAGE 126
%Find the size (rows and columns) of the current "Quad" that we are operating % on and the total number of elements in it. SizeOfQuad = s(size(s,l)Dlev,:) E lemlnQuad = prod(SizeOfQuad ) clear ks2 ; clear bs; % Main loop for n = 1 :ElemlnQuad % Find Brownian Sheet Quad = QuadOrig ; QuadS = Quad./(sigmaest); Quadl = QuadS / '2; % Square the wavelet coefficients d = size ( Quadl l); %Find the number of rows in Quad x bar = O ; %Sum Up for i = l :d for j =l:d xbar = xbar+Quadl(i,j); end end %Find mean xbar = xbar I (prod(size(Quadl))); % Subtract mean from the square of each element % for i=l:size(Quadl,l) for j = l :size(Quad1 ,2) Quad2(i,j) = Quadl(i,j)xbar; % Subtract mean out of Quad matrix end end 115
PAGE 127
%Find Brownian sheet using Todd Ogden's method % bs( 1 1 )=Quad2(1, 1); % First Column % for i = 2:size(Quad2 ,1) bs(i, 1 ) = bs(i1 1 ) + Quad2(i 1 ); end %First Row o/o for j =2 :size(Quad2 ,2) bs( 1 j)=bs( 1 j1 ) + Quad2( 1 j); end % Main algorithm % for i = 2:size(Quad2 ,1) for j = 2:size(Quad2 ,2) bs(ij) = bs(i1 ,j) + bs(ij1) bs(i1 j1) + Quad2(ij); end end % Divide output Brownian Sheet by d % bs = bs /d; % Find the max of the absolute value of the Brownian sheet and set it equal to % the test statistic Kd o/o Kd(n) = maxabs(bs); % Compare Kd, the test statistic, to the critial value o/o ifKd(n) < CritVal break; end 116
PAGE 128
% Find the maximum absolute value ofthe wavelet coefficients pull it out % and put it into QuadGoodStuff Replace its value with the PlaceHolder % value which in this case is equal to 0. % [val row col] = maxabs(QuadOrig ); QuadGoodStuff(row col) = QuadOrig(row,col) ; QuadOrig(row col) = PlaceHolder ; end % The end of the n Loop % Brownian Sheet Process % Soft threshold mGoodStuff by shrinking coefficients by the ma x abs of Quad thr(n) = maxabs(QuadOrig) ; mTmp = abs(QuadGoodStuff) maxabs(QuadOrig) ; mTmp = ( mTmp + abs ( mTmp)) / 2; QuadGoodStuffShrunk = sign(QuadGoodStuff). mTmp; % Update the vector vQuadGoodStuff with Quad GoodS tuff, the current % denoised matrix in a colurnnwise format. vQuadGoodStuff = [ vQuadGoodStuff QuadGoodStuffShrunk( : )']; end % The end of the Q Loop end % the end ofthe Dlev Loop mchd3 = detcoef2('h' vQuadGoodStuff,s,3 ); mcvd3 = detcoef2(' v', vQuadGoodStuff s 3 ); mcdd3 = detcoef2('d' vQuadGoodStuff s 3); mca 2 = idwtper2( ca3 mchd3 mcvd3 mcdd3 WaveletType ); mchd2 = detcoef2('h' vQuadGoodStuff,s ,2); mcvd2 = detcoef2('v' vQuadGoodStuff,s 2 ); mcdd2 = detcoef2('d' ,v QuadGoodStuff,s,2) ; meal = idwtper2(mca2,mchd2,mcvd2,mcdd2 WaveletType ); mchd 1 = detcoef2('h' vQuadGoodStuff,s 1 ); mcvd 1 = detcoef2('v' vQuadGoodStuff s 1 ); mcddl = detcoef2('d' vQuadGoodStuff s 1 ); 117
PAGE 129
BS4 = idwtper2(mca 1 ,mchd1 mcvd 1 mcdd 1 WaveletType ); % Plot the denoised image for Brownian Sheet Process % figure;mesh(BS9993) ; axis([O 63 0 63 0 2]);pause ; % figure ; mesh(BS05);axis([O 63 0 63 0 2]) ; pause ; figure;colormap(map );image(BS4 );title('Brownian Sheet Process alpha = 0 999'); o/ofigure;plot( c );hold on; plot( vQuadGoodStuff,'r') ; 118
PAGE 130
VisuShrink (2D) clear ; % Global Constants % WaveletType = 'db3' ; DecompLevel = 3 ; % Construct the 64x64 box function % % Hd = zeros(64 64);Hd(20:40 20 : 40) = ones(21 ,21 ) ; % figure ; mesh(Hd) ; axis([O 63 0 63 0 max(max(Hd))]) ; title('64 x 64 Box % Function') ; pause; % Add Noise to Box function % Jd = Hd + sqrt(.04)*randn(size(Hd)) + 0 ; % load NoisyBox ; % Jd = NoisyBox ; % figure;mesh(Jd) ; title('64x64 2D Box Function corrupted by N(0 0 2) additive % white Gaussian noise');pause ; % Load image % [X map ]=gifread('lena512.gif); % figure;colormap(map );image(X);axis image;pause; Jd = X + sqtt( 400)*randn(size(X)) + 0; % figure;colormap(map );image(J d);pause ; % Wavelet Decomposition [cal ,chdl ,cvdl ,cddl ]=dwtper2(Jd WaveletType ) ; [ ca2 chd2 cvd2 cdd2] = dwtper2( cal, WaveletType ) ; [ ca3 chd3 cvd3 ,cdd3 ]=dwtper2( ca2 WaveletType ); % figure;subplot(2,2, 1 );mesh( cal );title('Approximation coefficients Level 1') ; % subplot(2 2 2) ; mesh( chd 1 );title('Horizontal coefficients Level 1') ; % subplot(2 2 3);mesh( cvdl );title('Vertical coefficients Level 1'); 119
PAGE 131
%s ubplot(2 ,2,4 );mesh( cdd 1 );title('Diagonal coefficients Level 1');pause; %figur e;subplot(2 ,2, 1 );mesh( ca2);title('Approximation coefficients Level 2 ); % subplot(2 ,2, 2) ; mesh( chd 2);t itle('Horizontal coefficients Level 2'); % subplot(2,2,3 );mesh( cvd 2);t itle('Vertical coefficients Level 2') ; %s ubplot(2 2,4 );mesh( cdd2) ;t itle('Diagonal coefficients Level 2'); pause ; %fig ure;subplot (2,2, 1 );mesh( ca3 );title(' Approximation coefficient s Level 3'); % subplot(2 ,2,2); mesh( chd3);title('Horizontal coefficients Level 3'); % subplot(2,2,3);mesh( cvd3 );title('Vertical coefficients Level 3'); %s ubplot(2 2,4 );mesh( cdd3);title('Diagonal coefficients Level 3');pause; c = [ ca3(: )' chd3(: )' cvd3(: )' cdd3(: )' chd2(: )' cvd2(: )' cdd2(: )' chd 1 ( : )' cvd 1 ( : )' cdd1(:)' ] ; s = [ size( ca3) ; size( ca3);size( chd2);size( chd 1 ) ; size(Jd)]; % Start the creation ofvGoodStuffby first placing the smallest approximation % matri x into vGoodStuff. We can do this now because this denoising project %never works on the approx coefs. Eventually vGoodStuff which is a % colurnnwise s torage vector will be reconstructed. A = appcoe f2( c,s, WaveletType DecompLe vel); v QuadGoodStuff = [A( :)' ] ; % Pull out the fmest scale empirical wavelet Quad matrices (horizontal,vertical % and diagonal ) % mQfh = detcoef2('h' c ,s, 1 ); mQfv=detcoef2('v' c s 1 ); mQfd = detcoef2('d' c s 1 ); % Reshape them into columwise vectors v Qfh = reshape ( m Qfh 1 prod ( size( mQfh ))); v Q fv= reshape (m Qfv l prod (s ize ( mQfv ))); vQfd = reshape(mQfd,1,prod(size(mQfd))); % MAD / 0.6745 s igmaest = (median(abs([vQfh vQfv vQfd]))) I 0.6745; 120
PAGE 132
for Dlev=DecompLevel:1: 1 for Q = 1:3 ifQ== 1 Quad = detcoef2('h' c s Dlev) ; end ifQ== 2 Quad= detcoef2('v',c s Dlev) ; end ifQ== 3 Quad = detcoef2('d' c,s,Dlev); end % VisuShrink Process % Soft threshold Quad by shrinking coefficients by the thrvisu % thrvisu = sigmaest* sqrt(2 log(prod( size( J d)))); mTmp = abs(Quad)thrvisu ; mTmp = (mTmp + abs(mTmp)) / 2; QuadGoodStuffShrunk = sign( Quad). mTmp ; % Append the vQuadGoodStuff vector with QuadGoodStuffShrunk (current % denoised % matrix) in a colurnnwise format. % vQuadGoodStuff = [ vQuadGoodStuff QuadGoodStuffShrunk(: )'] ; o/ofigure ; axis ([O 4096 0 1 ]) ; plot( c,'r') ; hold on ; plot(vQuadGoodStuff) ; pause ; QuadVisu = vQuadGoodStuff ; end % (END OF Q LOOP) end % (END OF Dlev LOOP) % Reconstruct and plot the denoised image % mchd3 = detcoef2('h', vQuad GoodStuff s 3) ; mcvd3 = detcoef2('v',vQuadGoodStuff,s,3); mcdd3=detcoef2('d',vQuadGoodStuff,s,3); mca2 = idwtper2( ca3 mchd3 mcvd3 ,mcdd3 WaveletType ); % figure ; subplot(2 2 1 );mesh( ca3 ) ; title('Denoised Approximation coefficients %Level3'); 1 2 1
PAGE 133
% subplot(2 2 2);mesh(mchd3);title('Denoised Horizontal coefficients Level 3') ; % subplot(2,2,3);mesh(mcvd3);title('Denoised Vertical coefficients Level 3'); %subplot(2 2 4);mesh(mcdd3);title('Denoised Diagonal coefficients Level % 3');pause ; mchd2 = detcoef2('h' vQuadGoodStuff s 2) ; mcvd2 = detcoef2('v' vQuadGoodStuff s 2) ; mcdd2 = detcoef2('d',vQuadGoodStuff s 2); mea 1 = idwtper2(mca2,mchd2 mcvd2,mcdd2, WaveletType ) ; %figure;subplot(2,2 1 );mesh(mca2);title('Denoised Approximation coefficients %Level2'); % subplot(2 2 2) ; mesh(mchd2);title('Denoised Horizontal coefficients Level 2'); % subplot(2 2 3) ; mesh(mcvd2);title('Denoised Vertical coefficients Level 2 '); %subplot(2 2 4);mesh(mcdd2);title('Denoised Diagonal coefficients Level % 2');pause ; mchd 1 = detcoef2('h' vQuadGoodStuff ,s, 1 ); mcvd 1 = detcoef2('v' vQuadGoodStuff s 1 ); mcddl = detcoef2('d',vQuadGoodStuff s 1); % figure;subplot(2,2 1 );mesh(mcal ) ; title('Denoised Approximation coefficients %Levell'); %subplot(2,2,2);mesh(mchd1 );title('Denoised Horizontal coefficients Level 1 '); % subplot(2 2 3);mesh(mcvd1 );title('Denoised Vertical coefficients Level 1 '); % subplot(2 2 4 );mesh(mcddl ) ; title('Denoised Diagonal coefficients Level %1 ');pause; VSLena = idwtper2(mcal mchd 1 ,mcvd 1 mcdd 1 WaveletType ) ; %VSBoxl= idwtper2(mca1,mchd1 mcvdl,mcddl,WaveletType) ; o/ofigure;mesh(VSBoxl) ; title('VisuShrink Algorithm');axis([O 63 0 63 0 2]); figure ; colormap(map);image(VSLena);axis image;title('VisuShrink Algorithm'); 122
PAGE 134
SureShrink (2D) clear ; % Global Constants % WaveletType = 'db3' ; DecompLevel = 3 ; % Construct the box function (image) %Hd= zeros( 64,64 ); %H d(20:40 20:40) = ones(21 ,21 ); %figure; mesh(Hd);axis([O 63 0 63 0 max(max(Hd))]);title('64 x 64 Box % Function');pause ; % Add Noise % % load NoisyBox ; % J d = NoisyBox; %Jd = Hd + sqrt(.02)*randn(size(Hd)) + 0 ; %fig ure;mesh(Jd);title('64 x 64 Box Function with Gaussian White Noise %ad ded');pause ; [X,map ] = gifread('lena512.gif); %figure;c olormap( map );image(X);pause ; Jd = X + sqrt(400)*randn(size(X)) + 0; % figure ; colormap( map );image(J d);pause ; % Wavelet Decomposition [cal ,chdl cvd 1 cdd1 ] = dwtper2(Jd WaveletType ); [ca2 chd2 cvd2 cdd2] = dwtper2(ca1 WaveletType) ; [ ca3 chd3 ,cvd3 cdd3 ] = dwtper2( ca2, WaveletType ) ; c = [ca3(:)' chd3(:)' cvd3(:)' cdd3(:)' chd2(:)' cvd2(:)' cdd2(:)' chd1( : )' cvdl(:)' cdd1(:)']; s = [ size( ca3);size( ca3);size( chd2);size( chd 1 );size(Jd)]; 123
PAGE 135
%figure;subpl ot (2,2, 1 );mesh( ca 1 );title('Approximation coefficients Level 1 '); %su bplot( 2,2,2);mesh( chd1 );title('Horizontal coefficients Level 1 '); %s ubplot(2 ,2,3); mesh( cvd 1 );title('Vertical coefficients Level 1 '); %s ubplot(2 ,2, 4 );mesh( cdd1 );title('Diagonal coefficients Level 1 ');pause; % figure;subplot(2 ,2, 1 );mesh( ca2);title('Approximation coefficients Level 2'); %s ubplot(2,2,2);mesh( chd2);title('Horizontal coefficients Level 2'); %s ubplot(2 ,2, 3);mesh( cvd2);title('Vertical coefficients Level 2'); %s ubplot(2,2 4 );mesh( cdd2);title('Diagonal coefficients Level 2');pause; %figure;subplot(2,2, 1 );mesh( ca3);title('Approximation coefficients Level 3'); %s ubplot(2,2 ,2); mesh( chd3);title('Horizontal coefficients Level 3') ; %s ubplot(2 ,2, 3) ; mesh( cvd3) ; title('Vertical coefficients Level 3'); %s ubplot(2,2 ,4 );mesh( cdd3);title('Diagonal coefficients Level 3');pause; % Start the creation of vGoodStuff by first placing the smallest approximation % matrix into vGoodStuff. We can do this now because this denoising project % never works on the approx. coefs. Eventually vGoodStuff which is a % colurnnwise storage vector, will % be reconstructed. % A = appcoef2( c s WaveletType,DecompLevel); vQuadGoodStuff= [A(:)']; % Pull out the fmest scale empirical wavelet Quad matrices (horizontal,vertical %and diagonal) mQfh = detcoef2('h' c s 1 ); mQfv=detcoef2('v' c s 1 ); mQfd = detcoef2('d' ,c,s, 1 ); % Reshape them into columwise vectors % vQfh = reshape(m Qfh, 1 ,prod( size( m Qfh))) ; v Q fv= reshape( m Qfv 1 prod( size( m Qfv))) ; vQfd = reshape(mQfd, 1 prod(size(mQfd))) ; % MAD / 0.6745 % sigmaest=median(abs([vQfh vQfv vQfd])) I 0 6745 ; 124
PAGE 136
for Dlev=DecompLevel:1: 1 QuadH = detcoef2('h' c s Dlev) ; QuadV = detcoef2('v' c s Dlev); QuadD = detcoef2('d' c s Dlev); QuadVector = [QuadH(:)' QuadV(:)' QuadD(:)']; n=prod( size( QuadV ector)); x = QuadV ector ; xx=x/s igmaest ; a = sort(abs(xx))."'2; b = cumsum(a); n = length(x) ; cc = linspace(n1 ,O,n); ss = b + cc. a ; risk = (n( 2 .* (l:n )) + ss) / n ; [guess,ibest] = min(risk) ; thrsure = sqrt(a(ibest))*sigmaest % SureShrink Process % Soft threshold Quad by shrinking coefficients by the thrsure mTmp = abs(QuadVector)thrsure ; mTmp = ( mTmp + abs(mTmp)) / 2 ; QuadGoodStuffShrunk = sign(QuadVector) *m Tmp ; % Append the vQuadGoodStuff vector with Quad Good Stuff (current denoised % matrix ) in a columnwise format vQuadGoodStuff = [ vQuadGoodStuff QuadGoodStuffShrunk(: )']; QuadS ure =v QuadGoodStuff ; end %(E ND OF Dlev LOOP) % Reconstruct and plot the denoised image % mchd3 = detcoef2('h' ,v QuadGoodStuff ,s, 3) ; mcvd3 = detcoef2('v' vQuadGoodStuff s 3 ); mcdd3 = detcoef2('d' ,v QuadGoodStuff s 3) ; mca2 = idwtper2( ca3 mchd3 mcvd3 ,mcdd3 WaveletType ); 125
PAGE 137
mchd2 = detcoef2('h' ,vQuadGoodStuff, s,2); mcvd2 = detcoe f2('v', vQuad GoodS tuff, s ,2); mcdd2 = detcoef2(' d' vQuadGoodStuff,s,2); mea 1 = idwtper2(mca2, mchd2 mcvd2 mcdd2 WaveletType ); mchd 1 = detcoe f2(' h' vQuadGoodStuff, s, 1 ); mcvd 1 = detcoe f2('v' ,vQuadGoodStuff,s, 1 ) ; mcdd1 = detcoe f2(' d' vQuadGoodStuff,s, 1 ); SSLena = idwtpe r2(mcal mchd 1 ,mcvdl ,mcdd 1 Wa ve letType ); % S SBox 1 = idwtper2(mca 1 mchd 1 ,mcvd 1 ,mcdd 1 WaveletType ); figure;colormap( map );image(S SLena) ; axis image ; title('S ureShrink Algorithm' ); % figure;mesh(SSBox1) ; axis ([O 63 0 63 0 2]); title('SureShrink' ); 126
