Citation
An efficient and reliable ddos attack detection algorithm using a fast entropy computation method

Material Information

Title:
An efficient and reliable ddos attack detection algorithm using a fast entropy computation method
Creator:
No, Giseop
Publication Date:
Language:
English
Physical Description:
xvi, 112 leaves : ; 28 cm

Subjects

Subjects / Keywords:
Denial of service attacks ( lcsh )
Computer security ( lcsh )
Computer algorithms ( lcsh )
Computer algorithms ( fast )
Computer security ( fast )
Denial of service attacks ( fast )
Genre:
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )

Notes

General Note:
Department of Computer Science and Engineering
Statement of Responsibility:
by Giseop No.

Record Information

Source Institution:
|University of Colorado Denver
Holding Location:
|Auraria Library
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
463453089 ( OCLC )
ocn463453089
Classification:
LD1193.E52 2009m N6 ( lcc )

Full Text
AN EFFICIENT AND RELIABLE DDOS ATTACK DETECTION
ALGORITHM USING A FAST ENTROPY COMPUTATION METHOD
by
Giseop No
B.S., Republic of Korea Air Force Academy, 1998
A thesis submitted to the
University of Colorado Denver
in partial fulfillment
of the requirements for the degree of
Master of Science
Computer Science
2009


by Giseop No
All rights reserved.


T >
V
This thesis for Master of Science
degree by
Giseop No
has been approved
by
Tom Altman, PhD
Richard Osborne, PhD
7 ft/^f
Date


Y
No, Giseop (M.S., Computer Science)
Design & Analysis of an Efficient Reliable DDoS Attack Detection Algorithm Using
a Fast Entropy Computing Method
Thesis directed by Associate Professor Ilkyeun Ra
ABSTRACT
The threat of Distributed Denial of Service (DDoS) has become a major issue in
network security. It is difficult to detect DDoS since all traffic has normal packet
characteristics. An entropy based intrusion detection approach was introduced, a
powerful and simple way to identify abnormal conditions from the network channel.
However, the burden of computing information entropy values from heavy flow still
exists.
In this thesis, we introduce a compression entropy scheme and fast entropy
computation algorithm which uses the information from both packet numbers and
packet types. Both of them are similar to the conventional entropy scheme, but no
probability is needed to calculate entropy values. Compression entropy computing is
very fast. However, it is too sensitive to verify real network attacks and produce many
false negatives. We designed a fast entropy scheme. The fast entropy computing
algorithm could reduce time by more than 90 percent of conventional entropy
computing time. Also, our fast entropy indicated better detection accuracy than the
conventional and compression entropy approach.
Also, we designed an adaptive detector. We verified a meaningful range of


threshold from simulation between 2o and 6o, where a is the standard deviation of
the number of packet types during every monitoring interval. We used the general
pattern changes to design an adaptive detector; that means that if the difference
between the average of the previous monitoring interval and the new entropy value is
decreased, the detector cant detect the attacks with a high threshold value because of
the steady channel condition and stealthy attack pattern. In this case, we need to
decrease the threshold value. Meanwhile, if the channel has burst but the detector has
a relatively small threshold value, the detector works very sensitively in this situation.
As a result, the detector yields many false negativesa bad characteristic of detector.
In that case, the threshold should be increased accordingly. Our adaptive detector
shows the results which approximate optimal detection performance. We found that
our fast entropy estimator with adaptive detector works better than the conventional
entropy estimator, and significantly reduces computation time.
Finally, we add three dynamic monitoring-window schemes in the adaptive
detector to enhance the detection accuracy. Even though the dynamic monitoring
windows increase computational time a little more than fixed monitoring window
scheme, it can reduce the computational time approximately 90% compared to
Conventional Entropy scheme. Also, the Multiple Dynamic Moving Average
Window detector has a good performance in terms of reducing false positives and
negatives as well as the best fit with our Fast Entropy scheme.
This abstract accurately represents the content of the candidates thesis. I recommend
its publication.
Signed
Ilkyeun


DEDICATION
I dedicate this thesis to God, who brought me to this new world and permitted me to
finish this study. I dedicate this thesis to my lovely wife (Sunwha Lee) and children
(Yeonwook and Yeonwoo) for always believing in and supporting me with all their
heart. I also want to show my deep appreciation to my personal mentors, Mr. and Mrs.
Choe. Additionally, I dedicate this thesis to my sincere friend, Karen Henderson, who
encourages us to stay strong in God and who takes care of my family within her
devotion during all my overseas study in the US.


ACKNOWLEDGEMENT
I would like to thank my advisor, Ilkyeun Ra, for his patience in advising me
throughout all my overseas study. I also wish to express appreciation to my co-advisor,
Titsa Papantoni, for cheering up me from the beginning of my study and giving me
her insights into my thesis.


TABLE OF CONTENTS
Figures.....................................................................xiv
Tables......................................................................xvi
Chapter
1. Introduction..............................................................1
1.1 Network Security..........................................................1
1.2 DDoS Overview.............................................................2
1.2.1 Threat of DDoS Attack..................................................3
1.2.2 Difficulties of Defending DDoS.......................................5
1.3 Motivation................................................................6
1.4 Problems in DDoS Attack Detection.........................................7
1.4.1 Network Robustness.....................................................7
1.4.2 Problem on DDoS Attack Detection.....................................8
1.4.3 Problem on DDoS Detection Schemes.....................................8
1.5 Research Focus............................................................9
2. Related Work.............................................................11
2.1 Network Intrusion........................................................11
2.1.1 General Network Attack Patterns.................................11
viii


2.1.1.1 Basic Idea of General Network Attack
2.1.1.2 Automated Attack......................................................12
2.1.1.3 Sophistication Increasing.............................................13
2.1.1.4 Increasing Permeability...............................................13
2.1.1.5 Asymmetric Attack.....................................................13
2.1.1.6 Infrastructure Attack.................................................14
2.1.1.6.1 Worms...............................................................14
2.1.1.6.2 Attacks on Domain Name Service (DNS)................................14
2.1.1.6.3 Attacks on Routers..................................................15
2.1.1.7 Impact Analysis of Infra Structure Attack.............................16
2.2 DDoS Attack..............................................................17
2.2.1 Overview..............................................................17
2.2.1.1 General Characteristics of DoS/DDoS Attack............................18
2.2.2 Attack Methods........................................................19
2.2.2.1 Protocol Based Attack.................................................19
2.2.2.1.1 TCP SYN Flood.......................................................19
2.2.2.1.2 ICMP Flood..........................................................20
2.2.2.1.3UDP Flood............................................................21
2.2.2.2 Application Based Attack............................................21
2.2.2.3 Distributed Reflector Attack..........................................21
2.3 Information Entropy...................................................22
IX


2.4 DDoS Detection Approaches.............................................24
2.4.1 Signature Based Approach..........................................24
2.4.1.1 Intrusion Detection System Placement..............................25
2.4.1.2 SBA Procedure.....................................................25
2.4.1.3 Making Signature..................................................26
2.4.1.4 Designing an Engine...............................................26
2.4.2 Anomaly Based Approach............................................27
2.4.2.1 Distribution Based Detection......................................27
2.4.2.1.1 Base Distribution...............................................28
2.4.2.1.2 Inverse Distribution............................................30
2.4.2.2 Statistical Anomaly Detection...................................31
2.4.2.2.1 Conventional Statistical Algorithm..............................32
2.4.2.2.2 Chi-Square Algorithm............................................32
2.4.3 Entropy Based Approach............................................34
2.4.3.1 Background and Strength of Entropy Based Approach.................34
2.4.3.2 Conventional Entropy Computing....................................35
3. New Entropy Computation Design........................................39
3.1 Compression Entropy Design............................................39
3.1.1 Binary String Compression.........................................40
3.1.2 Combination Compression Entropy...................................42
3.1.3 Runtime Analysis of Compression Entropy...........................45
x



3.2 Fast Entropy...........................................................45
3.2.1 Approach for Fast Entropy Computation...............................45
3.2.2 Fast Entropy Calibration............................................47
3.2.3 Fast Entropy Estimator..............................................48
3.2.4 Runtime Analysis of Fast Entropy....................................49
4. Detector Design.......................................................50
4.1 Detector Design Approach..............................................50
4.2 Monitoring Concept (Moving Average)...................................51
4.3 Adaptive Detector.....................................................52
4.4 Dynamic Window Design................................................53
4.4.1 Definition..........................................................54
4.4.2 Simple Dynamic Moving Average Window (SDMAW)........................54
4.4.3 Multiple Dynamic Moving Average Window (MDMAW) with TS........55
4.4.4 MDMAW with AO.......................................................55
5. Simulation & Analysis..................................................56
5.1 Input Data.............................................................56
5.1.1 Normal Data Flow....................................................56
5.1.2 DoS and Port Sweep Attack (1999 DARPA Data Set).....................56
Typical DDoS (Non-Stealthy 2000 DARPA Data Set)....................57
5.1.4 Stealthy DDoS (Stealthy 2000 DARPA Data Set)........................58
5.2 Simulation Procedure.................................................59
XI


5.3 Detection Assumption.....................................................59
5.4 Accuracy Analysis........................................................60
5.4.1 DoS and Port Sweep Attack Result (99 DARPA) .......................60
5.4.1.1 Packet Distribution...................................................60
5.4.1.2 Entropy Distribution..................................................60
5.4.1.3 Detection Accuracy (7 Attacks)........................................62
5.4.2 Typical DDoS Result (00 DARPA)...............................62
5.4.2.1 Packet Distribution...................................................62
5.4.2.2 Entropy Distribution..................................................63
5.4.2.3 Detection Accuracy (1 DDoS Attack, 4 Attack Points).................64
5.4.3 Stealthy DDoS Result (00 Stealthy DARPA)..........................65
5.4.3.1 Packet Distribution...................................................65
5.4.3.2 Entropy Distribution..................................................65
5.4.3.3 Detection Accuracy (1 Stealthy DDoS Attack, 4 Attack Points)..........67
5.5 Adaptive Detector Simulation.............................................67
5.6 DMAW Simulation..........................................................69
5.7 Runtime Analysis.........................................................70
5.8 Simulation Analysis......................................................73
6. Conclusion................................................................75
7. Future Work...............................................................77
xii


Appendix
A. DDoS Detection Source Code.......................................78
B. Conventional Entropy Calculator Header...........................83
C. Compression Entropy Calculator Header...........................86
D. Detector Header..................................................87
E. Miscellaneous Headers............................................99
REFERENCES.........................................................108
xiii


LIST OF FIGURES
Figure
1.1 The Number of Internet Security Incidents [27].............................4
1.2 Dollar Amount Losses by Type in CSI/FBI Survey.............................5
2.1 Structure of a Typical DDoS Attack........................................17
2.2 ICMP Attack Example.......................................................20
2.3 DRDoS Attack Scenario.....................................................22
2.4 Entropy in case of two possibilities with probability p and (1-p).........23
2.5 An Example of Detecting Engine with SBA...................................26
2.6 Base frequency distribution (top),
and its Inverse frequency distribution (bottom) [16]......................30
2.7 Conventional Entropy Computation Flow.....................................36
4.1 Monitoring Concept with Moving Average....................................51
4.2 Adaptive Detector Algorithm Flow..........................................53
5.1 Packet Flows at Detection Points (99 DARPA)...............................60
5.2 Conventional Entropy Distribution with 99 DARPA...........................60
5.3 Compression Entropy Distribution with 99 DARPA............................61
5.4 Fast Entropy Distribution with 99 DARPA...................................61
XIV


5.5 Packet Flows at Detection Points (00 DARPA).............................63
5.6 Conventional Entropy Distribution with 00 DARPA.........................63
5.7 Compression Entropy Distribution with 00 DARPA.........................63
5.8 Fast Entropy Distribution with 00 DARPA.................................64
5.9 Packet Flows at Detection Points (00 Stealthy DARPA).....................65
5.10 Conventional Entropy Distribution with Stealthy DARPA.................65
5.11 Compression Entropy Distribution with 00 Stealthy DARPA..............66
5.12 Fast Entropy Distribution with 00 Stealthy DARPA......................66
5.13 Runtime Distribution....................................................72
xv


LIST OF TABLES
Table
2.1 An Exampe of Signature Table..........................................26
5.1 Comparison Normal and Stealthy DDoS Datasets..........................58
5.2 99 DARPA Dataset Detection Accuracy...................................62
5.3 00 DARPA Typical DDoS Dataset Detection Accuracy......................64
5.4 00 DARPA Stealthy DDoS Dataset Detection Accuracy.....................67
5.5 Performance Comparison:
Adaptive Detector vs Manual Threshold Setting........................68
5.6 Performance Comparison: Adaptive Detector vs. DMAW....................69
5.7 Runtime Result without Dynamic Window.................................71
5.8 Runtime Result with MDMAW with AO.....................................71
5.9 Computation Overhead..................................................65
xvi


1. Introduction
The main features of an Internet (or internet) network are its open-environment
and scalability. In one hand, these characteristics have led the growth of the Internet.
On the other hand, vulnerabilities in the network have occurred simultaneously. For
example, a weakness of IP itself is that one can spoof wrong addresses or fake the
server from detecting the sender of original messages. As the IT techniques grow
faster, the number of malicious users keeps increasing by easily obtained hack or
attack tools from the Internet or commercial markets.
The threat of Distributed Denial of Service (DDoS) attack now has become a
major issue in network security not only in commercial sectors but also government
infra-structures. We present the problems in network security, DDoS overview, our
motivation, and problems in DDoS attack detection, then we explain our research
focus in this thesis.
1.1 Network Security
To verify and get the ease of approach to the network security, IEEE suggests
three security services (Authentication, Confidentiality, and Integrity) [25]. Also,
National Information Assurance Glossary (NIAS) of the U.S. government provides
four main features of network security services (Availability, Integrity,
Confidentiality, and Non-Repudiation) [26]. The network security services are briefly
presented based on IEEE and NIAS definition as follows:
Authentication: allow only authorized accesses in our network.
Confidentiality: allow only authorized accesses to view our data.
1


Integrity: allow only un-modified data in transit between networks.
Availability: Serve information correctly to all normal accesses. The opposite
of Availability is denial of service/distributed denial of service (DoS/DDOS)
Non-Repudiation: one party of transaction cannot deny receiving a
transaction (For example, electronic transaction using digital signature)
On the basis of five network security services, network security can be defined in
various ways based on different methods and tools, in this thesis, we can use the
network security as preventing attackers from achieving objectives through
unauthorized access or unauthorized use of computers and networks [1], This
definition sets a desired boundary for the computer security field. Theft of computer
equipments from a computer lab does not fall under this definition. A weakness in
software, however, is a real security concern. The software may be without flaw, but
malicious users can exploit its vulnerabilities, which creates a security issue.
Therefore, as used in our thesis, network security means preventing all activity
including Unauthorized access or use (virus, Trojan horse, telnet, etc.), as well as
the ends of attacks (corruption, disclosure, or denial-of-service leading to theft,
espionage, fraud, etc.), are included because they require unauthorized access or
unauthorized use' [2].
1.2 DDoS Overview
Denial of Service Attacks uses multiple systems to attack one or more victim
systems with the intent of denying service to legitimate users of the victim systems.
The degree of automation in attack tools enables a single attacker to install their tools
and control tens of thousands of compromised systems for use in attacks. Intruders
often search address blocks known to contain high concentrations of vulnerable
systems with high-speed connections. DoS attacks are effective because the Internet
2


is comprised of limited and consumable resources, and Internet security is highly
interdependent. Once Denial of Service occurred in more than two places, it is called
as DDoS.
1.2.1 Threat of DDoS Attack
In 2007, the authors of [27] argued that Internet architecture has fundamental
weaknesses and vulnerabilities which can be classified with five main cause-root
problems (same as potential threats of DDoS attacks):
Resource Sharing: By occupying most of the shared resource, bandwidth
attacks can prevent servicing normal users.
Simple Core and Complex Edge'. Every intermediate router doesnt need to
understand upper or lower layers. The lack or authentication at the network
layer leads to a serious problem known as IP spoofing.
Multipath Routing: A packet can go via an arbitrary path among all
possible paths under congestion control. This can lead robustness or
Internet since routers doesnt know whether the addresses of incoming
packets are spoofed or not.
Fast Core Networks and Slow Edge Network: A router or local networks
attached to Internet has high computational power. Meanwhile, edge
network has lower power. High Core can overwhelm the edges capacity
when many sources have same destination.
Decentralized Internet Management'. There is no authority to control
overall Internet accesses because Internet is huge inter-network. This
causes easy accesses for DDoS attackers.
Before we go on to the details of network attack, it is meaningful to understand
3


how much attack grows up and what serious results of the attacks are. Under the
potential threats of DDoS attacks in Internet, computer security incidents increased
exponentially with same pattern of enhancement of the network performance.
Figure 1.1 The Number of Internet Security Incidents (27)
According to Computer Emergency Response Team (CERT) of the United States,
report [27], the number of incidents had jumped from 6 in 1988 to 137,529 in 2003.
This number should be increased in same pattern by malicious users or criminal
attackers in private and commercial sector (refer Figure 1.1).
In 2005, the Computer Security Institute (CSI) and the Federal Bureau of
Investigation (FBI) released a survey on the prevalence and character of computer
crimes based on the responses from 700 Security Analysts and Chief Security
Officers (CSO) from middle to large companies in the U.S. [28].
4


Figure 1.2 Dollar Amount Losses by Type in CS1/FBI Survey |28)
Denial of Service (DoS) makes the fourth portion ($7,310,725 loss) from network
crimes (see Figure 1.2). However, if we consider that almost all of DDoS attacks start
with virus tools, the total amount of money loss is significantly larger than DoS itself,
since the survey [29] on DDoS shows the feature of DDoS attacks which have trends
automation (same as virus attack trend). With thinking of money losses from
DoS/DDoS and growing the number of network incidents exponentially, the threat of
DoS/DDoS is overwhelming on the network security.
1.2.2 Difficulties of Defending DDoS
There are two main problems on defending DDoS attacks. First of all, it is very
difficult to identify DDoS attacks from legitimate packets since the DDoS attacker
increases the number of packets to consume all victims resources (computational
power or memory usage). Because of the feature of DDoS, the attackers send packets
with useful or useless packet contents in which defenders are forced to decide the
packets as the normal and legitimate packets. Second, finding the location of DDoS
attackers is also hard task. Since the attackers spoof the packet information (source IP
address) to prevent the back-tracking from defenders, there are few ways of detecting
hidden attackers.
5


The efficient way of defend DDoS attacks is that defender detect DDoS attack in
early phase of the attack. Once one detects the symptoms of DDoS attack, he/she can
block the suspicious incoming packets so that he/she can prevent preserve his/her
system to be shutdown or continue service for legitimate accesses.
1.3 Motivation
We are interested in the computation time in DDoS attack detection, since none of
the researches covered a problem in reducing the computation resource. If we reduce
the computation time with an algorithm under the problems in 1.4, the DDoS attack
detection capability grows up. Sometimes, reducing computation time produces less
detection accuracy such as a sampling algorithm. Therefore, one needs to use an
efficient attack detection algorithm (detection approach) to maintain detection
accuracy with one of network detection approaches.
To enhance DDoS attack detection, here are two ideas: 1) Choosing a proper
detection algorithm (detection approach) 2) Designing a fast detection estimator.
With combining two ideas, one may have a better DDoS attack detection scheme.
For the proper detection approach, the information entropy approach is introduced
in Chapter 2. Since the information entropy approach can represent a network
condition relatively well, using information entropy is a good network monitoring
approach in a detection algorithm. To design a fast detection estimator, one may use
the other ways from traditional estimator computing, which produce the value of the
detection estimator (information entropy) in a fast computation fashion. For example,
a data compression or modification of the estimator can be used for a good DDoS
attack detection estimator.
6


1.4 Problems in DDoS Attack Detection
1.4.1 Network Robustness
Authors in [21] argued that with an increasing number of network users, services
and the current generation of high-speed network links, the amount of transferred data
has increased significantly. Recent studies demonstrate that using the entropy of
traffic distributions has tremendous value for network monitoring applications.
Realizing the potential benefit requires efficient algorithms for computing the entropy
[19]. In general, computing traffic entropy on high-speed links is a hard task, because
it is infeasible for traditional methods to keep up with the line-rates, due to constraints
on available processing capacity [19].
For an experimental study [22], authors gathered a trace, called PEERING,
derived from 10,000,000 IP packets seen at a peering link during a period of 37
minutes. Trace CAMPUS was derived from 10,065,600 IP packets seen at LAN near
the border of a campus network during a period of 300 minutes. Trace ABILENE was
collected from an OC48c link in the Abilene network. This study used 532,567,007
UDP and TCP packets present during a 2 hour period. They got a trace, called COS,
collected at Colorado State University. This study used approximately 37 million
packets collected during January 25 and 26, 2003. Raw network flow statistics
collected in an aggregation network during 1 day in September 2002. There were
229,448,460 records, representing 6,009,481,415 packets and 3,107,927,460,309
bytes.
This was to obtain an example of campus network data flow. In the core (Internet
Back Born) network, the traffic is significantly larger than this. As a result of these
heavy traffic volumes on the network, constraints of computation power and memory
resource were imposed to make it almost impossible to compute the statistics per flow,
since attack detector should monitor all kinds of packets from an unknown attacker.
Since a DDoS attack aims to increase the amount of packet flow with normal packets,
7


this network robustness is a big burden to DDoS attack Defenders.
1.4.2 Problem on DDoS Attack Detection
As Internet technologies grow, the threat of network attack becomes more severe.
A DDoS attack can become easy for attackers (ease of acquiring attack tools from
Internet or markets) while the defenders get more difficult to be able to detect
malicious network flowsince the DDoS attacker now has all normal packets flow
but spoofed packet information. A burden for the defenders is to process all packet
information within a limited time because a DDoS attacker sends a lot of normal
packets to a victim(s), while a DDoS attacker increases packet flow until it uses
overall capacity of the network channel. Although there is a good monitoring scheme
against DDoS attacks, it still needs relatively high computation time to verify an
attack from a normal packet flow.
1.4.3 Problem on Detection Schemes
To detect DDoS attack, several notable detection schemes were researched
(Signature-based, Distribution-based, and Entropy-based DDoS detection approach).
Due to the existence of heavy traffic, none of the approaches presented above is
sufficient to efficiently detect network intrusion attacks with limited resource
(computation power).
Signature-based approach is less efficient than both the Distribution-based and
Statistical approach due to false negatives, false positives, and the need to continuing
maintaining signature tables with system experts continuously. Also, the Distribution-
based and statistical approaches are not sufficient, since it needs relatively many
computations to yield statistics or distribution of network flow. Even though using
Entropy has several advantages, it still needs an efficient use of algorithm to reduce
computation time and memory usage in a high speed network.
8


To mitigate computation and memory usage, the sampling based methods have
been introduced to reduce processing and memory requirements, and to be suitable
for capturing some traffic statistics. However, one must trade off accuracy for
efficiency [20] the Sampling approach increased efficiency of the detector, but it
could have significant error to tolerate.
1.5 Research Focus
A network attack detection facility can be broken down into two main areas: 1) an
efficient algorithm to compute the value of a detection estimator (mathematical
quantity) and 2) a detection algorithm (called intrusion detector). Typically, an
efficient detection estimator depends on the time of computing an estimator value.
Meanwhile, an intrusion detector depends on the accuracy that isolates attacks in time
of attack occurrences. Indeed, there are many approaches for intrusion detection
which are discussed in Chapter 2.
This thesis focuses on designing an efficient DDoS attack detection facility which
can significantly reduce computation time while it maintains detection accuracy.
To come up with less computation time, we first decide what a network condition
monitoring estimator (attack detection estimator) is needed. Since an Entropy-base
detection approach provides insight how to estimate the network condition, we use
the information entropy concept as an attack detection estimator. To reduce the
computation time reduced by the conventional entropy concept, we introduce two fast
entropy computation schemes: Compression Entropy and modified entropy estimators
(called Fast Entropy). For the detection accuracy, we propose an Adaptive Moving
Average Detector (AMAD). We evaluate the detection accuracy in AMAD with three
different DDoS detection schemes (Conventional Entropy, Compression Entropy and
Fast Entropy Scheme).
The following areas are discussed in this thesis, and considered to be in scope.
9


Chapter 1 addresses the problems of DDoS attack detection, motivation, and the
research focus of this thesis. Related work is presented in Chapter 2. Designing new
entropy computations are described in Chapter 3. Detector Design is explained in
Chapter 4. Chapter 5 discusses the simulation analysis results. Chapter 6 concludes
the contribution of thesis. Finally, the future work will be discussed in Chapter 7.
10


2. Related Work
This section was written to provide the condensed background knowledge of
related researches about network intrusion, Distributed Denial of Service (DDoS) and
DDoS detection approaches. There are various network intrusion detection
approaches: Signature-based approaches, Anomaly-based approaches, and Entropy-
based approaches. We first review the meaning of network security and intrusion,
then present DDoS detection approaches. As a result of review on researches, the
entropy-based network attack detection algorithms become a dominant detection
approach due to the convenience of the entropy which sensitively represents network
condition. Also, it has additional diagnostic information about the nature of anomaly,
and offers useful information to data groups (information clusters).
This section is organized as follows: Network security and intrusion are
introduced at first. Then, DDoS attacks and detection approaches are reviewed. The
concept of information entropy is then briefly presented. Finally, the conventional
Entropy-based approach against DDoS attack is introduced.
2.1 Network Intrusion
2.1.1 General Network Attack Patterns
2.1.1.1 Basic Idea of General Network Attack
The whole network system is linked: every single component (such as router,
local user, server, and so on) is connected either tightly or loosely. Before or after a
network attack, it is important for attackers or defenders to know the whole network
chain in order to attack a target system(s) or defeat an intruder(s) efficiently and cost-
effectively as opposed to just fixing a ruined system. One needs to know the
11


vulnerability of the whole network system. A network must ensure linking processes
between components to do any jobs between hosts, networks, or servers. Obviously,
every component has a precondition (or pre-status) which leads to the current state [2].
The first precondition in the network should be individual users (called end users) of
Internet. From this initial precondition, an attacker attempts to use other network
components via remote login procedure with the intention to threaten a network
security. Once the remote login succeeds, it means /wo connected system are in
trust each other condition and one computer trusts another computer without
password authentication. [2]. With this trust between an end user and remote
systems connected physically or logically, an end user verifies the vulnerabilities of
systems. After that, if an end user exploits system resources or forces the system to
behave abnormally, an end user becomes a network attacker and network security is
challenged.
The main keys in network system security are verifying the system vulnerabilities,
analyzing attack patterns, and planning or building the defense system under
knowledge of vulnerabilities and attack pattern before or after attack(s). The
Computer Emergency Response Team (CERT) recommands a guideline to verify five
attack patterns [3] as follows: (2.1.2.2 ~ 2.1.2.6).
2.1.1.2 Automation Attack
In this attack, an attacker increases the attack level via an automation tool.
Through such a tool, an attacker achieves an efficient way of attack. An automated
attack usually follows four major steps:
Scan for Potential Victims
Compromise Vulnerable Systems
Propagate the Attack
Coordinate Management of Attack Tools
12


2.1.1.3 Sophistication Increasing
An attacker can also achieve a network attack by developing a more sophisticated
tool to prevent the defender discovering or detecting attacks through analysis because
of the tools functions. There are three major sophisticated approaches in network
attacks:
Anti-forensics: attackers use techniques that obscure the presence of attack
tools. This makes it more difficult and time consuming for security experts to
analyze new attack tools and to understand new and rapidly developing
threats.
Dynamic Behavior: Attack tools can vary their patterns and behaviors based
on random selection, predefined decision paths, or through direct intruder
management.
Modularity of Attack Tools: attack tools are more commonly being developed
to execute on multiple operating system platforms.
2.1.1.4 Increasing Permeability
Firewalls blocking unauthorized access are often relied upon to provide primary
protection from intruders. However, technologies are being designed to bypass typical
firewall configurations. Some protocols marketed as being firewall friendly are
designed to bypass typical firewall configurations.
2.1.1.5 Asymmetric Attack
Security on the Internet is based on interdependent system. Each Internet systems
exposure to attack depends on the state of security of the rest of the systems attached
to the global Internet. Because of the advances in attack technology, a single attacker
13


can relatively easily employ a large number of distributed systems to launch
devastating attacks against a single victim. As both the automation of deployment and
the sophistication of attack tool management increase, the asymmetric nature of the
threat will continue to grow.
2.1.1.6 Infrastructure Attack
Infrastructure attack is broadly attempts to affect key components of the Internet
infrastructure. They are of increasing concern because of the number of organizations
and users on the Internet. Their dependency on Internet keeps growing. Four types of
infrastructure attacks are briefly described below.
2.1.1.6.1 Worms
A worm is a self-propagating malicious code (or a program). Unlike a virus,
which requires a user to do something to continue the propagation, a worm can
propagate by itself. The highly-automated nature of the worms which coupled with
the relatively widespread nature of the vulnerabilities allows a large number of
systems to be compromised within a matter of hours. Some worms include built-in
denial-of-service attack payloads or web site defacement payloads. But the biggest
impact of these worms is that their propagation effectively creates a DoS (or DDoS)
in many parts of the Internet because of the huge amounts of scan traffic generated;
they cause much parallel damage.
2.1.1.6.2 Attacks on Domain Name Service (DNS)
DNS is the distributed, hierarchical global directory that translates names to
numeric IP addresses. The threats on DNS can show up in various ways:
Cache Poisoning: If DNS is made to cache bogus information, the attacker
14


can redirect traffic intended for a legitimate site to a site under the attackers
control (malicious destination redirection to a victim).
Compromised Data: Attackers compromise vulnerable DNS servers, giving
them the ability to modify the data served to users. Many of the Top Level
Domain (TLD) servers run a software program called BIND (Berkeley
Internet Name Domain), in which vulnerabilities are discovered regularly
(malicious payloads distribution).
Denial of service: A large denial-of-service attack on some of the name
servers for a TLD (for example, .com or .org) could cause widespread
Internet slowdowns or effect outages.
Domain hijacking: By leveraging insecure mechanisms used by customers to
update their domain registration information, attackers can adapt the domain
registration processes to take control of legitimate domains.
2.1.1.6.3 Attacks on Routers
Routers are direct traffics from a source to the intended recipient(s) on the
Internet (similar to mail routing facilities in the postal service). Threats fall into the
following categories:
Routers as attack platforms: Intruders use poorly secured routers as platforms
for generating attack traffic at other sites, for scanning, or reconnaissance.
Denial of service: Although routers are designed to pass large amounts of
traffics through them, they often are not capable of handling the same amount
of traffics directed to them. (Think of it as the difference between sorting
mail and reading it.) Intruders take advantage of this characteristic by
attacking the routers that lead into a network rather than directly attacking the
systems on the network.
15


Exploitation of trust relationship between routers: For routers to do their jobs,
they have to know where to send the traffic they receive. They do this by
sharing routing information between them, which requires the routers to trust
the information they receive from their peers. As a result, it would be
relatively easy for an attacker to modify, delete, or inject routes into the
global Internet routing tables to redirect traffic destined for one network to
another; this effectively causes a denial of service to both (one because no
traffic is being routed to them, and the other because theyre getting more
traffic than they should).
2.1.1.7 Impact Analysis of Infrastructure Attack
DDoS has become a popular approach for attackers because the attackers can do
serious damages with less effort to the victim network(s). Also, (D)DoS can impact
on every type of infrastructure attacks. On the other hand, it is a big pain for
defenders. (D)DoS can congest the Internet because the attack of DoS re-codes (self-
reproduces, replicates, or duplicates) the hundreds or more Mbps. That could lead
results total resource loss from a local attack. In case of a virus attack (since most of
viruses have ability to clone or reproduce themselves), the virus can copy personal
information and distribute it to another network. It yields confidential accident.
Last but not least, the lack of time or resources is the most significant problem
under infrastructure attacks. With the growth of on-line commercial companies, stores,
and banks, the loss of resources in the network means that they immediately lose a
great loss of money. According to CERT report on attack trends, computer economics
estimated that the total economic impact of the infrastructure attack was $2.6 billion.
CERT also reported the cost of restoring communication systems ($1.3 billion after 9-
1-1 destined of the Twin Tower in New York, the cost to restore communication
capability was about $15.8 billion) [4],
16


2.2 DDoS Attack
2.2.1 Overview
CERT gives us a clear concept about a DDoS attack [5]. All the general scenarios
are as follows:
First, attackers build a network of computers that they will use to produce the
volume of traffic needed to deny services to other computer users. To make this attack
network, they keep trying to find computers in the network that are poorly secured
(not properly patched, no anti-virus feature, out of date version of a program or
operating system). Second, if they find vulnerable computers (called Zombie), they
install an attack program so that they can control and carry out attack that they want
to propagate into the network system.
Figure 2.1 Structure of a Typical DDoS Attack
Third, once computers (zombies) are taken by the attacker, the installed program
forces the compromised computers to be joined as a member of the attacking alliance.
As they capture more and more computers, they get more zombies. The attacker sends
an attack command to the zombies through secured channel to launch a bandwidth
attack against the targeted victim(s). The DDoS attack power increases exponentially.
The process of building a network for DDoS is a problem since it creates another
significant traffic load in the network.
17


Finally, once the attack network is completed, the attacker is now ready to attack a
victim or victims. Usually, attackers select a router or server with a high
concentration of traffic and center of data transmission.
2.2.1.1 General Characteristics of DoS/DDoS Attack
A crucial element in the DoS/DDoS attack is that an attacker focuses on the
volume rather than the content of the attack. This suggests two general characteristics
of DoS/DDoS, and gives clues how to defend DoS or DDoS attack and develop a
complex defense method [7]
Attackers can send a variety of packets either spoofed or not. The attack
traffic can be made arbitrarily similar to legitimate traffic, which greatly
complicates a defense.
The volume of traffic must be large enough to consume the targets resources.
The attacker usually has to control more than one computer to generate the
attack traffic.
Also, the difficulty of defending a DDoS attack was presented in [7] on the basis
of five main Internet principles of design;
Many DDoS defense approaches need to be deployed at numerous locations to
be effective. However, due to the lack of central control, it is extremely
difficult to enforce global deployment in the Internet, which makes highly
distributed solutions unattractive.
The distributed nature of DDoS attacks renders a single point solution
ineffective. Moreover, due to privacy and other commercial concerns, network
service providers generally are reluctant to provide detailed information about
the traffic patterns within their networks and often will not cooperate in
tracing attack sources.
More importantly, there is no automated support for tracing attack sources.
18


Once a source tracing request is authenticated and authorized, it has to be
enforced interaction by human.
2.2.2 Attack Methods
2.2.2.1 Protocol Based Attack
This type of attack is based on the weakness of network protocol such as TCP or
ICMP. There are two major attacks of bandwidth, which are consumption of the
hosts resources and consumption of network bandwidth [7].
The Consumption of Servers Resource: An attacker sends so many packets that
a victim-sever cannot process all of the incoming packets. As a result, any
normal users are forced to slow down their sending rate, while the attacker
maintains or increases his/her rates. The victims resources will be saturated
beyond their capacity.
The Consumption of Hosts Resource: If malicious flows are able to dominate
the communication links that lead to the victim, then the legitimate flows will be
blocked. Therefore, not only the intended victim of the attack is disabled, but
also any system that relies on the communication links of the attack path.
2.2.2.1.1 TCP SYN Flood
To establish a TCP connection, the client and server must participate in a three-
way handshake. To start TCP, a client sends a SYN message. After the server
acknowledges the SYN message, it sends a SYN-ACK message to the client. If they
finish connection, the client and server can exchange their data via the connection. In
a SYN Flood attack, the attacker sends SYN packets with forged IP source addresses
such that the incoming traffic appears to originate from multiple clients. Assuming
that the packets match the firewall's rule-base, the server will create state table entries
19


to track the pending connections. Because the client addresses are forged, the SYN-
ACK messages sent to the clients by the server will be discarded, leaving the servers
state table filled with bogus entries and eventually creating a denial-of-service
condition [8],
2.2.2.1.2 ICMP Flood
Attackers are using ICMP echo request packets directed to IP broadcast addresses
from remote locations to generate denial-of-service attacks. There are three parties in
these attacks: the attacker, the intermediary, and the victim (note that the intermediary
can also be a victim). [8]
Figure 2.2 ICMP Attack Example
The intermediary receives an ICMP echo request packet directed to the IP
broadcast address of their network. If the intermediary does not filter ICMP traffic
directed to IP broadcast addresses, many of the machines on the network will receive
this ICMP echo request packet and send an ICMP echo reply packet back. When
(potentially) all the machines on a network respond to this ICMP echo request, the
result can be severe network congestion or outages [9].
20


2.2.2.1.3 UDP Flood
In a UDP Flood attack, the attacker sends large numbers of small UDP packets
with forged IP source addresses. However, because the UDP protocol is
connectionless, there are no session status indicators (SYN, SYN-ACK, ACK, FIN, or
RST) to assist the firewall in detecting the abnormal protocol states that indicate an
attack. As a result, state-based servers must rely upon source and destination
addresses to create state table entries and set session timeout values [8].
2.22.2 Application Based Attack
A different type of attack is possible on applications by way of amplifying attack
resource level that is to force the target to execute expensive operations. For example,
many Web sites provide search engines to allow users to find a particular Web page.
An attacker can exploit this application by sending a large number of queries to a
Web sites search engine. In this way, the Web site is forced to perform CPU and
memory-intensive database operations which leaves few resources to serve legitimate
users. The main targets are World Wide Web (WWW) and Voice over IP (VoIP) [7].
2.22.3 Distributed Reflector Attack
A Distributed Reflector Attack (DRA) is the extended concept of DDoS attack
and more serious than DDoS. DRA is called Distributed Reflector Denial of Service
(DRDoSr [7].
21


Although DRA and DDoS goal is the same, unlike DDoS, once the attackers of
DRDoS gain control of reflectors (innocent third party such as routers or web servers),
attackers send attack order to the reflectors that send malicious packets with victims
IP address as the source IP address instead of sending attack packet directly to the
victim [7].
2.3 Information Entropy
Entropy is a concept identified by Shannon in 1948. Entropy is a quantity, a
measure of the uncertainty of a random variable. Let X be a discrete random variable
with alphabet % and probability mass function p(x) = Pr {X=x}, xey. The entropy H(X)
of a discrete random variable X is defined by
H(X) = xp(x)log p(x),
where OlogO = 0, and H(X) > 0 since 0 We can use the base of logarithm as 2, since the entropy is expressed in bits (binary)
in computer system. Entropy is a function of the distribution of X. it does not depend
on the actual values taken by the random variable X, but only on the probabilities. For
22


instance,
Let X =
r a with probability 1/2
b with probability 1/4
c with probability 1/8
^ d with probability 1/8
The entropy of X is
H(X) = (-l/2)log(l/2) + (-l/4)log(l/4) + (-l/8)log(l/8) + (-1 /8)log(l/8)
= 7/4 bits.
If probabilities of random variables are equally likely, the entropy has
maximum values. Assume we have two variables of X such that 1 (with
probability p), 0 (with probability of 1 -p).
Figure 2.4 Entropy in case of two possibilities with probability of p and (1-p)
The function of the basic properties of entropy is: It is a concave function of the
distribution and equals 0 when p = 0 or 1 (there is no uncertainty, which means
everything can be clearly known). Similarly, the uncertainty is maximum when p =
1/2, which also corresponds to the maximum value of entropy. This property can be
23


easily used in network traffic monitoring. If network traffic changes from normal to
abnormal status such as when the DDoS attacker sends a bunch of packets with the
same port number to saturate a certain port, the entropy of this port number will be
decreased. By contrast, under normal conditions, the entropy of the port number will
be increased. This phenomenon can be applied various network information such as
source IP address, destination IP address, source port, destination port, the total
number of packets and even in the data clustering schemes.
2.4 DDoS Detection Approaches
We reviewed Network security and intrusion issues, DDoS attack, and
Information entropy in a previous section. Our main concern is to detect DDoS
intrusion in the network. Various researches were done and several detection
approaches were introduced. We review three major approaches for detecting DDoS
attack. After a review of related research, we will present the problem of these
approaches in section 2.4.
2.4.1 Signature Based Approach
Most intrusion detection systems (IDS) are known as signature-based. This means
that they operate in much the same way as a virus scanner, by searching for a known
identity or signature for each specific intrusion event. While signature-based IDS are
very efficient at sniffing out known attacks, they do, like anti-virus software, depend
on receiving regular signature updates, to keep in touch with variations in hacker
technique. The goals of SB A are: 1) finding an attack pattern or signature 2)
narrowing a precise focus to reduce false negatives 3) being flexible to cover as many
variants as possible to reduce false negatives. In [11], SB A has several advantages
and disadvantages.
SBA has 4 main advantages: 1) It looks for specific and explicit indications of
24


attacks, since attacks are identified by raw byte sequences (strings), protocol type,
port numbers, etc. 2) Since only known attacks can be identified by SBA, the SBA
has very low false positives. 3) It is easy to implement SBA. 4) Since signature
information can be shared via online repositories, it is easy to share.
However, it also has disadvantages: 1) all systems having SBA must be trained 2)
SBA has potential false negatives because it may not detect even simple variations of
attack. 3) SBA has also false positives characteristic. When attack was failed, or
system was poorly configured, it may detect false attacks. 4) If the signature is stolen,
detection system is not working properly any more, since the attackers are able to
change attack method by using opened information from stolen signature data.
2.4.1.1 Intrusion Detection System Placement
To monitor the anomaly with SBA, one should decide where the Intrusion
Detection System (IDS) is installed. IDS must be placed one of two ways (early
warning mode or internal deployment). Early warning mode installs the device and
monitors attacks outside of network but has no capability of watching internal attacks.
Internal deployment needs little sophisticated configuration. However, it increases
security and detects internal attacks. Usually, the internal monitoring mode is
preferred.
2.4.1.2 SBA Procedure
A procedure is required to build an SBA due to the need to gather pre-knowledge
with the help of experts or observed attack data base. There are two simple
procedures to make SBA which are making signature and designing a detection
engine with signature.
25


2.4.1.3 Making Signature
With a knowledge base or database, an SBA designer (expert) makes a signature
set to install in IDS. Also, whenever he or she detects a new attack(s), he/she must
update signature either manually or in an automated way to reduce false negatives. An
example of a signature is presented in Table 2.1.
Table 2.1 An Example of Signature Table
Contents Description
Signature ID xx.xxxx
Message DDoS message detected
Rule Alert to system administrator via console, then, block down the flow
2.4.1.4 Designing an Engine
A detection engine does actual detecting by using signature approach. It defines
logging option (converting data into human readable format), and rules (how to act
against detected attack such as pass, just log, or alert).
Figure 2.5 An Example of Detecting Engine with SBA
26


2.4.2 Anomaly Based Approach
Due to the faults of SBA such as human errors, false positives, and false negatives,
one needs a safer detection approach, called an Anomaly Based Approach (ABA). It
contains distribution analysis approaches, data mining, statistical approaches and so
on. Usually, ABA is considered as more complex architecture. In fact, SBA needs to
consult every pattern of incoming traffic, which means more work load to maintain a
relatively high security level.
In network traffic terms, SBA captures all the headers of the IP packets running
towards the network. From this, it filters out all known and legal traffic [13]. Because
of this reason, we need anomaly sensors in high bandwidth connections or in near
backbone network to monitor the network efficiently. There are other equally obvious
advantages to using anomaly-based IDS. For example, because it detects any traffic
that is new or unusual, the anomaly method is particularly good at identifying sweeps
and probes toward network hardware.
Therefore, it can give early warnings of potential intrusions, because probes and
scan are the predecessors of all attacks. And this applies equally to any new service
installed on any item of hardware [13]. Also, once ABA is properly installed, there is
no need for update. Furthermore, it does work even in different platforms. For ABA
we will review distribution-based approaches and statistical anomaly detection.
However, it can yield false negatives since it can detect burst accesses of normal users
as precondition of network attacks, and has no pre-knowledge of attack signature
unlike SBA.
Three ABA approaches (Distribution Based Detection, Statistical Anomaly
Detection, and Entropy Based Approach) are presented in following sections.
2.4.2.1 Distribution Based Detection
Anomaly Based Approach Network Intrusion Detection System (ABA NIDS) is
27


not yet commercially deployed due to a false positive characteristic. However, AB
NIDS remains attractive and powerful method for NIDS and has a strong advantage
in that it can do something proactively before attacks happen.
To acknowledge an attack proactively, a NIDS designer should get some
information from the network flow such as volume of traffic, distribution of traffic
and so on. One approach to NIDS is to use distribution of network flows. In this paper,
several approaches using network distribution will be explained and compared. With
creating a distribution of traffic information mining to watch anomaly, one can use
traffic distribution. We will review three different ABNIDSs (Base Distribution,
Inversed Distribution approach)
2.4.2.1.1 Base Distribution
A DDoS attack is initiated with scanning by using malicious software such as a
worm virus. NIDS has been studied to detect worn from the network packet flow. In
the 1980s, most intruders were experts, with high levels of expertise and individually
developed methods for breaking into systems. They rarely used automated tools and
exploit scripts.
However, anyone, nowadays, can attack Internet sites using readily available
intrusion tools and exploit scripts that capitalize on widely known vulnerabilities [14].
This makes attackers use worm tools easily against targets. Thus, by characterizing
the worm attack pattern with distribution, the victim can detect DDoS attacks.
Authors in [15] suggest a prototype system, called Early Bird system, for
detecting worms based on two observations: 1) Some portions of the contents in
existing worms is invariant-typically the code exploiting latent host vulnerability. 2)
It is rare to observe the same string recurring within packets sent from many sources
to many destinations. They suggested that by sifting through network traffic for
content strings that are both frequently repeated and widely dispersed, one can
28


automatically identify new worms and their precise signatures.
With their assumption 1), Worms are designed foremost to spread, the invariant
portion of a worm's content will appear frequently on the network as it spreads or
attempts to spread. Conversely, content which is not prevalent will not represent the
invariant portion of a worm and therefore is not a useful candidate for constructing
signatures.
They first suggested that if an algorithm extracts all payloads from every packet
and save the content of each payload, the algorithm can find the symptom of worm
which indicates a pre-condition of DDoS. If they use big hash table and categorize
packet according to port, the computation can be reduced. With the idea of 2), they
suggested an idea that the number of distinct hosts infected by a worm will grow over
time.
Consequently, packets containing a live worm will tend to reflect a variety of
different source and destination addresses. During a major outbreak, the number of
such addresses can grow extremely quickly. Finally, it is reasonable to expect that the
distribution of these addresses will be far more uniform than typical network traffic
which can have significant clustering.
To do so, they suggested a multiple-bitmap algorithm which can be filled in any
IP address of all packets by extracting all packet information with a hash table to
reduce computation. With this bitmap information, one can analyze whether the
traffic is in normal or not (worm attack). After gathering those two kinds of
information, one can judge this network traffic as normal or abnormal regarding
DDoS.
However, this approach is still burdened significantly in computation because
algorithm has to count and extract every packet. Also, data mapping is needed for
analysis of data flows, even though it can detect anomaly in the network link.
29


2.4.2.1.2 Inverse Distribution
Research by the authors of [16] showed that the EarlyBird system [15] looks for
frequently occurring substrings in packet contents as indicators of potentially
malicious content. A similar idea can also be found in the Autograph and Polygraph
systems; these systems reason about the frequency with which patterns of substrings
appear to generate compact and discriminating signatures for a collection of packets
classified by an external entity as being malicious [16], By the study of [16], Inversed
Distribution Scheme was proposed. They suggested inversed distribution function,
1(f), which tracks for a given frequency / (the number of substrings that appear with
that frequency. They argued that the EarlyBird system (same as Autograph and
Polygraph) needs to verify the distinction between normal and abnormal distribution
by using frequencies at the point of a certain threshold. Enhancing performance in the
EarlyBird system leads it into high false positives.
Figure 2.6 Base frequency distribution (top), Inverse frequency
distribution (bottom) |16]
To correct false positive problem of the EarlyBird system, it needs threshold lists
of attack patterns (they did observation on worm attacks) or to wait for the instant
30


where the frequency of the worm fingerprints exceeds that of previously known
sources of content similarity. These approaches rely upon a probabilistic model for
the shape of the distribution, and emphasize early detection of an attack, with a very
small number of worm packet instances and with as few false positives as possible.
They used the term, bumps (corresponding to the inverse distribution 1(f) ), which
tracks for a given frequency f the number of fingerprints that appear with that
frequency. They showed a plot line (inverse distribution curve) and certain region on
the graph (contributions from fingerprints that have incurred displacements from the
previous time epoch which are greater than a fixed threshold).
They noted that the plot shows us the emergence of the new worm packets, and
the regions show us the portion of the frequency spectrum occupied by the fast
moving fingerprint of packets. The inversed distribution approach is efficient in that it
can identify the worm attacks clearly and unambiguously. However, just as the Base
distribution approach, it still needs computation to gather information distribution. To
get the inverse distribution of incoming packets, they should use a kind of conversion
calculation (counting background (overall packet group) and single packet group).
Therefore, the Inversed approach is efficient not in computational complexity but in
the verifying anomaly from normal condition.
2A.2.2 Statistical Anomaly Detection
Statistical Anomaly Detection (SAD) uses the statistics for DDoS attack detection.
SAD can be implemented for detection at a single point or various points
(collaborative working). Three main ideas were presented as SAD approaches: 1)
Conventional Algorithm for attack detection 2) Using packet inter-arrival time and
Entropy 3) Chi-square algorithm. We will review two main ideas and identify how
these ideas work in DDoS detection.
31


2.4.2.2.1 Conventional Statistical Algorithm
Emmanuel et al suggest the use of GAIA sensor (Detection Architecture) for early
detection of DDOS attack in [17]. GAIA has four main components, which are
Modeling, Mediation, Detection, and Alert Generation component. The sensor
monitors IP, transport protocol, and port number. The GAIA engine has four
components work as follows:
Mediation Layer: Observe traffic, compute values, feed values into other
component
Modeling Engine: Model a normal traffic pattern, then save them in
database storage.
Detecting Engine: Compare incoming pattern with normal profile
Alert Engine'. Generate alert if Detecting Engine produces an anomaly
A static algorithm uses traffic clusters and traffic diagrams into consecutive
periods of stable behavior, and computes thresholds for each one of these periods.
Once one acquires stable behavior, it can be used as a normal traffic pattern in the
database.
2.4.2.2.2 Chi-Square Algorithm
The Chi-square statistic is a mathematical quantity that a nonparametric statistical
technique used to determine if a distribution of observed frequencies differs from the
theoretical expected frequencies. Chi-square statistics use nominal (categorical) or
ordinal level data. Thus instead of using means and variances, this test uses
frequencies.
r s\_17^2 -
The Chi-square statistic is given by X2 = £------, where X is the Chi-square
statistic, O is the observed frequency, and E is the expected frequency. Let a bin be
32


combining a set or range of possible values and treating them as one. For example,
the packet length can be classified several bins having ranges such as 0-64 bytes,
65-128 bytes, and 129-255 bytes. Also, we let Nt be the number of packets in a
sampled traffic, B be the number of available bins, and be the expected number of
packets in the /-th bin under typical (normal) traffic. Then, the Chi square statistic is
represented as follows [18];
*= )
nl
With Pearsons Chi-square test, we can compare the distribution in case the
measurement contains all discrete value. The following steps are proposed by [18]
with the Chi-square test.
Mapping packet attribute values to frequencies in a current-traffic profile
Current-traffic profile is compared with a baseline profile using the chi-
square statistic periodically.
The baseline profile can be maintained as decaying averages of the current-
traffic bin frequencies. Each time the current-traffic bin frequencies are
computed, the average is updated
However, the authors in [18] argue the disadvantages of the Chi-square approach
due to two instabilities of Chi-square. First, the test works best when the number of
possible values is small. In particular, a rule of thumb is that the expected number of
packets in a sample having each possible value be at least five values. Secondly,
when the baseline frequency value for a given bin is very low, the chi-square statistic
may be excessively influenced by that bin's value. Ideally, the bins will be defined
such that this is unlikely, but as a fallback, low-value bins can be automatically
merged with adjoining bins prior to computing the Chi-square statistic.
33


2.4.3 Entropy Based Approach
2.4.3.1 Background and Strength of Entropy Based Approach
Various detection schemes were presented with Entropy quantity in section 1.3.
The main idea is that if an attacker starts his/her DDoS attack, the randomness of
network flow should be changed according to the attack phase. For instance, if the
attacker starts a DDoS attack with spoofed IP addresses, the types of IP address
should be increased. At this time, the randomness of IP address should be increased
significantly. Meanwhile, the types of source IP addresses will be decreased since an
attacker tries to send a bunch of normal packets to the one victim. In the same fashion,
we can observe the entropy changes of source/destination port number.
However, due to the limited computational power and memory space, more
efficient way of expressing network status has been needed. The static of information
entropy was introduced to express network randomness. As we reviewed in previous
chapters, SBA needs to keep maintaining and updating new attack information, if not,
it will increase false positives. Statistical Approaches need lots of computation to
verify statistics such as mean, standard deviation, distribution skew, and so on.
In Distribution-based approaches could capture network status more succinctly,
but it requires appropriate metrics (characteristic isolation) to encapsulate and capture
features of the underlying traffic distribution. However, entropy based anomaly
detection has three significant benefits [19]: First, the use of entropy can increase the
sensitivity of detection to uncover anomalous incidents that may not list as volume
anomalies. Second, using such traffic features provides additional diagnostic
information into the nature of the anomalous incidents (e.g., making distinction
between worms, DDoS attacks, and scans) that is not available from just volume
based anomaly detection. Third, entropy of traffic feature distributions offers useful
information to measure distance among traffic groups (clusters).
34


Also, the authors in [20] argued that entropy can be used by estimating the
entropy of IP packet size. They argued that attacks usually produce packets
independent of the response from the victim. Moreover, flooding-based attack traffic
often consists of packets with identical sizes. For example, a SYN flooding attack
traffic consists of SYN packets with 40 bytes and an ICMP flooding attack traffic
consists of ICMP packets with 1500 bytes. Hence, they believed that the distribution
of packet size is changed under attacks and that analysis of the packet size
distribution can identify attacks to some degree, especially when some special IP
packet size distribution appears. The key question was how to effectively describe the
packet size distribution in a manner that provides necessary information for attack
detection.
After conducting observations, they found that entropy, which describes the
degree of dispersion or concentration (randomness) of a distribution, is an effective
metric for extracting the properties of the packet size distribution in a manner that is
appropriate for attack detection [20].
2.4.3.2 Conventional Entropy Computing
As shown in 2.3, the conventional entropy uses the classical equation;
H = -EjLi P;fo^Cpi), where A(alphabet) = (ai, a2, ..., aj,
S: the input string with n number of symbols from A
To calculate the entropy of a packet stream, the algorithm must use the probability
(pi) that there will be a frequency of a, out of all of arrived packets. We need to store
every packet in the packet repository with its counter to compute the probability.
Therefore, every packet needs to be searched whether its packet is already stored or
not in repository.
35


Packet Repository
Packet Arrivals
Ri Counter = 1
Rz Counter = 1

Rm Counter = 1

Figure 2.7 Conventional Entropy Computation Flow
If every incoming packet is the same, there is no search time for storing new
incoming packet since it searches just first entry in the repository table. However, in
the worst case, a new incoming packet should search all over the repository table if
every packet is identical to the one when an attacker sends spoofed source address
during a DDoS attack.
If we dont have a compression scheme, the conventional algorithm works as
follows with probability computing:
Function: Conventional Entropy {Pkt, t, R)
Input: t Network monitoring interval
Pkt Network Packets within t
R Repository to store incoming packets
and their count value
Output: H Entropy of Symbol Stream
Setup monitoring time:=0, Entropy := 0; Pkt_types := 0; Totaljpacket = 0; Data_exist:= false;
36


Begin
1. While (monitoring time <= t) //save packet during t
2. Total_packet++;
3. Search R
4. If (R.item == Pkt)
5. R.item.counter ++;
6. Data exist := true;
7. End if
8. End Search
9. If (!Data_exist)
10. R.item := Pkt; // Save Pkt at the end of R
// As a new entry
11. R.item.counter++;
12. Pkt_types++;
13. End if
14. End while
15. For (i=0; i < Pkt_types ; i++)
16. Read P.item.counter
17. Compute Probability of typej
18. Entropy += Compute Entropy
19. End for
20. Return Entropy;
END
The runtime of the conventional entropy computation algorithm has two iterations.
In the first while loop, we assume that we have n packets and have m types of packets
during monitoring interval t. For the first two types packets saved in R (Monitoring
Repository).
37


In the worst case (i.e. every packets are identical), each i-th packet (pkti) search
the repository R from Ri to Rj: 1 + 2 + .............. + n = n^2 ^ operations, and 3
arithmetic operation and 1 substitution exist in the while loop. Therefore, the
runtime of first loop is n^n ^ + 4n. Second loop has n* 1 operations. Therefore, the
total runtime is
n('Y1' + 4n + n + 1 (return) = ^n2- + 5n + 1 0(n2)
38


3. New Entropy Computation Design
3.1 Compression Entropy Design
If we use lossless data compression, we can get source information entropy with a
sufficiently large input source. Various data compression schemes were introduced in
both loss and lossless compression way. In our thesis, we focus on the lossless data
compression, since if we lose data from the compression scheme. It means that we
cant get more accurate entropy from compression. Typical lossless compression
schemes are Huffman coding with probability and Universal source coding without
probability (Lempel-Ziv 77, 78, and its variants). Due to our objective which is to
reduce computation time, those two are not proper to compute entropy values from
the data source since it needs more computation time during the compression phases.
One research study discussed how the compression scheme is used to get faster
entropy computation [23] by using linear regression to analyze the relation between
original source entropy and compressed entropy for worm detection [23], The author
in [23] suggested an experimental result of reducing computation for entropy
calculation with three data compression algorithms which were bzip2, gzip, and
lzolx-I. They first compressed the sampled packets with a compression algorithm,
then calculated entropy with the compressed information. They find the Lempel-Ziv
compression algorithm (lzolx-1) has the best performance in computation time and
memory usage.
They proved that entropy estimated by compression is strongly related to the
estimated entropy by using linear regression analysis. However, in statistics, linear
regression is a form of regression analysis in which the relationship between one or
more independent variables and another variable, called dependent variable, is
modeled by a least squares function, called a linear regression equation. Therefore,
39


the relation from linear regression does not guarantee the concrete relation which
means that it has potential error factor. Therefore, with this Relation approach to
entropy computing could sometimes be unclear. Also, the author of [23] found that if
there is significant fluctuation of coefficient on destination IP addresses, Relation
approach is inappropriate to our goal in which we aim to destination IP addresses for
detecting DDoS attack maintaining detection accuracy (maintaining accurate entropy).
Ratko Tomic introduced a Fast, Optimal Entropy Coder by using a combination
scheme in data compression in [24], We will introduce and design the fast
compression entropy, exploiting the compression concept of [24],
3.1.1 Binary String Compression
Packet information (source/destination IP address, source/destination port number)
can be represented with binary string (0 and 1). Also, the information is nothing but
distinct bits (Os and Is). If we use indexing lattice path from [24] of the given source
string, we can compute entropy with only the number of Is counted.
I(b1b2...bn) = £y=1 ("/) where 0 bj: i-th binary element
n : total number of bi where k is the number of 1 s in S
{tij} : a subsequence of the sequence {/-I},
retaining only those values of (/-l) for which bi= 1, or in words,
tij : a zero-based bit index which picks out only the ls from the input stringS.
The size (in bits) of the path index / with k ones is log(N(rt-k,k)) = log(C(tt,k)),
where the binomial coefficient C(n,k) is the path count for M-bit strings with k ones.
Applying the Stirling-Approximation for the factorials C(rt,k), the path index size
log(C(n,A)) which is compression entropy becomes:
40


Log(C(n ,k)) = n[plog(l/p) + qlog(l/q)J -l/2log(2n npq) -..........(1-1)
, Where the probabilities through: p(l) = p= k/n andp(0) = q = (n k)/n
Until now, we have a runtime of fast optimal entropy coder taking 0(1), if we know
In the network, the system gets a continuous stream of packets which means that
it has a sequential symbol stream. We have known the computation time is 0(1) if we
have a binary string with knowledge of the number of 1 s count. The problem is how
to convert sequential symbol stream into a binary symbol stream.
We can use the idea of Multi Alphabet Source' scheme p.31 in [24]. As shown
(1), the entropy of the binary source can be computed with the count of Is. Let a
sequence of n symbols be Sn taking values from an alphabet Aq = { a\, ai,..., aq}, and
be a list of q counts k\, &2, fe,... kq (each count kt counting the corresponding symbol
a,) adding up to n.
The number of different arrangements of these n symbols with the given symbol
counts k, satisfying (2) is the multinomial coefficient:
the number of Is.
(1-2).
ki + k2 + kj +... + kq n
(2)
N(n, kl, k2,
> kq) = (
fcl+/c2HI-kq
kl,k2,....kq
)
(kl+fc2+---+fcq)!
kl\k2\...kq\
k lA k2 )
kl\fkl+k2\
kl+k2+--
kq
+liq) by rearrangement
41


= n?=1 by restatement with
multiplication notation. (3)
If we take a log of both sides, we can get the entropy of sequence of alphabet.
Entrap,(AJ = log(()) + log ((;)) +..........+ log((*1+*£-**))
= log((*1+*-+..............+ log (C"1 2 3)) + logO------(4)
3.1.2 Combination Compression Entropy
In [24], the author suggested binary splitting until it has an empty set of Is. Also,
the author argued that Huffman coding is suitable for optimal prefix codes for
producing compressed data with little processing and more memory usage. However,
in the network monitoring scheme, the entropy is more important than the optimal
code itself, since the goal of network traffic monitoring scheme is not getting optimal
compressed data but detecting the abnormal change in the traffic with entropy values.
Ironically, the compressed data (the output of compression algorithm) is just a bi-
product for the monitoring facility or not needed by the detection system. Therefore,
we will focus on how we can produce the entropy of a symbol string with less
computation time. We start the interpretation of first term of (4), log((/a+fe^?"'+kc?))
until the last term.
1st Term: C (k\ + k2 +... + kq, kq) = C (n, kq) by (2)
2nd Term: C (k/ + + + kq.i, kq.j) = C (n-kq, kq.j)
3nd Term: C (ki + k2 + ... + kq.2, kq.i) = C (n-kq -kq./, kq.i)
42


(q-l)-th Term C (fq + k2, hi) = C (n- kq kq.} -.- kq.2, kj)
q-th Term C (kt, kj) = C {n- kq kqq -.- kq.2 kq.i, kt) = 1
Since in every i-th Term computation, the number of objects from which we can
choose (let this number be ,) is subtracted by kq.j./ element, the size of n, keeps being
decreased. Again, we dont have interests in codes. Network traffic data (IP addresses
and Port number) has unique and distinct values. On one hand, an IP address has a 32
bit positive value, which is same size of unsigned integer type. On the other hand,
Port number has 16 bit positive value, which belongs to unsigned integer variances.
We can do a calculation with a heap structure since all packet information (addresses
and port numbers) can be arranged by its value. The number of same packet
information (in terms of address or port number) means k, in (3). For instance, if we
insert source IP address into heap, we can pull out the element until pulled data meets
different address value. If we count those number of packets and regard this number
as kj, we can calculate entropy values in each steps (herein, i-th term). Also, the k,
means the number of Is in the stream Ej=1 kj in the i-th stage. If we add up every
entropy in every steps, we can get the total entropy value.
The entropy computation algorithm (via combination compression algorithm)
works as follows:
Function: Heap-Combination Entropy (Pkt, t)
Input: t Network monitoring interval
Pkt Network Packets within t
Output: H Entropy of Symbol Stream
Setup Create and Init Heap, monitoring time:=0,
Current _item := Next_item := 0;
Entropy :=0;
43


Begin Pkt_count :=0; Total_count := 0;
i. While (monitoring time <= t)
2. {
3. Heap := pkt;
4. Total_count++;
5. }
6. Current_item = Delete item from Heap;
7. for (i=0; i 8. {
9. Next item := Delete item from Heap;
10. If(Current_item == Next_item)
11. Pkt_count++;
12. Else
13. {
14. Entropy += Compute Entropy with Total_count
15. & Pkt_count; (5)
16. Total_count = Pktcount;
17. Pkt_count := 0;
18. Currentjtem := Next_item;
19. }
20. }
21. 22. END Return Entropy;
44


3.1.3 Runtime Analysis of Compression Entropy
(5) in above algorithm can be computed in constant time because of (1-1) and (1-
2), where p(l) = Tota^cmmt'' anc*= ^ _PO)- This algorithm is very similar to
heap sort algorithm, since if we have n packet, it inserts n packets and deletes n
packets again. Known as heap sort, it has runtime of 0(n log n). In the heap structure,
insertion and deletionjnin operations take 0(log(n)) time in the worst case.
The runtime of the first while loop is:
Tl=1{log(n)+ 1) =Sf=i(%(n)) + S?=1(l)
= n?=i log (n) +n = log(n!) + n
= (lg(n) -l)n + n = nlog(n)
The runtime of the second for loop is (the worst case is when every packet is
different so that else runs all the time):
I Uilogin) + 4) = I Uilogin)) + 2?=1(4)
= nr=i l9 (n) + 4n = log(n!) + 4n
= (log(n) -l)n +4n = nlog(n) + 3n
(.Return needs 1 operation)
Therefore, total runtime of Heap-Combination Entropy is
n log n + n log n + 3n + 1 = 2n log n +3n + 1 O(n log n)
3.2 Fast Entropy
3.2.1 Approach for Fast Entropy Computation
A network attack detection based on the information entropy is frequently used,
45


since entropy can represent network condition with a simple number. The classical
way of computing entropy value is using the equation,
H = -£"=i Pi log ph
where p, the probability i-th symbol occurrences from all observed symbols.
The value of H means the average number of bits per a symbol given information
string. If we multiply H by the number of total symbols given symbol string (n), n*H
means the total number of bits to represent input string (i.e., total amount of
information). If the types of symbols are increased, the H is increased. Otherwise the
H is decreased even though the total number of symbols is increased. Due to this
characteristic, we can monitor and analyze the network traffic condition.
Intuitively, H is a quantity of disorder. If the disorder (the number of types of
symbol) is increased, H must be increased. Otherwise, H must be decreased.
Therefore, the simple expression of entropy can be represented as H = log (the
number of possible states within input information). It is very similar to physical
entropy, Ii = log (the number of possible system states). Each of which has the value
to represent information or system disorder. The reason we use the probability is that
we are able to know the information amount (size of bits) per symbol. However, the
specific value for a symbol is not important, since the goal for monitoring is
recognition of network pattern such as the number of different type of symbols (the
same as how many different packets in a network) and the total number of symbols
(the same as how many packets arrived in a network). We define the symbol as a
packet in the network monitoring scheme, hereinafter.
One thought to reduce computational consumption of entropy is to use only the
number of different types of packets. With the help of this thought, we can use
entropy as H = log (the number of distinct packets). One problem may occur since it
doesnt reflect the total number of packets. Usually, an attacker increases significantly
46


the number of packets to saturate victims system capacity during DDoS attack. As a
result, we need one more criteria that is total number of packets. Now, we define
entropy as:
H = log where m is the number distinct packets,
n is total number of packets in an input.
In a network packet flow, if we have m = 2 distinct packet addresses in total n = 8
packets, the entropy is H = log (1/4) 2 which represents the network entropy in
certain time. If the distinct packets are increased (ex. m = 4), the entropy is decreased
(H = 0). The number of distinct packets can be more detailed into packet source
address, destination address, source port number, and destination port number.
Generally, the maximum entropy increases with the number of symbols in the
classical entropy. However, the entropy H is decreased at the similar circumstance
(the increased distinct types of packet). We can monitor with H the number of
distinct types and total number of packets which can represent the packet stream
condition in the network.
3.2.2 Fast Entropy Calibration
We can represent the randomness of packet information with the ratio between the
number of distinct packet types and the total number of incoming packets that is H =
log ^ = log However, it may increase the false positives which are very critical
in monitoring systems. For instance, if an attacker increases the number of packets,
he/she increases the number of packet types simultaneously. In that case, the ratio ^
does not change, keeping the same entropy value. Now we introduce the entropy
calibration factor to increase sensitivity on packet number increasing.
Let n, be the total number of packets in monitoring interval t,. We now monitor 2
47


variables which are nh/ and n,. These two values are adjacent values in the monitoring
time series. Define calibration entropy as follows;
r I I- if > /-/
ni
H"= <
ig^~
if ^ < n,.,
If , > rij.i, 0 < < 1, then oo < /og^-1 < 0. In the same fashion, , <
ni n i
will have same range.
We will use the value ratio value between 0 and 1, since this region among
various log base values is very significantly changing. By taking absolute value, we
can monitor the entropy of the changing packet number. Also, we will use log base 2
since significant change of entropy is between 5 and 10 if the network changes
abruptly because the packet number ratio of (0.5, 1] has a similar pattern between 5
and 10. With the value of lower than 0.5 the logarithmic value drops down
significantly, thus reflects remarkable changes in the traffic flow of the network. If
there is no traffic change in terms of packet number, the ratio is 1, then, entropy
change is zero. (logl = 0, no impact into detection facility).
3.2.3 Fast Entropy Estimator
Now, we have
H= -log andH"
^ n
log |, ifn,>n,./
71/
<
log |, if rij < n,.
ni-1
Let x be H.
48


We can write
H = log ~ + x where m is the number distinct packets,
n is total number of packets in an input, and
x is packet number calibration factor (same as H).
3.2.4 Runtime Analysis of Fast Entropy
We can monitor the packets with detailed information (the number of types in
source address, destination address, source port, and destination port). To check the
number of distinct packets regarding to detailed information, we will use a heap
structure which is a binary tree where we need only insert operation into the heap tree.
Whenever packet is inserted, we check the heap whether it already exists in heap or
not. If data exists in the heap, the algorithm does nothing. If there doesnt exist the
data, we insert it into the heap, and increase the counter. Since binary tree has the
depth of log n, the insert operation needs log n time. Also, we monitor the packet
every t time over T time period. If we monitor m packets in t, and the total packets of
over T is n. The number of packets is n = m*.
The runtime of computing the number of distinct types is
IiL[ l9 (mi) = l9 (fnf) +...... + log (mT/t) = T/t login,
where m is average packet number in t.
m can be represented as m = yj- = n *
Let t/j be a, then,
T/t login = Vt ^d{.n x Vt) = alog(n* a1) = a log n-a log a
Therefore, the runtime is O (a log n- a log a).
49


4. Detector Design
To reduce computational time for entropy, we introduce a compression entropy
scheme and a fast entropy computation scheme which uses the number of packet
types and the number of packets. Both of them are similar to the conventional entropy
scheme, but it doesnt need any probability to calculate entropy values in contrast to
the conventional entropy scheme. We reviewed compression entropy and fast entropy
scheme is very fast in theory. We designed the fast entropy scheme. The Fast entropy
computing algorithm takes less than conventional entropy computing. Now, we need
DDoS detector which can acknowledge the anomaly of network condition. We will
use the moving average concept to design basic detector. We designed an adaptive
adapter on top of moving average detector. We verified meaningful range of
threshold from simulation which is between 2a and 6a, where a is standard deviation
of packet information (addresses and port number) during every monitoring interval.
We used the general pattern to design an adaptive detector.
If the difference between the average of previous monitoring interval and new
entropy value is decreased, the detector cant detect the attacks with high threshold
value because of the steady channel condition and stealthy attack pattern. In this case
we need to decrease the threshold value. Meanwhile, if the channel is burst but the
detector has relatively small threshold value, the detector works very sensitively in
this situation. As a result, the detector yields many false negatives, a bad
characteristic of a detector. In that case, threshold should be increased accordingly.
Our adaptive detector shows the result approximating optimal detection performance.
4.1 Detector Design Approach
We monitor entropy values sequentially with a simple moving average of window
50


size k. Each value comes from an entropy calculator with fixed size of window. There
are variants from the simple moving average method. We will use the simple moving
average, since we assume the traffic packet arriving is identically and independently
distributed (i.i.d), memoryless, and a stationary process.
4.2 Monitoring Concept (Moving Average)
Assume we monitor the entropy values for m intervals (i.e., window size of m). If
we have monitoring interval t seconds, we monitor the entropy value for m*t seconds.
In every monitoring interval t, an entropy value is computed.
Figure 4.1 Monitoring Concept with Moving Average
Lets define as follows:
y,: i-th average of Moving Average Window
a: Standard Deviation of Hn.m ~ Hn-i with y,
D{. absolute value of difference between y, and Hn (i.e., D, = \ y, Hn\)
/?: threshold multiplication factor, positive integer value (default y = 3)
(o: threshold (co = fi a)
Once y, is computed, it will be compared with H. To detect a traffic pattern
change, if Di > co, we decide we have an attack in current monitoring interval n.
Otherwise, traffic condition is still normal condition (out of attack). Once a
51


comparison is done, the Moving Average Window will be moving forward along with
time evolution (ju, will start attn.m+i).
4.3 Adaptive Detector
According to our simulation with field data, we recognized that the multiplication
factor, p, needs to be varied according to the packet traffic condition accordingly. On
one hand, if an attacker sends malicious traffic with small change in traffic at the time
the channel is stable, the detector cant detect the attack with high value of ft. Because
of the steady channel condition and stealthy attack pattern, the detection facility
doesnt work properly with highly set /?. On the other hand, if the channel is burst but
the detector has relatively small /?, the detector works very sensitively in this situation.
As a result, the detector yields many false negatives which are not severe but a bad
characteristic of the detector. Additionally, one doesnt need to monitor or keep track
of false negatives due to enhanced detector performance if we once set adaptive
threshold.
The basic idea of an adaptive threshold is that if the new entropy is a relatively
high value compared to an average value, /Uj, with previous n entropies, it has high
probability to be a burst channel. After we see this, we increase threshold
multiplication factor, /?. We define ft be [3, 6] since we observed the fact that if [i is
over 6, the detector couldnt detect almost any anomalies. On the other hand, the
detector is too sensitive to detect an attack precisely producing many false negatives
if P is (0, 2]. The P will be changed under the following rules:
If Hn > 1.5* jUj, then increase P by 1
If 0.5 pi, £ Hn < 1.5 then maintain current P
If If <0.5 Hi, then decrease P by 1
52


The adaptive detection algorithm will work as follows;
Figure 4.2 Adaptive Detector Algorithm Flow
4.4 Dynamic Window Design
We designed an adaptive detector. The adaptive detector works according to the
channel burst condition. If a channel is burst, the multiplication factor of the detection
threshold is increased by 1. If a channel is steady, the multiplication factor of the
detection threshold is decreased by 1. Otherwise, multiplication factor stays with
53


same value. We want to enhance the detection accuracy and reduce detection errors.
An idea of detector to reduce detection errors is to make the size of monitoring
window be dynamic (variable window size). We designed dynamic window and
simulated DoS/DDoS attack detection with our adaptive detector.
4.4.1 Definition
We define Dynamic Moving Average Window (DMAW) as a variable Moving
Average Window (MAW) to monitor effectively network conditions in DoS/DDoS
detector. The size of DMAW varies according to the channel condition. There may
exist several ways to update DMAW size. DMAW has its functions on top of adaptive
detector that updates threshold values according to network condition described in 4.3.
4.4.2 Simple Dynamic Moving Average Window (SDMAW)
Initially, the concept of Moving Average Window designed to move forward
whenever it compares a new entropy value without changing its size. Now we adopt a
Simple Dynamic Moving Average Window (SDMAW). Once the attack is detected,
the channel is clearly burst. In that case, we need to make the detector insensitive. If
the channel stays normal condition several detection periods, the channel is steady
that means we need to reduce the SDMAW to make it more sensitive for next
detection.
SDMAW works as follows;
The detector has one Moving Average Window (MAW).
Four detection criteria (source IP address, destination IP address, source port,
and destination port entropy) share this MAW size.
Whenever an attack occur, the size of SDMAW increased by 1
If a channel shows normal condition 5 times in a row, the size of SDMAW is
decreased by 1.
54


4.4.3 Multiple Dynamic Moving Average Window (MDMAW) with TS
The Multiple Dynamic Moving Average Window (MDMAW) with Threshold
Shift (TS) has four MAWs (source IP address, destination IP address, source port, and
destination port MAW, respectively). The detector manages the size of each
MDMAW size. The size of individual MDMAW changes its value whenever a
threshold value changes. An evidence of channel condition should be a threshold
value shift since the multiplication factor, /?, is updated the packet traffic condition
accordingly shown as in 4.3. A positive shift of /? shows high probability of burst
condition, then increase the size of MDMAW to monitor channel efficiently. If there
is a negative shift of /?, decrease the size of MDMAW.
MDMAW works as follows;
The detector has four MAWs to monitor the channel condition of source IP
address, destination IP address, source port, and destination port entropy.
If /? has a positive shift, the size of MDMAW is increased by 1.
If /? has a negative shift, the size of MDMAW is decreased by 1.
If /? has no shift, the size of MDMAW is maintained with same size.
4.4.4 MDMAW with AO
MDMAW with Attack Occurrence (AO) has four MAWs (same as MDMAW with
TS). MDMAW with AO works as a same fashion of SDMAW. However, each
MDMAW is updated separately.
55


5. Simulation & Analysis
We did a simulation with 4 datasets, which are normal dataset (University of
Colorado Denver BSS Computer Lab Traffic), and 3 different DoS/DDoS datasets
(1999 DARPA and 2000 DARPA dataset). We mixed or interleaved DoS/DDoS
datasets with the normal dataset. We will explain input data sets in detail in the
following chapter. We will show the result and runtime analysis and accuracy
performance with our simulation.
5.1 Input Data
To acquire normal data flow, we gathered traffic flow from the University
computer lab which was the Behavioral and Social Science computer lab (BSS lab)
filled with normal computer users (students). We extracted DoS and DDoS data set
from DARPA sets in MIT Lincoln Labs web site.
5.1.1 Normal Data Flow
We assume the BSS computer lab dataset we gathered represents normal traffic
flow. We sniffed packets on March 2, 2009 during the busiest hour between 11:00
AM and 12:00 noon.
5.1.2 DoS and Port Sweep Attack (1999 DARPA Dataset)
We mixed two network flows together. One is for normal network traffic gathered
from University of Colorado Denver Behavioral and Social Science computer lab
(BSS lab) on March 2, 2009. The other is for DoS attack traffic extracted from
DARPA data set on April 5 1999 (99 DARPA). We have precisely separated DoS and
Probe attack packets from 99 DARPA, which are 5 DoS attacks and 2 stealthy
56


Probing attacks. We interleaved 99 DARPA attacks in BSS lab traffic. Once normal
traffic (BSS lab) gets started, DoSl attack among 99 DARPA attacks is launched in 5
minutes. After that, attacks are launched in every 4 minutes in order of Stealthy
Probingl, Stealthy Probing2, DoS2, DoS3, DoS4, and DoS5 attack. Note that DoS
differs from DDoS in terms of traffic number scale and attack phases. We did
simulation with those data sets to verify that our new fast entropy works in general
DoS traffic or not.
5.1.3 Typical DDoS (Non-Stealthy 2000 DARPA Dataset)
2000 DARPA Data Set (00 DARPA) is a typical dataset of DDoS attack traffic
since it shows significant packet increases and entropy changes at the starting point of
the DDoS attack. We can categorize the five attack phases in 00 DARPA dataset:
Phase 1: IP sweep to the DMZ hosts from a remote site.
Phase 2: Probe of live IPs to look for the sadmind daemon
Phase 3: Breaks-in via the sadmind vulnerability,
both successful and unsuccessful on those hosts.
Phase 4: Installation of the Trojan mstream DDoS software
Phase 5: Launching the DDoS.
We assume phase 1, 3, and 5 are DDoS attacks. Phase 2 is not considered a
significant attack phase since it sends a small amount of packets in phase 2. Therefore,
detecting phase 1 is sufficient to diagnose the change of network condition. Phase 3
and 4 are regarded as the same attack phase in the detection facility since their traffic
pattern is very similar to one another. We will break phase 5 into two phases, phase5
and peak attack. Phase 5 means the beginning of the original phase 5 attack. Peak
attack has the most significant packet number changes during phase 5 attack period.
57


The actual DDoS attack is phase 5. It is sufficient for the entropy detector to
diagnose abnormal conditions in both phase 5 and peak attack. Any detector must
make sure to detect phase 5 attacks among all 5 phases. For example, if a detector
monitors any kinds of abnormal conditions from phase 1~4, but it doesnt detect
phase 5, the detector doesnt work for network monitoring at all. We will focus on the
detection ability on phase 5 and peak attack. We will regard detection to be a success
if it verifies phase 5 or peak attack.
5.1.4 Stealthy DDoS (Stealthy 2000 DARPA Data Set)
2000 DARPA Stealthy Data Set (00 stealthy DARPA) is stealthier than the 5.1.2
dataset but has 5 attack phases, the same as the 5.1.3 dataset. Attack scenario goes so
that attacker probes over the network (phase 1 and 2), penetrates into a host by
exploiting its system vulnerability called sadmind in Solaris (phase3). installs DDoS
software (phase4), and launches DDoS attack on compromised hosts.
Table 5.1 Comparison Normal and Stealthy DDoS Datasets
Unit: The b umber of Packets / min:sec
Phase 00 DARPA 00 stealthy DARPA
1 785 / 00:25sec 4 / 00:00sec
2 148 / 10:58sec 6 / 00:02sec
530/01:51 sec 70 / 00:00sec
4 526 / 00:52sec 76 / 00:47sec
5 34553 / 15:04sec 707/08:08sec
We set up only BSS lab traffic at the beginning of 30 minutes, after that we laid
mixed data set, which was interleaved with BSS lab data and 00 stealthy DARPA
dataset for 30 minutes. This formation gives us clear separation between a normal and
abnormal boundary for entropy detector evaluation. The significance is that 00
58


stealthy DARPA has the small number of packets during attack period even though it
has same 5 phases.
Since the amount of packets is not significant, it is very difficult for the detector
to detect every phase precisely. Our objective with 00 stealthy DARPA set is to detect
the main DDoS attack phase (phase 5). Also, we consider that detecting phase 5 is a
better detector rather than detecting any other phases. For example, if a detector
senses phase 1 through 4 anomalies but doesn't detect phase 5. it considered to have
very low performance since it missed the main DDoS attack point.
5.2 Simulation Procedure
We firstly tested datasets without adaptive function on a detector to estimate the
efficient and meaningful range of fi so that we set /? range for adaptive detector later.
Once we get the meaningful range of /?, we tested all dataset again with adaptive
detector. We presented the number of packets at all detection points, detection
accuracy tables (conventional, combination, and our fast entropy detector), and the
adaptive detector and dynamic window performance table with 99 DARPA, 00
DARPA, and 00 stealthy DARPA datasets. Finally, we simulate the detector with the
dynamic moving average window (DMAW)
5.3 Detection Assumption
We assume that if one of the monitoring facilities (among source address,
destination address, source port, and destination port) detects attack, it will be
regarded as a DDoS detection success. Also, we think that the more attack detections
in attack phases mean the better detector. Note that if the attack occurs in t we
consider the detection occurrences on t,./ and f+/ are the detection successes as
adjacent hits.
59


5.4 Accuracy Analysis
We presented the result of three different attack patterns. The accuracy analysis
was done in every attack pattern.
5.4.1 DoS and Port Sweep Attack Result (99 DARPA)
5.4.1.1 Packet Distribution
600
Figure 5.1 Packet Flows at Detection Points (99 DARPA)
The DoS attacks are clearly represented by increasing packet number. However,
the two stealthy Probe Sweep attacks do not show the changes of packet distribution.
5.4.1.2 Entropy Distribution
* r co o> lo t r- co s to r co cn co r co > lo r co cx> to r co
i * oj co co -o- eo to co r r oo o-> cr> cz> o cnj cm co co in m co
Figure 5.2 Conventional Entropy Distribution with 99 DARPA
60


Figure 5.3 Compression Entropy Distribution with 99 DARPA
Figure 5.4 Fast Entropy Distribution with 99 DARPA
The entropy distribution shows network condition changes visually in the above
graphs. However, the compression entropy scheme is too sensitive to monitor small
changes such as DoS 1, 2, and 3 attacks. This may increase false negatives if we set
the detector to be sensitive. On the other hand, if we make the detector insensitive to
reduce false negatives, it may increase false positives, which are very critical to
detection facility.
61


5.4.1.3 Detection Accuracy (7 Attacks)
Table 5.2 99 DARPA Dataset Detection Accuracy
Entropy Error Type Threshold
2a 3a 4a 5a 6a
Conventional False Positive 3 5 6 7 7
False Negative 19 7 1 0 0
Compression False Positive 1 3 3 3 4
False Negative 7 3 3 3 1
Fast False Positive 1 2 4 5 6
False Negative 8 0 0 0 0
In either aspect (either false positive or false negative), our fast entropy scheme
has the best performance under range of 2a ~ 4a, if we consider two stealthy probing
attacks, which are very difficult to detect from normal traffic with entropy approaches.
The conventional entropy scheme has the worst performance; it never works properly
since it is suffering from both false negatives and false positives all over the range of
P (3a ~6(j).
5.4.2 Typical DDoS Result (00 DARPA)
5.4.2.1 Packet Distribution
The packet flow shows relatively clear traffic pattern changes. With the packet
number, phase 1 and peak attack of phase 5 is obvious. However, the phase 2 doesnt
appear in the packet flow graph because it uses just a few packets rather than a
normal traffic number. Phase 3, 4, and 5 have similar packet loads. A detector will be
confused only by the number of packets. If there exists information shift
(source/destination address, source/destination port), the entropy detector can detect
any anomalies or intrusions.
62


Number of Packets
rN(OOW)rNCOO)rSrtO)l/)rSMOMOrS(*:flJinrNrtO)inrSrt01inrNMOllflrMOfl)
rrMfOfO^^lfllD^SSCOaiOOOrOJWCOfOt^lfllOlONOOXIfflOOrrMCNMttlfilO
T-T-T-1-*-T-r-T-T-T-1-T-l-T-T-T-T-(\JCSjOJC\jr'IC\l Cumulative Detection Point
Figure 5.5 Packet Flows at Detection Points (00 DARPA)
5.4.2.2 Entropy Distribution
20
15
10
T-t£T-{0T-<£)T- *-.-CVCvl(OrO^VV>in(0(DNNCOCOa>C7>00
Phasel
Peak attack
Phase3,4
Phase5
L ..
dst_port
srcjDort
*dst_addr
srcjddr
s-ojMcotOTr^iniDcotONscoajCTiOoos-r-cNCNfOfO^srinincD
t-s-T-T-*-s-s-s-r*s-T-s-T-s-T-t-T-C'|(MOJMM(NCM(MCM04OJNC4
Figure 5.6 Conventional Entropy Distribution with 00 DARPA
400
300
200
100
0
Figure 5.7. Compression Entropy Distribution with 00 DARPA
63


50
dstjjort
- src_port
dst_addr
src_addr
r-T-(MCNnn^^Lnin(DNNC00010)00^-NN(n<')^^lflUMDtDNNacCO)0>OOr-T-(N(NMWV^lf)lftlO
Figure 5.8. Fast Entropy Distribution with 00 DARPA
Entropy distribution is quite clear. We can determine the phase 1 and peak attack.
But it is difficult to identify phase 3, 4, and 5 attack. The entropy distribution graph
shows similar pattern of packet distribution.
5.4.2.3 Detection Accuracy (1 DDoS Attack, 4 Attack Points)
Our fast entropy approach yields powerful result in 3a~5a, since it has both low
false positives and relatively low false negatives. Only in 2cr, the detector yielded
many false negatives. This is because the calibration factor causes the detector to
fluctuate with small changes of packet numbers.

Table 5.3 00 DARPA Typical DDoS Dataset Detection Accuracy
Entropy Error Type Threshold
2 a 3a 4a 5a 6a
Conventional False Positive 3 2 4 4 4
False Negative 2 2 1 1 1
Compression False Positive 3 3 3 3 3
False Negative 9 7 6 6 5
Fast False Positive 1 1 2 2 3
False Negative 18 6 3 2 1
64


5.4.3 Stealthy DDoS Result (00 Stealthy DARPA)
5.4.3.1 Packet Distribution
150
Figure 5.9 Packet Flows at Detection Point (00 Stealthy DARPA)
A significant number of packet changes does not exist in the graph. Phase 4 and 5
has a little packet number increase. However, it is not still a significant number. Again,
there is no significant packet pattern change during the stealthy attack except phase 4
and 5.
5.4.3.2 Entropy Distribution
15
Phasel Phase2 Phase3 Phase4,
10
r(OU)NOMCnOMr03inN(DnONnra)lf)Nll)nOS^f(OinNffllOnOSVr(DinN()HOnO
rC\|CNn^inin0)Oi-N0-i-C'JCOC')^lOlOOOi-CN<0
Figure 5.10 Conventional Entropy Distribution with 00 Stealthy DARPA
65


200
(N M (SI N CN {
) w eg co n o
IffiOOr Nf)
I N CO CO CO (0 CO
dst_port
-srcjxxt
dstjddr
src_addr
Figure 5.11 Compression Entropy Distribution with Stealthy DARPA
40
30
20
10
0
Phase4,5
s srcjiort
dst^addr
src_addr
dstjxtrt
<-WifiN!JMDnONf#|ONO)(DP)ONtri.cOiflNflUinON5rffl^MO)(0(0(3Nt'-(DinNO)nO
i-CS|fsrO^U,>if>COh'l,,0>Or-MMCOT^iO rrrrrrrrrrrrrrCJNMlNNNNNtSOinOINNOnncOn
Figure 5.12 Compression Entropy Distribution with 00 Stealthy DARPA
As we discussed, entropy distribution doesnt have significant entropy changes in
the Conventional Entropy Scheme. The Compression Entropy Scheme represents the
network flow changes since it is very sensitive to the changes of the network channel,
which was the main fault in typical DoS/DDoS detector. However, our Fast Entropy
Scheme shows us the main change in phase5 (DDoS launching stage) which makes a
detector capable of detecting an anomaly. Meanwhile, the Conventional Entropy
Scheme does not have any pattern change, which implies that it has poor performance
under stealthy DDoS attack.
66


5.4.3.3 Detection Accuracy (1 Stealthy DDoS, 4 Attack Points)
Phase 4 and 5 are considered as one attack point since those two occurred at
almost same time. Note that phase 1~3 is a stealthy attack, which is very difficult to
identify with only address and port information. Again, our objective is to detect
phase 5 attack (main DDoS launching phase).
Table 5.4 00 DARPA Stealthy DDoS Dataset Detection Accuracy
Entropy Error Type Threshold
2ct 3o 4a 5a 6a
Conventional False Positive 3 4 4 4 4
False Negative 45 8 0 0 0
Compression False Positive 2 2 2 2 2
False Negative 27 17 12 11 5
Fast False Positive 1 1 2 2 2
False Negative 26 6 1 0 0
The Conventional Entropy Scheme doesnt work over the entire range. It couldnt
detect any attack phases. It yielded four false positives; that means that the detector
detected nothing out of four attack phases. Even though we grant the false negatives
since it is hard to detect stealthy attack phases, the detector should detect phase 4&5
attacks because it is relatively clear in packet distribution. Our detector produces
some false positives. However, the number of false positives is less than the
Conventional or Compression Entropy Scheme. Also, it has fewer false negatives
than the Compression Entropy Scheme.
5.5 Adaptive Detector Simulation
By doing manual experiments, we have learned that our Fast Entropy Scheme has
better accuracy in DoS, DDoS or Stealthy DDoS attack detection. Also, we noticed
67


that the range of [3 has a meaning between 2 and 6, which is 2a ~ 6a. We dont know
what the precise value of P is yet, but to test all possible P values is not realistic. We
introduced an adaptive scheme in section 6.3 Adaptive Detector Design. The
following table is the simulation result of 6.3.
Table 5.5 Performance Comparison: Adaptive Detector vs Manual Threshold Setting
False Positives / False Negatives
Entropy Detector Type Input Dataset
99 DARPA 00 DARPA 00 Stealthy DARPA
Conventional *Best case 3/19 1/2 3/45
** Adaptive 4/5 2/8 5/4
Compression Best case 1/7 1/5 2/12
Adaptive 1/5 3/13 2/16
Fast Best case 1/8 1/6 1/6
Adaptive 1/2 2/7 1/9
*Best Case: the best result among all simulation results with 2a ~ 6a. The first
criterion is low false positives but allowing 1 miss. If there isnt 1 miss, we will
choose a minimum of false positives as the best case. If false positives are the
same, having low false negatives is the best case.
** Adaptive: the algorithm changes the threshold value with observing network
flow data accordingly. It starts with /? = 3. Any /? among source address,
destination address, source port, and destination port will be changed after
evaluation at every monitoring interval (see 3.3 for details).
The Adaptive Entropy Detector approximates the best case, which is good
detection performance (characteristic) without any pattern training or history records.
68


The Adaptive Entropy Detector shows good adaptability with Fast Entropy Scheme.
On the other hand, it makes detection performance decline with the Conventional
Entropy Scheme. This adaptive scheme is suitable for general DoS attack and
Stealthy attack. However, it still has good performance in general DDoS attack. If we
adapt adaptive scheme with fast entropy scheme, we can get more stable.
5.6 DMAW Simulation
Adaptive Detector without DMAW showed good detection performance without
pattern training or history record compared to the Best Case of manual simulation of
each threshold multiplication factor, /? (see 5.5). Although the Adaptive Detector
approximates the Best Case in terms of false positives, it yields a few more false
negatives than the Best Case.
Table 5.6 Performance Comparison: Adaptive Detector vs DMAW
False Positives / False Negatives
Entropy Detector Type nput Dataset
99 DARPA 00 DARPA 00 Stealthy DARPA
Conventional Best Case 3/19 1/2 3/45
Adaptive withoiit DMAW * 2/& 5/4i.-
SDMAW 6/2 2/5 5/0
MDMAW with TS 6/1 2/6 5/1
MDMAW with AO 5/1 2/5 5/2
Compression Best Case 1/7 1/5 2/12
Adaptive^tbdutllJVlAW .: . - 3/ii^
SDMAW 3/5 3/17 2/15
MDMAW with TS 2/5 3/18 2/14
MDMAW with AO 3/3 3/12 2/14
Fast Best Case 1/8 1/6 1/6
AdaptivefmffihutDMAW -71/2'" '1/9/i
SDMAW 3/1 2/6 2/6
MDMAW with TS 3/1 3/5 1/5
MDMAW with AO 2/0 2/6 1/6
69


When we adopted DMAW scheme on top of Adaptive detector, it shows an
enhancement on reducing false negatives all over the entropy schemes compared to
Adaptive Detector without DMAW. Since Compression Entropy Scheme is very
sensitive, there is little contribution to reduce false negatives with DMAW in
Compression Entropy Scheme. DMAWs can reduce false negatives with 99 DARPA
dataset (five DoS attacks and two stealthy probing attacks), but produces a few more
false positives with 99 DARPA dataset. However, if we consider 99 DARPA dataset
contains two stealthy attacks, the detection performance is still tolerable. Note that
detectors on DMAW schemes remain in bad detection accuracy with Conventional
Entropy Scheme which is the same as Adaptive detector even thought we use
dynamic window scheme. MDMAW shows not only the best detection performance
among three DMAW approaches but also the best fit with our Fast Entropy in false
positives and negatives. We can enhance detection accuracy against the false positives.
With the DMAW approach, we can improve the detection performance of Adaptive
Detector by reducing false negatives.
5.7 Runtime Analysis
We did a simulation on a personal laptop (CPU: Intel Celeron 2.20 GHz, Memory
1.0 GB). Experiments were performed 10 times with datasets. Experimental Data was
from March 17, 2009. We averaged the runtime and summarized the result in Table
5.7. Compression entropy and Fast entropy can reduce the computation time more
than 90% by using a heap structure and indexing table. Also, the Fast Entropy
Scheme can reduce the computation speed about 12% compared to Compression
Entropy Scheme. However, as we discussed in previous paragraphs, the Compression
Entropy Scheme has high false negatives over the all thresholds, which means it
doesnt work well to the network monitoring algorithm with information entropy,
even though it has fast data compression ability.
70


Table 5.7 Runtime Results without Dynamic Window
Unit: Millisecond
Attack Patterns Conventional Entropy Compression Entropy Fast Entropy
aR/time R/time compare to bconv R/time compare to conv compare to ccomp
99 DARPA 8,984 781 -91% 673 -93% -14%
00 DARPA 20,008 1,883 -91% 1,665 -92% -12%
00 DARPA (Stealthy DARPA) 16,213 1,550 -90% 1,398 -91% -10%
Average Speed Reduce -91% -92% -12%
aR/time: Run Time over a detection process
bconv: the runtime of the Conventional Entropy
ccomp: the runtime of the Conventional Entropy
We estimate the runtime of detectors with dynamic window with the same input
datasets with three different DMAWs. We present the simulation results in Table 5.8
because the MDMAW with AO has the best detection performance (see 5.6)
Table 5.8 Runtime Results with MDMAW with AO
Unit: Millisecond
Attack Patterns Conventional Entropy Compression Entropy Fast Entropy
aR/time R/time compare to bconv R/time compare to conv compare to ccomp
99 DARPA 10,218 1,032 -90% 838 -92% -19%
00 DARPA 21,758 2,158 -90% 1,917 -91% -11%
00 DARPA (Stealthy DARPA) 17,764 2,006 -89% 1,745 -90% -13%
Average Speed Reduce -90% -91% -14%
71


Table 5.8 shows that Compression and Fast Entropy Scheme have approximately
the same performance in terms of computation time (91% reduce compared to
Conventional Entropy scheme, 14% reduce compared to Compression Entropy
Scheme). We depict the runtime distribution of Adaptive MDMAW detector in
Figure 5.13.
25,000 , - -...........-............ ......
Fast
Compression
Conventional
Figure 5.13 Runtime Distribution
As shown in Figure 5.13, MDMAW has a little overhead compared to Adaptive
detector. If we compare the computational time of Fast Entropy between Adaptive
detector and MDMAW with AO, MDMAW with AO needs approximately 22%
additional computational time (see Table 5.9). The growth of computational time
comes from a burden of managing and updating four dynamic monitoring windows.
20,000
15.000
10.000
5,000

vO

-Jo'
o


o 0 O'
ov vO^ .Ov

72


Table 5.9 Computation Overhead
Millisecond / Runtime Overhead
Attack Patterns Adaptive Detector MDMAW with AO
Conventional Entropy (Runtime in Millisecond) Fast Entropy Fast Entropy
Compared to Adaptive with Conventional Compared to Adaptive with Conventional Compared to Adaptive with Fast
99 DARPA 8,984 / - 673 / -93% 838 /-91% +25%
00 DARPA 20,008 / - 1,665 /-92% 1,917/-90% +15%
00 Stealthy DARPA 16,213/- 1,398/-91% 1,745 /-89% +25%
Average Speed Overhead -92% -90% +22%
However, we can also see that the result of speed overhead is almost similar when
we consider the runtime reduce rate between adaptive detector and MDMAW with
AO in case of using Fast Entropy. Even though the MDMAW scheme increases the
computational time, it still can reduce the total computational time significantly about
90% with Fast Entropy Scheme compared to Conventional Entropy Scheme.
5.8 Simulation Analysis
Our initial goal was to reduce entropy computation time and maintain detection
accuracy. We tried the Compression Entropy Concept at the first time, even though it
yields relatively high false negatives. The Compression Entropy Scheme can be an
alternative in a fast IP network. We introduced a new detection entropy concept not
using probabilities to compute entropy but using entropy of different types and
number of packets. Our new fast entropy concept works well, better than the
Conventional Entropy Scheme, and it was the fastest one: reducing computation time
more than 90%, compared to the Conventional Entropy Approach. Moreover, our fast
entropy has better attack detection accuracy.
73


We simulated various threshold values to determine where we can get the best
threshold. We found the best threshold varied between 2a ~ 6a. We developed an
adaptive entropy detector, which changes thresholds (source/destination address,
source /destination port threshold) at the end of every monitoring interval based on
the idea of 3.3. Our adaptive entropy detector works well with compression and Fast
Entropy Scheme which approximates to optimal case (the best detection setting). We
find the MDMAW with AO can reduce false negatives over three entropy schemes.
DMAW increase a little computational time because it needs to manage and update
four dynamic windows. However, it still very efficiently computes entropies, which
allows us to reduce the computational time approximately 90% if we use Fast
Entropy Scheme.
74


6. Conclusion
DDoS attack is becoming a more powerful tool to network attackers due to the
easy of acquisition of the tools from online and hideaway characteristic from
defenders with spoofed packet header information. Meanwhile, it is a main burden for
network administrators to detect DDoS attacks from normal user accesses since
attackers send a bunch of normal packets to a victim. Once a DDoS attack has started,
it can initiate not only economic damage to a commercial company but also public
calamity when it saturates the public severs capacity of a government system.
Due to this importance of a DDoS attack, various network anomaly detection
schemes were researched such as Signature-Based approach, Statistical approach, and
Entropy-Based approach. Researches show that the Entropy-Based approach is
regarded as one of the most efficient approaches because it can represent network
conditions more than other approaches. Although the entropy-based network
detection scheme is the efficient way of detecting network intrusion or anomaly, there
is still computational burden in a high speed link.
With the objective of reducing computational power, we introduced Compression
Entropy which doesnt need probability computing unlike Conventional Entropy.
Compression Entropy gave us the possibility to great reduce computational power.
However, it is too sensitive to detect intrusion effectively. As a result of Compression
Entropy, it produces many false negatives during monitoring time on the network. We
designed Fast Entropy which uses the number of packet types and the number of
packets based on the idea that DDoS attacks rely mainly not on packet types in
Conventional Entropy but both the packet types and traffic volume (the number of
packets). We found that our Fast Entropy Scheme has better performance in terms of
speed and accuracy than Conventional Entropy based network detection.
75


Also, we designed an adaptive detector which doesnt need to set up threshold
value. Based on manual simulation with various threshold values, we extracted a
meaningful threshold range. Once we set up the threshold range in our detector, it can
detect network intrusion or anomaly approximating the best performance among
manual simulation with various thresholds. Additionally, we showed in this thesis
that our Fast Entropy Scheme worked well with our adaptive detector. With the
change of thought about conventional entropy concept, our fast entropy could reduce
almost 90% of computation time comparing conventional entropy scheme with
adaptive detector. Also, it has showed better performance in detection performance
with adaptive detector.
Lastly, we designed DMAW schemes to make our detector run with higher
accuracy. We found that DMAW can reasonably reduce the false negatives. The
MDMAW with AO scheme outperforms in terms of reducing false negatives while
maintaining low false positives. MDMAW with AO showed the best fit with our Fast
Entropy scheme.
76


7. Future Work
In this thesis, we proposed a new concept of information entropy suitable for
network intrusion detection. The simulation result showed the better performance in
computational power and detecting performance against network intrusion with
limited simulation input. Simulation input was acquired from DARPA data sets which
are DoS, Probing, Normal DDoS, and Stealthy DDoS attack. To set up the normal
network traffic, we gathered a network packet from the University computer lab (BSS
lab). A network traffic set from a University lab may not represent typical normal
traffic. Also, if we use other attack data sets differing from DARPA datasets, the
simulation result may differ from those documents is. By having broad and intensive
input datasets, we can modify or adjust the fast entropy estimator for more accuracy.
We simulated entropy-based detector with moving average detector with adaptive
fashion. In the detection algorithm, it is important for network intrusion system to
estimate the quantity estimator to evaluate network condition as well as to design an
efficient detector. We focused on speed-up and high detecting-accuracy via a good
entropy estimator. However, designing a good detector is another key design field in
detection system.
77


Appendix A
DDoS Detection Source Code
The source code for DDoS detection follows. Note that the Fast Entropy
Computing is included in main function because of its simplicity. Except main
function, all other entropy computing functions are written in header files. In
appendix A, we will show the main function, then, other main functions are
represented in followed appendixes via Appendix B ~ E. Our codes are tested in
following environment.
Development Tool: Microsoft Visual Studio 2008
Development Language: C
Operating System: Microsoft Window XP Professional
Hardware
- CPU: Intel Celeron CPU 2.20GHz
Memory: 1 GB RAM
/

* Program (main function) Start Here
// ddos cpp : Defines the entry point for the console application.
// Ali Required Headers
#include "stdafx.h"
#include
include
#include
#include
#include "datajype.h"
#include "get_pktJnfo.h"
78


#include "sleep h"
#lnclude "pkt_gather.h
#include "log_base.h"
#include "conv_entropy_calculator.h"
#include "heap.h"
#lnclude "compression_gather. h"
#include "combi_entropy_calculator.h"
#include "hash h"
#include "detector.h"
#pragma warning(disable: 4996) //We will use old file open function. Ignore this warning
int _tmain(int argc, _TCHAR* argv[])
{
pkt_info packet;
char ch a';
int index;
int repo_pkt_count=0,
pktjnfo *repository=NULL;
double src_addr_entropy, dst_addr_entropy, src_port_entropy, dst_port_entropy;
unsigned int previous_pkt_time=0;
unsigned int delay;
HeapType h_src_addr, h_dst_addr, h_src_port, h_dst_port;
//Heap for Compression and Fast entropy Computing
hashjnfo info;
time_t start, finish, execution_time;
int previous_pkt_number =0;
int total_count = 0;
// Conventional Entropy Variable
repository = (pktjnfo*) calloc(MAX_PKT_MONITORING, sizeof(pktjnfo));
// Init Compression Entropy Variable
init(&h_src_addr); init(&h_dst_addr);
init(&h_src_port); init(&h_dst_port);
/* File Open 7
//FILE* bss = fopen ("bss+darpa.txt", "rt");
FILE* bss = fopen ("090302_bss_30min+99_darpa_DoSs_Only.txt", "rt");
//FILE* bss = fopen ("bss, then ddos.txt", "rt");
//FILE* bss = fopen ("random_normal, then ddos.txt", "rt");
FILE* result = fopen ("result.txt", "wt");
// Print Menu on Console
printf("============= Entropy Estimator =============\n");
printf( 1. Conventional Entropy Estimator \n");
printff 2. Combination Entropy Estimator \n");
printf( 3. Fast Entropy Estimator \n");
printf(" Your Choice ? ");
scanf("%d", & index);
// Init packet variable
packet.arrjime = 0;
packet. dst_addr = 0;
packet.dst_port = 0;
packet.proto = 0;
packet.src_addr = 0;
packet. src_port =1;
// Start Entropy Processing
start = clock ();
while (ch != EOF)
79


packet = get_pkt_info ( bss, packet); // Get a packet info from file
delay = packet.arrjime previous_pkt_time;
//sleep(delay): // Wait for packet delay
previous_pkt_time = packet.arrjime;
switch (index)
{
This Section is for Conventions Entropy Computating and Detection *
case CONV_ENTROPY:
if (interval_sample (packet, repository, delay) == true)
// Gather pkt during SAMPLINGJNTERVAL
{
// Find the number of packets
repo_pkt_count = 0; // Init Counter
for(int i=0 ; repository[i].dst_addr > 0 ; i++)
{ repo_pkt_count++; }
src_addr_entropy
= conv_entropy_calculator(repository, repo_pkt_count, SRC_ADDR);
dst_addr_entropy
= conv_entropy_calculator(repository, repo_pkt_count, DST_ADDR);
src_port_entropy
= conv_entropy_calculator (repository, repo_pkt_count, SRC_PORT);
dst_port_entropy
= conv_entropy_calculator(repository, repo_pkt_count, DST_PORT);
// Print Results in Console and file
printf("src_addr: %3.3f, dst_addr: %3.3f, src_port: %3.3f, dst_port: %3.3f, pkts: %d\n", src_addr_entropy,
dst_addr_entropy, src_port_entropy, dst_port_entropy, repo_pkt_count);
fprintf(result, "%1 3f\t%1 3f\t%1.3f\t%1 3f\t%dVt", src_addr_entropy, dst_addr_entropy, src_port_entropy,
dst_port_entropy, repo_pkt_count);
// Check Attack Existance
detector(total_count, src_addr_entropy, dst_addr_entropy, src_port_entropy,
dst_port entropy, result);
}
break;

* This Section is for Combination Compression Entropy Computating and Detection *
case COMBL.ENTROPY:
if (compression_gather (packet, &h src_addr, &h_dst_addr, &h_src_port, &h_dst_port, delay) == true) //
Gather pkt during SAMPLING JNTERVAL
{
repo_pkt_count = h_src_addr.heap_size;
src_addr_entropy = combi_entropy_calculator(&h_src_addr);
dst_addr_entropy = combi_entropy_calculator(&h_dst_addr);
src_port_entropy = combi_entropy_calculator (&h_src_port);
dst_port_entropy = combi_entropy_calculator(&h_dst_port);
// Print Results in Console and file
printf("src_addr: %3.3f, dst_addr. %3.3f, src_port: %3.3f, dst_port: %3.3f, pkts: %d\n", src_addr_entropy,
dst_addr_entropy, src_port_entropy, dst_port_entropy, repo_pkt_count);
fprintf(result, "%1,3f\t%1.3f\t%1.3At%1.3f\t%d\t", src_addr_entropy, dst_addr_entropy, src_port_entropy,
dst_port_entropy, repo_pkt_count);
80


// Check Attack Existence
detector(total_count, src addr entropy, dst_addr_entropy, src_port_entropy, dst_port_entropy, result);
}
break;
' * * * * k 'A* ** ** * A ** * ******* * ** ** ****** ** ** * til* * *** ** ****** ****** *** *
This Section is for Fast Entropy Computating and Detection
***.******************** *************************************************,
case FAST_ENTROPY:
if (hash_entropy (packet, &info, delay) == true)
// Gather pkt during SAMPLINGJNTERVAL
{
repo_pkt_count = info.pkt_counter;
if (previous_pkt_number == 0) // if this minitoring interval is first
{
previous_pkt_number = repo_pkt_count;
src_addr_entropy = log_base( ((double)
info.pkt_counter*info.pkt_counter/(double)info.src_addr_types), 2);
dst_addr_entropy = log_base( ((double)
info.pkt_counter*info.pkt_counter/(double)info.dst_addr_types), 2);
src_port_entropy = log_base( ((double)
info.pkt_counter*info.pkt_counter/(double)info.src_port_types), 2);
dst_port_entropy = log_base( ((double)
info.pkt counter*info.pkt_counter/(double)info.dst_port types ), 2);
}
else
{
if (previous pkt number > repo_pkt_count)
{
double tempi = abs(log_base( ((double) repo_pkt_count /(double)
previous_pkt_number), PKT_NUM_ENTROPY));
src_addr_entropy = log_base( ((double) info.pkt_counter/
(double)info.src_addr_types ), 2) + abs(log_base( ((double) repo_pkt_count
/(double) previous_pkt_number), PKT_NUM_ENTROPY));
dst_addr_entropy = log_base( ((double) info.pkt_counter/
(double)info.dst_addr_types ), 2) + abs(log_base( ((double) repo_pkt_count
/(double) previous_pkt_number), PKT_NUM_ENTROPY)),
src_port_entropy = log_base( ((double) info.pkt_counter/
(double)info.src_port_types ), 2) + abs(log_base( ((double) repo_pkt_count
/(double) previous_pkt_number), PKT_NUM_ENTROPY));
dst_port_entropy = log_base( ((double) info.pkt_counter/
(double)info.dst_port_types ), 2) + abs(log_base( ((double) repo_pkt_count
/(double) previous_pkt number), PKT NUM_ENTROPY));
}
else
{
double temp2 = abs( log_base( ((double) previous_pkt_number / (double)
repo_pkt_count), PKT_NUM_ENTROPY));
src_addr_entropy = log_base( ((double) info.pkt_counter /
(double)info.src_addr_types ), 2) + abs( log_base( ((double) previous pkt number
/ (double) repo_pkt_count), PKT_NUM_ENTROPY));
dst_addr_entropy = log_base( ((double) info.pkt_counter /
(double)info.dst_addr_types ), 2) + abs( log_base( ((double) previous_pkt_number
81


/ (double) repo_pkt_count), PKT_NUM_ENTROPY));
src_port_entropy = log_base( ((double) info.pkt_counter/
(double)info.src_port_types ), 2) + abs( log_base( ((double) previous_pkt number
/ (double) repo_pkt_count), PKT_NUM_ENTROPY));
dst_port_entropy = log_base( ((double) info.pkt_counter /
(double)info.dst_port_types), 2) + abs( log_base( ((double) previous_pkt number
/ (double) repo_pkt_count), PKT_NUM_ENTROPY));
}
previous_pkt_number = repo_pkt_count;
}
// Print Results in Console and file
printf("src_addr: %3.3f, dst_addr: %3.3f, src_port: %3.3f, dst_port: %3.3f, pkts: %d\n", src_addr_entropy,
dst_addr_entropy, src_port_entropy, dst_port_entropy, info.pkt_counter);
fprintf(result, "%3.3f\t%3.3fVt%3.3f\t%3,3f\t%d\t", src_addr_entropy, dst_addr_entropy, src_port_entropy,
dst_port_entropy, info.pkt_counter);
// Check Attack Existance
detector (total count, src_addr_entropy, dst_addr_entropy, src_port_entropy, dst_port_entropy, result);
}
break;
default:
puts("Estimator Selection Error");
break;
}
total_count++; // Count Total number of packet
ch = fgetc(bss);
}
// Program Execution Time
finish = clock();
execution_time = finish start;
printff'Execution Time %d\n, execution_time);
// File Close Error Check
int state = fclose(bss);
instate !=NULL)
puts(file close error");
if(state!=NULL)
puts("file close error");
// Memory Release
free(repository);
putsfEnd of Computation, Bye !!!");
getchar();
return 0;
}
82


Appendix B
Conventional Entropy Calculator Header
typedef struct tab!e_element
{
unsigned int packet;
int counter;
} table_element;
int get probability_info(pkt_info 'repository, table_element* pkt table, int final_data_location, int index)
{
int prob_data_index = 0;
bool data_exist;
unsigned int information;
for(int i=0 ; i {
switch (index) // Extract Required Information from repository
{
case SRC_ADDR:
information = repository[i].src_addr;
break ;
case DST_ADDR :
information = repository[i].dst_addr;
break;
case SRC_PORT :
information = repository(i],src_port;
break;
case DST_PORT:
information = repository[i].dst_port;
break;
}
if (prob_data_index ==0) //Save the first info without conditon
{
pkt_table[prob_data_index]. packet = information ;
pkt_table[prob_data_indexj. counter = 1;
prob_data_index++;
}
else
<
data_exist = false;
for(int j=0; j //search same data in prob_data[]
{
if(pkt tableO].packet == information)
{
(pkt_table[j].counter)++; //increase pkt_counter
data_exist = true;
break;
}
}
if(data_exist == false)
{
pkt_table[prob_data_index].packet = information;
pkt_table[prob_data_index++],counter = 1;
83


}
}
}
return prob datajndex;
}
double compute_entropy (table element *prob_data, int num_of types, int index)
{
double entropy=0;
double probability;
double total_pkt_number=0;
for(int i=0 ; i total_pkt_number += prob_data[i], counter;
for(int j=0 ; j {
probability = (double)(prob_dataO].counter) / total_pkt_number;
entropy += (probability log_base(probability, 2));
}
return entropy;
}
double conv_entropy_calculator(pkt_info 'repository, int final_data_location,int required_entropy_index)
int num_of_types;
double computed_entropy;
int dictionary_index=0; //Array location index
table_element *pkt_table_src_addr = NULL;
table_element *pkt_table_dst_addr = NULL;
table_element *pkt_table_src_port = NULL;
table_element *pkt_table_dst_port = NULL;
pkt_table_src_addr = (table_element*) calloc(MAX_PKT_MONITORING, sizeof(table_element)),
pkt_table_dst_addr = (table_element*) calloc(MAX_PKT_MONITORING, sizeof(table_element));
pkt_table_src_port = (table_element*) calloc(MAX_PKT_MONITORING, sizeof(table_element));
pkt_table_dst_port = (table_element) calloc(MAX_PKT_MONITORING, sizeof(table_element));
/****'"*"* Extract Packet Information from Repository **********/
switch (required_entropy_index)
{
case SRC_ADDR:
num_of_types = get_probability_info(repository, pkt_table_src_addr,
final_data_location,required_entropyJndex);
computed_entropy = compute_entropy(pkt_table_src_addr, num_of_types, SRC_ADDR);
break;
case DST_ADDR:
num_of_types = get_probability_info(repository, pkt_table_dst_addr,
final_data_location,required_entropy_index);
computed_entropy = compute_entropy(pkt_table_dst_addr, num_of_types, DST_ADDR);
break;
case SRC_PORT:
num_of_types = get_probability_info(repository, pkt_table_src_port,
final_dataJocation,required_entropy_index);
computed_entropy = compute_entropy(pkt_table_src_port, num_of_types, SRC_PORT);
break;
case DST_PORT:
num_of_types = get_probability_info(repository, pkt_table_dst_port,
final_data_location,required_entropy_index);
computed_entropy = compute_entropy(pkt_table_dst_port, num_of_types, DST_PORT);
break;
}
84