Citation
Fixed and mobile sensor based generalized additive models for freeway incident detection

Material Information

Title:
Fixed and mobile sensor based generalized additive models for freeway incident detection
Creator:
Thanasupsin, Kittichai
University of Colorado at Denver
Place of Publication:
Denver, CO
Publisher:
University of Colorado Denver
Publication Date:
Language:
English
Physical Description:
186 p. : ;

Subjects

Subjects / Keywords:
Engineering, Civil
Transportation
PQDDCHE
Genre:
Electronic dissertations.
non-fiction ( marcgt )
Electronic dissertations ( lcsh )

Notes

Abstract:
Generalized additive models (GAM) to detect lane-blocking and shoulder incidents are developed based on traffic measures estimated from fixed and mobile sensors. The generalized additive model, a nonparametric model, is a generalization of the generalized linear model, allowing appropriate functional forms of independent variables to be proposed. Generalized additive models allow flexible functions to be fitted and therefore their functional forms are revealed in the parametric estimate of generalized additive models. This capability of GAM serves as a powerful interpretive tool to examine the affect of each traffic measure on the probability of an incident. Fixed sensor based incident detection models are developed for lane-blocking and shoulder incidents on the Interstate 25 freeway in Colorado and the Interstate 880 freeway in California. Separate lane-blocking and shoulder incidents models are also developed for the Interstate 880 freeway to examine the characteristic differences between lane-blocking and shoulder incidents, as they relate to incident detection. Characteristics of incidents, model development, including significant variables selection, and model interpretation are also examined. Based on performance measures including detection rate, false alarm rate and mean time to detect, the nonparametric GAM and the parametric estimate of GAM, with only five variables for lane-blocking incidents and six variables for all incidents, outperform several neural network based models using 16 to 24 variables. In this research, the effect of type and length of freeway segments on model performance is also examined. Mobile sensor, and fixed and mobile sensor based incident detection models are developed for lane-blocking and shoulder incidents on the Interstate 25 freeway. The performance of mobile sensor based model shows the potential use of mobile sensor as an alternative data source. Using mobile sensor as an additional data source to fixed sensor data helps reduce the false alarm rate of the incident detection model. The performance of the incident detection models developed is unbiasedly validated using bootstrap method. The bootstrap performance examined includes mean detection rate, incident state detection rate, false alarm rate, mean time to detect, and their 95 percent confidence interval. The bootstrap performance may provide a good estimate of model performance in the field. ( ,, )
Thesis:
Thesis (Ph.D.)--University of Colorado at Denver, 2002. Civil engineering
General Note:
Source: Dissertation Abstracts International, Volume: 63-12, Section: B, page: 5990.
General Note:
Director: Sarosh I. Khan.
General Note:
Department of Civil Engineering

Record Information

Source Institution:
|University of Colorado Denver
Holding Location:
|Auraria Library
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
793534999 ( OCLC )
0493940561 ( ISBN )
ocn793534999

Downloads

This item has the following downloads:


Full Text
FIXED AND MOBILE SENSOR BASED GENERALIZED ADDITIVE MODELS
FOR FREEWAY INCIDENT DETECTION
by
Kittichai Thanasupsin
B.Eng., Khon Kaen University, Thailand, 1992
M.Eng., Asian Institue of Technology, Thailand, 1995
M.S., University of Colorado, 1998
A thesis submitted to the
University of Colorado at Denver
in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy
Civil Engineering
2002


2002 by Kittichai Thanasupsin
All rights reserved.


This thesis for the Doctor of Philosophy
Degree by
Kittichai Thanasupsin
has been approved
by

Sarosh I. Khan
Bruce N. Janson
&
James E. Diekmann
~ Trever Wang

Anne M. Doug!


Thanasupsin, Kittichai (Ph.D., Civil Engineering)
- v_ .
Fixed and Mobile Sensor Based Generalized Additive Models for Freeway Incident
Detection
Thesis directed by Assistant Professor Dr. Sarosh I. Khan
ABSTRACT
Generalized additive models (GAM) to detect lane-blocking and shoulder incidents are
developed based on traffic measures estimated from fixed and mobile sensors. The
generalized additive model, a nonparametric model, is a generalization of the
generalized linear model, allowing appropriate functional forms of independent
variables to be proposed. Generalized additive models allow flexible functions to be
fitted and therefore their functional forms are revealed in the parametric estimate of
generalized additive models. This capability of GAM serves as a powerful interpretive
tool to examine the affect of each traffic measure on the probability of an incident.
Fixed sensor based incident detection models are developed for lane-blocking and
shoulder incidents on the Interstate 25 freeway in Colorado and the Interstate 880
freeway in California. Separate lane-blocking and shoulder incidents models are also
developed for the Interstate 880 freeway to examine the characteristic differences
between lane-blocking and shoulder incidents, as they relate to incident detection.
Characteristics of incidents, model development, including significant variables
selection, and model interpretation are also examined. Based on performance measures
including detection rate, false alarm rate and mean time to detect, the nonparametric
GAM and the parametric estimate of GAM, with only five variables for lane-blocking
IV


incidents and six variables for all incidents, outperform several neural network based
models using 16 to 24 variables. In this research, the effect of type and length of
freeway segments on model performance is also examined.
Mobile sensor, and fixed and mobile sensor based incident detection models are
performance of mobile sensor based model shows the potential use of mobile sensor as
an alternative data source. Using mobile sensor as an additional data source to fixed
sensor data helps reduce the false alarm rate of the incident detection model.
The performance of the incident detection models developed is unbiasedly validated
using bootstrap method. The bootstrap performance examined includes mean detection
rate, incident state detection rate, false alarm rate, mean time to detect, and their 95
percent confidence interval. The bootstrap performance may provide a good estimate
of model performance in the field.
This abstract accurately represents the content of the candidates thesis. I recommend
its publication.
developed for lane-blocking and shoulder incidents on the Interstate 25 freeway. The
Signed
Sarosh I. Khan
v


CONTENTS
Figures.................................................................... xi
Tables........................................................................xv
Chapter
1. Introduction............................................................. 1
1.1 Background.............................................................. 1
1.2 Objectives of Study.......................................................4
1.3 Organization of Dissertation............................................. 5
1.4 Significant Contributions of Study.......................................7
2. Literature Review........................................................ 8
2.1 Fixed Sensor-Based Freeway Incident Detection Algorithms................10
2.2 Mobile Sensor-Based Freeway Incident Detection Algorithms...............19
2.3 Fixed and Mobile Sensor-Based Freeway Incident Detection Algorithms.....20
2.4 Mobile Sensor-Based Surface Street Incident Detection Algorithms........22
2.5 Synthesis of the Literature..............................................24
2.5.1 Performance of the Freeway Incident Detection Algorithms...............24
2.5.2 Source of Data for Incident Detection Algorithms.......................25
2.5.3 Incident Data Sets......................................................25
2.5.4 Characteristics of the Freeway Incident Detection Algorithms............26
vi


3. Data Collection.......................................................... 30
3.1 Data from Colorado.......................................................30
3.1.1 Traffic Measures from Fixed Sensors.....................................32
3.1.2 Traffic Measures from Mobile Sensors.....................................33
3.1.3 Incident Data............................................................36
3.2 Data from California.....................................................37
3.2.1 Traffic Measures from Fixed Sensors......................................38
3.2.2 Incident Data.......................................................... 39
4. Characteristics of Lane-blocking and Shoulder Incidents....................40
4.1 Characteristics of Incidents.............................................42
4.2 Characteristics of Incidents on the 1-880 Freeway........................43
4.3 Characteristics of Incidents on the 1-25 Freeway.........................44
4.3.1 Incident Rate............................................................45
4.3.2 Incident Duration.......................................................48
4.3.3 Average Delay......................................................... 50
5. Methodology................................................................57
5.1 Generalized Additive Model (GAM).........................................59
5.1.1 Spline Smoothing Function................................................60
5.1.2 Fitting Generalized Additive Models......................................61
5.2 .632 Bootstrap Method..................................................64
5.3 Model Evaluation.........................................................67
6. Development of Generalized Additive Model for Freeway Incident
Detection..................................................................69
vii


6.1 Selection of Independent Variables.......................................70
6.1.1 Significant Independent Variables for Fixed Sensor Based Incident
Detection Models........................................................80
6.1.2 Significant Independent Variables for Mobile Sensor Based Incident
Detection Model.........................................................81
6.1.3 Significant Independent Variables for Fixed and Mobile Sensor Based
Incident Detection Model................................................82
6.2 Parametric Estimate of Fixed Sensor Based Generalized Additive Model for
Incident Detection........................................................... 83
6.2.1 Partial Prediction.......................................................84
6.2.2 Generalized Additive Model (GAM) Parametric Estimate for Fixed
Sensor Based Incident Detection Model...................................88
6.3 Model Interpretation.....................................;..............93
6.3.1 Model Interpretation- Lane-blocking Incident Detection Model.............93
6.3.2 Comparison of Model Structures and their Implications....................98
7. Performance of Fixed Sensor, Mobile Sensor, and Fixed and Mobile
Sensor Based Incident Detection Models....................................104
7.1 Fixed Sensor Based Incident Detection Model.............................105
7.1.1 Performance of Fixed Sensor Based Incident Detection Model for the I-
25 Freeway........................................................... 106
7.1.2 Performance of Fixed Sensor Based Incident Detection Model for the I-
880 Freeway............................................................108
7.1.3 Performance of Fixed Sensor Based Incident Detection Model for Lane-
blocking Incidents.....................................................Ill
7.1.4 Performance of Fixed Sensor Based Incident Detection Model for
Shoulder Incidents................................................... 116
viii


7.2 Performance of Fixed Sensor Based Incident Detection Model by Segment
Length and Segment Type.............................................118
7.2.1 The 1-880 Freeway.................................................118
7.3 Mobile Sensor Based Incident Detection Model.......................123
7.3.1 Performance of Mobile Sensor Based Incident Detection Model.......124
7.4 Fixed and Mobile Sensor Based Incident Detection Model..............128
7.4.1 Performance of Fixed and Mobile Sensor Based Incident Detection
Model..............................................................128
8. An Unbiased Validation of Incident Detection Algorithm Performance
Using Bootstrap Method...............................................131
8.1 Bootstrap Performance of Generalized Additive Model for Freeway Incident
Detection...........................................................132
8.2 Bootstrap Performance of Parametric Estimate of Generalized Additive
Model..............................................................138
8.3 Summary.............................................................142
9. Summary, Conclusions, and Recommendations.............................144
9.1 Characteristics of Incidents........................................146
9.2 Significant Independent Variables for Incident Detection Models.....147
9.3 Performance of Generalized Additive Model for Incident Detection....148
9.3.1 Fixed Sensor Based Incident Detection Model.......................148
9.3.2 Performance of Fixed Sensor Based Incident Detection Model by
Segment Length and Segment Type....................................150
9.3.3 Mobile Sensor based Incident Detection Model......................150
9.3.4 Fixed and Mobile Sensor Based Incident Detection Model............151
9.4 Unbiased Validation of Model Performance...........................152
IX


9.5 Conclusions..........................................................152
9.6 Recommendations......................................................154
Appendix..................................................................156
A. Sample SAS Scripts.....................................................156
A.l SAS Script to Fit Generalized Additive Model..........................156
A.2 Script to Evaluate Model Performance (DR, ISDR, FAR, and TTD).........158
References................................................................183
'
X


FIGURES
Figure
2.1. In-lane or lane blocking freeway incident..............................9
3.1. Schematic of the test network and detector locations...................31
3.2. Typical detector configuration for ramp metering.......................32
3.3. Denver's AVL system....................................................35
3.4. GIS software used to display and extract bus AVL data on the 1-25
freeway northbound..................................................... 36
3.5. The PATH project study section.........................................38
4.1. Types of incidents.....................................................43
4.2. Incident characteristics in Colorado...................................45
4.3. Delay due to an incident.........................,.....................51
6.1 . Box-Whisker Plot for incident (1) and non-incident (0) conditions on
the 1-25 freeway.........................................................75
6.2. Box-Whisker Plot for incident (1) and non-incident (0) conditions on
the 1-880 freeway for all incidents......................................75
6.3. Box-Whisker Plot for incident (1) and non-incident (0) conditions on
the 1-880 freeway for lane-blocking incidents............................ 76
6.4. Partial Prediction for UDEVOCC, UOCC, and USPD for all incidents
on the 1-25 freeway.................................................... 85
6.5. Partial Prediction for UDEVOCC, UOCC, DQCC, USPD, DSPD, and
OCCDF for all incidents on the 1-880 freeway............................86
xi


6.6. Partial prediction for UDEVOCC, UOCC, DOCC, USPD, and DSPD
for the lane-blocking incident model for the 1-880 freeway............86
6.7. Partial prediction for UDEVOCC, UOCC, DOCC, USPD, DSPD, and
OCCDF for the shoulder incident detection model for the 1-880
freeway...............................1...............................87
6.8. Odds ratio, y/(xj), for a unit increase of independent variables.....95
6.9. Odds ratio, y/(xj), for upstream speed...............................96
6.10. Relationship between upstream speed (USPD) and upstream
occupancy deviation from historical upstream occupancy
(UDEVOCC) under (a) non-incident conditions and (b) incident
conditions................................................ 97
6.11. Odds ratio y/(xj) for upstream speed for the lane-blocking incident
model and the shoulder incident model..................................99
6.12. Speed at different location during incident conditions.............100
6.13. Odds ratio y/(Xj) for downstream speed for the lane-blocking incident
model and the shoulder incident model.................................101
6.14. Odds ratio if/(xj) for upstream speed for the 1-25 and the 1-880
freeways all incidents............................................... 103
7.1. Performance of the generalized additive model and the multilayer
feedforward neural network for the 1-25 freeway on the test set.......107
7.2. Performance (DR) of ID algorithms for all incidents on the 1-880 _
freeway.............................................................. 110
7.3. Performance (ISDR) of ID algorithms for all incidents on the 1-880
freeway...............................................................110
7.4. Detection rate for lane-blocking incidents on the 1-880 freeway (Test
set)..................................................................113
7.5. Incident state detection rate for lane-blocking incidents on the 1-880
freeway (Test set)....................................................114
xii


7.6. Detection rate at different interval persistence test for the 1-880
shoulder test set.................................................. 117
7.7. Incident state detection rate at different interval persistence test for the
1-880 shoulder test set..............................................117
7.8. Detection rate by segment length...................................119
7.9. Mean time to detect by segment length................:............... 119
7.10. Detection rate by segment type for short segments.................121
7.11. Mean time to detect by segment type for short segments............121
7.12. Detection rate by segment type for long segments..................122
7.13. Mean time to detect by segment type for long segments........:....123
7.14. Performance of mobile sensor based generalized additive model.....125
7.15. Cumulative average delay of incidents when the first probe vehicle is
available.......................................................... 126
7.16. Performance of mobile sensor based incident detection model by
average delay when the first probe vehicle available................127
7.17. Performance of fixed and mobile sensor based incident detection
model........!......................................................129
8.1. Bootstrap performance of GAM for incident detection.................133
8.2. Frequency plot of bootstrap performance.............................134
8.3. Frequency plot of degree of freedom for UDEVOCC.....................135
8.4. Degree of freedom of UDEVOCC with DR and ISDR................... 136
8.5. Bootstrap performance by mean time to detect.................... 137
8.6. Performance of parametric estimate of GAM at zero interval
persistence test.................................................. 139
8.7. Scatter plot of DR and FAR with histograms at zero interval
persistence test.................................................... 140
xiii


140
8.8. Scatter plot of ISDR and FAR with histograms at zero interval
persistence test
xiv


TABLES
Table
3.1. Distance between detector locations...................................33
4.1. Incident rate for the 1-25 and the 1-880 freeway......................46
4.2. Chi-square goodness of fit test for one-way contingency table.........47
4.3. Duration of incident for the 1-25 and the 1-880 freeway...............49
4.4. Pooled t-test and Kolmogorov-Smimov test for duration of incidents....49
4.5. Average delay due to incidents on the 1-25 and the 1-880 freeway......54
4.6. Pooled t-test and Kolmogorov-Smirhov test for average delay...........54
6.1. Traffic measures for incident detection...............................71
6.2. Univariable model ranked based on Deviance............................77
6.3. Analysis of Deviance for preliminary models for all incidents on the I-
880 freeway.....;......................................................78
6.4. Analysis of Deviance for the generalized additive models..............79
6.5. Analysis of Deviance for preliminary model for mobile sensor based
incident detection model...............:.::............................82
6.6. Analysis of Deviance for mobile sensor based incident detection model.82
6.7. Analysis of Deviance for fixed and mobile sensor based incident
detection model..........................................;.............. 83
6.8. Model analysis for parametric estimate of generalized additive model
for the all incident for the 1-25 freeway..............................89
xv


6.9. Model analysis for,parametric estimate of generalized additive model
for the all incidents model for the 1-880 freeway.......................90
6.10. Model analysis for parametric estimate of generalized additive model
for the lane-blocking incident model for the 1-880 freeway..............91
6.11. Model analysis for parametric estimate of generalized additive model
for the shoulder incident model for the 1-880 freeway...................92
7.1. Performance of the generalized additive model on the test set including
lane-blocking and shoulder Incidents.....................................109
7.2. Performance of incident detection models................................112
8.1. Minimum and maximum values of bootstrap performance of GAM.............134
8.2. Bootstrap performance of parametric estimate of GAM with 95 %
confidence interval.................................................... 141
xvi


1. Introduction
1.1 Background
Congestion on the freeway is an economic, social and environmental concern. It
causes excessive delay, queue backups, increased fuel consumption, and increased air
pollution. There are two types of freeway congestion: recurring and non-recurring
congestion. Recurring congestion or predictable congestion is caused by excessive
demand during morning and evening peak period or reduction in freeway capacity due
to change in roadway geometry. Reduction in freeway capacity can be caused by lane
drops, weaving sections, horizontal curvature, and vertical alignment. Non-recurring
congestion or unpredictable congestion is caused by an incident. Examples of freeway
incidents include traffic accidents, stalled vehicles, hazardous spill, debris, or any other
unexpected event that disrupts the flow of traffic on the freeway.
Under the national program on Intelligent Transportation Systems (ITS), the principal
thrusts of research are in the area of Advanced Transportation Management Systems
(ATMS) and Advanced Traveler Information Systems (ATIS). Both ATMS and ATIS
focus on monitoring traffic congestion in real-time. A major concern is providing
decision support to effectively detect, verify, and develop response strategies for
incidents that disrupt the flow of traffic. A key element in providing such support is
automating the process of detecting incidents on large area roadway networks Incident
detection is a major component of not only ATMS and but also for ATIS to provide
dynamic route guidance to travelers. The impacts of non-recurring congestion may
1


also be reduced considerably by reducing the time required to detect and clear an
incident.
The unexpected disruption to traffic flow by an incident causes a reduction of roadway
capacity at an affected location. The reduced capacity may be less than upstream
demand and cause congestion upstream of an incident. An incident may be either lane-
blocking incident or shoulder incident (Lindley 1986). Examples of lane-blocking
incident are accident and debris on the freeway and examples of shoulder incident are
stalled vehicle and flat tire vehicle. Shoulder incidents can also cause a significant
reduction of capacity due to rubbernecking. As an incident detection model is
implemented by a traffic management center (TMC) to detect freeway incidents, the
traffic management center is mainly interested in detecting both lane-blocking
incidents and shoulder incidents. If shoulder incidents are detected rapidly, a TMC
may respond by sending a service unit out to clear the incidents and assist motorists to
reduce the impact of the incidents. As will be presented in Chapter 4 of this
dissertation, shoulder incidents also cause considerable delay to motorists. Therefore,
an incident detection model that detects both lane-blocking and shoulder incidents that
cause disruption to traffic is of utmost importance. Some incident detection models
reported in the literature have been developed based on lane-blocking incident data
(Abdulhai and Ritchie 1999; Cheu and Ritchie 1995; Jin et al. 2002). An incident
detection model developed based on only lane-blocking incidents may not perform
well in detecting shoulder incidents.
The data source for an incident detection model can be infrastructure based and/or
non-infrastructure based sensors. The infrastructure-based sensors include loop
detectors embedded in pavements or video based detection systems that provide
estimates of traffic flow measures such as flow rate, speed and occupancy (percent
time detectors are occupied by traffic) at regular intervals. Examples of non-
2


infrastructure based sensors include location tracking systems installed in selected
vehicles in traffic stream that serve as probe vehicles. Global positioning system based
tracking systems provide vehicle location and speed at regular intervals. The loop
detectors have several disadvantages. They are difficult to install and maintain since
the freeway needs to be closed, pavement cut, wired and control cabinets installed. In
addition, in order to collect accurate measurements of occupancy, the loop detector
need to be properly tuned. This process takes a considerable amount of time and effort
(Skabordonis 1995). In addition, the spacing between fixed loop detectors effects the
performance of an incident detection model. For example, for a short duration incident
with high spacing between fixed loop detectors, the incident either may not be detected
or the time to detect may be high.
Use of probe vehicles as real-time data source has received considerable attention in
the last few years. Buses equipped with automatic vehicle location (AVL) systems are
a potential resource and may be used as mobile sensors or probe vehicles. Cellular
phones reporting location data to a Traffic Management Center (TMC) may also serve
as probes. The probe vehicle data may be used for freeway segments where loop
detectors are not available or used in combination with loop detector data where
available, to improve the performance of an incident detection model.
The performance of an incident detection (ED) model may be evaluated based on its
detection rate (DR), false alarm rate (FAR), and mean time to detect (TTD) incidents.
The performance of incident detection model developed should be unbiasedly
validated. The most stringent test is an external validation by testing the model to a
new population. Unbiased internal validation is an alternative test for model validation.
The techniques for obtaining nearly unbiased internal assessments of accuracy or
performance include data-splitting, cross-validation, and bootstrapping. Data-splitting
technique is the most common use for incident detection model validation (Abdulhai
3


1996; Cheu 1994; Dia et al. 1997). For incident detection model development, field
data size is usually relative small due to difficulty of collection and coordination. The
disadvantage of data-splitting is that it reduces the sample size of data set for model
development and testing. Another (disadvantage is that the model performance relies on
a single data split. Cross-validation is repeated data-splitting. The advantage of cross-
validation over data splitting is the size of training size can be much larger and cross-
validation reduces variability by not relying on a single sample split. Bootstrapping
involves in sampling the original sample with replacement. The size of bootstrap
sample is as large as the original sample. Efron (Efron and Tibshirani 1993) shows that
cross-validation is roughly unbiased but can show large variability. The simple
bootstrap method shows lower variability but can be severely biased downward. The
.632 bootstrap performs the best among all methods.
An incident model that has a high detection rate, a low false alarm rate, and a low
mean time to detect is essential for the operations of a TMC. Studies have shown the
alarms in a Traffic Management Center generated by an ID model with a high false
alarm rate are often ignored. The impacts of non-recurring congestion can be
minimized by detecting severe incidents rapidly. Moreover, a model that is easy to
implement and calibrate or train, and performs well is desired.
1.2 Objectives of Study
An incident detection (ID) model with high detection rate, low false alarm rate, and
low mean time to detect is an integral part of decision support system of an advanced
traffic management system (ATMS) operated by a traffic management center (TMC).
The impacts of non-recurring congestion may be reduced considerably by reducing the
time required to detect and clear an incident. Alarms generated by an incident
detection model with high false alarm are usually ignored by the TMC. Therefore, not
4


only a high detection rate but also a low false alarm rate is an operational requirement
of an incident detection model. As an incident detection model is developed and tested,
an accurate estimate of the performance of the incident detection model developed is
also desired.
The objectives of this research are to:
1. Study the characteristics of lane-blocking and shoulder incidents.
2. Identify significant independent variables for incident detection models.
3. Develop generalized additive models (GAM) to detect lane-blocking and
shoulder incidents based on data from fixed and mobile sensors.
4. Compare the performance of the incident detection models developed with
recent models reported in the literature.
5. Examine the effect of segment type and segment length on incident detection.
6. Obtain an accurate estimate of the performance of the model developed by
applying a bootstrap method.
1.3 Organization of Dissertation
The dissertation consists of nine chapters and is organized as follows:
Chapter 1 presents a background on incident detection, the objectives, and scope of
this dissertation.
5


Chapter 2 presents a comprehensive review of the literature on incident detection. A
synthesis of the literature on the techniques applied to freeway incident detection,
performance of the models, incident data sets used, and the strengths and weaknesses
of the models are presented.
Chapter 3 describes fixed sensor, mobile sensor, and incident data collected to develop
and test proposed incident detection models and to compare its performance to a few
recently developed models reported in the literature.
Chapter 4 examines the characteristics of incidents including incident type, incident
rate, incident duration, and average delay.
Chapter 5 presents the methodology proposed for a new incident detection model.
Chapter 6 describes the development of generalized additive models for incident
detection, including selection of independent variables and parametric estimate of
generalized additive models. The difference in traffic patterns that results due to lane-
blocking and shoulder incidents is also examined.
Chapter 7 presents the performance of the fixed sensor, mobile sensor, and fixed and
mobile sensor based generalized additive models and their parametric estimate for all
incidents on the Interstate 25 freeway in Colorado and the Interstate 880 freeway in
California, lane-blocking incidents, and shoulder incidents on the Interstate 880
freeway. The effect of the length and type of the freeway segments on the performance
of the incident detection model is also presented in this chapter.
Chapter 8 presents the bootstrap estimate of the performance of the models developed
for the Interstate 880 freeway.
6


Chapter 9 summarizes the findings of research, presents the conclusions, and
recommendations.
1.4 Significant Contributions of Study
The significant contributions of this study are as follows:
1. A new and improved incident detection model is developed to detect incidents.
2. Most incident detection models reported in the literature detect only lane-
blocking incidents. This research proposes incident detection models to detect
lane-blocking and shoulder incidents. The characteristics of lane-blocking and
shoulder incidents are also examined and compared.
3. This study examines the effect of type and length of segments on the
performance of a freeway incident detection model.
4. The performance of incident detection models is unbiasedly validated based on
a bootstrap method. The performance reported includes the mean DR, ISDR,
FAR, and TTD, and their 95 percent confidence interval.
^ :
7


2. Literature Review
Incident detection is an important function of a Traffic Management Center with an
Advanced Traffic Management System (ATMS) in place. From the early 1970's,
research has focused on developing incident detection algorithms for freeways. Since
the 1980's the scope of incident detection work has expanded to include signalized
urban arterials. Source of data for incident detection algorithm may be infrastructure
based, fixed sensors, and/or non-infrastructure based, mobile sensors. The
infrastructure based sensors include loop detectors embedded in pavements or video
based detection systems that provide estimates of traffic flow measures such as flow
rate, speed, and occupancy (percent time detectors are occupied by traffic). Incident
detection algorithms typically rely on data received from fixed, infrastructure based
loop detectors. Loop detectors, typically installed at regular spacing, provide a good
source of temporal variation of traffic measures, while the spatial variation of traffic
measure is available at discrete intervals e.g. half a mile. Non-infrastructure, mobile
sensors provide spatial variation of traffic measures at regular time interval. A limited
study based on simulated data with high penetration rate of probe vehicles has shown
improvement in the performance of an incident detection algorithm when both fixed
loop detectors and probe vehicles are used as data sources. Very few studies have used
field data, especially for integrated fixed and mobile based incident detection
algorithms.
Incident detection algorithms rely on the traffic patterns that emerge as an incident
occurs. Typically, it is assumed that as a freeway incident occurs, the flow rate
decreases, the occupancy increases, and the speed decreases upstream of the incident,
and the flow rate decreases, the occupancy decreases, and the speed increases
8


downstream of the incident (Figure 2.1). However, the increase of speedat .
downstream of the incident may vary with the distance between incident location and
downstream detector station or spatial distribution of loop detector to allow vehicles
accelerate to desire speed after passing an incident.
Several techniques have been applied to incident detection. Examples include
decision-tree based pattern recognition techniques, time-series based statistical
approach, catastrophe theory, and artificial intelligence based neural network
approach.
There are six types of incident detection algorithms based on the availability of data (i)
Fixed sensor- based freeway incident detection algorithms (ii) Mobile sensor-based
freeway incident detection algorithms (iii) Fixed and mobile sensor-based freeway
incident detection algorithms (iv) Fixed sensor based surface street incident detection
algorithms (v) Mobile sensor-based surface street incident detection algorithms and
(vi) Fixed and mobile sensor based surface street incident detection algorithms. This
chapter of the dissertation presents a review of all freeway incident detection
algorithms and only probe based surface street incident detection algorithms.
Downstream of the incident
Upstream of the incident
Figure 2.1. In-lane or lane blocking freeway incident
9


2.1 Fixed Sensor-Based Freeway Incident Detection Algorithms
Fixed sensors are typically installed in major urban freeways. Therefore, most incident
detection algorithms developed since the 1970's are based on data collected by fixed
sensors such as loop detectors.
A series of 11 pattern matching algorithms based on decision trees were developed
for freeway incident detection by Payne (Payne and Tighor 1978). These threshold
based algorithms, better known as the California Algorithms, detect a freeway incident
based on classifying traffic patterns as incident or non-incident condition using a
decision tree. The algorithm detects discontinuity of upstream and downstream
occupancy. An incident is declared if measures calculated from 1-minute average
occupancy for upstream and downstream stations exceed predetermined thresholds.
Other measures used in the algorithm, calculated from upstream occupancy and
downstream occupancy, are spatial difference in occupancies (OCCDF), relative
spatial difference in occupancies (OCCRDF), and relative temporal difference in
downstream occupancy (DOCCTD). The California Algorithm #1 compares these
three measures (OCCDF, OCCRDF, and DOCCTD) to three corresponding
thresholds. The algorithm indicates either incident or incident-free states. The
California Algorithm #2 or Modified California Algorithm is a refinement of the
California Algorithm #1 with an incident continuing state as another state for an
output. The Algorithm #4 is similar to Algorithm #2 but uses downstream occupancy
instead of relative temporal difference in downstream occupancy (DOCCTD). This
helps reduce false alarms due to compression wave in heavy traffic. Compression
wave is the growth of congested traffic which causes a high occupancy that moves
through the traffic stream in a direction counter to the flow. The Algorithm #7 uses the
same traffic measures as Algorithm #4 with a persistence requirement. The Algorithm
#8 has the capability to account for compression wave and persistence requirement to
10


reduce the false alarm rate; A persistence test checks for incident states for consecutive
intervals before an incident is declared. The California Algorithm was developed and
evaluated based on field data obtained from Los Angeles and Minneapolis freeway
surveillance systems including approximately 150 incidents. The California Algorithm
#8 outperformed the other California Algorithms. The detection rate reported is up to
61 percent with false alarm rate of 0.177 percent. The algorithm is often used as a
benchmark when comparing and evaluating new algorithms because of its widespread
use in traffic management centers (TMC).
An exponential smoothing algorithm for freeway incident detection algorithm was
developed by Cook et al. (Cook and Cleveland 1974). The variables are occupancy,
speed, volume, and energy computed from speed and volume. The algorithm was
developed and tested based on 1-minute interval field data from John C. Lodge
Freeway in Detroit including 50 incidents includedl8 accidents, 28 stalls and
breakdowns, 2 instances of debris, and 2 short maintenance operations. The highest
detection rate reported from the Exponential Station Discontinuity model islOO percent
at a false alarm rate of 6.5 percent and the lowest false alarm rate, 5.73 percent at a
detection rate of 96 percent:
The Standard Normal Deviate (SND) model is based on the assumption that an
incident results in a high rate of change in lane occupancy and energy computed from
volume and speed measurements (Dudek et al. 1974). The algorithm was tested and
evaluated based on field data for 35 real incidents in Houston, Texas. The SND
algorithm using a 5-minute interval occupancy as the control variable resulted in the
best performance. The detection rate reported is 92 percent with 1.3 percent false
alarm rate. .
A Bayesian approach (Levin and Krause 1978) assumes that the normal traffic flow
follows its historic trend and any deviation from this trend that exceeds a certain
11


threshold indicates an incident. The variable used is the ratio of the difference between
upstream and downstream 1-minute occupancies and upstream occupancy. The
algorithm requires the frequency distribution of the variable during incident and
incident-free conditions from historical data, and probability of an incident occurring
at a given detector. This algorithm was evaluated based on field data for 17 incidents
on outbound Kennedy Expressway, Chicago. The Bayesian Algorithm compares
favorably with the California Algorithm in terms of detection rate and false alarm rate.
The detection rate of the Bayesian Algorithm reported is 100 percent with false alarm
rate of 0 percent. However, the mean time to detect is higher than the California
algorithm. The structure of Bayesian algorithm limits the mean time to detect to at
least 4 time intervals.
Two pattern recognition methods, the HIOCC (High OCCupancy) and PATREG
algorithms were developed by Collins (Collins et al. 1979). The HIOCC identifies the
presence of stationary or slow moving vehicles over detectors based on occupancy
data. The one-second occupancy from detector is obtained by scanning the detector
every 1/10 second to determine whether the detector is occupied. The one-second
occupancy is smoothed with single-stage exponential smoothing at the end of every
second. Several consecutive seconds of very high detector occupancy initiates an
incident alarm. The PATREG algorithm monitors the traffic speed in each lane
between pairs of detector stations. Speed is compared against pre-determined
thresholds. Significant change in speed is an indication of an incident. Both algorithms
were developed and tested with two data sets from the M4 Motorway near London
with no incident and another from the Boulevard Peripherique in Paris with 12
incidents. The data from the M4 Motorway is used to test the algorithms for their
ability to detect queue due to congestion at downstream. The HIOCC detected all 12
incidents and queue due to congestion on the Boulevard Peripherique. The PATREG
did not detect the incidents in the heavy traffic on the Boulevard Peripherique. It
12


performs satisfactorily in ffee-flow condition up to about 1500 veh/h per lane on M4
motorway data in terms of false alarm rate.
Tsai (Tsai and Case 1979) applied the maximum-likelihood decision principle to
develop an optimum incident persistent test to improve the performance of the incident
detection algorithm. The technique is applied to the modified California algorithm
(Payne and Tignor 1978) to reduce the false alarm rate. The algorithm is tested with
field data of 28 real incidents from the Queen Elizabeth Way freeway in Canada. The
false alarm rate decreases from 0.09 percent to 0.06 percent but the detection rate also
decreases from 85 percent to 74 percent.
Fambro (Fambro and Ritch 1979) developed an algorithm for freeway incident
detection under low-volume conditions. The algorithm assumes that under low volume
conditions, vehicle speeds remain constant over short segments of freeway. The
control variables include the time that a vehicle enters the segment and predicted exit
time computed from speed estimated from detectors. An incident is declared if fewer
than an expected number of vehicles exit the section in an interval. The algorithm was
developed and tested with field data of 1-610 freeway in Houston, Texas. The
algorithm detects 100 percent of incidents when the flow rate is less than 400 veh/h. It
detects 61 percent of incidents when the flow rate is between 800 and 1200 veh/h.
An autoregressive integrated moving average (ARIMA) algorithm was developed
for freeway incident detection by Ahmed (Ahmed and Cook 1982). It assumes that
traffic flow can be modeled from historical, time-varying, traffic data by comparing
observed traffic measure such as occupancy against short-term predicted traffic
measures. Significant deviations from observed and estimated values of traffic
measures lead to an incident alarm. The algorithm was developed and evaluated based
on field data for 50 real incidents on the Lodge freeway in Detroit. The detection rate
reported is 100 percent with a false alarm rate of 2.6 percent by using constant-
13


parameter values estimated from incident-free data. The false alarm rate decreases to
1.4 percent when parameter estimates are updated occasionally. The mean time to
detect decreases from 0.58 minute to 0.39 minute. The performance of the time series
algorithm depends on the robustness of the optimum confidence interval as it
determines the threshold deviations.
The McMaster algorithm (Persaud and Hall 1989) for freeway incident detection is
based on Catastrophe theory. The flow-occupancy curve is used as the decision
criterion for detecting incidents by separating the areas corresponding to different
states of traffic conditions. The flow-occupancy criterion is derived from the
catastrophe theory model of the three dimensional relationship between flow,
occupancy, and speed. Incidents are detected by observing specific changes in traffic
measures in a short time period. Aultman-Hall (Aultman-Hall et al. 1991) developed
and evaluated the algorithm using flow rate, occupancy, and speed at a single station.
The flow-occupancy curve varies for each station based on geometric characteristics.
The flow-occupancy curve has four possible states; normal uncongested operation,
operation downstream of incident, operation within a queue of slow-and-go traffic, and
capacity operation downstream of a recurrent bottleneck. The algorithm was
developed and evaluated based on 30-seconds field data collected from the Burlington
Skyway in Ontario, Canada. The algorithm detects 6 incidents out of 10 incidents (60
percent). The 4 incidents that are missed are reported to have no significant effect on
traffic. The false alarm is one every 10 hours for 2 time interval persistence check and
one for every 39 hours for 3 time interval persistence check. The algorithm was also
tested with field data for 31 incidents from Queen Elizabeth Way in Ontario, Canada.
The algorithm detects 14 incidents. The other 17 missed incidents are reported to
include bad or no detector data, or had no visible effect on data. The false alarm
reported is 15 alarms for 39 days.
14


A combination of fuzzy logic and the learning capabilities of neural network was
applied to freeway incident detection algorithm by Hsiao (Hsiao et al. 1993). The input
variables used are 5-minute interval volume, occupancy, and rate of change of
occupancy. Data collected from the Dan Ryan expressway in Chicago at three
different locations included 6,9, and 4 incidents are used to develop and test the
algorithm. The first data set with 6 incidents is used to train the algorithm. The
detection rate for second and third data sets are 66.7 percent and 75 percent,
respectively. The false alarm rates reported are 0 for both data sets.
A low-pass filtering technique was applied to freeway incident detection algorithm
by Stephanedes (Stephanedes and Chassiakos 1993). The low-pass filtering or the
Minnesota Algorithm filters the raw traffic data before an incident detection algorithm
is applied. This helps to reduce false alarm rate due to short-term traffic fluctuations.
The 30-second average upstream occupancy and downstream occupancy are used to
compute spatial occupancy difference. Linear filter, moving-average smoother, is used
to filter out short-time fluctuation of occupancy difference. The variables are compared
against thresholds. The algorithm was developed and tested with field data of 27 real
incidents of.I-35W in Minneapolis. The Minnesota algorithm outperformed the
California Algorithm, California Algorithm #7 (Payne and Tignor 1978), Standard
Normal Deviate (SND) (Dudek et al. 1974), and the Double-Exponential Algorithms
(Cook and Cleveland 1974). The detection rate is 50 percent at false alarm rate of 0.1
percent. A detection rate of 80 percent is achieved with a false alarm rate of 0.3
percent.
An artificial intelligence based, non-linear pattern recognition technique, neural
network approach has also been applied to freeway incident detection. Cheu (Cheu
and Ritchie 1995) first applied the multi-layer feed forward (MLF) neural network to
develop an incident detection algorithm based on 30-second average occupancy and
15


average volume data for up to four previous time interval for upstream station and up
to 2 previous time interval for downstream station from loop detectors simulated for a
freeway in California. Four hundred simulated incidents are used for training and
another set of400 simulated incidents is used to evaluate the progress of training. Nine
hundred and eighty simulated incidents and 9 real incidents are used to evaluate the
performance of the neural network algorithm. The study reports that neural network
algorithm outperformed the California Algorithm # 8, the McMaster algorithm and
Minnesota algorithm.
A multi-layer feedforward neural network was applied to an extensive incident data
set by Dia (Dia et al. 1997). The data set included 100 real incidents from Tullamarine
Freeway in Melboum, Australia. Out of 100 incidents, 60 incidents are used for
training and 40 incidents are used for validation. The variables used are 20-second
average occupancy, average speed, and average volume at upstream and downstream
detector stations at current time interval. The algorithm is compared to the
ARRB/VicRoads model (Luk 1992). The logic of ARRB/VicRoads model is to
compare the traffic data between adjacent stations and adjacent lanes. An incident is
declared if the differences exceed pre-determined thresholds. The MLF outperformed
the ARRB/V icRoads models. The detection rate for MLF reported is 82.5 percent with
the false alarm rate of0.065 percent, and the mean time to detect of 203 seconds.
A framework for incorporating a neural network based continuous learning
capability, least squares and error back propagation, to the California algorithm and
McMaster algorithm was developed by Peeta (Peeta and Das 1998). The least squares
and error back propagation is implemented in the California algorithm to continuously
update the thresholds over time. The least squares technique and the error back
propagation are applied to McMaster algorithm to update the parameters that classify
region of four states for flow-occupancy curve. Simulation data of the Borman
16


Expressway in Indiana is used to develop and test the framework. The algorithm
outperformed the California Algorithm and McMaster Algorithm. The performance of
the California and McMaster algorithms with continuous learning capability also
improved with time in service. The error back propagation shows better learning
capability, shorter time in service to reach a 100 percent detection rate, than the least
squares technique.
A CUSUM algorithm was developed for freeway incident detection by Teng (Teng et
al. 1998). It assumes that the change in traffic processes can be distinguished by the
difference between the current cumulative sum of the log-likelihood ratio and its
minimum value up to the current time period. The log function of probability density
function associated with the normal and the changed conditions ratio is log-likelihood
ratio. The variables are upstream and downstream occupancies. The field data of 1-880
of the California PATH project was used to develop and test the algorithm. The 63
incidents that obviously effect the traffic were used. The input variables are
occupancies at upstream and downstream. The study reports that the algorithm
performed better than the California Algorithm and low-pass filtering algorithm.
Two neural network models, multi-layer feedforward (MLF) and fuzzy adaptive
resonance theory (ART), were applied to freeway incident detection algorithm by
Ishak (Ishak and Al-Deek 1999). The algorithm was developed and tested with field
data of 130 lane-blocking incidents of 1-4 freeway, Orlando, Florida. The MLF
algorithm reports a detection rate of 63 percent with 0.05 percent of false alarm rate
when using 30-second occupancy and speed as algorithm inputs. The detection rate
reported is 61 percent with 0 percent of false alarm rate when used occupancy, speed,
and volume as input variables. The fuzzy ART algorithm performed best when using
persistence factor (PF) of 2 and a vigilance parameter of 0.95 with occupancy and
speed as input variables. The vigilance parameter controls the dynamics of the Fuzzy
17


ART network and determines the degree of clustering achieved by the algorithm. For
false alarm rates up to 0.07 percent, the MLF algorithm outperformed the fuzzy ART
algorithm, the California Algorithm #7 and #8. For false alarm rates greater than 0.07
percent, the fuzzy ART algorithm outperformed the MLF network, the California
Algorithm #7 and #8. The fuzzy ART has advantages over the MLF network due to its
fast stable learning in response to analog or binary input patterns.
The Bayesian-based probability neural network (PNN) that utilized the concept of
statistical distance, instead of Euclidean distance, as a measure of nearness of weighted
vector to the different pattern was applied to freeway incident detection by Abdulhai
(Abdulhai and Ritchie 1999). The same simulated and real incidents data used by a
previous study (Cheu and Ritchie 1995) and additional real data from 1-880 in
California and I-35W in Minnesota including 45 and 159 incidents respectively were
used to evaluate PNN algorithm. The same 16 input variables as a previous study
(Cheu and Ritchie 1995) were used in this study. The algorithm compares favorably to
the multi-layer feedforward neural network in terms of detection rate, false alarm rate,
and mean time to detect. The study reports that the Bayesian-based PNN trains faster
than the MLF neural network and is potentially transferable to new sites without the
need for explicit off-line retraining. However, the characteristics of the incidents on the
1-880 freeway and the I-35W freeway is not compared.
A more recent model, the constructive probabilistic neural network (CPNN) was
developed by Jin (Jin et al. 2002). The model was structured based on mixture
Gaussian model and trained by a dynamic decay adjustment algorithm. A mixture
Gaussian model allows the PNN to include different smoothing parameters for each
unit pattern that can be obtained by the dynamic decay adjustment algorithm. The
CPNN models were developed and tested on 300 incidents simulated incident data in
Singapore and field data from the 1-880 freeway in California, also used in a previous
18


study (Abdulhai and Ritchie 1999). The 24 input variables used include occupancy,
speed, and volume up to 4 previous time interval for upstream station and up to 2
previous time interval for downstream station. The model developed for the simulated
data in Singapore reports a detection rate of 97.33 percent with false alarm rate of 7.12
percent and mean time to detect of 1.7 minutes without any persistence test. The model
developed for the 1-880 freeway data reports a detection rate of 95.65 percent with
false alarm rate of 0.33 percent and mean time to detect of 3.84 minutes. The model
developed on the simulated incident data in Singapore was also tested on the 1-880
data with the proposed adaptation method. The model developed specifically for the
location has been shown to perform better than the model with adaptation. However,
the CPNN model has shown to have high false alarm rate.
2.2 Mobile Sensor-Based Freeway Incident Detection Algorithms
All algorithms mentioned in the previous section use traffic flow parameters estimated
from fixed loop detectors. In the last few years, the, use of probe vehicles as an
alternative or additional source of traffic flow data has been of great interest. This
section presents a review of algorithms that use only data from mobile sensors.
Petty (Petty et al. 1997) has developed an ID algorithm based on GPS based probe data
collected every second. The algorithm determines when a probe vehicle has passed an
incident by comparing observed acceleration and speed against thresholds. The
observed data is first filtered using a standard moving average filter of width 20
seconds. The vehicle is classified as passing an incident when acceleration is above a
threshold and is accelerating to speed above a speed threshold. The speed threshold is
used to reject the large acceleration due to stop-and-go conditions that may cause false
alarms. The algorithm was developed and evaluated based on data collected from the
1-880 freeway in Hayward, California. Incidents were divided into three categories:
19


accident, vehicle breakdown, and police ticketing. The incident data included 25
accidents (13 in-lane accidents, 12 shoulder accidents), 61 police ticketing, and 226
vehicle breakdowns (16 in-lane break downs, 210 right shoulder break down). The
algorithm detected approximately 70 percent of all 25 accidents on the freeway and
about 50 percent of police ticking and vehicle breakdown.
A Standard Normal Deviates (SND) algorithm for freeway incident detection was
applied by Balke (Balke et al. 1996). The algorithm compares probe travel time with
historical travel time. An incident is declared if the travel time from a probe vehicle
exceeds the confidence interval around the typical travel time. Data was collected from
1-45, the Hardy Toll Road, and US-59 in Houston, Texas for over 11 months including
approximately 625 incidents. Travel time was estimated from the time when probe
vehicle drivers called a communication center as they were passing consecutive
reference locations. Probe vehicles drivers also asked to call in to provide incident
information. The algorithm is less effective than the California Algorithm and the
McMaster Algorithm. However, this algorithm used only probe vehicle data whereas
other algorithms used detector data. The detection rate reported is 58 percent and
higher false alarm rate than other algorithms.
2.3 Fixed and Mobile Sensor-Based Freeway Incident Detection Algorithms
Fixed sensor data may lack capability of representing spatial variation of traffic
conditions especially when detector spacing is large. Use of mobile sensor data, with
its capability of representing spatial variation of traffic conditions, is also of interest in
incident detection. This section reviews freeway incident detection algorithm that uses
both fixed and mobile sensor data.
20


To date, only one freeway incident detection algorithm has been developed based on
data from both fixed and mobile sensors. Three models, Discriminant Analysis (DA)
based model, Generalized Linear Model (GLZ), and Neural Network (NN) model,
were developed based on simulated incident and non-incident conditions for an
interstate freeway, 1-25, in Colorado (Hoeschen 1999). The mobile sensors are the
automatic vehicle location system (AVL) installed in buses by the local transit agency
for fleet management. The models use 30-second detector data with probe data
collected eveiy 10,20, or 30- seconds. The input variables include average bus speed,
number of bus reportings, upstream volume at time t-1 and t-2, upstream speed,
downstream volume at time t, t-1, t-2, downstream speed, upstream and downstream
speed difference, upstream occupancy difference by lane, downstream occupancy
difference by lane. The study also reports the performance of the models (DA, GLZ, &
NN) using fixed sensor data only, mobile sensor data only and both fixed and mobile
sensor data. The study reports that the performance of the mobile sensor data based
models depends on the average bus headway and the probe reporting interval. For the
bus headways observed on a section of the 1-25 freeway in Colorado during the
morning peak periods, the fixed sensor based model outperformed the mobile sensor
based model. However, the mobile sensor based model detected up to 70 percent of
incidents. When both detector and probe data were used, the overall performance of
the ID model improved compared to fixed sensor based model mainly the false alarm
rate was reduced. For combined fixed and mobile sensor based model, the NN model
outperformed other models for 10 seconds bus reportings. The GLZ model
outperformed other models for 30 seconds bus reportings. This study suggests that the
bus AVL data can be used by TMC's to improve overall performance of an incident
detection algorithm by lowering false alarm rate.
21


2.4 Mobile Sensor-Based Surface Street Incident Detection Algorithms
Although only two freeway incident detection algorithms have been developed based
on mobile sensor data and one study based on fixed and mobile sensor, the literature
review has also shown two studies in Chicago that have developed an incident
detection model for signalized surface street network based on probe data. The scope
of this dissertation proposal is limited to freeway incident detection, therefore a
comprehensive review of all surface street incident detection algorithms is not
provided here. Only two studies are reviewed here to provide an overview of the probe
based methods. It may be mentioned that there are significant differences in the traffic
patterns of surface street and freeway incidents due to the presence of multiple access
points, geometric constraints, control measures and the location of surveillance
infrastructure for surface streets.
A discriminant analysis based model was developed for incident detection for urban
arterials in the ADVANCE project (Sethi et al. 1995). This study used fixed sensor and
mobile sensor data independently to develop a fixed sensor based algorithm and a
mobile sensor based algorithm, respectively. For the fixed sensor based algorithm, the
best model was obtained when upstream occupancy deviation from historical
occupancy and volume to occupancy ratio deviation from historical volume to
occupancy ratio for upstream station was used to develop Fisher linear discriminant
functions. The 7-minute average data from a traffic simulation with 123 downstream
incidents, 177 midblock incidents, and 116 upstream incidents were used to develop
and test the algorithm. For fixed sensor based model, a detection rate reported is 65.9
percent with false alarm rate of 0 percent for downstream incidents (incidents occur
downstream of detectors). The mean time to detect reported is 1.56 time periods (11
minutes). The algorithm detected only 6.9 percent of upstream incidents and 1.7
percent of midblock incidents. For mobile sensor based algorithm, the best model
22


obtained when the ratio of travel time to historical travel time and speed to historical
speed were used. For mobile sensor based algorithm, the detection rate reported is 61.0
percent with a false alarm rate of 0.1 percent for downstream incidents. The mean time
to detect reported is 1.67 time periods (12 minutes). However, this performance was
based on 30 probe reports on each link during each 7-minute period (approximately
4.2 probe reports per minute) or a 20 percent probe penetration rate. The detection rate
reduced to 17.9 percent if there was only one probe vehicle on the link within 7-minute
interval. The algorithm detected 9.5 percent of upstream incident and 2.8 percent of
midblock incidents. The study did not compare the performance of the discriminant
model to other algorithms.
A discriminant analysis and multi-layer feedforward neural network were applied
to fixed sensor and mobile sensor based surface street incident detection algorithm by
Ivan (Ivan and Chen 1997). The variables used are volume to occupancy ratio
deviation from historical volume to occupancy ratio, occupancy deviation from
historical occupancy for fixed detector, travel time to historical travel time ratio, and
speed to historical speed ratio from probe vehicles. Also part of the ADVANCE study,
the simulation data used in the study mentioned previously (Sethi et al. 1995) was used
to test these models. The penetration rate of probe vehicle was 20 percent. Four data
sets were used, loop detector data only, probe vehicle data only, data fusion, and using
both loop detector and probe vehicle data. The discriminant analysis algorithm
performed best when using both fixed sensor and mobile sensor data. The detection
rate reported is 76.4 percent with false alarm rate of 0 percent. The MLF also
performed best when using both fixed sensor and mobile sensor data. The detection
rate reported is 87.0 percent with false alarm rate of 0.1 percent. When a single data
source is used, loop detector algorithm outperformed probe vehicle algorithm. The
neural network outperformed discriminant analysis in terms of detection rate.
23


2.5 Synthesis of the Literature -
This section of the dissertation presents a synthesis of the literature reviewed in terms
of the performance of the models, data source, data set used to develop and test the
models, and the characteristics of the models.
2.5.1 Performance of the Freeway Incident Detection Algorithms
For the last three decades research has focused on developing incident detection
algorithms based on several techniques including decision-tree based pattern
recognition technique, time-series based statistical approach, catastrophe theory, and
artificial intelligence based neural network approach. None of the algorithms report
perfect performance for incident detection, i.e. 100 percent detection rate and 0 percent
false alarm rate. Some studies may have reported perfect performance on a particular
data set but not in general. For all algorithms, there is always a trade-off between
detection rate and false alarm rate, that is, a higher detection rate can be achieved at the
expense of a higher false alarm rate. Responses from a recent survey (Abdulhai 1996)
of seven Traffic Management Centers show that a reasonable set of limits on DR and
FAR would be 88 percent and 1.8 percent, respectively. A more stringent set of limits
would be obtained using the extreme value of 100 percent and 0.25 percent,
respectively. It may be mentioned that a Traffic Management Center monitoring a 50
mile freeway monitored by fixed sensors every half a mile, using an incident detection
algorithm with a false alarm rate of 0.25 percent every 30-seconds, would generate 30
false alarms per hour. Very few incident detection algorithms reviewed report false
alarm rates lower than 0.25 percent. The MLF neural network, the PNN, and CPNN
models have been shown to perform best. Furthermore, all incident detection
algorithms reviewed report performance relying on only a single data split. The
24


algorithms are developed based on a portion of incident data set and tested on the
remaining of data set. The true variability of model performance is not known.
2.5.2 Source of Data for Incident Detection Algorithms
Data source for incident detection algorithms may be infrastructure based, fixed
sensors and/or non-infrastructure based, mobile sensors. A review of literature shows
that fixed sensors based incident detection algorithm always provide a significantly
better performance than mobile sensors based incident detection algorithm. One
mobile sensors based incident detection algorithm (Petty et al. 1997) developed based
on field data shows relatively low performance. Another mobile sensor based
algorithm (Balke et al. 1996) based on field data is reported to be less effective than a
fixed sensor based incident detection algorithm. The performance of mobile sensors
based incident detection model depends on the penetration rate and report interval of
probe vehicles. A freeway incident detection algorithm (Hoeschen 1999) shows the
combination of fixed sensors and mobile sensors based data may help reduce false
alarms. However, the model was developed based on simulated data with relatively
short report interval. Other two surface street incident detection models (Ivan and
Chen 1997; Sethi et al. 1995) were developed based on simulated data with 20 percent
penetration rate of probe vehicles. The algorithm relying on both fixed and mobile
sensors provide better performance than an algorithm relying on a single source.
2.5.3 Incident Data Sets
The performance of incident detection models are typically tested using data collected
either from a traffic simulation model and/or field data representing both incident and
non-incident conditions. The type, duration, and severity of incidents vary across the
25


different data sets that are reported in the literature. Data sets including simulated
incidents provide a wider range of testing conditions. Although it is best to test ,
algorithms based on field data including real incidents, it is often difficult to collect.
There are a handful of data sets available for real incidents on freeways in the US. The
two data sets from California include detector data from the SR-91 freeway for nine
incidents (Cheu and Ritchie 1995) and probe and detector data from the 1-880 freeway
in California by researchers at University of California, Berkeley, as part of the
Partners for Advanced Transit and Highways (PATH) project for 656 incidents. From
this data, 45 lane-blocking incidents were used by Abdulhai (Abdulhai and Ritchie ;
1999) and 63 lane-blocking incidents were used by Teng (Teng et al. 1998) to develop
and test their ID models. The I-35W data collected in Minnesota includes detector data
for 159 lane-blocking incidents and was used by Abdulhai (Abdulhai and Ritchie
1999). The data collected from the 1-4 freeway in Orlando, Florida includes detector
data for 130 lane-blocking incidents (Ishak and Al-Deek 1999) and the data from the I-
45 freeway and the US-59 in Houston, Texas includes 625 incidents (Balke et al.
1996). Out of all these data sets, only the data set from the California PATH project
includes both detector and probe data. It may be mentioned that the probes in the
PATH study were limited to four or five vehicles at an average headway of seven
minutes.
2.5.4 Characteristics of the Freeway Incident Detection Algorithms
The algorithms discussed each have their own, strengths and weaknesses. It is difficult
to compare the performance of the algorithms based on the results presented in the
papers reviewed mainly because the type of incidents included in each data set is
different, the wide variation in the operating conditions under which the algorithms
were tested and, in some cases, the size of the data set. Some studies do however
26


report the performance of one model against several others on the same data set. The
characteristics and weaknesses of the algorithms reviewed are summarized here.
For freeway incident detection, the decision-tree based California Algorithm was
developed by Payne (Payne and Tignor 1978). The algorithm detects discontinuity of
upstream and downstream occupancy. An incident is declared if measures calculated
from one-minute average occupancy for upstream and down stream stations exceed
predetermined thresholds. The measures used in the California Algorithm calculated
from upstream occupancy and downstream occupancy are spatial difference in
occupancy, relative spatial difference in occupancies, and relative temporal difference
in downstream occupancy. The California Algorithm relies on; location specific
thresholds. An exponential smoothing algorithm for incident detection was developed
by Cook (Cook and Cleveland 1974) based on occupancy, speed, volume, and energy
computed from speed and volume. The algorithm provides high detection rate but at
the expense of a high false alarm. The Standard Normal Deviate (SND) model
developed by Dudek (Dudek et al. 1974) is based on the assumption that an incident
results in a high rate of change in lane occupancy and energy computed from volume
and speed measurement. The thresholds for the Standard Normal Deviate Algorithm
are calibrated based on historical data. This reduces the algorithm's ability to adapt to
traffic variations, unless the thresholds are re-calibrated regularly. The Bayesian
- i . [ .1
algorithm (Levin and Krause 1978) assumes that the normal traffic flow follows its
historic trend and any deviation from this trend that exceeds a certain threshold
indicates an incident. The variables used are the ratio of the difference between one-
minute upstream and downstream occupancies and upstream occupancy. The
algorithm takes several time intervals to obtain posterior probability for incident
detection. This increases the mean time to detect. The HIOCC Algorithm (Collins et
al. 1979) identifies the presence of stationary or slow moving vehicles over detectors
based on detector data. The one-second occupancy from detector is obtained by
27


scanning the detector every 1/10 second to determine whether the detector is occupied.
The algorithm relies on detecting sharp changes in occupancy and therefore is unable
to distinguish between queue build up due to incident and recurring congestion. The
PATREG Algorithm (Collins et ak 1979) monitors the traffic speed in each lane
between pairs of detector stations. The algorithm relies on significant changes in speed
and fails to detect incidents at high flow rates. The PATREG algorithm performs well
when flow rate is very low. As flow rate increases, lane changing effect deteriorates
the performance of the algorithm. The time series algorithm (Ahmed and Cook 1982)
assumes that traffic flow can be modeled from historical, time-varying, traffic data by
comparing observed traffic measures such as occupancy against short-term predicted
traffic measures. Significant deviation from observed and estimated values of traffic
measures lead to an incident alarm. The day-to-day variation of traffic condition
affects the performance of this algorithm. The low-pass filtering algorithm
(Stephanedes and Chassiakos 1993), filters the raw traffic data before an incident
detection algorithm is applied. This helps to reduce false alarms due to short-term
traffic fluctuations. The variables used are 30-second average upstream occupancy,
downstream occupancy, and spatial occupancy difference. The low-pass filtering
algorithm like the California Algorithm, depends on location specific thresholds. The
CUSUM algorithm (Teng et al. 1998) assumes that the change in traffic processes can
be distinguished by the difference between the current cumulative sum of the log-
likelihood ratio and its minimum value up to the current time period. The variables are
upstream and downstream occupancies. The algorithm shows that a high detection rate
is achieved at a high false alarm rate. An artificial intelligence based, non-linear
pattern recognition technique, has also been applied to freeway incident detection.
Cheu (Cheu and Ritchie 1995) first applied the multilayer feedforward (MLF) neural
network to develop an incident detection algorithm. The 16 input variables include 30-
second average occupancy and average volume data for up to four previous time
interval for upstream station and up to two previous time interval for downstream
28


stations. The MLF neural network has been shown to perform well. Re-training of the
algorithm for a new location takes considerable time. The Bayesian-based probabilistic
neural network (PNN) (Abdulhai and Ritchie 1999) that utilized the concept of
statistical distance, instead of Euclidean distance, as a measure of nearness of weighted
vector to the different patterns was applied to freeway incident detection by Abdulhai
(Abdulhai and Ritchie 1999). The variables used for the PNN model were the same as
the MLF model (Cheu and Ritchie 1995). The PNN algorithm performs as well as the
MLF neural network. However, it re-trains faster than the MLF neural network model
and has also been shown to perform well across locations following incremental
learning. A more recent model, the constructive probabilistic neural network (CPNN)
model (Jin et al. 2002) is structured based on mixture Gaussian model and trained by a
dynamic decay adjustment algorithm. The 24 input variables include 30-second
average speed, volume, and occupancy up to 4 previous time interval for upstream
station and up to 2 previous time interval for downstream station. The model is also
developed and tested on simulated data in Singapore and the 1-880 freeway data in
California, the same data set in previous study (Abdulhai and Ritchie 1999). The
CPNN model has been shown to have high false alarm rate.
The comprehensive review of the literature presented demonstrates that very few
incident detection algorithms report a low false alarm rate with high detection rate.
Unless the performance of a fixed sensor based algorithms is improved significantly, it
may not be implemented or accepted for use in a TMC. To achieve acceptable
performance, algorithms may not be transferred to a new site without recalibration,
incremental training (Abdulhai 1996), or adaptive training (Jin et al. 2002)
29


3. Data Collection
As mentioned in the Chapter 2, both simulated and field data for traffic measures
under incident and non-incident conditions have been used to develop and evaluate
incident detection models. There are very limited data sets available on real incident
and non-incident conditions due to the difficulty in coordinating data collection. As
part of a research project funded by the Colorado Department of Transportation
(CDOT), incident data was collected for a network in Colorado. Another data Set
collected by the California PATH program is also used to develop and test several
incident detection algorithms. This data, as well as data collected by the California
PATH program are used for this research.
3.1 Data from Colorado
To collect data for this study, a 9.7 mile section of the northbound Interstate freeway,
1-25, between the County Line Road and the Colorado Boulevard (Figure 3.1) was
selected. It consists of four lanes and one auxiliary lane between the E. County Line
Road and the E. Dry Creek Road, and three lanes and one auxiliary lane between the
E. Dry Creek Road and the S. Colorado Boulevard. It includes 12 on-ramps at: (1) E.
County Line Road, (2) E. Dry Creek Road, (3) WB Arapahoe Road, (4) EB Arapahoe
Road, (5) E. Orchard Road, (6) E. Belleview Avenue, (7) 1-225 Interchange, (8) E.
Hampden Avenue, (9) E. Yale Avenue, (10) E. Evans Avenue, (11) SB Colorado
Boulevard, (12) NB Colorado Boulevard, and 9 off-ramps at: (1) E. Dry Creek Road,
(2) E. Arapahoe Road, (3) E. Orchard Road, (4) E. Belleview Avenue, (5) 1-225
30


Interchange, (6) E. Hampden Avenue, (7) E. Yale Avenue, (8) E. Evans Avenue, (9) S.
Colorado Boulevard.
A data collection effort was coordinated with the Colorado Department of
Transportation (CDOT), the Regional Transit District (RTD) and news media outlets
that collect incident data. The sources provided data for traffic measures from fixed
sensors and mobile sensors. The next few sections present details on both types of
sensors and the data collection process.
Figure 3.1. Schematic of the test network and detector locations
31


3.1.1 Traffic Measures from Fixed Sensors
In 1984, the Colorado Department of Transportation (CDOT) installed loop detectors
in the pavement to implement traffic responsive ramp metering within a section of the
northbound 1-25 freeway. As part of this system, detectors were installed on the
mainline and ori-ramps of the freeway. Figure 3.2 shows a typical configuration of the
detectors for ramp metering. The mainline detectors provide volume, occupancy and
speed trap data by scanning at 60 times/second. The data is collected to generate
volume (vph), percent occupancy (%), and speed by lane (JHK and Associates 1990)
every minute.
Figure 3.2. Typical detector configuration for ramp metering
For the test network, eleven on-ramps out of 12 on-ramps have detectors. The distance
between detector locations ranges from 0.13 miles to 2.20 miles. Table 3.1 shows the
32


distance between detector locations for all the on-ramps in the network. The one-
minute detector data was collected for this study. The data includes percent occupancy,
volume, and speed for each lane. This data was obtained from the CDOT from
6:00AM to 6:00PM for five weeks from April 24,2001 to May 25,2001 and another
nine weeks from September 24,2001 to November 23,2001.
Table 3.1. Distance between detector locations
' : On-ramp Location Distance (mile)
From To
County Line Dry Creek 1.00
Dry Creek Arapahoe SE 1.11
Arapahoe SE Arapahoe NE 0.13
Arapahoe NE Orchard 1.56
Orchard Belleview 1.11
Belleview Hampden 2.20
Hampden Yale 1.06
Yale Evans 0.91
Evans , Colorado NE . 0.47
Colorado NE Colorado NW 0.18
3.1.2 Traffic Measures from Mobile Sensors
The Regional Transportation District, Colorado's transit agency, installed an
Automatic Vehicle Location (AVL) system in 1993 to develop more efficient transit
33


schedules, to improve the agency's on-street operations, and to increase safety through
better management (Castle Rock Consultants 1998).
Denvers AVL system components are shown in Figure 3.3. Each vehicle in the RTD
fleet is equipped with an Intelligent Vehicle Login Unit (IVLU) and a global
positioning system (GPS) receiver capable of real-time differential correction. As a
GPS receiver's signal may degrade due to obstructions in urban environments, the
Denver AVL system integrates the GPS with inertial sensors or dead-reckoning (DR)
sensors. The location accuracy of this type of GPS receivers is 1-2 meters. However,
specific information on the accuracy of Denvers integrated GPS-DR system is not
available. The system consists of 1,335 vehicles in its fleet, including 935 fixed route
buses. Bus location data is available every two minutes through this AVL system and
was collected for this study.
Normally, 18 bus routes travel on the northbound 1-25 freeway, the test network.
During the morning peak period, the average bus flow rate is 14 buses per hour
between the County Line Road and the 1-225 Interchange and 23 buses per hour
between the 1-225 Interchange and the Colorado Boulevard. During the afternoon peak
period, the average bus flow rate is 4 buses per horn between the County Line Road
and the 1-225 Interchange and 11 buses per hour between the 1-225 Interchange and
the Colorado Boulevard. More buses operate on the northbound 1-25 section of the test
network during the morning peak period than during the afternoon peak period. The
AVL data available at 2-minute report intervals includes an unique bus identification
(ID), bus route, time stamped, and coordinate of bus locations in NAD 27 State Plane
coordinate system. The data was obtained from the Regional Transportation District
(RTD) from 6:00AM to 6:00PM for 12 weeks from April 24,2001 to July 15,2001
and for 11 weeks from September 24,2001 to December 7,2001.
34


GPS Correction data
Differentially corrected location, vehicle, route and
message data
Voice Communications
Voice Communications
Dispatch Center
eggSBggs
Differentially corrected
location, vehicle, route and
message data
Field Supervisor
and Maintenance
Figure 3.3. Denver's AVL system
3.1.2.1 Post Processing Data from Mobile Sensors
Each AVL data file provided by the RTD contains bus location data every day of the
week for the entire Denver metro area. A Fortran routine written extracts the time
stamped bus location data by date. A Geographic Information System (GIS) software,
ArcView, and Avenue, ArcView's scripting language, was used to extract the data for
all northbound buses within a buffer zone around the test network. Network Analyst,
another ArcView tool, was used to estimate the distance traveled along the freeway
35


between two consecutive AVL reports. Figure 3.4 shows the GIS software, ArcView's
display of the bus AVL data.
Figure 3.4. GIS software used to display and extract bus AVL data on
the 1-25 freeway northbound
3.1.3 Incident Data
Several news channels in the Denver metropolitan area report road condition for major
roadways. A local company, Premiere Traffic Network (PTN), collects information on
roadway conditions by listening to police and emergency vehicle service radio
36


dispatch, calling police station, reports from airborne reporters and monitoring live
video feed in their operation center from cameras at several locations. A local radio
station, the 850 KOA radio, posts this information on the World Wide Web. However,
this information is not saved or archived in a database.
The information posted includes location of incidents, approximate time incident
started, and a brief description of the incidents. This data was collected in two periods;
April 24,2001 to May 25,2001 for five weeks and from September 24,2001 to
November 23,2001 for nine weeks during the morning peak period 6:00-9:00 AM and
the afternoon peak period 3:00-6:00 PM, to coincide with the detector data collection
described above. The incident data were collected from the website. A computer
program and a screen capture program were used to capture the web sites display
every 30 seconds to record incident information posted. The web display captured was
saved in an AVI file format. It was later replayed to collect the incident data. The
actual start time and end time of the incidents were examined by plotting occupancy at
upstream of incident location of each incident. The actual start time is determined
based on the time the upstream occupancy deviation from historical occupancy is
higher than 5 percent. Similarly, as the upstream occupancy deviation for historical
occupancy falls below 5 percent, an incident is determined to have ended. For the 1-25
network, the database includes 58 incidents. The detector malfunctioned as 20
incidents occurred. Half of the 38 incident were randomly selected as the training set
and the remaining as the test set.
3.2 Data from California
The 1-880 data was collected as a part of a research project in California (Petty et al.
1995). The study section of the 1-880 freeway in Hayward, California is 9.2 miles long
and varies from 3 to 5 lanes (Figure 3.5). An HOY lane covers approximately 3.5
37


miles of the section. The data was collected in two periods; before the Freeway Service
Patrol (FSP) was in operation from February 16 through March 19,1993 and during
FSP in operation from September 27 through October 29,1993. Data collected for this
project include loop detector data, probe vehicle data, and incident data. However,
only loop detector data and incident data are used in this research.
Figure 3.5. The PATH project study section
3.2.1 Traffic Measures from Fixed Sensors
The northbound section of the 1-880 freeway includes 18 detector stations and the
southbound section includes 17 detector stations. The spacing between detector
38


stations ranges from 0.19 mile to 0.55 miles with an average of 0.33 miles. The loop
detector data collected was processed by a program developed at the University of
California, Berkeley to report average flow rate, average speed, and average
occupancy per period. Some detectors malfunctioned periodically and/or counted
vehicles incorrectly. Two fixes (Petty et al. 1995) were performed on the loop detector
data for missing data and inconsistent data. The missing data was recreated data from
adjacent upstream loop detector and/or average from two adjacent loop detectors. A
consistency fix corrected systematic errors in the loop data such as over or under
counting. If the average vehicle accumulation per minute over a long period exceeds a
threshold, then flow is estimated using correction factors. The correction factors are
computed as a fraction of the flow of the nearest mainline flow.
3.2.2 Incident Data
Incident data was collected by drivers of probe vehicles. When vehicles passed an
incident while driving on the freeway, they informed their command center and
recorded their positions on an on-board portable computer.
Accuracy issues of the incident database are reported in a paper (Petty et al. 1995). The
location of the incidents was corrected by correlating the location of the incidents
recorded in the incident database at the command center and the location recorded in
an on-board portable computer. Since the start and end time recorded for the incidents
were based on probe vehicle drivers witnessing an incident, the time recorded was
only an approximation. For the 1-880 incidents, the start and end times were
determined as described earlier for the 1-25 freeway. A data set of 45 lane-blocking
incidents and 660 shoulder incidents were developed from the PATH project data.
39


4. Characteristics of Lane-blocking and Shoulder Incidents
An incident management system is a component of an advanced traffic management
system (ATMS). The purpose of an incident management system is to detect, verify,
and assess the magnitude of an incident, to identify the appropriate response to restore
a facility to normal operation, and to implement the appropriate response in the form
of traffic control, information, and aid (Carvell 1997).
Incident detection is one of the main functions of incident management system. An
incident detection algorithm is implemented by a traffic management center to detect
any unexpected event that disrupts traffic flow and causes significant delay to
motorists. Therefore, an incident detection algorithm that detects both lane-blocking
and shoulder incidents is critical to the operation of an ATMS. Several researchers
(Ahmed and Cook 1982; Balke et al. 1996; Cook and Cleveland 1974; Dia et al. 1997;
Hsiao et al. 1993; Payne and Tignor 1978; Persaud and Hall 1989; Petty et al. 1997;
Stephanedes and Chassiakos 1993; Tsai and Case 1979) have developed and tested
incident detection algorithms based on both lane-blocking and shoulder incidents
while others (Abdulhai 1996; Cheu 1994; Ishak and Al-Deek 1999; Teng et al. 1998)
focused on only lane-blocking incidents. If the characteristics of lane-blocking
incidents and shoulder incidents are significantly different, an incident detection model
developed based on only lane-blocking incident may not perform well in detecting
shoulder incidents.
Incidents may be classified in a number of ways. An incident could be vehicle related
or non-vehicle related. An accident or breakdown is a vehicle related incident.
Accidents may involve a single car or multiple cars. A vehicle breakdown may involve
40


fuel leaking, flat tire, tire changing, or a mechanical problem. Non-vehicle related
incidents may include debris on the roadway. Incidents may also be classified based on
where they occur as lane-blocking and shoulder incidents. In practice, if lane-blocking
incidents are rapidly moved to shoulder before they are observed, they may be
classified as shoulder incidents. A lane blocking incident may involve a single lane or
multiple lanes (Petty et al. 1995). Shoulder incidents usually cause rubbernecking, an
action of drivers to slow down to observe an incident scene as they pass the incident.
Incidents may be also classified by its severity or the delay it causes; delay lower than
normal recurring congestion or delay higher than normal recurring congestion.
This chapter presents the characteristics of incidents on two freeway sections of the I-
25 freeway in Colorado and the 1-880 freeway in California. The characteristics of an
incident investigated include incident rate, average delay, and duration of incidents
based on the type of incident. Incidents are classified as a lane-blocking incident or a
shoulder incident. Incident rate is estimated as the number of incidents per million
vehicle miles traveled. Incident severity is examined in terms of delay and duration.
An examination of the characteristics of incident is expected to aid in the development
of incident detection algorithms. If the characteristics of lane-blocking incident and
shoulder incident are significantly different, an incident detection algorithm developed
based on Only lane-blocking incident may not perform well in detecting shoulder
incidents. Furthermore, an incident detection algorithm may be transferred or applied
to another location if the characteristics of incidents across locations are not
significantly different. Therefore, in this study the characteristics of incidents are
examined at each location and compared across locations.
41


4.1 Characteristics of Incidents
Limited studies have examined the characteristics of incidents. A review of literature is
summarized here.
Lindley (Lindley 1986) investigates the characteristics of incidents based on the data
from the Highway Performance Monitoring System (HPMS) database maintained by
the Federal Highway Administration (FHWA). The HPMS database includes detailed
geometric, traffic and other data for selected roadway sections throughout the US. The
database investigated includes 4,646 sections representing 9,349 miles of urban
interstate and 3,390 sections representing 5,986 miles of urban other freeway and
expressway. Incidents are categorized as lane-blocking or shoulder incidents. As
shown in Figure 4.1, a lane-blocking incident is classified as an accident (one lane,
multi-lane) or a breakdown (one lane, two lanes). Shoulder incidents are classified as
accidents or breakdowns. An analysis of the HPMS data shows that 4 percent of the
incidents are lane-blocking and 96 percent are shoulder incidents (presented in Figure
4.1). Incident rate reported is 200 incidents per million vehicle miles traveled (VMT).
Another study in California collected incident data from the I-10 freeway, Los
Angeles, CA (Skabardonis et al. 1999) and includes information from probe vehicles,
loop detectors, and incident logs along a 7.8 miles freeway section. The data collected
for 30 days includes 1,560 incidents. The incidents are classified into lane-blocking
and shoulder incident. In-lane incidents are divided into accidents (one lane, multi
lane) and breakdowns (one lane, multi lane). Shoulder incidents are divided into
accidents and breakdowns. About 9.6 percent are lane-blocking incident and about
90.4 percent are shoulder incidents. The incident rate is 92.8 incidents per million
vehicle miles traveled. A mean response time for incidents is 11.4 minutes and the
mean incident duration of incidents is 20.7 minutes.
42


______One Lane
84.6%, 74.3%, 87.5%
Accidents
21.3%, 59.3%, 21.6% Multi Lane
Lane-Blocking 15.4%, 25.7%, 12.5%
4%, 3.7%, 9.6% One Lane
Breakdowns 99.2%, 95.8%, 97.5%
Incidents 76.7%, 40.7%, 78.4% Multi Lane
Accidents 0.8%, 4.2%, 2.5%
4.2%, 8.6%, 5.4%
Shoulder
96%, 96.3%, 90.4% Breakdowns
95.8%, 91.4%, 94.6%
Percentages reported are from national study (Lindley 1986), 1-880 freeway in CA
(Petty et al. 1996), and I-10 freeway in CA (Skabardonis et al. 1999).
Figure 4.1. Types of incidents.
4.2 Characteristics of Incidents on the 1-880 Freeway
The data collected for 10 weeks from probe vehicles and loop detectors along a 9.2
mile section of a freeway in Hayward, California includes 1,616 incidents. Four or five
probe vehicles at an average time headway of about 7 minutes collected data for 10
weeks from February to March, 1993 and from September to October in 1993. In
September and October 1993, freeway service patrol (FSP) is in operation (Petty et al.
1995). The 1-880 incidents are classified into lane-blocking and shoulder incidents and
further classified as accidents (one lane, multi-lane) and breakdowns (one lane, multi-
lane) (Petty et al. 1996). For the 1-880 freeway, 3.7 percent of incidents are lane-
blocking and 96.3 percent are shoulder incidents. Most of the incidents are shoulder
breakdowns (85%). The incident rate is 104 incidents per million vehicle miles
43


traveled. The mean response time for incidents is 28.9 minutes before the FSP is in
operation and 13.8 minutes when the FSP is in operation. Mean duration of incidents is
24.7 minutes (Skabardonis et al. 1997).
4.3 Characteristics of Incidents on the 1-25 Freeway
Characteristics of incidents in terms of frequency, incident rate, incident duration, and
average delay are investigated for each location, the 1-25 freeway network and the I-
880 freeway network (described in details in Chapter 3), and compared across
locations. Incidents are classified into two types, lane-blocking and shoulder, based on
the location that the incidents were observed. To compare the characteristics of
incidents in Colorado, data from a 9.2 miles section of the 1-25 freeway, 9.8 miles in
the 1-225 freeway and 5.2 miles in 6th Avenue are examined.
The Colorado data shows that about 60 to 95.2 percent (average 76.3 percent) of the
incidents are lane-blocking (Figure 4.2). The percentage of lane-blocking incidents
(76.3 %) is significantly higher than reports from three previous studies (Lindley 1986;
Petty et al. 1996; Skabardonis et al. 1999) as shown in Figure 4.1. The 1-880 database
shows that about 96.3 percent of total incidents are shoulder incidents. The 1-25 NB
data shows only 37.5 percent of total incidents are shoulder incidents. Most of the I-
880 freeway incidents are shoulder incidents and most of these incidents are vehicle
breakdowns. Most of the incidents on the 1-25 freeway are lane-blocking incidents and
mostly accidents.
44


1-25 NB 1-25 SB 1-225 NB 1-225 SB. 6th Ave. EB 6thAve.WB
Lane-Blocking
Average
62.5% 69.5% 60.0% 76.5% 95.2% 93.8% 76.3%
Shoulder
37.5% 30.5% 40.0% 23.5% 4.8% 6.2% 23.8%
1-25 NB I-25 SB I-225 NB I-225 SB Accident 6th Ave. EB 6th Ave. WB Average
77.5% 56.1% 75.0% 83.3% 95.0% 60.0% 74.5%
Jreakdowr
22.5% 43.9% 25.0% 16.7% 5.0% 40.0% 25.5%
Accident
83.3% 27.8% 62.5% 60.0% 100.0% 100.0% 72.3%
3reakdowr
16.7% 72.2% 37.5% 40.0% 0.0% 0.0% 27.7%
Figure 4.2. Incident characteristics in Colorado
There may be a number of reasons for these differences in percentage of types of
incidents observed across freeway locations. First, for the 1-880 freeway, about half of
the shoulder incidents are reported from call boxes. The 1-25 freeway test network
does not have call boxes. Therefore, vehicle breakdowns or minor incidents may not
be reported in Colorado. Secondly, the 1-25 freeway only has right shoulders available
and the 1-880 freeway has right shoulders and center divide (median) available. This
may cause lane-blocking incidents to be easily moved to either right shoulder or
median for the 1-880 freeway. The characteristics of incidents in terms of incident rate
is presented next.
4.3.1 Incident Rate
Incident rate is the number of incidents occurring per vehicle miles traveled (VMT). It
may be mentioned that incident rate for the 1-880 freeway excludes California
Highway Patrol (CHP) ticketing-related events, since most of these events are citations
for violations of the HOY lane usage. Approximately 26 percent of total incidents are
45


CHP ticketing-related events. These events are excluded and the rest of the incidents
are used for comparison.
For the 1-25 freeway, the incident rate for lane-blocking incidents (30.6 incidents/107
VMT) is higher than for shoulder incidents (18.4 incidents/107 VMT) (Table 4.1). For
the 1-880 freeway, the incident rate for shoulder incidents (780.0 incidents/107 VMT)
is much higher than for lane-blocking incidents (37.4 incidents/107 VMT).
Table 4.1. Incident rate for the 1-25 and the 1-880 freeway
Parameter 1-25 1-880
Lane- blocking Shoulder Lane- blocking Shoulder
Incident Rate (incidents/107 VMT) 30.6 18.4 37.4 780.0
The Chi-square goodness of fit test for one-way contingency table is performed based
on the null hypothesis that the probability of lane-blocking and shoulder incidents per
vehicle miles traveled are equal to 0.5. The likelihood ratio chi-square statistic and p-
value are calculated (Table 4.2). For the 1-25 freeway, the probability of lane-blocking
incidents and shoulder incidents per vehicle miles traveled are not significantly
different than 0.5 at the 5 percent level of significance (p-value=0.08). For the 1-880
freeway, the probability of lane-blocking and shoulder incidents per vehicle miles
traveled are significantly different than 0.5 at the 5 percent level of significance (p-
value < 0.001). Shoulder incident rate is much higher than lane-blocking incident rate
for the 1-880 freeway.
46


One of the purposes of an incident management system is to identify the appropriate
response to restore a facility to normal operation and they may be different for lane-
blocking and shoulder incidents. For the 1-25 freeway, the incident rates for lane-
blocking and shoulder incidents are not significant different. However, they are
significantly different for the 1-880 freeway. Therefore, the design and operation of
and incident management system for the two freeways may be different for the two
locations.
Table 4.2. Chi-square goodness of fit test for one-way contingency
table
Location p-value
1-25 between lane-blocking and shoulder incidents 0.08
1-880 between lane-blocking and shoulder incidents <0.001
Across locations
Lane-blocking incidents 0.41
Shoulder incidents < 0.001
All Incidents < 0.001
Across locations, the probabilities of lane-blocking incident per vehicle miles traveled
for the 1-25 freeway and the 1-880 freeway are not significantly different than 0.5 at 5
percent level of significance (p-value = 0.41) and for shoulder incidents they are
significantly different than 0.5 at 5 percent level of significance (p-value< 0.001).
Therefore, for an incident detection algorithm relying on prior probability of an
47


incident, transferred to a new location without recalibration, may not perform well in
detecting incidents, especially shoulder incidents.
n
It may be mentioned that for all incidents, the incident rate is 49 incidents per 10
VMT for the 1-25 freeway and 817.4 incidents per 107 VMT for the 1-880 freeway. The
incident rates for both freeways are much lower than the incident rate (2,000 incidents
per 107 VMT) reported in a study by Lindley (Lindley 1986). The significant
difference in incident rate may be due to the manner in which incident data was
collected at each location.
4.3.2 Incident Duration
For a given type of incident, longer duration incidents cause higher impact to traffic
flow. The duration of incidents is also important for a mobile sensor based incident
detection algorithm. The traffic measures estimated from mobile sensors present the
spatial variation of'traffic conditions for a specific time interval. The duration of
incidents and the penetration rate of probe vehicles affect the availability of probe
reports during an incident.
For the 1-25 freeway, the mean duration of lane-blocking incident (43.1 minutes) is
higher than of shoulder incidents (29.9 minutes) (Table 4.3). A pooled t-test shows that
the means of incident duration between lane-blocking and shoulder incidents are not
significantly different at 5 percent level of significance (p-value = 0.11) (Table 4.4). A
Kolmogorov-Smimov (K-S) test also shows that the distribution of incident duration
for lane-blocking and shoulder incidents are not significantly different, at a 5 percent
level of significance (p-value = 0.10).
48


Table 4.3. Duration of incident for the 1-25 and the 1-880 freeway
Parameter 1-25 1-880
Lane- blocking Shoulder Lane- blocking Shoulder
Mean Duration (minute) 43.1 29.9 53.9 62.9
Mean Duration (minute) (all incidents) 38.3 62.3
Table 4.4. Pooled t-test and Kolmogorov-Smimov test for duration of
incidents
Location Pooled t-test p-value K-S test p-value
1-25 between lane-blocking and shoulder incidents 0.11 0.10
1-880 between lane-blocking and shoulder incidents 0.09 0.01
Across locations
Lane-blocking incident 0.12 0.10
Shoulder incident 0.003 <0.01
All incidents 0.005 < 0.005
For the 1-880 freeway, the mean duration is 53.9 minutes for lane-blocking incidents
and 62.9 minutes for shoulder incidents. A pooled t-test shows that means incident
duration for lane-blocking and shoulder incidents on the 1-880 freeway are not
49


significantly different at a 5 percent level of significance (p-value = 0.09) but a K-S
test shows that the distribution of incident duration for lane-blocking and shoulder
incidents on the 1-880 freeway are significantly different at a 5 percent level of
significance (p-value = 0.01).
Across locations, the 1-25 freeway and the 1-880 freeway, the mean incident duration
of all incidents on the 1-880 freeway is 62.3 minutes, about 1.6 times higher than the
duration of incident on the 1-25 freeway. This may be due to higher response time for
incidents on the 1-880 freeway. The mean of incident duration and the distribution of
incident duration are not significantly different for lane-blocking incidents but is
significantly different for shoulder incidents across locations at a 5 percent level of
significance.
4.3.3 Average Delay
Lane-blocking incidents usually cause a reduction in capacity of a roadway. Shoulder
incidents also cause a reduction in capacity due to rubbernecking, an action of drivers
to slow and observe an incident scene as they pass the incident. The impact of
J.
incidents may be expressed as average delay per vehicle affected by an incident.
Averages delay, while queue present due to incident, can be calculated as follows:
50


Figure 4.3. Delay due to an incident
- _ Total Delay
q Total Number of Arrivals While Queue Present
, (4.1)
Area of Triangle abe
(4.2)
Area of [(Triangle ace) (Triangle acd) (Triangle bde)]
A*L
(4.3)

da '
til
' f\
V


A*t
(4.4)
51


Therefore, average delay while queue present is:
1 tr(X-fir)
9 2 X
(4.5)
where,
dq= Average delay while queue present
tq Time duration in queue
tr = Incident duration
X = Mean arrival rate
jj.r Mean service rate during incident
H Mean service rate
The historical arrival rate flow rate under non-incident condition or flow rate from
several upstream and ramp detectors may be used to estimate the mean arrival rate
(X). The flow rate from upstream detectors during an incident may be used as the
service flow rate during incident condition {/j.r). The flow rate from upstream
detectors after an incident ended may be used as the service flow rate (//).
The weighted average delay at each location is calculated as;
n
d =
"gw
n
(4.6)
i=i
where,
dqw = Weighted average delay
52


= Total number of arrivals while queue present for incident i
n = Total number of incidents at each location .
For the 1-25 freeway, the weighted average delay is 7.10 minutes per vehicle for lane-
blocking incidents and 3.65 minutes for shoulder incidents (Table 4.5). Lane-blocking
incidents cause higher delay to individual affected vehicle than shoulder incidents. The
maximum and minimum average delay per vehicle per incident is 17.32 and 0 minutes,
and 9.48 and 0 minutes for lane-blocking and shoulder incident respectively. This
shows that some shoulder incidents may cause higher average delay than lane-
blocking incidents. Therefore, shoulder incidents may also cause significant delays and
are also important to detect For all incidents, the weighted average delay is 6.07
minutes per vehicle. A pooled t-test shows that the average delay due to lane-blocking
incident and shoulder incidents is not significantly different at the 5 percent level of
significance (p-value= 0.10) (Table 4.6). The Kolmogorov-Smimov (K-S) test also
shows that the distribution of average delay due to lane-blocking and shoulder
incidents is not significantly different at the 5 percent level of significance (p-
value=0.10).
53


Table 4.5. Average delay due to incidents on the 1-25 and the 1-880 freeway
Parameter 1-25 1-880
Lane- blocking Shoulder Lane-, blocking Shoulder
Weighted Average Delay (min/veh) 7.10 3.65 2.72 1.84
Max. Average Delay (min/veh) 17.32 9.48 12.27 25.68
Min. Average Delay (min/veh) 0 0 0 0
Weighted Average Delay (min/veh) (all incidents) 6.07 1.89
Table 4.6. Pooled t-test and Kolmogorov-Smimov test for average
delay
Location Pooled t-test p-value K-S test p-value
1-25 between lane-blocking and shoulder 0.10 0.10
1-880 between lane-blocking and shoulder 0.016 0.10
Across locations
Lane-Blocking 0.008 <0.025
Shoulder 0.027 <0.01
All incidents <0.001 <0.001
54


For the 1-880 freeway, the weighted average delay is 2.72 and 1.84 minutes per vehicle
for lane-blocking and shoulder incidents respectively. Similar to the 1-25 freeway,
lane-blocking incidents causes higher delay to individual affected vehicle than
shoulder incidents. The maximum average delay for shoulder incidents (25.68 minutes
per vehicle) is higher than for lane-blocking incidents (12.27 minutes per vehicle).
Similar to the 1-25 freeway, some shoulder incidents may cause higher average delay
than lane-blocking incidents. Therefore, an incident detection algorithm that performs
well in detecting both lane-blocking and shoulder incident is desired. A pooled t-test
shows that the average delay for lane-blocking and shoulder incidents is significantly
different at the 5 percent level of significance (p-value=0.016). A K-S test shows that
the distribution of average delay for lane-blocking and shoulder incidents is not
significantly different at the 5 percent level of significance (p-value=0.10).
The weighted average delay for all incidents is 1.89 minutes per vehicle on the 1-880
freeway and 6.07 minutes per vehicle on the 1-25 freeway. Drivers on the 1-25 freeway
experience more than 3 times higher delay due to an incident than drivers on the 1-880
freeway. One of the reasons may be because the 1-880 freeway has higher capacity (5-
lane) than the 1-25 freeway (3-lane). For 3-lane freeway, one lane blocking incident
causes 47 percent capacity reduction and 78 percent for two lanes blocking incident.
For 5-lane freeway, one lane blocking incident causes 25 percent capacity reduction
and 50 percent capacity reduction for two lanes blocking incident (Blumentritt 1981).
Across locations, the average delay and distribution of average delay per vehicle per
incident for the 1-25 freeway and the 1-880 freeway are significantly different for lane-
blocking and shoulder incidents. This shows that the characteristics of incidents in
terms of their impact are significantly different between the two locations. Therefore,
an incident detection algorithm developed for a specific location may not perform well
55


if it is transferred to another location without recalibration since the impact of incidents
is different.
The incident rate, duration of incidents, distribution of duration of incidents, average
delay, and distribution of average delay due to incident for lane-blocking and shoulder
incidents are not significantly different at 5 percent level of significance for the 1-25
freeway. For the 1-880 freeway, the incident rate, distribution of duration of incidents,
and average delay are significantly different at 5 percent level of significance for lane-
blocking and shoulder incidents. Across locations, the incident rate, duration of
incidents, distribution of duration of incidents, average delay, and distribution of
average delay are significantly different at 5 percent level of significance for shoulder
incidents and all incidents. The difference of duration of incidents may be due to the
difference in response time for the two locations. The difference of average delay
across locations is due to the differences in severity of incidents themselves and
freeway capacity.
56


5. Methodology
As outlined in a review of the literature in Chapter 2, several techniques including
decision-tree based pattern recognition technique, time-series based statistical
approach, catastrophe theory, and artificial intelligence based neural network approach
have been applied to incident detection. To date, neural network based incident
detection algorithms have been shown to perform the best based on detection rate and
the lowest false alarm rate. The probabilistic neural network (PNN) has been shown to
train faster than the multilayer feedforward (MLF) neural network. A modified form of
PNN, with a principal component transformation of the inputs also performs well
(Abdulhai and Ritchie 1999). In a limited study using data collected from a traffic
simulation, a generalized linear model (GLZ) was applied to freeway incident
detection (Hoeschen 1999) and has performed as well as the MLF models.
Generalized linear model differs from a general linear model as the distribution of the
dependent or response variable may be explicitly non-normal and may be categorical.
The dependent variable is predicted from a linear combination of predictor variables
using a link function. For a GLZ, the functions of the predictor variables are estimated
parametrically.
The generalized linear model may be expressed as;
g(^) = 7i = /30+/3lxl+fi2x2+- + fipxp (5.1)
where,
ju = mean of the response variable
57


g(pi) = Tj = a link function that link a random component, p, to the systematic
component (/? + /?,x, + fi2x2 +... + J3pxp). For a Bernoulli distribution, the link
(
function may be logit. For logit link function, g(p) = log
V-f*)
Generalized additive model (GAM) is a further generalization of a generalized linear
model (GLZ). The generalized additive model is a nonparametric regression and
smoothing technique that relaxes the assumptions of linearity and uncovers the
structure in the relationship between the independent variables and the dependent
variables. The generalized additive models combine an additive assumption that
enables relatively many parametric relationships to be explored simultaneously with
the distributional flexibility of generalized linear models. Other nonparametric
regression regressions do not perform well when the number of independent variables
is large, leading to variances of the estimates to be unacceptably large or is often
referred to as the curse of dimensionality. The generalized additive model overcomes
these problems since each of the individual terms are estimated using a univariate
smoother and the estimates of the terms explain how the response changes with
corresponding independent variables.
Generalized additive model is appropriate where a dependent variable is not normally
distributed and is categorical, the relationship between the variables is expected to be
of a complex form, and may not easily be modeled as a linear or non-linear model. The
appropriate functional form may be suggested or the structure of relationship between
independent variables and dependent variable may be explored more thoroughly using
generalized additive models.
58


5.1 Generalized Additive Model (GAM)
To develop an incident detection model, generalized additive modeling technique is
applied. A generalized additive model (Hastie and Tibshirani 1990) consists of a
random component, an additive component, and a link function relating the two
components. The response y, the random component, is assumed to have exponential
family,
fY(y, 4) =exp I'yi+c (y> <0 j (5-2)
where, 6 is the natural parameter and is the scale parameter. The mean of the
response variable // is related to the set of independent variables xy, x2, xj, ...xp by a link
function g.
where, sj(.sp(.) are smooth functions defining the additive component, and the
relationship between /j. and rj is defined by g(ju) = 77. For a Bernoulli distribution,
p
(5.3)
1=1
E (7) = P(Y = 1 |X) is the mean and the canonical link may be logistic or probit. The
generalized additive logistic model can therefore be expressed as follows:
(5.4)
is the odds of y, given xt.
59


5.1.1 Spline Smoothing Function
GAM estimates the nonparametric function of predictors via scatterplot smoothers.
The trend of a response measurement or dependent variable as a function of one or
more independent variables can be estimated by smoothing functions. Smoothing
function is a nonparametric in its nature. It does not assume a rigid form for the
dependent variable on independent variables. Examples of smoothing functions
include cubic smoothing spline and locally-weight running-line. This section describes
a brief overview of cubic smoothing spline.
Each smoother, stf.), is: estimated based on minimizing the following penalized
residual sum of squares function:
where; X is smoothing parameter and a < jc, < ..: closeness to the data. The second term penalizes curvature in the function. The
parameter X is the smoothing parameter. Large values of X produces smoother
curves while smaller values produce more wiggly curves.
For the fitted generalized additive model with cubic spline smoothers, the following
equation is minimized forsmoothers:
(5.5)
(5.6)
60


5.1.2 Fitting Generalized Additive Models
The generalized additive model is estimated based on two iterative algorithms; Local
Scoring Algorithm (LSA) and Backfitting Algorithm (BFA). An outer loop of the
Local Scoring Algorithm and an inner loop of the Backfitting Algorithm are used until
convergence. For each iteration of the Backfitting algorithm, splines are estimated.
During each iteration of the Local Scoring Algorithm, an adjusted dependent variable
and a set of weights are estimated to apply an iteratively reweighted least squares. The
algorithms are as follows:
5.1.2.1 Local Scoring Algorithm
The iterative procedure for Local Scoring Algorithm is as follow:
For a binomial distribution, the equation for the adjusted dependent variable
1. Initialization: s0 = g(is(y)), j1(.)=52(.)=...=5(.) =0,m=0.
2. Iterations: m=m+l
From the dependent variable, predictor, and mean based on the previous
iteration, a new adjusted dependent variable may be estimated as:
(5.7)
reduces to,
(5.8)
61


The weights Wi are defined by,
w 1 =

yd^j
rO z 4-Un'-..0
v
(5.9)
where, Py is the variance of Y at p For a binomial distribution, it reduces to,

(5.10)
3. Convergence: The iterations are continued till the deviance for the model fails
to decrease or satisfies the convergence criterion. Using the Backfitting
Algorithm, an additive model to z is fitted with weight wt to obtain estimated
functions Sj{.).
5.1.2.2 Backfitting Algorithm
The Backfitting Algorithm is applied to estimate the smoothing functions sj(.),..... sp()
in the additive model in Eq. (5.3). The j*11 set of partial residuals may be estimated as,
k*j (5.11)
and E |Xj j = Sj (Xj). Based on estimates {s(:)., i j}, smoothing functions
Sj (.) may be estimated. The iterative procedure is as follows:
1. Initialization : s0 -- E(Y);s(.) =.... (.) = 0, m=0 2. Iterations: m=m+l, for j = 1 top
j-i p Rj=Z-s0-YJsk(xk)- X sk(xk) k=1 k=j+1 (5.12)
s:=e(rj\xj) (5.13)
62


An iteratively re-weighted least squares is obtained by smoothing weighted
Rj on Xj.
Sj(Xj) =
E[w.{Rj}\Xj}
E(w\Xj)
(5.14)
3.
where, the weights are estimated based on Eq. (5.9)
Convergence: The iterations are continued till either the residual sum of
C ( p Yl
squares,
Avg
y-*0-]>>;(*,)
, fails to decrease, or satisfies the
l ^ >=' ' J
convergence criterion, z, is re-estimated based on the Local Scoring Algorithm.
An adjusted dependent variable based on an iteratively weighted least squares.
The algorithm regresses z,- on x with weight w,- to obtain revised estimates. The
new fi, rj, zi estimated and the process repeated till the change in deviance,
D(y;fi) = 2(l(MnaK;y))-l{fi,,y) (5.15)
is sufficiently small, where, fxmax is the parameter value that maximizes
likelihood l(fx,y) over all fx, the saturated model. Here, l(fx,y) is the log
likelihood.
(5-16)
where; P = probability of Y\x.
The degrees of freedom for a generalized additive model can be expressed as,
df = tracers -1 (5.17)
where, 5(^) is a smoother operator. Thus df is the sum of the eigenvalues of 5(Ay.).
63


It may be mentioned that dependency between independent variables is not a direct
problem in generalized additive model as in ordinary regression where the variance of
the estimators is a function of the design matrix.
In regression, multicollinearity can cause highly inflated variances and are impossible
to interpret individually, and numerically unstable estimates of the regression
coefficients or non unique solution. In generalized additive model, strict collinearity
through the origin (i.e. X2 = cXi) between independent variables can cause the
backfitting algorithm to converge to one of the solutions (non unique) determined by
the starting functions (Hastie and Tibshirani 1990). If the dependent variables are not
strictly collinear then the backfitting algorithm converges to the unique solution,
independent of the starting functions.
Generalized additive logistic model can be used to develop an incident detection
algorithm. Logistic link function is appropriate since the response or dependent
variable is either incident or non-incident. GAM allows us to include many
independent variables without curse of dimensionality which usually occurs with
nonparametric model. Independent variables may include traffic measures such as
speed, occupancy, volume, bus speed, etc. for each lane, for several previous time
intervals. Nonparametric model uses more flexible functions than parametric model
which allows us to explore the relationship between dependent and independent
variables.
5.2 .632 Bootstrap Method..
Misclassification rate measures how well a model predicts or classifies the class of a
future observation. It measures model performance and can be used for model
selection. Misclassification rate is defined as the probability of an incorrect
64


classification. The probability refers to repeated sampling from the true population. If a
model is fitted and tested on the same sample, the misclassification rate, so-called
"apparent error", is unrealistic low because it uses the same data both for fitting and
assessment. In order to get a more realistic estimate of misclassification rate, a test
sample that is separate from training sample may be used. There are several
approaches to improve an estimation of misclassification rate. The cross validation fits
the model by leaving the i* observation out, and then computing the predicted value
for the ith observation. If this is done for each observation, the average
misclassification rate may be calculated. The simplest bootstrap approach generates B
bootstrap samples, estimates the model on each, and then applies each fitted model to
the original sample to give B estimates of misclassification rate. The overall estimate
of misclassification rate is the average of these B estimates (Efron and Tibshirani
1993). However, the simple bootstrap method does not work very well. The .632
bootstrap performs best among several methods including cross validation (Efron and
Tibshirani 1993). A bootstrap sample is generated by sampling original sample size n
with replacement n times. The model is then fitted to a bootstrap sample and used to
predict the response in the original sample. The .632 bootstrap differs from standard
bootstrap in that it predicts for only those observations that do not appear in the
bootstrap sample. The idea behind the .632 bootstrap is to use the misclassification rate
from just these caises to adjust the optimism in the apparent error rate.
Let s0 be the average error in predicting the response or average misclassification rate
for observations not in bootstrap sample but in the original sample. err(x, F) is the
apparent error computed by fitting and assessment on the original sample. The .632
bootstrap estimate of optimism is defined as (Efron and Tibshirani 1993):
dr632 = .632[£0 err(x,F)] (5.18)
65


The estimate of optimism is then added to equation (5.18) to obtain the .632 estimate
of prediction error or misclassification rate.
err'632 err(x, F) + .632[£0 err(x, F)]
= .368 *err(x,F) + .632* s0 (5.19)
The value .632 is approximately the probability that a given observation appears in a
bootstrap sample of size n by sampling an original sample size n. s0 can be estimated
as follows:
Given a set of B bootstrap samples, the eJQ be the prediction error computed for the j*
bootstrap sample by
1. Estimating a prediction rule using this j* bootstrap sample.
2. Averaging the prediction error in predicting the response for observations not
in this bootstrap sample but in the original sample.
This is repeated B times, and then the average is
2>/)
h=^T~ (5.20)
For incident detection algorithm, the response of each observation is either 0 (non-
incident) or 1 (incident). Three different average misclassification rates may be
estimated. Let erra is an average misclassification rate using bootstrap method when
each vector reflects non-incident but ID algorithm indicates incident. errb is an
average misclassification rate using bootstrap method when each vector reflects
incident but ID algorithm indicates non-incident. errc is an average misclassification
rate using bootstrap method when each incident the ID algorithm indicates non-
66


incident instead of incident. erra, l-errb, and 1 -errc are bootstrap average false alarm
rate (FAR), bootstrap average incident-state detection rate (ISDR), and bootstrap
average detection rate (DR), respectively. The .632 bootstrap method would provide
a good estimate of the performance of the ID algorithm.
5.3 Model Evaluation
To evaluate the performance of an incident detection model such as detection rate
(DR), false alarm rate (FAR), and mean time to detect (TTD) are estimated. A new
performance measure, ISDR, has been included to evaluate the performance of an
incident detection model in classifying each incident vector. These measures are
estimated as follows:
Detection Rate ('DR'): Percentage of incidents correctly classified by the model.
Detection Rate (DR) = *100 (5.21)
jy,
where, .
Nd = Number of incidents detected
Nt = Total number of incidents
i 5 ' ) I ;
Incident-State Detection Rate (ISDR); Percentage of the incident vectors classified
correctly by an incident detection model. This is also referred to as the classification
rate for incident state vector. - i...... .
V+
Incident State Detection Rate (ISDR) = * 100 (5.22)
where,
67


V* = Number of incident vectors correctly classified
Vi = Total number of incident vectors
False Alarm Rate (FAR'): Percentage of the number of non-incident vectors classified
incorrectly.
False Alarm Rate {FAR) = * 100 (5.23)
^m
where,
FT = Number of non-incident vectors correctly classified
Vni= Total number of non-incident vectors
Mean Time to Detect (TTD): Average time elapsed between the beginning of an
incident and when the incident detection model first detects an incident.
1 N'
Mean Time to Detect {TTD) =-tt) (5.24)
i=i
where,
ti = Time incident starts
t] = Time incident first detected
Nt = Total number of incidents
68


6. Development of Generalized Additive Model for Freeway Incident
Detection
Characteristics of traffic and incidents may vary considerably by location depending
on roadway geometry, driver behavior, and weather conditions. Therefore, variables
significant for incident detection algorithm may vary by location as well. The impact
of incidents on traffic may vary based on location due to differences in the
characteristics of incidents, traffic, roadway geometry, and driver behavior. These
factors and the spatial distribution of fixed sensors also affect the structure of an
incident detection model and its performance.
Other studies have developed incident detection models for one location, and
examined its performance for other locations (Abdulhai 1996; Cheu and Ritchie 1995).
This examination has been referred to as transferability test. In this research, a different
approach is taken. Significant independent variables for incident detection are
identified and generalized additive models for incident detection are developed for two
freeway sections, Interstate 25 in Colorado and Interstate 880 in California, and the
differences between the models and their performance are examined.
An incident detection model is implemented by a traffic management center to detect
freeway incidents or any event that disrupts the flow of traffic. The disruption may be
caused by lane-blocking or shoulder incidents. Examples of lane-blocking incidents
are accident and debris on the freeway and examples of shoulder incidents are stalled
vehicle, flat tire vehicle, etc. Shoulder incidents can also causes a significant reduction
of capacity at the affected location due to rubbernecking as presented in Chapter 4. A
traffic management center (TMC) is not only interested in detecting lane-blocking
69


incident but also interested in detecting shoulder-incidents. If the shoulder incidents are
detected, then the TMC can send a service unit to clear the incident and assist
motorists. An incident detection model developed and tested on both lane-blocking
and shoulder incidents should provide better estimates of its performance at a traffic
management center.
This chapter presents the model development procedure for fixed sensor based
nonparametric generalized additive models for all incidents (lane-blocking and
shoulder incidents) on the 1-25 freeway, all incidents on the 1-880 freeway, lane-
blocking incidents on the 1-880 freeway, and shoulder incidents on the 1-880 freeway.
The development procedures for mobile sensor, and fixed and mobile sensor based
nonparametric generalized additive models for all incidents on the 1-25 freeway are
also presented in the chapter. The functional forms and their parameter estimates for
fixed sensor based generalized additive models, as suggested by the generalized
additive model, are presented in this chapter. A comparison of the model structures of
the incident detection models developed and their implication are also examined in this
chapter.
6.1 Selection of Independent Variables
To date, several incident detection algorithms have been developed based on traffic
flow measures including occupancy, speed, and volume. Table 6.1 shows a sample of
fixed-infrastructure based traffic measures used,in previous studies and additional
variables considered for this study. The traffic measures used for the California
Algorithm are spatial occupancy difference (OCCDF), temporal difference of
downstream occupancy (DOCCTD), relative difference in spatial occupancy
(OCCRDF).
70


Table 6.1. Traffic measures for incident detection
Variable Description Payne and Tignor, 1978 Cheu and Ritchie, 1995 Dia, Rose et. al., 1997 Abdulhai and Ritchie, 1999 Variables considered for this study
UOCC Upstream occupancy t,t-l,..,t-4 t t,t-l,..,t-4 t,t-l,..|t-4
USPD Upstream speed t t,t-l,..,t-4
UVOL Upstream volume t,t-l,..,t-4 t t,t-l,..,t-4 t,t-l,..,t-4
DOCC Downstream occupancy t t,t-l,t-2 t t,t-l,t-2 t,t-l,..jt-4
DSPD Downstream speed ' t t,t-l,..jt-4
DVOL Downstream volume t,t-l,t-2 t t,t-l,t-2 t,t-l,..|t-4
OCCDF Spatial occupancy difference, UOCC-DOCC t t
OCCRDF Relative difference in spatial occupancy, (UOCC,- Docqyuocc, t t
DOCCTD Temporal difference of downstream occupancy,(DOCC,.2- DOCC,)/DOCC,.2 t t
UDEVOCC UOCC,-Historical upstream occupancy at time interval t t
DDEVOCC DOCC,-Historical downstream occupancy at time interval t t
UDSPDDF Spatial speed difference, USPD,-DSPD,
UDDEVOCC Difference in deviation of upstream and downstream occupancy from historical occupancy, UDEVOCC,- DDEVOCC,.2 t t
Note: (t-n) = Data lagged n time interval.
71


The 16 variables used in neural network models (Abdulhai 1996; Cheu and Ritchie
1995) include upstream occupancy and volume, downstream occupancy, and volume
for up to four time intervals. The additional variables considered in this study include
deviation of upstream occupancy from historical upstream occupancy (UDEVOCC),
deviation of downstream occupancy from historical downstream occupancy
(DDEVOCC), spatial speed difference (UDSPDDF), and difference in deviation of
upstream occupancy and downstream occupancy from historical occupancy
(UDDEVOCC).
To develop generalized additive models, the selection of significant independent
variables is based on the following procedure.
1. Box-Whiskers plots of variables showing median, quartiles, and extreme
values are examined. The box represents interquartile and contains 50 percent
of the values. The whiskers, a line extended from the box show the minimum
and maximum values excluding outliers. The values of variables that are
significantly different under incident conditions have a strong location shift
between incident and non-incident conditions. The variables with strong
location shifts are identified.
2. Univariable Generalized Additive Models (GAM) with significant variables
identified in step 1 is developed based on the training data set. The deviances
of the fitted model on the training set are examined. Potential variables are
ranked by deviance, from lowest to highest. The models with the lowest
deviance are.used to identify the main effects. The deviance or likelihood-ratio
: statistic for a fitted model ju is defined by
D(y;fi) = 2{l(jumax;y)-l(fi;y)} (6.1)
72


Where,
//max is the parameter value for the model that maximizes l(/j;y)over all
fj. (the saturated model). l(fi; y) is the log likelihood of a fitted model.
Deviance is used as the goodness-of-fit measure to compare models (Hastie
and Tibshirani 1990).
3. Following the univariable analysis, multivariable analysis is conducted based
on a forward stepwise variable selection starting with the independent variable
with the lowest deviance in step 2. A variable is added to the model if the
analysis shows a significant change in deviance. The Chi-Square test at a 5%
level of significance 09J) is applied to test the significance of the
improvement in deviance (Adeviance), Adferr is the expected change in
deviance. Degree of freedom, dfy, is used as an approximate estimate of
Adferr for models with and without the/h term.
4. In addition to step 3, the GAM is developed by adding one variable at a time,
starting with the explanatory variables identified in step 2. Model performance
at this step is examined based on the DR and FAR on the test set. In this step,
the last model includes all variables from step 2. Other subset models are also
examined.,
5. The performance of the selected model from step 3 and all models in step 4 are
examined. The final model is selected based bn its performance (DR and FAR)
on the testing set. ' " ":'-
The independent variables considered in all the generalized additive models developed
for incident detection are not strictly collinear. Therefore, the backfitting algorithm
converges to a unique solution.
73


As four different incident detection models are being developed; incident model for all
incidents =on the 1-25 freeway, model of all incidents on the 1-880 freeway, lane-
blocking incident model for the 1-880 freeway, and shoulder incident model for the I-
880 freeway, the same variable selection procedure is applied to all models. The
variable selection procedures for a few specific models are also presented.
Fixed Sensor Based Incident Detection Models
As part of step 1, Figure 6.1, Figure 6.2, and Figure 6.3 show a strong shift of location
of the upstream occupancy, upstream speed, downstream speed, and the deviation of
upstream occupancy from historical upstream occupancy under incident conditions for
all incidents on the 1-25 freeway, all incidents on the 1-880 freeway, and lane-blocking
incidents on the 1-880 freeway, respectively. Under non-incident conditions, upstream
occupancy ranges from 0 to 28 percent, 5 to 15 percent, and 5 to 13 percent and under
incident Condition, the upstream occupancy ranges from 10 to 69 percent, 8 to 42
percent, and 10 to 38 percent for all incidents on the 1-25 freeway, all incidents on the
1-880 freeway, and lane-blocking incidents on the 1-880 freeway, respectively. In step
1, for both the 1-25 and 1-880 data, the Box-Whisker plots show a strong shift of
location for upstream occupancy between incident and non-incident conditions. Other
variables that show a strong shift of location include USPD, UVOL, UDEVOCC,
DOCC, DSPD, DVOL, DDEVOCC, OCCDF, OCCRDF, UDDEVOCC, and
UDSPDDF. The Box-Whisker plots also show that variables at current time interval
and at previous time intervals show similar location shift. In order to develop a
parsimonious model, only variables at current time interval are used as independent
variables for the GAM.
74


uocc
Outliers
Extremes
USPD
Outliers
Extremes
DSPD
Outliers
Extremes
UDEVOCC
Outliers
Extremes
Figure 6.1. Box-Whisker Plot for incident (1) and non-incident (0)
conditions on the 1-25 freeway
uocc
Outliers
Extremes
O USPD
o Outliers
Extremes
a DSPD
o Outliers
::: Extremes
UDEVOCC
o Outliers
Extremes
Figure 6.2. Box-Whisker Plot for incident (1) and non-incident (0)
conditions on the 1-880 freeway for all incidents
75


100
uocc
Outliers
Extremes
USPD
Outliers
Extremes
DSPD
Outliers
Extremes
UDEVOCC
Outliers
Extremes
Figure 6.3. Box-Whisker Plot for incident (1) and non-incident (0)
conditions on the 1-880 freeway for lane-blocking incidents
In step 2, the univariable model including the deviation of upstream occupancy from
historical upstream occupancy provides the lowest deviance for the 1-25 incidents, I-
880 shoulder incidents, and 1-880 lane-blocking and shoulder incidents. The
univariable model for 1-880 lane-blocking incident model with the upstream speed
provides the lowest deviance (Table 6.2). For both 1-25 and 1-880 freeway, the
univariable models with the three lowest deviance include the upstream variables. All
the models for the 1-880 freeway also include downstream variables. The multivariable
models are developed based on these findings.
76


Table 6.2. Univariable model ranked based on Deviance
No. 1-25 1-880 Lane-blocking 1-880 Shoulder 1-880 All Incidents (Lane-blocking and Shoulder)
1 UDEVOCC USPD UDEVOCC UDEVOCC
2 USPD UOCC USPD USPD
3 UOCC UDEVOCC UOCC UOCC
4 UDDEVOCC DSPD DDEVOCC DDEVOCC
5 OCCDF DOCC DSPD DSPD
6 UDSPDDF DDEVOCC DOCC DOCC
7 OCCRDF UDDEVOCC UDDEVOCC OCCDF
8 DDEVOCC OCCDF OCCDF UDDEVOCC
9 UVOL DVOL OCCRDF OCCRDF
10 DOCC UVOL UDSPDDF UDSPDDF
11 DSPD OCCRDF UVOL DVOL
12 DVOL UDSPDDF DVOL UVOL
Table 6.3 summarizes the results from fitting a number of multivariable generalized
additive models for all incidents (lane-blocking and shoulder incidents) for the 1-880
freeway (step 3 and 4). For each model, the table shows the deviance of the fitted
model, the change of deviance (A Deviance), and the expected change of deviance
(Adf). The first model in Table 6.3 includes the UDEVOCC variable. For the second
model, the second variable in Table 6.2, USPD is added to the first model. The
deviance decreases by 1764.31. The expected change of Deviance (Adf) is 6.63. A
Chi-Square test shows a significant decrease of deviance (p-value = 0) as USPD
variable is added to the model and therefore USPD is added to the model. Similarly,
other variables are added in a forward stepwise method. The model with the lowest
77


Table 6.3. Analysis of Deviance for preliminary models for all incidents on the 1-880 freeway
Model No. Variable Compare with model no. Deviance A Deviance A df p-value
UDEVOCC USPD uocc DDEVOCC DSPD DOCC OCCDF UDDEVOCCT1 OCCRDF UDSPDDF DVOL UVOL
1 tmem mm imams /r£^^?£t2 - 72712.86 - -
2 ms&GSk _ SSSBS ... " . Mai 1 70948.55 1764.31 6.63 O.OOE+OO
3 stwaKsas liis ww mmmsm is** mmm ... 2 55591.02 15357.53 22.50 O.OOE+OO
4 wmmm- mtesm wsmmm Mte Mi 3 66897.51 -11306.49 22.34 -
S 1 . :0^-:V.VK , ; ' 3 66833.93 -11242.91 2.06 -
6 7 wmm HMHt s 3 6 55471.44 55253.88 119.58 217.56 26.94 35.09 5.86E-14 3.15E-28
8 MM8 fSSgjg 7 65755.29 -10501.41 34.92 -
9 10 11 12 13 * MM mm# MM wte0m - * : mW gtg£ IS / 7 7 7 7 66107.53 65590.83 65574.77 55084.22 64228.59 -10853.65 -10336.95 -10320.89 169.66 16.07 49.84 22.76 46.99 4.62E-16
14 SPsSSiS# /v-fce/sr?- - 55246.09 - - - .
15 I - 65898.00 - * - - .
16 ss . - 62747.37 - - - :
17 - i i MM WBM& - 82621.84 - - -
18 'l ssa - 46068.28 - - .
19 / - 47389.48 - . -
20 V' - 1310758.40 - - -
" f .*-r. * - 65930.19 - - -
22 a £ ssst. - 55281.5 - - -
78


deviance from the forward stepwise procedure (model #1 to model #12) in step 3 is
Model # 12. As part of step 4, by adding one variable at a time, model #13 to model
#22 are obtained. Based on the detection rate and the false alarm rate on the test set,
the performance of model #12 is compared to models #13 to #22. The final selected
model is model #21.
In steps 2,3, and 4, the convergence criteria selected for the Local Scoring Algorithm
is 0.5 and for the Backfitting Algorithm is 0.01.
The significant independent variables and analysis of deviance of the final selected
models for the 1-25 freeway and the 1-880 freeway model are summarized in Table 6.4.
The degree of freedom (df) of variables represent the flexibility of their function. The
Chi-Square test is used to compare the deviance between the full model and the model
without this variable. The p-value of this test is also reported in the Table 6.4.
Table 6.4. Analysis of Deviance for the generalized additive models
Selected Significant Variables 1-25 Freeway All incidents 1-880 Freeway Lane Blocking (LB) Incidents 1-880 Freeway Shoulder (SHLD) Incidents 1-880 Freeway All incidents
df p-value df p-value df p-value df p-value
UOCC USPD UDEVOCC DOCC DSPD OCCDF 4.45 4.49 11.58 <0.0001 < 0.0001 < 0.0001 4.00 8.27 13.29 4.00 8.33 < 0.0001 <0.0001 <0.0001 <0.0001 < 0.0001 5.34 1.00 26.02 5.35 0.99 5.38 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 5.35 1.00 1.84 5.34 1.00 5.35 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001
DF = Degree o: ' freedom
79


6.1.1 Significant Independent Variables for Fixed Sensor Based Incident
Detection Models
Typically, it is assumed that as an incident occurs, the occupancy increases and the
speed decreases upstream of an incident, and occupancy decreases and speed increases
downstream of an incident. Comparing the incident detection models for the two
locations, the upstream occupancy, speed, and the occupancy deviation from historical
occupancy are significant for the 1-25 freeway. For the 1-880 freeway, both the
upstream and downstream variables are significant. This may be due to the difference
in the characteristics of the incidents and the freeway capacity. A study (Blumentritt
1981) has shown that the percentage reduction of capacity due to an incident varies
based on the number of lanes of freeway available at a specific location. The study has
shown that for a 3-lane freeway, one-lane blocking incidents cause.a 47 percent
reduction in capacity and two-lane blocking incidents cause 78 percent reduction in
capacity. For a 5-lane freeway, one-lane-blocking and two-lane incidents cause 25
percent and 50 percent reduction in capacity respectively. Since an incident causes
higher reduction in capacity on the 3-lane 1-25 freeway, traffic variables measuring
upstream conditions are significant. For the 1-880 freeway, lane-blocking incidents
cause higher impact to traffic flow than shoulder incidents, therefore fewer variables
are needed. An examination of the differences in the characteristics of the incidents on
the 1-880 and the 1-25 freeway in previous chapter shows that the composition of
incidents by type, duration, and average delay are significantly different across the two
locations. The average delay due to incidents is higher (6.07 minutes per vehicle) for
the 1-25 freeway than for the 1-880 freeway (1.89 minutes per vehicle). Therefore, the
variables significant for an incident detection model are different by type of incident
and by locations.
80


6.1.2 Significant Independent Variables for Mobile Sensor Based Incident
Detection Model
For the mobile sensor based incident detection model, the independent variables
examined are average bus speed (AVGSPD), historical.bus speed (HISTSPD), and
deviation of bus speed from historical bus speed (DEVSPD). The same variable
selection procedure as described in section 6.1 is followed. Table 6.5 summarizes the
results by fitting a number of mobile sensors based incident detection models. For each
model, the table shows the deviance of the fitted model, the change of deviance (A
Deviance), and the expected change of deviance (A df). The first model in Table 6.5 is
fitted with AVGSPD variable. The second model, HISTSPD is added to the first
model. The deviance decreases by 174.65. The expected change of deviance is 100. A
Chi-square test shows a significant decrease of deviance (p-value < 0.001) as
HISTSPD variable is added to the model and therefore HISTSPD is added to the
model. Similar procedure is applied to the other models. The model with the lowest
deviance from the forward stepwise procedure (model # 1 to model # 3) is model # 3.
Another combination of independent variables (model # 4) is also examined. The
model with the best performance based on detection rate and false alarm rate on the
test set is model # 2, with average bus speed and historical bus speed as independent
variables. Table 6.6 shows the analysis of deviance and the significant independent
variables of the final model. The degree of freedom (df) of variables represent the
flexibility of their function. The Chi-Square test is used to compare the deviance
between the full model and the model without this variable. Both variables are
significant at the 5 percent level of significance.
81


Table 6.5. Analysis of Deviance for preliminary model for mobile
sensor based incident detection model
Model Variable Compare with model no. Deviance A Deviance A df p-value
AVGSPD HISTSPD DEVSPD
1 198.49 - - -
2 V 1 23.84 174!65 100 <0.001
3 2 0.06 23.78 100 <0.001
4 1 15307.65 - - -
Table 6.6. Analysis of Deviance for mobile sensor based incident
detection model
Selected significant variable Degree of freedom p-value
Average bus speed (AVGSPD) 100 <0.0001
Historical bus speed (HISTSPD) 40 < 0.0001
6.1.3 Significant Independent Variables for Fixed and Mobile Sensor Based
Incident Detection Model
The fixed sensor based incident detection model is developed based on three
significant independent variables; upstream occupancy, upstream speed, and deviation
of upstream occupancy from historical occupancy; as described in section 6.1.1. The
fixed and mobile sensor based incident detection model is developed based on the
three significant independent variables from fixed sensors and average probe speed
from mobile sensors. As a result, the incident detection model with additional probe
vehicle data provides higher detection rate at 0 percent false alarm rate.
82


Table 6.7 shows analysis of Deviance and significant independent variables of the
model. The degree of freedom {df) of variables represent the flexibility of their
functions. The Chi-square test is used to compare the Deviance between the full model
and the model without that variable. All variables are significant at the 5 percent level
of significance.
Table 6.7. Analysis of Deviance for fixed and mobile sensor based
incident detection model
Variable Degree of freedom p-value
UOCC 7.90 < 0.0001
USPD 1.00 < 0.0001
UDEVOCC 11.66 < 0.0001
AVGSPD (probe data) 150.0 <0.0001
6.2 Parametric Estimate of Fixed Sensor Based Generalized Additive Model
for Incident Detection
The models presented in the previous sections are non-parametric generalized additive
models for incident detection. Each independent variable is fitted by a smoother such
as cubic spline. A parametric form of the model is proposed by examining the partial
prediction or effect of each independent variable on the response. A parametric model
provides simple functional forms for easy implementation. One of the advantages of
the parametric model over the nonparametric model is that it requires less time to
83


develop estimates of the parameters as the functional forms are specified in the
parametric model. The training time for a parametric model may be less than one time
interval (e.g. 30-second). Therefore, real time re-calibration may be performed to
adjust the parameter estimates. This section presents the parametric model estimation
of the generalized additive models for incident detection for lane-blocking and
shoulder incidents for a freeway section in Colorado and California, the differences in
the model structure and parameter estimates.
6.2.1 Partial Prediction >
One of the advantages of the generalized additive models over other algorithms is that
it allows an examination of the fitted functions of each independent variable to explore
its functional form. A partial prediction plot may be used to examine the effect of each
independent variable to the response and to suggest parametric functions for each
variable for the model. The functions of the model are parametrized in terms of one or
two parameter families of functions.
Figure 6.4, Figure 6.5, Figure 6.6, and Figure 6.7 show partial prediction plots for the
incident detection models for all incidents on the 1-25 freeway, all incidents on the I-
880 freeway, lane-blocking incidents on the 1-880 freeway, and shoulder incidents on
the 1-880 freeway, respectively. The examination of the partial prediction plots shows
that the deviation of upstream occupancy from historical upstream occupancy
(UDEVOCC) has the most effect on the response of all models. The UDEVOCC
functions for all models may be considered to be a piecewise linear function.
The partial prediction plots also show that the upstream occupancy (UOCC) functions
for all models may be represented by polynomial. The upstream speed (USPD)
functions may be represented by polynomials for the incident detection models for all
84


Full Text

PAGE 1

FIXED AND MOBILE SENSOR BASED GENERALIZED ADDITIVE MODELS FOR FREEWAY INCIDENT DETECTION by Kittichai Thanasupsin B.Eng.,Khon Kaen University, Thailand, 1992 M.Eng., Asian Institue of Technology, Thailand, 1995 M.S., University of Colorado, 1998 A thesis submitted to the University of Colorado at Denver in partial fulfillment of the requirements for the degree of -Doctor ofPhilosophy Civil Engineering 2002

PAGE 2

2002 by Kittichai Thanasupsin All rights reserved.

PAGE 3

This thesis for the Doctor of Philosophy Degree by Kittichai Thanasupsin has been approved by Bruce N. Janson Date

PAGE 4

Thanasupsin, Kittichai (Ph.D., Civil Engineering) .. \. Fixed and Mobile Sensor Based Additive Models for Freeway Incident Detection Thesis directed by Assistant Professor Dr. Sarosh I. Khan r: ... ABSTRACT Generalized additive models (GAM) to detect lane-blocking and shoulder incidents are developed based on traffic measures from fixed and mobile sensors. The generalized additive model, a nonparametric model, is a generalization of the ,, ,i! I II generalized linear model, allowing forms of independent variables to be proposed. Generalized additive models allow flexible functions to be i fitted and therefore their functional forms are revealed in the parametric estimate of generalized additive models. This capability of GAM serves as a powerful interpretive !:: ; tool to examine the affect of each. traffic measure on the probab1.1ity of an incident. Fixed sensor based incident detection models are developed for lane-blocking and ; I! ,l shoulder incidents on the Interstate 25 freeway in Colorado and the Interstate 880 freeway in California. Separate lane-blocking aAd incidents models are also developed for the Interstate 880 freeway to examine the characteristic differences between lane-blocking and shoulder incidents,. as they relate to incident detection. Characteristics of incidents, model development, including significant variables selection, and model interpretation are also examined. Based on performance measures including detection rate, false alarm rate and mean time to detect, the nonparametric GAM and the parametric estimate of GAM, with only five variables for lane-blocking iv

PAGE 5

incidents and six variables for all incidents, outperform several neural network based models using 16 to 24 variables. In this research, the effect of type and length of freeway segments on model performance is also examined. Mobile sensor, and fixed and mobile sensor based incident detection models are developed for lane-blocking and shoulder incidents on the Interstate 25 freeway. The performance of mobile sensor based model shows the potential use of mobile sensor as an alternative data source. Using mobile sensor as an additional data source to fixed sensor data helps reduce the false alarm rate of the incident detection model. The performance of the incident detection models developed is unbiasedly validated using bootstrap method. The bootstrap performance examined includes mean detection rate, incident state detection rate, false alarm rate, mean time to detect, and their 95 percent confidence interval. The bootstrap performance may provide a good estimate of model performance in the field. This abstract accurately represents the content of the candidate's thesis. I recommend its publication. v / Sarosh I. Khan

PAGE 6

,..r-CONTENTS Figures ................. : ............. ..... : ...... : .................................................................................. xi Tables .............................................................................................................................. xv Chapter 1. Introduction.......... : .... .. : ...... ; ...... ............................................................................... 1 1.1 Background ............ ....................................... .-............. : .-.. : ...................................... 1 1.2 Objectives of Study ................................................................................................ ; 4 1.3 Organization ofDissertation .................. ; ................................................................ 5 1.4 Significant Contributions of Study ......................................................... ................. 7 2. Literature Review ........ .-...................................... : ........................................................ 8 2.1 Fixed Sensor-Based Freeway Incident-Detection Algorithms ............................. 10 2.2 Mobile Sensor-Based Freeway Incident Detection Algorithms ........................... 19 2.3 Fixed and Mobile Sensor-Based Freeway Incident Detection Algorithms ......... 20 2.4 Mobile Sensor-Based Surface Street Incident Detection Algorithms .................. 22 2.5 Synthesis of the Literature ..................................................................................... 24 2.5.1 Performance of the Freeway Incident Detection Algorithms ............................ 24 2.5.2 Source of Data for Incident Detection Algorithms ............................................ 25 2.5.3 Incident Data Sets ................................................................................................ 25 2.5.4 Characteristics of the Freeway Incident Detection Algorithms ......................... 26 vi

PAGE 7

3. Data Collection ..... ; ...................................................................... ............................. 30 3.1 Data from Colorado ............................................................................................... 30 3.1.1 Traffic Measures from Fixed Sensors ................................................................ 32 3.1.2 Traffic Measures from Mobile Sensors .............................................................. 33 3 .1.3 Incident Data ....................................................................................................... 36 3.2 Data from California .............................................................................................. 37 3.2.1 Traffic Measures from Fixed Sensors ................................................................ 38 3.2.2 Incident Data ..................................... ; ................................................................. 39 4. Characteristics of Lane-blocking and Shoulder Incidents ...................................... 40 4.1 Characteristics oflncidents ................................................................................... 42 4.2 Characteristics oflncidents on the I-880 Freeway ............................................... 43 4.3 Characteristics of Incidents on the I-25 Freeway ................................................. 44 4.3.1 Incident Rate ........................................................................................................ 45 4.3.2 Incident Duration ................................................................................................. 48 4.3.3 Average Delay ................................................. .................................................... 50 5. Methodology ............................................................................................................ 57 5.1 Generalized Additive Model (GAM) .................................................................... 59 5.1.1 Spline Smoothing Function ................................................................................ 60 5.1.2 Fitting Generalized Additive Models ................................................................. 61 5.2 ".632 Bootstrap" Method ...................................................................................... 64 5.3 Model Evaluation ................................................................................................... 67 6. Development of Generalized Additive Model for Freeway Incident Detection ................................................................................................................... 69 Vll

PAGE 8

6.1 Selection of Independent Variables ...................................................................... 70 6.1.1 Significant Independent Variables for Fixed Sensor Based Incident Detection Models ................................................................................................ 80 6.1.2 Significant Independent Variables for Mobile Sensor Based Incident Detection Model .................................................................................................. 81 6.1.3 Significant Independent Variables for Fixed and Mobile Sensor Based Incident Detection Model ................................................................................... 82 6.2 Parametric Estimate of Fixed Sensor Based Generalized Additive Model for Incident Detection .................................................................................................. 83 6.2.1 Partial Prediction ................................................................................................. 84 6.2.2 Generalized Additive Model (GAM) -Parametric Estimate for Fixed Sensor Based Incident Detection Model ............................................................ 88 6.3 Model Interpretation ........ ............................................................ ......................... 93 6.3.1 Model InterpretationLane-blocking Incident Detection Model ...................... 93 6.3.2 Comparison of Model Structures and their Implications ................................... 98 7. Performatice ofFixecl Sensor, Mobile Sensor, and Fixed and Mobile Sensor Based Incident Detection Models .............................................................. 104 7.1 Fixed Sensor Based Incident Detection Model .................................................. 105 7.1.1 Performance Sensor Based Incident Detection Model for the 1-25 Freeway 106 7 .1.2 Performance of Fixed Sensor Based Incident Detection Model for the I880 Freeway ...................................................................................................... 108 7.1.3 Performance afFixed Sensor Based Incident Detection Model for Laneblocking Incidents .......................................... : ................................................ Ill 7.1.4 Performance of Fixed Sensor Based Incident Detection Model for .. Shoulder .............................. : ............................................................... 116 viii

PAGE 9

7.2 Performance ofFixed Sensor Based Incident Detection Model by Segment Length and Segment Type ................................................................................... 118 7.2.1 The I-880 Freeway ........ : ..................... ; ............. : ............................................... 118 7.3 Mobile Sensor Based Incident Detection Model ................................................ 123 7.3 .1 Performance of Mobile Sensor Based Incident Detection Model ................... 124 7.4 Fixed and Mobile Sensor Based Incident Detection Model.. ............................. 128 7.4.1 Performance of Fixed and Mobile Sensor Based Incident Detection Model ................................................................................................................. 128 8. An Unbiased Validation oflncident Detection Algorithm Performance Using Bootstrap Method ........................................................................................ 131 8.1 Bootstrap Performance of Generalized Additive Model for Freeway Incident Detection .............................................................................................................. 132 8.2 Bootstrap Performance of Parametric Estimate of Generalized Additive Model ............................................................................ ...................................... 138 8.3 Summary .............................................................................................................. 142 9. Summary, Conclusions, and Recommendations ................................................... 144 9.1 Characteristics of Incidents ................................................................................. 146 9.2 Significant Independent Variables for Incident Detection Models .................... 147 9.3 Performance of Generalized Additive Model for Incident Detection ................ 148 9.3 .1 Fixed Sensor Based Incident Detection Model ................................................ 148 9.3.2 Performance of Fixed Sensor Based Incident Detection Model by Segment Length and Segment Type ................................................................. 150 9.3.3 Mobile Sensor based Incident Detection Model .............................................. 150 9.3.4 Fixed and Mobile Sensor Based Incident Detection Model ............................ 151 9.4 Unbiased Validation of Model Performance ...................................................... 152 ix

PAGE 10

9.5 Conclusions .......................................................................................................... 152 9.6 Recommendations ............................................................................................... 154 Appendix ....................................................................................................................... 156 A. Sample SAS Scripts ............................................................................................... 156 A.1 SAS Script to Fit Generalized Additive Model .................................................... 156 A.2 Script to Evaluate Model Performance (DR, ISDR, FAR, and TID) ................. 158 References ...................................................................................................................... 183 . !' i' L. .' ',,''. X

PAGE 11

FIGURES Figure 2.1. In-lane or lane blocking freeway incident.. ........................................................ 9 3 .1. Schematic of the test network and detector iocations ...................................... 31 . 3 .2. Typical detector configuration for ramp metering ........................................... 32 3.3. Denver's A VL system ................................ ...................................................... 35 3.4. GIS software used to display and extract bus A VL data on the I-25 freeway northbound .......................................................................................... 36 . 3.5. The PATH project study section ............. ; ......................................................... 38 4.1. Types of incidents ......... : ............... ........ : ......... =. .... :: ......................................... 43 4.2. Incident characteristics in Colorado ......... .' ........ : ..................... : .................. .' .. ... 45 ; ( 4.3. Delay due to an incident ................................................................. ................... 51 '' I ; ' 6.LBox-Whisker Plot for incident (1) andnon-incident(O}conditions on the I-25 freeway ............................................................................... _. ................ 75 6.2. Box-Whisker Plot for incident (1) and non-incident (0) conditions on the I-880 freeway for all incidents ................................................................... 75 6.3. Box-Whisker Plot for incident (1) and non-incident (0) conditions on the I-880 freeway for lane-blocking incidents ................................................. 76 ' l' . :,_ 6.4. Partial Prediction for UDEVOCC, UOCC, and USPD for all incidents on the I-25 freeway ....... ................................... _._ ................................................ 85 6.5. Partial Prediction for UDEVOCC, UOCC, DQCC, USPD, DSPD, and OCCDF for all incidents on the I-880 freeway ............................................... 86 xi

PAGE 12

6.6. Partial prediction for UDEV.OCC; UOCC, DOCC, USPD, and DSPD for the lane-blocking incident model for the freeway ........................... 86 6.7. Partial prediction forUDEVOCC, UOCC, DOCC, USPD,_DSPD, and OCCDF for the shoulder incident detection model for the I-880 freeway ............................. ; ................ ...... : ..... .................................................. 87 6.8. Odds ratio, lfl(x), for a unit increase of independent variables ........................ 95 6.9. Odds ratio, If!( X), for upstream .............................. 96 6.10. Relationship between upstream speed (USPD) and upstream occupancy deviation from historical upstream occupancy (UDEVOCC) under (a) non-incident conditions and (b) incident conditions .......................................................................................................... 97 6.11. Odds ratio lfl(x) for upstream speed for the lane-blocking incident model and the shoulder incident model ........................................................... 99 6.12. Speed at different location during incident conditions ................................ 100 . 6.13, Odds ratio lfl(x) for speed for the blocking incident model and the shoulder incident model ......................................................... 101 6.14. Odds ratio lfl(x) for upstream speed for the I-25 and the 1.,880 freeways all incidents ............ : .......................... ............................................. 103 7 .1. Performance of the geneJ.fllized additive model and the multilayer feedforward neirral for the I25 freeway on the test set .................. 107 7.2. Performance (DR) ofiD algorithms for all incidents on the I-880 freeway .......................................... 110 7.3. Performance (ISDR) ofiD algorithms for all incidents on the I-880 freeway ........... : .......................... : ..................................................................... 110 7.4. Detection rate for lane-blocking incidents on the I-880 freeway (Test set) ............................................. : ...................... : .............................................. 113 7.5. Incident state detection rate for lane-blocking incidents on the I-880 freeway (Test set) ........................................................................................... 114 xii

PAGE 13

7 .6. Detection rate at different interval persistence test for the I-880 shoulder test set .............................................................................................. 117 7.7. Incident state detection rate at different interval persistence test for the I-880 shoulder test set ..................................................................................... 117 7.8. Detection rate by segment length ..................... ............................................. 119 -. . 7.9. Mean time to detect by segment length ........................... ..... ........................ 119 7.10. Detection nite by' segment type for short segffients .................. .. ............... 121 7 .11. Mean time to detect by segment type for short segments ............................ 121 : .. -. ;1. ' . 7.12. Detection rate by segment type for long segments ............... ; ...................... 122 7.13. Mean time to detect by segment type for long segments ......... .................... 123 7.14. Performance of mobile sensor based generalized additive model.. ............. 125 7.15. Cumulative average delay of incidents when the first probe vehicle is available ............................................................................................. ............ 126 7 .16. Performance mobile sensor based incident detection by average delay when the first probe vehicle available .................................... 127 .. ... .:" 7.17. Performance of fixed and mobile sensor based incident detection model ... : ... ... :: .... : .............................. :: ...... : ................... : ... .. .... .: ........ '.". 129 "! .. 8.1. Bootstrap performance of GAM for incident detection ................................. 133 8.2. Frequency plot of bootstrap performance ...................................................... 134 8.3. Frequency plot of degree of freedom for UDEVOCC ................................... 135 8.4. Degree of freedom ofUDEVOCC with DR and ISDR ..... j ............ ; ...... ....... 136 8.5. Bootstrap performance by mean time to detect.. .................... ::; ..................... 137 8.6. Performance of parametric estimate of GAM at zero interval persistence test ............. : ............................................... :.; .. :::.;;;;; .................. : .. 139 8.7. Scatter plot of DR FAR with histograms at zero interVal persistence test ...................................................................... .......................... 140 xiii

PAGE 14

8.8. Scatter plot ofiSDR and FAR with histograms at zero interval persistence test ..................................................... 140 XIV

PAGE 15

'. j ,,; i_: TABLES Table 3.1. Distance between detector locations ................................................................ 33 4.1. Incident rate for the 1-25 and the 1-880 freeway .............................................. 46 4.2. Chi-square goodness offit test for one-way contingency table ....................... 47 ,. 4.3. Duration of incident for the 1-25 and the 1-880 freeway .................................. 49 4.4. Pooled t-test and Kolmogorov-Smirnov test for duration of incidents ........... 49 4.5. Average delay due to incidents on the 1-25 and the 1-880 freeway ................. 54 4.6. Pooled t-test and test for.average delay ....................... 54 . . . . 6.1. Traffic measures for incident detection ............................................................ 71 1: 6.2. Univariable model ranked basedo:p._Dyviance ................................................. 77 6.3. Analysis of Deviance for preliminary models for all incidents on the 1-880 freeway ........ ; ................. ........ ... ; .... .......................................................... 78 6.4. Analysis of Deviance for the generalized additive models .............................. 79 6.5. Analysis of Deviance for mobile sensor based incident detection model ........ : ........... ... ;: ..... ;; ................................................ 82 6.6. Analysis of Deviance for mobiie based incident detection model ....... 82 -.,1' ,. 6. 7. Analysis of Deviance for fixed and mobile sensor based incident detection model ................................................................................................ 83 . 6.8. Model analysis for parametric estimate of generalized additive model for the all incident for the 1-25 freeway ........................................................... 89 XV

PAGE 16

6.9. Model. analysis for, parametric estimate of generalized additive model for the all incidents model for the 1-880 freeway ............................................ 90 6.1 0. Model analysis for parametric estimate of generalized additive model for the lane-blocking incident model for the 1-880 freeway ........................... 91 6.11. Model analysis for parametric estimate of generalized additive model for the shoulder incident model for the 1-880 freeway .................................... 92 7 .1. Perfomiance of the generalized additive model on the test set including lane-blocking and shoulder Incidents ..... ...... ......... .......... ....... .. .. .. . .. ... ........ 109 7.2. Performance of incident,detection models ..................................................... 112 8.1. Minimum and maximum values of bootstrap performance of GAM ............ 134 8.2. Bootstrap performance of parametric estimate of GAM with 95% confidence interval .............. .... .............................. .......................................... 141 xvi

PAGE 17

1. Introduction 1.1 Background Congestion on the freeway is an economic, social and environmental concern. It causes excessive delay, queue backups, increased fuel consumption, and increased air pollution. There are two types of freeway congestion: recurring and non-recurring congestion. Recurring congestion or predictable .congestion is caused by excessive demand during-morning and evening peak period or reduction in freeway capacity due to change in roadway geometry. Reduction in freeway capacity can be caused qy lane drops, weaving sections, horizontal curvature, and vertical alignment. Non-recurring congestion or congestion is caused by. an incident. Examples of freeway incidents include traffic accidents, stalled vehicles, hazardous spill, debris, or any other unexpected event that disrupts the flow of traffic on the freeway. Under the national program on Intelligent Transportation Systems (ITS), the principal thrusts of research are in the area of Advanced Transportation Management Systems . .. .-.-, (ATMS) and Advanced Traveler Information Systems (ATIS). Both ATMS and ATIS !" focus on monitoring traffic congestion in real-time. A major concern is providing decision support to effectively detect, verify, and develop response strategies for incidents that disrupt the flow of traffic. A key element in providing such support is automating the process of detecting incidents on large area roadway networks. Incident detection is a major component of not only ATMS and but also for ATIS to provide dynamic route guidance to travelers. The impacts of non-recurring congestion may 1

PAGE 18

also be reduced considerably by reducing the time required to detect' and clear an incident. The disruption to traffic flow by an incident causes a reduction of roadway capacity at an affected location. The reduced capacity may be less than upstream demand and cause congestion upstream of an incident. An incident may be either lane blocking incident or shoulder mcident (Lindley 1986). Examples oflane-blocking incident are accident and debris on the freeway and examples of shoulder incident are stalled vehicle and flat tire vehicle. Shoulder incidents can also cause a significant reduction of capacity due to rubbernecking. As an incident detection model is implemented by a traffic management center (TMC) to detect freeway incidents, the traffic management center is mainly interested in detecting both lane-blocking incidents and shoulder incidents. If shoulder incidents are detected rapidly, a TMC may respond by sending a service unit out to clear the incidents and assist motorists to .reduce the impact ofthe;incidents. As will be presented in Chapter4 ofthis dissertation, sho.ulder incidents also cause considerable delay to motorists. Therefore, an incident detection model that detects both lane-blocking and shoulder incidents that cause disruption to traffic is of utmost importance. Some incident detection models reported in the literature have been developed based on lane-blocking incident data (Abdulhai and Ritchie 1999; Cheu and Ritchie 1995; Jin et aL 2002). An incident detection model developed based on only lane-blocking incidents may not perform well in detecting shoulder incidents. The data source for an incident detection model can be infrastructure based and/or non-infrastructure based sensors. The infrastructure-based sensors include loop detectors embedded in pavements or video based detection systems that provide estimates of traffic flow measures such as flow rate, speed and occupancy (percent time detectors are occupied by traffic) at regular intervals. Examples of non2

PAGE 19

infrastructure. based sensors include location trackinR systems installed in selected vehicles in traffic stream that serve as probe vehicles. Global positioning system based tracking systems provide vehicle location and speed at regular intervals. The loop detectors have several disadvantages. They are difficult to install and maintain since the freeway needs to be closed, pavement cut, wired and control cabinets installed. In addition, in order to collect accurate measurements of occupancy, the loop detector need to be properly tuned. This process takes a'considerable amount of time and effort (Skabordonis 1995). In addition, the spacing between fixed loop detectors effects the performance of an incident detection 'moci<;;L For example, for a short duration incident with high spacing betWeen fixed loop defectors, the inCident either may not be detected or the time to detect may be high. Use of probe vehicles as real-time data source has received considerable attention in the last few years. Buses equipped with automatic-vehicle location (A VL) systems are a potential resource and may be used as moqile .sensors or probe vehicles. Cellular phones reporting location data to a Traffic Management Center (TMC) may also serve as probes. The probe vehicle data may be used fqr freeway segments where loop detectors are not available or used in combination :with loop detector data where available, to improve the performance of an. incident detection model. 1' ,, 1 1 j: The performanc.e of an incident detection (ID) model may be evaluated based on its ,. :.... :. detection rate (DR), false alarm rate (FAR), and mean time to detect (TID) incidents. The performance of incident detection model developed should be unbiasedly validated. The most stringent test is an external validation by testing the model to a new population. Unbiased internal validation is alternative test for model validation. The techniques for obtaining nearly unbiased internal assessments of accuracy or performance include data-splitting, cross-validation, and bootstrapping. Data-splitting technique is the most common use for incident detection model validation (Abdulhai 3

PAGE 20

1996; Cheu 1994; Dia et al. 1997). For incident detection model development, field. data size is usually relative small due to difficulty of collection, and coordination. The disadvantage of data-splitting is that it reduces the sample size of data set for model development and testing. Another:disadvantage is that the model performance relies on a single data split. Cross-validation is repeated data-splitting. The advantage of cross validation over data splitting is the size of training size can be much larger and cross validation reduces variability by not relying on a single sample split. Bootstrapping involves in sampling the original sample with replacement. The size of bootstrap sample is as large as the original sample. Efron (Efron and Tibshirani 1993) shows that cross-validation is roughly unbiased but can show' large variability. The simple bootstrap method shows lower variability but can be severely biased downward. The .632 bootstrap performs the best among all methods. An incident model that has a high detection rate, a low false alarm rate, and.a low mean time to detect is essential for the operations of a TMC. StJ,ldies have shown the alarms in a Traffic Management Center generated by an ID model with a high false alarm rate are, often ignored. The impacts ofnon-recurring congestion can be minimized by detecting severe incidents rapidly. Moreover, a model that is easy to implement and calibrate or train, and performs well is desired. 1.2 Objectives of Study An incident detection (ID) model with high detection rate, low false alarm rate, and low mean time to detect is an integral part of decision support system of an advanced traffic management system (ATMS) operated by a traffic management center (TMC). The impacts of non-recurring congestion may be reduced considerably by reducing the time required to detect and clear an incident. Alarms generated by an incident detection model with high false alarm are usually ignored by the TMC. Therefore, not 4

PAGE 21

only a high detection rate but also a low falsealann rate is an operational requirement, of an incident detection model. As an incident detection model is developed and tested, an accurate .estimate ofthe performance of the incident detection model developed is also desired. The of this research are to: 1. Study the characteristics oflane-blocking and shoulder incidents. 2. Identify significant independent variables for incident detection models. 3. Develop generalized additive models (GAM) to detect lane-blocking and shoulder incidents based on data from fixed and mobile sensors. 4. Compare the performance of the incident detection models developed with recent models reported in the literature. 5. Excimine the effect of segment type and segment length on mcident detection. 6. Obtain an accurate estimate of the performance ,of the model developed by applying a bootstrap method. 1. 1.3 Organization of Dissertation The dissertation consists of nine chapters and is organized as follows: Chapter 1 presents a background on incident detection, the objectives, and scope of this dissertation. 5

PAGE 22

Chapter2 presents a comprehensive reviewofthe literature on incident detection. A synthesis ofthe.literature on the techniques applied to freewayincident detection, performance of the models, incident data sets used, and the strengths and weaknesses of the models are presented. Chapter 3 describes fixed sensor, mobilesensor, and incident data collected to develop and test proposed inCident detection models and to compare its performance to a few recently developed models reported in the literature. Chapter 4 examines the characteristics of incidents including incident type, incident ,' .. '. .. rate, incident duration, and average delay. Chapter 5 presents the methodology proposed for a new incident detection model. Chapter 6 describes the development of generalized additive models for incident detection,including selection of independent variables and parametric estimate of generalized additive models. The difference in traffic patterns that results due to lane blocking and shoulder incidents is also examined. Chapter 7 presents the performance of the fixed sensor, mobile sensor, and fixed and mobile sensor based generalized additive models and their parametric estimate for all incidents on the Interstate 25 freeway in Colorado and the Interstate 880 freeway in California, incidents, and incidents on the Interstate 880 freeway. The effect bfthe length and type of the freeway segments on the performance of the incident detection model is also presented in this chapter. Chapter 8 presents thebootstrap estimate of the performance of the models developed for the Interstate 880 freeway. 6

PAGE 23

Chapter 9 summarizes the findings of research, presents the conclusions, and recommendations. ,. .. 1.4 Significatit:Cohtributions'of Study: The significant contributions of this study are as follows: 1. A new arid improved incident detection model is developed to detect incidents. 2. Most incident -dete'ctionmodels reported in the literature detect only lane blocking incidents. This research proposes incident detection models to detect lane-blocking and shoulder incidents. The characteristics of lane-blocking and shoulderincidehts'are also examined and compared. 3. This stuqy examines the effect of type and length of segments on the -. -. . -''. -incident 4. The of incident detection models is unbiasedly validated based 1'. .. ,. .' t, 1 _-,. . : ';-, , a bootstrap method. The performance reported includes the mean DR, ISDR, l ... : { . FAR, and TID, and their 95 percent confidence interval. ''l_ :: i': 7

PAGE 24

2. Literature Review. Incident detection is an important function ofa Traffic Management Center with an Advanced Traffic Management System (ATMS) in place. From the early 1970's, research,has focused on developing incident detection algorithms for freeways. Since the 1980's the scope of incident detection work has expanded to include signalized urban arterials. Source of data for incident detection algorithm may be infrastructure based, 'fixed "sensors, and/or non-infrastructure based, mobile sensors. The infrastructure based sensors include loop detectors embedded in pavements or video based detection systems that provide estimates of traffic flow measures such as flow rate, speed, and occupancy (percent time detectors are occupied by traffic). Incident detection algorithms typically rely on data received from fixed, infrastructure based loop detectors. Loop detectors, tYPically installed at regular spacing, provide a good source of temporal variation oftraJfic measures, while the spatial variation of traffic measlire is available at discrete intervals e.g. half a mile. Non-infrastructure, mobile sensors provide spatial variation of traffic measures at regular timeiriteival. A limited study based on simulated data with high penetration rate of probe vehicles has shown improvement in the performance of an incident detection algorithm when both fixed loop detectors and probe vehicles are used as data sources. Very few studies have used field data, especially for integrated fixed and mobile based incident detection algorithms. Incidenf algorithms rely on the traffic patternsthat emerge as'an incident occurs. TyPically, it is assumed that as a freeway incident occurs, the flow rate decreases, the occupancy increases, and the speed decreases upstream of the incident, and the flow rate decreases, the occupancy decreases, and the speed increases 8

PAGE 25

downstream of the incident (Figure 2.1). However, the increase ofspeedat downstream of the incident may vary with the distance between incident location and downstream detector station or spatial distribution ofloop detector to allow vehicles accelerate to desire speed after passing an incident. .. [ Downstream of the incident Upstream of the incident Figure 2.1. In-lane or lane blocking freeway inCident Several techniques have been applied to incident detection. Examples include decision-tree based pattern recognition techniques, time-series based statistical approach, catastrophe theory; andartificial intelligence based neural rietwork -.. ' : . ;.. r There are six types of incident detection algorithms based on availability of data (i) Fixed sensorbased freeway incident detection (ii) Mobile sensor-based freeway incident detection algorithms (iii) Fixed and mobile sensor.;.based freeway incideritdetection algorithms (iv) Fixed sensor based surface street incident detection algorithms{v) Mobile sensor-based surface street incident detection algorithms and (vi) Fixed and mobile sensor based surface street incident detection algorithms. This chapter of the dissertation presents a review of all freeway incident detection algorithms and only probe based surface street incident detection a1gorithms. 9

PAGE 26

2.1 Fixed Sensor-Based Freeway Incident Detection Algorithms Fixed sensors are typically-installed in major urban freeways. Therefore, most incident detection algorithms developed since the 1970's are based on data collected by fixed sensors such as loop detectors. A series of 11 pattern matching algorithms based on decision trees were developed for freeway incident detection by Payne (Payne and Tigrior 1978). These threshold based algorithms, better known as the California Algorithms, detect a freeway incident based on classifying traffic patterns as incident or non-incident condition using a decision tree. The algorithm detects discontinuity of upstream and downstream occupancy. An incident is declared ifmeasures calculated from 1-minute average occupancy for upstream and downstream stations exceed predetermined thresholds. Other measures used in the algorithm; calculated from upstream occupancy and downstream occupancy, are spatial difference in occupancies{OCCDF), relative spatial difference in occupancies (OCCRDF), and relative temporal difference in downstream occupancy (DOCCTD). The California Algorithm #1 compares these three measures (OCCDF, OCCRDF, and DOCCTD) to three corresponding thresholds. The algorithm indicates either incident or incident-free states. The : .. California Algorithm #2 or Modified California Algorithni is a refinement of the California Algorithm # 1 with an incident continuing state as another state for an output. The Algorithm #4 is similar to Algorithm #2 but uses downstream occupancy instead of relative temporal difference in downstream occupancy (DOCCTD). This helps reduce false alarms due to compression wave in heavy traffic. Compression wave is the growth of congested traffic which causes a high occupancy that moves through the traffic stream in a direction counter to the flow. The Algorithm #7 uses the same traffic measures as Algorithm #4 with a persistence requirement. The Algorithm #8 has the capability to account for compression wave and persistence requirement to 10

PAGE 27

reduce the false alarm rate:A persistence test checks for incident states for consecutive intervals before an incident is declared. The California Algorithm was developed and evaluated basedon field data obtained from Los Angeles and Minneapolis freeway surveillance systems including approximately 150 incidents. The California Algorithm #8 outperformed the other California Algorithms. The. detection rate reported is up to 61 percent with false alarm rate of 0.177 percent. The algorithm is often used as a benchmark when comparing and evaluating new algorithms because of its widespread use in traffic management centers (TMC). An exponential smoothing algorithm for freeway incident detection algorithm was developed by Cook et al. (Cook and Cleveland 1974). The variables are occupancy, speed, volume, and energy computed from speed and volume. The algorithm was developed and tested based on 1-minute interval field data from John C. Lodge Freeway in Detroit including 50 incidents included18 accidents, 28 stalls and breakdowns, 2 instances of debris, and 2 short maintenance operations. The highest detection rate reported from the Exponential Station Discontinuity model islOO percent at a false alarm rate of 6.5 percent and the lowest false alarm-rate, 5.73 percent at a detection rate of 96 percent.' : The Standard Normal Deviate (SND) model is based on the assumption that an . I . . incident results in a high rate of change in lane occupancy and energy computed from volume and speed measurements{Dudek et al. 1974). The algorithm was tested and evaluated based on field data for 35 real incidents in Houston, Texas. The SND algorithm using a 5-minute interval occupancy as the control variable resulted in the best performance. the detection rate reported is 92 percent with 1.3 percent false alarm rate. A Bayesian approach.(Levin and Krause 1978) assumes that the normal traffic flow follows its historic trend and any deviation from this trend that exceeds a certain 11

PAGE 28

threshold indicates an incident. The variable used is the ratio of the difference between upstream and downstream 1-minute occupancies and upstream occupancy. The algorithm requires the frequency distribution of the variable during incident and incident-free conditions from historical data, and probability of an incident occurring at a given detector. This algorithm was evaluated based on field data for 17 incidents on outbound Kennedy Expressway, Chicago. The Bayesian Algorithm compares favorably with the California Algorithm in terms of detection rate and false alarm rate. The detection rate of the Bayesian Algorithm reported is 100 percent with false alarm rate ofO percent. However, the mean time to detect is higher than the California algorithm. The structure of Bayesian algorithm limits the mean time to detect to at least 4 time intervals. Two pattern recognition methods, the IDOCC (High OCCupancy) and PATREG algorithms were developed by Collins (Collins et al. 1979). The IDOCC identifies the presence of stationary or slow moving vehicles over detectors based on occupancy data: The one-second occupancy from detector is obtained by scanning the detector every Ill 0 'second to determine whether the detector is The one-second occupancy is smoothed with single-stage exponential smoothing at the end of every second. Several consecutive seconds of very high detector occupancy initiates an incident alarm. The P A TREG algorithm monitors the traffic speed in each lane .. . . between pairs of detector stations. Speed is compared against pre-determined thresholds. Significant change in speed is an indication of an incident. Both algorithms . were and tested with two data sets from the M4 Motorway near London with n.o incident and another from the Boulevard Paris with 12 1,fie data from the M4 Motorway is used to test the algorithms for their ability to detect queue due to congestion at downstream. The HIOCC detected all12 incidents and queue due to congestion on the Boulevard Peripherique. The P A TREG did not detect the incidents in the heavy traffic on the Boulevard Peripherique. It 12

PAGE 29

performs satisfactorily in freeflow condition up to about .1.500 veh/h per lane on M4 motorway data in terms of false alarm rate. Tsai (Tsai and Case 1979) applied the maximum-likelihoo,d decision principle to develop an optimum incident persistent test to improve the performance of the incident detection algorithm. The technique is applied to the modified California algorithm (Payne and Tignor 1978) to reduce the false alarm rate. The.algorithm is tested with field data of 28 real incidents from the Queen Elizabeth Way freeway in Canada. The false alarm rate decreases from 0.09 percent to 0.06 percent but the detection rate also decreases from 85 percent to 74 percent. Fambro (Fambro and Ritch 1979) developed an algorithm for freeway incident l;; 't. detection under low-volume conditions. The algorithm assumes that under low volume conditions, vehicle speeds remain constant over short segments of freeway. The control variables include the time that a vehicle enters the segment and predicted exit time computed from speed estimated from detectors. An incident is declared if fewer ) than an expected number of vehicles exit the section in an interval. The algorithm was developed and tested with field data ofl-610 freeway in Houston, Texas. The ... '\ algorithm detects 100 percent of incidents when the flow rate is less than 400 veb/h. It detects 61 percent of incidents when the flow rate is between 800 and 1200 vehlh. An autoregressive integrated moving average (ARIMA) algorithm was developed for.freeway incident detection by Ahmed (Ahmed and Cook 1982). It assumes that traffic flow can be modeled from historical, time-varying, traffic data by comparing observed traffic measure such as occupancy against predicted traffic measures. Significant deviations from observed and estimated.values of traffic measures lead to an incident alarm. The algorithm was developed and evaluated based on field data for 50 real incidents on the Lodge freeway in Detroit. The detection rate reported is 100 percent with a false alarm rate of2.6 percent by using constant-13

PAGE 30

parameter values estimated from incident-free data. The false alarm rate decreases to 1.4 percent when parameter estimates are updated occasionally. The mean time to detect decreases from 0.58 minute to 0.39 minute. The performance ofthe time series algorithm depends on the robustness of the optimum confidence interval as it determines the threshold deviations. The McMaster algorithm (Persaud and Halll989) for freeway incident detection is based on Catastrophe theory. The flow-occupancy curve is used as the decision criterion for detecting incidents by separating the areas corresponding to different states of traffic conditions. The flow-occupancy criterion is derived from the catastrophe theory model of the three dimensional relationship between flow, occupancy, and speed. Incidents are detected by observing specific changes in traffic measures in a short time period. Aultman-Hall (Aultman-Hall et al. 1991) developed and evaluated the algorithm using flow rate, occupancy, and speed at a single station. The flow-occupancy curve varies for each station on geometric characteristics. The flow-occupancy curve has four possible states; normal uncongested operation, operation downstream of incident, operation within a queue of slow-and-go traffic, and capacity operation downstream of a recurrent bottleneck. The algorithm was developed and evaluated based on 30-seconds field data collected from the Burlington Skyway in Ontario, Canada. The algorithm detects 6 incidents out of 10 incidents ( 60 percent). The 4 incidents that are missed are reported to have no significant effect on traffic. The false alarm is one every 10 for 2 time interval persistence check and one for every 39 hours for 3 time interyal persistence check. The algorithm was also tested with field data for 31 incidents from Q!Jeen Elizabeth Way in Ontario, Canada. The algorithm detects 14 The other 17 missed incidents are reported to include bad or no detector data, or had no visible effect on data. The false alarm reported is 15 alarms for 39 days. 14

PAGE 31

A combinationoffuzzy logic and the learning capabilities ofneural network was applied to freeway incident detection algorithm by Hsiao (Hsiao et al. 1993). The input variables used are 5-minute interval volume, occupancy, and rate of change of occupancy. Data collected from the Dan Ryan expressway in Chicago at three different locations included 6, 9, and 4 incidents are used to develop and test the algorithm. The first data set with 6 incidents is used to train the algorithm. The detection rate for second and third data sets are 66.7 percent and 75 percent, respectively. The false alarm rates reported are 0 for both data sets. A low-pass filtering technique was applied to freeway incident detection algorithm by Stephanedes (Stephanedes and Chassiakos 1993). The low-pass filtering or the Minnesota Algorithm filters the raw traffic data before an incident detection algorithm is applied. This helps to reduce false alarm rate due to short-term traffic fluctuations. The 30-second average upstream occupancy and downstream occupancy are used to compute spatial occupancy difference. Linear filter, moving-average smoother, is used to filter out short-:time fluctuation of occupancy difference. The variables are compared against thresholds. The algorithm was developed and tested with field data of 27 real incidents of.I-35W. in Minneapolis. The Minnesota algorithm outperformed the California Algorithm, California Algorithm #7 (Payne and Tignor 1978), Standard Normal Deviate (SND) (Dudek et al. 1974), and the Double-Exponential Algorithms (Cook and Cleveland The detection rate is 50 percent at false alarm rate ofO.l percent. A detectipn rate of 80 percent is achieved with a false alarm rate of 0.3 percent. -.l:j ,-;: An artificial intelligence based, non-linear pattern recognition technique, neural network has been applied to freeway incident detection. Cheu (Cheu and Ritchie 1995) first applied the multi-layer feed forward (MLF) neural network to ,. : -. develop an incident detection algorithm based on 30-second average occupancy and 15

PAGE 32

average volume data for up to four previous time interval for upstream station and up to 2 previous time interval for.downstream station from loop detectors simulated for a freeway in California. Four hundred simulated incidents are used for training and another set of 400 simulated incidents is used to evaluate the progress oftraining. Nine hundred and eighty simulated incidents and 9 real incidents are used to evaluate the performance of the neural network algorithm. The study reports that neural network algorithm outperformed the California Algorithm# 8, the McMaster algorithm and Minnesota algorithm. .. A multi-layer feedforward neural network was applied to an extensive incident data set by Dia (Dia et al. 1997). The data set included 100 real incidents from Tullamarine Freeway in Melbourn, Australia. Out of 100 incidents, 60 incidents are used for training and 40 incidents are used for validation. The variables used are 20-:s.econd average occupancy, average speed, and average volume at upstream and downstream detector stations at current time interval. The algorithm is compared to the ARRBNicRoads model (Luk 1992). The logic of ARRBNicRoads model is to compare the traffic data between adjacent stations and lanes. An incident is declared if the differences exceed pre-determined thresholds. The MLF outperformed the ARRBNicRoads models. The detection rate for MLF reported is 82.5 percent with the false alarm rate of0.065 percent, and the mean time to detect of203 seconds. A framework for incorporating a neural network based continuous learning capability, least squares and error back propagation, to the California algorithm and McMaster algorithm was developed by Peeta (Peeta and Das 1998). The least squares and error back propagation is implemented in the California algorithm to continuously update the thresholds over time. The least squares technique and the error back propagation are applied to McMaster algorithm to update the parameters that classify region of four states for flow-occupancy curve. Simulation data of the Borman 16

PAGE 33

Expressway in Indiana is used to develop and test the framework. The algorithm outperformed the California Algorithm and McMaster Algorithm. The performance of the California and McMaster algorithms with continuous learning capability also improved with time in service. The error back propagation shows better learning capability, shorter time in service to reach a 100 percent detection rate, than the least squares technique. A CUSUM algorithm was developed for freeway incident detection by Teng (Teng et al. 1998). It assumes that the change in traffic processes can be distinguished by the difference between the current cumulative sum of the log-likelihood ratio and its minimum value up to the current time period. The log function of probability density function associated with the normal and the.changed conditions ratio is log-likelihood ratio. The variables are upstream and downstream occupancies. The field data ofl-880 of the California PATH project was used to develop and test the algorithm. The 63 incidents that obviously effect the traffic were used. The input variables are occupancies at upstream and downstream. The study reports that the algorithm performed better than the California Algorithm and low-pass filtering algorithm. Two neural network models, multi-layer feedforward (MLF) and fuzzy adaptive resonance theory (ART), were applied to freeway .incident detection algorithm by Ishak (Ishak and Al-Deek 1999). The algorithm was developed and tested with field data of 130 lane-blocking incidents ofl-4 freeway, Orlando, Florida. The MLF algorithm reports a detection rate of 63 percent with 0.05 percent of false alarm rate when using 30-second occupancy and speed. as algorithm_ inputs. The detection rate reported is 61 percent with 0 percent of false alarm rate when used occupancy, speed, and volume as input variables. The fuzzy ART algorithm performed best when using persistence factor (PF) of2 and a vigilance parameter of0.95 with occupancy and speed as input variables. The vigilance parameter controls the dynamics of the Fuzzy 17

PAGE 34

ART network and determines the degree of clustering achieved by the algorithm. For false alarm rates up to 0.07 percent, the MLF algorithm outperformed the fuzzy ART algorithm, the California Algorithm #7 and #8. For false alarm rates greater than 0.07 percent, the fuzzy ART algorithm outperformed the MLF network, the California Algorithm #7 and #8. The fuzzy ART has advantages over the MLF network due to its fast stable learning in response to analog or binary input patterns. The Bayesian-based probability neural network (PNN) that utilized the concept of statistical distance; instead of Euclidean distance, as a measure of nearness of weighted vector to the different pattern was-applied to freeway incident detection by Abdulhai (Abdulhai and Ritchie 1999). The same simulated and real incidents data used by a previous study (Cheu and Ritchie 1995) and additional real data from 1-880 in California and I-35W inMinnesota including 45 and 159 incidents respectively were used to evaluate PNN algorithm. The same 16 illput variables as a previous study (Cheu and Ritchie 1995) were usedin this study. The algorithm compares favorably to the multi-layer feedforward neural network in terms of detection rate, false alarm rate, and mean time todetect. The study reports that the Bayesian-based PNN trains faster than the MLF neural network and is potentially transferable to new sites without the need for explicit off-line retraining. However, the characteristics of the incidents on the I-880 freeway and the 1-3-5W freeway is not compared. ... ,, A more recent model,.the constructive probabilistic neural network (CPNN) was developed by Jin (Jin et al. 2002). The model was structured based on mixture Gaussian model and by a dynamic decay adjustment algorithm. A mixture Gaussian model allows the PNN to include different parameters for each unit pattern that can be obtained by the dynamic decay adjustment algorithm. The CPNN models were developed and tested on 300 incidents simulated incident data in Singapore and field data from the 1-880 freeway in California, also used in a previous 18

PAGE 35

study (Abdulhai and Ritchie 1999). The 24 input variables used include occupancy, speed, and volume up to 4 previous time interval for upstream station and up to 2 previous time interval for downstream station. The model developed for the simulated data in Singapore reports a detection rate of97.33 percent with false alarm rate of7.12 percent and mean time to detect of 1. 7 nlinutes any persistence test. The model developed for the 1-880 freeway data reports a detection rate of 95.65 percent with false alarm rate of0.33 percent and mean tiine to detect of3.84 minutes. The model developed on the simulated incident data in Singapore was also tested on the 1-880 data with the proposed adaptation method. The model developed specifically for the location has been shown to perform better than the model with adaptation. However, the CPNN model has shown to have high false alarm 2.2 Mobile Sensor-Based Freeway Incident Detection Algorithms All algorithms mentioned in the previous section use_ traffic flow parameters estimated from fixed loop detectors. In the last few years, the, use of probe as an .... 1:}-ltemative or additional source of traffic flow data been of great interest. This section presents a review of algorithms that use only data from mobile sensors. '. Petty (Petty et al. 1997) has developed an ID algorithm based on GPS based probe data collected every second. The algorithm determines when a probe vehicle has passed an incident by comparing observed acceleration and speed against thresholds. The observed data is first filtered using a standard moving average filter of width 20 seconds. The vehicle is classified as 'passing an incident' when acceleration is above a threshold and is accelerating to speed above a speed threshold. The speed threshold is used to reject the large acceleration due to stop-and-go conditions that may cause false alarms. The algorithm was developed and evaluated based on data collected from the 1-880 freeway in Hayward, California. Incidents were divided into three categories: 19

PAGE 36

accident, vehicle breakdown, and police ticketing. The incident data included 25 accidents (13 in-lane accidents, 12 shoulder accidents), 61 police ticketing, and 226 vehicle breakdowns (16 in-lane break downs, 210 right shoulder break down). The algorithm detected approximately 70 percent of all 25 accidents on the freeway and about 50 percent of police ticking and vehicle breakdown. A Standard Normal Deviates (SND) algorithm for freeway incident detection was applied by Balke (Balke et al. 1996). The algorithm compares probe travel time with historical travel time. An incident is declared if the travel time from a probe vehicle exceeds the confidence interval around the typical travel time. Data was collected from 1-45, the Hardy Toll Road, and US-59 in Houston, Texas for over 11 months including approximately 625 incidents. Traveltime was estimated from the time when probe vehicle drivers called a communication center as they were passing consecutive reference locations. Probe vehicles drivers_also asked to call in to provide incident information. The algorithm is less effective than the California Algorithm and the McMaster Algorithm. However, this algorithm used only probe vehicle data whereas other algorithms used detector data. The detection rate reported is 58 percent and higher false alarm rate than other algorithms. 2.3 Fixed and Mobile Sensor-Based Freeway Incident Detection Algorithms Fixed sensor data may lack capability of representing spatial variation of traffic conditions especially when detector spacing is large. Use of mobile sensor data, with its capability of representing spatlal of traffic conditions, is also of interest in incident detection. This section reviews freeway incident detection algorithm that uses both fixed and mobile sensor data. 20

PAGE 37

To date, only one freeway incident detection algorithm has been developed based on data from both fixed and mobile sensors. Three models, Discriminant Analysis (DA) based model, Generalized Linear Model (GLZ), and Neural Network (NN) model, were developed based on simulated incident and non-incident conditions for an interstate freeway, I-25, in Colorado (Hoeschen 1999). The mobile sensors are the automatic vehicle location system (A VL) installed in buses by the local transit agency for fleet management. The models use 30-second detector data with probe data collected every 10, 20, or 30seconds. The input variables include average bus speed, number of bus reportings, upstream volume at time t-1 and t-2, upstream speed, downstream volume at timet, t-1, t-2, downstream speed, upstream and downstream speed difference, upstream occupancy difference by lane, downstream occupancy difference by lane. The study also reports the performance of the models (DA, GLZ, & NN) using fixed sensor data only, mobile sensor data only and both fixed and mobile sensor data. study reports that the performance of the mobile sensor data based models depends on the average bus headway and the probe reporting interval. For the bus headways observed on a section of the I-25 freeway in Colorado during the morning peak periods, the fixed sensor based model outperformed the mobile sensor based model. However, the mobile sensor based model detected up to 70 percent of incidents. When both detector and probe data were used, the overall performance of the ID model improved compared to fixed sensor based model mainly the false alarm rate was reduced. For combined fixed and mobile sensor based model, the NN model outperformed other models for 10 seconds bus reportings. The GLZ model outperformed other models for 30 seconds bus reportings. This study suggests that the bus A VL data can be used by TMC's to improve overall performance of an incident detection algorithm by lowering false alarm rate. 21

PAGE 38

2.4 Mobile Sensor-Based Surface Street Incident Detection Algorithms Although only two freeway incident detection algorithms have been developed based on mobile sensor data and one study based on fixed and mobile sensor, the literature review has also shown two studies in Chicago that have developed an incident detection model for signalized surface street network based on probe data. The scope of this dissertation proposal is limited to freeway incident detection; therefore a comprehensive review of all surface street incident detection algorithms is not provided here. Only two studies are reviewed here to provide an overview of the probe based methods. It may be mentioned that there are significant differences in the traffic patterns of surface street and freeway incidents due to the presence of multiple access points, geometric constraints, control measures and the location of surVeillance infrastructure for surface streets. A discriminant analysis based model was developed for incident detection for urban arterials in the ADV ANCEproject (Sethi et al. 1995). This study used fixed sensor and mobile sensor data independently to develop a fixed sensor based algorithm and a mobile sensor based algorithm, respectively. For the fixed sensor based algorithm, the best model was obtained when upstream occupancy deviation from historical occupancy and volume to occupancy ratio deviation from historical volume to occupancy ratio for upstream station was used to develop Fisher linear discriminant functions. The 7-minute average data from a traffic simulation with 123 downstream incidents, 177 midblock incidents, and 116 upstream incidents were used to develop and test the algorithm. For fixed sensor based model, a detection rate reported is 65.9 percent with false alarm rate ofO percent for downstream incidents (incidents occur downstream of detectors). The mean time to detect reported is 1.56 time periods (11 minutes). The algorithm detected only 6.9 percent ofupstream incidents and 1.7 percent of midblock incidents. For mobile sensor based algorithm, the best model 22

PAGE 39

obtained when the ratio of travel time to historical travel time and speed to historical speed were used. For mobile sensor based algorithm, the detection rate reported is 61.0 percent with a false alarm rate of 0.1 percent for downstream incidents. The mean time to detect reported is 1.67 time periods (12 minutes). However, this performance was based on 30 probe reports on each link during each 7-minute period (approximately 4.2 probe reports per minute) or a 20 percent probe penetration rate. The detection rate reduced to 17.9 percent if there was only one probe vehicle on the link within 7-minute interval. The algorithm detected 9.5 percent of upstream incident and 2.8 percent of midblock incidents. The study did not compare the performance of the discriminant model to other algorithms. A discriminant analysis and multi-layer feedforward neural network were applied to fixed sensor and mobile sensor based surface street incident detection algorithm by Ivan (Ivan and Chen 1997). The variables used are volume to occupancy ratio deviation from historical volume to occupancy ratio, occupancy deviation from historical occupancy for fixed detector, travel time to historical travel time ratio, and speed to historical speed ratio from probe vehicles. Also part of the ADVANCE study, . the simulation data used in the study mentioned previously (Sethi et ai. 1995) was used to test these models. The penetration rate of probe vehicle was 20 percent. Four data sets were used, loop detector data only, probe vehicle data only, data fusion, and using both loop detector and probe vehicle data. The discrimina.llt analysis algorithm -performed best when using both fixed sensor and mobiie'sensor data. The detection rate reported is 76.4 percent with false alarm rate ofO percent The MLF also performed best when using both fixed sensor and mobile sensor data. The detection . . rate reported is 87.0 percent with false alarm rate of0.1 percent. When a single data source is used, loop detector algorithm outperformed probe vehicle algorithm. The neural netWork outperformed discriminant analysis in terms of detection rate. 23

PAGE 40

2.5 Synthesis of the Literature This section of the dissertation presents a synthesis of the literature reviewed in terms of the performance of the models, data source, data set used to develop and test the models, and the characteristics of the models. 2.5.1 Performance of the Freeway Incident Detection Algorithms For the last three decades research has focused on developing incident detection algorithms based on several techniques including decision-tree based pattern recognition technique, time-series based statistical approach, catastrophe theory, and artificial intelligence based neural network approach. None of the algorithms report perfect performance for incident detection, i.e. 100 percent detection rate and 0 percent false alarm rate. Some studies may have reported perfect performance on a particular data set but not in general. For all algorithms, there is always a trade-off between detection rate and false alarm rate, that is, a higher detection rate can be achieved at the expense of a higher false alarm rate. Responses from a recent survey (Abdulhai 1996) of seven Traffic Management Centers show that a reasonable set of limits on DR and FAR would be 88 percent and 1.8 percent, respectively. A more stringent set of limits would be obtained using the extreme value of 100 percent and 0.25 percent, respectively. It may be mentioned that a Traffic Management Center monitoring a 50 mile freeway monitored by fixed sensors every half a mile, using an incident detection algorithm with a false alarm rate of 0.25 percent every 30-seconds, would generate 30 false alarms per hour. Very few incident detection algorithms reviewed report false alarm rates lower than 0.25 percent. The MLF neural network, the PNN, and CPNN models have been shown to perform best. Furthermore, all incident detection algorithms reviewed report performance relying on only a single data split. The 24

PAGE 41

algorithms are developed basedcon a portion of incident data set and tested on the remaining of data set. The true variability of model performance is not known. 2.5.2 Source of Data for Incident Detection Algorithms Data source for incident detection algorithms may be infrastructure based, fixed sensors and/or non-infrastructure based, mobile sensors. A review ofliterature shows that fixed sensors based ineiderit detection algorithm always provide a significantly better performance than mobile sensors based incident detection algorithm. One mobile sensors based incident detection algorithm (Petty et al. 1997) developed based on field data shows relatively low performance. Another mobile sensor based algorithm (Balke et al. 1996) based ori field data is reported to be less effective than a fixed sensor based incident detection algorithm. The performance of mobile sensors based incident detection model depends on the penetration rate and report interval of probe vehicles. A freeway incident detection algorithm (Hoeschen 1999) shows the combination of fixed sensors and mobile sensors based data may help reduce false alarms. However, the model was developed based on simulated data with relatively short report interval. Other two surface street incident detection models (Ivan and Chen 1997; Sethi et al. 1995) were developed based on simulated data with 20 percent penetration rate of probe vehicles. The algorithm relying on both fixed and mobile sensors provide better performance than an algorithm relying on a single source. 2.5.3 Incident Data Sets L The performance of incident detection models are typically tested using data collected either from a traffic simulation model and/or field data representing both incident and non-incident conditions. The type, duration, and severity of incidents vary across the 25

PAGE 42

different data sets that are reported in the literature. Data sets including simulated incidents provide a wider range of testing conditions. Although it is best to test algorithms based on field data including real incidents, it is often difficult to collect. There are a handful of data sets available for real incidents on freeways in the US. The two data sets from California include detector data from the SR-91 freeway for nine incidents (Cheu and Ritchie 1995) and probe and detector data from the 1-880 freeway in California by researchers at University of California, Berkeley, as part of the ---. Partners for Advanced Transit and Highways (PATH) project for 656 incidents. From this data, 45 lane-blocking incidents were used by Abdulhai (Abdulhai and Ritchie 1999) and 63lane-blocking incidents were used by Teng (Teng et al. 1998) to develop and test their ID models. The I-35W data collected in Minnesota includes detector data for 159lane-blocking incidents and was used by Abdulhai (Abdulhai and Ritchi_e 1999). The data collected from the 1-4 freeway in Orlando, Florida includes detector data for-130 lane-blocking incidents (Ishak and Al-Deek 1999) and the data from the 145 freeway and the US-59 in Houston, Texas includes 625incidents (Balke et al. 1996).-0utofall these data sets, only the data set from the California PATH project includes 'both detector and probe data. It may be mentioned that the probes in the PATH study were limited to four or five vehicles at an average headway of seven minutes. 2.5.4 Characteristics of the Freeway Incident Detection Algorithms The algorithms discussed each have their own, strengths and weaknesses. It is difficult to compare the performance of the algorithms based on the results presented in the papers reviewed mainly because the type of incidents included in each data set is different, the wide variation in the operating conditions under which the algorithms were tested and, in some cases, the size of the data set. Some studies do however 26

PAGE 43

report the performance of one model against several others on the same data set. The characteristics and weaknesses of the algorithms reviewed are summarized here. For freeway incident detection, the decision-tree based California Algorithm was developed by Payne (Payne and Tignor 1978). The algorithm detects discontinuity of upstream and downstream occupancy. An incident is declared if measures calculated from one-minute average occupancy for upstream and down stream stations exceed predetermined thresholds. The measures used in the California Algorithm calculated from upstream occupancy and downstream occupancy are spatial difference in occupancy, relative spatial difference in occupancies,and relative temporal difference in downstream occupancy. The California Algorithm relieson'location specific thresholds. An exponential smoothing algorithm for incident detection was developed by Cook (Cook and Cleveland 1974) based on occupancy, speed, volume, and energy computed from speed and volume. The algorithm provides high detection rate but at the expense of a high false alarm. The Standard Normai'Dev.iate'(sND) model developed by Dudek (Dudek et al. 1974) is based on the assumption that an incident results in a high 'rate of change in lane occupancy and computed from volume and speed The thresholds for the Standard Deviate Algorithm are calibrated on historical data. This reduces the aigorifum's ability to adapt to ' .' .: traffic variations, unless the thresholds are re-calibrated. reglllarly. The Bayesian ( .. 1 : i .. algorithm (Levin and Krause 1978) assumes that the normal traffic flow follows its ' historic trend and any deviation from this trend that exceeds a certain threshold indicates an incident. The variables used are the ratio of the difference between one minute upstream and downstream occupancies and upstream occupancy. The algorithnl. takes time intervals to obtain probability for incident detection. This increases the mean time to detect. The HIOCC (Collins et al. 1979) identifies the presence of stationary or slow moving vehicles over detectors based on detector data. The one-second occupancy from detector is obtained by 27

PAGE 44

scanning the detector every 1/lD second to determine whether the detector is occupied. The algorithm relies on detecting sharp changes in occupancy and therefore is unable to distinguish between queue build up due to incident and recurring congestion. The PATREG Algorithm (Collins-et aL1979) monitors the traffic speed in each lane between pairs of detector stations. The algorithm relies on significant changes in speed and fails to detect incidents at high flow rates. The P A TREG algorithm performs well when flow rate is very low. As flow rate increases, lane changing effect deteriorates the performance of the algorithm. The time series algorithm (Ahmed and Cook 1982) assumes that traffic flow can be mbdeled froril"historical,otime-varying, traffic data by comparing observed traffic measure!tsuch as occupancy against short-term predicted traffic measures. Significant deviation from observed and estimated values of traffic measures lead to an incident alarm. The day-to-day variation of traffic condition affects the performance of this algorithm. The low-pass filtering algorithm (Stephanedes and Chassiakos 1993), filters the raw traffic data before an incident detection algorithm is applied. This helps tb reduc_e false alarms due to short-term traffic fluctuations. The-vanabfes'iisedare -30-second average upstream occupancy, downstream occupancy, and spatial occupancy difference. The low-pass filtering algorithm like the California Algorithm, depends on location specific thresholds. The CUSUM algorithm (Teng 1998) assumes that the change in traffic processes can '!.'_. be distinguished by the difference between the current cumulative sum of the log likelihood ratio and its minimum value up to the current time period. The variables are upstream and downstream The algorithm shows that a high detection rate is at a high false alami rate.An artificial intelligence based, non-linear pattern recognition been applied'to freeway incident detection. Cheu (Cheu and Ritchie applied the multilayer feedforward (MLF) neural netWork to develop an incident detection algorithm. The 16 input variables include 30second average occupancy and average volume data for up to four previous time interval for upstream station and up to two previous time interval for downstream 28

PAGE 45

stations. The MLF neural network has been shown to perform well. Re-training of the algoritlu.n for a new location takes considerable time. The Bayesian-based probabilistic neural network (PNN) (Abdulhai and Ritchie 1999) that utilized the concept of statistical distance, instead of Euclidean distance, as a measure of nearness of weighted vector to the different patterns was applied to freeway incident detection by Abdulhai (Abdulhai and Ritchie 1999). The variables used for the PNN model were the same as the MLF model (Cheu and Ritchie 1995). The PNN algorithm performs as well as the MLF neural network. However, it re-trains faster than the MLF neural network model and has also been shown to perform well across locations following incremental learning. A more recent model, the constructive probabilistic neural network (CPNN) model (Jin et al. 2002) is structured based on mixture Gaussian model and trained by a dynamic decay adjustment algorithm. The 24 input variables include 30-second average speed, volume, and occupancy up to 4 previous time interval for upstream station and up to 2 previous time interval for downstream station. The model is also developed and tested on simulated data in Singapore and the I-880 freeway data in California, the same data set in study(Abdulhai and Ritchie 1999). The CPNN model has been shown to have high false alarm rate. The comprehensive review of the literature presented demonstrates that very few incident detection algorithms report a low false alarm rate with high detection rate. Unless the performance of a fixed sensor based algorithms is improved significantly, it may not be implemented or accepted for use in a TMC: To achieve acceptable performance, algorithms may not be transferred to a new site without recalibration, incremental training (Abdulhai 1996), or adaptive training (Jin et al. 2002) 29

PAGE 46

; 3. Data Collection As mentioned in the Chapter 2, both simulated and field data for traffic measures under incident and non-incident conditions have been used to develop and evaluate incident detection models. There are very limited data sets available on-real incident and non-incident conditions due to the difficulty in coordinating data collection. As part of a research project furided by the Colorado Department of Transportation (CDOT), incident data was collected for a network in Colorado. Another data set collected by the California PA Tilprogram is also used to develop and test several incident detection algorithms. This data, as well as data collected by the California PATH program are used for this research. 3.1 Data from Colorado To collect data for this study, a 9.7 mile section of the northbound Interstate freeway, -. -I-25, between the County Line Road and the Colorado Boulevard (Figure 3.1) was ! .-. ' selected. It consists of four lanes and one auxiliary lane between the E. County Line . . . -Road and the E. Dry Creek Road, and three lanes and one auxiliary lane between the : ;_. . \. . E. Dry Creek Road and the S. Colorado Boulevard. It includes 12 on-ramps at: (1) E. . County Line Road, (2) E. Dry Creek Road, (3) WB Arapahoe Road, (4) EB Arapahoe Road, (5) E. Orchard Road, (6) E. Belleview Avenue, (7) I-225 Interchange, (8) E. Hampden Avenue, (9) E. Yale Avenue, (10) E. Evans Avenue, (11) SB Colorado Boulevard, (12) NB Colorado Boulevard, and 9 off-ramps at: (1) E. Dry Creek Road, (2) E. Arapahoe Road, (3) E. Orchard Road, (4) E. Belleview Avenue, (5) I-225 30

PAGE 47

Interchange, (6) E. Hampden Avenue, (7) E. Yale Avenue, (8) E. Evans Avenue, (9) S. Colorado Boulevard .. A data collection effort was coordinated with the Colorado Department of Transportation (CDOT), the Regional Transit District (RTD) and news media outlets that collect incident data. The sources provided data for traffic measures from fixed sensors and mobile sensors. The next few sections present details on both types of sensors and the data collection process. J Detectors Figure 3.1. Schematic of the test network and detector locations 31

PAGE 48

3.1.1 Traffic Measures from Fixed Sensors In 1984, the Colorado Department ofTransportation (CDOT) installed loop detectors in the pavement to implement traffic responsive ramp metering within a section of the northbound 1-25 freeway. As part of this system, detectors were installed oli the mainline and ori-ramps of the freeway. Figure 3.2 shows a typical configuration of the detectors for ramp nietering.The mainline detectors provide volume, occupancyand speed trap data byscanning at 60 times/second. The data is collected to generate volume (vph), percent occupancy(%), and speed by lane (JHK and Associates 1990) every minute. ,: .. Figure 3.2. Typical detector configuration for ramp metering For the test network, eleven on-ramps out of 12 on-ramps have detectors. The distance between detector locations ranges from 0.13 miles to 2.20 miles. Table 3.1 shows the 32

PAGE 49

distance between detector locations for all the on-ramps in the network. The one minute detector data was collected for this study. The data includes percent occupancy, volume; and speed for each lane. This data was obtained from the CDOT from 6:00AM to 6:00PM for five weeks from April24, 2001 to May 25,2001 and another nine weeks from September 24,2001 to November 23,2001: Table 3.1. Distance between detector locations -'. On-ramp Location Distance (mile) :::-:.From To ColintYL'ine Dry Creek ... .. 1.00 DryCreek Arapahoe SE tu Arapahoe SE ArapahoeNE 0.1j ArapahoeNE Orchard 1.56 OrchardBelleview 1.11 .. .. Belleview .. Hampden 2.20 : "'-Hampden .. Yale .. 1.06 Yale'!'., . --.. Evans .. --' 0.91. Evans_,,.( .. ; Colorado NE .. .0.47 .'"! Colorado NE ColoradoNW 0.18 . _:. ', 3.1.2 Traffic Measures from Mobile Sensors The Regional Transportation District, Colorado's transit agency, installed an Automatic Vehicle Location (A VL) system in 1993 to develop more efficient transit 33

PAGE 50

schedules, to improve the agency's on-street operations, and to increase safety through better management (Castle Rock Consultants 1998). Denver's AVL system components are shown in Figure 3.3. Each vehicle in the RTD fleet is equipped with an Intelligent Vehicle Login Unit (IVLU) and a global positioning system (GPS) receiver capable of real-time correction. As a GPS receiver's signal may degradedue to obstructions in urban environments, the Denver A VL system integrates the GPS with inertial sensors or dead-reckoning (DR) sensors. The location accuracy of this typeofGPS receivers is 1-2 meters. However, specific information on the accuracy of Denver's integrated GPS-DR system is not available. The system consists of 1,335 vehicles in its fleet, including 935 fixed route buses. Bus location data is available every two minutes through this A VL system and was collected for this study. Normally, 18 bus routes travel on the northbound 1-25 freeway,the test network. During the morning peak period, the average bus flow rate is 14 buses per hour between the County Line Road and the 1-225 Interchange and 23 buses per hour .. between the 1-225 Interchange and the Colorado Boulevard. During the afternoon peak . ', '. -' period, the average bus flow rate is 4 buses per hour .the County Line and the I-225 Interchange and 11 buses per hour between the l-225 Interchange and .... ! c. :. the Colorado Boulevard. More buses operate on the northbound I-25 section of the test -.. . -. . network during the morning peak period than during the afternoon peak period. The . : -. A VL data available at 2-minute report intervals includes an u11ique bus identification (ID), bus route, time stamped, and coordinate of bus locations in NAD 27 State Plane coordinate system. The data was obtained from the Regional Transportation District (RTD) from 6:00AM to 6:00PM for 12 weeks from April24, 2001 to July 15, 2001 and for 11 weeks from September 24,2001 to December 7, 2001. 34

PAGE 51

GPS Correction data Differentially corrected location, vehicle, route and message data Voice Communications Voice Communications Voice Communications Dispatch Center Differentially corrected Jl' location, vehicle, route and message data Field Supervisor and Maintenance Figure 3.3. Denver's AVL system 3.1.2.1 Post Processing Data from Mobile Sensors Each A VL data file provided by the RTD contains bus location data every day of the week for the entire Denver.metro area. A Fortran routine written extracts the time stamped bus location data by date. AGeographic Information System (GIS) software, Arc View, and Avenue, Arc View's scripting language, was used to extract the data for all northbound buses within a buffer zone around the test network. Network Analyst, another Arc View tool, was used to estimate the distance traveled along the freeway 35

PAGE 52

between two consecutive A VL reports. Figure 3.4 shows the GIS software, Arc View's display of the bus A VL data. Figure 3.4. GIS software used to display and extract bus A VL data on the I-25 freeway northbound 3.1.3 Incident Data Several news channels in the Denver metropolitan area report road condition for major roadways. A local company, Premiere Traffic Network (PTN), collects information on roadway conditions by listening to police and emergency vehicle service radio 36

PAGE 53

dispatch, calling police station, reports from airborne reporters and monitoring live video feed in their operation center from cameras at several locations. A local radio station, the 850 KOA radio, posts this information on the World Wide Web. However, this information is not saved or archived in a database. The information posted includes location of incidents, approximate time incident started, and a brief description of the incidents. This data was collected in two periods; April24, 2001 to May 25,2001 for five weeks and from September 24,2001 to November 23, 2001 for nine weeks during the morning peak period 6:00-9:00 AM and the afternoon peak.period 3:00-6:00 PM, to coincide with the detector data collection described above. The incident data were collected from the website. A computer program and a screen capture program were used to capture the web sites display every 30 seconds to record incident information posted. The web display captured was saved in an A VI file format. It was later replayed to collect the incident data. The actual start time and end time of the incidents were examined by plotting occupancy at upstream of incident location of each incident. The actual start time is determined based on the time the upstream occupancy deviation from historical occupancy is higher than 5 percent. Similarly, as the upstream occupancy deviation for historical occupancy falls below 5 percent, an incident is determined to have ended. For the I25 network, the database includes 58 incidents. The detector malfunctioned as. 20 incidents occurred. Half of the 38 incident were randomly selected as the training set and the remaining as the test set. 3.2 Data from California The I-880 data was collected as a part of a research project in California (Petty et al. 1995). The study section of the I-880 freeway in Hayward, California is 9.2 miles long and varies from 3 to 51anes (Figure 3.5). An HOV lane covers approximately 3.5 37

PAGE 54

miles of the section. The data was collected in two periods; before the Freeway Service Patrol (FSP) was in operation from February 16 through March 19, 1993 and during FSP in operation from September 27 through October 29, 1993. Data collected for this project include loop detector data, probe vehicle data, and incident data. However, only loop detector data and incident data are used in this research. 'I' Oakland 0 j"" 264 ccccrcl J. Ot[]IOICIO OICICICIC tl [ CICICICIC 266 I I I I ...... I I I I I I I I ]"'[]'[]'[]'[] 280 "'"'"'"'"L c:o:c:o:o oocco 111 w ..... I I I I o'o'c'c'o c:c:o:c:c 16 c:c:c:c:c 291 ' J::: :0 .... r ''' c,o,o,o,o m 1118 c,c,c,o,o DtDtOtOaO 1"' '" t ''' ''' hl:boaiS:R.92 ..,. c,o,c,c 329 c,c,o,c 119 c,c,c,o o,c,o,o )iv,v t,t{ : .: :0 <>: ' '' o,c,c,o 342 o.o.c,c 113 .. F''" "'"'"'"C ''' ''' o'o'c'o ... ccco ) 128 ( c:c:o:c o:c:o:o 112 1 I I I I ''' ''' 1:: ,,. :: : : :0 o; I I '' ...,_, 1 1 01010!0 ''' ''' ' 00 D'D'DoD c .. l1ll1ll1ll1l I I:: :::I 388 o,o,o,o T-J si8!8is I o 0 I 1 1 1 lS6 0 1 0 101010 0 I I jo'o'oo1o """:":" 116 .. 8[8[8[8, [ o:c:c:c:O I6Q : : 1 1 117 -I 0 0 ::: :o o:::: I I I I I I I I I I 1 I I I 1 I I I I I I I I I I I 1 I 8!8:8:8:8 l77 8:8:8!8!8 I 1 I I I I 1 I I I I I I I 1 I I I 1 I I 1 I I ., 010101010 I I 1 I 1 1 I I Oi0t01010 liS o:o:c:c:c 1M c:c:c:c:a : : : :<> <>; I I I I 0 0 clolc,clc c,c,c.ciC 18'8'8 8'8r I I I I I I I I .. I I I I I I I I '' I 0 I o:o:o:o:o liJ o:c:o:o:o -0 I I 0 o,clolclc CIOICICIC ., '' '' '' '' ::: :<> o:::: ''. '' 8:8:8:8:8 m 8:B:g:g:B 110 t!t!t I I I I I I I I I I 1 I I I I 1 I I 1 I I 1 I I '' '' 1 1 '' '' A-S ..... c'c'oloc 243 CiOID1D'o 110 '' "I' #19 I 11:]11:]11:] 25Q I I I I '' '' ')C:D: : :o 0: : : : (' ......,,. ,A I 0 .., '' To San ......... -[] ....,_ ,.._ <> lmUIIOOr. HOV[J%11; &om Marina) Figure 3.5. The PATH project study section 3.2.1 Traffic Measures from Fixed Sensors The northbound section of the I-880 freeway includes 18 detector stations and the southbound section includes 17 detector stations. The spacing between detector 38

PAGE 55

stations ranges from 0.19 mile to 0.55 miles with an average of0.33 miles. The loop detector data collected was processed by a program developed at the University of California, Berkeley to report average flow rate, average speed, and average occupancy per period. Some detectors malfunctioned periodically and/or counted vehicles incorrectly. Two fixes (Petty et al. 1995) were performed on the loop detector data for missing data and inconsistent data. The missing data was recreated data from adjacent upstream loop detector and/or average from two adjacent loop detectors. A consistency fix corrected systematic errors in the loop data such as over or under counting. If the average vehicle accumulation per minute over a long period exceeds a threshold, then flow is estimated using correction factors. The correction factors are computed as a fraction of the flow of the nearest mainline flow. 3.2.2 Incident Data Incident data was collected by drivers of probe vehicles. When vehicles passed an incident while driving on the freeway; they informed therr command center and recorded their positions on an on-board portable computer. Accuracy issues of the incident database are reported in a paper (Petty et al. 1995). The location of the incidents was corrected by correlating the location of the incidents recorded in the incident database at the command center and the location recorded in an on-board portable computer. Since the start and end time recorded for the incidents were based on probe vehicle driver's witnessing an incident, the time recorded was only an approximation. For the I-880 incidents, the start and end times were determined as described earlier for the I-25 freeway. A data set of 45 lane-blocking incidents and 660 shoulder incidents were developed from the PATH project data. 39

PAGE 56

4. Characteristics of Lane-blocking and Shoulder Incidents An incident management system is a component of an advanced traffic management system (ATMS). The purpose of an incident management system is to detect, verify, and assess the magnitude of an incident, to identify the appropriate response to restore a facility to normal operation, and to implement the appropriate response in the form oftraffic control, information, and aid (Carvel11997). Incident detection is one of the main functions of incident management system. An incident detection algorithm is implemented by a traffic management center to detect any unexpected event that disrupts traffic flow and causes significant delay to motorists. Therefore, an incident detection algorithm that detects both lane-blocking and shoulder incidents is critical to the operation of an A TMS. Several researchers (Ahmed and Cook 1982; Balke et al. 1996; Cook and Cleveland 1974; Dia et al. 1997; Hsiao et al. 1993; Payne and Tignor 1978; Persaud and Hall1989; Petty et al. 1997; Stephanedes and Chassiakos 1993; Tsai and Case 1979) have developed and tested incident detection algorithms based on both lane-blocking and shoulder incidents while others (Abdulhai 1996; Cheu 1994; Ishak and Al-Deek 1999; Teng et al. 1998) focused on only lane-blocking incidents. If the characteristics of lane-blocking incidents and shoulder incidents are significantly different, an incident detection model developed based on only lane-blocking incident may not perform well in detecting shoulder incidents. Incidents may be classified in a number of ways. An incident could be vehicle related or non-vehicle related. An accident or breakdown is a vehicle related incident. Accidents may involve a single car or multiple cars. A vehicle breakdown may involve 40

PAGE 57

fuel leaking, flat tire, tire changing, or a mechanical problem. Non-vehicle related incidents may include debris on the roadway. Incidents may also be classified based on where they occur as lane-blocking and shoulder incidents. In practice, iflane-blocking incidents are rapidly moved to shoulder before they are observed, they may be classified as shoulder incidents. A lane blocking incident may involve a single lane or multiple lanes (Petty et al. 1995). Shoulder incidents usually cause rubbernecking, an action of drivers to slow down to observe an incident scene as they pass the incident. Incidents may be also classified by its severity or the delay it causes; delay lower than normal recurring congestion or delay higher than normal recurring congestion. This chapter presents the characteristics of incidents on two freeway sections of the I25 freeway in Colorado and the I-880 freeway in California. The characteristics of an incident investigated include incident rate, average delay, and duration of incidents based on the type of incident. Incidents are classified.as a lane-blocking incident or a shoulder incident. Incident rate is estimated as the number of incidents per million vehicle miles traveled. Incident severity is examined in terms of delay and duration. An examination of the characteristics of incident is expected to aid in the development of incident detection algorithms. If the characteristics oflane-blocking incident and shoulder incident are significantly different, an incident detection algorithm developed based on only lane-blocking incident may not perform well in detecting shoulder incidents. Furthermore, an incident detection algorithm may be transferred or applied to another location if the characteristics of incidents across locations are not significantly different. Therefore, in this study the characteristics of incidents are examined at each location and compared across locations. 41

PAGE 58

4.1 Characteristics of Incidents Limited studies have examined the characteristics of incidents. A review ofliterature is summarized here. Lindley (Lindley 1986) investigates the characteristics of incidents based on the data from the Highway Performance Monitoring System (HPMS) database maintained by the Federal Highway Administration (FHW A). The HPMS database includes detailed geometric, traffic and other data for selected roadway sections throughout the US. The database investigated includes 4,646 sections representing 9,349 miles of urban interstate and 3,390 sections representing 5,986 miles of urban other freeway and expressway. Incidents are categorized as lane-blocking or shoulder incidents. As shown in Figure 4.1, a lane-blocking incident is classified as an aqcident (one lane, multi-lane) or a breakdown (one lane, two lanes). Shoulder incidents are classified as accidents or breakdowns. An analysis of the HPMS data shows that 4 percent of the incidents are lane-blocking and 96 percent are shoulder incidents (presented in Figure 4.1 ). Incident rate reported is 200 incidents per million vehicle miles traveled (VMT). Another study in California collected incident data from the I -10 freeway, Los Angeles, CA (Skabardonis et al. 1999) and includes information from probe vehicles, loop detectors, and incident logs along a 7.8 miles freeway section. The data collected for 30 days includes 1,560 incidents. The incidents are classified into lane-blocking and shoulder incident. In-lane incidents are divided into accidents (one lane, multi lane) and breakdowns (one lane, multi lane). Shoulder incidents are divided into accidents and breakdowns. About 9.6 percent are lane-blocking incident and about 90.4 percent are shoulder incidents. The incident rate is 92.8 incidents per million vehicle miles traveled. A mean response time for incidents is 11.4 minutes and the mean incident duration ofincidents is 20.7 minutes. 42

PAGE 59

One Lane 84.6%, 74.3%, 87.5% Accidents 21.3%, 59.3%, 21.6% Multi Lane 15.4%, 25.7%, 12.5% Lane-Blocking 4%, 3.7%, 9.6% One Lane 99.2%, 95.8%, 97.5% Breakdowns 78.7%, 40.7%, 78.4% Incidents MultiLane 0.8%, 4.2%, 2.5% Accidents 4.2%, 8.6%, 5.4% Shoulder 96%, 96.3%, 90.4% Breakdowns 95.8%, 91.4%, 94.6% Percentages reported are from national study (Lindley 1986), I-880 freeway inCA (Petty et al. 1996), and I -10 freeway in CA (Skabardonis et al. 1999). Figure 4.1. Types of incidents. 4.2 Characteristics of Incidents on the 1-880 Freeway The data collected for 10 weeks from probe vehicles and loop detectors along a 9.2 mile section of a freeway in Hayward, California includes 1,616 incidents. Four or five probe vehicles at an average time headway of about 7 minutes collected data for 10 weeks from February to March, 1993 and from September to October in 1993. In September and October 1993, freeway service patrol (FSP) is in operation (Petty et aL 1995). The I-880 incidents are classified into lane-blocking and shoulder incidents and further classified as accidents (one lane, multi-lane) and breakdowns (one lane, multi lane) (Petty et aL 1996). For the I -880 freeway, 3. 7 percent of incidents are lane blocking and 96.3 percent are shoulder incidents. Most of the incidents are shoulder breakdowns (85% ). The incident rate is 104 incidents per million vehicle miles 43

PAGE 60

traveled. The mean response time for incidents is 28.9 minutes before the FSP is in operation and 13.8 minutes when the FSP is in operation. Mean duration of incidents is 24.7 minutes (Skabardonis et al. 1997). 4.3 Characteristics of Incidents on the 1-25 Freeway Characteristics of incidents in terms of frequency, incident rate, incident duration, and average delay are investigated for each location, the 1-25 freeway network and the 1880 freeway network (described in details in Chapter 3), and compared across locations. Incidents are classified into two types, lane-blocking and shoulder, based on the location that the incidents were observed. To compare the characteristics of incidents in Colorado, data from a 9.2 miles section of the 1-25 freeway, 9.8 miles in the 1-225 freeway and 5.2 miles in 6th Avenue are examined. The Colorado data shows that about 60 to 95.2 percent (average 76.3 percent) of the incidents are lane-blocking (Figure 4.2} The percentage oflane-blocking incidents (76.3 %) is significantly higher than reports from three previous studies (Lindley 1986; Petty et al. 1996; Skabardonis et al. 1999) as shown in Figure 4.1. The 1-880 database shows that about 96.3 percent of total incidents are shoulder incidents. The 1-25 NB data shows only 37.5 percent of total incidents are shoulder incidents. Most of the 1880 freeway incidents are shoulder incidents most of these incidents are vehicle breakdowns. Most of the incidents on the 1-25 freeway are lane-blocking incidents and mostly accidents. 44

PAGE 61

I-25NB I-25SB 1-225NB 1-22556; 61hAve. EB 61hAve. WB Average 1-25 NB ,1-25 SB ,1-225 NB ,1-225 SB 161h Ave. EB 16th Ave. WB Average Accident 77.5% 56.1% 75.0% 83.3% 95.0% 60.0% 74.5% Lane-Blocking 62.5% 69.5% 60.0% 76.5% 952% 93.8% 76.3'Ye Breakdown 22.5%143.9%1 25.0%1 5.0%1 40.0% 25.5'Y Tolallncldenls Accident 83.3% 27.8% 62.5% 60.0% 100.0% 100.0% 72.3'Y. Shoulder 37.5% 130.5% 140.0% 123.5% I 4.8% I 6.2% 23.8% Breakdown 16.7% 722% 37.5% 40.01(, 0.0% O.O%j Figure 4.2. Incident characteristics in Colorado There may be a number of reasons for these differences in percentage of types of incidents observed across freeway locations. First, for the 1-880 freeway, about half of the shoulder incidents are reported from call boxes. The 1-25 freeway test network does not have call boxes. Therefore, vehicle breakdowns or minor incidents may not be reported in Colorado .. Secondly, the 1-25 freeway only right shoulders available and the 1-880 freeway has right shoulders and center divide (median) available. This . may cause lane-blocking incidents to be easily moved to either right shoulder or median for the 1-880 freeway. The characteristics of incidents in terms of incident rate is presented next. 4.3.1 Incident Rate Incident rate is the number of incidents occurring per vehicle miles traveled (VMT). It may be mentioned that incident rate for the 1-880 freeway excludes California Highway Patrol (CHP) ticketing-related events, since most of these events are citations for violations of the HOV lane usage. Approximately 26 percent of total incidents are 45

PAGE 62

CHP ticketing-related events. These events are excluded and the rest of the incidents are used for comparison. For the I-25 freeway, the incident rate for lane-blocking incidents ( 30.6 incidents/107 VMT) is higher than for shoulder incidents (18.4 incidents/107 VMT) (Table 4.1). For the I-880 freeway, the incident rate for shoulder incidents (780.0 incidents/107 VMT) is much higher than for lane-blocking incidents (37.4 incidents/107 VMT). Table 4.1. Incident rate for the I-25 and the I-880 freeway ", .. I-25 I-880 Parameter LaneShoulder LaneShoulder blocking blocking Incident Rate (iricidents/107 VMT) 30.6 18.4 37.4 780.0 The Chi-square goodness of fit test for one-way contingency table is performed based on the null hypothesis that the probability oflane-blocking and shoulder incidents per vehicle miles traveled are equal to 0.5. The likelihood ratio chi-square statistic and p value are calculated (Table 4.2). For the I-25 freeway, the probability of lane-blocking incidents and shoulder incidents per vehicle miles traveled are not significantly different than 0.5 at the 5 percent level of significance (p-value=0.08). For the I-880 freeway, the probability oflane-blocking and shoulder incidents per vehicle miles traveled are significantly different than 0.5 at-the 5 percent level of significance (p value < 0.001). Shoulder incident rate is much higher than lane-blocking incident rate for the I-880 freeway. 46

PAGE 63

One of the purposes of an incident management system is to identify the appropriate response to restore a facility to normal operation and they may be different for lane blocking and shoulder incidents. For the I-25 freeway, the incident rates for lane blocking and shoulder incidents are not significant different. However, they are significantly different for the I-889 freeway. Therefore, the design and operation of and incident management system for the two freeways may be different for the two locations. Table 4.2. Chi-square goodness of fit test for one-way contingency table Location p-value I-25 between lane-blocking and shoulder incidents 0.08 I-880 between lane-blocking and shoulder incidents <0.001 --Across locations Lane-blocking incidents 0.41 Shoulder inCidents < 0.001 All Incidents < 0.001 Across locations, the probabilities oflane-blocking incident per vehicle miles traveled for thet-25 freeway and the 1-880 freeway are not significantly different than 0.5 at 5 percent level of significance (p-value = 0.41) and for shoulder incidents they are significantly different than 0.5 at 5 percent level of significance (p-value< 0.001). Therefore, for an incident detection algorithm relying on prior probability of an 47

PAGE 64

incident, transferred to a new location without.recalibration, may not perform well in detecting incidents, especially shoulder incidents. It may be mentioned that for all incidents, the incident rate is 49 incidents per 10 7 VMT for the I-25 freeway and 817.4 incidents pe: for the I-880 freeway. The incident rates for both freeways are much lower than the incident rate (2,000 incidents ... per 107VMT) reported in a study by Lindley (Lindley 1986). The significant difference in incident rate may be due to the manner in which incident data was collected at each location. 4.3.2 Incident Duration For a given type of incident, longer duration incidents cause higher impact to traffic flow. The duration of incidents is also important for a mobile sensor based incident detection algorithm. The traffic measures estimated from mobile sensors present the spatial variation of traffic conditions for a specific time interval. The duration of .. '.-' incidents and the penetration rate of probe vehicles affect the availability of probe ; .. reports during an incident. For the I-25 freeway, the mean duration oflane-blocking incident (43.1 minutes) is higher than of shoulder incidents (29.9 minutes) (Table 4.3). A pooled t-test shows that the means of incident duration between lane-blocking and shoulder incidents are not significantly different at 5 percent (p-value = 0.11) (Table 4.4). A Kolmogorov-Smimov (K-S) test also shows that the distribution of incident duration for lane-blocking and shoulder incidents are not significantly different, at a 5 percent level of significance (p-value = 0.1 0). 48

PAGE 65

Table 4.3. Duration of incident for the 1-25 and the 1-880 freeway Parameter Mean Duration (minute) Mean Duration (minute) (all incidents) Lane blocking 43.1 1-25 38.3 Shoulder 29.9 Lane blocking 53.9 1-880 Shoulder 62.9 62.3 Table 4.4. Pooled t-test and Kolmogorov-Smimov test for duration of incidents Pooled t-test K-S test Location p-value p-value 1-25 between lane-blocking and shoulder incidents 0.11 0.10 1-880 between lane-blocking and shoulder incidents 0.09 0.01 Across locations Lane-blocking incident 0.12 0.10 Shoulder incident 0.003 <0.01 All incidents 0.005 <0.005 For the 1-880 freeway, the mean duration is 53.9 minutes for lane-blocking incidents and 62.9 minutes for shoulder incidents. A pooled t-test shows that means incident duration for lane-blocking and shoulder incidents on the 1-880 freeway are not 49

PAGE 66

significantly different at a 5 percent level of significance (p-value = 0.09) but a K-S test shows that the distribution of incident duration for lane-blocking and shoulder incidents on the I-880 freeway are significantly different at a 5 percent level of significance (p-value = 0.01). Across locations, the I-25 freeway and the I-880 freeway, the mean incident duration of all incidents on the I-880 freeway is 62.3 minutes, about 1.6 times higher than the duration of incident on the I-25 freeway. This may be due to higher response time for incidents on the I-880 freeway. The mean duration and the distribution of incident duration are not significantly different for lane-blocking incidents but is significantly different for shoulder incidents across locations at a 5 percent .level of significance. 4.3.3 Average Delay Lane-blocking incidents usually cause a reduction in capacity of a roadway. Shoulder ru .. -: .. q incidents also cause a reduction in capacity due to rubbernecking, an action of drivers )!' --:-... : to slow and observe .an incident scene as they pass the incident. The impact of incidents be as average delay per vehicle affected by an incident. Averages delay, while queue present due to incident, can be calculated as follows: 50

PAGE 67

.1 ;, l1 .. Time Figure 4.3. Delay due to an incident .{ J = Total Delay q Total Number of Arrivals While Queue Present Area of Triangle abe A.* t q _Area ofT (Triangle ace)-'(Triangleacd)-(Trianglebde) ]' A.*t q 51 (4.1) (4.2) (4.3) (4.4)

PAGE 68

Therefore, delay while queue present is: where, d q = Average delay while queue present t q = Time duration in queue tr = Incident duration A. = Mean arrival rate Jlr = Mean service rate during incident Jl = Mean service rate The historical arrival rate flow rate under non-incident condition or flow rate from several upstream and ramp detectors may be used to estimate the mean arrival rate (A.). The flow rate from upstream detectors during an incident may be used as the service flow rate during incident condition ( Jlr ). The flow rate from upstream detectors after an incident ended may be used as the service flow rate ( Jl ). The weighted average delay at each location is calculated as; .; ... n "'cJ ..1. *t -)q q I J = _,_i=.-:.1 ____ qw n *tq)i i=l where, d qw =Weighted average delay 52 (4.5) (4.6)

PAGE 69

(A. *tq ); =Total number of arrivals while queue present for incident i n = Total number of incidents .at each location For the 1-25 freeway, the weighted average delay is 7.10 minutes per vehicle for lane blocking incidents and 3.65 minutes for shoulder incidents (Table 4.5). Lane-blocking incidents delay to individual affected v:ehicle than shoulder incidents. The maximum and minimum average delay per vehicle per incident is 17.32 and 0 minutes, and 9.48 and 0 minutes for lane-blocking and shoulder incident respectively. This shows that some shoulder incidents may cause higher average delay than lane blocking incidents. Therefore, shoulder incidents may also cause significant delays and are also important to detect For all incidents, the weighted average delay is 6.07 minutes per vehicle. A pooled t-test shows that the average delay due to lane-blocking incident and shoulder incidents is not significantly different at the 5 percent level of significance (p-value= 0.10) (Table 4.6). The Kolmogorov-Smimov (K-S) test also shows that the distribution of average delay due to lane-' blocking and shoulder incidents is not significandy different at the 5 percent level of significance (p value=O.l 0). 53

PAGE 70

Table 4.5. Average delay dueto incidents on the 1-25 and the 1-880 freeway 1-25 1-880 Parameter LaneShoulder Lane-,. Shoulder blocking blocking ., Weighted Average Delay (minlveh) 7.10 3.65 2.72 .. 1.84 Max. Average Delay (min!veh) 17.32 9.48 12.27 25.68 Min. Average Delay (minlveh) 0 0 0 0 Weighted Average Delay (minlveh) 6.07 1.89 (all incidents) Table 4.6. Pooled t-test and Kolmogorov-Smimov test for average delay Location Pooled t-test K-S test p-value p-value .. 1-25 between lane-blocking and shoulder 0.10 0.10 1-880 between lane-blocking and shoulder 0.016 0.10 Across locations LaneBlocking 0.008 <0.025 Shoulder 0.027 <0.01 All incidents <0.001 <0.001 54

PAGE 71

For the I-880 freeway, the weighted average delay is 2.72 and 1.84 minutes per vehicle for lane-blocking and shoulder incidents respectively. Similar to the I-25 freeway, lane-blocking incidents causes higher delay to individual affected vehicle than shoulder incidents. The maximum average delay for shoulder incidents (25.68 minutes per vehicle) is higher than for lane-blocking incidents (12.27 minutes per vehicle). Similar to the I-25 freeway, some shoulder incidents may cause higher average delay than lane-blocking incidents. Therefore, an incident detection algorithm that performs well in detecting both lane-blocking and shoulder incident is desired. A pooled t-test shows that the average delay for lane-blocking and shoulder incidents is significantly different at the 5 percent level of significance (p-value=O.Ol6). A K-S test shows that the distribution of average delay for lane-blocking and shoulder incidents is not significantly different at the 5 percent level of significance (p-value=O.l 0). The weighted average delay for all incidents is 1.89 minutes per vehicle on the I-880 freeway and 6.07 minutes per vehicle on the I-25 freeway. Drivers on the I-25 freeway experience more than 3 times higher delay due to an incident than drivers on the I-880 freeway. One of the reasons may be because the I-880 freeway has higher capacity (5lane) than the I-25 freeway (3-lane). For 3-lane freeway, one lane blocking incident causes 47 percent capacity reduction and 78 percent for two lanes blocking incident. For 5-lane freeway, one lane blocking incident causes 25 percent capacity reduction and 50 percent capacity reduction for two lanes blocking incident (Blumentritt 1981). Across locations, the average delay and distribution of average delay per vehicle per incident for the I-25 freeway and the I-880 freeway are significantly different for lane blocking and shoulder incidents. This shows that the characteristics of incidents in terms of their impact are significantly different between the two locations. Therefore, an incident detection algorithm developed for a specific location may not perform well 55

PAGE 72

if it is transferred to another location without recalibration since the impact of fucidents is different. The incident rate, duration of incidents, distribution of duration of incidents, average delay, and distribution of average delay due to incident for lane-blocking and shoulder incidents are not significantly different at 5 percent level of significance for the I-25 freeway. For the I-880 freeway, the incident rate, distribution of duration of incidents, and average delay are significantly different at 5 percent level of significance for lane blocking and shoulder incidents. Across locations, the incident rate, duration of incidents, distribution of duration of incidents, average-delay, and distribution of average delay are significantly different at 5 percent level of significance for shoulder incidents and all incidents. The difference of duration of incidents may be due to the difference in response time for the two locations. The difference of average delay !' across locations is due to the differences in severity of incidents themselves and freeway capacity. 56

PAGE 73

5. Methodology As outlined in a review of the literature in Chapter 2, several techniques including decision-tree based pattern recognition technique, time-series based statistical approach, catastrophe theory, and artificial intelligence based nemal network approach have been applied to incident detection. To date, neural network based incident detection algorithms have been shown to perform the best based on detection rate and the lowest false alarm nite. The probabilistic neural network (PNN) has been shown to train faster than the multilayer feedfo1Ward (MLF) neural network:A modified form of PNN, with a principal component transformation of the inputs also performs well (Abdulhai and Ritchie 1999). In a limited study using data collected from a traffic simulation, a generalized linear model (GLZ) was applied to freeway incident ... : -' detection (Hoeschen 1999) and has performed as well as the MLF models. Generalized linear model differs from a general linear model as the distribution of the dependent or response variable may be explicitly non-normal and may be categorical. The dependent variable is predicted from a linear combination of predictor variables using a link function. For a GLZ, the functions of the predictor variables are estimated parametrically. The generalized linear model may be expressed as; (5.1) where, JL = mean of the response variable 57

PAGE 74

g(p,) = 17 = a link function that link a random component, iJ, ; to the systematic component ( f3o + f3lxl + PzXz + ... + ppxp ). For a Bernoulli distribution, the link function may be logit. For logit link function, g(p) = log 1Jl Genenilized additive model (GAM) is a further generalization of a generalized linear model (GLZ). The generalized additive model is a nonparametric regression and smoothing technique that relaxes the assumptions of linearity and uncovers the structure in the relationship between the independent variables and the dependent variables. The generalized additive models combine an additive assumption that enables relatively many parametric relationships to be explored simultaneously with the distributional flexibility of generalized linear models. Other nonparametric regression regressions do not perform well when the number of independent variables is large, leading to variances of the estimates to be unacceptably large or is often referred to as the 'curse of dimensionality'. The generalized additive model overcomes these problems since each of the individual terms are estimated using a univariate smoother and the estimates of the terms explain howthe:Tesponse changes with corresponding independent variables. Generalized additive is appropriate where a dependent variable is not normally distributed and is categorical, the relationship between the variables is expected to be of a complex form, and may not easily be modeled as a linear or rion-linear model. The functional form may be suggested or the structure of relationship between w,dype:ndent variables and dependent variable may be explored more thoroughly using 't '. . . . additive models. 58

PAGE 75

5.1 Generalized Additive Model (GAM) To develop an incident detection model, generalized additive modeling technique is applied. A generalized additive model (Hastie and Tibshirani 1990) consists of a random component, an additive component, and a link function relating the two components. The response y, the component, is assumed to have exponential family, { yB-b(B) } fr(y,B,t/J) = exp a(t/J) +c(y,t/J) (5.2) where, Bis the natural parameter and is the scale parameter. The mean of the response variable J1 is related to the set of independent variables x 1 x2 x3 xp by a link function g. p _. g(p) = TJ = S0 +_Lsi (xi) (5.3) i=l where, s1(.), .... sp(.) are smooth functions defining the additive component, and the relationship between p and TJ is defined by g(p) = TJ. For a Bernoulli distribution, E(Y) = P(Y =I IX) is the mean and the canonical link may be logistic orprobit. The generalized additive logistic model can therefore be expressed as follows: (5.4) { P(ylxi) } . ... where, ( I ) IS the odds ofy, x;. 1-P y xi _ 59

PAGE 76

5.1.1 Spline Smoothing.Function GAM estimates the nonparametric function of predictors via scatterplot smoothers. The trend of a response measurement or dependent variable as a function of one or more independent variables can be estimated by smoothing functions. Smoothing function is a nonparametric in its nature. It does not assume a rigid form for the dependent variable on independent variables. Examples of smoothing functions include cubic smoothing spline and locally-weight running-line. This section describes a brief overview of cubic smoothing spline. Each smoother; is: estimated based on minimizing the following penalized residual sum of squares function: 11 b L +A-J{s"(t)}2dt i=! a (5.5) where; A is smoothing parameter and a::; x1 ::; .. :::;x" ::; b. The first term measures closeness to the data. The second term penalizes curvature in the function. The parameter A is the smoothing parameter. Large values of A produces smoother curves while smaller values produce more wiggly curves. For the fitted generalized additive model with cubic spline smoothers, the following . . . .. ; ,._," .equation is minimized.forp smoothers: ::. 11 { p }2 p 2 g(y;)+ dt (5.6) 60

PAGE 77

5.1.2 Fitting Generalized Additive Models The generalized additive model is estimated based on two iterative algorithms; Local Scoring Algorithm (LSA) and Backfitting Algorithm (BFA). An outer loop of the Local Scoring Algorithm and an inner loop of the Backfitting Algorithm are used until convergence. For each iteration of the Backfitting algorithm, splines are estimated. During each iteration of the Local Scoring Algorithm, an adjusted dependent variable and a set of weights are estimated to apply an iteratively reweighted least squares. The algorithms are as follows: 5.1.2.1 Local Scoring Algorithm The iterative procedure for Local Scoring Algorithm is as follow: 1. Initialization : s0 = g ( E (y)), s? (.) = (.) = ... = (.) =0, m=O. 2. Iterations: m=m+ 1 From the dependent variable, predictor, and mean based on the previous iteration, a new adjusted dependent variable may be estimated as: Z; = + (Y;-.un(:7l, J .u. 0 (5.7) For a binomial distribution, the equation for the adjusted dependent variable reduces to, (5.8) 61

PAGE 78

The weights Wi are defmed by, (5.9) where, V;0 is the variance ofY at /-l;o. For a binomial distribution, it reduces to, (5.10) 3. Convergence: The iterations are continued till the deviance for the model fails :to decrease or satisfies the convergence criterion. Using the Backfitting Algorithm, an additive model to z is fitted with weight Wi to obtain estimated 'functionssj(.). 5.1.2.2 Backfitting Algorithm The Backfitting Algorithm is applied to estimate the smoothing functions s1(.), .... sp(.) in the additive model in Eq. (5.3). The jth set of partial residuals may be estimated as, Rj = Y -s0 (Xk) (5.11) k#-j and E( Rj lxj) = sj (xj). Based on estimates {s(l ,i j}, smoothing functions f 1 (.)may be estimated. The iterative procedure is as follows: 1. Iriitialization: s0 = E(Y);s0(.) = .... = s0 (.) = 0, m=O I p 2. 'Iterations: m=m+l, forj = 1 top j-1 p Rj =Z-s0 L sk(xk) k=l k=j+l 62 (5.12) (5.13)

PAGE 79

An iteratively re-weighted least squares is obtained by smoothing weighted Rjon xr (5.14) where, the weights are estimated based on Eq. (5.9) 3. Convergence: The iterations are continued till either the residual sum of squares, { Avg ( Ys, t,s; (x) J}, firlls to decrease, or satisfies the convergence criterion. Zi is re-estimated based on the Local Scoring Algorithm. An adjusted dependent variable based on an iteratively weighted least squares. The algorithm regresses z; on x with weight w; to obtain revised estimates. The new f1, 1], z; estimated and the process repeated till the change in deviance, D(y;jl) = 2(/ (,umax;y)) -[ (Jl',y) is sufficiently small, where, ,Umax is the parameter value that maximizes likelihood l(fl,Y) over all ,u, the saturated model. Here, l(fl,Y) is the log likelihood. where; P = probability of Y]x. The degrees of freedom for a generalized additive model can be expressed as, df =trace( s( A.j ))-1 (5.15) (5.16) (5.17) where, s(-tj) is a smoother operator. Thus dfis the sum of the eigenvalues of s(-tj). 63

PAGE 80

It may be mentioned that dependency between independent variables is not a direct problem in generalized additive model as in ordinary regression where the variance of the estimators is a function of the design matrix. In regression, multicollinearity can cause highly inflated variances and are impossible to interpret individually, and numerically unstable estimates of the regression coefficients or non unique solution. In generalized additive model, strict collinearity through the origin (i.e. Xz = cX1) between independent variables can cause the backfitting algorithm to converge to one of the solutions (non unique) determined by th(! starting functions (Hastie and Tibshirani 1990). If the dependent variables are not strictly collinear then the backfitting algorithm converges to the unique solution, independent of the starting functions. Generalized additive logistic model can be used to develop an incident detection algorithm. Logistic link function is appropriate since the response or dependent is either incident or non-incident. GAM allows us to include many independent variables without 'curse of dimensionality' which usually occurs with model. Independent variables may include traffic measures such as speed, occupancy, volume, bus speed, etc. for each lane, for several previous time intervals. Nonparametric model uses more flexible functions than parametric model which allows us to explore the relationship between dependent and independent variables. 5.2 ".632 Bootstrap" Method:_ Misclassification rate measures how well a model predicts or classifies the class of a future observation. It measures model performance and can be used for model selection. Misclassification rate is defined as the probability of an incorrect 64

PAGE 81

classification. The probability refers to repeated sampling from the true population. If a model is fitted and tested on the same sample, the misclassification rate, so-called "apparent error", is unrealistic low because it uses the same data both for fitting and assessment. In order to get a more realistic estimate of misclassification rate, a test sample that is separate from training sample may be used. There are several approaches to improve an estimation of misclassification rate. The cross validation fits the model by leaving the ith observation out, and then computing the predicted value for the ith observation. If this is done for each observation, the average misclassification rate may be calculated. The simplest bootstrap approach generates B bootstrap samples, estimates the model on each, and then applies each fitted model to the original sample to give B estimates of misclassification rate. The overall estimate of misclassification rate is the average of these B estimates (Efron and Tibshirani 1993). However, the simple bootstrap method does not work very well. The .632 bootstrap best among several methods including cross validation (Efron and Tibshirani 1993). A bootstrap sample is generated by sampling original sample size n with replacement n times. The model is then fitted to a bootstrap sample and used to predict the response in the original sample. The . 632 bootstrap differs from standard bootstrap in that it predicts for only those observations that do not appear in the bootstrap sample. The idea behind the .632 bootstrap is to use the misclassification rate from just these cases to adjust the optimism in the apparent error rate. Let &0 be the average error in predicting the response or average misclassification rate for observations not in bootstrap sample but in the original sample. err(x, F) is the apparent error computed by fitting and assessment on the on'ginal sample. The .632 bootstrap estimate of optimism is defined as (Efron and Tibshirani 1993) : &632 = .632[&0 -err(x,F)] (5.18) 65

PAGE 82

The estimate of optimism is then added to equation (5.18) to obtain the .632 estimate of prediction error or misclassification rate. err632 = err(x,F) + .632[e0 err(x,F)] = .368 err(x, F)+ .632* 80 (5.19) The value .632 is approximately the probability that a given observation appears in a bootstrap sample of size n by sampling an original sample size n. e0 can be estimated as follows: Given a set ofB bootstrap samples, the e6 be the prediction error -computed for the jth bootstrap sample by 1. Estimating a prediction rule using this jth bootstrap sample. 2. Averaging the prediction error in predicting the response for observations not in this bootstrap sample but in the original sample. This is repeated B times, and then the average is ,.. j=l Bo =.;_-B_ : (5.20) For incident detection algorithm, the response of each observation is either 0 (non incident) or 1 (incident). Three different average misclassification rates may be estimated. Let erra is an average misclassification rate using bootstrap method when each vector reflects non-incident but ID algorithm indicates incident. errb is an average misclassification rate using bootstrap method when each vector reflects incident but ID algorithm indicates non-incident. errc is an average misclassification rate using bootstrap method when each incident the ID algorithm indicates non66

PAGE 83

incident instead of incident. erra ; 1and 1-errc are bootstrap average false alarm rate (FAR), bootstrap average incident-state detection rate (ISDR), and bootstrap average detection rate (DR), respectively. The ".632 bootstrap" method would provide a good estimate of the performance ofthe ID algoritluh. 5.3 Model Evaluation To evaluate the performance of an incident detection model such as detection rate (DR), false alarm rate (FAR), and meantime to detect (TTD) are estimated. A new performance measure, ISDR; has been included to evaluate the performance of an incident detection model in classifying each incident vector. These measures are estimated as follows: . ' Detection Rate (DR) : Percentage of incidents correptly classified by the model. where, Detection Rate (DR) = Nd 100 !ft Nd =Number of incidents detected N1 = Total number of incidents : \ (5.21) Incident-State Detection Rate (ISDR): Percentage oJthe incident vectors classified correctly by an incident detection model..This is also referred to as the classification rate for incident state vector. i '. Incident State Detection Rate (ISDR) = V/ 100 v; (5.22) where, 67

PAGE 84

V/ = Number or-incident vectors correctly classified v; =Total number of incident vectors False Alarm Rate (FAR): Percentage ofthe number of non-incident vectors classified incorrectly. (V. -V+) False Alarm Rate (FAR)= 111 m *100 where, of non-incident vectors correctly classified Vni= Total number of non-incident vectors Mean Time to Detect (TTD): Average time elapsed between the beginning of an ... incident and when the incident detection model first detects an incident. where, 1 N, Mean Time to Detect (TTD) =-_L(ti -ti) Nt i=l ti = Time incident starts t; =Time incident first detected N, = Total number of incidents 68 (5.23) (5.24)

PAGE 85

6. Development of Generalized Additive Model for Freeway Incident Detection Characteristics of traffic and incidents may vary considerably by location depending on roadway geometry, driver behavior, and weather conditions. Therefore, variables significant for incident detection algorithm may vary by location as well. The impact of incidents on traffic may vary based on location due to differences in the characteristics of incidents, traffic, roadway geometry, and driver behavior. These -factors and the spatial distribution of fixed sensors also affect the structure of an incident detection model and its performance. Other studies have developed incident detection models for one location, and examined its performance for other locations (Abdulhai 1996; Cheu and Ritchie 1995). This examination has been referred to as transferability test. In this research, a different approach is taken. Significant independent variables for incident detection are identified and generalized additive models for incident detection are developed for two freeway sections, Interstate 25 in Colorado and Interstate 880 in California, and the differences between the models and their performance are examined. An incident detection model is implemented by a traffic management center to detect freeway incidents or any event that disrupts the flow of traffic. The disruption may be caused by lane-blocking or shoulder incidents. Examples oflane-blocking incidents are accident and debris on the freeway and examples of shoulder incidents are stalled vehicle, flat tire vehicle, etc. Shoulder incidents can also causes a significant reduction of capacity at the affected location due to rubbernecking as presented in Chapter 4. A traffic management center (TMC) is not only interested in detecting lane-blocking 69

PAGE 86

incident but also interested in detectingshoulder incidents. If the shoulder incidents are detected, then the TMC can send a service unit to clear the incident and assist motorists. An incident detection model developed and tested on both lane-blocking and shoulder incidents should provide better estimates,of its performance at a traffic management center. This chapter presents the model development procedure for fixed sensor based nonparametric generalized additive models for all incidents (lane-blocking and shoulder incidents) on the 1-25 freeway, all on the 1-880 freeway, lane blocking incidents on the 1-880 freeway, and shoulder incidents on the 1-880 freeway. The development procedures for mobile sensor, and fixed and mobile sensor based . ,: ::.__ nonparametric generalized additive models for all incidents on the 1-25 freeway are l . . also presented in the chapter. The functional forms and their parameter estimates for '. fixed sensor based generalized additive models, as. suggested by the. generalized additive model, are presented in this chapter. A comparison of the model structures of I ' the incident detection models developed and their implication are also examined in this . '--chapter. 6.1 Selection of Independent Variables '.:' To date, several incidentdetection have been.developed based on traffic flow measures including occupancy, speed, and volume. Table 6.1 shows a sample of fixed-infrastructure based traffic measures used.in preyious studies and additional -' . --.. . variables considered for this study. The tra:ffic measures used for the California Algorithm are spatial occupancy difference (OCCDF), temporal difference of downstream occupancy (DOCCTD), relative difference in spatial occupancy. (OCCRDF). 70

PAGE 87

Table 6.1. Traffic measures for incident detection Payne and Cheuand Dia, Rose Abdulhai Variables Variable Description Tignor, Ritchie, et. a!., and considered 1978 1995 1997 Ritchie, for this study 1999 : uocc Upstream occupancy t,t-1 .. ,t-4 t 1 .. ,t-4 t,t-1, .. : USPD Upstream speed t t,t-1 .. ,t-4 UVOL Upstream volume t,t-1 .. ,t-4 t t,t-1 .. ,t-4 t,t-1, .. ,t-4 DOCC Downstream occupancy t t,t-1,t-2 t t,t-1 ,t-2 t,t-1 .. ,t-4 DSPD Downstream speed : t t,t-1 .. ;t-4 DVOL Downstream volume t,t-1 ,t-2 t t,t-1,t-2 I t,t-1 .. OCCDF Spatial occupancy difference, UOCC-DOCC t t Relative difference in spatial occupancy, (UOCC1 : OCCRDF t t 0 DOCCt)/UOCC, DOCCTD Temporal difference of d,ownstream occupancy ,(DOCC1.r t t DOCCt)/DOCC1.2 UDEVOCC UOCC,-Historical upstream occupancy at time interval t t DDEVOCC DOCC,-Historical downstream occupancy at time interval t t UDSPDDF Spatial speed difference, USPD,-DSPD1 UDDEVOCC Difference in deviation of upstream and downstream t occupancy from historical occupancy, UDEVOCC1 -DDEVOCC1.2 t Note: (t-n) = Data lagged n time mterval. 71

PAGE 88

The 16:vanabfes used in:neural netWork models (Abdulhai 1996; Cheu and Ritchie 1995) include upstream occupancy and volume, downstream occupancy, and volume for up to forirtime intervals. The additional variables considered in this study include deviation ofupstream:occupancy from historical upstream occupancy (UDEVOCC), deviation ofdownstream occupancy from historical downstream occupancy (DDEVOCC), spatial speed difference (UDSPDDF), and difference in deviation of upstream occupancy and downstream occupancy from historical occupancy (UDDEVOCC). To develop. generalized additive models, the selection of significant independent variables is based on the following procedure. 1. Box,Whiskers plots ofvariables showing median, quartiles, and extreme values are.examined. The box represents interquartile and contains 50 percent of the values. The whiskers, a line extended from the box show the minimum and maximum values excluding outliers. The values of variables that are significantly different under incident conditions have a strong location shift between incident and non-incident conditions. The variables with strong : shifts are identified. 2. Univariable Generalized Additive Models (GAM) with significant variables identified in step 1 is developed based on the training data set. The deviances of the fitted model on the training set are examined. Potential variables are ranked by deviance, from lowest to highest. The models with the lowest _: deviance are: to--identify the main effects. The deviance or likelihood-ratio _') I' statistic for a fitted model jL is defined by D(y; jL) = 2 {l(.Umax ;y) -l(jL;y)} (6.1) 72

PAGE 89

Where, .... t .. . f-lmax is the parameter value for the model that maximizes l (p; y) over all J.l (the saturated model). l(jl;y)is the log likelihood of a fitted model. I ; ,. \ .. Deviance is used as the goodness-offit measure to compare models (Hastie .! '! ,!' and Tibshirani 1990). . .... . ,. -il .. 3. Following the univariable amiiysis, iriultivanable 'anaiysis is conducted based on a forward stepwise variable selection with the independent variable with the lowest deviance m step 1. A vanable is added to the model if the analysis shows a significant change in deviance. The Chi-Square test at a 5% level of significance ( x:;r, ,0.95 ) is. applied to the significance of the . improvement in deviance (L\deviance), L\dferr _is the expected change in deviance. Degree of freedom, djj, approximate t?Stimate of L\dferr for models with and without the jth; term. : . .... ); : -: \ : . : -... : ' . 4. In addition to step 3, the QAM is developed by adding one variable at a time, . _;.: r; ,;. starting with the explanatory variables identified in step 2. Model performance ; !. '-_: i_l '. ..: -:::. 'l' . t at this step is examined based on the DR and FAR on the test set. In this step, ' '; : '.' { I ( i the last model includes all variables from step 2. Other subset models are also ! '. 1 ; i . ; j 1 ; I ; examined .. 5. The performance ofthe'selected:model from step 3 and all models in step 4 are examined. The based oil its performance;(DR and FAR) on the testing set. ; .:; ';1_. -. The independent variables considered in all the generalized additive models developed for incident detection are not strictly collinear. Therefore, the backfitting algorithm converges to a unique solution. 73

PAGE 90

As four different incident detection models are being developed; incident model for all incidents:on the 1-25 freeway, model of all incidents on the 1-880 freeway, lane blocking incident model for the 1-880 freeway, and shoulder incident model for the 1880 freeway, the same variable selection procedure is applied to all models. The variable selection procedures for a few specific models are also presented. Fixed Sensor Based Incident Detection Models As part of step 1, Figure 6.1, Figure 6.2, and Figure 6.3 show a strong shift oflocation of the upstream occupancy, upstream speed, do'Wnstream speed, and the deviation of upstream occupancy from historical upstream occupancy under incident conditions for all incidents on the 1-25 freeway, all incidents on the 1-880 freeway, and lane-blocking incidents on the 1-880 freeway, respectively. Under non-incident conditions, upstream occupancy ranges from 0 to 28 percent, 5 to 15 percent, and 5 to 13 percent and under incident condition, the upstream occupancy ranges from 10 to 69 percent, 8 to 42 percent, and 10 to 38 percent for all incidents on the 1-25 freeway, all incidents on the 1-880 'freeway, and lane-blocking incidents on the 1-880 freeway, respectively. In step 1, for both the 1-25 and 1-880 data, the Box-Whisker plots show a strong shift of location for upstream occupancy between incident and non-incident' conditions. Other variables that show a strong shift oflocation include USPD, UVOL, UDEVOCC, DOCC, DSPD, DVOL, DDEVOCC, OCCDF, OCCRDF, UDDEVOCC, and UDS:PbDF. The Box-Whisker plots also show that variables at current time interval and at previous time intervals show similar location shift. In order to develop a parsimonious model, only variables at current time interval are used as independent variables for the GAM. 74

PAGE 91

140 120 r-------------------------,.,--,,----.. ... ----------------... -........... ............. .... ... ___ .. ___ 4 100 f--.. ---------.. --eo -.-.................. .......... _._ ----20 ------------------.. --o 'J m., 'l 0 -...... -.. cJl---.. -------20 .................... ____ ....... __ ___ .. ______ ,, ____ ,___ ......... -. .. .. ____ ,_ .. ,_ .. _______ ... .. ,, .. ,_ 0 uocc o Outliers Extremes o USPD o Outliers Extremes DSPD o Outliers Extremes UDEVOCC o Outliers "' Extremes Figure 6.1. BoxWhisker Plot for incident (1) and non-incident (0) conditions on the I-25 freeway 100 40 .. ----------------.. ----1,1--H---11---__ .-______ ... o .... [... -................... ____ 0 __________ ___ .............. _, -20 ............... ,_ ............ .. __ __________ !t. ........ ---, ____________ ........................... .. _____ 1 0 -0 uocc o Outliers ... Extremes o USPD o Outliers Extremes DSPD o Outliers ... Extremes e UDEVOCC o Outliers ... Extremes Figure 6.2. Box-Whisker Plot for incident (1) and non-incident (0) conditions on the I-880 freeway for all incidents 75

PAGE 92

100 80 .. -------.-l ---------. ;eo ------l 20 -----0 -----20 L-__..__ ___ 0 ...__'I' : I-:-c uocc o Outiiers Extremes o USPD o Outiiers Extremes t> DSPD o Outiiers Extremes UDEVOCC o Outliers :::: Extremes Figure 6.3. Box-Whisker Plot for incident (1) and non-incident (0) conditions on the I-880 freeway for lane-blocking incidents l In step 2, the univariable model including the deviation of upstream occupancy from historical upstream occupancy provides the lowest deviance for the I-25 incidents, I880 shoulder incidents, and I-880 lane-blocking and shoulder incidents. The univariable model for I-880 lane-blocking incident model with the upstream speed provides the lowest deviance (Table 6.2). For both I-25 and I-880 freeway, the univariable models with the three lowest deviance include the upstream variables. All the models for the I-880 freeway also include downstream variables. The multivariable models are developed based on these findings. _} . .' '\ . , ( . _,.. .:' ;: -76

PAGE 93

6.2. Vnivariable model ba&ed on Deviance I-880 r..:880 I-880 All Incidents No. I-25 Lane-blocking .. Shoulder (Lane-blocking and Shoulder) 1 UDEVOCC VSPD UDEVOCC UDEVOCC 2 VSPD voce VSPD VSPD 3 voce UDEVOCC. voce voce 4 UDDEVOCC DSPD DDEVOCC DDEVOCC 5 OCCDF DOCC DSPD DSPD 6 UDSPDDF DDEVOCC DOCC DOCC 7 OCCRDF UDDEVOCC UDDEYOCC OCCDF 8 DDEVOCC OCCDF OCCDF UDDEVOCC .. 9 UVOL DVOL OCCRDF OCCRDF .. 10 DOCC UVOL UDSPDDF UDSPDDF ... .. 11 DSPD OCCRDF UVOL DVOL ., 12 DVOL UDSPDDF DVOL UVOL .. Table 6.3 summarizes the results from fitting a number of multivariable generalized additive models for all iricidents (lane-blocking and shoulder incidents) for the I -880 freeway (step 3 and 4). :For each model, the table shows the deviance ofthe.fitted model, the change of deviance and the expected change of deviance ( ). The first model in Table 6.3 includes the UDEVOCC variable. For the second model, the second variable in Table 6.2, VSPD is. added to the first model. The deviance decreases by 1764.31. The expected change ( df) is 6.63. A Chi-Square test shows a significant decrease of deviance (p-value = 0) as VSPD. variable is added to the model and therefore VSPD is added to the model. Similarly, other variables are added in a forward stepwise method. The model with the lowest 77

PAGE 94

Table 6.3. Analysis of Deviance for preliminary models for all incidents on the I-880 freeway 78

PAGE 95

deviance from the forward stepwise procedure (model #1 to model #12) in step 3 is Model# 12. As part of step 4, by adding one variable at a time, model #13 to model #22 are obtained. Based on the detection rate and the false alarm rate on the test set, the performance of model #12 is compared to models #13 to #22. The fmal selected model is model #21. In steps 2, 3, and 4, the convergence criteria selected for the Local Scoring Algorithm is 0.5 and for the Backfitting Algorithm is 0.01. The significant independent variables and analysis of deviance of the final selected models for the I-25 freeway and the I-880 freeway model are summarized in Table 6.4. The degree of freedom (dj) of variables represent the flexibility of their function. The Chi-Square test is used to compare the deviance between the full model and the model without this variable. The p.:.value of this test is also reported in the Table 6.4. Table 6.4. Analysis of Deviance for the generalized additive models Selected I-25 Freeway. I-880 Freeway I-880 Freeway I-880 Freeway Significant All incidents Lane Blocking Shoulder All incidents Variables (LB) Incidents (SHLD) Incidents df p-value df p-value df p-value df p-value voce 4.45 <0.0001 4.00 <0.0001 5.34 <0.0001 5.35 <0.0001 USPD 4.49 <0.0001 8.27 <0.0001 1.00 <0.0001 1.00 <0.0001 UDEVOCC 11.58 <0.0001 13.29 < 0.0001 26.02 <0.0001 1.84 < 0.0001 DOCC -4.00 <0.0001 5.35 <0.0001 5.34 < 0.0001 DSPD .. --8.33 <0.0001 0.99 <0.0001 1.00 < 0.0001 <0.0001 5.35 <0.0001 ---5.38 OCCDF ... ---..: DF =Degree of freedom 79

PAGE 96

6.1.1 Significant Independent Variables for Fixed Sensor Based Incident Detection Models Typically, it is assumed that as an incident occurs, the occupancy increases and the speed decreases upstream of an incident, and occupancy decreases and speed increases downstream of an incident. Comparing the incident detection models for the two locations, the upstream occupancy, speed, and the occupancy deviation from historical occupancy are significant for the I-25 freeway. For the I-880 freeway, both the upstream and downstream variables are significant. This may be due to the difference in the characteristics of the incidents and the freeway capacity. A study (Blumentritt 1981) has shown that the percentage reduction of capacity due to an incident varies based on the number oflanes of freeway available at a specific location. The study has shown that for a 3-lane freeway, one-lane blocking incidents cause_a 47 reduction in capacity and two-lane blocking incidents cause 78 percent reduction in capacity. For a 5-lane freeway, one-lane-blocking andtwo-lane incidents cause 25 percent and 50 percent reduction in capacity Since an incident causes higher reduction in capacity on the 3-lane I-25 freeway, traffic variables measuring upstream conditions are significant. For the I-880 freeway, lane-b.locking incidents cause higher impact to traffic flow than shoulder incidents, therefore fewer variables ' '-are needed. An examination of the differences in the characteristics of the incidents on the I-880 and the I-25 in previous chapter shows that the of incidents by type, duration, and average delay are significantly different across the two locations. The average delay due to incidents is higher (6.07 minutes per vehicle) for the I-25 freeway than for the I-880 freeway (1.89 minutes per vehicle). Therefore, the variables significant for an incident detection model are different by type of incident and by locations. 80

PAGE 97

6.1.2 Significant Independent Variables for Mobile Sensor -Based Incident Detection Model For the mobile sensor based incident detection model, the independent variables examined are average bus speed (A VGSPD), historical.bus speed (HISTSPD), and deviation ofbus speed from historical bus speed (DEVSPD). Thesame variable selection procedure as described in section 6.1 is followed. Table 6.5 summarizes the results by fitting a number of mobile sensors based incident detection models. For each model, the table shows the deviance of the fitted model, the change of deviance (.1. Deviance), and the expected change of deviance (.1. df). The first model in Table 6.5 is fitted with A VGSPD variable. The second model, HISTSPD is added to the first model. The deviance decreases by 174.65. The expected change of deviance is 100. A Chi-square test shows a significant decrease of deviance (p-value < 0.001) as HISTSPD variable is added to the model and therefore HISTSPD is added to the model. Similar procedure is applied to the other models. The model with the lowest deviance from the forward stepwise procedure (model# 1 to model# 3) is model# 3. Another combination of independent variables (model# 4) is also examined. The model with the best performance based on detection rate and false alarm rate on the test set is model # 2, with average bus speed and historical bus speed as independent variables. Table 6.6 shows the analysis of deviance and the significant independent variables of the final model. The degree of :freedom (d./) of variables represent the flexibility oftheir function. The Chi-Square test is used to compare the deviance between the full model and the model without this variable. Both variables are significant at the 5 percent level of significance. 81

PAGE 98

Model 1 2 3 4 Table 6.5. Anal ysis of Deviance for preliminary model for mobile ensor based incident detection model s Variable AVGSPD HISTSPD ./ =.=. '::. "' ; ./ ./ ./ ./ ./ Table 6.6. Anal Compare with DEVSPD model no. Deviance A Deviance Adf 198.49 23.84 174:65 100 ./ 2 0.06 23.78 100 ./ 15307.65 ysis of Deviance for mobile sensor based incident detection model Selected significant variable Degree of freedom p-value Average bus speed (A V GSPD) 100 < 0.0001 Historical bus speed (HI STSPD) 40 < 0.0001 p-va1ue <0.001 <0.001 6.1.3 Significant lnde Variables for Fixed and Mobile Sensor Based on Model Incident Detecti The fixed sensor based in cident detection model is developed based on three significant independent v ariables; upstream occupancy, upstream speed, and deviation om historical occupancy; as described in section 6.1.1. The ased incident detection model is developed based on the of upstream occupancy fr fixed and mobile sensor b three significant indepen dent variables from fixed sensors and average probe speed from mobile sensors. As a result, the incident detection model with additional probe gher detection rate at 0 percent false alarm rate. vehicle data provides hi 82

PAGE 99

Table 6.7 shows analysis of Deviance and significant independent variables of the model. The degree of freedom (d./) of variables represent the flexibility of their functions. The Chi-square test is used to compare the Deviance between the full model and the model without that variable. All variables are significant at the 5 percent level of significance. Table 6.7. Analysis of Deviance for fixed and mobile sensor based incident detection model Variable Degree of freedom p-value uocc 7.90 <0.0001 USPD 1.00 < 0.0001 UDEVOCC 11.66 <0.0001 A VGSPD (probe data) ---150.0 < 0.0001 6.2 Parametric Estimate of Fixed Sensor Based Generalized Additive Model for Incident Detection The models presented in the previous sections are non-parametric generalized additive models-for incident detection. Each independent variable is fitted by a smoother such as cubic spline. A parametric form of the model is proposed by examining the partial prediction or effect of each independent variable on the response. A parametric model provides simple functional forms for easy implementation. One of the advantages of the parametric model over the nonparametric model is that it requires less time to 83

PAGE 100

develop estimates of the parameters as the functional forms are specified in the parametric model. The training time for a parametric model may be less than one time interval (e.g. 30-second). Therefore, real time re-calibration may be performed to adjust the parameter estimates. This sectiori presents the parametric model estimation of the generalized additive models for incident detection for lane-blocking and shoulder incidents for a free\V:;tY section in Colorado and California, the differences in the model structure and parameter estimates. 6.2.1 Partial Prediction One of the advantages of the generalized additive models over other algorithms is that it allows an examination of the fitted functions of each independent variable to explore its functional form. A partial prediction plot may be used to examine the effect of each independent variable to the response and to suggest parametric functions for each variable for the model. The functions of the model are parametrized in terms of one or two parameter families of functions. Figure 6.4, Figure 6.5, Figure 6.6, and Figure 6. 7 show partial prediction plots for the incident detection models for all incidents on the I-25 freeway, all incidents on the I880 freeway, lane-blocking incidents on the I-880 freeway, and shoulder incidents on the I-880 freeway, respectively. The examination of the partial prediction plots shows that the deviation of upstream occupancy from historical upstream occupancy (UDEVOCC) has the most effect on the response of all models. The UDEVOCC functions for all models may be considered to be a piecewise linear function. The partial prediction plots 'also show that the upstream occupancy (UOCC) functions for all models may be.represented by polynomial. The upstream speed (USPD) functions may be represented by polynomials for the incident detection models for all 84

PAGE 101

incidents on the I-25 freeway and the lane-blocking on the I-880 freeway. The upstream speed (USPD) functions may be represented by linear functions for incident detection model for all incidents and for the shoulder incident model for the I-880 freeway. -'-' I 0 -00 c.. I I co 00 onlocc v 1/ l 1/ USPD "'"-"" .,. lc J I! -10 0 10 R II 40. _SO .. IPLDr aocpp_tllfVO:C +++ppJJSPD >(I(I(PP_uocc I Figure -6:4 . Partial Prediction for UDEVOCC, UOCC, and USPD for all incidents on -the -I-25 freeway 85

PAGE 102

. UDJOCC / v .. IL OCCDP" v DSPD r--.... R <-.::.. ... USpP DOCC uocc -u -10 " 70 10 Figure 6.5_ .. Partial Prediction for UDEVOCC, VOCC, DOCC, VSPD, DSPD, and OCCDF for all incidents on the 1-880 freeway . o..-4-. r.- l ...... .. .. ,1 (II :; .; u "' ., c. -0 "'"' c. .. --,..,. r 'iEvocb I I -P< -uocc _,. .o:::.,.;;..-y DOCC _,.._ -0 ....... 1.-,v \ 1\ "-""'" -to -10 10 10 30 40 .. .. I PLOT 0 0 Opp_tUVOCC + + + PP_UBPD I( ]I( lCpip_I.DX SPD USPD I 10 10 ,__ I Figure 6.6. Partial prediction for UDEVOCC, VOCC, DOCC, VSPD, and DSPD for the lane-blocking incident model for the 1-880 freeway 86

PAGE 103

UDEVOCC ./ J'r _I -u J -o ... I 01 OCCDF 7 .. ......_ ......... !i-o-. DSPD / --......_ nocc _;I;;; / 'I I _, "'uocc ... I I -:110 -zo -10 zo 30 40 " .. .. .. .. Figure 6.7. Partial prediction for UDEVOCC, UOCC, DOCC, USPD, DSPD, and OCCDF for the shoulder incident detection model for the 1-880 freeway The UJ?EVOCC, UOCC, and USPD are common independent variables for all models for both freeways. For the 1-880 freeway all models, the downstream occupancy (DOCC) functions may be represented by polynomials. The downstream speed (DSPD) functions may be represented by a polynomial for the lane-blocking incident model and linear function for the shoulder incident model and incident detection model for all incidents. The partial prediction plots for lane-blocking and shoulder incidents forthe 1-880 freeway in Figure 6.4 and Figure 6.5 also show significantly different structure of the models. Based on these partial prediction plots, parametric functions are proposed. 87

PAGE 104

6.2.2 Generalized Additive Model (GAM) Parametric Estimate for Fixed Sensor Based Incident Detection Model The parametric models are developed by fitting functional forms of independent variables parametrically. The functional forms are suggested by examining the partial .. predictions of each independent variable. At-test is used to evaluate the significance of parameter estimates. For all incidents (lane-blocking and shoulder incidents) on the I-25 freeway, a generalized additive model (GAM) for freeway incident detection is developed based on a piecewise linear function ofUDEVOCC, quadratic functions ofUOCC and USPD. The parametric estimate of GAM for the I-25 data is follows: ( p(x) J {-,0.0059(udevocc) ifudevocc 0 %} 17 =log = -0.8437 + + 1-p(x) 0.2394(udevocc) if udevocc >0% +0.0319(uocc)-0.0022(uocc)2 -0.0739(uspd)+ (6.2) + 0.0006( uspd)2 where, p(x) = probability(Y == 1 or .I x) ., The parameter estimates for the model presented in Table 6.8 shows that all estimates are significantly different from as determined by the size of the estimate relative to its estimated asyffiptotic standard error and by the p-valuesfor each estimate, an upper bound of a Type I error assessed by the t-statistic. The Type I error for UDEVOCC estimate is higher than five percent for the piecewise function for UDEVOCC less than zero percent. The variable was included based on performance of the model on the test set. 88

PAGE 105

Table 6.8. Model analysis for parametric estimate of generalized additive model for the all incident for the 1-25 freeway No. Variables Parameter Standard t-statistic p-value Estimate Error 1 Intercept -0.8437 0.3023 -2.79 0.0053 2 UDEVOCC::;; 0% -0.0059 0.0041 -1.46 0.1455 3 UDEVOCC>O% 0.2394 0.0097 24.63 <.0001 4 uocc 0.0319 0.0113 2.82 <.0049 5 uocc2 -0.0022 0.0003 -8.46 <.0001 6 USPD -0.0739 0.0118 -6.27 <.0001 7 USPD2 0.0006 0.0001 4.25 <.0001 The 1-880 freeway model for all incidents includes a piecewise linear function of UDEVOCC, polynomial functions ofUOCC, DOCC, and OCCDF, and linear functions ofUSPD and DSPD. The parametric estimate of the GAM for all incidents on the 1-880 freeway is: ( p(x) J {60;6206( udevacc) if udevacc::; 0 %} 7]=log =-6.1565+ + 1-p(x) 0.5150( udevacc) if udevacc > 0 % ..... +1.1085(uacc)-0.0578(uacc)2 +0.0009(uacc)3 -0.0248(uspd)+ (6.3) +0.3123( dace )-0.0185( dace t + 0.0003( dace Y -0.0303( dspd)+ +0.0019(accdf)2 -. ., The parameter estimates for the model presented in Table 6.9 shows that all estimates are significantly different from zero as determined by the size of the estimate relative to its estimated asymptotic standard error and by the p-'va1ues for each estimate, an upper bound of a Type I error assessed by the t-statistic. The 1-880 freeway model for lane-blocking incidents includes three linear functions of UDEVOCC, polynomial functions ofUOCC, USPD, DOCC, and DSPD. The 89

PAGE 106

parametric estimate of the GAM for lane-blocking incidents for the I-880 data is defined as: -{ -0.0039(udevocc) if q=log( p(x) J=-3.1099+ 0.7551(udevocc) + 1--:p(x) 0.3314(udevocc) ifudevocc> 5% No. 1 2 3 4 5 6 7 8 9 10 11 12 +0.0619(uocc )-0.0019(uocc t + { ( ) ( ) 2 } (6.4) + 0.3587 uspd -0.0061 uspd if uspd mph + 0.0262(uspd) if uspd >55 mph 0 0027(docct +{-0.0883(dspd)+0.0011(dspd)2 if dspd mph} .::...o.0298(dspd) if dspd > 45 mph Table 6.9. Model analysis for parametric estimate of generalized additive model for the all incidents model for the I-880 freeway ... ----Parameter Variables Estimate Standard Error t-statistic p-value Intercept -6.1565 0.3770 -16.33 <.0001 UDEVOCC 50% 60.6206 -0.3556 170.47 <.0001 UDEVOCC>O% 0.5151 0.0071 72.86 <.0001 voce 1.1085 0.0479 23.12 <.0001 uocc2 -0.0578 0.0027 -21.52 <.0001 uocc3 0.0009 0.0001 -18.42 <.0001 USPD ----0.0248 0.0022 -11.47 <.0001 DOCC ,, 0.3123 -0.0344 9.07 <.0001 DOCC2 -0.0185 0.0018 -10.20 <.0001 DOCC3 0.0003 0.0001 8.98 <.0001 DSPD -0.0303 0.0029 -10.18 <.0001 OCCDF2 0.0019 0.0003 7.56 <.0001 The parameter estimates for the model presented in Table 6.10 shows that all estimates are significantly different from zero as determined by the size of the estimate relative to its estimated asymptotic standard error and by the p-values for each estimate, an 90

PAGE 107

upper bound of a Type I error assessed by the t-statistic. The Type I error for UDEVOCC estimate is higher than five percent for the piecewise function for UDEVOCC less than zero percent. The variable was included based on performance of the model on the test set. Table 6.1 0. Model analysis for parametric estimate of generalized additive model for the lane-blocking incident model for the I.:.880 freeway No. Variables 1 Intercept 2 UDEVOCC :s; 0% 3 0% 5 % 5 voce 6 uocc2 IfUSPD :s; 55 mph 7 USPD 8 USPD2 IfUSPD > 55 mph 9 USPD 10 DOCC2 ifDSPD :s; 45 mph 11 DSPD 12 DSPD2 ifDSPD > 45 mph 13 DSPD Paramete r Estimate -3.1099 -0.0039 0.7551 0.3314 0.0619 -0.0019 0.3587 -0.0061 0.0262 -0.0027 -0.0883 0.0011 -0.0298 Standard Error 0.5027 0.0069' l, 0.0301 0.0157 0.0252 0.0008 0.0189 0.0003 0.0065,, 0.0003 0.0199 0.0004Q.0054 t -statistic -6.19 -0.56 25.06 21.09 2.46 -2.37 ' 18.97 -23.53 p-value <0.0001 0.5761 < 0.0001 < 0.0001 0.0141 0.0176 < 0.0001 < 0.0001 4.02 < 0.0001 .:9.21 < 0.0001 i : ; t '. ', . :-4.44' 2.83 < 0.0001 0.0047 r 'l : -. !,. -5.51 < 0.0001 ' '; l .. The model for shoulder incidents on the I-880 freeway linear function ofUDEVOCC, polynomial functions ofUOCC, DOCC, andOGCDF, and . .. . .. linear functions ofUSPD and DSPD. The parametric estimate of the GAM for . . . shoulder incidents for I-880 data is defined as: 91

PAGE 108

.l ( p(x) J. . {65.4034( udevocc) if udevocc s 0 %} 1J = og = -6.3622 + + 1-p(x) 0.5080(udevocc) if udevocc > 0% + 1.1241( uocc)0.0579( uocc )2 + 0.0009( uocc t -0.0179( uspd)+ (6.5) +0.2605(docc)-0.0159(docc)2 +0.0002(docct -0.0302(dspd)+ + 0.0019( occdf)2 The parameter estimates for the model presented in Table 6.11 shows that all estimates are significantly different from zero as determined by the size of the estimate relative to its estimated asymptotic standard error and by the p-values for each estimate, an upper bound of a Type I error assessed by the t-statistic. No. 1 2 3 4 5 6 7 8 9 10 11 12 Table 6.11. Model analysis for parametric estimate of generalized additive model for the shoulder incident model for the I-880 freeway Variables Parameter Standard Error t-statistic p-value Estimate. . Intercept -6.3622 0.3810 -16.69 <.0001 UDEVOCC s; 0 % 65.4034 0.2785 234.83 <.0001 UDEVOCC > 0% 0.5080 .. 0.0072. 70.92 <.0001 uocc 1.1241 0.0484 23.24 <.0001 uocc2 -0.0579 0.0027 -21.31 <.0001 uocc3 0.0009 0.0001 18.18 <.0001 USPD .;0.0179 0;0023 -7.78 <.0001 DOCC 0.2605 0.0347 7.50 <.0001 DOCC2 -0.0159 0.0018 -8.72 <.0001 DOCC3 0.0002 0.0001 7.74 <.0001 DSPD. ... .:-0.0302 0.0030 -10.00 <.0001 OCCDF2 0.0019 0.0003 7.38 <.0001 92

PAGE 109

6.3 Model Interpretation Typically, a rather simplistic view of incident detection is taken that suggests that as an incident occurs, the occupancy increases and the speed and flow rate decreases upstream of an incident. Simultaneously, occupancy, and flow rate decreases and speed increases downstream of an inCident. Generalized additive models allow flexible functions to be fitted and therefore their functional forms are revealed in the parametric estimation of generalized additive models. This capability of GAM serves as a powerful interpretive t!lofto the effect of each variable on the probability of an incident. This section attempts to examine these interpretations to understand the traffic pattern that emerges due to an incident and its subsequent impact on incident detection for lane-blocking incidents. The structures of models for lane-blocking incident on the 1-880 freeway, shoulder incidents on the 1-880 freeway, all incidents on --. the 1-25 and the 1-880 freeway are also compared. 6.3.1 Model InterpretationLane-blocking Incident Detection Model The incident detection model discussed is presented in Eq. (6.4). The model's functional form is suggested by the partial prediction plots presented in Figure 6.6 and as discussed earlier. The functional form of the UDEVOCC variable in Eq. (6.4) suggests that the likelihood of an incident increases as the deviation of upstream occupancy increases from historical occupancy the rate of increase is higher for increases of up to five percent than for a greater than five percent increase. If the upstream occupancy is less than historical occupancy, it may increase the likelihood of a non-incident condition. For all other variables, USPD, UOCC, DSPD, and DOCC the effect is mostly non-linear on the response variable. 93

PAGE 110

This can be further illustrated by comparing the odds ratio of an independent variable for a lane-blocking incident and a shoulder incident. The odds ratio 1/1 ( xj) -is defmed r; as, (6.6) where, ( L1x_;) represents a unit change in the variable Xj, all other variables remaining the same. The odds ratio for a parametric estimate of a generalized additive model (Eq.(5.4)) may be estimated as, The odds ratio may be used to express how the odds in favor of an incident may increase. or decrease due to a unit increase in a traffic measure. (6.7) Figure 6.8 shows that the odds ratio for unit increase in UDEVOCC greater than 0 percent (or upstream occupancy > historical upstream occupancy), USPD less than 30 mph, and UOCC less than 16 percent, the odds ratio is greater than one, indicating an increase odds in favor of an incident compared to a non-incident condition. The increase in odds in favor of an incident is highest for unit increase in UDEVOCC or upstream occupancy deviation from historical occupancy, relative to all other variables and as also shown by the partial prediction plot shown in Figure 6;6. Figure 6.8 also shows that for upstream occupancy deviation up to 5 percent higher than historical, the increases in odds iii favor of an 'incident relative to non-incident is highest and the odds continue to increase at a lower rate in favor of an incident for higher deviations. 94

PAGE 111

0 -e rn "'0 "'0 0 -20 -10 0 10 20 30 40 UDEVOCC X USPD .& DSPD D UQCC ::KDQCC 50 60 Figure 6.8. Odds ratio, lf/(Xj), for a unit increase of independent variables70 The quadratic fonnofthe USPD variable in the-incident detection model suggests that its influence on detecting incidents is not the same for the range of its observations. More specifically, as shown in Figure 6.9, for USPD greater than30 mph (uncongested conditions), a unit decrease in USPD increases the odds in favor of an incident (odds ratio > 1) as expected. However, for'USPD less than 30 mph (congested conditions), a unitdecrease in USPD, the odds in favor of an incident decrease (odds ratio< 1) since the decrease ofUSPD may be due to recurring congestion. 95

PAGE 112

1.6 -,-----------------------rg 1.4 .!3 1.2 t':l Q) 1-< i 0.: t':l 1-< 0) under both uncongested and congested conditions. UDEVOCC also helps reduce false alarms (as indicated by Figure 6.10 and Figure 6.6 showing partial prediction< 0) when 'tpstream occupancy is less than historical occupancy, especially under uncongested conditions. Overall, as expected, the probability of an incident increases as UDEVOCC increases. 96

PAGE 113

; .,, -15 -20 __. __ _._ ___.__---1 0 20 40 60 80 100 0 20 40 60 80 (a) (b) Upstream speed (USPD), mph : 1 Figure 6.10. Relationship between upstream speed (USPD) and upstream occupancy deviation from historical upstream occupancy (UDEVOCC) under (a) non-incident conditions and (b) incident conditions, 97 100

PAGE 114

6.3.2 Comparison of Model Structures and their Implications As discussed earlier, for the lane-blocking incident model and the shoulder incident model for the 1-880 freeway, the significant independent variables are the same, except the spatial occupancy difference (OCCDF) is also significant for the shoulder incident model. The parameter estiniates and the functional forms of the parametric estimate of the GAM for the lane-blocking incident model and the shoulder incident model are different as presented in eq. (6.4) and eq. (6.5), respectively. The functional forms are suggested by the partial prediction plot presented in Figure 6.6 and Figure 6.7. The parametric form is piecewise linear for UDEVOCC, and quadratic for UOCC and DOCC for the lane-blocking incident model and the shoulder incident model. For USPD and DSPD, the parametric forms are quadratic for the lane-blocking incident model and linear for the shoulder incident model. This shows that traffic characteristics, as they relate to incident detection, are different for lane-blocking and shoulder incidents espeqially for the upstream speed and downstream speed. Figure 6.11 the odds ratio for a unit decrease of speed (USPD) for the lane-blocking incident model and the shoulder incident model. For lane-blocking incidents, the quadratic form of the USPD variable in the incident detection model . . . . suggests that its influence on detecting incidents is not the same for the range of its observations. More-specifically, as shown in 6.11, for USPD greater-than 3-0 mph (uncongested conditions), a unit decrease in USPD increases the odds in favor of an incident (odds ratio> 1) as expected.However, for USPD than 30 mph (congested conditions), a unit decrease-inUSPD, decreases the odds in favor of an incident (odds ratio< 1) since the decrease ofUSPD may be due to recurring congestion. For shoulder incidents, the odds in favor of an incident increases (odds ratio> 1) as the USPD decreases. However, the influence of upstream speed on detecting shoulder incidents remains the same over the range of speeds observed. This 98

PAGE 115

shows that the effects of lane.;. blocking incidents and shoulder incidents on upstream speed are significantly different. 1.6 -,----------------------------, .s ----------a ::I 0.8 -Lane-Blocking Incident Model .... cE -0.6 --j-----------1 --Shoulder Incident Model ... ..., .._.. 0.4 -tiS 0.2 -o -o 0 0 0 10 20 30 40 50 60 Upstream Speed (USPD), mph Figure 6.11. Odds ratio lf/(x) for upstream speed for the lane-blocking incident model and the shoulder incident model 70 Typically, it is assumed that during an incident, the speed is higher downstream compared to upstream ofthe incident. The probability of an incident is expected to increase as the downstream speed increases. However, Figure 6.3 shows that the average downstream speed (DSPD) is lower under incident conditions for the available spacing of detectors on the 1-880 freeway. This may be because vehicles require a certain acceleration distance to achieve desired speed after passing an incident. Figure 6.12 shows speed at detector station #16, # 3, and# 1, for an incident between detector station # 16 and #3, on the southbound direction of the I -880 freeway. During the 99

PAGE 116

incident, the speed does not increase at the first downstream station (detector station # 3) but increases at a station fui-ther downstream (detector station #1). This illustrates that the functional forms and the parameters estimates of an incident detection model may be determined by the spacing of detectors or the length of freeway segments. For the 1-880 freeway, for an average spacing of detector of one-third mile, the probability of an incident decreases as the downstream speed increases. : : Detector Statton #16 Station #3 Station #1 :2 --+DO DO DO ;; I 1700.ft D Loop detector [81 Incident !700ft. E" 40 +-------------4r:-f-+-------------."----t------, "" "'0 o +--------------+.---------.;........f----------' 20 ---At I::etectnr Station# 16 10 At DetectOr Station # 3 At I:etector Station # I 0 50 100 150 200 250 300 350 400 Titre Figure 6.12. Speed at different location during incident conditions 100

PAGE 117

Figure 6.13 shows that for a unit increase of downstream speed, the odds in favor of an incident decreases (odds ratio <1) for lane-blocking incident model and shoulder incident model. However, its influence on the odds in favor of an incident decreases as the DSPD increases for the lane-blocking incident modeL For shoulder incidents, the effect of downstream speed on the odds in favor of an incident remains the same for the range of its observations. 1.02 -.-------------'--------'-----------, Q 1=1-. tl) Q ...... 1.00 ----------------------------------7'---------------0 Q)
PAGE 118

lane-blocking and shoulder incidents are expected to yield different incident detection models. For different locations, the significant independent variables for an incident detection model may depend on the characteristics of the incidents, geometry of the freeway, and driver behavior. For the I-25 freeway, upstream variables are significant; and for the I-880 freeway, both upstream and downstream variables are significant (Table 6.4). For the same significant independent variables, their influence on the response (incident detection) is different across locations, as illustrated by the model presented in eq. ( 6.2) and eq. ( 6.3). This is also clearly illustrated in Figure 6.14, that shows the odds ratio for upstream speed (USPD) for the 1.:.25 and 1-880 freeway. On the I-880 freeway, approximately 93 percent of the incidents are shoulder incidents. Therefore, the odds ratio of upstream speed in Figure 6.14 shows characteristics similar to the shoulder incident model shown in Figure 6.11. The odds in favor of an incident increases (odds ratio > 1) as the USPD decreases. The influence of upstream speed (USPD) on the odds in favor of an incident remains the same for the range of its observations. On the I-25 freeway, approximately 62.5 percent ofthe incidents are lane-blocking. The odds in favor of an incident increases-as the USPD decreases (odds ratio >1). The influence ofUSPD on the odds in favor.ofari.incident decreases as the USPD increases. This shows that for the difference in the prbportion of each type of incidents for the two freeways, the effec! ofthe USPD on theresponse (or incident detection) is also different. 102

PAGE 119

1.08 0 t:l.. 1.07 ell ;:J ""' 0 1.06 0 1.05 u 0 '"0 .... 1.04 a ::I "' ... 1.03 .a g 1.02 0 1.01 :g 1.00 0 I-25 All Incidents +----------"'-..::--------------l-I-880 All Incidents -----------------------.. 0.99 0 10 20 30 40 50 60 Upstream speed (USPD) Figure 6.14. Odds ratio lf/(x) for upstream speed for the I-25 and the 1-880 freeways all incidents 70 The significant independent variables and structures of the incident detection models developed are different for the models developed for different type of incidents and locations. This may be due to the differences in characteristics of incidents as presented in Chapter 4. 103

PAGE 120

/ 7. Performance of Fixed Sensor, Mobile Sensor, and Fixed and Mobile Sensor Based Incident Detection Models An incident detection model is implemented by a Traffic Management Center to detect ,ofreeway incidents or any non-recurring congestion that disrupt the flow of traffic. The .disruption may be caused by lane-blocking incident (e.g. accident or breakdown in the --mainline) or shoulder incidents (e.g. flat tire vehicle or breakdown on shoulders). An examination ofthe impact of incidents shows that, for some incidents, average delay due to shoulder incidents may be higher than lane-blocking incidents as presented in Chapter 4. Therefore, a traffic management center may need to detect both lane blocking and shoulder incidents to respond to clear the incidents as quickly as possible. -The impact-of an incident could be reduced considerably by reducing the incident detection and response time. Therefore, the performance of an incident detection algorithm on both lane-blocking and shoulder incidents is most relevant to a traffic management center to meet its goal to minimize travel delay to motorists. Incident detection algorithms developed and tested on both lane-blo9king and shoulder inciqents should provide better estimates of its performance in the field to detect incidents that cause significant delay. The generalized additive model is developed based on incident data collected for two freeway sections in Colorado and California and its performance is reported. Separate incident models are developed to examine the .characteristic differences between lane-blocking and shoulder incidents as they relate to incident detection. In addition, the performance of the generalized additive models is also compared to the multilayer feedforward (MLF) neural network and other neural network based models previously reported in the literature. 104

PAGE 121

This chapter presents the performance of fixed sensor based incident detection model to detect both lane-blocking and shoulder incidents developed for two freeway sections; Interstate 25 in Colorado and Interstate 880 in California. Furthermore, the performance of fixed sensor based incident detection model is also examined by segment length and segment type. Mobile sensors may be used as an afternative source of traffic measures where fixed sensors are not available and as an additional source of traffic measures to improve the performance of an incident detection model. Next, the mobile sensor, and combine fixed and mobile sensor based incident detection models are presented. Mobile sensor is a good source of spatial variation of traffic measures. 7.1 Fixed Sensor Based Incident Detection Model Generalized additive models for incident detection are developed for two freeway sections, Interstate 25 in Colorado and Interstate 880 in California. The models presented in Chapter 6 clearly illustrate that the variables identified as significant to detect incidents are considerably different for the two freeways. For the 1-25 freeway in Colorado, only upstream traffic variables are significant but for the 1-880 freeway, both upstream and downstream variables are significant. The parametric functions estimated for the models for the two locations are also significantly different. The models are developed to both types of incidents baed.on a training data set and performance evaluated based on a test data set. This section of the dissertation presents the performance of the generalized additive models developed based on measures such as detection rate, false alarm and mean time to detect. To compare the performance of the GAM to other incident detection models, a neural network based model was also developed. The comparisons are also presented in the following sections. 105

PAGE 122

7.1.1 Performance of Fixed Sensor Based Incident Detection Model for the 1-25 Freeway A generalized additive model, including three significant independent variables upstream occupancy (UOCC), upstream speed (USPD), and deviation of upstream occupancy from historical occupancy (UDEVOCC), is developed to detect all incidents (lane-blocking and shoulder incidents) on the 1-25 freeway. Performance of the model is evaluated on the test set. For this model, the detection rate (DR) is 100 percent, false alarm rate (FAR) is 0 percent and the mean time to detect is 3.47 minutes, at a one-interval persistence test (Figure 7.1). At a FAR ofO percent, the DR and the mean time to detect ranges from 100 percent to 52.63 percent and 2.28 minutes to 24.5 minutes respectively, for one to four-interval persistence tests. Without any persistence test, the DR is 100 percent with a FAR of0.035 percent. For the parametric estimate of the generalized additive model (GAM-Parametric), the DR is 100 percent and the FAR is 0.005 percent, without any persistence test. With a one-interval persistence test, the parametric model's DR is lowered to 94.74 percent; however the FAR is 0 percent, as in the nonparametric model. The generalized additive model converges in approximately 1.5 minutes for the training data set with 1000 hours of30-second data or 120,000 vectors on a computer with an Intel Pentium 4 processor with speed at 1.8 GHz. A parametric approximation of GAM is easier to implement and requires less time to recalibrate. The parametric functions are suggested by examining the partial prediction of each predictor or independent variable. 106

PAGE 123

100.0 -llr-J-------6--------------, 90.0 80.0 70.0 ,-.., 60.0 '*50.0 Q 40.0 30.0 +------------! --.!rGAM 20.0 -t----------1 ---GAM-PARAMEST. 10.0 __ M_L_F ______ __J 0.0 0.000 0.020 0.040 0.060 FAR(%) 0.080 0.100 0.120 Figure 7.1. Performance ofthe generalized additive model and the multilayer feedforward neural network for the I-25 freeway on the test set To compare the performance of the generalized additive model to another well-known incident model, a multilayer feedforward (MLF) neural network model is also developed on the same data set using the 16 variables suggested by Cheu (Cheu and Ritchie 1995). The DR ranges from 84.21 to 4 7.3 7 percent and the FAR ranges from 0.10 to 0.02 percent for zero to four-interval persistence tests (Figure 7.1). The mean time to detect ranges from 6.50 to 10.65 minutes for zero to four-interval persistence tests. . For the I-25 freeway, the GAM performs significantly better than the MLF neural network model based on the detection rate, false alarm rate, and mean time to detect. 107

PAGE 124

The performance of the parametric estimate of the generalized additive model compares well with the nonparametric GAM. 7 .1.2 Performance of Fixed Sensor Based Incident Detection Model for the 1-880 Freeway A generalized additive model including six' significant variables, upstream and downstream occupancy and speed, deviation of upstream occupancy from historical occupancy, and difference in upstream and downstream occupancy, is developed to detect all incidents (lane-blocldng and shoulder incidents) on the 1-880 freeway section. The performance is evaluated on the test set. As shown in Table 7.1, without any persistence test, the DR is 95.59 percent, FAR is zero percent and the mean TTD is 3.40 minutes for the generalized additive model. The detection rate (DR) ranges from 95.59 to 94.71 percent, and the mean time to detect ranges from 3.40 minutes to 5.51 minutes without any persistence test to four-interval persistence test. The false alarm rate is zero percent for all tests. :r -The model detects all lane-blocking incidents and 95.28 to 94.34 percent ofthe shoulder incidents for zero to persistence tests. The ISDR ranges from 81.37 to 77.05 percent for zero to four-mterval persistence tests. For a parametric estimate of the generalized additive model (GAM-parametric), Figure 7.2 shows, the DR ranges from 97.35 to 96.47 percent and the FAR from 0.01 to 0 percent for zero to four-interval persistence test (Figure 7 .2). The ISDR ranges from 86.57 to 82.19 percent (Figure 7.3). The TTD of the parametric model is lower than the non parametric GAM. It ranges from 2.28 to 4.53 minutes. The parametric approximation of GAM provides higher DR, higher ISDR, and lower TID with a slightly higher FAR than the nonparametric GAM. 108

PAGE 125

Table 7.1. Performance ofthe generalized additive model on the test set including lane-blocking and shoulder Incidents FAR DR(%) (%) ISDR(%) TTD (Minute) Persistence Test LBand LBand LBand (SHLD) LBand (LB) (SHLD) (LB) (LB) (SHLD) SHLD SHLD SHLD SHLD 0 95.59 (100) (95.28) 0 81.37 (84.52) (81.21) 3.40 (3.70) (3.38) 1 95.29 (100) (94.97) 0 80.21 (83.22) (80.06) 3.89 (4.18) (3.87) 2 95.29 (100) (94.97) 0 79.12 (81.97) (78.97) 4.42 (4.66) (4.41) 3 94.71 (100) (94.34) 0 78.07 (80.71) (77.93) 4.87 (5.14) (4.85) 4 94.71 (100) (94.34) 0 77.05 (79.46) (76.92) 5.51 (5.61) (5.50) Note: LB= Lane-blocking incident, SHLD =Shoulder incident 109

PAGE 126

""" '<]e. 100.0 .,.-------------------, -GAM 95.0 ..---------1 -GAMPARAM EST. """""'*"'-MLF 90.0 ;; 85.0 r----------------------0 -------------------0 80.0-1-----75.0 70.0 +---..---,----r---,-------r---l 0.000 0.025 0.050 O.D75 0.100 0.125 0.150 FAR(%) Figure 7.2. Performance (DR) ofiD algorithms for all incidents on the 1-880 freeway 100.0 .,.------------------, -GAM 95.0 +----------1 -GAMPARAM EST. """""'*"'-MLF 90.0 """ '<]e. '-85.0 0 ------------tl) -80.0 75.0 70.0 +-----r-----,---,-----r-----,-----1 0.000 0.025 0.050 0.075 0.100 0.125 0.150 FAR(%) Figure 7.3. Performance (ISDR) ofiD algorithms for all incidents on the 1-880 freeway 110

PAGE 127

To compare the generalized additive model to another incident detection model, a multiplayer feedforward (MLF) neural network model with 16 independent variables suggested by Cheu (Cheu and Ritchie 1995) is developed for this data. The DR ranges from 87.10 to 86.78 percent with FAR ranges from 0.15 to 0.09 percent for zero to four-interval persistence test (Figure 7.2). The ISDR rariges from 79.64 to 77.43 percent and the TTD from 2.30 to 3.07 minutes for zero to four-interval persistence tests. The performance of the generalized additive model is significantly better than the MLF neural network model in terms of detection rate, incident state detection rate, and false alarm rate. For both the I-25 and the 1-880 freeways, the nonparametric and parametric GAM performcomparably and significantly better than the MLF neural network model. 7 .L3 Performance of Fixed Sensor Based Incident Detection Model for Lane blocking Incidents The performance of the generalized additive model for and its parametric estimate model Jor lane-blocking incidents are compared to several neural network based models also developed on the same data set. It can be mentioned that the performance of the model including detection rate (DR), incident state detection rate (ISDR), false and mean tim:e to detect (TTD) are based on the test set. The detection rate (DR) of the nonparametric, lane-blocking incident detection model is 100 percent and false alarm rate (FAR) is 0 percent-for zero to four-interval persistence test. The incident state detection rate (ISDR) ranges from 75.2 to 68.4 percent and the mean time to detect (TTD) ranges from 5.02 to 7.20 minutes for zero to four-interval persistence tests (Table 7.2). 111

PAGE 128

Table 7.2. Performance of incident detection models Performance Persistence Model Test DR ISDR FAR TID (%) (%) (%) (Minute) 0 100 75.22 0 5.02 -1 100 73.38 0 5.66 GAM 2 100 71.70 0 6.14 3 100 70.01 0 6.73 4 100 68.42 0 7.20 0 100 85.68 0.013 3.18 1 100 84.47 0.007 3.59 GAM-Parametric 2 100 83.22 0.003 4.14 .. 3 100 82.11 0.000 4.57 4 100 81.00 0.000 5.00 0 95.45 69.24 0.287 5.17 1 90.91 67.94 0.253 5.08 MLFNN 2 90.91 66.54 0.223 5.48 3 90.91 65.14 0.199 5.88 I 4 90.91 63.84 0.179 6.28 I 0 100 0.5 0.25 -1 98 0 1.32 PNN 2 98 0 1.87 398 0 2.37 i 0 95.65 0.33 3.84 1 95.65 0.30 4.24 CPNN 2 95.65 0.27 4.75 3 95.65 0.23 6.21 I I 112

PAGE 129

The DR for the parametric estimate of GAM is 100 percent for zero to four-interval persistence tests and a higher FAR(0;013,to 0;003 percent for zero to two interval persistence test) than the nonparametric-GAM (Figure 7.4). The ISDR is also higher with higher FAR than the nonparametric GAM (Figure 7.5). The Il1ean TID ranges from 3.18 to 5.00 minutes, lower than the parametric GAM (Table 7.2). It may be noted that for the same mean time. to detect, the performance of both the non parametric and parametric generalized additive model estimate performance is comparable. More specifically, at an average time to detect of five minutes, a 100 percent detection rate and 0 percent FAR may be achieved for both non-parametric and parametric GAM. The parametric model may provide an easier alternative for on-line implementation. --. '$ '-" p::: 0 100.0 95.0 _-----------.. 7* _)( ______ ___ 90.0 85.0 80.0 75.0 70.0 "65.0 60.0 0.00 -+Nonpammetric GAM -Pammetric GAM Feedforward Neural Network (Abdulhai, 1999) """'*CPNN (Jin, 2002) 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 FAR(%) Figure 7.4. Detection rate for lane-blocking incidents on the 1-880 freeway (Test set) 113

PAGE 130

100.0 ..,.._Nonparametric GAM 95 0 -1--------1 -eParametric GAM 90.0 -1--------I....,_Multilayer Feedforward Neural network 85.0 t--------------------------------------80.0! ------'---------------------------....... 0 (/) -75.0 :::: _-___ ;;;;c -.: -q 60.0 +---.,....----,----,------r-----.---r-----i 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 FAR(%) Figure 7.5. Incident state detection rate for incidents on the I-880 freeway (Test set) To compare the performance of the generalized additive models to existing models, three models were selected: a multilayer feedforward (MLF) neural network model, a probabilistic neural network (PNN) model and a constructive probabilistic neural network (CPNN) model. The MLF neural network-model with 16 variables, as suggested by Cheu (Cheu and Ritchie 1995), is developed on the training set and evaluated on the test set. Overall, the generalized additive model performs significantly better than the MLF neural network model based on all performance measures (Figure 7.4 and Figure 7.5) and takes less time to MLF neural network, the DR ranges from 95.45 to 90.91 percent and false alarm rate from 0.287 to 0.179 percent, without any persistence test to four interval persistence test. The ISDR ranges from 69.24 to 63.84 percent and 114

PAGE 131

the mean TTDranges from 5.16 to 6.28 minutes for zero to persistence tests. -. :: The performance of a probabilistic neural network (PNN) model with principal component transformation (Abdulhai and Ritchie 1999) of the same input variables as the Cheu (Cheu and Ritchie 1995) study, on the same data set, is also reported in Table 7.2. The detection rate of the PNN model is 100 percent with a false alarm rate of0.5 percent, without any persistence test. The DR remains at 98 percent with 0 percent FAR for one to three-interval persistence tests (Figure 7.4). A more recent model, a constructive probabilistic neural network (CPNN) model (Jin et al. 2002) with 24 variables for the same lane-blocking incidents on the 1-880 freeway reports a DR of95.65 percent and a false alarm rate ranging from 0.33 to 0.23 percent for zero to three-interval persistence tests (Figure 7.4). Based on all measures as shown in Figure 7.4 and Figure 7.5, the nonparametric GAM and the parametric estimate of GAM, with only five independent variables outperforms the multilayer feedforward (MLF) neural network, the probabilistic neural network (PNN), and the constructive probabilistic neural network (CPNN) models at a significantly lower false alarm rate. In addition, the incident-state detection rate for both generalized additive models is higher than the multilayer feedforward (MLF) neural network model, indicating that the duration of the incident is also better predicted by the model. Other models such as the probabilistic neural network (Abdulhai 1996) and the constructive neural network (Jin et al. 2002) do not report incident state detection rate. 115

PAGE 132

7.1.4 Performance of Fixed Sensor Based Incident Detection Model for Shoulder Incidents For the 1-880 freeway, more than 90 percent of the incidents are shoulder incidents. A nonparametric GAM and parametric estimate of GAM are developed based on only the shoulder incidents to compare the performance with the lane-blocking incident and all incidents model (lane-blocking and shoulder incidents). Both the GAM and parametric estimate of GAM provide FAR of zero percent for zero to four-interval persistence test. As shown in Figure 7.6, the DR remains the same for up to three interval persistence tests (98.74 percent) and slightly lower at four-interval persistence test (98.43 percent). Incident state detection rate ranges from 92.18 to 87.78 percent (Figure 7.7) and mean time to detect ranges from 1.18 to 3.34 minutes for zero to four interval persistence tests. The parametric model provides lower DR and ISDR with 0 percent FAR for zero to four-interval persistence test. For zero to four-interval persistence test, DR ranges from 97.48 to 96.23 percent, 86.60 to 82.11 percent for ISDR, and 2.29 to 4.45 minutes for TTD. The generalized additive models for shoulder incidents have lower detection rate and higher incident state detection rate than lane-blocking incident detection model. Given the characteristic differences between lane-blocking incidents and shoulder incidents, as expected a model developed explicitly for either incident type performs better than a model developed for both incident types. As an ID algorithm is implemented by a traffic management center to detect freeway incidents, and both shoulder and lane blocking incidents cause significant impact, it shou1d be developed to detect both types of incidents. 116

PAGE 133

,l. : i .. 100.0 95.0 + 90.0 .. --------------------------------------------85.0 ---------------------,-., 80.0 "#. -; 75.0 -t------------i -+---GAM Ci 70.0 +------------1 -BGAM-PARAM EST. 65.0 60.0 55.0 !------------------------------50.0 -t------,------,------,-------1 0 2 3 4 Persistence Test Figure 7 .6. Detection rate at different interval persistence test for the 1-880 shoulder test set 100.0 ,----------'-----------------, 95.0 -j---------------------------9o.o 85.0 ";' 80.0 +-----------------------------75.0 Ci -----------70.0 65.0 +----------r----------,---j -+-GAM 60.0 +----------1 -BGAM-PARAM EST. 55.0 -r----------L _________ _l--j-50.0 -t-------T--------,------,-------1 0 1 2 Persistence Test 3 Figure 7. 7. Incident state detection rate at different interval persistence test for the 1-880 shoulder test set 117 4

PAGE 134

7.2 of Fixed Sensor Based Incident Detection Model by Segment Length. and Segment Type Most reliable source of traffic measures for freeway incident detection is loop detector. Detectors are installed at fixed locations reporting at regular intervals. For incident detection, normally detectors are installed at a spacing of half or one third of a mile. In this study, a segment is defmed by two consecutive detector locations. A segment length shorter than 113 mile is defined as a short segment and a segment longer than 1.3 mile is defined as a long segment. A short segment captures the spatial variation of traffic measures better than the longer segments. The effect of segment length on incident detection is presented next. 7.2.1 The 1-880 Freeway The I-880 freeway study secticm consists of33 segments. Segment lengths range from 0.189 mile to 0.625 mile. Twenty four segments are shorter than 1/3 mile and are defined as short segments. Nine segments are longer than 1/3 mile and defined as long segments. The for short segments is slightly higher than for long segments for all .. persistence tests (Figure 7.8). Without any persistence test, the DR is 95.73 percent for short segments and 95.24 percent for long segments. For one and two-interval persistence test, the DR for short segments and long segments are almost the same. For three and four-interval persistence test, the DR for the short segments is about 1.61 percent higher than the long segments. The false alarm rate is zero percent for all persistence tests. The mean time to detect ranges from 7.05 to 11.42 minutes and 6.57 118

PAGE 135

to 10.56 minutes for zero to four-interval persistence test for short and long segments, respectively (Figure 7.9). ----e E o:l .. c:: 0 ., Q) Cl c::
PAGE 136

An incident on a short segment is slightly easier to detect by an incident detection model since traffic measures are collected by detectors located close to the incident or the spatial variation of traffic flow can be captured better than a long segment. For a long segment, spacing between detectors is large; Therefore, spatial variation of traffic flow may not be represented as well as a short segment. An incident that causes minor impact to traffic may not be detected by an. incident detection model. It may be mentioned that the range of segment lengths on the I-880 freeway is quite small. Segments on freeway may also be grouped based on the presence of ramps; segments without ramp, segments with on-ramp, and segments with off-ramp. For short segments, there are 12 segments without ramp, seven segments with on-ramps, and five segments with off-ramps. For long segments, there are three segments with on ramps and six segments with off-ramps. For long segments, there are no segments without ramps. On-ramp causes merging of traffic and also vehicles on the main freeway try to change lane away from on-ramp. Off-:-ramp causes diverging of traffic as vehicles changing lane to off-ramp. For short segments, segments without any ramp perform best since there is no effect of merging or diverging vehicles (Figure 7.10). Segments with on-ramp perform worst for all persistence tests. The detection rate for short segments without any ramp performs about one percent better than the short segffient with off-ramp and about 6.5 ,_ ,;. percent better than the short segment with on-ramp for one to four-interval persistence test. Mean time to detect ranges from 5.04 to 9.23, 14.54 to 19.63, and 4.04 to 8.08 minutes for zero to four-interval persistence test for short segments without ramp, with on-ramp, With off-ramp (Figure 7.11). A pooled t-test is performed to .. .. ' compare mean time to detect at zero persistence test. The mean time to detect for short segment with on-ramp is significantly higher than short segment without ramp (p value= 0.0009) and with off-ramp (p-value = 0.0022) at 5 percent level of significance. 120

PAGE 137

Mean time-to detect for short segment without ramp is not significantly higher than short segment with off-ramp (p-value=0.226) at 5 percent level of significance. 100.0 .-------------------------., 99.0 ------------------------------------98.0 2 96.0 = 95.0 o -+-Short segment without ramp .g 94.0 2 Short Segment with on-ramp P 93 ----------.--Short Segment with off-ramp 92.0 +------------'-----------=---------------=--------rl 91.0 t------------------1---IJ----_-_--_-_--_-_--___;__-.-_--------'--_--__ ----1 90.0 +------,-----,------,-----1 0 2 Persistence test 3 4 --Figure 7.10. Detection rate by segment type for short segments ': ,:. 25.0 -+-Short segment without ramp -Short Segment with on-ramp I 2 0 0 '----.--Short Segment with off-ramp 15.0 _____ Q.) -c s 10.0 _, 0.0 +---------,-----------,-----------.---,---,-----1 0 2 Persistence test 3 4 f'igure 7 .11. Mean time to detect by segment type for short segments 121

PAGE 138

For long segments, the segment with on-ramp provides higher DR than segment with off-ramp (Figure 7.12). For zero to two-interval persistence test, the detection rate for the segment with on-ramp performs about three percent better than the segment with off-ramp and about 5.2 percent better than the segment with off-ramp for three and four-interval persistence test. The mean time to detect ranges from 3.94 to 7.75 and 7.78 to 11.94 minutes for zero to four-interval persistence test for long segments with on-ramps and with off-ramps respectively (Figure 7.13). A pooled t-test is performed to compare mean time to detect at zero persistence test. Mean time to detect for segment with off-ramp is not significantly higher than segment with on-ramp (p-value = 0.054) at 5 percent level of significance. 100.0 99.0 98.0 -.. '$. '-' 97.0 Q) 96.0 ...... .... 95.0 .8 94.0 ...... 0 Q) ...... 93.0 Q) 0 92.0 91.0 90.0 -+-Long segment with on-ramp ----Long segment with off-ramp ----------------0 1 2 Persistence test 3 4 Figure 7.12. Detection rate by segment type for long segments 122

PAGE 139

,-... c .._. u C!) -C!) ""0 o -C!) -c "' C!) :::E 14.0 12.0 10.0 8.0 '6.0 4.0 2.0 0.0 0 --+--Long segment with on-ramp ---Long segment with off-ramp 2 Persistence test 3 Figure 7.13. Mean time to detect by segment type for long segments 7.3 Mobile Sensor Based Incident Detection Model Incident detection algorithms developed since 1970 rely on fixed sensors data. In the last few decades, state transportation agencies monitoring freeway generally deployed infrastructure based fixed sensors every third or half a mile. Fixed sensors such as loop detectors record temporal and spatial variation of traffic measures at discrete time and space interval. Recent years, mobile sensors have also deployed as an alternative or additional source of information for a variety of applications including transit fleet management. Data from mobile sensors may also be used as an alternative and/or additional source of traffic flow measures for incident detection. Traffic measures from mobile sensors represent spatial variation of traffic conditions at a specific time interval. This research examines the use of mobile sensors, buses equipped with automatic vehicle location (A VL), deployed for fleet management. Two incident 123

PAGE 140

detection algorithms are developed and examined relying on only mobile sensor data or both fixed and mobile sensors data. Mobile sensor or probe data was collected for the I-25 freeway and is described in detail in Chapter 3. The average time headway between buses on northbound direction of the I-25 freeway is 13.5 minutes during the morning peak hours and 30 minutes during the afternoon peak hours. This corresponds to a penetration rate of about 0.25 percent during the morning peak hours and about 0.13 percent during the afternoon peak hours. The A VL system reports bus locations every two minutes. The average length of segment on the I-25 freeway is one mile. On average for each segment, ten probe reports per hour during the morning peak hours and five probe reports per hours during the afternoon peak hours are available. A total of38 incidents on the I-25 freeway were recorded dufing the study period. However, probe vehicle data are available for28 incidents on the I-25 freeway. Half of the 28 incidents are randomly selected and used for training and the remaining half of the incidents are used for testing. The performance_measures reported for the mobile sensor based incident detection model are detection rate and false alarm rate. Mean time to detect is not reported since it depends on when the probe data is available. 7.3.1 Performance ofMobile Sensor Based Incident Detection Model The mobile sensors based incident detection model with two significant independent variables as described earlier is developed. Without any persistence test, when probe vehicle is available, the model detects 10 incidents out of 14 incidents (71.43 percent) with false alarm rate of0.19 percent (Figure 7.14). At zero percent false alarm rate, the detection rate ranges from 35.71 to 0 percent for one to four-interval persistence test. Due to low probe penetration rate and high report interval of probe vehicles, none of the incidents were five probe reports available during incidents. The result show that 124

PAGE 141

10 incidents out of a total of 19 incidents or 52.63 percent are detected under the given probe penetration rate and probe report interval on the I-25 freeway. 100 ,-,------------------------, 90 ---... -----------.. -------.... -80 --------.. ------.. ---------------70 60 __ --... -----------..-. '$. . --84o ----.. ----20 10 --------.-----------------------. ---------------------. : ,Q o.o,o 0.05 0.10 0.15 0.20 0.25 FAR(%) Figure 7 .14. Performance of mobile sensor based generalized additive model :; Incident detection models detect incident or non-incident state for each interval. The ;tt .. .: :. '-.... incident state does not express the severity of an incident. One measures to estimate . : i ': r ; . . . ; impact of at1 incident on traffic is average delay. An incident detection model may .. not be able to detect an incident that causes low impact to traffic. The average delay . .. may be every time interval. The examination of the model performance by severity of an incident in terms of average delay is present next. : j ( _' : : . The average or cumulative average delay of each time interval for the entire duration of each iriddent may be calculated as in Chapter 4. In this section, the cumulative average deiay of each time interval up to when the first probe vehicle 125

PAGE 142

available is estimated. Figure 7.15 shows the cumulative average delay when the first probe vehicle was available for 14 incidents in the test set. Median of average delay is about one minute per vehicle. Therefore, seven incidents impose average delay greater than one minute per vehicle and the remaining seven incidents impose average delay less than one minute per vehicle. 10.0 -,-------------------------., 9.0 ';,;' 8.0 .s --------------------------------.s 5 7.0 --------------------------------------------------6.0 1----------------------------------Q) 5.0 ----------------------------------------Cl Q) 4.0 bJ) -------------------------------------3.0 1-t-=--------------------------> < 2.0 1.0 0. 0 +-=""-r-1-----------------'-r-'""'-;r='"-r-2 3 4 5 6 7 8 9 10 11 12 13 14 Incident Figure 7.15. Cumulative average delay of incidents when the first probe vehicle is available For incidents with average delay greater than one minute per vehicle when the first probe vehicle available, the model detects all seven incidents without any persistence test (100 percent DR) and detects three incidents out of seven incidents (42.86 percent DR) for incidents when average delay is less than one minute per vehicle at a false alarm rate of0.19 percent (Figure 7.16). The zero false alarm rate is achieved at one interval persistence test with the detection rate is 71.43 percent and 14.29 percent for incidents imposing average delay higher and lower than one minute per vehicle when 126

PAGE 143

the first probe vehicle available, respectively. The result shows that the severity of an incident at time the probe data is available effects the performance of an incident detection model. The_ mobile sensor based incident detection model developed detects all that impose average delay higher than one minute per vehicle when the first probe vehicle available. Therefore, for the given probe penetration rate and report -. -interval for the I-25 freeway, if the average delay is greater than one minute per vehicle, the mobile sensor based incident detection model would detect 100 percent of incidents with false alarm rate of0.19 percent regardless of when the probe vehicle is available the during incident. ,..-.._ *' ..._, 0 100 90 80 70 60 50 40 30 20 10 0 +----------1 -+-Average delay greater than 1 minute -Average delay less than 1 minute 0.00 0.05 0.10 0.15 0.20 0.25 FAR(%) Figure 7.16. Performance of mobile sensor based incident detection model by average delay when the f1r"st probe vehicle available 127

PAGE 144

7.4 Fixed and Mobile Sensor Based Incident Detection Model Mobile sensors may beused as an additional source of traffic measure to improve the perfonnance of the incident dete.ction model if the number of probe reports per time interval is high enough. The immber of probe reports relates to probe penetration rate and probe report interv'al. The fixed and mobile sensor based incident detection model is developed for the 1-25 freeway. The perfonnance of the model is compared with fixed sensor based incident detection model to examine the benefit of using mobile sensor data as an additional data source. It may be mentioned that an incident detection model is developed based on fixed sensor or fixed and mobile sensors data only when probe vehicle data is also available. This is to examine perfonnance of a fixed and mobile sensor based incident detection algorithm given high penetration rate and/or short probe report interval of probe vehicles. There are 28 incidents available when the probe vehicle data available. Half of the incidents is randomly selected and used for training and remaining half of the incidents is used for testing. 7.4.1 Performance of Fixed and Mobile Sensor Based Incident Detection Model The use of average probe speed in a combined fixed and mobile sensor based model for incident detection the false alarm rate from 0.05 percent to zero percent at detection rate 92.86 percent_to 85.71 percent (Figure 7.17). An incident detection model with zero percent, false with slightly lower detection rate may be ... I preferred by a traffic management center than an incident detection model with higher ...... ... .; .' false alarm rate and slightly higher detection rate. Since the total number of false alarms per time unit is a function. of application interval of an incident detection model 128

PAGE 145

and number of segments that a TMC monitored. Alarms generated by an incident detection model with high false alarm rate are usually ignored by a TMC. 100 -"$. '-" Cl 90 80 70 60 50 40 30 -+--Fixed sensors 20 -Fixed sensor and mobile sensors 10 -------------------------------------------1 0.00 0.01 0.02 0.03 FAR(%) 0.04 0.05 0.06 Figure 7.17. Performance of fixed and mobile sensor based incident detection model For the combine fixed sensor and mobile sensor model, the false alarm rate is 0 percent and detection rate ranges from 85.71 percent to 14.29 percent for zero to four-:. .. ,I interval persistence test. For fixed sensor based incident detection model, the detection rate and false alarmrate range_from 92.86 percent to 7.14 o.ps percent to 0 percent for zero to four-interval persistence tests, respectively. The zero percent false alarm rate can be achieved at two-interval persistence test. At 0 percent. false alarm. rate, the detection rate is 85.71 percent and 21.43 percent fixed fllld mobile sensors based incident detection model and fixed sensor based incident detection model, The result shows that the data from mobile sensors may be used as an additional source of traffic measures to reduce the false alarm rate of the 129

PAGE 146

incident detection model. As a result, the incident detection model with additional probe vehicle data provides higher detection rate at 0 percent faise alarm rate. The fixed sensor based incident detection model outperforms the multilayer feedforward neural network, the probabilistic neural network, and the constructive probabilistic neural network models. The DR and FAR may not be significantly different for short and long segments. However, for short segments with on-ramps, the mean time to detect is significantly higher than short segments without ramps or with off-ramps. For long segments with off-ramps, the mean time to detect is significantly higher than long segments with on-ramps. The mobile sensor based incident detection model detects 100 percent of incidents with average delay greater than one minute per vehicle. This shows a potential use of mobile sensor as an alternative data source. The use of data from mobile sensor as an additional data source to fixed sensor has been shown to reduce .. Ialse alarm rate. .!. 130

PAGE 147

8. An Unbiased Validation of Incident Detection Algorithm Performance Using Bootstrap Method Most incident detection models are developed on a training set and tested the performance on a test set. Therefore, performance of the model relies on only single data split. Performance measures of the model include detection rate (DR), incident state detection rate (ISDR), false alarm rate (FAR), and mean time to detect (TID). In this study, ".632 bootstrap" method is performed to obtain better estimates of model performance. Among other methods such as data splitting, cross validation, and standard bootstrap, ".632 bootstrap" method performs best (Efron and Tibshirani 1993). For bootstrap estimate of confidence interval, Efron (Efron and Tibshirani 1993) suggests that 1000 bootstrap samples is quite adequate and 250 bootstrap samples provide reasonable results. In this study, ".632 bootstrap" method with 500 bootstrap samples is performed. A generalized additive model with six significant independent variables; upstream occupancy (UOCC), upstream speed (USPD), upstream occupancy deviation from historical occupancy (UDEVOCC), downstream occupancy (DOCC), downstream speed (DSPD), and spatial occupancy difference (OCCDF), is developed based on all incidents (lane-blocking and shoulder incidents) on the I-880 data for each of bootstrap sample. All six significant variables are selected according to independent variable selection procedure as described in Chapter 6. The bootstrap" performance is evaluated to obtain an unbiased model performance. 131

PAGE 148

8.1 Bootstrap Performance of Generalized Additive Model for Freeway Incident Detection Generalized additive model is a nonparametric model that allows high flexibility of the fitted functions. Each independent variable is fitted by a spline, a smoothing function. The flexibility of the fitted function of each independent variable for generalized additive models may vary. With the bootstrap method, three performance groups of generalized additive models are observed. Figure 8.1 shows the bootstrap performance of the models without any persistence test. Group 1 has low detection rate (DR), low incident state detection rate (ISDR), and low false alarm rate (FAR). Group 2 has high DR, high ISDR, and low FAR and Group 3 has high DR, high ISDR, and high FAR. For this data, DR less than 98 percent is defined as low DR and DR higher than 98 percent is defmed as high DR. ISDR less than or greater than 86 percent is defined as low or high ISDR. FAR less than 0.05 percent is defined as low FAR, and FAR greater than 0.05 percent as high FAR. In Figure 8.1, the asterisk represents the performances of the ID algorithm developed based on data-splitting technique, training on a half of data set and test the performance on the remaining of data set. The performance falls in Group 1 for this particular model. This illustrates that the bootstrap performance accounts for the performance of the model on all three groups and therefore provides a better estimate of the model's performance in the field. Table 8.1 shows minimum and maximum values of 500 bootstrap performances of GAM. Without bootstrap method, by data splitting technique, the reported DR could be as low as 93.90 percent or as high as 99.69 percent at zero-interval persistence test. Similarly, the reported ISDR could be as low as 80.46 percent or as high as 92.06 percent. The reported FAR could be as low as 0.0375 percent or as high as 0.4137 percent, a significant difference. In field implementation, performance of GAM for incident detection would be categorized in one of these three groups. Group one contains about 60 percent out of total 500 bootstrap performances and about 30 percent and 10 percent for Group 2 and 132

PAGE 149

3, respectively (Figure 8.2). The models within Group 2 provide the best performance (high DR, high ISDR, and low FAR). The performance of the generalized additive model provides three groups of performances due to the difference in flexibility of the fitted functions and the bootstrap sample in each group. The flexibility of the fitted function in terms of degree of freedom is presented next. q,O
PAGE 150

Table 8.1. Minimum and maximum values-ofbootstrap performance of GAM DR(%) ISDR(%) FAR(%) Persistence Test Min Max Min Max Min Max 0 93.90 99.69 80.46 92.06 0.03754 0.41373 1 .16 99.63 79.33 90.84 0.02979 0.35540 2 93.05 99.53 78.29 89.77 . 0;02562 0.31014 .. 3 92.48 99.48 77.30 88.77 0.02145 0.27751 4 92.07 99.32 76.34 87.80 0.01907 0.24668 .,. ... 1ZO 100 so g .60 g. {/J Figure 8.2. Frequency plot of bootstrap performance 134

PAGE 151

A generalized additive model is fitted to each of the 500 bootstrap samples and provides. three groups ofbootstrap performances as presented earlier. The degree of freedom for fitted models is about one for USPD and DSPD, and about for UOCC, DOCC, and OCCDF. For UDEVOCC, the degree of freedom ranges from 0.1 to 44 (Figure 8.3). This high variation of degree of freedom is caused by the characteristics of traffic flow that are sampled and included in the bootstrap samples. 160 140 120 100 rn &> 0 0 80 0 z 60 40 20 0 0 6 12 18 24 30 36 42 48 Degree of Freedom (UDEVOCC) Figure 8.3. Frequency plot of degree of freedom for UDEVOCC ,; .. ofUDEVQCC can be roughly classified into two groups. Less fle,Qbility of fitted functions or low degree of freedom group ranges from 0.1 to 10. More. fitted function or high degree of freedom group ranges from about 10 to 38. There are a few models that have degree of freedom ofUDEVOCC outside this range. Figure 8.4 shows that the models with low degree of freedom of UDEVOCC provide low DR and low FAR or group 1 in Figure 8.1. The models with high degree of freedom ofUDEVOCC provide higher DR and either low or high FAR. 135

PAGE 152

These performances correspond to Group 2 and 3 in Figure 8.1. It may be mentioned that models with higher flexibility of fitted functions for UDEVOCC provide higher DR and ISDR. However, it may provide high FAR as well. The effect of flexibility of functions to mean time to detect (TTD) is presented next. c.1 c.s o.4 .of! o.s /0 o.t !/ o.l : .'; = : { .:.._ . ., ;. :< .... r Figure 8.4. Degree of freedom ofUDEVOCC with DR and ISDR Bootstrap mean time to detect (TTD) at zero-interval persistence test ranges from 1.468 to 3.783 minutes. The range ofTTD is divided into 2 equal intervals; less than or equal 2.6254 minutes and higher than 2.6254 minutes. Comparing Figure 8.1 to Figure 8.5, most of models in Group 1 has higher TTD ( > 2.6254 minutes). Model in Group 2 and 3 have lower TTD (<= 2.6254 minutes). It is obviously that lower flexibility of functions provides lower DR, lower ISDR and lower FAR as well. It can be summarized that the 3 groups of the generalized additive model performances are: 136

PAGE 153

1. Group 1: low DR, low ISDR, low FAR, high TID, and low degree of freedom ofUDEVOCC. 2. Group 2: high DR, high ISDR, low FAR, low TID, and high degree of freedom ofUDEVOCC. 3. Group 3: high DR, high ISDR, high FAR, low TID, and high degree of freedom ofUDEVOCC. TTD (MINUTE): <= 2.6254 TTD (MINUTE):> 2.6254 Figure 8.5. Bootstrap performance by mean time to detect With bootstrap method, all possible performances of generalized additive model are observed. Flexibility of fitted functions of the non-parametric model, GAM, provides three groups ofperformance due to certain characteristics oftraffic flow. If a generalized additive model is developed and used for incident detection, its performance. could fall in one of these three groups as presented earlier. With bootstrap method, it also allows us to select the model with the flexibility of fitted functions in a desired group to further develop parametric estimate model. Parametric estimate of 137

PAGE 154

GAM which has a fixed degree of freedom of each independent variable is presented next. ; . 8.2 Bootstrap Performance of Parametric Estimate of Generalized Additive Model The model presented in the preyious sections is a generalized additive model for incident detection. Each variable is fitted by a cubic spline smoother. To simplify the model, a parametric form of the model is proposed by examining the partial prediction or effect of each independent variable. The parametric model provides simple functional forms and easy for implementation. One of the advantages of a parametric model to a nonparametric model is that it requires less time to train. The training time for the. parametric model111ay be less than one time interval (e.g. 30-second). Once an incident detection algorithm is developed and implemented for a specific location. For the parametric model, a real time re-training can be performed to adjust the parameter estimates to traffic variation. This section presents the bootstrap performance of the parametric estimate of generalized additive models. Generalized additive model for Group 1 (Figure 8.1) is used as a guide to develop the parametric functional forms. The partial prediction plot and the functional form of the parametric model is presented in Chapter 6. Models from Group 1 include a low degree of freedom for UDEVOCC and provide consistently low FAR. Models with a high degree of freedom for UDEVOCC provide performance in either Group 2 or Group 3 which may provide a higher FAR. In freeway incident detection, a Traffic management center monitors hundreds of freeway sections and an incident detection algorithm is implemented for short time interval ( e,g. 30-second). Unless the FAR of the algorithm is extremely low, incident alarms are usually ignored. Therefore, a model with performance in Group 1 is desired because of its low FAR. 138

PAGE 155

Parametric model with polynomial functions for UOCC, DOCC, and OCCDF, a piecewise linear function for UDEVOCC, and linear functions forUSPD and DSPD is developed for.each ofbootstrap sample. Bootstrap performance of the parametric estimate of GAM is performed with 500 bootstrap samples. The performance of these models may be grouped in one cluster (FigUre 8.6) \Vhile the perfomiance of pervious bootstrap samples three group of performance due to the flexibility of their functions. Figure 8.7 shows the DR at zero-interval persistence test of500 bootstrap samples the parametric models. DR is normally distributed with the mean of96.06 percent. Figure 8.8 shows ISDR at zero-interval persistence test for 500 bootstrap samples. ISDR is normally distributed with a mean ISDR of 82.57 percent. For most of the 500 bootstrap samples (91.4 percent), the FAR is 0 percent. ., gs,O g1.5 g1.o g6 . c1. g6() ;tl g&.6 r.; ; g&.O g.t-.6 g4.0 ... ; . .. '{_: ) Figure 8.6. Performance of parametric estimate of GAM at zero interval persistence test 139

PAGE 156

= I 0.018 0.016 0.014 0.012 0.010 c 0.008 0.006 0.004 0.002 0.000 0 -0.002 0 0 0 ll 0 0 0 0 0 o$JcJ!io'Go 6 "o0 0 -0.004 u DR(%) 300 600 Figure 8.7. Scatter plot of DR and FAR with histograms at zero interval persistence test ::I 0.018 0.016 0.014 0 0.012 0 o: 0 0 0.010 0.008 0 0 a: iE 0.006 o 0 0.004 0.002 0.000 00> 0 1--0.002 -0.004 76 78 80 82 84 86 88 0 300 600 ISDR(%) Figure 8.8. Scatter plot ofiSDR and FAR with histograms at zero interval persistence test 140

PAGE 157

Confidence interval is defined based on bootstrap percentiles. Let ll indicate a bootstrap performance. 1 00( 1-2 a ) percent is confidence level. 01 or lower limit of confidence interval is the 100.a.th percentile'of o'sdistributiori. oup or upper limit of confidence interval is the 100.(1-a.)th percentile of o 's distribution. Table 8.2 shows performance of the bootstrap parametric estimate of the generalized additive models at different persistence tests, at a 95 percent confidence interval. Mean DR, mean ISDR, and mean FAR range from 96.06 to 94.96 percent, 82.57 to 78.05 percent, and 0.00036 to 0 percent for zero to four interval persistence respectively. The mean TTD ranges from 3.12 to 5.28 minutes for zero to four-interval persistence test. The FAR of 0 percent can be achieved at four-interval persistence test with mean DR of94.96 percent and 95 percent confident interval of (93.27 ,96.47), mean ISDR of 78.05 percent and 95 percent confidence interval of(75.77,80.10), and mean TID of5.28 minutes and 95 percent confidence interval of(4.84,5.76) .. PT 0 1 2 3 4 Table 8.2. Bootstrap performance of parametric estimate of GAM with 95 % confidence interval DR(%) ISDR(%) FAR(%) TTD (Minute) Mean 95%CI Mean 95%CI Mean Mean 95%CI .. 96.06 (94.81,97.27) 82.57 (80.51,84.48) 0.00036 3.12 (2.67,3.60) 95.70 (94.26,96.99) 81.32 (79.17,83.24) 0.00011 3.66 (3.20,4.16) 95.44 (93.94,96.85) 80.18 (77.96,82.14) 0.00006 4.19 (3.73,4.67) 95.30 (93.78,96.67) 79.09 (76.81,81.10) 0.00002 4.74 (4.27,5.22) 94.96 (93.27,96.47) 78.05 (75.77,80.10) 0 5.29 (4.84,5.76) 141

PAGE 158

8.3 Summary Data-splitting technique is the most coiillilon use for ID algorithm validation. With data-splitting technique, performance of the ID algorithm is based on only a single data split into a training and a testing data set. In this study, performance ofiD algorithm is unbiasedly validated by using the bootstrap method. With the bootstrap method, three groups of GAM bootstrap performance are observed. The difference between minimum and maximum values is about six percent for DR and 11 percent for ISDR. The most significant difference in performance in the bootstrap samples is for FAR-ranging from 0.03 to 0.41 percent. The DR is as high as 99.69 percent and as low as 93.90 percent for zero-persistence test. The nonparametric model may use very flexible functions. Therefore, the fitted models have different degree of freedom or flexibility, depending on the characteristics of traffic flow in the samples. It clearly shows that if the ID algorithms were developed and validated with a data-splitting sample, the performance of the algorithm may fall in one of the thJ;ee.performance groups and the flexibility of the models would also vary significantly. If the performance of the algorithm developed based on data-splitting falls in Group 2 (high DR, high ISDR; and low FAR), then the reported performance would be higher than its true performance. About 60 percent of all possible performance is in Group 1 (low DR, low ISDR, and low FAR). With the bootstrap method, it also allows us to select a model structure in a desire performance group to further develop a parametric estimate of a generalized additive model. The parametric estimate of a generalized additive model is a parametric model including independent variable with fixed degrees of freedom. A parametric model provides the simplest functional form for implementation. Another advantage of the 142

PAGE 159

parametric model over a non parametric model is that it takes less time:totrain.The real time retraining may be performed to adjust the parameters to changes in traffic characteristics or conditions. A better estimate of the performance of an incident detection algorithm may be provided using a bootstrap method. 143

PAGE 160

9. Summary, Conclusions, and Recommendations In this research, generalized additive models are developed to detect freeway incidents. Generalized additive model is a further generalization of the generalized linear model, allowing appropriate functional forms of independent variables to be proposed. Generalized additive model combines an additive assumption that enables relatively many parametric relationships to be explored simultaneously with the distributional flexibility of generalized linear model. For incident detection, the response variable follows a binomial distribution, and the variables in the non-parametric additive models may be examined based on their partial prediction plots and parametric form for each independent variable may be proposed .. This yields a parametric estimate of generalized additive model. A parametric estimate of generalized additive model provides functional forms that are easy to implement. The training time for a parametric incident detection model is less than one time interval (e.g. 30 second). Therefore, a real-time recalibration of the model to adjust parameter estimates to any change of traffic characteristics may be performed. Incident detection models are developed for the two major freeways, the 1-25 freeway in Colorado and the 1-880 freeway in California. The 1-25 freeway data collected in Colorado includes incident data, loop detector data, and probe vehicle data. The data -:. . from the 1-880 freeway in California that used to develop incident detection models include loop detector data and incident data. A comprehensive review of the literature . .. .. . shows that all incident detection models lack transferability. The acceptable performance of the model transferred to a new location is achieved only ifthe model is retrained or recalibrated. The study presented in Chapter 4 also shows a significant characteristics difference of incidents for the two locations. In this research, 144

PAGE 161

generalized additive models are developed specifically for each location. The significant independent variables for each location are identified and the differences between the models and its performance are examined. An incident detection model is implemented by a traffic management center (TMC) to detect freeway incidents or any unexpected event that causes disruption to the flow of traffic. An incident may be either lane-blocking incident or shoulder incident. A traffic management center is mainly interested in detecting both lane-blocking and shoulder incidents. If the shoulder incidents are detected rapidly, a TMC may respond by sending a service unit out to clear the incidents and assist motorists to reduce the impact of the incidents. An incident detection model developed and tested on both lane-blocking and shoulder incidents should provide better estimates of its performance at a traffic management center. Due to characteristic differences between lane-blocking and shoulder incidents, a model developed solely based on lane blocking incidents may not perform well in detecting shoulder incidents. Therefore, to illustrate the characteristic differences between lane-blocking incidents and shoulder -incidents as they relate to incident detection; separate incident detection models are also developed. The differences of characteristics between lane-blocking and shoulder incidents based on the significant independent variables selected and their parametric estimate are also examined. A comprehensive review of literature shows that most of incident detection models have been developed based on traffic measures estimated from fixed sensors. Fixed sensors are a good source of data for temporal variation of traffic, while spatial traffic variation is available at a specific interval (e.g. half a mile). This research develops fixed sensor based incident detection models. In addition, mobile sensor based, and fixed and mobile sensor based incident detection models are also developed. Data from mobile sensors represent spatial variation of traffic conditions at a specific time 145

PAGE 162

interval. Traffic measures from mobile sensors may be used as an alternative data source where fixed sensors are not available or as an additional data source to improve the performance of incident detection models. A summary of the characteristics of incidents, significant independent variables for generalized additive models, model performance, bootstrap method to unbiasedly validate the performance ofmodels, conclusions, and recommendations of this study are presented next. 9.1 Characteristics of Incidents The study of characteristics of incidents for the two freeways shows that the characteristics between the two locations are significantly different. The incident rate of all incidents for the I-25 freeway is 49 incidents per 107 VMT and 817.4 incidents per107 VMT for the I-880 freeway. For the I-25 freeway, the probability oflaneblocking incidents and shoulder incidents per vehicle miles traveled are not significantly different than 0.5 at the 5 percent level of significance (p-value=0.08). For the 1-880 freeway, the probability of lane-blocking and shoulder incidents per vehicle miles traveled are significantly different than 0.5 at the 5 percent level of significance (p-value < 0:001). The probability oflane-blocking incidents on the I-25 freeway is not significantly higher than shoulder incidents. However, for the 1-880 freeway, the probability of shoulder incidents is significantly higher than lane-blocking incidents, The mean incident duration and distribution of incident duration between lane blocking and shoulder incidents are not significantly different for the I-25 freeway but the distribution of incident duration between lane-blocking and shoulder incidents is significantly different for the 1-880 freeway. The duration of incident on the I-880 146

PAGE 163

freeway is 1.6 times higher than the 1-25 freeway. However, the delay imposed on motorists per vehicle is lower for motorists on the 1-880 freeway. The weighted average delay of lane-blocking incidents for the 1-25 freeway is more than 2 times higher than for the 1-880 freeway. One of the reasons may be because 1-880 freeway has more lanes (S-lane) than 1-25 freeway (3-lane). Once an incident occurs, it causes higher capacity reduction and traffic speed to drop more for 1-25 freeway. For 3-lane freeway, one lane blocking incident causes 47 percent capacity reduction and 78 percent for two lanes blocking incident. For S-lane freeway, one lane blocking incident causes 25 percent capacity reduction and 50 percent capacity reduction for two lanes blocking inCident (Blumentritt 1981 ). The average delay for lane-blocking incidents is about 1.9 times higher than shoulder incidents for the 1-25 freeway and about 1.5 times higher for the 1-880 freeway. The results show that the characteristics of incidents in terms of incident rate, incident duration, and average delay are significantly different at 5 percent level of significance for the .two freeways and for different type of incidents. Therefore, low transferability of the models is expected and the model developed solely based on lane-blocking incidents may not perform well in detecting shoulder incidents. 9.2 Significant Independent Variables for Incident Detection Models The models developed to detect incidents in Colorado and California illustrate that the distribution and characteristics of lane-blocking-and shoulder incidents affect the structure of an incident detection model. The incident detection model developed for the 1-25 freeway includes upstream traffic measures such as speed, occupancy and the deviation of occupancy from historical occupancy. The model developed for lane blocking on the 1-880 freeway includes both upstream and downstream traffic 147

PAGE 164

measures such as upstream and downstream speed, occupancy and deviation of occupancy from historical occupancy. For the models developed for all incidents and shoulder incidents on the I-880 freeway, spatial difference in occupancy is also significant in addition to significant variables for lane-blocking incident on the I-880 freeway. The functional forms and parameters ofthe models are also significantly different. ... Generalized additive models developed, only for lane-blocking incidents and shoulder incidents for a freeway section in California, highlight some important differences of the impact of incidents on incident detection. For example, the spatial difference in occupancy is significant for shoulder incidents but not for lane-blocking incidents. In addition, the function of the independent variables and their parameter estimates in the generalized additive models are significantly different for the two types of incidents. Therefore, the model specification for an incident detection model may vary by location, mainly due to the characteristics of the incidents it is developed to detect. 9.3 Performance of Generalized Additive Model for Incident Detection This section summarizes the performance of fixed sensor, mobile sensor, and fixed and mobile sensor based incident detection models. 9.3.1 Fixed Sensor Based.lncident Detection Model The generalized additive model developed to detect lane-blocking incident performs significantly better than the multilayer feedforward (MLF) neural network, the probabilistic neural network (PNN), and the constructive probabilistic neural network (CPNN) as demonstrated by the performance of the models for the I-880 freeway 148

PAGE 165

section in California. For the generalized additive model tested on sections of the 1-880 freeway, the detection rate is 100 percent, the false alarm rate is 0 percent and the mean time to detect ranges from 5.02 to 7.2 minutes for zero to four interval persistence tests. Performance of parametric estimate of GAM is comparable with GAM. Incident detection model is implemented by a traffic management center to detect any unexpected event that causes disruption to traffic flow. The study of characteristics of inCidents also shows 'that shoulder incidents may cause higher impact on traffic than lane-blocking. Therefore, incident detection model developed based on all incidents (lane-blocking and incidents) is more relevant. The generalizedadditive model developed for both lane blocking and shoulder .. incidents for the 1.:880 freeway in California detects 95.59 percent of all incidents at zero percent false alarm rate. The model detects 100 percent of the lane-blocking incidents and percent of the shoulder incidents. The model developed for the 125 freeway .in :Colorado for all incidents detects 100 percent of the incidents at zero false alarm rate, at a oneinterval persistence test. The generalized additive models perform significantly better than the neural network based models. . The generalized developed for shoulder incidents has lower detection ---; ; : 1 ': rate and higher incident state detection rate than lane-blocking incident detection model. Incident detection model developed explicitly for either incident type performs better than a model developed for both incident types. As an incident detection model is implementeci'by a traffic management center to detect freeway incidents, and both shoulder and lane-blocking incidents cause significant impact to traffic, it should be developed to both, zypes of incidents. 149

PAGE 166

It may be mentioned that performance of-parametric estimate of GAM is comparable to GAJ'4 for all models developed. .. '' .. l 9.3.2 Performance of Fixed Sensor Based Incident Detection Model by Segment Length and Segment Type 1-880 freeway all incidents, detection rate of incident detection model for short segment (shorter than 1/3 mile) is higher than for long segment (longer than 1/3 mile) all persistence tests .. on freeway may also be grouped by segment type; segments without ramp, segments with on-ramp, and segments with off-ramp. OnramP. causes merging of traffic and also vehicles on the main freeway try to change lane away from on-ramp. Off-ramp causes diverging of traffic as vehicles changing lane to off-ramp. Short segments without ramps provide highest detection rate compared to short segments with on-ramps and with off-ramps. Long segments with on-ramps provide higher detection rate than long segments with off-ramps. For short segments with on ramps, the mean time to detect is significantly higher than short segments without ramps or with off-ramps. 9.3.3 Mobile Sensor based Incident Detection Model Mobile sensor may be used as an alternative source of traffic measures where fixed sensor is not available. For the I-25 freeway, with out any persistence test, the model detects 10 incidents out of 14 incidents (71.43 percent) with false alarm rate of0.19 percent when the probe vehicle is available for all14 incidents. For a given probe 150

PAGE 167

penetration rate and probe report interval on the I-25 freeway, the probe based incident detection model detects 10 incidents out of a total of 19 incidents (52.63 percent). Incident detection models rely on the traffic.pattems that emerge as an incident occurs. An incident that causes low impact to traffic may not be detected. Impact due to an incident to traffic may be measured as average delay per vehicle. For the I-25 freeway data, without any persistence test, the mobile sensor based incident detection model detects all seven incidents (1 00 percent DR) and detects three incidents out of seven incidents (42.86 percent DR) with false alarm rate of0.19 percent for incidents with average delay greater than and lower than one minute per vehicle when the first probe vehicle available. 9.3.4 Fixed and Mobile Sensor Based Incident Detection Model Data from mobile sensors may also be used as an additional data source to improve performanc_e of the model. For the I-25 freeway data, comparing the combine fixed and mobile sensors based model with fixed sensor based model, by using average probe speed as an additional data, without any persistence test, the false alarm rate is reduced from 0.05 percent to zero percent while the detection rate decreases from 92.86 percent to 85.71 percent. At 0 percent false alarm rate, the detection rate is 85.71 percent and 21.43 percent for the combine model and fixed sensor based incident detection model, respectively. For the combine fixed and mobile sensor model, the -false alarm rate is 0 percent and detection rate ranges from 85.71 percent to 14.29 percent for zero to four-interval persistence test. For fixed sen8or based incident ); detection model, the detection rate and false alarm rate range from percent to 7.14 percent and 0.05 percent to 0 percent for zero to four-intervalpersistence tests, respectively. The zero percent false alarm rate can be achieved at two-interval persistence test. The results show that the data from mobile sensors may be used as an 151

PAGE 168

additional traffic measure source to reduce the false alarm rate of the incident detection model. 9.4 Unbiased Validation of Model Performance_. Bootstrap method provides a better estimate of model performance than data-splitting method and cross validation method. Bootstrap performance with 500 bootstrap samples is performed for the incident detection model developed to detect all incidents I on, the 1:-880 freeway. Three groups of performance are observed for the GAM due to the flexibility of their functions. For parametric estimate of GAM, Mean DR, mean and mean FARrange from 96.06 to94.96 percent, 82.57 to 78.05 percent, and 0.00036 to 0 percent for zero to four interval persistence test, respectively. Mean TID ranges form 3.12 to 5:28 minutes for zero to four-interval persistence test. The FAR of 0 percent can be achieved at four-interval persistence test with mean DR of94.96 percent and 95 percent confiqent interval mean ISDR of78.05 percent and 95 percent confidence interval of(75.77,80.10), and mean TTD of5.28 .. minutes and 95 percent of(4.84,5.76). 9.5 Conclusions The Generalized additive model (GAM) performs significantly better than other neural network models for incident detection. The parametric estimate of GAM performs comparably with generalized additive model. The parametric estimate of GAM may be retrained in real-time to adjust the parameters to any change in traffic characteristics. They also provide functional forms that are easy to implement. 1 152

PAGE 169

A study of the characteristics of incidents shows that they are significantly different by location and type. distribution and characteristics oflane-blocking and shoulder incidents also affect the structure of an incident detection model. An incident detection . model developed-for one location may not perform well on another location due to the difference in the characteristics of incidents. Therefore, an incident detection model that requires a short time 'io such as GAM and parametric estimate of GAM, developed specifically for each location is preferred. Fixed sensor is a good spillce of temporal variation of traffic measures, while spatial l. variation is available at discrete interval. The spatial distribution of fixed sensor or segment length effects the performance of an incident detection model. For fixed sensor based incident detection model, for segments shorter than 1/3 mile or for longer segment, the DR and FAR may not be statistically different. However, for short segments with.on-ramps, the mean time to detect may be significantly higher than short segments without any ramp or with off-ramps. The mean time to detect for long segments with off-ramps may be significantly higher than long segments with on respectively. It is also found that using only data from mobile sensors, an incident detection model may detect all incidents that cause an average delay greater than one minute per vehicle. Therefore, mobile sensor based incident detection model may be implemented if the penetration rate of probe vehicles is high and/or report interval is short. Use of mobile sensor data in addition to data from fixed sensors helps reduce false alarm rate of an incident detection model. For all incident detection algorithms, there always exists a trade-off among low false alarm rate, high.detectionrate, and low detection rate .A low false alarm rate can be achieved at the expenSe of detection rate and time to detect. A high detection rate and 153

PAGE 170

low time to detect can be achieved at the expense offalse alann rate. Therefore, the acceptable tradeoff depends on the policy of a traffic management center. :: The marginal rate of substitution obtained from performance envelop may help a traffic management center to decide the desired performance. The marginal rate of :-. : substitution or the slope of the performance envelop is the trade-off at given detection rate. Typically, performance of an incident detection model is reported based on data splitting technique. The bootstrap method provides better estimates of model perfQrmance than data splitting technique and cross validation. In addition, it also helps identify the appropriate degree of freedom to achieve a specific level of . . --. '. . -performance or trade off between DR and FAR. The bootstrap performance reported including means and 95 percent confidence interval provides a e.stimate of model performance in the field. 9.6 Recommendations :. j .... : .-.. r,.. The generalized additive model for incident detection provides significantly better performance than neural network based models. Although, the off-line bootstrap : tested in this research provides a good estimate of model performance in the field, on-line testing in a traffic management center is recommended. time required to calibrate a parametric estimate of GAM is less than one time interval (e.g. 30 seconds). An off-line and on-line recalibration ofthemodelto adjust parameters to variation or of traffic characteristics to, improve the performance of the model could be further studied. 154 --------------------------------------------

PAGE 171

In this research, due to the low penetration -rate of probe vehicles and long probe report interval for the I-25 freeway, only incident data when probe vehicle is available is used to develop and test the model. Probe vehicle data, as an additional data source, improves the performance of the incident detection model by reducing false alarm rate. In the field, if more non-infrastructure based sensors .such as probe vehicles and video image processing become available, fixed and niobile.sensor based nicident detection model may be.further tested on _larger data sets. .-;. '. 155

PAGE 172

Appendix .'. ';_! _., 1 A. Sample SAS Scripts SAS Script to Fit Generalized Additive Model Proc sort data=testlbshldnoninc 1 Out=testlbshldnoninc; By Setno vectorid; Run; Proc sort data=trainlbshldnoninc 1 Out=trainlbshldnoninc; By Setno vectorid; Run;. Datatestlbshldnoninc; Set testlbshldnoninc; no+1; id+1; Run; Datatrainlbshldnoninc; Set trainlbshldnoninc; no+1; id+l; il Run; .. /"f. .I : 1 .. 1 1:1 I J .I I I I I I I I I I GAM I I I I I I I I I I I I l I l I I I l I I I I I I / /*+needto'besorted By Setno and vectorid first+*/ Proc GAM data=work.trainlbshidnoninc; Model incvector = Spline(udevocc;df=4) Spline( uspd,df=4) Spline(uocc,df=4) Spline( dspd,df=4) Spline(docc,df=4) Spline( oocdf,df=4) ,_,., /dist=binomial epsScore=0.5 epsilon=0.01; Output Out= trainlbshldOut; Score data= testlbshldnoninc Out=testlbshldOut; Run; 156

PAGE 173

/*++use below script to fit parametric Model++ /* I I.J I 1.1 I I I I I I I I I I I I I I I I gee with log odd ratio+++*/ /*Datatrainlbshldnonincparam; Set trainlbshldnoninc; lfudevocc <0 Then udevocc1 = udevocc; Else udevocc1=0; lfudevocc >= 0 Then udevocc2 = udevocc; Else udevocc2=0; Run; Datatestlbshldnonincparam; Set testlbshldnoninc; If udevocc < 0 Then udevocc 1 = udevocc; Else udevocc 1 =0; lfudevocc >= 0 Then udevocc2 = udevocc; Else udevocc2=0; Run; ,,; .. Proc GAM data= trainlbshldnonincparam; Model incvector = param(udevocc1 udevocc2 uspd uocc uocc*uocc uocc*uocc*uocc dspd docc docc*docc docc*docc*docc oocdf*oocdf)/dist=binomial; Output Out=i880train0utparam; Score data=testlbshldnonincparam Out=i880test0utparam; Run; . -.. l ,' .. ;; 157

PAGE 174

-: :. A.2 Script to Evaluate Model Performance (DR, ISDR, FAR, and TTD) /*Input file is OS_ temp including variables; setno, no, incvector (dependent variable), p incvector (predicted response) *I /*combine 'original test file' with 'output of test file from gam*/ /**I I I I I I I train set I I I I I I I I I I I I*/ data os temptrain; set trainlbshldnoninc (keep= setno incvector lborshld); set i880trainoutparam (keep =p_incvector); varname ="gam"; run; data os temptest; set testlbshldnoninc (keep= setno incvector lborshld); set i880testoutparam (keep =p incvector); vamame="gam"; run; /********************************************************************* *I /***** I 1 1 1 1 1 calculate performance for++++train set I I I I I I *************/ /*I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I*/ data os_isdrfartemp; set os temptrain; retain holdprobO; retain holdprob 1; retain holdprob2; retain holdprob3; /****specify transformation and threshold here, default threshold is 0.5 ********/ ifp_incvector >20 then p_incvector=20; if p incvector <20 then p incvector=-20; prob aptO =-( exp(p _ihcvector)/(1 +exp(p incvectcir))); ifprob_aptO > 0.50 then prob_aptO = 1; else prob_aptO =0; /*'========================================================*! /*create response for different persistence (from 0 upto 4 persistence)*/ if prob aptO = 1 and holdprobO = 1 then prob aptl =l;else prob aptl =0; ifprob_aptl == 1 and holdprob1 =1 then prob_apt2 =1;else prob_apt2 =0; if prob-apt2 = 1 and holdprob2 = 1 then prob-apt3 = 1 ;else prob-apt3 =0; ifprob_apt3::!: 1 and hoidpJob3 =1 then prob_apt4 =l;else prob_apt4 =0; output; holdprobO = prob aptO; holdprob l = prob aptl; 158

PAGE 175

holdprob2 = prob_apt2; holdprob3 = prob_:_apt3; drop holdprobO holdprob 1 holdprob2 holdprob3; run; /****************************************************************/ /*part i start calculate incident-state detection rate (isdr) for original sample*/ /*part i.i for both lb and shoulder*******************************/ data OS 'isdr -, set os temptrain; set os isdrfartemp; ifincvector = 1 and prob aptO= 1 then isdO = 1; else isdO =0; if incvector = 1 and prob _aptl =1 then isd1 = 1; else isd1 =0; if incvector = 1 and prob apt2= 1 then isd2 = 1; else isd2 =0; if incvector = 1 and prob apt3= 1 then isd3 = 1; else isd3 =0; if incvector = 1 and prob apt4= 1 then isd4 = 1; else isd4 =0; tisdO+isdO; tisd1+isd1; tisd2+isd2; tisd3+isd3; tisd4+isd4;tis+incvector; if tis "=0 then os_)sdrptO = (tisd0/tis)*100;else os_isdrptO =0; if tis "=0 then os_isdrptl = (tisd1/tis)*100;else os_isdrptl =0; if tis "=0 then os_isdrpt2 = (tisd2/tis)*100;else os_isdrpt2 =0; if tis "=0 then os_isdrpt3 = (psd3/tis)*100;else os_isdrpt3 =0; if tis "=0 then os_isdrpt4 = (tisd4/tis)*100;else os:_isdrpt4 =0; run; proc sort data= os_isdrfartemp out= os_isdrfartemp2; by incvector setno; run ,J /*part i.ii for lane-blocking only****:>!:*************************/ /*******************:!<******--"'***********"**************************/ data os_lbincdataset; set os _isdrfartemp2; by setno notsorted;-if lborshld = 1; iflast.setno = 1; lbincset =setno; keep setno lbincset; output;data os _lbincset; merge os _isdrfartemp os _lbincdataset; by setno; iflbincset =""then delete; /*drop lbincset;*/ run; ;,, 159

PAGE 176

data OS _)bisdr; set os _lbincset; if incvector = 1 and prob aptO= 1 then lbisdO = 1; else lbisdO =0;: > if incvector = 1 and prob aptl =l then lbisd1 = 1; else lbisd1 =0; if incvector = 1 and prob _apt2= 1 then lbisd2 = 1; else lbisd2 =0; if incvector = 1 and prob...;.. apt3= 1 then lbisd3 = 1; else lbisd3 =0; .. ifincvector = 1 and prob_apt4=1 then lbisd4 = 1; else lbisd4 =0;. tlbisdO+lbisdO; tlbisd1+1bisd1; tlbisd2+lbisd2; tlbisd3+1bisd3; tlbisd4+lbisd4;tlbis+incvector; iftlbis A=O then os_lbisdrptO = (tlbisd0/tlbis)*100;else os_lbisdrptO =0; iftlbis A=O then os_)bisdrptl =:(tlbisdl/tlbis)*lOO;else os..Jbisdrptl =0; iftlbis A=O then os_lbisdrpt2 = (tlbisd2/tlbis)*100;else os_lbisdrpt2 =0; iftlbis A=O then os_lbisdrptJ =(tlbisd3/tlbis)*lOO;else os_;_lbisdrpt3 =0; iftlbisA=O then os_lbisdrpt4 = (tlbisd4/tlbis)*100;else os.:Jbisdrpt4 =0; run; /*part i.iii : for shoulder incident only***************************/ /******************************************************************/ data os shldincdataset; setos_)sdrfartemp2; by setno notsorted; iflborshld. =2; iflast.setno = 1; shldincset ==setno; keep setno shldincset; _: .. output; data os shldincset; merge os _:_isdrfartemp os shldincdataset; by setno; ifshldincset =""then delete; drop shldincset; run; :, ,._, data os shldisdr; set os shldincset; if incvector = 1 and prob aptO= 1 then shldisdO = 1; else shldisdO =0; ifincvector = 1 and prob.:..:_aptl=l then shldisdl= 1; else shldisdL=O; if incvector = 1 and prob apt2=1 then shldisd2 = 1; else shldisd2 =0; if incvector = 1 and prob apt3= L then shldisd3 = 1; else shldisd3 =0; ifincvector = 1 and prob_apt4=1 then shldisd4,= 1; else shldisd4 =0; tshldisdO+shldisdO ; tshldisd 1 +shldisd1; tshldisd2+shldisd2; tshldisd3+shldisd3; tshldisd4+shldisd4;tshldis+incvector; iftshldis A=O then os_shldisdrptO = (tshldisdO/tshldis)*IOO;else os_shldisdrptO =0; 160

PAGE 177

iftshldis 1\=0 then os_shldisdrptl = (tshldisdlltshldis)*lOO;else os_shldisdrptl =0; iftshldis 1\=0 then os_shldisdrpt2 = (tshldisd2/tshldis)*l00;else os_shldisdrpt2 =0; iftshldis 1\=0 then os_shldisdrpt3 = (tshldisd3/tshldis)*lOO;else os_shldisdrpt3 =0; if tshldis 1\=0 then os shldisdrpt4 = ( tshldisd4/tshldis )* 1 OO;else os shldisdrpt4 =0; rurt; of part i********:l<*********************************>!:**/ /*part ii calculate time to detect(ttd) **********************/ /*part ii.i calculate ttd for lane-blocking only************/ data os _lbisdrttd; set os_lbisdr; by setno; retain holdtotala holdtprob aptO holdtprob aptl holdtprob apt2 holdtprob apt3 holdtprob _apt4 holdlborshld; totala+incvector; if incvector =0 then totala=O; ifholdtotala >= 1 and totala =0 then itisinc =1; else itisinc =0; ifholdtotala A= 0 or totala A= Othen tprob aptO+prob _aptO;else tprob aptO = 0; ifholdtotala /\= 0 or totala /\= Othen tprob _aptl +prob aptl ;else tprob aptl = 0; ifholdtotala /\= 0 or totala A= Othen tprob apt2+prob apt2;else tprob apt2 = 0; ifholdtotala A= 0 or totalal\= Othen tprob_apt3+prob_apt3;else tprob_apt3 = 0; ifholdtotalal\= 0 or totala /\= Othen tprob apt4+prob _apt4;else tprob _apt4 = 0; /*below to check if the model detects the incident* I if itisinc = 1 and tprob aptO >= 1 then itisdetectO = 1; else itisdetectO = 0; ifitisinc = 1 and tprob aptl >= 1 then itisdetectl = 1; else itisdetectl = 0; ifitisinc = 1 and tprob_apt2 >= 1 then itisdetect2 = 1; else itisdetect2 = 0; if itisinc = 1 and tprob apt3 >= 1 then itisdetect3 = 1; else itisdetect3 = 0; if itisinc = 1 and tprob apt4 >= 1 then itisdetect4 = 1; else itisdetect4 = 0; titisdetectO+itisdetectO ;titisdetect 1 +itisdetect 1 ;titisdetect2+itisdetect2;titisdetect3+itisd etect3 ;titisdetect4+itisdetect4; titisinc+itisinc; /* begin calculation of mean time to detect* I if totala = 1 then incstart = n _;else incstart = 0; if last.setno = 1 or incstart /\= 0 then tincstart = 0; tincstart+incstart; iftprob_aptO =1 and holdtprob_aptO = 0 then incdetectatO = _n_;else incdetectatO = 0; iftprob aptl = 1 and holdtprob aptl = Othen incdetectatl = n _;else incdetectatl = 0; if tprob apt2 = 1 and holdtprob apt2 = Othen incdetectat2 = n _;else incdetectat2 = 0; if tprob apt3 = 1 and holdtprob apt3 = Othen incdetectat3 = n _;else incdetectat3 = 0; if tprob apt4 = 1 and holdtprob apt4 = Othen incdetectat4 = n _;else incdetectat4 = 0; 161

PAGE 178

if first.setno = 1 or to tala = 1 then tincdetectatO = O;if first.setno = 1 or totala = 1 then tincdetectatl = 0; if first.setno = 1 or totala = 1 then'tincdetectat2 = O;if frrst.setno = 1 or totala = 1 then tincdetectat3 = 0; if first.setno = 1 or totala = 1 then tincdetectat4 = 0; iftincdetectatO = 0 or tincdetectatO =""then tincdetectatO+incdetectatO; if tincdetectat1 = 0 or tincdetectat1 tincdetectat1 +incdetectat 1; iftincdetectat2 = 0 or tincdetectat2 =""then tincdetectat2+incdetectat2; iftincdetectat3 = 0 or tincdetectat3 =""then tincdetectat3+incdetectat3; iftincdetectat4 = 0 or tincdetectat4 =""then tincdetectat4+incdetectat4; iftincstart "= 0 and tincdetectatO "=Oc.and incvector =1 then etO = tincdetectatOtincstart;else etO =0; ; if tincstart "= 0 and tincdetectat 1 "=0 and incvector = 1 then et 1 = tincdetectat 1tincstart;else etl =0; iftincstart "= 0 and tincdetectat2 "=0 then et2 = tincdetectat2-tincstart;else et2 =0; if tincstart "= 0 and tincdetectat3 "=0 then et3 = tincdetectat3-tincstart;else et3 =0; iftincstart "= 0 and tincdetectat4 "=0 then et4 = tincdetectat4-tincstart;else et4 =0; retain holdetO;retain holdetl;retain holdet2;retain holdet3;retain holdet4; if etO "= 0 and holdetO = 0 then ttdO = etO;else ttdO = 0; ifetl"= 0 and holdetl = 0 then ttd1 = etl;else ttd1 = 0; if et2 "= 0 and holdet2 = 0 then ttd2 = et2;else ttd2 = 0; if et3 holdet3 = 0 then ttd3 = et3;else ttd3 = 0; if et4 "= Ocand holdet4 = 0 then ttd4 = et4;else ttd4 = o; totalttdO+ttdO ;totalttd 1 +ttd 1 ;totalttd2+ttd2;totalttd3+ttd3 ;totalttd4+ttd4; iftitisdetectO "= 0 then os_lbmttdO = totalttdO/titisdetectO; else os_lbmttdO = 0; iftitisdetectl"= 0 then os_lbmttd1 = totalttd1/titisdetectl; else os_lbmttd1 = 0; iftitisdetect2 "= 0 then os_lbmttd2 = totalttd2/titisdetect2; else os_lbmttd2 = 0; iftitisdetect3 "= 0 then os_lbmttd3 = totalttd3/titisdetect3; else os_lbmttd3 = 0; if titisdetect4 "= 0 then os _lbmttd4 = totalttd4/titisdetect4; else os _lbmttd4 = 0; output; :,'----:-,,holdetO = etO;holdetl = etl;holdet2 = et2;holdet3 = et3;holdet4 = et4; holdtotala = totala; aptO = tprob aptO; holdtprob __:_aptl = tprob aptl; holdtprob_:_a:pt2 = tprob_apt2; holdtprob_apt3 = tprob_apt3; holdtprob _apt4 = tprob apt4; holdlborshld=lborshld; 162

PAGE 179

keep OS _lbmttd0 OS _lbmttd 1 OS _lbmttd2 ,os _lbmttd3 OS keep os_lbisdrptO os_)bisdrptl os_lbisdrpt2 os_lbisdrpt3 os_lbisdrpt4; run; /*partii.ii calculatettdforshoulder incident only***********/ I data os shldisdrttd; set os_shldisdr; by setno; : .. ,,_ .. retain holdtotala holdtprob aptO holdtprob _apt 1 holdtprob apt2 holdtprob apt3 holdtprob _apt4 holdlborshld; totala+incvector; if incvector =0 then totala=O; if holdtotala >= 1 and to tala =0 then itisinc = 1; else itisinc =0; ifholdtotala /\= 0 or totala /\= Othen tprob aptO+prob _aptO;else tprob aptO = 0; ifholdtotala /\= 0 or totala A: Othen tprob_aptl +prob_aptl;else tprob_aptl = 0; ifholdtotala /\= 0 ortotala /\= Othen tprob_apt2+prob_apt2;else tprob_apt2 = 0; if holdtotala A: 0 or to tala /\= Othen tprob apt3+prob apt3 ;else tprob apt3 = 0; ifholdtotala A: 0 ortotala /\= Othen tprob_apt4+prob_apt4;else tprob_apt4 = 0; /*below to check if the model detects the incident* I if itisinc = 1 and tprob aptO >= J. then itisdetectO. =:=_ 1; else .itisdetectO = 0; ifitisinc = 1 and.tprob_aptl >= 1 thenitisdetectl = l; else itisdetectl = 0; ifitisinc == 1 andtprob:.:_apt2 >= 1 then itisdetect2 = 1;.else itisdetect2 = 0; ifitisinc = 1 'andtprob apt3 >= 1 then itisdetect3 = 1; else itisdetect3 = 0; if itisinc = 1 and tprob apt4 >= 1 then itisdetect4 = 1; else itisdetect4 = 0; titisdetectO+itisdetectO;titisdetect 1 +itisdetect 1 ;titisdetect2+itisdetect2;titisdetect3+itisd etect3 ;titisdetect4+itisdetect4; . titisinc+itisinc; : :. : :. /*end calculation of:dt*Y .. /* begin calculation of mean time to detect* I if totala = 1 then incstart = n _;else incstart = 0; iflast.setno = 1 orincstart /\= 0 then tincstart = 0; tincstart+incstart; : if tprob aptO =1 and -holdtprob aptO = 0 then incdetectatO = n _;else incdetectatO = 0; if tprob aptl = 1 and ho1dtprob aptl = Othen incdetectatl = _n _;else incdetectatl = 0; if tprob apt2 =1 and holdtprob apt2 = Othen incdetectat2 = _n _;else incdetectat2 = 0; iftprob_apt3 =1 ;and holdtprob_apt3 = Othen incdetectat3 = _n_;else incdetectat3 = 0; iftprob_apt4 =land holdtprob_apt4 = Othen incdetectat4 = _n_;else incdetectat4 = 0; 163

PAGE 180

if firstsetno-= 1 or totala = 1 then tincdetectatO = O;if first.setno = 1 -or totalai= 1 then tincdetectatl = 0; if frrst.setno = 1 or totala = 1 then tincdetectat2 = O;if first.setno = 1 or totala = 1 then tincdetectat3 = 0; if first.setno = 1 or totala = 1 then tincdetectat4 = 0; iftincdetectatO = 0 or tincdetectatO =""then tincdetectatO+incdetectatO; iftincdetectatl = 0 or tincdetectatl =""then tincdetectatl +incdetectatl; if tincdetectat2 = 0 or tincdetectat2 ="" then tincdetectat2+incdetectat2; iftincdetectat3 = 0 or tincdetectat3 =""then tincdetectat3+incdetectat3; iftincdetectat4 = 0 or tincdetectat4 =""then tincdetectat4+incdetectat4; if tine start /\= 0 and tincdetectatO /\=0 and incvector = 1 then etO = tincdetectatO tincstart;else etO =0; if tincstart /\= 0 and tincdetectatl /\=0 and incvector = 1 then etl = tincdetectatl tincstart;else-etl =0; if tincstart /\= 0 and tincdetectat2 /\=0 then et2 = tincdetectat2-tincstart;else et2 =0; iftincstart /\= 0 and tincdetectat3 /\=0 then et3 = tincdetectat3-tincstart;else et3 =0; iftincstart /\= 0 and tincdetectat4/\=0 then et4= tincdetectat4-tincstart;else et4 =0; retain holdetO;retain holdetl;retain holdet2;retain holdet3;retain holdet4; if etO /\= 0 and holdetO = 0 then ttaO = etO;else ttdO = 0;if etl /\= 0 and holdetl = 0 then ttd1 = etl ;else ttd1 = 0; if et2 /\= 0 and holdet2 = 0 then ttd2 = et2;else ttd2 = 0; if et3/\= 0 and holdet3 = 0 then ttd3 = et3 ;else ttd3 = 0; if et4 /\= 0 and holdet4 = 0 then ttd4 = et4;else ttd4 = 0; totalttdO+ttdO;totalttd 1 +ttd 1 ;totalttd2+ttd2;totalttd3+ttd3 ;totalttd4+ttd4; if titisdetectO /\= 0 then os shldmttdO = totalttdO/titisdetectO; else os shldmttdO = 0; iftitisdetectl A= 0 thenos_shldmttdl = totalttdlltitisdetectl; else os_shldmttd1 = 0; iftitisdetect2 /\= 0 then os_shldmttd2 = totalttd2/titisdetect2; else os_shldmttd2 = 0; iftitisdetect3 /\=-0-then os_shldmttd3 = totalttd3/titisdetect3; else os_shldmttd3 = 0; iftitisdetect4 /\= 0 then os_shldmttd4 = totalttd4/titisdetect4; else os_shldmttd4 = 0; output; holdetO = etO;holdetl = etl;holdet2 = et2;holdet3 = et3;holdet4 = et4; holdtotala = totala; holdtprob..:.. aptO = tprob aptO; holdtprob _aptl = tprob _:aptl; holdtprob apt2:= tprob apt2; holdtprob_apt3 = tprob_apt3; holdtprob 2apt4 = tprob apt4; holdlborshld=lborshld; 164

PAGE 181

keep os_shldmttdO os_shldmttd1 os_shldmttd2 os_shldmttd3 os_shldmttd4; keep os_shldisdrptOos_shldisdrptl os_shldisdrpt2 os_shldisdrpt3 os_shldisdrpt4; run; /*******ttd for both lane-blocking and shoulder willb be calculated next section*************************/ /*part iii calculate false alarm rate(far)********separated non-incidents set is created******************/ data os incdataset; set os _isdrfartemp2; by setno notsorted; if incvector = 1; iflast:setno = 1; incset = setno; keep setno incset; output; run; I* os nonincset is non-incident data file and used to calculate far*/ data os _nonincset; merge os isdrfartemp os incdataset; by setno; if incset "= "" then delete; drop incset; run; I* os incset is incident data file*/ data os _incset; merge os isdrfartemp incdataset; by setno; if iricset = '"' then delete '., -..: _,; drop incset; ' . run /* os_)sdrfartemp co'ntams only non-incident sets and then used to compute far*/ data os far; set os _nonincset; if incvector = 0 and prob aptO= 1 then faO = 1 ;else faO =0; if incvector = 0 and prob aptl = 1 then fa1 = 1 ;else fa1 =0; if incvector = 0 and prob apt2= 1 then fa2 = 1 ;else fa2 =0; ifincvector = 0 and prob_apt3=1 then fa3 = 1;else fa3 =0; if incvector = {) and prob-apt4= 1 then fa4 = 1 ;else fa4 =0; tfaO+faO ;tfa1+fa1 ;tfa2+fa2 ;tfa3+fa3 ;tfa4+fa4; nine= 1'-incvector; tninc+ninc; 165

PAGE 182

iftninc A=O then os_farptO = (tfa0/tninc)*100; else os_farptO =0; iftninc A=O then os_farptl = (tfa1/tninc)*100; else os_farptl =0; iftninc A=O then os_farpt2 = (tfa2/tninc)*100; else os_farpt2 =0; iftninc A=Othen os_farpt3 = (tfa3/tninc)*100; else os_farpt3 =0; iftninc A=O then os_:_farpt4 = (tfa4/tninc)*100; else os_farpt4 =0; /*keep setno incvector vectorid p_a prob_aptO prob_apt1 prob_apt2 prob_apt3 prob_apt4 os_isdrptO os_isdrptl os_isdrpt2 os_isdrpt3 os_isdrpt4 os:JarptO os_farpt1 os_farpt2 os_farpt3 os_farpt4;*/ run;. /********end of false alarm rate calculation**************/ /* calculate dr and ttd for original sample bothJaneblocking and shoulder incidents***/ ... dataos_drfar; .. :, _, set os _isdr end =lastobs; bysetno;. retain. holdtotala holdtprob -'aptO holdtprob..:.. aptl holdtprob apt2 holdtprob apt3 holdtprob apt4 holdlborshld; totala+incvector; ifincvector =0 then totala=O; .. ; :<, ... ifholdtotala >= 1 and totala =0 then itisiric,.= 1; :else itisinc =0; ifholdtotala A= 0 or totala A= Othen tprob.:.:.aptO+iJrob'-aptO;else tprob_aptO = 0; ifholdtotala A= 0 or totala A= Othen tprob.:.:_aptl+prob_aptl;else tprob_aptl = 0; ifholdtotala A= 0 or totala A= Othen tprob apt2+prob _apt2;else tprob apt2 = 0; ifholdtotala 0 ortotala A= Othen tprob_apt3+prob_apt3;else tprob_apt3 = 0; ifholdtotala 0 or totala Othen tprob apt4+prob apt4;else tprob apt4 = 0; /*below to check if the model detects the incident* I ifitisinc = 1 and tprob_aptO :>= 1 then itisdetectO = 1; else itisdetectO = 0; if itisinc = l and tprob aptl >= 1 then itisdetectl = 1; else itisdetectl = 0; if itisinc = 1 and tprob apt2 >= 1 then itisdetect2 = 1; else itisdetect2 = 0; ifitisinc = 1 and tprob_apt3 >= 1 then itisdetect3 = 1; else itisdetect3 = 0; ifitisinc = 1 and tprob _apt4 >= 1 then itisdetect4 = 1; else itisdetect4 = 0; titisdetectO+itisdetectO;titisdetect 1 +itisdetect 1 ;titisdetect2+itisdetect2;titisdetect3+itisd etect3 ;titisdetect4+itisdetect4; titisinc+itisinc; /*below is to check how many on each types (laneblocking and shoulder) incidents are detected in input data, if"lborshld" equals to 1 islaneblocking and if2 is shoulder incident*/ if itisdetectO = 1 and holdlborshld =:= 1 then lbdetectedO= 1 ;else lbdetectedO=O; if itisdetectO = 1 and holdlborshld=2 then shlddetectedO= 1 ;else shlddetectedO=O; 166

PAGE 183

ifitisdetectl =1 and holdlborshld =1 then lbdetected1=1;else lbdetected1=0; if itisdetect 1 = 1 and holdlborshld =2 then shlddetected 1 = 1 ;else shlddetected 1 =0; if itisdetect2 = 1 and holdlborshld = 1 then lbdetected2= 1 ;else lbdetected2=0; if itisdetect2 = 1 and holdlborshld =2 then shlddetected2= 1 ;else shlddetected2=0; ifitisdetect3 =1 and holdlborshld =1 then lbdetected3=1;else lbdetected3=0; ifitisdetect3 =1 and holdlborshld =2 then shlddetected3=1;else shlddetected3=0; if itisdetect4 = 1 and holdlborshld = 1 then lbdetected4= 1 ;else lbdetected4=0; if itisdetect4 = 1 and holdlborshld =2 then shlddetected4= 1 ;else shlddetected4=0; tlbdetectedO+lbdetectedO;tlbdetected 1 +lbdetected 1 ;tlbdetected2+lbdetected2;tlbdetecte d3+lbdetected3 ;tlbdetected4+lbdetected4; tshlddetectedO+shlddetectedO;tshlddetected 1 +shlddetected 1 ;tshlddetected2+shlddetect ed2;tshlddetected3+shlddetected3;tshlddetected4+shlddetected4; if itisinc = 1 and holdlborshld = 1 ,then itislbinc = 1 ;else itislbinc=O; ifitisinc =:=1 and holdlborshld=2 then itisshldinc= 1 ;else itisshldinc=O; titislbinc+itislbinc;titisshldinc+itisshldinc; /*end calculation of dr*/. /*begin calculation of mean time to detect*/ if totala = 1 then incstart =_ n _;else incstart = 0; if last.setno = 1 or incstart "= 0 then tincstart = 0; tincstart+incstart; iftprob aptO = 1 and holdtprob aptO = 0 then incdetectatO = n _;else incdetectatO = 0; iftprob_aptl =1 and holdtprob.:._aptl.= Othen incdetectatl = :..:_n_;else incdetectatl = 0; iftprob = 1 and holdtprob apt2 = Othen incdetectat2 = n _;else incdetectat2 = 0; if tprobo....:.apt3 = 1 and holdtprob.:...apt3 = Othen incdetectat3 = n _;else incdetectat3 = 0; if tprob apt4 = 1 and holdtprob apt4 = Othen incdetectat4 = n _;else incdetectat4 = 0; iffirst.setno =1 or totala=l then tincdetectatO = O;iffirst.setno =1 or totala =1then tiricdetectatl = 0; . . .. .. iffirst.setno = 1 or totala = 1 then tincdetectat2 = O;if frrst.setno = 1 or totala = 1 then tincdetectat3 = 0; if first.setno = 1 or totala = 1 then tincdetectat4 = 0; iftincdetectatO = 0 or tincdetectatO =""then tincdetectatO+incdetectatO; iftincdetectatl = 0 or tincdetectatl =""then tincdetectatl +incdetectatl; iftincdetectat2 = 0 or tincdetectat2 =""then tincdetectat2+incdetectat2; if tincdetectat3 = 0 or tincdetectat3 ="" then tincdetectat3+incdetectat3; iftincdetectat4 = 0 or tincdetectat4 ="" then tincdetectat4+incdetectat4; 167

PAGE 184

if tincstart I'=. 0 and tincdetectatO "=0 and incvector = 1 then etO = tincdetectatO tincstart;else etO =0; if tine start "= 0 and tincdetectat1 "=0 and incvector = 1 then et 1 = tincdetectat1tincstart;else etl =0; iftincstart "= 0 and tincdetectat2 "=0 then et2 = tincdetectat2-tincstart;else et2 =0; iftincstart "= 0 and tincdetectat3 "=0 then et3 = tincdetectat3-tincstart;else et3 =0; iftincstart "= 0 and tincdetectat4 "=0 then et4 = tincdetectat4"'tincstart;else et4 =0; retain holdetO;retain holdetl;retain holdet2;retain holdet3;retain holdet4; if etO "= 0 and holdetO = 0 then ttdO = etO;else ttdO = 0; if etl 1.\= 0 and holdetl = 0 then ttd1 = etl ;else ttd1 = 0; if et2 "= 0 and holdet2 = 0 then ttd2 = et2;else ttd2 = 0; if et3 "= 0 and holdet3 = 0 then ttd3 = et3;else ttd3 = 0; if et4 "= 0 and holdet4 = 0 then ttd4 = et4;else ttd4 = 0; totalttdO+ttdO ;totalttd 1 +ttd 1 ;totalttd2 +ttd2 ;totalttd3+ttd3 ;totalttd4+ttd4; output; holdetO = etO;holdetl = etl;holdet2 =et2;holdet3 = et3;holdet4 = et4; holdtotala = totala; holdtprob_aptO = tprob_aptO; holdtprob aptl = tprob aptl; holdtprob apt2 = tprob apt2; holdtprob_apt3 = holdtprob_apt4 = tprob_apt4; holdlborshld=lborshld; run; /* end calculation of dr, far, isdr and mttd I*=======================================================* I data os drplusttdbeforesum; set os drfar end =lastobs; iftitisinc I'=. 0 then os_drO =(titisdetect0/titisinc)*100;else os_drO =0; iftitisinc "= 0 then os_drl=(titisdetectlltitisinc)*lOO;else os_dr1 =0; iftitisinc "= 0 then os_dr2 =(titisdetect2/titisinc)*100;else os_dr2 =0; iftitisinc "= 0 then os_dr3 =(titisdetect3/titisinc)*100;else os_:_dr3 =0; iftitisinc "= 0 then os_dr4 =(titisdetect4/titisinc)*100;else os_dr4 ==0; numincdetectedO=titisdetectO; numincdetected1 =titisdetect 1; 1 numincdetected2=titisdetect2; numincdetected3=titisdetect3; numincdetected4=titisdetect4; 168

PAGE 185

numtotalinc = titisinc; iftitisdetectO "= 0 then os_mttdO = totalttdO/titisdetectO; else os_mttdO = 0; if titisdetectl "= 0 then os mttd 1 = totalttd 1/titisdetectl; else os mttd 1 = 0; iftitisdetect2 "= 0 then os_mttd2 = totalttd2/titisdetect2; else os......:mttd2 = 0; iftitisdetect3 "= O_then os_mttd3 = totalttd3/titisdetect3; else os_mttd3 = 0; iftitisdetect4 "= 0 then os_mttd4 = totalttd4/titisdetect4; else os_mttd4 = 0; keep os_drO os_dr1 os_dr2 os_dr3 os_dr4; keep os_mttdO os_mttd1 os_mttd2 os_mttd3 os_mttd4; keep numincdetectedO numincdetected1 numincdetected2 numincdetected3 numincdetected4 numtotalinc; keep os_isdrptO os_isdrptl os_isdrpt2 os_isdrpt3 os_isdrpt4; keep titislbinc titisshldinc; keep tlbdetectedO tlbdetected1 tlbdetected2 tlbdetected3 tlbdetected4; keep tshlddetectedO tshlddetected1 tshlddetected2 tshlddetected3 tshlddetected4 varname; iflastobs; run; data os farbeforesum; set os_far end=lastobs; keep OS_ farpt0 OS _farptl OS_ farpt2 OS_ farpt3 OS_ farpt4 varname; iflastobs; run; data os _lbisdrttdsum; set os _lbisdrttd end=lastobs; keep os_lbisdrptO os_lbisdrptl os_lbisdrpt2 o_sJbisdrpt3 os_)bisdrpt4; keep os_lbmttdO os_Jbmttd1 os_lbmttd2 os_)bmttd3 os_lbmttd4s iflastobs; run; data os shldisdrttdsum; set os_shldisdrttd end=lastobs; keep os-'-'shldisdrptO os_shldisdrptl os_shldisdrpt2 os_shldisdrpt3 os_shldisdrpt4; keep os_shldmttdO os_shldmttd1 os_shldmttd2 os_shldmttd3 os_shldmttd4; iflastobs; run; data os summarytrain; set os drplusttdbeforesum end =lastobs; set os farbeforesum end= lastobs; set os _lbisdrttdsum end=lastobs; set os_shldisdrttdsum end=lastobs; 169

PAGE 186

keep os shldisdrptO os shldisdrptl os shldisdrpt2 os shldisdrpt3 os shldisdrpt4 os_shldmttdO os_shldmttd1 os_shldmttd2 os_shldmttd3 os_shldmttd4; keep os_lbisdrptO os_)bisdrptl os_lbisdrpt2 os_lbisdrpt3 os_lbisdrpt4 os_lbmttdO os_lbmttdl os_lbmttd2 os_lbmttd3 os_lbmttd4; keep os_isdrptO os_isdrptl os_isdrpt2 os_isdrpt3 os_isdrpt4; keep os_farptO os_farptl os_farpt2 os_farpt3 os_farpt4 os_drO os_drl os_dr2 os_dr3 os_dr4; keep os_mttdO os_mttd1 os_mttd2 os_mttd3 os_mttd4; keep numincdetectedO numincdetected 1 numincdetected2 numincdetected3 numincdetected4 numtotalinc; keep titislbinc titisshldinc; keep tlbdetectedO tlbdetected1 tlbdetected2 tlbdetected3 tlbdetected4; keep tshlddetectedO tshlddetected1 tshlddetected2 tshlddetected3 tshlddetected4 vamame;iflastobs; run; proc datasets library = work; delete os drplusttdbeforesum;delete os farbeforesum;delete os drfar;delete os isdr;delete os temp;delete os_isdrfartemp;delete os isdrfartemp2;delete os _far; delete os _incset;delete os_ incdataset;delete os _)bincset;delete os _lbincdataset;delete os shldincset;delete os shldincdataset;delete os nonincset;delete os _lbisdrttd;delete os shldisdrttd;delete os _lbisdr;deleteos shldisdr;delete os temptrain;delete os shldisdrttdsum;delete os _lbisdrttdsum; run; . quit; .,, , -:proc export data= work.os summarytrain outfile= "c:\i-25 id\os summarytrainlbandshld.xls" dbms=excel2000 replace; run; /***************************************************************/ /*****++++calculate performance for++++test set++++****/ /*I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I*/ data os isdrfartemp; set os...:. temptest; retain holdprobO; retain holdprob 1; retain holdprob2; retain holdprob3; 170

PAGE 187

/*====:specify transformation and threshold here, default threshold is 0.5 ======*/ if p _incvector >20 then p incvector=20; ifp_incvector <-20 then p_incvector=-20; prob aptO = ( exp(p _incvector)/(1 +exp(p incvector))); if prob aptO > 0.5 then prob aptO = 1; else prob aptO =0; I*========================================================* I /*******create response for different persistence( from 0 upto 4 persistence)*/ ifprob_aptO = 1 and holdprobO =1 then prob_aptl=l;else prob_aptl :;::::0; if prob aptl = 1 and holdprob 1 = 1 then prob apt2 = 1 ;else prob apt2 =0; if prob apt2 = 1 and holdprob2 = 1 then prob apt3 = 1 ;else prob apt3 =0; ifprob_apt3 = 1 and holdprob3 =1 then prob-'-apt4 =1;else prob_apt4 =0; output; holdprobO = prob_aptO; holdprob1 = prob_apt1; holdprob2 = prob apt2; holdprob3 = prob_apt3; drop holdprobO holdprob 1 holdprob2 holdprob3; run; /*part i start calculate incident-state detection rate(isdr) for original /*part i.i for both lb and data OS isdr; set os temptest; set os _isdrfartemp; if incvector = 1 and prob aptO= 1 then isdO = 1; else isdO =0; -if incvector = 1 and prob aptl = 1 then isd 1 = 1; else isd 1 =0; if incvector = 1 and prob apt2= 1 then isd2 = 1; else isd2 =0; if incvector = 1 and prob apt3= 1 then isd3 = 1; else isd3 =0; if incvector = 1 and prob apt4= 1 then isd4 = _1; else isd4 =0; tisdO+isdO; tisd1+isd1; tisd2+isd2; tisd3+isd3; tisd4+isd4;tis+incvector; if tis A=O then os_isdrptO = (tisdO/tis)*lOO;else os_isdrptO =0; iftis A=O then os_isdrptl = (tisdl/tis)*100;else os_isdrptl =0; iftis A=O then os_isdrpt2 = (tisd2/tis)*100;else os_isdrpt2 =0; if tis A=O then os_isdrpt3 = (tisd3/tis)*100;else os_)sdrpt3 =0; if tis A=O then os_isdrpt4 = (tisd4/tis)*100;else os_:_isdrpt4 =0; run; proc sort data= os_isdrfartemp out= os_isdrfartemp2; by incvector setno; 171

PAGE 188

run; /*part i.ii for lane-blocking only*************************************************************/ /******************************************************************/ data os lbincdataset; set os isdrfartemp2; by setno nots6rted; if lborshld = 1; if last.setno = 1; lbincset =setno; keep setno lbincset; output; data os_lbincset; merge os _isdrfartemp os _lbincdataset; by setno; iflbincset ="" then delete; /*drop lbincset;*/ run; data os _lbisdr; set os _lbincset; . if incvector = 1 and proh aptO= Lthen lbisdO = 1; else lbisdO =0; if incvector = 1 and aptl =1 then lbisd1 = 1; else lbisd1 =0; if incvector = 1 and prob _apt2= 1 then lbisd2 = 1; else lbisd2 =0; if incvector = 1 and prob apt3= 1 then lbisd3 = 1; elseJbisd3 =0; if incvector 1 andprob = apt4= 1 $en = else lbisd4 =0; tlbisdO+lbisdO; tlbisd1+lbisd1; tlbisd2+lbisd2; tlbisd3+lbisd3; tlbisd4+lbisd4;tlbis+incvector; iftlbis ",;,0 then os_lbisdrptO = (tlbisdO/tlbis)*lOO;else os_lbisdrptO =0; iftlbis "=0 then os_lbisdrptl = (tlbisdl/tlbis)*lOO;else os_lbisdrptl =0; iftlbis "=0 thenos_lbisdrpt2 = (tlbisd2/tlbis)*100;else os_lbisdrpt2 =0; iftlbis "=0 then os_lbisdrpt3 = (tlbisd3/tlbis)*100;else os_lbisdrpt3 =0; iftlbis "=0 then os_lbisdrpt4 = (tlbisd4/tlbis)*100;else os_lbisdrpt4 =0; run; /*part i.iii for shoulder incident only**********************************************************/ /***************************************************************/ data os-'shldincdataset; set os _isdrfartemp2; by setno notsorted; iflborshld =2; if last.setno = 1; 172

PAGE 189

shldincset =setno; keep setno shldincset; output; data os shldincset; merge os _isdrfartemp os shldincdataset; by setno; if shldincset ="" then delete; drop shldincset; run; data os shldisdr; set os shldincset; if incvector = 1 and prob aptO= 1 then shldisdO = 1; else shldisdO =0; if incvector = 1 and prob:_ aptl = 1 then shldisd1 = 1; else shldisd1 =0; if incvector = 1 and prob apt2= 1 then shldisd2 = 1; else shldisd2 =0; if incvector = 1 and prob apt3= 1 then shldisd3 = 1; else shldisd3 =0; if incvector = 1 and prob apt4= 1 then shldisd4 = 1; else shldisd4 =0; tshldisdO+shldisdO ; tshldisd1 +shldisd1; tshldisd2+shldisd2; tshldisd3+shldisd3; tshldisd4+shldisd4;tshldis+incvector; iftshldis "=0 then os...:shldisdrptO = (tshldisd0/tshldis)*100;else os_shldisdrptO =0; iftshldis "=0 then os_shldisdrptl = (tshldisdl/tshldis)*100;else os_shldisdrptl =0; iftshldis "=0 then os_shldisdrpt2 = (tshldisd2/tshldis)*100;else os_shldisdrpt2 =0; iftshldis "=0 then os_shldisdrpt3 = (tshldisd3/tshldis)*100;else os_shldisdrpt3 =0; iftshldis "=0 then os_shldisdrpt4 = (tshldisd4/tshldis)*100;else os_shldisdrpt4 =0; run; /******end of part i********************************!************************************ I /*part ii calculate time to detect (ttd) ********************/ /*part ii.i calculate ttd for lane-blocking only************/ data os _lbisdrttd; set os_lbisdr; by setno; retain holdtotala holdtprob aptO holdtprob aptl holdtprob _apt2 holdtprob apt3 holdtprob apt4 holdlborshld; totala+incvector; if incvector =0 then totala=O; ifholdtotala >= 1 and totala =0 then itisinc =1; else itisinc =0; ifholdtotala "= 0 or totala "= Othen tprob_aptO+prob_aptO;else tprob_aptO = 0; ifholdtotala "= 0 or totala "= Othen tprob apt1 +prob apt1 ;else tprob apt1 = 0; ifholdtotala "= 0 or totala "= Othen tprob_apt2+prob_apt2;else tprob_apt2 = 0; if holdtotala "= 0 or totala "= Othen tprob apt3+prob apt3 ;else tprob apt3 = 0; 173

PAGE 190

ifholdtotala /\= 0 or:totaJa "'= Othen tprob apt4+prob apt4;else tprob _apt4 = 0; /*below to checkifthe model detects the incident*/ ifitisinc =.I and tprob_aptO >= 1 then itisdetectO = 1; else itisdetectO = 0; ifitisinc=-l,andtprob_aptl >= 1 then itisdetectl = 1; else itisdetectl = 0; ifitisinc = 1 and tprob..:._apt2 >= 1 then itisdetect2 = 1; else itisdetect2 = 0; if itisinc = 1 and tprob apt3 >= 1 then itisdetect3 = 1; else itisdetect3 = 0; ifitisinc = 1 and tprob_apt4 >= 1 then itisdetect4 = 1; else itisdetect4 = 0; titisdetectO+itisdetectO;titisdetect 1 +itisdetect 1 ;titisdetect2+itisdetect2;titisdetect3+itisd etect3 ;titisdetect4+itisdetect4; titisinc+itisinc; /*begin calculation of mean time to detect*/ iftotala =1 th(m incstart = n ;else incstart = 0; if last.setno = 1 or incstart "'= 0 then tincstart = 0; tincstart+incstart; iftprob_aptO =1 and holdtprob_aptO = 0 then incdetectatO = _n_;else incdetectatO = 0; if tprob aptl = 1 and holdtprob aptl = Othen incdetectatl = n _;else incdetectatl = 0; if tprob apt2 == 1 and holdtprob apt2 = Othen incdetectat2 = _n _;else incdetectat2 = 0; iftprob:__:apt3 =1 and holdtprob_apt3 = Othen incdetectat3 = _n_;else incdetectat3 = 0; iftprob..:._apt4 =1 and holdtprob_apt4 = Othen incdetectat4 = _n_;else incdetectat4 = 0; if first.setno = 1 or totala = 1 then tincdetectatO = O;if first.setno = 1 or totala = 1 then tincdetectatl ='0; iffirst.setno.=1 ortotala =lthen tincdetectat2 = O;iffrrst.setno =1 or totala =1then tincdetectat3 = 0; if first.setno =o;1 onotala = 1 then tincdetectat4 = 0; _1 J. ..... iftincdetectatO =Oor tincdetectatO =""then tincdetectatO+incdetectatO; iftincdetectatl = 0 ortincdetectatl =""then tincdetectat1+incdetectat1; iftincdetectat2 = 0 or tincdetectat2 ='"'then tincdetectat2+incdetectat2; iftincdetectat3 = 0 or tincdetectat3 =""then tincdetectat3+incdetectat3; iftincdetectat4 tincdetectat4 =""then tincdetectat4+incdetectat4; if tincstart "'= 0 and tincdetectatO "'=0 and incvector = 1 then etO = tincdetectatO tincstart;else etO =0; if tine start "'= 0 and tincdetectat1 "'=0 and incvector = 1 then et 1 = tincdetectat 1-tincstart;else etl =0; iftincstart A= 0 and tincdetectat2 "'=0 then et2 = tincdetectat2-tincstart;else et2 =0; iftincstart "'= 0 and tincdetectat3 "'=0 then et3 = tincdetectat3-tincstart;else et3 =0; iftincstart "'= 0 and tincdetectat4 "'=0 then et4 = tincdetectat4-tincstart;else et4 =0; retain holdetO;retain holdetl;retain holdet2;retain holdet3;retain holdet4; 174

PAGE 191

if etO "= 0 and holdetO = 0 then ttdO = etO;else ttdO = 0; ifetl "= 0 andholdetl = 0 then ttdl = etl;else ttdl = 0; if et2 "= 0 and holdet2 = 0 then ttd2 = et2;else ttd2 = 0; if et3 "= 0 and holdet3 = 0 then ttd3 = et3;else ttd3 = 0; if et4 "= 0 and holdet4 = 0 then ttd4 = et4;else ttd4 = 0; totalttdO+ttdO;totalttd 1 +ttd 1 ;totalttd2+ttd2;totalttd3+ttd3 ;totalttd4+ttd4; iftitisdetectO "= 0 then os_lbmttdO = totalttdO/titisdetectO; else os_lbmttdO = 0; iftitisdetectl "= 0 then os_lbmttd1 = totalttd1/titisdetectl; else os_lbmttd1 = 0; iftitisdetect2 "= 0 then os_lbmttd2 = totalttd2/titisdetect2; else os_lbmttd2 = 0; if titisdetect3 "= 0 then os _lbmttd3 = totalttd3/titisdetect3; else os _lbmttd3 = 0; iftitisdetect4 "= 0 then os_lbmttd4 = totalttd4/titisdetect4; else os_lbmttd4 = 0; output; . .. etO;holdetl = etl;holdet2 = et2;holdet3 = = et4; holdtotala = totala; holdtprob_aptO = tprob_aptO; holdtprob_apt1 = tprob_aptl; holdtprob_apt2 = tprob_apt2; holdtprob_apt3 = tprob_apt3; holdtprob_apt4 = tprob_apt4; holdlborshld=lborshld; I keep os_lbmttdO os_lbmttd1 os_lbmttd2 os_lbmttd3 os_lbmttd4; keep os_lbisdrptO os_lbisdrptl os_lbisdrpt2 os_lbisdrpt3 os_lbisdrpt4; run; .. . . ': .. /*part ii.ii calculate ttd for shoulder incident only********/ data os shldisdr; by setno; retain holdtotala holdtprob aptO holdtprob aptl l!oldtp!ob apt2 holdtprob holdtprob apt4 totala+incvector; if incvector =0 then totala=O; ifholdtotala >= 1 and totala =0 then itisinc =1; else itisinc =0; ifholdtotala "= 0 or totala "= Othen tprob_aptO+prob_aptO;else tprob_JlptO = 0; ifholdtotala "= 0 or totala "= Othen tprob aptl +prob aptl ;else tprob aptl = 0; ifholdtotala "= 0 or totala "= Othen tprob_apt2+prob_apt2;else tprob_apt2 = 0; ifholdtotala "=.0 or totala "= Othen tprob_apt3+prob_apt3;else tprob_apt3 = 0; ifholdtotala "= 0 or totala "= Othen tprob_apt4+prob_apt4;else tprob_apt4 = 0; /*below to check if the model detects the incident* I ifitisinc = 1 and tprob_aptO >= 1 then itisdetectO = 1; else itisdetectO = 0; 175

PAGE 192

if itisinc = 1 and tprob _aptl >= 1 then itisdetectl = 1; else itisdetectl = 0; if itisinc = 1 and tprob apt2 >= 1 then itisdetect2 = 1; else itisdetect2 = 0; if itisinc = 1 and tprob apt3 >= 1 then itisdetect3 = 1; else itisdetect3 = 0; if itisinc = 1 and tprob apt4 >= 1 then itisdetect4 = 1; else itisdetect4 = 0; titisdetectO+itisdetectO;titisdetect 1 +itisdetect 1 ;titisdetect2+itisdetect2;titisdetect3+itisd etect3 ;titisdetect4+itisdetect4; titisinc+itisinc; I* end calculation of dr* I I* begin calculation of mean time to detect*/ if totala = 1 then incstart = n incstari = 0; iflast.setno = 1 or incstart "= 0 then tincstart = 0; tincstart+incstart; iftprob_aptO =1 and holdtprob_aptO = 0 then incdetectatO = _n_;else incdetectatO = 0; if tprob aptl = 1 and holdtprob apt1 = Othen incdetectat1 = _n _;else incdetectatl = 0; if tprob apt2 = 1 and holdtprob apt2 = Othen incdetectat2 = n _;else incdetectat2 = 0; iftprob_apt3 =1 and holdtprob_apt3 = Othen incdetectat3 = _n_;else incdetectat3 = 0; if tprob apt4 = 1 and holdtprob apt4 = Othen incdetectat4 = n _;else incdetectat4 = 0; iffirst.setno =1 or totala =1 then tincdetectatO = O;iffrrst.setno =1 or totala =1then tincdetectatl = 0; if first.setno = 1 or totala = 1 then tincdetectat2 = O;if first.setno = 1 or totala = 1 then tincdetectat3 = 0; if frrst.setno = 1 or totahi = 1 then tincdetectat4 = 0; iftincdetectatO = 0 or tincdetectatO =""then tincdetectatO+incdetectatO; if tincdetectatl = 0 or. tincdetectatl ="" then tincdetectatl +incdetectatl; iftincdetectat2 = 0 or tincdetectat2 ="".thentincdetectat2+incdetectat2; iftincdetectat3 = 0 or tincdetectat3 =""then tincdetectat3+incdetectat3; iftincdetectat4 = 0 or tincdetectat4 =""then tincdetectat4+incdetectat4; if tincstart "= 0 and tin.cdetectatO "=0 and incvector = 1 then etO = tincdetectatOtincstart;else etO =0; if tincstart "= 0 and tincdetectat1 "=0 and incvector = 1 then et 1 = tincdetectat1, tincstart;else et1 =0; if tincstart "= 0 and tincdetectat2 "=0 then et2 = tincdetectat2-tincstart;else et2 =0; iftincstart "= 0 and tincdetectat3 "=0 then et3 = tincdetectat3-tincstart;else et3 =0; iftincstart "= 0 and tincdetectat4"=0 then et4 = tincdetectat4-tincstart;else et4 =0; retain holdetO;retain holdetl;retain holdet2;retain holdet3;retain holdet4; if etO "= 0 and holdetO = 0 then ttdO = etO;else ttdO = 0; 176

PAGE 193

ifetl "= 0 and holdetl :=: 0 then ttdl = et1;else ttd1 = 0; if et2 "= 0 and holdet2 = 0 then ttd2 = et2;else ttd2 = 0; if et3 "= 0 and holdet3 = 0 then ttd3 = et3;else ttd3 = 0; if et4 "= 0 and holdet4 = 0 then ttd4 = .et4;else ttd4 = 0; totalttdO+ttdO;totalttd 1 +ttd1 ;totaittd2+ttd2 ;totalttd3+ttd3 ;totalttd4+ttd4; iftitisdetectO i-.= 0 then = totalttdO/titisdetectO; else os_shldmttdO = 0; iftitisdetect1"= 0 then os_shldmttd1 = totalttd1/titisdetectl; else os_shldmttd1 = 0; if titisdetect2 "= 0 then os shldmttd2 =:= totalttd2/titisdetect2; else os shldmttd2 = 0; iftitisdetect3 t\d:.o then os-shldmttd3= totalttd3/titisdetect3; else os-shldmttd3 = 0; iftitisdetect4 i-.;, othen' ds=shidffittd4;, totalttd4/titisdetect4; else os=shldmttd4 = 0; output; holdetO = etO;holdetl = etl;holdet2 = et2;holdet3 = et3;holdet4 = et4; holdtotala = totala; aptO == tptob aptO; ' holdtprob aptl = tprob aptl; holdtprob apt2 = tprob apt2; holdtprob _apt3 =: holdtprob_apt4 = tprob_apt4; holdlborshld=lborshld; keep os_shldmttdO os_shldmttdl os_shldmttd2 os_shldmttd3 os_shldmttd4; keep os os sh1discJ.tPtl o(_ shldisdrpt2 os shldisdrpt3 os shldisdrpt4; run; /*ttd for both lane-blocking and shoulder will be calculated next section*************************/ /*part iii calculate false rate '(far)******* *separated non-incidents set is data os incdataset; set os_isdrfartemp2; by setno notsorted;. : ; .. if.incvector .=: 1 ;: iflast.setno = 1; incset = semo; keep setri.o incset; output; .. _: L ,_., _. run; . . I* os nonincset is inCident data file and used to calculate far*/ data os nonincset; merge os_isdtfaitemp os_incdataset; by setno; if incset "= "" then delete; drop incset; 177

PAGE 194

run; /* os incset is incident data file*/ data os incset; merge os_isdrfartemp incdataset; by setno; ifincset = "'' then delete drop incset; run; I* os_isdrfartemp contains only non-incident sets and then used to compute far*/ data OS_ far; set os nonincset; if incvector = 0 and prob aptO= 1 then faO = 1 ;else faO =0; ifincvector = 0 and prob_apt1=1 then fa1 = 1;else fa1 =0; if incvector = 0 and prob apt2= 1 then fa2 = 1 ;else fa2 =0; if incvector = 0 and prob apt3= 1 then fa3 = 1 ;else fa3 =0; if incvector = 0 and prob apt4= 1 then fa4 = 1 ;else fa4 =0; tfaO+faO ;tfa1 +fa1 ;tfa2+fa2 ;tfa3+fa3 ;tfa4+fa4 ; nine = 1-incvector ; tninc+ninc; iftninc "=0 then os_farptO = (tfaO/tninc)*lOO; else os_farptO =0; iftninc "=0 then os_farptl = (tfal/tninc)*100; else os_farptl =0; iftninc "=0 then os_farpt2 = (tfa2/tninc)*100; else os_farpt2 =0; iftninc "=0 then os_farpt3 = (tfa3/tninc)*100; else os_farpt3 =0; iftninc "=0 then os_farpt4 = (tfa4/tninc)*100; else os_farpt4 =0; run' /********end of false alarm rate calculation******/ /* calculate dr and ttd for original sample both 1aneblocking and shoulder incidents*/ data OS_ drfar; set os_isdr end =lasto}Js; by setno; retain holdtotala holdtprob_aptO holdtprob_aptl holdtprob_apt2 holdtprob_apt3 holdtprob apt4 holdlborshld; totala+incvector; if incvector =0 then totala=O; ifholdtotala >= 1 and totala =0 then itisinc =1; else itisinc =0; ifholdtotala "= 0 or totala "= Othen tprob_aptO+prob_aptO;else tprob_aptO = 0; ifholdtotala "= 0 or totala "= Othen tprob....:,aptl+prob_aptl;else tprob_aptl = 0; ifholdtotala "= 0 or totala "= Othen tprob _apt2-+prob _apt2;else tprob apt2 = 0; ifholdtotala "= 0 or totala "= Othen tprob_apt3+prob_apt3;else tprob_apt3 = 0; ifholdtotala "= 0 or totala "= Othen tprob_apt4+prob_apt4;else tprob_apt4 = 0; 178

PAGE 195

/*below to check if the model detects the incident*/ ifitisinc = 1 and tprob_aptO >= 1 then itisdetectO = 1; else itisdetectO = 0; if itisinc = 1 and tprob aptl >= 1 then itisdetectl = 1; else itisdetectl = 0; if itisinc = 1 and tprob apt2 >= 1 then itisdetect2 = 1; else itisdetect2 = 0; if itisinc = 1 and tprob _apt3 >= 1 then itisdetect3 = 1; else itisdetect3 = 0; if itisinc = 1 and tprob apt4 >= 1 then itisdetect4 = 1; else itisdetect4 = 0; titisdetectO+itisdetectO ;titisdetect 1 +itisdetect 1 ;titisdetect2+itisdetect2 ;titisdetect3+itisd etect3 ;titisdetect4+itisdetect4; titisinc+itisinc; /*below is to check how many on each types (laneblocking and shoulder) incidents are detected in input data, if"lborshld" equals to 1 is laneblocking and if2 is shoulder incident*/ ifitisdetectO =1 and holdlborshld =1then lbdetected0=1;else lbdetectedO=O; if itisdetectO = 1 and holdlborshld =2 then shlddetectedO= 1 ;else shlddetectedO=O; if itisdetectl = 1 and holdlborshld = 1 then lbdetected1 = 1 ;else lbdetected1 =0; ifitisdetectl =1 and holdlborshld =2 then shlddetected1=1;else shlddetectedl=O; ifitisdetect2 =1 and holdlborshld =1 then lbdetected2=1;else lbdetected2=0; ifitisdetect2 =1 and .holdlborshld =2 then shlddetected2=1;else shlddetected2=0; ifitisdetect3 =1 and holdlborshld =1 then lbdetected3=1;else lbdetected3=0; ifitisdetect3 =1 and holdlborshld =2 then shlddetected3=1;else shlddetected3=0; ifitisdetect4 =1 and holdlborshld =1 then lbdetected4=1;else lbdetected4=0; if itisdetect4 = 1 and holdlborshld =2 then shlddetected4= 1 ;else shlddetected4=0; tlbdetectedO+lbdetededO;tlbdetected1 +lbdetected1 ;tlbdetected2+lbdetected2;tlbdetecte d3+lbdetected3;tlbdetected4+lbdetected4; tshlddetectedO+shlddetectedO;tshlddetected1 +shlddetected1 ;tshlddetected2+shlddetect ed2;tshlddetected3+shlddetected3 ;tshlddetected4+shlddetected4;' ifitisinc = 1 and holdlborshld =1 then itislbinc =1;else itislbinc=O; if itisinc = 1 and then itisshldinc= 1 ;else itisshldinc=O; titislbinc+itislb41,c;ti!isshldinc+itisshldinc;. I* end calculation of dr* I I* begin calculation of mean time to detect*/ if totala = 1 then incstart = n _;else incstart = 0; if last.setno = 1 or incstart 0 then tincstart = 0; tincstart+incstart; .. iftprob_aptO =1 and holdtprob_aptO = 0 then incdetectatO = _n_;else incdetectatO = 0; if tprob aptl = 1 and holdtprob aptl = Othen incdetectatl = n _;else incdetectatl = 0; iftprob_apt2 =1 and holdtprob_apt2 = Othen incdetectat2 = _n_;else incdetectat2 = 0; iftprob_apt3 =1 and holdtprob_apt3 = Othen incdetectat3 = _n_;else incdetectat3 = 0; 179

PAGE 196

iftprob_apt4 =1 and holdtprob_apt4 = Othen incdetectat4 = _n_;else incdetectat4 = 0; iffrrst.setno =1 or totala =1 then tincdetectatO = O;iffrrst.setno =1 or totala =1then tincdetectat1 = 0; if frrst.setno = 1 or to tala = 1 then tincdetectat2 = O;if frrst.setno = 1 or to tala = 1 then tincdetectat3 = 0; if frrst.setno = 1 or totala = 1 then tincdetectat4 = 0; iftincdetectatO = 0 or tincdetectatO =""then tincdetectatO+incdetectatO; if tincdetectatl = 0 or tincdetectatl ="" then tincdetectatl +incdetectatl; iftincdetectat2 = 0 or tincdetectat2 =""then tincdetectat2+incdetectat2; iftincdetectat3 = 0 or tincdetectat3 =""then tincdetectat3+incdetectat3; iftincdetectat4 = 0 or tincdetectat4 =""then tincdetectat4+incdetectat4; if tincstart "= 0 and tincdetectatO "=0 and incvector = 1 then etO = tincdetectatO tincstart;else etO =0; if tincstart "= 0 and tincdetectat1 "=0 and incvector = 1 then et 1 = tincdetectat 1-tincstart;else etl =0; iftincstart "= 0 and tincdetectat2 "=0 then et2 = tincdetectat2-tincstart;else et2 =0; iftincstart "= 0 and tincdetectat3 "=0 then et3 = tincdetectat3-tincstart;else et3 =0; if tincstart "= 0 and tincdetectat4 "=0 then et4 = tincdetectat4-tincstart;else et4 =0; retain holdetO;retain holdetl;retain holdet2;retain holdet3;retain holdet4; if etO "= 0 and holdetO = 0 then ttdO = etO;else ttdO = 0; if etl "= 0 and holdetl = 0 then ttd1 = etl;else ttd1 = 0; if et2 "= 0 and holdet2 = 0 then ttd2 = et2;else ttd2 = 0; if et3 "= 0 and holdet3 = 0 then ttd3 = et3 ;else ttd3 = 0; if et4 "= 0 and holdet4 = 0 then ttd4 = et4;else ttd4 = 0; totalttdO+ttdO;totalttd 1 +ttd1 ;totalttd2+ttd2;totalttd3+ttd3 ;totalttd4+ttd4; output; holdetO = etO;holdetl = etl;holdet2 = et2;holdet3 = et3;holdet4 = et4; holdtotala = totala; holdtprob aptO = tprob aptO; holdtprob aptl = tprob aptl; holdtprob apt2 = tprob apt2; holdtprob apt3 = tprob apt3; holdtprob apt4 = tprob apt4; holdlborshld=lborshld; run; 180

PAGE 197

/* end calculation of dr, far, isdr and mttd /* */ data os drplusttdbeforesum; set os drfar end =lastobs; iftitisinc /\= 0 then os_drO =(titisdetectO/titisinc)*lOO;else os_drO =0; iftitisinc /\= 0 then os_dr1 =(titisdetectl/titisinc)*IOO;else os_dr1 =0; iftitisinc 0 then os_dr2 =(titisdetect2/titisinc)*100;else os_dr2 =0; iftitisinc /\= 0 then os_dr3 =(titisdetect3/titisinc)*100;else os_dr3 =0; iftitisinc /\= 0 then os_dr4 =(titisdetect4/titisinc)*100;else os_dr4 =0; numincdetectedO=titisdetectO; numincdetected1 =titisdetect 1; numincdetected2=titisdetect2; numincdetected3=titisdetect3; numincdetected4=titisdetect4; numtotalinc = titisinc; if titisdetectO /\= 0 then os mttdO = totalttdO/titisdetectO; else os mttdO = 0; iftitisdetectl/\= 0 then os_mttd1 = totalttdlltitisdetectl; else os_mttd1 = 0; if titisdetect2 /\= 0 then os mttd2 = totalttd2/titisdetect2; else os mttd2 = 0; iftitisdetect3 /\= 0 then os_mttd3 = totalttd3/titisdetect3; else os_mttd3 = 0; if titisdetect4 /\= 0 then os _mttd4 = totalttd4/titisdetect4; else os mttd4 = 0; keep os_drO os_drl os_dr2 os_dr3 os_dr4; keep os_mttdO os_mttd1 os_mttd2 os_mttd3 os_mttd4; keep numincdetectedO numincdetected 1 numincdetected2 numincdetected3 numincdetected4 numtotalinc; keep os_isdrptO os_isdrpt1 os_isdrpt2 os_isdrpt3 os_isdrpt4; keep titislbinc titisshldinc; keep tlbdetectedO. tlbdetected1 tlbdetected2 tlbdetected3 tlbdetected4; keep tshlddetectedO tshlddetectedl tshlddetected2 tshlddetected3 tshlddetected4 varname; iflastobs; run; data os farbeforesum; set os _far end=lastobs; keep os_farptO os_farpt1 os_farpt2 os_farpt3 os_farpt4 varname; iflastobs; run; data os _lbisdrttdsum; set os _lbisdrttd end=lastobs; keep os_lbisdrptO os_lbisdrptl os_lbisdrpt2 os_lbisdrpt3 os_lbisdrpt4; keep os_lbmttdO os_lbmttdl os_lbmttd2 os_lbmttd3 os_lbmttd4; iflastobs; 181

PAGE 198

run; data os shldisdrttdsum; set os shldisdrttd end=lastobs; keep os_shldisdrptO os_shldisdrptl os_shldisdrpt2 os_shldisdrpt3 os_shldisdrpt4; keep os_shldmttdO os_shldmttdl os_shldmttd2 os_shldmttd3 os_shldmttd4; iflastobs; run; data os summarytest; set os drplusttdbeforesum end =lastobs; set os farbeforesum end= lastobs; set os _lbisdrttdsum end=lastobs; set os shldisdrttdsum end=lastobs; keep os_shldisdrptO os_shldisdrptl os_shldisdrpt2 os_shldisdrpt3 os_shldisdrpt4 os_shldmttdO os_shldmttdl os_shldmttd2 os_shldmttd3 os_shldmttd4; keep os_lbisdrptO os_lbisdrptl os_lbisdi-pt2 os_lbisdrpt3 os_lbisdrpt4 os_lbmttdO os_lbmttdl os_lbmttd2 os_lbmttd3 os_lbmttd4; keep os_isdrptO os_isdrptl os_isdrpt2 os_isdrpt3 os_isdrpt4; keep OS_ farptQ OS_ farpt 1 OS_ farpt2 OS_ farpt3 OS_ farpt4 OS_ drO OS_ dr 1 OS_ dr2 OS_ dr3 os_dr4; keep os_mttdO os_mttd1 os_mttd2 os_mttd3 os_mttd4; keep numincdetectedO numincdetected1 numincdetected2 numincdetected3 numincdetected4 numtotalinc; keep titislbinc titisshldinc; keep tlbdetectedO tlbdetected1 tlbdetected2 tlbdetected3 tlbdetected4; keep tshlddetectedO tshlddetected1 tshlddetected2 tshlddetected3 tshlddetected4 vamame; iflastobs; run; proc datasets library = work; delete os drplusttdbeforesum;delete os farbeforesum;delete os drfar;delete os _isdr;delete os temp;delete os _:isdrfartemp;delete .os __:_isdrfartemp2;delete os _far; delete os _temp99;delete os incset;delete os _incdataset;delete os _lbincset;delete os _lbincdataset;delete os shldincset;delete os shldincdataset;delete os nonincset;delete os _lbisdrttd;delete os shldisdrttd;delete os _lbisdr;delete os shldisdr;delete os temptest;delete os shldisdrttdsum;delete os _lbisdrttdsum; run; quit; proc export data= work.os_summarytest run; outfile= "c:\i-25 id\os_summarytestlbandshld.xls" dbms=exce12000 replace; 182

PAGE 199

References Abdulhai, B. (1996). "A neuro-genetic-based universally transferable freeway incident detection framework," Ph.D. Dissertation, University of California, Irvine, California. Abdulhai, B., and Ritchie, S. G. (1999). "Enhancing the universality and transferability of freeway incident detection using a Bayesian-based neural network." Transportation Research, 7C(5), 261-280. Ahmed, S. A., and Cook, A. R. "Application of time series analysis techniques to freeway incident detection." Proceedings of the 1982 Annual Meeting of the Transportation Research Board, Washington D.C., 1-9. Aultman-Hall, L., Hall, F. L., Shi, Y., and Lyall, B. "A catastrophe theory approach to freeway incident detection." Proceedings of the Second International Applications of Advanced Technologies in Transportation Engineering Conference, Minnesota, 37377. Balke, K. N., Dudek, C. L., and Mountain, C. E. (1996). "Using probe-measured travel times to detect major freeway incidents in Houston, Texas." Transportation Research Record(1554), 213-220. Blumentritt, C. W. (1981). Guidelines for selection of ramp control systems, Transportation Research Board National Research Council, Washington, D.C. Carvell, J.D. (1997). Freeway management handbook, Federal Highway Administration, Washington, D.C. Castle Rock Consultants(1998). "Denver regional transportation district automatic vehicle location system, in Evaluation Final Report Prepared for the Federal Transit Administration." Cheu, R. L. (1994). "Neural network models for automated detection oflane-blocking incidents on freeways," Ph.D. Dissertation, University of California, Irvine, California. 183

PAGE 200

Cheu, R. L., and Ritchie, S. G. (1995). "Automated detection oflane-blocking freeway incidents using artificial neural networks." Transportation Research, 3C(6), 371-388. Collins, J. F., Hopkins, C. M., and Martin, j. A. (1979). Automatic incident detection: TRRL algorithms HIOCC and PATREG, Transport and Road Research Laboratory, Crowthome, Berkshire. Cook, A. R., and Cleveland, D. E. (1974). "Detection of freeway capacity-reducing incidents by traffic-stream measurements." Transportation Research Record( 495), 1-11. Dia, H., Rose, G., and Monash University. Institute of Transport Studies. (1997). "Development and evaluation of neural network freeway incident detection models using field data." Transportation Research, 5C(5), 313-331. Dudek, C. L., Messer, C. J., and Nuckles, N. B. (1974). "Incident detection on urban freeways." Transportation Research Record(495), 12-24. Efron, B., and Tibshirani, R. (1993). An introduction to the bootstrap, Chapman & Hall, New York. Fambro, D. B., and Ritch, G. P. (1979). "Automatic detection -of freeway incidents during low volume conditions." FHW AITX 79123t21 0-1, Texas Transportation Institute, State Department of Highways and Public Transportation, College Station, Texas. Hastie, T., and Tibshirani, R. (1990). Generalized additive models, Chapman and Hall, London ; New York. Hoeschen, B. (1999). "Freeway incident detection models using automatic vehicle location data from buses," Master Thesis, University of Colorado, Denver. Hsiao, C.-H., Lin, C.T., and Cassidy, M. J. "An application of fuzzy set theory to incident detection." Proceedings of the Seventy-second Annual Meeting of the Transportation Research Board, Washington, D.C., 1-32. Ishak, S., and Al-Deek, H. (1999). "Performance of automatic ANN-based incident detection on freeways." Journal ofTransportation Engineering, 125(4), 281-90. 184

PAGE 201

Ivan, J. N., and Chen, S.-R. (1997). "Incident detection using vehicle-based and fixed location surveillance." Journal a/Transportation Engineering, 123(3), 209-215. lliK and Associates. (1990). "Ramp Metering Computer Control System: Denver and Arapahoe Counties." Project No. IR 25-2 (159) and IR 225-4(31), Division of Highways, State of Colorado, Norcross, Georgia. Jin, X., Cheu, R L., and Srinivasan, D. (2002). "Development and adaptation of construction probabilistic neural network in freeway incident detection." Transportation Research, lOC, 121-147. Levin, M., and Krause, G. M. (1978). "Incident detection: A Bayesian approach." Transportation Research Record(682), 52-58. Lindley, J. A. (1986). "Quantification of urban freeway congestion and analysis of remedial measures." Report RD-87-052, FHW A, U.S. Department of Transportation, McLean, Va. Luk, J. Y. K. a. S., F. Y. C. (1992). "The calibration of freeway incident detection algorithms." Working Document No. WD TE 92/001, Australian Road Research Board ARRB, Vermont South, Victoria, Australia. Payne, H. J., and Tignor, S. C. (1978). "Freeway incident;-_detection algorithms based on decision trees with states." Transportation Research Record(682), 30-37. Peeta, S., and Das, D. "A continuous learning framework for freeway incident detection." Proceedings of the Seventy-Seventh Annual Meeting of the Transportation Research Board, Washington, D.C., 1-46. Persaud, B. N.,.and Hall, F. L. (1989). "Catastrophe theory and patterns in 30-second freeway traffic data : implications for incident detection." Transportation Research, 23A(2), 103-113. .. . PettY, K., California. Dept. ofTransportation, Partners for Advanced Transit and Highways (Calif.), and University of California Berkeley. Institute of Transportation Studies. (1995). Freeway service patrol (FSP) 1.1 : the analysis software for the FSP Project, California PATH Program Institute of Transportation Studies University of California Berkeley, Berkeley, California. 185

PAGE 202

Petty, K. F., Noeimi, H., and Sanwal, K. (1996). "The freeway service patrol evaluation project database support programs and accessibility." Transportation Research, 4C(2), 71-85. Petty, K. F., Skabardonis, A., and Varaiya, P. P. "Incident detection with probe vehicles: performance, infrastructure requirements, and feasibility." Proceedings of the Eighth Annual IF ACIIFIPIIFORS Symposium, Chania, Greece, 125-130. Sethi, V., Bhandari, N., Koppelman, F. S., and Schofer, J. L. (1995). "Arterial incident detection using fixed detector and probe vehicle data." Transportation Research, 3C(2), 99-112. Skabardonis, A., Petty, K. F., Bertini, R L., Varaiya, P. P., Noeimi, H., and Rydzewski, D. (1997). "The 1-880 field experiment: analysis of incident data." Transportation Research Record(1603 ), 72-79. Skabardonis, A., Petty, K. F., and Varaiya, P. P. (1999). "Los Angeles: 1-10 field experiment: incident patterns." Transportation Research Record(1683), 22-30. Skabordonis, A., H. Noeimi, K. Petty, D. Rydzewski, P. P. Varaiya and H. Al-Deek. (1995). "Freeway service patrols evaluation." Institute of Transportation Studies, University of California, Berkeley. Stephanedes, Y. J., and Chassiakos, A. 1>: (1993). "Application o_ffiltering techniques for incident detection." Journal of transportation engineering, 119(1), 13-26. , ; I , Teng, H., Martinelli, D. R, and Jiang, P. "Enhanced freeway incident detection utilizing traffic measurements from two contiguous detectors." Proceedings of the Seventy-Seventh Annual Meeting of the Transportation Research Board, Washington; D.C., 1-23. Tsai, J., and Case, E. R (1979). "Deyelopment of freeway incident-:-detection algorithms by using pattern-recognition techniques." Transportation Research Record(722), 113-116. 186