PAGE 1
A NOVEL APPROACH TO MODELBASED OBJECT DETECTION by John Harris Dukesherer B.A., Michigan State University, 1990 A thesis submitted to the University of Colorado at Denver in partial fulfillment of the requirements for the degree of Master of Science Computer Science 2000
PAGE 2
This thesis for the Master of Science degree by John Harris Dukesherer has been approved by Date
PAGE 3
Dukesherer, John Harris (M.S., Computer Science) A Novel Approach to ModelBased Object Detection Thesis directed by Assistant Professor Christopher E. Smith ABSTRACT The popularity of bicycles as an alternate mode of transportation has prompted transportation agencies to inquire about how bicycles may be efficiently counted on bike paths and roadways. Computer vision has progressed such that the development of a computer vision based, automated bicycle counter is being investigated. This thesis investigates the use of modelbased vision techniques to aid in the development of such a system by creating a hybrid method based on the Hough transform and the directed Hausdorff distance for accurate recognition and detection of bicycles in imagery. Imagery is searched, using a variant of the Hough transform, optimized for the detection of circles, to find potential locations of the wheels of a bicycle. Upon location of a potential wheel in an image, the directed Hausdorff distance is used to verify the existence or non existence for each potential wheel site, through the comparison with an abstract model representation of a bicycle. Research shows the hybrid method out performs either method alone, with better than 95% accuracy. This abstract accurately represents the content of the candidate's thesis. I recommend its publication. Signed Christopher E. Smith lll
PAGE 4
DEDICATION I dedicate this thesis to my family, for their unfaltering understanding and support while I was writing this. Especially to my wife, Margo, without whose love and understanding this would not have been possible. Special thanks to Gram, you are a truly good soul.
PAGE 5
ACKNOWLEDGMENT Many thanks to my advisor, Christopher Smith, without whose knowledge, expertise and leadership, this thesis would never have come to fruition. I would also like to thank Jim, Mike, my classmates and friends, for their support and assistance during the writing of this thesis and many other projects. I would be remiss if I did not extend my gratitude to all my coworkers at Medtronic SNT for their support while I finished this thesis. To all my fellow cyclists, I propose that we now ride our bicycles instead of study them.
PAGE 6
CONTENTS Figures ..................................................................................................................... viii Tables ................................................... : ..................................................................... x Chapter 1. Introduction ........................................................................................................ 1 1.1 Motivation ...................................................................................................... 2 1.2 Problem Definition and Approaches .............................................................. 3 1.3 Thesis Organization ........................................................................................ 6 2. Prior Research .................................................................................................... 8 2.1 Image Decomposition and the Hough Transform .......................................... 8 2.2 Templatebased Matching and the Hausdorff Distance ................................. 9 3. Issues Inherent to Successful Bicycle Detection .............................................. 11 3.1 The Physical Properties of Bicycles ............................................................. 11 3.2 Considering Wheel Specific Bicycle Issues ................................................. 13 3.3 Obtaining Images .......................................................................................... l3 3.4 Templates ..................................................................................................... 16 3.5 Lines Intersecting Bicycle Wheels ............................................................... 17 4. An Approach Using the Directed HausdorffDistance ..................................... 19 4.1 Investigation ofthe Hausdorff Metric .......................................................... 19 4.2 Explanation of the Hausdorff Method .......................................................... 20 vi
PAGE 7
4.3 An Algorithm for the Directed Hausdorff Distance ..................................... 23 4.4 Complexity ................................................................................................... 23 4.5 Results .......................................................................................................... 24 5. Hybridization ofthe Hough and Hausdorff ...................................................... 29 5.1 The HoughHausdorffHybrid ...................................................................... 30 5.2 Algorithm Description .................................................................................. 31 5.3 Complexity ................................................................................................... 32 5.4 Results .......................................................................................................... 32 6. Conclusions ...................................................................................................... 38 6.1 The Directed HausdorffDistance ................................................................. 38 6.2 The Hybrid Approach .................... .............................................................. 38 6.3 Future Work .................................................................................................. 39 References ................................................................................................................ 41 Vll
PAGE 8
FIGURES Figure 2.1 Trek Road ........................................................................................................... 12 2.4 Santa Cruz Mountain Bike ................................................................................. 12 2.2 Softride Road Bike ............................................................................................. 12 2.3 Bianchi Mountain Bike ....................................................................................... l2 3.1 Natural Scene ...................................................................................................... l4 3.2 Threshold of Grayscale Image ............................................................................ 14 3.3 Edge Detected Image .......................................................................................... 15 3.4 Input Image ......................................................................................................... l5 3.3 Current Template ................................................................................................ 17 4.1 Felix Hausdorff (18491942) : ............................................................................ 19 4.1 Single Bicycle Devoid ofBackground ............................................................... 25 4.2 Optimized Template ........................................................................................... 25 4.3 Successful HausdorffBicycle Detection ............................................................ 26 4.4 Background Induced False Positive ................................................................... 28 4.5 Denser Image False Positives ............................................................................. 28 5.1 Cyclist on Auraria Campus ................................................................................ 33 5.2 Hough target for Hybrid Approach .................................................................... 33 Vlll
PAGE 9
5.3 Location of Bicycle ............................................................................................ 34 5.4 Criterium Scene .................................................................................................. 36 5.5 Hough Circle Detection ...................................................................................... 37 5.6 Successfully Detected Bicycle ........................................................................... 37 ix
PAGE 10
TABLES Table 6.1 Comparative Results ........................................................................................... 39 X
PAGE 11
1. Introduction Transportation systems in the United States and worldwide are increasingly complex, require higher expenditures, and are more congested as modem society moves toward urbanization, expansion, and construction of hightechnology highways and public transportation systems. Transportation agencies, such as the Minnesota Department of Transportation, must address issues of cost and complexity by openly advocating the use of alternative transportation as a commuting and transportation alternative. One popular choice among alternative transportation modalities is the increased use of bicycles. In addition to being more environmentally friendly than automobiles, bicycles are often more economically efficient and, in our opinion, a healthier alternative to other forms of travel. Even though many large cities (most notably Los Angeles, Portland, San Francisco and Seattle) have spent a great deal of time and money to construct and maintain bicycle paths and bicycle specific lanes on roadways, there are few studies on how many bicycles actually use these paths and roadways. 1bis lack of comprehensive study is, in our opinion, due to two basic reasons. The first is that bicycle paths are generally a lower priority item on transportation department agendas, being relatively low cost and providing recreational facilities, leading to greater acceptance by the public. The second reason is simply that an effective method for studying utilization and traffic flow for bicycle traffic is not available. 1
PAGE 12
1.1 Motivation A fully automated and reliable bicycle counter is not currently commercially available. For instance, one group developing an automated coWiter reported that under common circmnstances, it had at best a 70% confidence rating. The approach taken by this group was to use motion sensors to activate a video camera, then use a version of frame differencing to detect implied motion. Frame differencing is essentially a comparison of two images (acquired in adjacent time slots), A and B, resulting in an output image, C, devoid of the overlapping information contained in images A and B. Another alternative is the use of a mechanically triggered mechanism, much the same as those used to count automobiles. Mechanically triggered mechanisms (typically, pressure hoses) perform inconsistently when counting automobiles due to what appears to be variable gross vehicle weights and new, lighter weight automobile construction. By nature, bicycles are light weight and cyclist body weight can cause a high degree of variability in overall vehicle weight. This implies that miscounting by existing, commercially available equipment could be accentuated. Another option would be to utilize manual counting by having transportation personnel stand at the side of a bicycle path and count each bicycle as it passes. This is a costly alternative and the amount of error introduced by human counters can be difficult to quantify. What is required is an alternative to cost ineffective manual counting and to current errorprone automatic counters. In 1997, the Minnesota Department of Transportation contracted with several entities to investigate alternative forms of automatic bicycle counting based upon computer vision techniques. This program 2
PAGE 13
was an outgrowth of an earlier MnDoT program that investigated the use of computer vision to identify and track pedestrians at crosswalks. The approaches presented in this thesis are the result of a portion of this investigation into alternative, computer visionbased bicycle counters. The research presented here is intended to aid in the production of a computer visionbased bicycle counting system. The goal is a portable timelapse video system that is used to gather image data and will be analyzed by an imaging system at some later time. The output of the imaging system will be the number of bicycles detected. It is our goal to provide a means to detect bicycles through the use of computer vision techniques that have been adapted to fit this particular application. We believe that computer vision has achieved a level of sophistication that will allow the successful completion of the task at hand. Through the use of computer vision techniques, we expect to provide a successful, fully automated bicycle counter. 1.2 Problem De:fmition and Approaches Automated detection and counting of bicycles using computer vision techniques generally involves imagery taken from a stationary camera with either fullmotion video or timelapse video. In the case oftimelapse video, common techniques from computer vision such as frame differencing or background subtraction may be insufficient to correctly identify objects (e.g. bicycles) in motion. This is mainly due to changes in scene illumination, shadowing, and weather. In timelapse video, the changes in scene illumination and shadowing due to the apparent motion of the sun will create large areas of difference in each image. This method of image acquisition affects the performance of frame differencing and background subtraction techniques, where background subtraction can be thought of as using a 3
PAGE 14
single image of a scene repeatedly to subtract background information from other images acquired at earlier or later times. Computer vision techniques also have difficulties due to the variety of situations and backgrounds that this imagery might include. Typical bicycle paths fall into one of three categories: a) paths for the exclusive use of bicycles; b) paths for a mixed use of bicycles and pedestrians; or c) paths demarcated at the side of streets and highways. These three categories result in a wide variety of objects in the imagery and in the background, including foliage, people, pets, automobiles, trucks, etc. For a system to correctly identify and count bicycle traffic, distinguishing features of a bicycle and/or the cyclist must be found that do not readily match features extracted from other possible objects in the imagery. Two basic approaches would appear to be the most logical for identifying bicycles. The first approach is a geometric decomposition based upon (possibly) unique features of the bicycle such as the wheels, the frame, or the drive train. We use the term geometric decomposition to describe breaking down an image into the basic geometric shapes that are described within that image, or searching an image for the presence of basic geometric shapes. This approach would potentially be a viable and simple method of solving this particular problem. The second approach is to use a templatematching method to match a (possibly abstract) model of a bicycle or cyclist to objects in the gathered imagery. Templatematching, in our case, implies using a base image (template) to compare with a second image to obtain a measure of likeness between the base image and the second image. We will use the terms templatematching and modelmatching interchangeably throughout this thesis. Both approaches have a long history of use in various computer vision applications. Of the available techniques for geometric decomposition, a circular or 4
PAGE 15
elliptical Hough transform for identifying wheels would appear to be the best possible alternative. Techniques that would attempt to identify other bicycle geometry, such as the two triangles that typically compose the frame of a bicycle, would fail given the increasing popularity of composite frames and given the diversity oftraditional steel and aluminum frames. The other approach to this problem would be to use a templateor model matching technique that would attempt to match a bicycle model to image data. Several possibilities exist for model matching. Typically, modelmatching techniques attempt to identify specific objects rather than classes of objects, making these techniques too brittle for application to the bicycle counting problem. What is required is a more generic modelmatching technique that can identify objects that belong to the class of bicycles rather than specific, individual bicycles. Considering the goals of this thesis, a templatematching technique based upon the Hausdorff distance measure would be the best alternative. This method is claimed to be immune to small perturbations, such as weather in imagery and reports indicate that the Hausdorff distance will produce results that are robust against partially occluded objects and outliers that may arise at the contours due to noise or insufficient feature extraction [16]. The method compares two images by measuring the distances from edgedetected pixels ( edgels) in one image to the nearest edgels in the second image. The comparison of images containing a certain object yields a small distance value, whereas the comparison of two images containing different objects yields a larger distance measure. Typically, the distance that determines whether or not a prospective object is a match will be experimentally determined with actual test Imagery. 5
PAGE 16
Neither the Hough Transform nor the Hausdorff distance would appear to be robust enough to use independently to count bicycles. Both techniques have potential pitfalls that could lead to the misidentification of common objects as bicycles (false positives) and the nonidentification of actual bicycles (false negatives). Furthermore, the Hausdorff distance is extremely inefficient, having a runtime complexity in the general case of O(n6). We propose a third possible approach that utilizes both the Hough Transform and the Hausdorff distance in a hybrid algorithm. The Hough Transform is used to find possible bicycle wheels in the imagery. The locations near these potential wheels are then analyzed using the Hausdorff distance to determine if a strong match exists with the bicycle model. Since the Hough is more efficient than the Hausdorff, this improves performance by reducing the number of candidate template positions for which the Hausdorff metric is calculated. We have found this method to be more accurate than the other methods alone. The hybrid method produces the desired results with accuracy above 95%. 1.3 Thesis Organization In the following chapters, we will present the details of the problem of identifying and counting bicycles in imagery, our proposed solutions, and the experimental results that demonstrate the validity of the chosen approach. To begin, we will present an overview of work related to geometric object recognition, described as the use of geometric decomposition to identify members of a class of objects, such as circles or lines. This will be followed by work related to templatebased detection, specifically the use of the Hausdorff distance for template matching. The issues addressed and the conclusions reached will be evaluated and used as basis, where applicable, for the initial design of the solution to the problem of using modelbased detection for finding bicycles in images. 6
PAGE 17
Our discussion of the solution to the problem of detecting bicycles will begin with the issues faced when detecting bicycles, including geometric specifics apparent in imagery of all bicycles, and the assumptions that must be made to facilitate a successful project. Additionally, we will address issues that will be fully explored later in the Approach sections, to clarify our initial discussions related to the solution of this problem. Next the tasks necessary to implement the solution will be described and outlined. This will form the basis for the optimized modelbased detection and hybrid algorithm critical to the creation of a fully automated and reliable bicycle counter. Finally a thesis summary and detailing of the contributions of the thesis to the area of modelbased object detection and computer vision will be presented, followed by details of the next logical steps for research beyond the content ofthis thesis. 1bis includes possible modifications to algoritluns and possible alternative approaches for future study and the conceivable alterations of the basic algoritluns herein. 7
PAGE 18
2. Prior Research Prior research in the field of computer vision for object detection has mainly centered on the decomposition of image elements into known geometric shapes that can be readily identified. The Hough Transform is a wellknown algorithm for ascertaining the presence of geometric shapes in edgedetected images. Others have used templatebased methods to gather information from edgedetected images regarding the presence of more complex shapes produced by objects that are an amalgam of several geometric shapes or objects that do not decompose well into subshapes. The Hausdorff distance is one such solution where a template is compared to an image to find matches for that template in the image. 2.1 Image Decomposition and the Hough Transform The Hough Transform is generally used to find edge elements that conform to known geometric shapes such as circles, ellipses, lines and simple curves, the detection of which is performed through use of the knowledge of the geometric formulas that describe these shapes. Because bicycles contain derivations of simple geometric shapes such as somewhat predictable line configurations that make up the frames and most notably circles i.e., the tires and wheels when viewed from the side, this approach has merit for the problem of detecting bicycles. The transform originates in a proposal by Hough in 1962 [12], and the original proposal has since undergone various extensions and analyses [4, 6]. The key idea is as follows, suppose we are given a set of points and also suppose that we know the parametric form of the curve to which these points belong. We may then determine the parameters of the curve underlying the points in two steps. First, 8
PAGE 19
each edgel in the edgedetected image is used to solve the parametric equation(s) of the curve of interest. Since only one point on the curve is used in the solution, the system of equations is under constrained and many solutions will exist. For instance, if the Hough is used to detect circles, a single point could belong to an infinite number of circles in the image. Each of these possible circles will have a center and a radius from the parametric equations. In the second step, for each possible solution (at a given level of granularity), a vote is registered for that particular curve interpretation. In the circle example, a three dimensional voting matrix is used to collect the votes for each possible circle in the image. The dimensions of the voting matrix are the center of the circle in X andY and the radius of the circle. Cells of the voting matrix that receive a high number of votes demonstrate evidence from the image that supports the existence of that particular circle. Moreover, relative locations of feature points are not a consideration during curve fitting in this process [5]. 2.2 Templatebased Matching and the Hausdorff Distance Central to the problem of pattern recognition in computer vision is the ability to determine the extent to which one shape is like another. The geometric comparison of shapes is a fundamental tool for modelbased object recognition, where many methods used refer to a similarity measure between the model features and features of the image. The Hausdorff distance measures the divergence of a set of features with respect to a reference set of features [3, 16]. Our initial investigation has determined that the use of the Hausdorff distance to measure the extent to which data from an image resembles the data contained in a template image would provide solid results for the problem of detecting bicycles. 9
PAGE 20
Much of the published material available outlining use of the Hausdorff distance concentrates heavily on the optimization of transformations of model sets used in object detection, focusing on instances where the models are allowed to translate and scale with respect to the images [13, 14, 18, 24, 26, 28]. Solid work exists in this area and future research will incorporate some scaling and translation methods to improve template choices. Our approach differs from the presented research in two important ways. First, due to the predictable nature and size of bicycles in imagery and due to the defmed placement of devices for image acquisition, we will perform only fixed point calculations using the Hausdorff distance. Second, our approach makes use of the forward directed Hausdorff distance alone. Calculation of the reverse Hausdorff distance will not provide reliable indications of matching, a topic discussed in more detail in section 3.5. 10
PAGE 21
3. Issues Inherent to Successful Bicycle Detection This chapter explores the issues inherent to detection of bicycles and the tradeoffs that must be considered when designing a working algorithm to accurately detect bicycles. Several issues have already been introduced in Chapter 2 during the discussion of previous work. This section explores the issues and tradeoffs that must be considered when designing an algorithm to detect bicycles in information dense images. Several relevant issues have already been introduced in the earlier discussion of previous work in like areas. Issues that need clarification will be addressed, tradeoffs related to these issues will be offered and a detail of the decisions involved in the intended design is offered. 3.1 The Physical Properties of Bicycles Adopting a computer vision perspective, we can show that bicycles are made up of three distinctly definable objects that could be used separately or together to identify an object in an image as a bicycle: A. The cyclist. The human providing power to the bicycle. Much work has been done to identify the human factors within images. The direction of this thesis will not be focused on the detection of human factors in imagery as variations in the human form and position are largely variable, especially in cycling specific situations; B. The frame. A rigid structure, generally made from steel or aluminum tubing, often arranged in a pattern that consists largely of two distinct 11
PAGE 22
triangles, the front and rear. Because of the nearly endless number of distinct frame configurations available to overthecounter consumers, we have decided to focus our efforts elsewhere, specifically to an area that is more consistent from bicycle to bicycle. Included are several photographs of mass produced, commercially available bicycles to illustrate this point. See Figures 2.1 through 2.4; Figure 2.1 Trek Road Figure 2.4 Santa Cruz Mountain Bike Figure 2.2 Softride Road Bike Figure 2.3 Bianchi Mountain Bike C. The wheel set. Two rims of equal size held fast to hubs at the center of the wheel through the use of spokes. The tires encircle the rims at the outside edge providing the only intended ground contact, directly below the hubs. By definition a bicycle must have two wheels and these two wheels will form the basis for the features that will be used to decide if in fact a bicycle 12
PAGE 23
is present in the image data. Note that the wheel sets of each bicycle in Figures 2.1 through 2.4 are the most consistent portions of hardware from bicycle to bicycle. 3.2 Considering Wheel Specific Bicycle Issues Given the information above, it is necessary to consider possible derivations of the wheel sets that will be used in positively determining whether or not bicycles can be counted on the visual merits of the wheel set alone. Though many different kinds of wheel sets and tires are readily available to cyclists, visual information provided by these wheel sets yields much the same results from bicycle to bicycle. It is assumed from this point forward that countable bicycles will have one of two possible standardized wheel sizes, with mountain bicycles being built around a predominately standard 26 inch diameter rim size and road bicycles being built around a 700c rim size, equating to approximately 29 inches in diameter. Future work will make concessions for smaller wheel sizes and radically different wheel bases such as those found on recumbent bicycles. 3.3 Obtaining Images Initially image information was obtained from stock bicycle photographs to begin prototyping the algorithm. These images were a source of clean digital photographs containing bicycles with no background detail. Later, image data was gathered through the use of a digital camera, mounted to a post to simulate the proposed scenario of a portable timelapse video system as outlined in the introduction, see Figure 3.1. The final set of images were processed in the following manner: 1. Threshold grayscale image, retain data from 140 to 25 5, see Figure 3 .2. 2. Edge detection by Sobel operator, see Figure 3.3 3. Threshold image, retain data from 150 to 255. See Figure 3.4. 13
PAGE 24
Figure 3.1 Natural Scene Figure 3.2 Threshold of Grayscale Image 14
PAGE 25
Figure 3.3 Edge Detected Image Figure 3.4 Input Image 15
PAGE 26
3.4 Templates We conducted extensive research to select a template that would suit the needs of this project by providing a robust model to compare to image data. Our goal is to provide a template that describes the shape of the objects we are searching for, taking into account the expected variation due to differing wheel sizes. We have also attempted to provide a template that would be sparse enough to maintain reasonable performance by limiting the computations necessary to determine the existence or nonexistence of bicycles. We have noted that several objects that might have been included in the template had a poor chance of being detected, such as bicycle frame specific data, causing irregularities in detection. What follows is a description of the current template design and the methods used to obtain the template in use today. The template in use today started as a line detected image of a stock bicycle photo devoid of background detail. Extraneous image information is eliminated, retaining the information derived from the wheels. Finally extraneous edgel information is removed from the data created by the wheels, yielding two arcs describing approximately 3/Sths of a circle each, see Figure 3.3. We add that the template in use today is more zealously cropped at the top, bottom and sides and is presented here in this form to alleviate confusion regarding the image data contained within. 16
PAGE 27
Figure 3.3 Current Template Portions of the circle were removed to account for the fact that many serious cyclists may use panniers, which are bags that hang on racks mounted over the top of one or both wheels. These panniers have a tendency to occlude portions of a wheel set when present in images. Another reason for removing portions of the circle stems from the fact that when pedaling, the cyclists foot has a tendency to occlude portions of the rear wheel of the bicycle. We have constructed our template to include no directional data, therefore we need only one template for our comparisons. The final reason for removing portions of the full circles was based on the early assertion that the template should provide no more information than necessary for the algorithm to make a positive identification of a bicycle. We assert that the template contains enough information to eliminate, as much as possible, false positives. 3.5 Lines Intersecting Bicycle Wheels Bicycle wheels, with few exceptions, appear to be nearly clear of visual information in the areas between the wheel hub and the rim while the wheels are rotating. Our research utilizing images with controlled backgrounds in early testing presented no indication of problems. Image data gathered in uncontrolled circumstances did 17
PAGE 28
present difficulties as extraneous image information was introduced near and especially behind bicycles in images. If the template matches a circle from the image that happens to be crossed by a line induced from some extraneous image artifact, calculation of the reverse Hausdorff distance will return increased value. This increased value could cause a false negative scenario, though the match might well have been good[l8]. 18
PAGE 29
4. An Approach Using the Directed Hausdorff Distance Figure 4.1 Felix Hausdorff (18491942) Our goal has been to develop an algorithm to verify the existence of bicycles in imagery through the use of the directed Hausdorff distance. What follows is the reasoning behind choosing the Hausdorff metric as an image matching tool, a brief explanation of the Hausdorff distance methods, the developed algorithm and our findings. 4.1 Investigation of the Hausdorff Metric Our initial investigation determined that the use of the Hausdorff distance to measure the extent to which data from an image resembles the data contained in a 19
PAGE 30
template image will provide solid results for the problem of detecting bicycles. The Hausdorff distance measures the extent to which each point of a template set lies near some point of an image set, and these sets mostly describe object contours in our application. We will use this distance to determine the degree of resemblance between two objects that are superimposed on one another. 4.2 Explanation of the Hausdorff Method When talking about distance in reference to image matching, we usually mean the shortest distance. If two polygons A and Bare at some distance from each other, we commonly understand that distance to be the shortest distance between any point of A and any point of B. Formally, this is called a minmin function, and the distance D between A and B is D(A,B) = minaEA minbEB (d(a,b)) (4.1) This equation reads, for every point a of A, find its smallest distance to any point b of B, keeping the smallest distance found among all points a. The minmin distance described here carries with it a low amount of informative content, especially with respect to orientation. The Hausdorff distance, in spite of its apparent complexity, captures the subtleties ignored by the shortest distance method described above. Named for Felix Hausdorff, see figure 4.1, the Hausdorff distance can be described as the maximum distance of a set of points to the nearest point in the other set. More formally, the Hausdorff distance from set A to set B is a maxmin function best described as follows. 20
PAGE 31
Given two finite point sets A= {a1 ... ap} and B = {b1 ... ,bq}, the Hausdorff distance is defined as H(A, B)= max (h (A,B), h (B,A)) (4.2) where h(A,B) = maxaEA minhEB II ab II, (4.3) and II is some underlying norm on the points of A and B (e.g., the L2 or Euclidean norm). The function h(A,B) is the directed Hausdorff distance from A to B. This distance identifies the point a E A that is farthest from any point of B, and measures the distance from a to its nearest neighbor in B, using the norm 11"11In effect h(A,B) ranks each point of A based on its distance to the nearest point of B, and then uses the largest ranked point (the most mismatched point of A) as the distance. This noted, if h(A,B) = d, then each point of A must be within distance d of some point in B, and there is also some point of A that is exactly distance d from the nearest point in B, again the most mismatched point. The Hausdorff distance, H(A,B), is the maximum of h(A,B) and h(B,A). Therefore it measures the degree of mismatch between two sets, by measuring the distance of the point of A that is farthest away from any point of B and the distance of the point of B that is farthest away from any point of A. Thusly the Hausdorff distance resembles the directed Hausdorff distance in that each member of A is near some member of B and vice versa. Different from other approaches to shape comparison, there is no explicit pairing of points of A with points of B. Many points of A may be close to the same point of B. 21
PAGE 32
The Hausdorff distance measures the mismatch between two sets that are at fixed positions with respect to one another. In this thesis we are only concerned with measuring the mismatch between these fixed position sets, as the variations in bicycle wheel sizes are not great enough to warrant translation of the template to all relative positions. The Hausdorff distance by nature is asymmetrical. Stated simply the forward distance (h(A,B)) and the reverse distance (h(B,A)) are rarely equal, stemming from the asymmetry of maxmin functions. The property of asymmetry as it relates to the Hausdorff distance, while acceptable and even beneficial to other applications of the Hausdorff metric, is in fact a detriment to this particular project. It is our supposition that the use of the overall Hausdorff distance (i.e. comparing the template to an image and that image to the template) poses repercussions to the problem of identifying bicycles in imagery. We have concentrated our efforts toward the use of the forward Hausdorff distance, h(A,B), where the point set A is the template and B is the image. Large image data sets contain information such as pedestrians, pets, automobiles and semitractor trailers that when operated on by the reverse Hausdorff distance, h(A,B), can produce faulty results for our application. Any item in the imagery that has the potential to create a set of edgels that forms a bisector of a circle derived from a bicycle wheel creates a situation that increases the distance returned by the reverse Hausdorff distance such that a potentially good match could be overlooked. We have ascertained through trial and error that the use of the reverse Hausdorff distance, for this project, is ineffectual and therefore not a necessary component to our approach. 22
PAGE 33
4.3 An Algorithm for the Directed Hausdorff Distance The basic directional Hausdorff distance algorithm used in our approach is defined by the following pseudo code: 1. h(A,B) = 0; 2. for each possible placement of the template (A) on the image (B) 3.1 dmin = oo; 3.2 for each a; of A 4. for each b1 in B overlapped by A 5.1 calculate deucuJ..a;,bJ); 5.2 dmin =min ( deucul...a;,bj), dm;n); 3.3 h(A,B) =max ( h(A,B), dm;n); 4.4 Complexity If images A and B have i by j and m by n pixels, with p and q edgels respectively (p and q have an upper bound of rnn), then: Step 2 O(mn); Step 3.2 O(p); Step 4 O(q); which leads to a runtime complexity of O(rnnpq). If m and n are similar in size, and p and q approach their upper bounds, the complexity would be O(n6). All steps not specifically referenced above are 0(1). 23
PAGE 34
4.5 Results We have found the use of the directed Hausdorff distance, in controlled experimentation, provides admirable results for our investigation to the problem of identifying bicycles. The algorithm proposed above consistently returns small, predictable distances when supplied an image derived from a bicycle. Our research and testing at this point (using controlled imagery) shows a 100 percent success rate for detection of bicycles with 0 instances of false positive returns. The largest negative at the early stage of development is runtime for this algorithm. Extensive loop shorting and template optimization has the effect of lessening the run times to some extent, and optimization efforts are ongoing through our development. Accuracy being paramount, we have made certain concessions, accepting higher runtimes. The binary images, see Figure 4.1, are 640 x 480 pixels, and the template is roughly 385 x 135 pixel image derived in the manner described in chapter 3, see Figure 4.2. The computation of the directed Hausdorff distance is done using an implementation, written in the C programming language, of the algorithm described above. The timings are based on usermode CPU times for a 120 MHz Pentium workstation. Our earliest tests runtimes made clear that speed would be an issue we would battle for the rest of the project. The first successful test, involving no optimization efforts came in at a less than ideal 46 minutes. Efforts to optimize the algorithm showed marked improvement and runtimes were trimmed to an acceptable 15 to 20 minutes. Modifications allowing for best guess template 24
PAGE 35
placement based on perceived travel expectations of a bicycle through imagery and radical pruning of the amount of vertical search space gleaned runtimes as fast as 250 seconds, see Figure 4.3, using an acceptable distance ceiling of 30. The acceptable distance ceiling refers to the largest allowable directed Hausdorff distance measure that will be considered a good match between the template and the image, implying the existence of a bicycle in the image data. Figure 4.1 Single Bicycle Devoid of Background Figure 4.2 Optimized Template 25
PAGE 36
Figure 4.3 Successful Hausdorff Bicycle Detection The results from our early testing, runtime issues not withstanding, provide indication that our algorithm (making use of the directed Hausdorff distance) will provide the desired results for image data acquired from less controlled circwnstances. Using the same manner of computation described above, two templates, differing in only size, are compared to 640 x 480 grayscale, binary images obtained through the use of a pole mounted camera to simulate a portable time lapse recording system, as described in section 3.3. The two template sizes referred to above are the original template, measuring 385 x 135 pixels, and a smaller template, measuring 225 x 100 pixels. The results of the testing at this stage highlight two major drawbacks of our approach. First, nmtimes have more than doubled due to the less predictable areas of the image where bicycles will occur. Even using a version of the fastest 26
PAGE 37
algorithm to this point, the amount of extraneous image information induced from background information has slowed the end result processing times. Repeated tests show processing times from 850 seconds in the best cases to 1150 seconds in more extreme cases. Second, and more detrimental, pixel dense areas created by buildings, trees and shrubbery, and any sizeable group of people has the effect of perturbing our implementation ofthe algorithm above, producing false positive results when we search larger portions ofimagery. To illustrate this point we have include Figure 4.4, where the forward Hausdorff distance is used to search the entire image for instances of bicycles. A smaller template is also compared, in the same fashion, to an image denser with pixel information, used in portions of the testing to find the forward Hausdorff distance algorithm reaction to increased amounts of extraneous clutter, see figure 4.5. Note how a box or boxes demarcate the buildings in the upper right comer of Figures 4.4 and 4.5. Each rectangular box indicates a possible match produced by the Hausdorff distance calculation that is within the acceptable distance ceiling, which was set to 1 00 for use with the smaller template and 200 for the larger template. Clearly, we must find a method to prune large search spaces, allowing better initial placement of the template relative to potential bicycles in the image for future comparisons. Chapter 5 describes the development of a hybrid approach, utilizing a version of the Hough transform in conjunction with the Hausdorff distance to alleviate the issues exposed by our work in this chapter. 27
PAGE 38
Figure 4.4 Background Induced False Positive Figure 4.5 Denser Image False Positives 28
PAGE 39
5. Hybridization of the Hough and Hausdorff In this chapter, we present a revised algorithm utilizing the Hough transform and the directed Hausdorff distance together to detect bicycles in imagery. Our research shows that runtime and false positive recognition problems prevent the use of the directed Hausdorff distance as a lone source, upon which we would have otherwise based our algorithm for use in bicycle detection. A version of the Hough transform, optimized for detection of circles or ellipses is discussed earlier in this thesis as a stand alone method. Others have made use of this method with varying success in the following manner. Assuming images are obtained at an essentially orthogonal viewing angle from the intended target (a bicycle), the Hough transform can be optimized to detect circles, ostensibly locating one or more wheels of a bicycle. Searching an image space for a predictable range of circle sizes, with origins at predictable distances from each other along a horizontal or near horizontal vector h(lS potential, but this method has drawbacks. The limitations of a singular Hough method arise in reference to occlusion of portions of any wheel. As our research into cycling specific issues has revealed, the pedaling motion of a cyclist implies occlusion of a portion of the rear wheel of the bicycle he/she rides. It is our goal for the remainder of this thesis to present a novel method for detecting bicycles making use of the strengths of both the Hausdorff and Hough methods, while attempting to minimize the weaknesses of each approach. 29
PAGE 40
5.1 The HoughHausdorff Hybrid The methods described in the following text will make use of an algorithm designed around the Hough transform as for use in circle detection and the forward Hausdorff distance for template matching. The approach we have chosen is to first limit the search area.in an image to the probable locations of bicycle traffic, without over minimizing the search space. Eliminating searches in the extreme upper and lower bands of the image we have shortened effective runtimes by design. Within the reduced search area we apply the modified form of the Hough transform, optimized to locate circles in binary images. This implementation has the advantage of acting as a first pass fllter, such that if no circle of the appropriate size is present, then the Hausdorff is not called. In the event a circle of the correct proportions is located, the directed Hausdorff distance is used to verify the existence of a possible bicycle, based on the position of the origin of the detected circle. Only at this point is the directed Hausdorff distance calculation performed, and it is done for a small number of possible placements of the template taking into account that the circle located earlier by the Hough transform could possibly be either the front or rear wheel of a bicycle. This hybrid approach is more time dependent on the runtime of the Hough transform than that of the forward Hausdorff distance. 30
PAGE 41
5.2 Algorithm Description The basic algorithm used in our approach is defined by the following pseudo code: 1. for each hparameter. kparameter, and radius that satisfies the parametric equation for the curve of interest (in this case a circle with a radius that falls between radiusmin and radiusmax), register a vote in the cell of the voting matrix corresponding to [hparameter. kparameter. radius] 1.1 for each possible hparameter in the voting matrix 2. for each possible kparameter in the voting matrix 3. for each possible radius in the voting matrix 4. calculate circumference based on radius in the voting matrix; 4.1 if votes > z circumference then { } 5. if h(A,B) < acceptable_distance, then outline the template placement in the image; 5.1 tag the circle to indicate located by Hough; where a; are the edge pixels ( edgels) in A, the centers of the prospective circles are given by the ordered pairs (hparameter, kparameter), the radii of the prospective circles are given by radius, the number of pixels expected in a circle of a specific radius is given by circumference, z is the percentage of pixels needed in circumference to indicate a valid circle, the range of radii for prospective circles is from radius min to radiusmax. the number of votes in a given cell of the voting matrix is votes, and 31
PAGE 42
acceptable_ distance gives the threshold for identifying a match for template A to a section of image B using the directed Hausdorff distance, h(A,B). 5.3 Complexity If images A and B have i by j and m by n pixels, with p and q edgels respectively (p and q have an upper bound ofmn), then: Step 1. O(mn) Step 2. O(m) Step 3. O(n) which leads to leads to a run time complexity O(m2n2), or a complexity of0(n4 ) not inclusive for calculation of the Hausdorff distance. All steps not specifically referenced above are 0(1), except step number 5 which is O(n6 ) for a limited nwnber of calculations, specifically 242 iterations of the directed Hausdorff distance each time it is called. 5.4 Results Results generated through the latest testing show positive results. The use of the Hough transform to prune the search space allows near optimal placement of the template and minimizing the nwnber calculations of the more complex forward Hausdorff distance. Our method for this hybrid approach of the Hough transform and the forward Hausdorff distance operates as follows. Figure 5.1 represents an input image containing a bicycle. The Hough transform as optimized for circles is used 32
PAGE 43
to locate possible sites for wheels in that image and each of the possible sites is recorded, illustrated by the placement of a set of cross hairs, see Figure 5.2. Figure 5.1 Cyclist on Auraria Campus Figure 5.2 Hough target for Hybrid Approach 33
PAGE 44
For each of the possible wheel sites, as illustrated above, the directed Hausdorff distance is calculated for two placements of the template having a cost of 121 calculations per placement. The first placement assumes that the circle detected by the Hough transform relates to the front wheel of a potential bicycle, and the second placement assumes that the detected circle relates to the back wheel of a bicycle. Placement of the template is done through the use of prior knowledge of the origin of each arc representing a wheel in the template. Matching the origin of the circle detected by the Hough transform to the implied origins of each wheel representation in the template, we search from the center out of a 121 pixel grid. If an acceptable distance is returned by the forward Hausdorff distance, it is assumed that a bicycle exists in the current image. Figure 5.3 illustrates the position of the template on the image where the best/smallest distance was reported. The acceptable distance ceiling is set at 50 for this particular example, a distance of 24 is returned within a runtime of 538 seconds Gust over 8 Yz minutes). Figure 5.3 Location of Bicycle 34
PAGE 45
Near optimal placement of the template has drastically reduced the actual runtime of the portion of the C code based on the directed Hausdorff distance. Overall runtimes do still suffer due to the complexity of the Hough transform. Further reductions in run times can be expected by more severely limiting the sizes of the circles the Hough transform searches for and further constraining the search area. Further improvements can be expected through more advantageous choices of template sizes and possibly near perfect placement of the template when comparing it to the image. CPU times for our hybrid approach are calculated given the following inputs: 640 x 480 pixel, gray scale images refined for input using the same methods as described in section 3.3. The template used in the current model varies in size, depending on a best guess for size match at runtime. Selection of templates is critical at this time, as better choices for template size relative to the actual wheel size represented in the images allows a smaller acceptable distance ceiling. Runtimes for the current algorithm have been constantly within the range of 600 to 11 00 seconds, with a confidence level of 1 00 % when the Hough transform identifies a circle. Overall confidence is near 96 % as the Hough has overlooked one potential circle location at this point. The runtime corresponding to the images represented by Figures 5.15.3 is 624 seconds. Again, all timings are based on usermode CPU times for a 120 MHz Pentium workstation. Figures 5.4, 5.5 and 5.6 illustrate the input image, Hough detection phase and the successful detection of a bicycle from a street scene. The runtime for this particular input is 988 seconds (under 16 The acceptable distance ceiling is set to 100 and a distance of 31 is returned. This image, denser with edgel information, increases the runtime of the Hough transform. It is also important to note that the bicycle represented in figure 5.4 is at an angle with the ground other 35
PAGE 46
than 90 degrees adding difficulty to the task of locating possible wheel sites with the Hough transform. The Hough transform does in fact locate two probable wheel locations, see Figure 5.5. Once the Hausdorff distance is calculated, see Figure 5.6, we receive confmnation that a bicycle is present in the image. Of the three methods discussed in this thesis, the Hough transform, the forward Hausdorff distance, and our hybrid approach, the later approach shows the most promise and has given us the best results. Future research should show reduced runtimes as we further optimize our algorithm and the C code it represents. Figure 5.4 Criterium Scene 36
PAGE 47
Figure 5.5 Hough Circle Detection Figure 5.6 Successfully Detected Bicycle 37
PAGE 48
6. Conclusions In this thesis, we have presented two approaches to the problem of using model based methods to locate bicycles in imagery. The first method, the use of the forward Hausdorff distance as a lone tool for comparison of an image to a template shows flaws in practical application. The second method, using the Hough transform to prune the search space, allowing near perfect placement of a template for the directed Hausdorff distance calculations, shows promise for application in creating a fully automated and reliable bicycle counter. It is our opinion that the research contained herein is a valid beginning to the solution of that problem. 6.1 The Directed Hausdorff Distance While use of the directed Hausdorff distance alone for the detection of bicycles is inadequate, we support the use of this method in conjunction with other approaches that can assist in providing informed placement of a template in relation to an object in an image. The new hybrid approach highlights the accuracy of the forward Hausdorff distance for the task of finding and identifying bicycles in imagery, while minimizing possibly negative side effects such as unacceptably large runtimes and false positives due to extraneous image information. 6.2 The Hybrid Approach Our research indicates that the use of the directed Hausdorff distance in conjunction with the Hough transform performs well for the task of bicycle detection in imagery. More robust than either method alone, results indicate that this method has value to add to the end goal of the creation of a computer vision based, 38
PAGE 49
automatic bicycle collllter, suitable for use in many situations. We believe our approach using both the Hough transform and the directed Hausdorff distance is a leap in the right direction to solve the outlined problem. Our work here makes a contribution to the field of computer vision modelbased detection, through investigation of a particularly difficult problem, the accurate detection of bicycles. It is our assertion that hybrid approaches are Wlcommon in current research and more such approaches should be investigated. Method Hausdorff Hough Hybrid Images 23 23 23 Actual Bicycles 17 17 17 False Positive 3 6 0 False Negative 4 12 1 Correct Detection 16 5 22 %Correct 70% 22% 96% Table 6.1 Comparative Results Table 6.1 above provides the results of comparative study between the Hausdorff, Hough and the hybrid methods. Each method was compared to the same 23 images. Seventeen of the images contain bicycles, the remainder are images of automobiles, buses, trucks and pedestrians. The Hausdorff and Hough approaches both suffer from accuracy issues and to this point the HoughHausdorff Hybrid method out performs either independent method. All percentages have been rounded to the nearest whole number. 6.3 Future Work The research, contained in this thesis, is thorough for the scope of the thesis, yet areas for additional study remain. It is our hope that we or others will be able to assist in continuing the work presented herein. The suggestions we make here are 39
PAGE 50
based on our knowledge of the task at hand and it is our wish that those wishing to make use of this research would do so. Our plans for ongoing work related to this thesis are as follows. Currently we plan to pursue testing regarding accuracy in a broader range of conditions, while optimizing of the techniques outlined in Chapter 5 including, but not limited to, optimization of the Hough transform, and template placement for calculation of the forward Hausdorff distance. Plans for the near future include the addition of a template scaling routine and a study for further template optimization. A simple one time scaling operation could be implemented to optimize the template size for a particular view, eliminating guessing and furthering the goal of computer vision based bicycle detection unit. We hope to investigate different techniques to guide placement of the template used in the directed Hausdorff distance. Other model based approaches to image decomposition and feature extraction could provide a lower cost alternative to the Hough transform. Finally, we wish to concentrate our efforts on applying this method to other domains where objectclass members share simple geometric characteristics. 40
PAGE 51
References I. H., Alt, "Matching shapes with a reference point," Technical Report B 9418, Institut fiir lnformatik, Freie UniversiUit Berlin, DE., 1994. 2. H. Alt, B. Behrends, and J. BlOmer, "Approximate matching polygonal shapes," Technical Report B 9310, Institut fiir Informatic, Freie Universitat Berlin, DE., 1992. 3. Atallah, "A linear time algorithm for the Hausdorff distance between convex polygons," Information Processing Letters, 17:207209, 1983. 4. Ballard, "Generalizing the Hough transform to detect arbitrary shapes," Pattern Recognition, 13(2):111122, 1981. 5. Ballard, and, C. Brown, Computer Vision, PrenticeHall Inc., Englewood Cliffs, New Jersey, 1982. 6. Brown, C. M. "Inherent bias and noise in the Hough transform," IEEE Transactions on Pattern Analysis and Machine Intelligence, 5(5):493505, 1983. 7. M. Beauchemin, K. Thomson, and G. Edwards, "On the Hausdorff distance used for the evaluation of segmentation results," Canadian Journal of Remote Sensing, 24(1):38, 1998. 8. Belogay, C. Cabrelli, U. Molter, and R. Shonkwiler, "Calculating the Hausdorff distance between curves," Information Processing Letters, 64:1722. 1997. 9. L. Chew, and K. Kedem, "Getting around a lower bound for the minimum Hausdorff distance," Computational Geometry, Theory and Applications, 10(3):197202, 1998. 41
PAGE 52
10. L. Chew, K. Kedem, and S, Schirra, "On characteristic points and approximate decision algorithms for the minimum Hausdorff distance," Technical Report MPI194150, MaxPlanckInstitut fiir lnformatik, Saarbriicken, DE., 1994. 11. M. Dubuission, and A. Jain, "Modified Hausdorff distance for object matching," Proceedings of !APR International Conference on Pattern Recognition, 566568, 1994. 12. R. Duda, and P. Hart, Pattern Classification and Scene Analysis, John Wiley & Sons, New York, 1973. 13. W. Grimson and D. Huttenlocher, "Analyzing the probability of a false alann for the Hausdorff distance under translation," Proceedings of the ARPA Image Understanding Workshop, II:12571262, 1994. 14. D. Huttenlocher and K. Kedem, "Computing the minimum Hausdorff distance for point sets under translation," Proceedings of the 61h annual ACM Symposium on Computational Geometry, 340349, 1990. 15. D. Huttenlocher, K. Kedem, and J. Kleinberg, "On dynamic Voronoi diagrams and the minimum Hausdorff distance for point sets under Euclidian motion in the plane," Proceedings of the B'h Annual ACM Symposium on Computational Geometry, 110119, 1992. 16. D. Huttenlocher, G. K.landerman, and W. Rucklidge, "Comparing images using the Hausdorff distance," IEEE Transactions on Pattern Analysis and Machine intelligence, 15(9):850863, 1993. 17. D. Huttenlocher, R. Lilien, and C. Olson, "Approximate Hausdorff matching using Eigenspaces," Proceedings of the ARPA Image Understanding Workshop, 11811186, 1996. 18. D. Huttenlocher and W. Rucklidge, "A multiresolution technique for comparing images using the Hausdorff distance," Technical Report TR 921321, Department of Computer Science, Cornell University, NY, 1992. 42
PAGE 53
19. K. Kedem andY. Yarmovski, "Curve based stereo matching using the minimum Hausdorff distance," Proceedings of the 1 ih Annual ACM Symposium on Computational Geometry, 415418, 1996. 20. C. Olson, "Probabilistic formulation for Hausdorffmatching," Proceedings of the IEEE Conference on Vision and Pattern Recognition, 150156, 1998. 21. J. Paumard, and E. Aubourg, "Adjusting astronomical images using a censored Hausdorff distance," Proceedings of the 4'h IEEE International Conference on Image Processing, 111:232, 1997. 22. Preparata and M. Shamos, Computational geometry, an introduction, Springer Verlag, NY, 1985. 23. Rote, "Computing the minimum Hausdorff distance between two point sets on alline Wlder translation," Information Processing Letters, 38:123127, 1991. 24. W. Rucklidge, "Lower bounds "ror the complexity of the Hausdorff distance," Technical Report TR 941441, Department of Computer Science, Cornell University, NY, 1995. 25. W. Rucklidge, "Efficient computation of the minimum Hausdorff distance for visual recognition," Ph.D. thesis, Department of Computer Science, Cornell, University, NY, 1995. 26. W. Rucklidge, "Locating objects using the Hausdorff distance," Proceedings of the 51h Annual /CCV Conference on Computer Vision, 457464, 1995. 27. W. Rucklidge, "Efficient visual recognition using the Hausdorff distance," Lecture Notes in Computer Science, 1173, SpringerVerlag, NY, 1995. 28. W. Rucklidge, "Efficiently locating objects using the Hausdorff distance," International Journal ofComputer Vision, 24(3):251270, 1997. 43
PAGE 54
29. R. Shonkwiler, "An image algorithm for computing the Hausdorff distance efficiently in linear time," Image Processing Letters, 30:8789, 1989. 30. R. Shonkwiler, "Computing the Hausdorff set distance in linear time for any Lp point distance," Information Processing Letters, 3 8:201207, 1991. 31. J. Yi, B. Bhanu, and M. Li, "Targetindexing in SAR images using scattering centers and the Hausdorff distance," Pattern Recognition Letters, 17(11): 11911198, 1996. 32. X. Yi and 0. Camps, "Line featurebased recognition using the Hausdorff distance," Proceedings of the IEEE Symposium on Computer Vision, 7984, 1995. 33. X. Yi and 0. Camps, "Robust occluding contour detection using the Hausdorff distance," Proceedings of the IEEE Conference on Vision and Pattern Recognition, 962967, 1997. 34. X. Yi and 0. Camps, "Linebased recognition using a multidimensional Hausdorff distance," IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(9):901916, 1999. 44
