datatopolicy project Program for the Fall 2019 Symposium
Schedule 9:00am 1st Presentation Set e 10:00 am 2nd Presentation Set e 11:00 am Casual Poster Viewing I Social 12:00 pm Discussion and Awards Throughout the symposium, help yourself to snacks and visit our feedback board! The Data to Policy Project Executive Committee thanks everyone for taking part in this important event! Special thanks to this semester's faculty with participating students: Joshua French, Audrey Hendricks & Serena Kim, as well as to our independent study students and the Machine Learning Club. With gratitude, Shea Swauger Mike Ferrara Diane Fritz Matt Mariner AURARIA LIBRARY J I datatopol icyproject
0 Titles and Abstracts The projects are listed by their poster location number There are two presentation times: Odd/Blue: 9:00am hour I Even/Grey: 7 O:OOam hour Characterizing Disparities in Police Lethal Force Based on Mental Illness Svmotom Presentation Kate Fitch In November 2018, the American Public Health Association named police brutality and violence a public health problem that disproportionately impacts vulnerable populations including people with mental health conditions. Previous research has demonstrated that, when looking at differences in weapon, location, and whether or not a person is classified as "attacking," people with symptoms of mental illness tend to be less threatening when killed by police compared to people without symptoms. This project collected data on 18 variables that may influence officer perception of threat for 1113 fatal force events in 2018 to create a more accurate measure of threat. Mean and distribution of threat scores were compared between Ml and no Ml groups using the Independent Samples T-test and Mann Whitney-Wilcoxon test. Additionally, simple logistic regressions were performed to determine odds ratios for specific threatening scenarios. People with signs of mental illness had significantly lower mean threat scores than people without signs of mental illness (t = 11.61, p value &It; .001) and this finding is much more significant than when considering fewer variables. The distribution of threat scores is also significantly different between groups (W = 164804, p-value &It; .001). In simple logistic regressions, people with mental illness were shown to have lower odds of having harmed police (p-value &It; .001) or fled the scene (p-value &It; .001), higher odds of being killed in daytime hours (p-value = .oo8) and in low crime (less than 25th percentile) counties (p-value = .028), and no increased odds of having harmed others (p-value = 0.271). These results continue to support the existence of disparities in the circumstances under which people are killed by police based on mental illness symptom presentation. Interestingly, in these observations, police were aware of a mental health crisis situation in 54% of fatal force incidents against people with mental illness symptoms. Despite this, serious disparities exist, indicating a need for universal de-escalation training coverage and/or police-clinician co-response teams for crisis situations. This could help to reduce fatal outcomes for this population. â€¢ â€¢ Clustering to Improve the Allocation of Drug Rehabilitation Resources ian Arriaga-MacKenzie, Christina Ebben, Selvakumar Jayaraman, Nicholas Koprowicz, Javier Pastorino, Swayanshu Pragnya & Evan Stene This research project focuses on how to allocate drug rehabilitation resources so that we can better aid those impacted by addiction. The Machine Learning Club has student members from a variety of academic fields, and we used our diverse skill sets to propose a way to implement this allocation strategy. Using Denver County crime data, we executed clustering algorithms using the geospatial location of drug arrests, concentrating on the most addictive drugs and specific factors that reflect a need for rehabilitation. Our results map where in Denver County the allocation of rehabilitation resources would be most effective and we include insight on what type of resources would be most beneficial in different parts of Denver. We conclude the project with suggestions of how clustering algorithms can be applied to other areas of the public sector. Tracing violent crimes: how socioeconomical factors encourages violent crimes Thomas Faut, Kevin Klitchman, Olivia Miller & Alexandria Ronco Violent crimes such as assault, destruction of property, rape and so on are unfortunate and oftentimes overlooked tragedies occurring in every county in Colorado. Can we understand why violent crimes are in abundance and if so could we use that understanding to reduce or prevent future tragedies? While each instance is its own story with specific people and circumstances, there are too many occurrences to look at case by case. Instead, it may be more effective to look into what socioeconomical factors supports an environment for violent crimes to occur. Taking into account all 64 counties in Colorado we strive to explore how violent crime rates may be connected with other socioeconomical factors. For each county in 2015, we observed data regarding economical factors such as employment rates, standard of living, as well as trends noticed with the general populations such as percentages of smokers, single parent households, obesity, etc. We will perform statistical study comparing all reported crimes in each county In Colorado with each of the aforementioned factors and determine which are the most significant attributes influencing violent crime frequencies. By determining these significant factors it will provide a new approach in form of new policies and awareness that may ultimately diminish future violent crimes.
â€¢ â€¢ â€¢ Exploring Factors that Affect the CCDF's Average Expenditure on Children Anqi Hou, Xiaotian Jia, Savannah Murphy & Tianshi Wang Child Care and Development Fund (CCDF) is a federal program that helps low-income families access childcare so that parents can work or continue their education to improve the quality of children's life and family income. Our study aimed to determine which factors influenced the amount spent per child in CCDF care. We took data from the CCDF and census web sites to create a regression model. We used the variables that may affect the average expenditure of CCDF on children in different states in 2017: such as GDP, unemployment rate, average rent, average wage, etc. The study found a way that can be used by the state government to increase the welfare of children in CCDF program: improving the number of school per capita apparently increases the CCDF's expenditure on each children. This information could be used as a case for increasing the number of schools available to families. Investigating Neighborhood Characteristics Correlated with Larceny and White Collar Crimes Dongdong Lu Crime varies from crime. Intuitively, a different type of crime should have a different distribution among different neighborhoods in Denver. In this research, I examined the relationship between neighborhoods' characteristics and their larceny incidences and white collar incidences per person. The method being used is building linear models for the two rates in Denver neighborhoods. The analysis clearly shows that certain factors would affect the neighborhoods' larceny rate for the most. And none of the neighborhoods' characteristics affect the neighborhoods' white-collar crime rates. Finally, policy suggestions are provided. The Factors of Life Expectancy: Examining Demographic and Health Data that Correlate with a Longer Life in Colorado and California Vickie Handler, Matin Khajeh-Sharafabad, Nhat Pham &Xiao Wang For this project, we are looking for the different factors that would determine the life expectancy in California and Colorado. After analyzing the dataset, our group concluded, the best model for this dataset would have State, Poverty, Lung cancer, Car accident, Stroke, and the White race. Lung cancer, Car accident, and Stroke are directly associated with decreasing the average Life expectancy since they relate to the death rate. The State factor plays a significant role in determining the average Life expectancy. The different States would have different policies that would change Life expectancy; for example, not all the States have the death penalty. Growing up in Poverty can make the average Life expectancy lower since many people who live in Poverty cannot live a healthy lifestyle. Lastly, our group concluded, when there are more White people, the average Life expectancy increases. â€¢ â€¢ â€¢ Denver Neighborhood Factors that Contribute to Citizens Spending Over an Hour Commuting Each Day and How Denver Can Lessen this Burden Megan Duff I will be studying the effects Denver neighborhood factors have in its citizen's commute time. I will be using neighborhood factors such as percentage of education levels, rent prices, home value, poverty, race, age, income, and gender to see if there is any connection to the previously listed and citizens who are commuting more than one hour a day. I will be using the American Community Survey Nbrhd (2011-2015) to conduct this analysis. Age of First Use Valentinas Sungaila My goal for this project is to use the 2018 NSDUH data to predict age when someone has used marijuana/hashish for the first time using penalized regression. For this project since I have so many variables, I will be using a penalized regression. More specifically I will be doing a ridge regression (L2 penalty). Cool thing about a ridge regression is that even if my variables are co linear I can lower variance with a bias tradeoff to produce the smallest MSE for my model. I will also use cross validation to select a shrinkage parameter with the lowest MSE. Comparing Models To see how well my model predicts, 1 will split my 2018 data into a training and test sets randomly. Then I will compare the MSE between my training data and my test data and see if they are close to each other. I will also use the predict function in r which will allow me to create prediction intervals for my predictors and models. If I have time I might try to split my data between males and females and see if my model has better prediction for one group or the other. Predicting Property Tax Delinquency from Housing Characteristics Quinsen Joel Residential property tax delinquency is a pervasive and largely un-publicized issue facing communities throughout the U.S .. Tax delinquency can signal larger financial and personal issues for families, and in turn, the broader challenges that a county faces in supporting their current residents' needs. One of the major consequences for delinquent owners, foreclosure through tax lien sale, is time-dependent and potentially remediable through advance action. County governments that have policy ambitions to intervene would value the ability to predict the likeliness of tax delinquency early on. So, this study uses publicly available housing characteristics data to build a predictive model, generating probabilities that a given property is/will become tax delinquent. Proper use or further extension of this model should help policy agents to direct their efforts towards the most likely tax delinquents. A specific intervention policy is suggested, along with a description of the model's accuracy, usefulness, and future extensions.
Predicting Congressional Election Results Using Campaign Finance Data Max McGrath [Abstract unavailable for program] Numbers by Color : Hate Crime Rates in Red and Blue States River Bond This study investigates the rates of hate crimes before and after the 2016 election. In the years after the pivotal 2016 election, hate crimes have risen. Data on these rates and certain demographic regressors are investigated in this study. The Effect of Food Access and Other Factors on Obesity Levels in Colorado Nick Koprowicz Obesity levels are rising across the country at an alarming rate, and research has shown levels to be higher in Black, Hispanic, and low-income populations. It's been proposed that lack of access to healthy food is the cause of this disparity. In this project, we build a linear regression model to examine if lack of food access is associated with higher rates of obesity in Colorado neighborhoods, and see what other factors are associated with higher obesity levels. We find that higher obesity levels are associated with greater Hispanic population, low income, and low access to food. We use our findings to recommend several policies to address this in Colorado. The Effect of Stress on Premature Birth Heath Lancaster This paper analyzes premature births in the Denver area. Premature babies, especially those born very early, often have complicated medical problems in both the long term and the short term. A premature birth is defined as a birth that takes place more than three weeks before the babies estimated due date. While the causes of premature births are not clear, stress is a known risk factor. The goal of this study is to explore the link between stressful life events and premature births in the Denver area. The study will assume will examine stress though looking at low income areas and areas with higher crime rates. Using data from the Auraria Library's Data to Policy Project, this paper analyzes the spatial clustering of premature births in the Denver area and compares this to the spatial clustering of certain crimes and incomes. The data are regional count data of the Denver neighborhoods, and will use statistical methods compatible with area datasets. Gun Incidents Frequency Analysis Andrei Matveev Thinking about ways to minimize potential harm to children by accidental injuries from shooting near schools this project is intended to check if there is any relationship between frequency of gun-related incidents and characteristics of schools and its neighborhood. Based on Comprehensive record of US gun violence incidents from 2013-2018 (containing more than 239k gun violence incidents), Civil Rights Data Collection for the 2015-16 School Year and ZIP Code Data Tax Year 2016 (from The Internal Revenue Service), which were merged based on gun incident and schools geo coordinates, schools for chosen States were grouped into clusters. Initial clustering was made for 20 variables and gun incident frequencies were calculated for each cluster. After that a linear model was built with cluster's virtual coordinates as predictors (in R2o space, relevant to the number of variables) and the gun incident frequency as response variable. Analysis of the relationship between predictors and the response variables, hopefully, will help to find the most dangerous areas for schools and elaborate proactive measures needed to minimize the frequency of gun incidents for areas with especially high children concentration. Is Poverty the Only Cause? Investigating Violent Crime Rates in Denver Neighborhoods Goeun Nam Research has shown that crime rates are related to poverty level, unemployment rate, and low education. This project aims to address the causes of violent crime rate in Denver Neighborhoods, utilizing the listed predictors mentioned and more, such as number of marijuana dispensaries per neighborhood. The project consists of using linear regression, Hypothesis Testing, and methods of variable selection to predict the rate of violent crime given the characteristics of a Denver neighborhood. High School Graduation Rates by Colorado County: Investigating the Demographic Factors Correlated with the Graduation Rates of Disadvantaged Groups Sandra Robles Munoz The state of Colorado currently boasts a near So% High School graduation rate, defined as those that graduate within four years of entering ninth-grade. However, students belonging to specific instructional programs, such as students with disabilities, limited English proficiency, economic disadvantages, among others, tend to graduate at lower rates. We look at county-level graduation data for these disadvantaged groups slotted for graduation during the 2011-2012 school year, and analyze the relationship between graduation rates and different factors, such as median income; median home value; demographic breakdowns by rage, gender, and age; as well as average CSAP scores during the student's Sophomore year of High School. These factors are a small step towards helping us understand what may contribute to graduation rates amongst disadvantaged students, in order to make education policy recommendations.
â€¢ Weapons Crime in Denver MalikOdeh This project will focus on weapons crime in Denver from the publicly available crime dataset provided by the Denver Police Department. Using point-pattern analysis of case and control data (treating weapons crime as cases and other crimes as controls), we attempt to determine if there is clustering of cases versus controls in the data as well as locate potential clusters of weapons crime. Food Deserts and the Impact on Low Income Housing ian Arr i agaMacKenzie The ability to purchase cheap, healthy food is a convenience overlooked by many people. For those who live in designated low-income housing, the choice between fast food and travelling to a grocery store can be difficult. The time cost associated with distance between your home and where you buy your food can have an effect on where you end up eating, and this may have other implications such as long-term health. We examine these so called 'food-deserts' in Denver, where it is easier to get fast food then it is to get healthy groceries, and their association with low income housing projects in the city limits. We also examine possible predictors of food-deserts, and identify regions where potential incentives could be explored to entice food distributors to open stores. Oil and Gas Development in Respiratory Incidents Colorado and Alex Hegg It is important to examine possible causes behind various public health concerns. By attempting to determine whether there exists a relationship between respiratory issues in Colorado and oil and gas development, we can work to inform sound public policy designed to address health concerns. In this study we used information from the 5-year 2013-2017 American Community Survey to obtain the population and number of reported asthma cases for all Colorado census tracts. Additionally, we obtained the oil and gas well information from the Colorado Oil and Gas Conservation Commission including the locations of all wells and when they began to drill said wells. Using tools from spatial statistics, we examine whether there appears to be a similar relationship between the location of wells and the location of asthma cases. â€¢ Burglary and Robbery in Denver Arlin Tawzer Though similar in concept, burglary and robbery have subtle differences in their definition and in the response from law enforcement. Burglary, while more common, is less threatening than a robbery, particularly armed robbery, and one dimension by which the two crimes would potentially differ would be in their distribution across Denver's neighborhoods. The objective of my analysis is to look at the distributions of the crimes across Denver to determine if the patterns are different between the two and if so, which locations this occurs in. I will be accounting for covariates available in the neighborhood characteristics dataset and the 2010 census data (for population numbers). By understanding the difference in spatial patterns between the two crimes I hope to inform policy surrounding both crime types. â€¢ Burglary Patterns in Denver County Kirk Van Arkel This analysis involves examining the Denver crime dataset made available by denvergov.org. The analyses only consider burglaries and thefts from residences in 2014, which includes a total of 3,423 incident locations. Each incident contains 19 variables, such as incident type, date of occurrence, latitude and longitude, among several others. The total number of thefts and burglaries are aggregated by neighborhood, excluding DIA, leaving us with 77 total Denver neighborhoods. Clusters of theft/burglary are determined by considering the total number of households in each neighborhood as the population-at-risk. Methods such as CEPP testing, Besag Newell testing, and spatial scanning are utilized in determining the most evident clusters. Ultimately, neighborhood demographic data such as poverty percentage, vacant household percentage, median age, gender percentage, and more are considered to construct a generalized linear model in attempts to determine factors which contribute to these clusters of theft and burglary. Investigating Cluster of Drug and Alcohol Crime in Contrast with Other Types of Crime Negar Janani In this analysis we looked solely at crimes and where they happened. Since it's possible that certain categories of crimes are more prevalent in particular districts or precincts, we are looking for clustering alcohol and drug in contrast with other types of crime in Denver area.
â€¢ â€¢ Identifying High Suicide Rates and Potential Causes in Colorado Counties Emma Collins American suicide rates are at an all-time high and is the tenth leading cause of death of all U.S. citizens. According to the American Foundation for Suicide Prevention, in 2017 Colorado ranked 11th for highest suicide rates among all so states at a rate of 20.35 per 1oo,ooo individuals, well above the national average of 14 per 1oo,ooo individuals. In this project we aim to distinguish high rates at the Colorado county level and potential causes and suggest improvements to decrease these rates. We use Turnbull et.al's cluster evaluation permutation procedure (CEPP), the Besag-Newell test, and Kulldorf's spatial scan statistics to identify unusually high suicide rates at the county level and a Poisson generalized linear model to find demographics that contribute to more suicides in certain counties over others. The project uses publicly available data from the U.S. Census Bureau and the Colorado Department of Public Health and Environment. Spatial Relationships Colorado, 2014-2018 of Overdose Deaths in Danielle Totten Deaths from drug overdoses have been increasing nationally, driven in large part by the rise of synthetic opioids. Increased availability of overdose medication and community resources could help to prevent deaths caused by drug overdose. This project will explore the spatial relationships between drug overdose deaths and opioid related overdose deaths among the counties of Colorado as well as model the expected number of deaths in a given county based on demographic and socioeconomic factors. This information could help counties and communities better understand factors related to drug overdoses to provide resources that help in the prevention of drug overdose deaths. Factors which Influence the Personal Income Kaichao Chang, Hanjun Li, Pinnan Liu & Tianyi Wang The personal income is an essential part of people's well being. In this project, we aim to find out the potential factors that have strong influence on the personal income. We analyze the U.S. census microdata in 2017 from IPUMS-USA by linear regression analysis with R. The result shows that number of people in the household, sex, age, family income, income wage and employment status are important factors on the personal income. Influential Factors of Nations in Determining Life Expectancy Emily Novak The World Health Organization (WHO) is a specialized agency of the United Nations that is concerned with international public health. Over the years 2000-2015, the Global Health Observatory (GHO), a subsidiary of WHO, collected life expectancy data from 137 countries with regards to demographic variables, income composition, mortality rates, the effect of immunization, and human development index. The dataset analyzed in this project is data collected for the year 2015 and a thorough statistical analysis is completed to determine influential factors that prolong or shorten life expectancy. After this is done, the project focuses on sensible measures which can be taken to increase life expectancy. For example, Measles vaccinations are associated with a higher life expectancy, so a cost-effective action for nations which have a low Measles vaccination rate is to hold free or low cost vaccination clinics.
â€¢ Spatial Analysis of Hate Crimes in Denver, CO Alexandra Rotondo According to the Denver Post, Denver, CO, has seen a 16% increase in hate crimes relative to the previous year(s)1. Denver Open Data has made hate crime data available through the bias-motivated crimes CSV, which, among other parameters, details: the offense description, the group in which the perpetrator has bias against, the block location of the incident, and the latitude and longitude coordinates of the location. Utilizing different spatial analysis methodologies, this analysis aims to discern whether or not there exists a cluster of hate crimes in Denver, CO, neighborhoods. Equivalently, this analysis aims to discern if there is a statistically significant higher incidence of hate crimes, relative to the overall population, in certain areas within Denver, CO, than others. The coordinates and block locations were used to aggregate the hate-crime data to a neighborhood (regional) level, to which different areal data analysis methodologies can be applied. The methodologies incorporated in this analysis are CEPP, Spatial Scan Statistics, and comparing the observed test statistic to 499 test statistics obtained from data generated under the assumption of constant risk of hate crimes. Relating this back to public policy, if a cluster of hate crimes is identified, the goal then becomes to answer the following questions: Are there certain groups of people targeted more heavily than others? How can we foster greater inclusivity towards those group(s) of individuals? Are there certain factors of those areas (e.g. large night-life presence, etc.) that may be contributing to the higher incidence of hate crimes. If so, how can public policy be modified to help effectively address this situation? Childhood Obesity and Park Acreage: Does Increasing Public Park Acreage Decrease Childhood Obesity Rates in the Neighborhoods of Denver, CO? Jack Wold-McGimsey This research focuses on whether there is a relationship between park acreage in the neighborhoods of the city and county of Denver, CO, and neighborhood child Body Mass Index (BMI) rates, and hypothesized there would be a negative correlation between park acreage and childhood BMI with influences from demographicfactors. To test whether a relationship existed, this study employed correlational analysis and linear regression models using childhood BMI data from the Colorado BMI Monitoring System and the "American Community Neighborhood Survey" census data, along with a GIS analysis of Denver neighborhood park acreage. The results appear to indicate that there is not a statistically significant relationship between park acreage and childhood BMI rates in the neighborhoods of Denver. There is a slight, albeit unclear, negative correlation between park acreage and childhood BMI rates, but the linear regression results were largely non-statistically significant and did not indicate the existence of a clear relationship between park acreage and child BMI rates. The key conclusions of this research are that more research must be done on Denver neighborhoods to determine with greater accuracy the factors that affect childhood obesity, and that a clear relationship between park acreage and childhood BMI rates is not reliably discernable. The policy implications of these findings are that policy makers or public health initiatives will not be able to rely on a clear and direct relationship between park acreage and childhood BMI rates in Denver neighborhoods, and policy should not be designed with this assumption without further research.