Predictive Modeling of Serious Crime Using County Demographic Data: Stats Final Exam Part 2

Final exam solving predictive modeling problems for crime analysis using demographic data.

Andrew Taylor
Contributor
4.9
53
10 months ago
Preview (3 of 8 Pages)
100%
Log in to unlock

Page 1

Predictive Modeling of Serious Crime Using County Demographic Data: Stats Final Exam Part 2 - Page 1 preview image

Loading page ...

Stats Final Exam Part 2Given the data set calledCounty Demographic Information, construct a predictive model forthe variable “Total Serious Crime” using some or all of the other variables in the set of data.The modelshouldbe mathematically valid,accurateand reliable.Total Serious Crime is Variable #8Other Variables:#2Land Area#3Total Population#4Percent of Population aged 18-34#5Percent of Population 65 or over#6Number ofActive Physicians#7Number of Hospital Beds#9Percent of High School Graduates#10Percent of Population with College Degrees#11Percent of Population below poverty level#12Unemployment Percent#13Per Capita Income#14Total Personal Income#15Geographic RegionNote: I am omitting the data set to simplify this problem; the following analyses use the dataset described above, and you can assume the math is calculated correctly. I am testing to seeif you can identify what analytical techniques may be validly employed and how effective arethey building a model.Variables 2 to 14 are numeric variables and variable 15 iscategorical.

Page 2

Predictive Modeling of Serious Crime Using County Demographic Data: Stats Final Exam Part 2 - Page 2 preview image

Loading page ...

Page 3

Predictive Modeling of Serious Crime Using County Demographic Data: Stats Final Exam Part 2 - Page 3 preview image

Loading page ...

Analysis #1In the given data set, we were asked to determine if anaccurate predictive model forVariable #8, Serious Crime could be found using the attached data.Since Variable 15 was determined to becategorical, regression was not appropriate touse; so I used Analysis of Variance (ANOVA) to examine if there was a significant relationshipbetween Variable 8 and 15. The results (using Systat 13.0) are printed above.VariablesLevelsVAR(15) (4 levels)1.0002.0003.0004.000Dependent VariableVAR(8)N440Multiple R0.110Squared Multiple R0.012Estimates of Effects B = (X'X)-1X'YFactorLevelVAR(8)CONSTANT28,017.368VAR(15)1-4,931.339VAR(15)2-6,236.627VAR(15)3-1,026.394Analysis of VarianceSourceType III SSdfMean SquaresF-Ratiop-ValueVAR(15)1.795E+01035.985E+0091.7740.151Error1.471E+0124363.374E+009ANOVA results suggest that Variable 15 is significantly related to Variable 8, butVariable 15 can only explain approximately 15.1% of the variation in Variable 8.Therefore, I conclude thatvariable 15 is significantly related to variable 8although variable 15 is only a minor factor in predicting variable 8.
Preview Mode

This document has 8 pages. Sign in to access the full document!