------------------------------------------------- Statistical analysis of the relation between Crime Rate, Education and Poverty: USA, 2009 Sonarika Mahajan 100076 Research Question In this research paper, analysis is done to conclude whether the level of education and poverty influence the total crime rate in the United States of America. Using descriptive statistics such a mean, standard deviation, variance, histograms, scatter diagrams and simple linear regression analysis performed upon both independent variables separately, it can be analysed till what extent do these two independent variables, i. . education and poverty cause fluctuations upon the dependent variable, in what proportion (direct or inverse) and of the two independent variables, which is a better predictor for determining crime rate in USA. Data description [The states selected for this study are highlighted with yellow in the above map] The Data that is used to define our dependent variable include both, violent crime (murder and non- negligent manslaughter, forcible rape, robbery, and aggravated assault) as well as property crime (burglary, larceny-theft, motor vehicle theft, and arson).

Crime statistics used in this study are published by FBI (Federal Bureau of Intelligence) serving as a governmental agency to the United States Department of Justice. The independent variable that comments upon the education levels in the United States of America is carried out by analysing the total number of public high school graduates per state. This data includes students of all the ethnicities for the school year 2008-2009. The education universe in this study is equivalent to the total population of the state.

This data has been collected by National Centre for Education Statistics (NCES), which is the primary federal entity that collects education related data in the U. S. and other countries and analyses it. The poverty status for an individual is measured by comparing his/her income to a preset amount of dollars known as the threshold value. The poverty universe excludes children below the age of 15, people living in military barracks, institutional group quarters and college dormitories. This data is collected by the U. S. Census Bureau, serving as the most reliable source about America’s people and economy.

All the data collected is cross-sectional, since it was taken during the same time period (year 2009) across different parameters. Also, the scale of measurement for these variables is the ratio scale, since the ratio between two values is meaningful and the observations are comparable to a zero value. Analysis Mean: It is the representative of a central value for a given data set, i. e. average. The mean value for crime variable suggests that in the year 2009, the percentage of crimes being reported in any state of USA was 3. 26%.

The mean value for education variable suggests that the percentage of public high school graduates being reported in any state of USA was 1% for the same time period. Similarly, the mean value for the poverty variable suggests that the percentage of individuals living below the poverty line being reported in any state of USA was 13. 54%. Standard deviation & Variance: The higher the value of the standard deviation, greater is the dispersion of the data set. Out of the three variables, poverty has the highest standard deviation value of 2. 98.

Therefore, the percentage of individuals below poverty level is more widely dispersed over the states as compared to the other two variables. Variance is the average of the sum of squared deviation scores. It is used to compute the standard variation since it’s a better means for determining the dispersion of data. It is measured as the square of standard deviation for any data set. Skewness: The symmetry of the variable distribution is measured by the help of this statistic. Crime rate has a skewness of 0. 083, making it a symmetrical distributed variable since the value is closer to zero. The education variable is skewed negatively at -. 67 since the variable has lower values, indicating a left skewed histogram. Whereas, poverty shows a positive skewness value of . 670 since its variables have numerous high values, which justifies the right skewness of the histogram. Simple linear regression model: a. Crime and Education - Y = Dependent variable, Crime X = Independent variable, Education. The regression model is the equation that describes how y is related to x. This regression equation is: From Table 2. 4 in appendix, the regression equation is, Crime = 6. 17 - 2. 9 (Education) This regression equation can be graphed as follows assuming ? 0 as the intercept and ? as the slope: Here the slope ? 1 is negative. Interpretation of the slope: For every 1% increase in the number of students being graduated from high school, there is a decrease of 2. 9% in crime activities in the USA. Interpretation of the intercept: Even if there is no variation in the education level, the estimated crime rate would be 6. 17%. The coefficient of determination or r2: It determines the proportion of variation in the dependent variable by the independent variable. From Table 2. 2, r2 = . 181 This states that 18. 1% of the variation in crime rate is explained by regression of education on crime.

Since this value is not close to 1, it doesn’t seem to be a appropriate predictor to determine the crime rate in USA. Hypothesis testing: Ho: ? 1 = 0 (education is not a useful predictor of crime) Ha: ? 1 ? 0 (education is a useful predictor of crime) Significance level: ? = 0. 05 According to the rejection rule, the null hypothesis will be rejected if p-value ? ?. From table 2. 4, p-value = 0. 019 Since 0. 019 ? 0. 05, we reject the null hypothesis. At 95% confidence level, there is enough evidence to conclude that education is a useful predictor for crime in USA since the slope of the regression line is not zero. b. Crime and Poverty:

Y = Dependent variable, Crime X = Independent variable, Poverty. The regression equation is as follows: Plugging in the values to from Table 3. 4, get: Crime = 1. 819 + 0. 107 (Poverty) This regression equation can be graphed as follows assuming ? 0 as the intercept and ? 1 as the slope: Here the slope ? 1 is positive. Interpretation of the slope: For every 1% increase in the individuals below poverty line, there is an increase of . 11% in crime activities in the USA. Interpretation of the intercept: With the poverty level remaining constant, the estimated crime rate would be 1. 82%. The coefficient of determination or r2

From Table 3. 2, r2 = . 191 This states that 19. 1% of the variation in crime rate is explained by regression of poverty on crime. Hypothesis testing: Ho: ? 1 = 0 (poverty is not a useful predictor of crime) Ha: ? 1 ? 0 (poverty is a useful predictor of crime) Significance level: ? = 0. 05 According to the rejection rule, the null hypothesis will be rejected if p-value ? ?. From table 3. 4, p-value = 0. 016 Since 0. 016 ? 0. 05, we reject the null hypothesis. At 95% confidence level, there is enough evidence to conclude that poverty is a useful predictor for crime in USA since the slope of the regression line is not zero.

Conclusion and recommendations From this study conducted, it is assured that the crime rate in USA is directly proportionate to the people below the poverty line and inversely proportionate to the number of high school students graduating in the year 2009. When simple linear regression was performed to both the independent variables separately, the coefficient of determination (r2) and the p-value aided our study to select the variable that was a better predictor for determining the crime rate in America. Poverty, with the significance level of 19. 1% is known to be a better predictor in this case as compared to the 18. % significance level shown by the independent variable, education. This fact was further proved when the p-value for poverty stood at a lower amount as compared to its counterpart. Even though it can be concluded that poverty is a better predictor for crime rate in USA, the level of significance still stands at a diminutive 19. 1%. Much stronger predictors could be used for the above study. GDP, income level, provision of federal aid or employment rate could be a few options to choose amongst. Appendix Table 1. 1 Statistics for crimes reported in 30 states of USA.

State| Population| Violent Crime| Property Crime| Total Crime| Percentage of Total Crime | Alabama| 47,08,708| 21,179| 1,77,629| 1,98,808| 4. 22| Alaska| 6,98,473| 4,421| 20,577| 24,998| 3. 58| Arizona| 65,95,778| 26,929| 2,34,582| 2,61,511| 3. 96| California| 3,69,61,664| 1,74,459| 10,09,614| 11,84,073| 3. 20| Colorado| 50,24,748| 16,976| 1,33,968| 1,50,944| 3. 00| Connecticut| 35,18,288| 10,508| 82,181| 92,689| 2. 63| Florida| 1,85,37,969| 1,13,541| 7,12,010| 8,25,551| 4. 45| Hawaii| 12,95,178| 3,559| 47,419| 50,978| 3. 94| Iowa| 30,07,856| 8,397| 69,441| 77,838| 2. 59|

Kansas| 28,18,747| 11,278| 90,420| 1,01,698| 3. 61| Michigan| 99,69,727| 49,547| 2,82,918| 3,32,465| 3. 33| Minnesota| 52,66,214| 12,842| 1,39,083| 1,51,925| 2. 88| Mississippi| 29,51,996| 8,304| 87,181| 95,485| 3. 23| Missouri| 59,87,580| 29,444| 2,02,698| 2,32,142| 3. 88| Montana| 9,74,989| 2,473| 24,024| 26,497| 2. 72| Nebraska| 17,96,619| 5,059| 49,614| 54,673| 3. 04| Nevada| 26,43,085| 18,559| 80,763| 99,322| 3. 76| New Jersey| 87,07,739| 27,121| 1,81,097| 2,08,218| 2. 39| New Mexico| 20,09,671| 12,440| 75,078| 87,518| 4. 35| New York | 1,95,41,453| 75,176| 3,78,315| 4,53,491| 2. 2| North Carolina| 93,80,884| 37,929| 3,44,098| 3,82,027| 4. 07| North Dakota| 6,46,844| 1,298| 12,502| 13,800| 2. 13| Oregon| 38,25,657| 9,744| 1,13,511| 1,23,255| 3. 22| Pennsylvania| 1,26,04,767| 47,965| 2,77,512| 3,25,477| 2. 58| South Dakota| 8,12,383| 1,508| 13,968| 15,476| 1. 91| Texas | 2,47,82,302| 1,21,668| 9,95,145| 11,16,813| 4. 51| Virginia| 78,82,590| 17,879| 1,91,453| 2,09,332| 2. 66| Washington| 66,64,195| 22,056| 2,44,368| 2,66,424| 4. 00| Wisconsin| 56,54,774| 14,533| 1,47,486| 1,62,019| 2. 87| Wyoming| 5,44,270| 1,242| 14,354| 15,596| 2. 87| Source: http://www. fbi. ov/about-us/cjis/ucr/crime-in-the-u. s/2011/crime-in-the-u. s. -2011/tables/table-5 Table 1. 2 Statistics for public high school graduates in 30 states of USA. State| Population| Total Public High School Graduates| Percentage of High School Graduates| Alabama| 47,08,708| 42,082| 0. 89| Alaska| 6,98,473| 8,008| 1. 15| Arizona| 65,95,778| 62,374| 0. 95| California| 3,69,61,664| 3,72,310| 1. 01| Colorado| 50,24,748| 47,459| 0. 94| Connecticut| 35,18,288| 34,968| 0. 99| Florida| 1,85,37,969| 1,53,461| 0. 83| Hawaii| 12,95,178| 11,508| 0. 89| Iowa| 30,07,856| 33,926| 1. 13| Kansas| 28,18,747| 30,368| 1. 8| Michigan| 99,69,727| 1,12,742| 1. 13| Minnesota| 52,66,214| 59,729| 1. 13| Mississippi| 29,51,996| 24,505| 0. 83| Missouri| 59,87,580| 62,969| 1. 05| Montana| 9,74,989| 10,077| 1. 03| Nebraska| 17,96,619| 19,501| 1. 09| Nevada| 26,43,085| 19,904| 0. 75| New Jersey| 87,07,739| 95,085| 1. 09| New Mexico| 20,09,671| 17,931| 0. 89| New York | 1,95,41,453| 1,80,917| 0. 93| North Carolina| 93,80,884| 86,712| 0. 92| North Dakota| 6,46,844| 7,232| 1. 12| Oregon| 38,25,657| 35,138| 0. 92| Pennsylvania| 1,26,04,767| 1,30,658| 1. 04| South Dakota| 8,12,383| 8,123| 1. 00| Texas | 2,47,82,302| 2,64,275| 1. 7| Virginia| 78,82,590| 79,651| 1. 01| Washington| 66,64,195| 62,764| 0. 94| Wisconsin| 56,54,774| 65,410| 1. 16| Wyoming| 5,44,270| 5,493| 1. 01| Source: http://nces. ed. gov/CCD/tables/ESSIN_Task5_f2. asp Table 1. 3 Statistics for individuals below Poverty line in 30 states of USA. State| Population for whom poverty status is determined| Individuals in poverty| Percent below poverty| Alabama| 45,88,899| 8,04,683| 17. 54| Alaska| 6,82,412| 61,653| 9. 03| Arizona| 64,75,485| 10,69,897| 16. 52| California| 3,62,02,780| 51,28,708| 14. 17| Colorado| 49,17,061| 6,34,387| 12. 90|

Connecticut| 34,09,901| 3,20,554| 9. 40| Florida| 1,81,24,789| 27,07,925| 14. 94| Hawaii| 12,64,202| 1,31,007| 10. 36| Iowa| 29,05,436| 3,42,934| 11. 80| Kansas| 27,32,685| 3,65,033| 13. 36| Michigan| 97,35,741| 15,76,704| 16. 20| Minnesota| 51,33,038| 5,63,006| 10. 97| Mississippi| 28,48,335| 6,24,360| 21. 92| Missouri| 58,18,541| 8,49,009| 14. 59| Montana| 9,46,333| 1,43,028| 15. 11| Nebraska| 17,39,311| 2,14,765| 12. 35| Nevada| 26,06,479| 3,21,940| 12. 35| New Jersey| 85,31,160| 7,99,099| 9. 37| New Mexico| 19,68,078| 3,53,594| 17. 97| New York | 1,90,14,215| 26,91,757| 14. 16|

North Carolina| 90,95,948| 14,78,214| 16. 25| North Dakota| 6,20,821| 72,342| 11. 65| Oregon| 37,48,545| 5,34,594| 14. 26| Pennsylvania| 1,21,65,877| 15,16,705| 12. 47| South Dakota| 7,82,725| 1,11,305| 14. 22| Texas | 2,41,76,222| 41,50,242| 17. 17| Virginia| 76,23,736| 8,02,578| 10. 53| Washington| 65,30,664| 8,04,237| 12. 31| Wisconsin| 54,95,845| 6,83,408| 12. 43| Wyoming| 5,29,982| 52,144| 9. 84| Source: http://www. census. gov/compendia/statab/cats/income_expenditures_poverty_wealth/income_and_poverty--state_and_local_data. html Regression (Independent variable: Education)

Table 2. 1 Variables Entered/Removedb| Model| Variables Entered| Variables Removed| Method| 1| Educationa| . | Enter| a. All requested variables entered. | | b. Dependent Variable: Crime| | Table 2. 2 Model Summary| Model| R| R Square| Adjusted R Square| Std. Error of the Estimate| 1| . 425a| . 181| . 152| . 67068| a. Predictors: (Constant), Education| | Table 2. 3 ANOVAb| Model| Sum of Squares| df| Mean Square| F| Sig. | 1| Regression| 2. 784| 1| 2. 784| 6. 189| . 019a| | Residual| 12. 595| 28| . 450| | | | Total| 15. 379| 29| | | | a. Predictors: (Constant), Education| | | | . Dependent Variable: Crime| | | | | Table 2. 4 Coefficientsa| Model| Unstandardized Coefficients| Standardized Coefficients| t| Sig. | | B| Std. Error| Beta| | | 1| (Constant)| 6. 165| 1. 173| | 5. 257| . 000| | Education| -2. 904| 1. 167| -. 425| -2. 488| . 019| | | | | | Regression (Independent variable: Poverty) Table 3. 1 Variables Entered/Removedb| Model| Variables Entered| Variables Removed| Method| 1| Povertya| . | Enter| a. All requested variables entered. | | b. Dependent Variable: Crime| | Table 3. 2 Model Summary| Model| R| R Square| Adjusted R Square| Std.

Error of the Estimate| 1| . 437a| . 191| . 162| . 66665| a. Predictors: (Constant), Poverty| | Table 3. 3 ANOVAb| Model| Sum of Squares| df| Mean Square| F| Sig. | 1| Regression| 2. 935| 1| 2. 935| 6. 604| . 016a| | Residual| 12. 444| 28| . 444| | | | Total| 15. 379| 29| | | | a. Predictors: (Constant), Poverty| | | | | b. Dependent Variable: Crime| | | | | Table 3. 4 Coefficientsa| Model| Unstandardized Coefficients| Standardized Coefficients| t| Sig. | | B| Std. Error| Beta| | | 1| (Constant)| 1. 819| . 575| | 3. 162| . 004| | Poverty| . 107| . 042| . 437| 2. 570| . 016|