Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Correlation
A quantitative relationship between two interval or ratio level variables
x
Hours of Training
Number of Accidents
Shoe Size Height Cigarettes smoked per day Lung Capacity Score on SAT Grade Point Average Height IQ What type of relationship exists between the two variables and is the correlation significant?
Correlation
measures and describes the strength and direction of the relationship Bivariate techniques requires two variable scores from the same individuals (dependent and independent variables) Multivariate when more than two independent variables (e.g effect of advertising and prices on sales) Variables must be ratio or interval scale
Accidents
Hours of Training
GPA
Math SAT
y = IQ
IQ
Height
No linear correlation
but non-linear!
Correlation Coefficient r
A measure of the strength and direction of a linear relationship between two variables
0
If r is close to 0 there is no linear correlation.
Outliers.....
Outliers are dangerous
Application
Final Absences Grade
95 90 85 80 75 70 65 60 55 50 45 40
8 10 Absences X
12
14
16
x 8 2 5 12 15 9 6
y 78 92 90 58 43 74 81
Final Grade
Computation of r
x y
1 2 3 4 5 6 7
8 2 5 12 15 9 6
78 92 90 58 43 74 81
x2 64 4 25 144 225 81 36
579
57 516
Test of Significance
The correlation between the number of times absent and a final grade r = 0.975. There were seven pairs of data.Test the significance of this correlation. Use = 0.01.
Rejection Regions
Critical Values t0
t
4.032
4.032
df\p 0.40 0.25 0.10 0.05 0.025 0.01 0.005 0.0005 1 0.324920 1.000000 3.077684 6.313752 12.70620 31.82052 63.65674 636.6192
0.288675
0.816497
1.885618
2.919986
4.30265
6.96456
9.92484
31.5991
0.276671
0.764892
1.637744
2.353363
3.18245
4.54070
5.84091
12.9240
0.270722
0.740697
1.533206
2.131847
2.77645
3.74695
4.60409
8.6103
0.267181
0.726687
1.475884
2.015048
2.57058
3.36493
4.03214
6.8688
t
4.032
+4.032
The equation of a line may be written as y = mx + b where m is the slope of the line and b is the yintercept.
The line of regression is: The slope m is: The y-intercept is:
( x i ,y i )
= a data point
revenue
240
230 220 210 200 190
180
1.5 2.0
Ad $
2.5
3.0
1 2 3 4 5 6 7
x 8 2 5 12 15 9 6
y 78 92 90 58 43 74 81
x2 64 4 25 144 225 81 36
Write the equation of the line of regression with x = number of absences and y = final grade.
57 516
3751
579 39898
Calculate m and b.
= 3.924x + 105.667
Final Grade
Absences
Predicting y Values
The regression line can be used to predict values of y for values of x falling within the range of the data.
The regression equation for number of times absent and final grade is:
= 3.924x + 105.667
Use this equation to predict the expected grade for a student with (a) 3 absences (b) 12 absences
(a)
(b)
The correlation coefficient of number of times absent and final grade is r = 0.975. The coefficient of determination is r2 = (0.975)2 = 0.9506.
Interpretation: About 95% of the variation in final grades can be explained by the number of times a student is absent. The other 5% is unexplained and can be due to sampling error or other variables such as intelligence, amount of time studied, etc.