Feb. 2, 2008
burt <- read.table("burt.txt", header = T)
attach(burt)
model<-lm(FostIQ~OwnIQ)
Regression Lingo Alert
Regress the outcome variable (Y) on the predictor variable
coef(model)
(Intercept) OwnIQ
9.719491 0.907920
anova(model)
Analysis of Variance Table
Response: FostIQ
Df Sum Sq Mean Sq F value Pr(>F)
OwnIQ 1 9250.7 9250.7 169.42 < 2.2e-16 ***
Residuals 51 2784.7 54.6
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Fitted Model
-Drops error from the regression equation.
Yvariable-sub-i = (Intercept) + Slope*Xvariable-sub-i
(Regression Equation includes error so the Y variable is not an estimate but in the Fitted Model the Y is an estimate [needs the hat])
library(lattice)
xyplot(FostIQ~OwnIQ,type=c("p","r"))
R^2
R^2 = SSmodel/SStotal
R^2 = 9251/12035 = .769
*76.9% of difference in FostIQ is accounted for by difference in OwnIQ and 23.1% is not.
We cannot account for how the unexplained variation divides the variation between other systematic components, measurement error, and individual variation.
Estimated Residual Variance
What is the variance of the mean estimates for the scores at each point on the line (REmember that the line represents the means of the potential distribution at any point on a line.)
sigma-hat-
sigma-hat^2-sub-X|Y = SSresiduals/n
sigma-hat^2-sub-X|Y = SSresiduals/n - (parameters in equation)
sigma-hat^2-sub-X|Y = 2785/31 - 2
SD=Sqrt(sigma-hat^2-sub-X|Y = 2785/31 - 2)
sqrt(54.6)
[1] 7.389181
We are therefore sure to around 95% (2 SDs) that our predicted values will be within about 15 points either side of any particular point estimate.
9.72+.91*75
[1] 77.97
77.97-14.8
[1] 63.17
77.97+14.8
[1] 92.77
So we are sure that for an OwnIQ score of 75 we would expect the Fost IQ score to be somewhere between 63.1,92.8.