Main | Jan 28, 2009 »

Jan 26, 2009


Mathematical and Statistical Models


Mathematical Models


-deterministic models, there is no error in a mathematical model

Statistical Model


-We are using models that allow for error and use probability
-Allow for other systematic components that in many cases are not included or were not measured.
-Allow for measurement error (especially in the social sciences)
-Allow for individual variation within the unit of analysis that we are analyzing.

Goals of Creating Statistical Models


1.Identify systematic components
2.Assess the model fit (looking at residuals [good model=smaller residuals])

How do we use statistical Models


Articulate Research Questions -> Outcome Variables, Focal (important) Predictors, Covariates (account or control for)
Postulate the statistical model (What is it going to look like?)
...fitting model to sample data
Determine if relationship is due to chance -> Does the model really work in the population or is it by chance?

Regression is all about relationships and associations


-Causality can be only determined through the design of the study not through analysis.
-Analysis discovers associations, correlations or covariation.
> burt <- read.table("burt.txtv", header = T)

> burt <- read.table("burt.txt", header = T)
> head(burt)
ID OwnIQ FostIQ
1 1 68 63
2 2 71 76
3 3 73 77
4 4 75 72
5 5 78 71
6 6 79 75
> attach(burt)
Always a good idea to begin an analysis with a decriptive analysis and plots.
> library(psych)

package 'psych' successfully unpacked and MD5 sums checked

> library(psych)
> describe(OwnIQ)
var n mean sd median trimmed mad min max range skew kurtosis se
1 1 53 97.36 14.69 96 97 14.83 68 131 63 0.24 -0.47 2.02
> describe(FostIQ)
var n mean sd median trimmed mad min max range skew kurtosis se
1 1 53 98.11 15.21 97 98.21 16.31 63 132 69 -0.02 -0.5 2.09
Remeber from 8261: Kernel Density Plot is better than a histogram.
-There are actually some better ways to plot now than using the base plot command: plot()
> library(lattice)
-the "lattice" library has a LOT of plot styles
A density plot in lattice is -> densityplot(variable,Kernel="e")
> densityplot(OwnIQ,Kernel="e")
-all of the "lattice" graphics functions allow you to enter a formula
> densityplot(~OwnIQ,data=burt,Kernel="e")
> densityplot(FostIQ,Kernel="e")
> densityplot(~FostIQ,data=burt,Kernel="e")
-in "lattice" library histogram ->histogram()
-in "lattice" library boxplot -> bwplot()
> bwplot(OwnIQ)
> histogram(OwnIQ)
> densityplot(OwnIQ,Kernel="e")
-in "lattice" scatter plot -> xyplot(formula, type="p")
-formla -> X~Y where X will be plotted on X axis and Y on the Y axis
> xyplot(FostIQ~OwnIQ,type="p")

Five things to look for in a scatter plot


1. What is the direction of the relaitionship?
2. What is the type of relationship? (Is it linear?)
3. What is the strength of the relationship? (Are the points close or far from the line?)
4. What is the magnitude of the relationship? (Line slope?)
5. Are there any unusual observations? (not necessarily outlier)
-In a deterministic realtionshp all data points are on the line, but there are different types of error in social science, so our plots start to look like clouds.
>