April 2008 Archives

Mapping with R: Third grade proficiency gaps in 2007

| No Comments

Here are some new maps displaying math and reading achievement gaps in early elementary school, suggesting disparities at kindergarten entry. I will write more on these in a later post.

Click on the thumbnails below to see full maps:





Spring break canoe trip

| No Comments

Amy and I canoed the Buffalo National River in Arkansas over spring break--three nights, three days, 45 miles. I highly recommend it. The Buffalo offers beautiful limestone-tinted water, huge cliffs, manageable rapids, and wildlife galore. It's also 12 hours south (warmer) than Minnesota. We timed our trip perfectly. Two days after we got off the river, it hit flood stage. Check out this video of the White River, downstream of the Buffalo:

Here are some pics (click to enlarge):

Making chili

Amy in the bow

Blue heron rookery

I decided to make a more advanced map, one with some relevance to my graduate studies in evaluation and educational psychology. This map shows geographic variation in third grade reading proficiency as measured by the 2005 Minnesota Comprehensive Assessment (MCA). Click on the thumbnail to enlarge:


Merging test scores to the attribute data frame was difficult. The attribute data frame must evidently remain sorted in the same order in which it was converted to an sp class object. I found this out after merging district test scores to the attribute data frame and finding that Cannon Falls had migrated up to Minneapolis. I created a row number variable to re-sort the data frame back to its original order before adding it back to the sp object. I'm betting there is a better way to achieve the same type of join. In the future, I will probably join additional variables to the DBF portion of the SHP file before calling it into R.

Here is the script:


#get data
mca05r3m <- read.table(file.choose()) #MCA scores from MDE
districts <- readShapePoly(file.choose(), proj4string=CRS("+proj=utm +zone=15 +ellps=GRS80 +datum=NAD83")) #school districts from MetroGIS

#prepare data
districts <- spTransform(districts, CRS=CRS("+proj=longlat"))
districts@data$row <- as.numeric(row.names(districts@data)) #necessary for linking attributes back to polygons after merge
mca05r3m$proficient <- with(mca05r3m, percentLevel3 + percentLevel4 + percentLevel5) #MCA levels 3-5 represent proficiency
temp <- merge(districts@data, mca05r3m, by="SD_NUMBER", all.x=T, sort=F) #merge scores to attribute data frame
districts@data <- temp[order(temp$row),] #order is key

#set up the choropleth with helpful code from GeogR
plotvar <- districts@data$proficient
nclr <- 6
plotclr <- brewer.pal(nclr,"Oranges")
plotclr <- plotclr[nclr:1] #reordering colors looks better
class <- classIntervals(plotvar, nclr, style="quantile")
colcode <- findColours(class, plotclr, digits=4)

#map it
plot(districts, density=16, col="grey", axes=T, cex.axis=.75) #hatches missing data
plot(districts, col=colcode, add=T)
text(coordinates(districts), labels=districts$SD_NAME.x, cex=.5)
title(main="Percentage of Third Graders Reading Proficiently in 2005", sub="Created with R",font.sub=2)
legend(-94.1, 45.4, legend=names(attr(colcode, "table")), fill=attr(colcode, "palette"), cex=0.75, bty="n")

Sources of data and code include:

About me

| No Comments


Current positions
-Evaluation and testing specialist
-Doctoral candidate

Doctoral program
Quantitative Methods in Education (QME), Evaluation concentration

Research interests
-Preventing educational and health disparities
-Program and policy evaluation
-Spatial statistical methods and the geography of education
-Causal theory and inference
-Latent variable models (item response theory, factor analysis, structural equation models)
-Mixed models for longitudinal and multilevel data
-Complex survey sampling and variance estimation

-Hubert H. Humphrey School of Public Affairs
-Maryville College

Work experience
-Minneapolis Public Schools
-Minnesota Department of Education
-Office of Research Consultation and Services (ORCS)
-Human Capital Research Collaborative
-Saint Paul Public Schools' Department of Research, Evaluation and Assessment
-Wilder Research
-Child Care Resource and Referral (CCR&R)

Minneapolis, Minnesota, USA

Curricullum vitae

Mapping with R

| No Comments

I have been learning R software for statistical computing. I have used it for all of my homework assignments in EPSY 8261 (Probability and Inference) and 8262 (Regression and the General Linear Model). My prof has an R blog here: http://rguide.blogspot.com/.

I envision bridging my studies in educational psychology (quantitative methods, evaluation) with my interest in spatial analysis and mapping. I read yesterday that APA's science directorate is trying to encourage greater use of GIS in psychological research. You can read more here: http://www.apa.org/science/psa/feb08gis.html.

I decided to explore R's mapping capabilities today. Here are a couple findings:
-The shapefiles package is not really necessary.
-The maptools package can read and display shapefiles, but the author encourages converting shapefiles into sp class or polylist objects.
-Once shapefiles are converted, one can use the maptools plot.polylist polygon helper function.

Pat Bartlein at the University of Oregon recommends the following packages: sp, maptools, rgdal, maps, mapproj, mapdata, classInt, scatterplot3d, and RColorBrewer. He created a great website for folks like me who are starting out using R for geography: http://geography.uoregon.edu/GeogR/index.html. Thanks, Pat!

Applied researchers in the Twin Cities unfamiliar with GIS should know that the Metropolitan Council and its MetroGIS initiative have done a great job of collaborating with local entities to make a ton of data available at no charge: http://www.metrogis.org/.

I prefer ArcGIS for making maps and performing spatial analyses. I use MapInfo at work because our IT/geographer person prefers it. Mapping with R is similar to statistical computing in R--it's a blank portrait waiting for color and texture, whereas SPSS and commercial GIS software are like the Etch A Sketch.

Click on the thumbnail for a large version of the Twin Cities map I created with R:

Twin Cities Metro Area 041208.png

Here is the script:


counties=readShapePoly(file.choose(),proj4string=CRS("+proj=utm +zone=15 +ellps=GRS80 +datum=NAD83")) #choose MetroGIS' county.shp
counties=spTransform(counties, CRS=CRS("+proj=longlat"))


#capitalize first letter of county names
capwords <- function(s) {
cap <- function(s) paste( toupper(substring(s,1,1)),
tolower(substring(s,2)), sep = "", collapse = " " )
sapply(strsplit(s, split = " "), cap, USE.NAMES = !is.null(names(s)))

display.brewer.all() #check out the color options
cols <- c(brewer.pal(7,"Pastel2"))

water=readShapePoly(file.choose(),proj4string=CRS("+proj=utm +zone=15 +ellps=GRS80 +datum=NAD83")) #choose MetroGIS' hydro_luse2000.shp
water=spTransform(water, CRS=CRS("+proj=longlat"))

coordinates(counties) #centroids

plot(counties,col=cols,axes=T,cex.axis=.75) #plot function in maptools package
text(coordinates(counties), labels=names, cex=.75)
title(main="Twin Cities Metro Area",sub="Created with R",font.sub=2)