November 18, 2008

Estimating policy effects using spatial regression discontinuity

I created a poster to present at GIS Day, an event organized by the University of Minnesota Geographic Information Science Student Organization. I hadn't heard of GIS Day until last week, so I had to spend all weekend on the analysis, write-up, and formatting. Thankfully I already had the data and the research topic ready to go. I'm grateful to David Card and Alan Krueger, authors of Myth and Measurement: The New Economics of the Minimum Wage, for sharing data from their study. Here are some excerpts from my poster:

Estimating policy effects using spatial regression discontinuity: The case of New Jersey's minimum wage increase

Background
Estimating causal effects from policy-level interventions is an important aim in the field of program evaluation, but policies are typically implemented in geographically defined jurisdictions, such as school districts or states, and not by randomly assigning participants to a treatment or control group. Consistent with the Education Sciences Reform Act of 2002, the U.S. Department of Education gives preferential treatment to causal research based on "random assignment experiments or designs ... [that] eliminate plausible competing explanations." Geographic information systems (GIS) are not widely used in education research but may help isolate competing explanations when estimating the effect of a policy on educational outcomes.

Purpose
Can GIS help evaluators and policy analysts comply with federal priorities for causal research in education? The purpose of this study is to apply GIS and cross-disciplinary inquiry (i.e., across education, geography, economics, and statistics) to the case of treatment assignment based on geographic borders. Geographic information from a well-known study of minimum wage effects by Card and Krueger (1994) was harnessed with GIS software. Data were then re-analyzed using regression discontinuity (RD). The strengths and limitations of GIS and spatial RD are discussed in the context of statistical results.

Results
The final model explains about 3 percent of total variation in the dependent variable (i.e., post-pre change in FTE employees). Of the models examined, the final is the most parsimonious with significant higher order terms. The confidence interval for the mean difference in Y at the PA-NJ border (CI0.95 = -6.3, 3.64) suggests that raising the minimum wage by 19 percent had an insignificant effect on employment in the food service industry. This finding corresponds with Card and Krueger's conclusion.

             Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.309174 2.595784 0.89 0.374
X -1.330037 2.526658 -0.53 0.599
Z 0.224291 0.081157 2.76 0.006
XZ -0.340753 0.132150 -2.58 0.010
Z2 0.001632 0.000653 2.50 0.013
Note: Heteroscedasticity-consistant standard errors
Residual standard error: 8.82 on 374 degrees of freedom
(25 observations deleted due to missingness)
Multiple R-squared: 0.0411, Adjusted R-squared: 0.0308
F-statistic: 4.84 on 4 and 374 DF, p-value: 0.00082

Conclusions
- GIS can help educational researchers harness geographic information to evaluate programs and policies.
- Spatial RD holds promise as a quasi-experimental evaluation tool because educational and other policies are frequently implemented along geographic boundaries, rather than by randomly assigning students or citizens, which requires stringent modeling of the assignment process to minimize competing explanations.
- More methodological inquiry is needed to judge how well and under what conditions spatial RD yields unbiased estimates.

PA_NJ_Map.png

Fitted_line.png

Click on the image to access the full, readable poster:
Poster_Handout_111908.png

October 27, 2008

Go Broncos!

My brother-in-law, Brad Wick, recently became head cross country running coach at Boise State University. He moved to Boise after three years as Assistant Coach of the Gopher men's cross country team. Heading into the Western Athletic Conference championship, the Bronco men are predicted to finish third and the women fourth. Go Broncos!

An interview with Brad

Visit Flotrack For More Videos

October 25, 2008

Exploratory factor analysis of school district characteristics

What factors influence a school district's ability to improve student achievement? With that as my guiding research question, I decided to explore whether demographic and financial variables represent latent factors. If not, then each may influence achievement individually. Using school district information published the Minnesota Department of Education, I explored the latent factor structure of 12 theoretically important predictors of school district success. Before conducting the EFA, I transformed variables with skewed distributions and removed multivariate outliers exhibiting high Mahalanobis distance.

Results from the factor analysis suggest that the predictors load on three latent factors that explain about 51 percent of total variation in the predictors.
1. Social stratification manifests in higher economic and racial/ethnic segregation and in higher state and federal funding, including appropriations for economically disadvantaged students.
2. Educational enhancement is reflected by higher spending on special education, instructional support services, and pupil support services; lower spending on student activities and athletics; and higher property tax revenue.
3. Regular instruction is apparent in higher spending for regular instruction and in local revenue from sources other than property taxes, although the latter is weakly related. Unclassified local revenue could be removed, reducing it and regular instruction to single, observed variables.

Next steps will include conducting a confirmatory factor analysis with a different year of data and developing a spatio-temporal model of student achievement at the district level.

Data sources
Minnesota Department of Education. (2007a). 2007 Minnnesota Comprehensive Assessment results: Public schools. Saint Paul, MN: Author. Retrieved from: http://education.state.mn.us/MDE/Data/Data_Downloads/Accountability_Data/Assessme
nt_MCA_II/index.html

Minnesota Department of Education. (2007b). School district financial profiles. Saint Paul, MN: Author. Retrieved from: http://education.state.mn.us/MDE/Accountability_Programs/Program_Finance/Financial_
Management/School_District_Financial_Profiles/index.html

Minnesota school district demographics, revenues, and expenditures in 2007

Number of districtsMeanStandard deviationMinimum-
maximum
Demographics
Percent third graders eligible for free or reduced lunch33337.816.25.0-100.0
Percent third graders of color and/or Hispanic or Latino ethnicity2891416.40.0-100.0
Average daily membership (ADM) served per licensed instructional staff33414.82.54.9-22.1
Revenue per ADM
Local property taxes334$1,408$711-$114-$4,053
Other local source334$1,214$660$357-$7,154
State sources334$7,753$994$5,496-$12,662
Federal sources334$663$1,123$108-$17,358
Percent of revenue expended
Regular instruction33443.83.831.7-66.5
Special education33414.943.1-29.2
Student activities and athletics3343.71.40.0-9.8
Instructional support services3343.71.50.9-9.6
Pupil support services3342.11.70.0-24.6

Scatterplot_Matrix_Untransformed.png

Scatterplot_Matrix_Transformed.png

EFA_Scree_Plot.png

EFA_Bi_Plot.png

Loadings (promax rotated):
Factor1 Factor2 Factor3
PCT_P.t 0.744 -0.216 -0.043
PCT_MIN.t 0.583 0.464 0.165
ADMSPLIS.t -0.475 0.555 0.052
PROPTREV.t -0.142 0.513 -0.064
LOCREVO.t -0.127 0.129 -0.243
STATEREV.t 0.729 -0.141 -0.157
FEDREV.t 0.883 0.136 0.053
RI -0.046 0.071 0.994
SE.t 0.211 0.623 -0.310
SAA.t -0.233 -0.792 -0.140
ISS.t -0.221 0.398 -0.046
PSS -0.140 0.447 0.039

Factor1 Factor2 Factor3
SS loadings 2.636 2.267 1.230
Proportion Var 0.220 0.189 0.102
Cumulative Var 0.220 0.409 0.511

Factor correlations:
Factor1 Factor2 Factor3
Factor1 1.000 -0.178 0.122
Factor2 -0.178 1.000 0.192
Factor3 0.122 0.192 1.000

October 06, 2008

Kelli's birthday party

Here are some pictures from Kelli's 30th birthday party. Welcome to the club, Kelli! Our friends sure do have some cute kids.

Families
IMG_9247.jpg IMG_9225.jpg IMG_9232.jpg IMG_9242.jpg

Dads
IMG_9229.jpg IMG_9243.jpg IMG_9240.jpg IMG_9245.jpg

Moonwalk
IMG_9248.jpg IMG_9226.jpg

September 26, 2008

Pollster, my addiction

Election day is only six weeks away. I grew dependent on Pollster.com during the primaries. Now I have a full-blown addiction.

I like Pollster because it's run by professional survey researchers who pay attention to sampling error, nonresponse bias, and other threats to making accurate inferences about the electorate's preferences. They have sought to offset some of those threats by mashing up survey results from many reputable sources and running a smoothed trend line through them. Additionally, they critically review results and methods, pointing out strengths and shortcomings that poll consumers can use to evaluate results.

September 18, 2008

Spatial analysis of math proficiency in third grade

Is student proficiency in a school district related to proficiency levels in neighboring districts? I'm continuing my foray into spatial methods by analyzing spatial dependency among standardized test scores at the school district level in Minnesota. If the influence of neighboring school districts is significant, then district performance could be evaluated in a manner that controls for the influence of neighbors (e.g., with a spatial lag model expressed as y = ρWy + + ε). Additionally, we could single out school districts that exceed expectations, where expected performance is predicted by neighbors.

The following maps show third grade math proficiency in 2007 and each district's four nearest neighbors. As shown in the spatial correlogram, the percentage of students in Grygla Public Schools exhibiting proficiency is greater than the percentage in neighboring districts. The same is true for other districts in or near the bottom right quadrant. Districts in or near the top left quadrant, such as Brooklyn Center and Red Lake, are not faring as well as their neighbors. Note that Grygla and Red Lake are neighbors, their proficiency levels differ greatly, and they border districts with missing test score data (i.e., their spatial lags are weighted by three instead of four neighbors), all of which help explain why they are positioned as outliers in the correlogram.

Overall, there's a lack of evidence that this particular district outcome (percent of third graders proficient in math) is influenced by neighboring districts or nearby districts (out to six lags). The spatial correlation coefficient, Moran's I, is not significantly different from zero. Future analyses will investigate spatial dependency of other indicators of student and district achievement, such as reading proficiency, proficiency gaps (economic and racial/ethnic), and proficiency by grade level.

Third graders proficient in math in 2007 choropleth.jpg

Third graders proficient in math in 2007 choropleth Twin Cities.jpg

Minnesota school district neighbors.jpg

Twin Cities school district neighbors.jpg

Third graders proficient in math in 2007 Moran plot.jpg

Grygla and Red Lake juxtaposed
Grygla_Red_Lake_zoom.jpg

Third graders proficient in math in 2007 Moran correlogram.jpg

Moran's I test

Moran I statistic standard deviate = 0.6023,
p-value = 0.547
alternative hypothesis: two.sided
sample estimates:
Moran I statistic 0.019742685
Expectation -0.003058104
Variance 0.001433003

September 16, 2008

My Old Kentucky Home

Here are some pictures from my visit to Kentucky in August.

Shaker Village
Picture 019.jpg

Sorghum
Picture 043.jpg

Family reunion
IMG_9093.jpg

Rest in peace, Granny
IMG_9068.jpg

August 22, 2008

The life of an applied researcher

Check out the first deleted scene from The Simpsons Movie. It's really funny and deftly portrays what it can be like to conduct applied research in the policy arena. The head of the EPA fails to persuade President Schwarzenegger to take action on pollution in Springfield and resorts to giving a lesson on the statistical concepts of central tendency, variance, and outliers. In the end, a "stupidly high level" of pollution proves less persuasive than one mutant specimen. The takeaway: data collection and statistical analysis can strongly suggest a course of action, but applied researchers must also consider what types of information will resonate with their audience if they hope to inspire appropriate action. Presenting tabular information in maps, for example, can give statistics more meaning. Another takeaway: technocracy is funny.


August 17, 2008

Workshop on quasi-experimental design and analysis in education

I recently attended one of the Workshops on Quasi-Experimental Design and Analysis in Education. The workshop was led by two of my heroes, Thomas Cook and William Shadish. It was an honor to be selected for the workshop and to share the company of Tom, Will, and my fellow attendees for a week in Evanston. I had some experience with quasi-experimental methods before the workshop, but I learned a great deal more, such as checking propensity score balance criteria and mixing design elements to strengthen causal inferences. The workshop definitely improved my ability to conduct causal research in education.

Tom, me, and Will
Picture 118.jpg

View from Northwestern University
Picture 113.jpg

Camping at Illinois Beach State Park after the workshop

Picture 003.jpg

July 16, 2008

Poem for Wilder Research librarians

At Wilder Research, we promote evidence-based practices that support foundation's mission:

To promote the social welfare of persons resident or located in the greater Saint Paul metropolitan area by all appropriate means... without regard to, or discrimination on account of, nationality, sex, color, religious scruples or prejudices.

Our library of relevant journals, books, and other resources is a testament of our commitment to best practices. We have three awesome librarians, Heather, Amanda, and Kerry, who maintain the library and provide services, such as literature searches for clients. I used the Dewey Decimal Classification System to write this poem for them on staff appreciation day:

If I was a librarian,
I wouldn't know whether
to classify you as 780 or 811.
When we need literature,
you deliver with 118.
You are our 234,
and without you,
our 003 would break down.
I'm not even being 817!

To which the librarians replied:

Dear Chris,
Thanks for your kind comments. We are happy to hear that you appreciate the 001 and 706 of the library. It has always been our 901 that we are here to help the 128 of Wilder Research in whatever way we can. However, we do note that the 119 of your visits to the library has been low lately. So come on in for some quality 302 when you have a chance.
Love,
The Librarians

July 06, 2008

Namekagon and St. Croix trip

We canoed the Namekagon and St. Croix rivers over Memorial Day weekend. Our friends Stephanie and John and Lori and Carl accompanied Amy and me. We covered about 45 miles over three days, a pace similar to our trip on the Buffalo River.

My favorite part of the trip, besides spending time with our friends, was the spectacular, dynamic views offered by Namekagon. The river travels circuitously between wetlands, sandy, pine-covered hills, and deciduous forest, enabling us to sneak up on wildlife and experience wonderment at each turn. We spotted black bears running through the forest (away from our canoes, thankfully), weasels (otter, pine martin, fisher), and waterfowl (green heron, blue heron, egret, ducks, mallard).

Amy, Lori, and Stephanie (the Cobber Posse)
100_1571.JPG

John, Carl, and me (honorary members of the Posse)
100_1572.JPG

View from our (tick-infested) campsite on the Namekagon
Namekagon Campsite Panoramic.JPG

Our route, approximately
Namekagon_Trego_Riverside.png

July 02, 2008

Retrospective pretest posttest

"What is a retrospective pretest posttest?" You are not alone in wondering. I belong to the American Evaluation Association and subscribe to the EVALTALK listserv where members routinely enquire about this method. A retrospective pretest posttest is a type of survey instrument designed to reduce response shift bias.

If training participants, for example, are asked to rate their knowledge, skills, and abilities (KSAs) before and after receiving instruction, then they will often report lower KSAs after the training program. This is known as response shift bias and occurs because training participants tend to overestimate their KSAs beforehand. They become more knowledgeable during the training program and better equipped to gauge their own KSAs. Asking participants about their KSAs retrospectively, after the program, presumably reduces bias.

A retrospective pretest posttest is a single survey instrument administered after a program. It contain questions that asks respondents to rate their KSAs in hindsight as well as at the conclusion of the program. Survey questions are nearly identical and evenly spit between "Thinking back before the program..." and "Now, after the program." I should note that in addition to reducing bias, a retrospective pretest posttest, administered once to a cohort, saves resources that would otherwise be spent on administering two or more surveys.

Here's a good explanation posted to EVALTALK by Susan McNamara:

"[T]here are some cases where an after-the-fact measure really works out best. For example, in parent training (especially when court-ordered), the parents come to the first meeting thinking they know everything and that they're perfect parents. If you give them a pretest now, they're going to score themselves very highly. After three months or more of training, they've come to realize that no one is perfect, that there might be better ways of doing things, and perhaps the ways (e.g., of discipline) they were using before weren't the best. Now, they might score themselves lower. If you give them a post-test, it looks like your program isn't working, because their scores are going DOWN. However, if you use a retrospective pre-test ("fill this out now, but think about the attitudes and skills you had before you entered the program") and a regular post-test, you are more likely to get a realistic change."

I would like to research this method further because the scholarly literature has not fully addressed some questions. To what degree are retrospective self-ratings biased? Is there an optimal time to administer a "pretest" in a lagged fashion as to reduce bias (i.e., when does a self-rating shift to becoming unbiased)? Is there an item format that promotes unbiased responses when asked retrospectively (e.g., matrix vs. sequential vs. sectioned)? Should evaluators measure gain scores by supplementing or substituting self-ratings with performance test items? One could shed light on these questions by simultaneously administering a test measuring KSAs and a survey asking respondents to rate their KSAs throughout a training program or course.

I created the heuristic model below to illustrate how tests and self-ratings might differ over time. The model uses the prediction equation from Dr. Andrew Zieffler's dissertation as a starting point. He measured continuous longitudinal growth of bivariate reasoning ability among students in an educational psychology statistics class. The quadratic growth curve represents actual ability, while the other lines (gray and dotted) represent hypothetical self-rating scenarios that could be measured in a future study of response shift bias.

If one replicated Dr. Zieffler's study with both performance-based and self-rating measures, then the results could help evaluators and statistical educators choose an optimal data collection strategy that minimizes bias and cost. It would provide information for choosing knowledgeably between self-ratings and performance-based items, as well as between temporal options (traditional pre/post, lagged pre/post, retrospective pre/post). Topically, self-ratings of statistical reasoning may help with instructional planning and creating reliable and valid formative assessments with fewer items.

Click on thumbnail to see larger version:
Bivariate_reasoning_heuristic.png

R script

curve(.89856+.32046*x-.0064827*x^2, 0, 30, xlim=c(0,30), ylim=c(0,5.5), lwd=2, xlab="Teaching session", ylab="Bivariate reasoning")
x=c(0,3,13,18,30)
y=c(5,3,3.5,4.2,4.65)
continuous=cbind(x,y)
lines(spline(continuous, n=50, method="natural"), lwd=2, col=8)
prepost=cbind(c(0,30), c(5,4.65))
points(prepost, pch=5, lty=2, lwd=2, col=8)
lines(prepost, lty=2, lwd=2, col=8)
retro=cbind(c(0,30), c(1,4.65))
points(retro, pch=16, lwd=2)
lines(retro, lwd=2, lty=3)
oldpar=par(no.readonly=T)
par(ps=9, cex.main=1)
legend("bottomright", c("Predicted mean score (Zieffler, 2006)",
"Continuous self-rating (hypothetical)",
"Before and after self-rating (hypothetical)",
"Retrospective-before and after self-rating (hypothetical) ")
,pch=c(NA,NA,5,16), col=c(1,8,8,1), lty=c(1,1,2,3), lwd=c(2,2,2,2), merge=T)
title(main="Heuristic model: Bivariate reasoning measured at different times by different methods")
par(oldpar)

Résumé

Download my résumé

April 24, 2008

Mapping with R: Third grade proficiency gaps in 2007

Here are some new maps displaying math and reading achievement gaps in early elementary school, suggesting disparities at kindergarten entry. I will write more on these in a later post.

Click on the thumbnails below to see full maps:

Proficiency_gap_math_3rd_grade_2007_poverty.png

Proficiency_gap_reading_3rd_grade_2007_poverty.png

Proficiency_gap_math_3rd_grade_2007_minority.png

Proficiency_gap_reading_3rd_grade_2007_minority.png

April 19, 2008

Spring break canoe trip

Amy and I canoed the Buffalo National River in Arkansas over spring break--three nights, three days, 45 miles. I highly recommend it. The Buffalo offers beautiful limestone-tinted water, huge cliffs, manageable rapids, and wildlife galore. It's also 12 hours south (warmer) than Minnesota. We timed our trip perfectly. Two days after we got off the river, it hit flood stage. Check out this video of the White River, downstream of the Buffalo:

Here are some pics (click to enlarge):

Making chili
IMG_3691.jpg

Amy in the bow
IMG_3684_2.jpg

Blue heron rookery
IMG_3752.jpg

The views and opinions expressed in this page are strictly those of the page author. The contents of this page have not been reviewed or approved by the University of Minnesota.