Recently in Social Sciences Category

The Consolidated Federal Funds Report by the Census Bureau provided data on which funds were distributed from the federal government to states, counties and places by program and agency.

I'm using the past tense because in 2012 the US Federal Budget eliminated funding for the branch of the Census Bureau which produced the CFFR. The status of the CFFR going forward is unclear.

The Census Bureau used to have an web-based query system for the CFFR, which while not perfect, was useful for a moderate range of users including some of those who wished to create datasets for analysis. The query system has been removed and now users have two options: *.pdf reports or FTP filesets.

For very casual use, say looking up a single figure for a single year, the *.pdf versions of printed CFFR reports will probably be acceptable. The *.pdf reports mimic the print reports produced before the query system and are just as inconvenient for data analysis.

All other users will be working with the FTP site, aka a somewhat confusing list of compressed filesets for download.

There is one small piece of good news: you can now easily locate data from 1983-2010, which is 10 more years than before.

The price is in usability. There are three compressed filesets:

You might think that, based on names, the last fileset is the only one with data at a state level and that the other two are national summaries. This is incorrect. As far as I can see, given that both 1993-2010 filesets have the exact same documentation, the difference is that the 3rd fileset is useful for users who only want to work with data already formatted for immediate use in spreadsheets (*.csv). Given that it's fairly simple to import and format the *.txt and *.DAT files into spreadsheets though, this isn't much of an improvement.

The real effort will be in rearranging the fileset you decide to use. Likely tasks will include:

  • converting from fixed-width format to delimited format

  • aggregating from places to counties or states

  • filtering by type of aid, program or agency

  • linking program names to codes

Google Refine will be helpful for at least some of these tasks.

For example, suppose you want to see trends in funding over time. All of the file sets are sorted by year. You can consider opening each, adding a year column and then creating one large file for all years. You could then use Google Refine to filter to rows for only one program for all years. What's nice about Google Refine is that if you have selections made, when you export, you only export those rows.

Refine may also help you to aggregate the data to state level, but I haven't tried that myself yet.

You can do all of these things in MS Excel of course, it can be more complicated and the bigger the files get, the slower Excel runs.

Census Data & Zip Codes

The Census Bureau does not publish data by Zip Code. Zip Codes are mail delivery codes created at and for the convenience of the U.S. Postal Service. Zip Codes are not consistent geographic units and the Census Bureau has to have consistency in a geographic unit in order to create meaningful statistics.

However, the Census Bureau did create "Zip Code Tabulation Areas" for Census 2000 which approximate real Zip Codes. Data from Census 2010 will eventually be available at the ZCTA level too, but the American Community Survey will not. As the American Community Survey is now the only source of nationwide socio-economic statistics for small geographic areas, this means users who need data organized by Zip Code will need to look elsewhere.

With the development of the American Community Survey over the last decade, the 2010 Census was much, much shorter than in 2000 or indeed any previous Census except for 1790.

In 2010, there were only 10 questions covering:

  • how many people live here
  • how old they are
  • what sex they are
  • how they are related
  • how they define their race
  • how they define their ethnicity
  • whether their home is owned, rented, or no payment required
  • whether the people in this house sometimes live elsewhere (college, nursing homes, prison, military, etc)

For recent data on any other socio-economic topics, users must now use the American Community Survey.

ICPSR Research Paper Competitions

ICPSR is sponsoring three competitions to highlight the best student research papers (undergraduate and master's) using quantitative data. The objective is to encourage students to explore the social sciences by means of critical analysis of a topic supported by quantitative analysis of a dataset(s) held within ICPSR and presented in written form.

  • Deadline for submission is January 31, 2010.
  • Two competitions covers any dataset(s) held within the ICPSR archive and are eligible to undergraduate and master's students. The third competition solicits undergraduate papers addressing issues relevant to minorities in the United States, including immigrants, that utilize data from the Resource Center for Minority Data.
  • Up to three cash prizes will be awarded for each competition. The winner will receive a monetary award of $1,000 (second place receives $750 and third place $500).
For details on the competition, see

The Internet Archive brings together the digitized government publications from lots of different projects so that you can search in one location for historical publications containing statistics. Of course, the search isn't exhaustive since not everything has been digitized, but the depth increases every day and there are materials from the Census back to at least the first decade of the 20th Century. Check it out at

It's All About How You Count: Follow Up

The Department of Treasury has released its report on its activities resulting from the Emegency Economic Stabilization Act (P.L. 110-343) and, indeed, they are reporting what they spent as cash rather than as the "net present value" reported by the Congressional Budget Office. See Tranche Report to Congress and Tranche Report Appendices.

To sum up:

Congressional Budget Office reports $17 billion actually spent.
Dept. of the Treasury reports $115 billion actually spent.

The Congressional Budget Office (CBO) has released its October Monthly Budget Review. If one were to skim just the tables, one would get the impression that the TARP expenditures in October were $17 billion. In the paragraph below, however, the CBO notes that

"In CBO''s view, the stock investment and associated warrants should not be recorded on a cash basis but on a net present value basis, accounting for market risk, as specified in the Emergency Economic
Stabilization Act. CBO's preliminary estimate of $17 billion for the present value cost is included in its
estimate of $134 billion for the October deficit. However, CBO anticipates that the Treasury will report
the stock purchases on a cash basis. As a result, CBO estimates that the Treasury will report the October
deficit at $232 billion."

In short, we could see radically different claims of funds spent because of different methods of reporting the dollars committed. As always, you have to pay attention whenever someone starts quoting statistics...

Why You Should Always Ask For Proof

Ars Technica has a great little article on a oft-cited figure about job losses due to intellectual property theft at What they discover is that the number has (so far) no basis in reality. They were able to find a reference to the job losses in a 1986 interview with the then Secretary of the Department of Commerce. Where he got his figures remains a mystery. Yet, this number of supposed job losses is being used to justify new laws strengthening intellectual property rights. There might be good reason for the proposed law, but the "job loss" isn't one of them.

The various discs from the 1997 Economic Census, which can be checked out from the University Libraries, do not work properly in Windows XP. The Census Bureau has posted patches and instructions at No word on how they handle Vista...

Real Estate Database

A comprehensive collection of historical real estate data. Includes Winan’s Real Estate Index of housing prices back to 1830, regional real estate price indices, national and regional new home sales and listings, and more. Choose “Real Estate Market Data? under Data Series Type in the Filter Search.

U.S. Stocks Database

Comprehensive current and historical coverage of stocks traded on exchanges in the United States. The database provides daily data for current US stocks back to 1970 and monthly data back to 1815. Enables you to combine the financial indicators from the United States and 200 other countries with the historical price data on over 20,000 individual current and delisted stocks.

  • “Business definitions have always been tricky for researchers,” said Brian Headd, economist for the Office of Advocacy and co-author of the working paper. “It’s hard to determine what counts as a business. For instance, is there a minimum employee or revenue requirement, a length of time in business, or a contribution to owners income? He added, “Definitions are often based on convenience, such as what data is available to the researcher.”

The Bureau of Labor Statistics has redesigned its website and has added many useful features, of which some are highlighted here. For all new features, see

1. RSS feed for latest economic indicators:

2. Calender of all releases with the option to subscribe via *.ical/*.ics:

3. Guide to Data by Geography:

4. Tutorials:

The MN Campaign Finance and Disclosure Board makes finding campaign finance data a little more complex than one might expect. The data isn't as recent as one might expect either - 12/31/2007 in most cases and most information provided is in *.pdf files. Exceptions are noted.

Independent Expenditures -

Contributions Received by a Candidate -

Contributions Made by Individuals -

Contributions Received by Party Units or Political Committees -

Contributions Made by Party Units or Political Committees -

Political Party, Political Committee and Political Fund Reports -
On the Board home page, "Political Party Units" are listed as a separate link, but they're really a subset of the overall Political Party database.

Candidate Report of Receipts and Expenditures -

Lobbyist Disbursement Reports -
Data through 2008

Public Official Lists -
Reports in *.html; content undated.

State Campaign Finance Databases

For information on state-level campaign finance, see Campaign Finance Databases from the State Agency Databases site.

STAT-USA has new financial and travel statistics. STAT-USA is a subscription database from the Department of Commerce that contains statistics either hard to find elsewhere or not available elsewhere.

All Americas Barometer Survey

Newly licensed: All Americas Barometer Survey

An effort by the Latin American Public Opinion Project (LAPOP) to measure democratic values and behaviors in the Americas using national probability samples of voting-age adults.

The Minneapolis Area Association of REALTORS® has released "Foreclosures and Short Sales in the Twin Cities Housing Market," a special new research report that attempts to answer some of the more pressing questions surrounding lender-mediated properties. Includes an analysis of current inventory, new listings, closed sales, sales prices, and the impact that the growth of lender-mediated properties is having on each trend.

The data was gathered and analyzed by MAAR staff in collaboration with Aaron Dickinson, REALTOR® member with Edina Realty, and utilizes a new data approach based upon information from the NorthstarMLS system.

To share comments or questions on this new report, please contact Jeff Allen, MAAR Research Manager, at or Aaron Dickinson at

Thanks to our Journalism, Film and Mass Communications Librarian, Johan Oberg (oberg091 at umn dot edu), you can find sources of current and historical newspaper circulation statistics compiled at

Minneapolis Area Association of Realtors' produced report on home sales activity in the metropolitan area over the last year. Includes summaries of 2001-present for sales, values by county, city and neighborhood and some maps. See Residential Real Estate Activity Report

Estimating Excess Mortality in Post-Invasion Iraq

From the article:

"There is no set formula for accurately tallying deaths from humanitarian crises. When a population becomes destabilized, estimation of mortality is likely to be severely challenged. In the case of a sudden traumatic event, such as a natural disaster affecting an otherwise stable population, health and human service agencies, though compromised, may well be able to facilitate an accurate assessment of deaths through the use of prospective registries of vital events."

USDA State Fact Sheets

State Fact Sheets

State fact sheets provide information on population, employment, income, farm characteristics, farm financial indicators, and top commodities, exports, and counties for each state in the United States.

Social Explorer

Created by Queens College, City University of New York, and partnering with the University of Minnesota's own National Historical Geographic Information System, this subscription edition of Social Explorer offers the ability to create customized maps and reports of demographic, housing, and employment patterns throughout the United States using data from the U.S. Census Bureau. Data are available by decade between 1940 and 2000 for a variety of geographical entities that are as small as the census tract for certain areas.

Obesity rates for females by country for 2005, 2015

Roper Center membership now active

The Roper Center for Public Opinion Research is one of the world's leading archives of social science data, specializing in data from surveys of public opinion. The data held by the Roper Center range from the 1930s, when survey research was in its infancy, to the present. Most of the data are from the United States, but over 50 nations are represented.

Our membership with the Roper Center is now active and ready for use. With the membership, University of Minnesota current students, faculty and staff have access on and off campus to the full suite of Roper content, including iPoll and the Dataset Collection.

For surveys with national adult or major subpopulation samples (such as registered voters, women, or African Americans), iPOLL can be used as the primary finding aid to locate datasets on particular topics or for specific time periods. For state, foreign, and other special samples, the Dataset Collections can be used to locate datasets by keyword (the abstract is searched), date, country, survey organization, survey sponsor, or type of sample. Our membership includes downloads of all files in the Roper Express collection.

Roper also offers the option to do online analysis of selected datasets through the IDEAS: Interactive Data Exploration & Analysis System.

The Topography of Poverty in the United States: A Spatial Analysis Using County-Level Data From the Community Health Status Indicators Project

From the abstract:

"The spatial analytic techniques are broadly applicable to socioeconomic and health-related data and can provide important information about the spatial structure of datasets, which is important for choosing appropriate analysis methods."

Global Patterns of City Size Distributions and Their Fundamental Drivers

Interesting article that uses a combination of census data and geospatial data in support of the argument that "macroscopic patterns of human settlements may be far more constrained by fundamental ecological principles than more fine-scale socioeconomic factors".

There are data-related resources all over the Twin Cities campus. This guide highlights major sources of assistance with software, hardware, analysis and locating data.

Social Sciences Data - Resources across the University of Minnesota

This National Institute of Corrections manual ( provides guidance on how information affects policy decision making. Topics include good management; data collection; how to locate and capture information; analyzing, interpreting, and sharing information; and getting the most from your information system.

Interagency Information Cooperative

The Interagency Information Cooperative (IIC) was created from the Sustainable Forest Resources Act of 1995 (M.S. Chapter 89A.09). The overall mission of the IIC is to enhance the access and use of forest resources data in Minnesota. The following public organizations have representatives on the IIC: Minnesota Association of County Land Commissioners, United States Forest Service, Land Management Information Center, University of Minnesota, and Department of Natural Resources. The IIC Memorandum of Understanding was created in 1997, and goes into greater depth on the purposes, membership, and duties of the IIC.

Focus here is more on harmonization of data through metadata rather than computationally intensive work. However, it is boundary-spanning in that it is attempting to harmonize biological, geospatial, social science and other kinds of data.

The Minnesota Population Center (MPC) is a University-wide interdisciplinary cooperative for demographic research. The MPC serves sixty faculty members and research associates from ten colleges and nineteen departments at the University of Minnesota, and employs nearly a hundred research support staff, including computer programmers and technicians, administrative staff, research assistants, and data-entry staff. As a leading developer and disseminator of demographic data, we also serve a broader audience of some 6,000 demographic researchers worldwide.

Current Projects include:

The IPUMS consists of high-precision samples of the American population drawn from 15 censuses and the American Community Survey, spanning 1850 to 2005. The data and documentation are harmonized, making it easy to use multiple census years simultaneously.

IPUMS-International is an integrated series of census microdata samples from 1960 to the present. At this time, the series includes 63 samples drawn from 20 countries, with more scheduled for release in the future. MPC is collaborating with statistical agencies, data archives, and demographic experts from the participating countries to create this resource.

IPUMS-CPS provides integrated data and documentation from the Annual Demographic Supplement of the Current Population Survey (CPS) from 1962 to 2006. The harmonized CPS data is also compatible with the data from IPUMS-USA.

The North Atlantic Population Project (NAPP)
NAPP is a harmonized database of the complete censuses of Canada (1881), Great Britain (1881), Norway (1865, 1900), and the United States (1880). We will be adding six complete censuses of Iceland, the complete 1890 census of Sweden, and twenty samples from the late nineteenth and early twentieth century to the databases.

National Historical Geographic Information System (NHGIS)
NHGIS will provide U.S. aggregate census data and electronic boundary files for tracts and counties between 1790 and 2000.

Integrated Health Interview Series (IHIS)
The Integrated Health Interview Series (IHIS) will provide integrated data and documentation from the National Health Interview Surveys (NHIS) from 1963 to 2003. A preliminary dataset with 300 variables is now available.


ICPSR is pleased to announce the launch of MyClass - a mass registration process where faculty and librarians are able to reserve temporary user accounts for a class or training group who will be working with ICPSR data and resources.

Users of MyClass accounts can download and manipulate data without going through the process of establishing permanent MyData accounts. Data access is available immediately after creation of MyClass accounts. After a period of time that instructors specify, the temporary accounts will cease to exist.

To create MyClass accounts, you must have a valid MyData account and your institution must be a member of ICPSR. MyClass account creation can be found by following this link

About this Archive

This page is an archive of recent entries in the Social Sciences category.

New Data is the previous category.

Statistical Literacy is the next category.

Find recent content on the main index or look in the archives to find all content.