View Zhirong Zhao's profile on LinkedIn

Blogroll

My pages

Visitors

Powered by

UThinkrunning MT v.4.25

Header image of Hong Kong financial center courtesy of hleung on flickr.

Merging panel data in R

It is easy to merge two data frames with a common column in R (see R: Merge Two Data Frames), but I am not sure whether the same command “merge” can be used easily to combine two panel data together through the intersection of two common columns, for instance, “NAME” and “YEAR.” So I define the following function “pmerge” to do that for me:


1. Codes for the function
-------------------------------------------------------
pmerge <- function(X,Y){
        n <- max(X$YEAR)-min(X$YEAR)+1
        n0 <- min(X$YEAR)
        Z <- list(NULL)
        for (i in 1:n) {
        X1 <- subset(X,YEAR==n0+i-1)
        Y1 <- subset(Y,YEAR==n0+i-1)
        Z[i] <- list(merge(X1,Y1,by.x="NAME",
by.y="NAME",all=TRUE))
                }
        ZZ <- data.frame(Z[1]);dim(ZZ)
        YEAR <- rep(min(X$YEAR),dim(data.frame(Z[1]))[1])
        for (i in 2:n) {
        ZZ <- rbind(ZZ,data.frame(Z[i]))
        YEAR <- c(YEAR, rep(min(X$YEAR)+i-1,
dim(data.frame(Z[i]))[1]))
         }
        ZZ$YEAR <- YEAR; ZZ
        }
-------------------------------------------------------

2. An example
-------------------------------------------------------
#. Assuming that you will merge two data frames, “A” and “B,” through two common columns of “NAME” and “YEAR.

> dim(A); names(A)
[1] 12820 4
[1] "NAME" "YEAR" "SERVICE" "POP"
> dim(B); names(B)
[1] 1071 4
[1] "NAME" "YEAR" "MSA" "HGWAY"
> D <- pmerge(A,B)
> dim(D); names(D)
[1] 12910 8
[1] "NAME" "YEAR.x" "SERVICE" "POP" "YEAR.y" "MSA" "HGWAY"
[8] "YEAR"
> D[sample(dim(D)[1],8),]
......
-------------------------------------------------------


Note that that columns “YEAR.x” and “YEAR.y” may have missing values, indicating missing observations in either data A or data B, respectively.

Post a comment

Hubert H. Humphrey Institute of Public Affairs