View Zhirong Zhao's profile on LinkedIn

Blogroll

My pages

Visitors

Powered by

UThinkrunning MT v.4.25

Header image of Hong Kong financial center courtesy of hleung on flickr.

A function for descriptive statistics with R

I wrote this function to prepare summary statistics of regression models. In such a table I typically include mean, standard deviation, minimum, and maximum. The function can be directly apply to outputs of general regressions models such as lm() or glm(). Then I may save the results in a csv file, and run Excel2LaTeX to get LaTeX codes.


#. The function is:
------------------------------------------------------------
msummary <- function(M){
        n <- dim(M$model)[2]
        s <- matrix(NA,dim(M$model)[2],5)
        for (i in 1:n) {
                s[i,] <- cbind(mean(as.numeric(M$model[,i])),
                 sd(as.numeric(M$model[,i])),
                 min(as.numeric(M$model[,i])),
                 median(as.numeric(M$model[,i])),
                 max(as.numeric(M$model[,i])))
                }
        S <- data.frame(s)
        colnames(S) <- cbind("Mean","St.D.","Min","Median","Max")
        S$N <- rep(dim(M$model)[1],dim(M$model)[2])
        rownames(S) <- names(M$model)
        S
        }
------------------------------------------------------------
#. An example:
------------------------------------------------------------
> M1.c$call
glm(formula = OUT ~ -1 + SERVICE + MSA + POPLG + POPLG2 + EDULEVEL + 
    POVERTY + CHIEF + PTCLG + STCLG + PTC2 + STC2 + UNPAVED + 
    PARTY + TURNOUT, family = binomial(link = "logit"), data = data)
> msummary(M1.c)
            Mean  St.D.    Min  Median    Max     N
OUT        0.269  0.444  0.000   0.000   1.00 10661
SERVICE    8.979  5.141  1.000   9.000  19.00 10661
MSA        0.296  0.456  0.000   0.000   1.00 10661
POPLG     10.265  1.132  7.510  10.071  13.73 10661


Post a comment

Hubert H. Humphrey Institute of Public Affairs