April 27, 2010

More HW hints, Ch4 Exercise 16

Many students have asked about part b).

The idea is this: how well are each of these three models able to accommodate the outliers present in the data? One way to assess this is by considering the posterior distribution of the width of an interval in the *likelihood.* Some models will shift the mean up to reach the outlier, while others will have sufficiently heavy tails to accommodate it in the variance.

So for each model's population-level sigma parameter (i.e., 1/sqrt(tau0[k] for k=1,2,3), we consider the posterior distribution of the width of a 95% interval in the likelihood that model induces for the data.

To get this posterior, we use the CODA samples of sigma[k]. For each Gibbs draw, compute the .025 and .975 quantiles of the associated likelihood (normal, t, or DE), take their difference, and call this a single posterior sample from the likelihood interval width. Repeat this procedure for all the sigma[k] Gibbs draws and you will have a histogram for the posterior of this quantile difference in the k^th model. Then you can compare across models.

April 21, 2010

HW Hint, Ch4 Ex 16

Several students have asked about how to do the comparison across models for the stack loss data, versus the data in Example 4.7, where each model has a single parameter, theta.

In the stack loss models, you have some regression parameters (betas) and also some fitted medians (mu_j) for each year. I would say, take a look at the betas, say beta_1, in each model. Also look at a couple of the mu's, say mu_21 because we know that is any outlying year, and one other mu_j in a non-outlying year for comparison.

April 8, 2010

Bug fixes

Hey guys: a couple of bug fixes. First of all, there was a mistake in my "outline" of the metropolis code. I had the inequality for the acceptance ratio logic going the wrong direction. The corrected code follows:

## Sample Metropolis-Hastings algorithm
# function to compute log(h(theta))
log.h <- function(theta,data){
met <- function(data,initials,NREP){
  # Constants
  n <- dim(data)[1]
  p <- length(initials)
  # Empty matrix to hold samples
  chain <- matrix(NA,NREP+1,p)
  # Initialize the chain
  chain[1,] <- initials
  for (i in 2:(NREP+1)){
    # Parameters of the proposal
    meen <-
    var <-
    # Draw a proposal value
    thetastar <- rnorm(1,meen, var)
    # Compute the rejection ratio
    logr <- log.h(thetastar) + dnorm(chain[i-1,],meen,sqrt(var),log=T)
       - log.h(chain[i-1,]) - dnorm(thetastar,meen,sqrt(var),log=T)
    # Decide whether to accept or reject
    if (logr > 0){
      chain[i,] <- thetastar
    } elseif (logr > log(runif(1))) {
      chain[i,] <- thetastar
    } else {
      chain[i,] <- chain[i-1,]

Second, it appears that in the GeoBugs extra credit opportunity, the maps aren't going to load in because it doesn't like negative x values. One solution is to simply add a large constant to all of the longitude values in the polygon file produced by R so that they are positive.

February 22, 2010

Posterior Density Plots from BRugs

If you're using BRugs to fit your models (which I think has many advantages over using WinBUGS directly), here's how to make a density plot:

If you set the coda option to TRUE, the output object from BRugsFit will be of type mcmc.list. This is a list in which each element is one chain of MCMC samples. Within each chain, you have a matrix with one node in each column. To plot the histogram for both chains for a single node, use code like this:

myfit <- BRugsFit(...,coda=T)
Z13.samples <- unlist(myfit[,'Z[13]',]) # this gets the 'Z[13]' samples from all chains

February 21, 2010

Posterior Predictive Probabilities Trick

Here's a little hint for HW3:

Brad has shown you how to put in a covariate value, X=100, and leave the outcome value unobserved, Z[13]=NA. This will get WinBUGS to sample from the posterior predictive distribution for Z[13]|X=100.

To compute probabilities from this distribution, you can use an indicator function called step(x). The function returns 1 if x >= 0 and 0 otherwise. Thus, if you want an indicator function for Z[13] > 0, use step(Z[13]).

Notice that this is a transformation of your predicted value into a Bernoulli random variable with mean = Pr(Z[13]>0|X=100). Also, since the outcome is continuous, the strict inequality doesn't matter. Finally, by the definition of Z, it represents the reduction, so positive values of Z are reductions in temperature, and negative values of Z are increases in temperature.

If you give this thing a name and monitor it in WinBUGS, then its posterior mean will be the posterior probability that Z[13]>0|X=100.

Of course, you can also do it outside of WinBUGS by saving off the MCMC samples for Z[13] (the so-called coda chains) and processing them in R or whatever. It's a simple matter of counting up the proportion of posterior samples for Z[13] that are >0.