October 7, 2004

enough with the averages, how about the variance?

Most competitive sports -- baseball and cricket being the archetypes -- generate a lot of statistics. Most of them are averages, and very rarely, measures of variability. But it's in the variability that games turn and hard decisions have to be made. That's as true for the coach on the sidelines as for fantasy whatever-ball.

Sticking with baseball and cricket for the moment, some of these numbers are statistics only in the broadest sense, and have little meaning. If you've ever watched cricket, you'll be familiar with phrases from the commentators like "this is a record partnership for the tenth wicket in the second innings between these two countries played in country X."

Translated into baseball commentary: "this is the first time any player from team X has scored a triple in the bottom of the ninth with two out when playing team Y on the road."

Pretty impressive, huh ... Since lifetime bests, team bests, and all-time bests are really quite rare, both baseball and cricket commentators regularly resort to inflating the value of average play by making it the best of all-time in some highly specific context.

I've always been curious who keeps these statistics, and how it's done. Do they have a database at their fingers, so they can calculate these rarities at will? Or do you actually have to remember stuff like that to get on TV? The phrase idiot savant comes to mind.

But these aren't really statistics, they're records.

What you hear a lot of in both sports are references to averages; averages at bat, and the average cost in runs for a pitcher/bowler to get someone out.

And if you're only going to know one number about a player, an average [of some form] is probably the most useful.

What's curious is that we don't hear so much about the variation in players' performances. Stick with me through the next cricket example, because this really does apply to other sports.

In cricket, a batting average of over 30 in international matches is pretty good, over 40 is exceptional, and anything over 50 makes you a legend (get up towards 100 and you will have statues and knighthoods).

But who would you rather have on your team, a guy that gets scores of 9, 0, 105, 1, 4, 97 ... or a guy that racks up 35, 29, 46, 27, 51, 18 ... Probably the second guy.

The analogy to earned run averages in baseball and points per game in basketball should be pretty straightforward.

In fact, the way variation comes out in sports is when people talk about a player's "consistency." There's few sports that don't value consistency, but it's only when a player is truly, maddeningly inconsistent that commentators and journalists bother to make the comparisons.

Cross country team meets, interestingly or trivially, often report the 'packing analysis' or the 'spread' for a team. But this is the only sport I can think of, where some measure of variability is reported. Perhaps there's others.

Look at the individual stats pages for baseball, basketball, or football. All averages. No variation there.

But in the final minutes of the game, who are you going to put out there? The guy that has the lower average, but rarely gets nothing; or the guy that can give you a huge win, but might also fail completely ... Depends how far behind you are really.

Clearly, newspapers are restricted in the amount of stuff they can publish. But the web ... the web has no such restrictions on space. And with sports betting a growing market, it's stranger still we don't hear more specific discussion of consistency and variability in performance.

For now, though, I have a day job and a dissertation that preclude making these calculations myself, and all I can ask is that anyone who uses this thought to win a fantasy football league or sports betting ring, send me some of their winnings.

Posted by robe0419 at October 7, 2004 2:11 PM