Why not subscribe?

Wednesday, December 23, 2009

Statistician productivity

John D. Cook ponders the issue of why computer programmers aren't paid in line with productivity.

Marginal Revolution picks this up, and the comments at MR predictably note that there are many programmers with negative productivity. Some favorite comments:
In my experience as a software engineer, the worst programmers produce negative work and the top programmers are worth 10X as much as a mediocre programmer. It's a wider range than any other job I've done.

There are substantial numbers of working programmers with negative productivity, destroying value for their companies every day. They create designs that must be expensively scrapped or rewritten, engender defects that suck up customer support dollars, or in the worst case create products that are essentially unusable and unsaleable.
It reminds me of all those critical jobs in firms where the person's occupation is to say "no" to lots of people who want to muck up a clean operation. ... the standard in the gaming industry to avoid "feature creep;" games are best when they do one thing very, very well..
 This makes me think about statistician productivity, which seems subject to the same issues.

I think the most common problem I see in applied statistical work is error bounds that are too tight -- much too tight. This can happen in so many ways:
using sampling error, but labeling it total error,
having a complex design but using the simple random sample error formula,
using a confidence interval for a mean (e.g. forecast value of a mean) instead of the confidence interval for an individual prediction,
assuming independence when it's unwarranted.
In the error is too small, there is short term happiness, but longer term problems with expectations that can't be met.

But in at least one way this ineptitude isn't negatively productive.The short term happiness may lead to a contract, so there's big pressure to use the smallest number possible from the most inept person given the assignment.

Applied statistics definitely has this in common with what Cook writes about programmers:
The romantic image of an über-programmer is someone who fires up Emacs, types like a machine gun, and delivers a flawless final product from scratch. A more accurate image would be someone who stares quietly into space for a few minutes and then says “Hmm. I think I’ve seen something like this before.”

1 comment:

  1. I think you're right about underestimating uncertainty being the most common error. That's a common Bayesian criticism of frequentist statistics, that the frequentists assume things are known that are not known and thus underestimate uncertainty.

    Have you seen Sam Savage's book Flaw of Averages? The main theme of the book is underestimating uncertainty, describing what can go wrong when a random quantity is replaced with its average.