Monday, January 21, 2013

Benford’s Law becomes more clear

You remember Benford’s Law?  Wolfram sums it up:

Benford's law states that in listings, tables of statistics, etc., the digit 1 tends to occur with probability ~30%, much greater than the expected 11.1% (i.e., one digit out of 9).

I think Wolfram meant to add that the law applies to the leading digits of the figures you’re using. In any case, the law applies when you have a large sample of “real” data, as opposed to computer-generated random numbers, and so it’s useful to forensic accountants trying to detect fraud (or so I hear). 


It may seem counterintuitive (shouldn’t all digits occur with equal probability?) – but the last few minutes (start at 7:53) of this Numberphile video helped me get a grip on what’s going on:


  1. How does that compare to the first digit of the number you get when picking a point at random on a slide rule scale? -- sure, the formula was P(d) = log(1 + 1/d) = log((d+1)/d) = log(d+1) - log(d). I find that pretty intuitive. Maybe it's because I spent a lot of time with slide rules when I was in high school and college.

  2. And what about poor old zero, a fine hard-working digit? Of course it never gets to be the leading digit -- but that ought to be a tip-off that the distribution isn't going to be flat. And it's not on the slide rule scale at all.