Statistics vs truthiness

I thoroughly enjoyed reading Howard Wainer’s Truth or Truthiness: Distinguishing Fact From Fiction By Learning to Think Like a Data Scientist. I even laughed out loud occasionally, as there’s a lot of wit on display here, and one gets a strong sense of Wainer’s personality. This is not usual in a book about statistics (although having said that, Angrist and Pischke also do quite well on the clarity and fun front, especially for econometricians.)

Truth or Truthiness a collection of essays in effect, published as a response to this brave new world of truthiness (ie. lies that people believe because they want to) in politics and public debate. Wainer writes very clearly about statistics in general, and his main theme here, causal inference. This is of course dear to the heart of economists, and gratifyingly Wainer recognises that the profession is more scrupulous than most disciplines about causation. The book starts by underlining the importance of having a clear counterfactual in mind and thinking – thinking! – about how it might be possible to estimate the size of any causal effect. As Wainer puts it, “The real world is hopelessly multivariate,” so untangling the causality is never going to happen without careful thought.

I also discovered that one aspect of something that’s bugged me since my thesis days – when I started disaggregating macro data – namely the pitfalls of aggregation, has a name elsewhere in the scholarly forest: “The ecological fallacy, in which apparent structure exists in grouped (eg average) data that disappaears or even reverses on the individual level.” It seems it’s a commonplace in statistics – here’s one clear explanation I found. Actually, I think the aggregation issues are more extensive in economics; for example I once heard Dave Giles do a brilliant lecture on how time aggregation can lead to spurious autocorrelation results.

Having said how much I enjoyed reading Truth or Truthiness, I’m not sure who it’s aimed at who isn’t already really interested in statistics. For newcomers to Wainer, I’d recommend his wonderful earlier books, Picturing the Uncertain World, and Graphical Discovery. They’re up there with Edward Tufte’s books on intelligent visualisation (rather than the decorative visualisation that’s become unfortunately common).