As someone who values the work and knowledge building of research, I find it is important to be able to recognize some of the ways in which research gets misused or misunderstood. Ideally statistics are a way to take a mass of data and present it in a clearer way, allowing people to understand more of what the data tells us. However, statistics are necessarily reductive, and sometimes important information gets left out. For example, Justin Matejka and George Fitzmaurice have put together a lovely visualization of how very different sets of data can end up with the same statistical output. They argue for visualization (e.g. scatter plots) so that you can see the data, not just the summary statistics.

However, visualization is not necessarily sufficient. Tyler Vigen has put together a set of graphs demonstrating some obviously disconnected variables that nonetheless correlate neatly. A tidy graphs does not change the fact that correlation does not equal causation!

Even peer-reviewed, published research can be flawed or overstate the confidence of their findings. One tool for evaluating academic research that uses statistics is described in Ulrich Schimmack’s Replicability Index, which simply looks at the findings, sample size, and power to determine the likelihood that the results would be replicated if the study were repeated. As disappointing as it is, one empirical study with statistically significant findings is not sufficient evidence to act on; with publication bias removing the context of all the non-significant findings, we don’t usually have all the evidence anyway. When we are making decisions that impact people’s lives in profound ways, it would be wise to maintain our humility about what we actually know. Gerd Gigerenzer has some excellent writing on becoming more careful decision makers. With all the wealth of information, we need to become ever-wiser consumers of information, being able to recognize when things are not as certain as they appear.