You know what really grinds my gears? Using correlation to imply causation. It’s 2015 people, you should know better by now. When you Google “correlation vs causation,” about 5.3 million results are returned in a quarter of a second. The first one? “Correlation does not imply causation,” courtesy of Wikipedia.
The only positive thing to come out of this epidemic is the rise of spurious correlations. In an attempt to clearly demonstrate correlation does not imply causation, statisticians have found things that are correlated, but are absurdly unrelated. In fact, there is a website actually called “Spurious Correlations.” If correlation did imply causation, we had all better hope that the winning word at the national spelling bee is a short one, and people in Wisconsin shouldn’t use bedsheets.
Don’t get me wrong, I’m not trying to say correlations are worthless. Problems arise when it is used as the final analysis. Correlations are simply a starting point to highlight potential relationships that could be investigated further, nothing more.
If correlation isn’t enough to show causation, what is? You have to run some sort of controlled experiment. Let’s say we want to show the impact of a branding campaign on overall revenue. Instead of just running the campaign and then looking for a correlation between branding spend and overall revenue, let’s set up an experiment. Find two markets that behave similarly, track their overall revenue for a few weeks, and then implement the branding strategy in just one of the markets. This allows us to see how the introduction of the branding campaign changed market A and compare that to the change, or lack of change, in control market B.
Visualizing the market trends before and after the start of the branding campaign is a great way to start your analysis. You can finish by analyzing the differences between the test and control markets. Google has actually created an R package called CausalImpact to help with both of these. For more information on this package, consult your neighborhood statistician.
So the next time you see a little bit of a trend in the data, stop and think, “correlation does NOT imply causation,” before you start taking action. Begging isn’t beneath me. Please, please, PLEASE make it stop. Set up a controlled experiment first.
And that, people, is what grinds my gears.