Google’s Causal Impact: Part 2 – Caution
Here it is. The highly anticipated second and final installment of my two-part series on Google’s Causal Impact R package. Please try to control your excitement.
In part 1, we took a general look at how the package helps marketers measure incrementality and ran through a quick toy example. Here’s a quick refresher on the example.
Default Causal Impact example. Shows actual test group against forecast, pointwise difference, and cumulative difference.
Causal Impact does a great job of showing the cumulative lift in our test group due to our dummy interaction. But what if the test group was actually the control group and vice versa? We would certainly expect Causal Impact to show a drop that mirrors the original lift. Something as trivial as labelling one set of data as the “test group” and another as the “control group” shouldn’t determine which group looks better in our analysis. Let’s take a look.
Flipping the test and control groups shows a decrease, as we expected.
As we expected, Causal Impact correctly shows that our new test group, our original control, performed significantly poorer after our dummy interaction. That’s great, but this is a nice, clean, textbook example. Let’s see if this holds in the real world.
Our internal team wanted to measure the lift of using a new bidding strategy. I pulled some Adwords data from our data warehouse into R and created the data frame “data”. (Really clever name.) From there, I was able to pull out data for my test and control groups, groups 2 and 1 respectively. We then use “join” and “zoo” to get the data into an acceptable format for Causal Impact. Code and output below:
Real world example showing Group 2 underperforming.
We see that group 2 performed significantly worse over the long haul, so we would conclude that our control group, group 1, outperformed our new bidding strategy and we should just continue operating as we had been.
Before we move on, let’s just flip our test and control labels and make sure group 1 still looks like the winner.
Real world example showing Group 1 underperforming.
Well that kinda sucks. Just by flipping the test and control labels, we now see that group 1 underperformed, contradicting our original conclusion. At this point, I wouldn’t feel comfortable using either result, and we have to go back to the drawing board to come up with a new way to analyze the data.
Why does this happen? I can only guess it has to do with the inexact science of automatically fitting forecasting models. Forecasting, in my opinion, can be a little sketchy to begin with. We’re taught from day 1 of statistics that extrapolation is a big no-no, but that’s exactly what forecasting is. On top of that, we are automatically fitting a model to the data. While these functions that automatically fit model for us can save us some time, they certainly aren’t perfect. Adding these two factors together could potentially lead to odd results as we have seen above.
Stay in touch
Subscribe to our newsletter
By clicking and subscribing, you agree to our Terms of Service and Privacy Policy
All that said, Google’s Causal Impact package really is an innovative tool that can help marketers measure incrementality. The methodology is quite clever and the visuals are highly intuitive. We will continue to look for opportunities to leverage it. If you do find yourself in a perfect scenario to use it, just make sure you try flipping the test and control groups to ensure the results are consistent before taking action.