
One of the trickier calculations you’ll ever need to get your head round as a marketer is calculating statistical significance – especially if you’re not a natural statistician – and understanding how you’d use this is half the battle.
Statistical Significance could also be one of the most important calculations you’ll need to understand as well. As a marketer, your mantra is of course “always be testing”. But there’s the problem: how do you know if your slight conversion rate uplift success warrants roll out, a retest or ditching?
For starters keep it simple. If you want to achieve significance you’ll want to avoid the complexities of a multivariate test, and if you’re running an A/B test make sure you only isolate one variable, otherwise your results will be invalid. I’ve seen this mistake too many times and been guilty of it myself a couple: you make a few tweaks to an email and it works fantastically; then you need to identify exactly what it was that worked… bugger.
What you’re looking to achieve is to disprove the null hypothesis. The null hypothesis is assumed to be true until evidence suggests otherwise.
In most cases, there will only be two actions a user can take: click or not click; convert or not convert. Because of this you can use Chi-Squared measurements as the data is unique, in this case, unique just means “yes or no”. There aren’t levels of quality of clicks.
You can test based upon varying degrees of confidence – most often 95% confidence – often called the alpha. The higher the requirement for significance, the lower your alpha will be. And this is calculated at 1 minus the alpha – so for 95% confidence, you’re looking at =1-0.5. Because you’re only ever likely to be comparing one result against another, your degree of freedom will be 1.
For me, this was the most confusing bit to get my head around. Take some time to look at this Chi-Square distribution table, it makes sense eventually. As your degree of freedom is always going to be 1 in an A/B test, the significance figures that you’re looking to beat are going to be limited – but we’ll come back to this once you have your results.
Now, run your test and collect your data in a table, like this:
| Converted | Control | Variant | Total |
| Yes | 131 | 111 | 242 |
| No | 3527 | 3656 | 7183 |
| Total Clicks | 3658 | 3767 | 7425 |
If there was no relationship between your test and the results, then you could expect that the null hypothesis is true and the CTR (or ConvRate, or OR, etc.) would be the same for both your control and variant. As this is unlikely, you need to calculate what your expected values for your control and variant would be by assuming they are the same. To calculate this you need to use the equation =(total converted * total control)/total
So in this example =(242*3658)/7425
Then calculate this for the other results in your table to give you your expected volume at an equal rate. Like this:
| Expected Convs | Control | Variant | Total |
| Yes | 119 | 123 | 242 |
| No | 3539 | 3644 | 7183 |
| Total Clicks | 3658 | 3767 | 7425 |
Now to calculate Chi-Square, compare your actual volume to your expected volume. This comparison is done by subtracting the observed from the expected, then square the result and divide that by the expected volume. So you understand what’s happening here, squaring the difference between the volumes amplifies the effect of the difference, so it’s more noticeable, and then dividing that result by the expected result normalises the figure.
The equation looks like this =((expected control – actual control)^2)/expected control
And in this example =((119-131)^2)/119
Then calculate this for the other results in your table. Once that’s been worked out, sum the totals across rows and columns and they should add up to the same figure, in this case, 2.37.
| Chi-Square | Control | Variant | Total |
| Yes | 1.16 | 1.13 | 2.29 |
| No | 0.04 | 0.04 | 0.08 |
| Total Clicks | 1.20 | 1.17 | 2.37 |
To see whether your results are significant you want to take your Chi-Square number and look it up against a Chi-Square distribution table based upon your chosen alpha and your degree of freedom, which as mentioned before is always 1. Still confused? Say you want 95% confidence that the results of your last test have won, you want to take the alpha of 0.5 and look up against your degree of freedom, 1. This will give you a Chi-Square value which your Chi-Square calculation must be equal to, or exceed, for your results to be statistically significant.
| ꭓ².100 | ꭓ².050 | ꭓ².025 | ꭓ².010 | ꭓ².005 | |
| Chi-Square | 2.71 | 3.84 | 5.02 | 6.64 | 7.88 |
| Significance | 90% | 95% | 97.25% | 99% | 99.50% |
Because my Chi-Square result of 2.37 is less than 3.84, my test results are not statistically significant despite being a winning test, and I couldn’t confidently say that my CTR is related to the difference in CTA. If I wanted to be certain that this wasn’t due to simple variance, I’d run the test again – the exact same test, mind you – and aggregate the results.

Leave a comment