Friday, 28 August 2015

The value of the toss

Mike Selvey has written an interesting piece in the Guardian arguing that the toss makes very little difference to the outcome of test matches. This idea interests me, since pre-match analysis and the chat amongst fans often seems to imply enormous tactical significance to winning the toss. Could it be that it's all just pageantry?

Selvey backs up his point with statistics from England and Australia's recent history, but the subject intrigues me and I feel like there's room to chip in with my own two statistical penneth.

The table below shows the number of wins and losses for teams winning the toss in various formats. Taking my lead from the suggestion that uncovered pitches made a big difference to the value of the toss I've divided the Test data into pre- and post- 1970. (Honestly, I'm a bit young to remember uncovered pitches but this article suggests they began to be phased out in the 60s, which is why I picked 1970). I've split the data for limited overs formats into day, day-night and night since there's good reason to think that might be important.

I've also made an attempt at evaluating whether the difference between wins and losses is "statistically significant" in each case. More of that below.


The data is consistent with the uncovered pitches idea: the win/loss ratio for teams winning the toss is much better pre-1970 than post-1970. The post-1970 data, meanwhile, is pretty consistent with Selvey's assertion that the toss doesn't matter too much. The ODI data seems consistent with the idea that the toss doesn't matter much in Day games, but plays a role in Day-Night games. Counter-intuitively, the T20 data seems to reverse that trend.

So, what of this statistical significance?

One way of estimating whether a statistical finding is significant for a given sample size is to calculate the probability that you could have got a result at least equally extreme from that sample size, assuming that the "null hypothesis" is true. In our case "the null hypothesis" would be "the toss makes no difference". If this were true than in any given game (excluding draws) there would be a 50/50 chance that the toss would just happen to fall for the team that was going to win the game anyway, and from that assumption we can work out how likely it is that we would get a result as extreme as the one we did, from the sample size we have, if the toss really made no difference. This number is the "p-value" in the second-last column of the table.

Conventionally, if this number is less than 0.05 we call our result "statistically significant". There's a lot of issues with this approach to things, not least that the 0.05 threshold is completely arbitrary, but it at least gives us a rough starting point for deciding how much importance we should attach to a finding.

In this case, our results that the toss makes a (positive) difference in Day-Night ODIs, Daytime T20Is and that it used to make a difference (pre-1970) in Tests all pass our significance threshold of 0.05. The other cases fail to pass the threshold, which certainly isn't the same as saying we've proved the toss makes no difference in those cases, but does mean that we can't be sure it does with the data available.

As far as modern test matches go, it seems Mike Selvey is right- for the most part we probably do over-analyse the toss. Even if the difference it makes is real, it seems to be tiny. So, next time I think Alastair Cook's made the wrong post coin-flip decision, I'll try to remember to give the guy a break- it may well not make any difference.

1 comment:

  1. What you're looking at here is relative risk - what is the relative chance of winning given you have won the toss compared to when you have lost. You can then get away from the unnecessary p-values and just focus on effect size - for example for your day T20s the results are 1.4 (1.15 to 1.71), so the chances of winning a T20 if you win the toss were on average 40% higher, but the actual true value may lie between 15 and 71%.

    ReplyDelete