We need to look at the data: August 2015

Friday, 28 August 2015

The value of the toss

Mike Selvey has written an interesting piece in the Guardian arguing that the toss makes very little difference to the outcome of test matches. This idea interests me, since pre-match analysis and the chat amongst fans often seems to imply enormous tactical significance to winning the toss. Could it be that it's all just pageantry?

Selvey backs up his point with statistics from England and Australia's recent history, but the subject intrigues me and I feel like there's room to chip in with my own two statistical penneth.

The table below shows the number of wins and losses for teams winning the toss in various formats. Taking my lead from the suggestion that uncovered pitches made a big difference to the value of the toss I've divided the Test data into pre- and post- 1970. (Honestly, I'm a bit young to remember uncovered pitches but this article suggests they began to be phased out in the 60s, which is why I picked 1970). I've split the data for limited overs formats into day, day-night and night since there's good reason to think that might be important.

I've also made an attempt at evaluating whether the difference between wins and losses is "statistically significant" in each case. More of that below.

The data is consistent with the uncovered pitches idea: the win/loss ratio for teams winning the toss is much better pre-1970 than post-1970. The post-1970 data, meanwhile, is pretty consistent with Selvey's assertion that the toss doesn't matter too much. The ODI data seems consistent with the idea that the toss doesn't matter much in Day games, but plays a role in Day-Night games. Counter-intuitively, the T20 data seems to reverse that trend.

So, what of this statistical significance?

One way of estimating whether a statistical finding is significant for a given sample size is to calculate the probability that you could have got a result at least equally extreme from that sample size, assuming that the "null hypothesis" is true. In our case "the null hypothesis" would be "the toss makes no difference". If this were true than in any given game (excluding draws) there would be a 50/50 chance that the toss would just happen to fall for the team that was going to win the game anyway, and from that assumption we can work out how likely it is that we would get a result as extreme as the one we did, from the sample size we have, if the toss really made no difference. This number is the "p-value" in the second-last column of the table.

Conventionally, if this number is less than 0.05 we call our result "statistically significant". There's a lot of issues with this approach to things, not least that the 0.05 threshold is completely arbitrary, but it at least gives us a rough starting point for deciding how much importance we should attach to a finding.

In this case, our results that the toss makes a (positive) difference in Day-Night ODIs, Daytime T20Is and that it used to make a difference (pre-1970) in Tests all pass our significance threshold of 0.05. The other cases fail to pass the threshold, which certainly isn't the same as saying we've proved the toss makes no difference in those cases, but does mean that we can't be sure it does with the data available.

As far as modern test matches go, it seems Mike Selvey is right- for the most part we probably do over-analyse the toss. Even if the difference it makes is real, it seems to be tiny. So, next time I think Alastair Cook's made the wrong post coin-flip decision, I'll try to remember to give the guy a break- it may well not make any difference.

Tuesday, 25 August 2015

Ian Bell's 'easy' runs

Today's post features something totally new for this blog: a reader's request. Specifically, Simon Mills asks:

"Can you test the hypothesis that Ian Bell only scored easy runs?"

Right now seems like a good time to reflect a bit on Ian Bell and his reputation as many are suggesting his international career could or should be about to draw to a close, after an Ashes series where he struggled to cut sufficient mustard. So, here goes.

Opposition

Clearly, how we answer the question of whether Ian Bell deserves to be labelled a scorer of 'easy runs' is going to depend on how we define 'easy'. One, very simple, way we could try to define it is to say

"easy runs are those scored against test cricket's weakest attacks"

The graph below plots Ian Bell's batting average against each test side against the overall batting average of middle order batsmen (positions 3-6 in the order) against each opponent over the period of Bell's career (Aug 2004-present). Points which fall above the red line represent nations against whom he has outperformed his peers and those below the red line represent the nations against whom he has fared worse than the 'average' middle order batsman of his era.

His batting average against each nation for the most part follows that of the average middle order batsman. Against Australia, South Africa, Pakistan, India and West Indies his average is within a few runs either way of the average performance for middle order batsmen against those opponents. There a couple of large exceptions in Sri Lanka (against whom he has over-performed compared to his peers) and New Zealand (against whom his record is poor). Then there's a giant honking outlier in Bangladesh, who he has really cashed in against averaging 158 compared to the 'average average' of 63.

Bangladesh aside, there isn't a very strong trend for Bell to strongly over-perform against weak attacks or under-perform against strong ones. His does score more runs against weaker bowling sides, but only to a degree comparable to the average middle order batsman of his time.

Match situation

Another way of looking at Simon's question is to define 'easy' and 'difficult' not by the opposition but by the match situation, saying something like:

'easy runs are scored when you come in with your team already in a good position'

To look at whether Ian Bell primarily scores runs when he comes in with England in a good position, I've compared his performance when coming in with less than 20 runs per wicket on the board (i.e. at scores worse than 20-1, 40-2 etc) against his performance coming in with more than 80 runs per wicket scored (scores better than 80-1, 160-2 etc). The results are in the table below. I've also broken it up into time periods pre- and post- his being dropped after the West Indies tour of 2009, an event many people consider a watershed in his career.

The result is that yes, he does score more runs coming in with a good platform laid, then coming in to a dicey situation. This is even more true of post-2009 Bell than pre-2009 Bell. Post-2009 Bell seems to be quite the master of putting the boot in from a well-laid platform. On the other hand, one shouldn't take it that he's useless when coming in in a tough situation- his average when coming in with fewer than 20 runs/wicket on the board is fairly close to his overall, pretty respectable, average.

Clearly, it's unfair to accuse Bell of being only a scorer of easy runs. The 2013 Ashes stand out amongst Bell's achievements but he's starred on many other occasions too. However, it is true to say he is more prolific at scoring against weaker attacks and in more comfortable match situations. That shouldn't surprise us too much- easy runs are, after all, easier.

Tuesday, 18 August 2015

Winning away tests isn't getting harder, but not losing them is

As the Ashes locomotive rumbles into its final stop at The Oval, the destination of the series is decided- another home win, the sixth in the last seven Ashes. Meanwhile, in Sri Lanka, the home team have taken a 1-0 series lead against India, after a borderline miraculous comeback.

Winning tests away from home is hard. It was ever thus, but some are worried that it's harder than ever, that with packed international schedules players simply don't have the will or the time to acclimatise to foreign conditions and that away wins are becoming an endangered species.

With this in mind, I wanted to see how much more unusual winning away test matches has become. The graph below shows the percentage of home wins, away wins and draws in Tests (excluding Tests at neutral venues) in 5 year chunks since 1946.

The first thing we can notice is that the proportion of home wins has indeed increased steadily, but markedly since the late 80s. What's interesting to me here is that there has been very little change in the rate of away wins over the same period. Instead, the 'extra' home wins are coming out of the proportion of drawn games- the rise in home wins almost exactly tracks a decline in the number of draws.

I think many cricket fans are aware both that home wins are becoming more frequent and that draws are becoming less so but offer separate explanations for these observations.

A discussion about rise of home wins will tend to centre around modern scheduling and inadequate preparation by away sides. Meanwhile, if you asked a random cricket fan to explain the decline in the number of draws I suspect they would talk about the influence of limited overs form of cricket on test matches- making the game go faster and producing results. They might also make a mention of improved drainage systems. These factors may all be contributing but take no account of the fact that the rate of away wins has remained fairly stable and nor do they explain why home teams are benefitting disproportionately from the decline of the Test match draw. I can't claim to properly explain these things either, but I hope to come back to it. Perhaps it's that the increasing availability of wins for away sides (because less games are drawn) is roughly cancelling out what would be a decrease in away wins from increasing home advantage.

It's not the winning that away teams are getting worse at, it's the not losing.

Saturday, 8 August 2015

Umpires really are calling fewer no-balls under DRS

Amidst the confusion and excitement of the last two days of crazy test cricket at Trent Bridge, there's been a bit of chat about no-balls. I know- of all the wonderful, brilliant things that have happened, I want to show you a graph about no-balls. Sad, right?

Yesterday, on two occasions, England thought they'd taken a wicket only for the batsman to be called back because the umpires checked for an overstepping front foot on the video replay and belatedly called a no-ball. Instances like this seem to happen quite a lot in modern test cricket. Many commentators are accusing the umpires of not even looking for no-balls in normal play because they know they can always check if a wicket falls. Therefore, the argument goes, the bowler isn't getting any warning that he's overstepping until it costs him a wicket.

To hear some commentators hold forth on the subject you'd think that no-balls are now only ever called when there's a dismissal at stake (which, just to be clear, isn't true) and that this is a serious moral disservice to the bowler (which I find pretty doubtful, but I guess is a matter of opinion).

So, are umpires really calling fewer no-balls in the DRS era?

Yes. A lot fewer.

The graph shows the number of no balls called in test cricket per legal delivery by year since 2000. There's a fairly steep drop following the introduction of DRS in 2009 which now seems to have levelled off. No balls are now being called at less than half the rate they were in 2009. Looking at the trend since 2005, it looks like the rate of no-ball calling was already decreasing before DRS. I think that's kind of interesting and I don't know why it would be true. It is possible that it's a false impression created by the way the noise in the data fell. The decrease since 2009 is undeniable though.

Personally, I think there are two ways cricket can go from here:

1) Technology ends up being used to check no-balls for every delivery, not just wicket taking ones-
although whether this can be done without either confusion or incredibly dull disruption remains to be seen.

2) Bowlers, commentators and fans learn to accept "the new normal". Front foot no-balls just aren't going to be called very often unless there's a wicket. If bowlers are really relying on the umpire to warn when they're overstepping then they'll have to get over that. They'll either have to get better at measuring their run up or have one of their fielders watch out for when they're pushing the line and warn them. Or they can just accept that overstepping will cost them a wicket every now and then.

Sure, no-balls aren't as exciting as the England being on the brink of Ashes victory. But they still make for a nice graph.

**ADDITION (9/8/15)**
Someone left a comment on reddit asking whether the drop in no-ball calling could be down to bowlers simply bowling fewer no-balls and have nothing particularly to do with DRS. This is a fair question, and of course, as a general rule, we must be careful not to conflate correlation with causation. I'd be quite surprised if this was the true explanation, as I'm not sure why such marked, global improvement in the avoidance of overstepping should occur, but I suppose it's possible.

I can't definitively rule it out but I did have a quick look at the rate at which no-balls have been called on (would-be) wicket taking deliveries in tests in the 2015 season. I found 11 examples of wickets being chalked off for no-balling and the 2015 season has seen 572 wickets from legal deliveries. 11/572=0.019 which is much closer to the overall rate of no-ball calling 10 years ago than the current rate. Indeed, it's higher than even the rate ten years ago (please beware the small sample though), which isn't too surprising since they probably were also missing some no balls in the pre-DRS era.

This:

1) Suggests that umpires are indeed missing significant numbers of no-balls in 'normal' play, which they're catching on wicket taking deliveries.
2) Is consistent with the notion that the rate at which bowlers are actually overstepping hasn't changed too much. I don't think there's any way to make this definitive since there's no way of knowing how many no-balls umpires were missing pre-DRS.

I do think the more likely explanation is that umpires are catching fewer no-balls rather than that bowlers are simply overstepping less, but the only way I can think of to prove that would be if some poor intern at Sky went through 10 years of archive footage looking for missed no-balls. On balance, it's probably good that that isn't going to happen.

Tuesday, 4 August 2015

Thrashings are the norm in test match cricket

As the cricket watching community picked through the smouldering debris left behind by Steven Finn at Edgbaston, I noticed one theme come up several times: isn't it odd that these two teams keep thrashing one another?

A 2-1 series scoreline suggests two teams who are fairly evenly matched, and yet none of the individual matches has been even a tiny bit close. The margins of victory to date are: 169 runs (England), 405 runs (Australia) and 8 wickets (England). Shouldn't evenly matched sides produce evenly matched games?

And yet, test cricket confounds that expectation rather often. Australia's tour of South Africa last year or the 2009 Ashes spring to mind as examples of test series which somehow contrived to be close overall, without producing a single close game.

So to place this years sequence of shoeings in some context I've had a look at the proportion of margins of victory in non-drawn test matches since 2005, to see just how prevalent hammerings are in modern test cricket and conversely just how much of a rare and priceless jewell a close test match is.

Our first chart shows the proportion of various margins of victory in all test matches since 2005. As you can see, nearly 50% of the pie is taken up by really big wins: innings victories and those by more than 200 runs or 9 or 10 wickets. By contrast, truly down-to-the-wire games (I believe the technical term is "arse nippers") form only a tiny proportion of test matches. Wins by less than 50 runs or 3 or less wickets make up only 7.6% if the total.

You may at this point be thinking that the stats are skewed by the inclusion of matches involving the relatively weak teams like Bangladesh and Zimbabwe. You might also be thinking of the shoddy away performances by the likes of England's last Ashes tour party or the Indian side that toured England in 2011. It's no surprise that those teams got thrashed, you might say.

In the chart below, to isolate the proportions in games between relatively evenly matched teams, I've restricted the sample to include only matches in series where both teams won at least one game- thus proving themselves at least capable of beating their opposition in those conditions.

As you'd expect, this restriction does alter the proportions. But not by very much.

The region of pie chart given to the biggest victories (innings, >200 runs, 9-10 wickets) shrinks a little bit, garnering a slice of more than 40%. Nevertheless, games with margins smaller than 100 runs or 7 wickets still occur less than a quarter of the time and very close games (<50 runs, 1-3 wickets) only 8% of the time- even between relatively evenly matched sides.

In this light, it doesn't seem anomalous at all that we haven't seen any close games in the Ashes yet. With only two games to go, it wouldn't really be surprising if we don't see any at all.

To be honest, I think anyone who's followed test cricket for a few years knows that games with really narrow margins of victory are quite rare. What looking at these numbers did for me is throw into sharp relief how much we as cricket fans over-react to both heavy defeats and large victories. Even if your team has been absolutely walloped in one game, it doesn't necessarily mean they won't be a match for the opposition in the next. On the other hand, those of us with triumphalist tendencies, when rejoicing in a big win should remember that even that hapless Aussie team of 2010-11 absolutely smashed England at Perth.

I don't think I know of any other sport in which it is so common for games between fairly evenly matched teams to end in very large margins of victory.

"It's not unusual to get thrashed at any time,

Even when you're evenly matched and in your prime"

as Tom Jones sang in his hymn to the strange fluctuations of cricketing fortunes, before his manager persuaded him to make the song about love or some nonsense like that. I think it's pretty hard to pick a winner for the Trent Bridge test. But I doubt it will be close.

Saturday, 1 August 2015

Home and Away with James Anderson (and 15 other fast bowlers)

The mood of the English cricketing public has lurched once more from despondent to cheerful, as England continued their extended experiment in demonstrating why 'momentum' is nonsense with victory in the third test. It was a test with lots of great moments and sub-plots- Ian Bell's success after being moved up to three, Adam Voges' magic jumper and Steve Finn's heartening resurgence to name but a few. England's victory was set up, however, by James Anderson's first day efforts- taking 6-47 as Australia staggered their way to 136 all out. Worryingly for England, we're quite likely not to see Anderson for the rest of the Ashes as he went off in the second innings with a side injury.

Anderson's critics have always maintained that he is a home track bully, that he only produces his best in favourable home conditions- and that therefore his reputation as one of his generation's leading bowlers is undeserved. Anderson's 6-47 on a seaming pitch at Edgbaston can hardly dispel that impression.

Although it's certainly unfair to suggest that he never produces away from home it is probably fair enough to say his home record is much more impressive than his away one. As far his average goes, he takes his wickets at 26.80 at home and 34.04 away.

What I want to do in this post is put that difference in some context. Yes, Jimmy Anderson takes his wickets more cheaply at home- but is he extraordinary in that regard or is he just an example of a very common phenomenon? Is his reputation more dependent on home performances than that of his peers in the fast bowling fraternity?

The graph above shows the difference between the home and away bowling averages for the world's 16 leading fast bowlers, as defined by the current ICC rankings. I should note that in the case of Pakistan's Junaid Khan I have counted his games in the UAE as being at home, since that's where Pakistan have played their 'home' games during his career.

Almost all of these 16 pace merchants have better averages at home than away- and of course it's no surprise that bowlers prefer conditions they're familiar with, where they learned their trade and gained their reputation. The three buckers of this trend are Josh Hazlewood, Tim Southee and Junaid Khan. As discussed above, Junaid Khan has never played a 'true' home test, and for that reason I was in two minds about including him. Josh Hazlewood has played only 8 tests (3 at home, 5 away) and so his startling stats should probably be taken with a pinch of salt, a squeeze of lime and a shot of tequila.

James Anderson, meanwhile, sits in the midst of a cluster of bowlers who average between 5 and 8 runs higher away than at home. The gap between his home and away performances is thus a little on the high side but by no means extraordinary amongst his fast bowling peers.

You can say Jimmy is a home track bully if you want, but in that case test cricket's home track playground is full of them. Bowlers who consistently excel to the same high level in all countries and conditions are rare beasts indeed.