We need to look at the data

Friday, 25 August 2017

Over too quickly

The English test summer has been a summer of thrashings. Not only the one-sided day-night affair at Edgbaston last week, but also the South Africa series finished without producing a single game which you would really call close.

This is not something terribly unusual. Large margins of victory are the norm in test match cricket, even if the two teams competing are fairly evenly matched, as has been noted previously on this blog.

There has been another common feature of these games- related to, but separate from, the comfortable victory margins: the outcome was highly predictable once both teams had batted once. The first innings leads in England's home games of 2017 have been: 97, 130, 178, 136 and a whopping 346.

With these leads duly established, the side ahead has remained more-or-less in control for the remainder of the game.

This observation prompts today's blog topic: how frequently are modern test matches virtually decided by the time both teams have batted once?

The pie chart below shows the distribution of first innings leads in tests since 2013. This amounts to a sample of 194 tests (I discounted a few which didn't get as far as two complete innings)

As you can, see more than a third of games feature first innings leads greater than 200 runs, and around half of all games feature leads greater than 150 runs. As you would guess, a lead of 150 runs is pretty determinative. The chart below shows the outcome of those games since 2013 with a first innings difference above 150: the first innings leader won 86% of the time with all the rest being draws, apart from one- Sri Lanka's miraculous victory against India at Galle in 2015

So around half of modern test matches really are basically over once each team has batted once- perhaps taking you up to the early to mid stages of day 3 with any real doubt about the destination of the game. You may wonder whether the qualifier "modern" was really necessary in that last sentence. Perhaps it was ever thus. People often like to tell you that the past was better, but people are often wrong.

I haven't attempted an analysis of first lead innings over the entire history of test cricket, which is what I would have liked to do. Unfortunately, I wasn't clever enough to find a time-efficient way of gathering the data on first innings leads (statsguru doesn't have a button for that). But to provide a bit of historical context for the data above, I homed in on the data for a 5 year period in the late 90s, by way of comparison with the modern day. I chose this period for no better reason than that it was the time I first got into cricket and I feel nostalgic about it.

The data from 1995-1999 support the theory that tests were not quite so frequently decided early in those days, as you can see below.

Only around a third of games saw first innings leads above 150 runs, and nearly a quarter were within 50 runs on first innings (as an aside, games with sub 50 run leads are basically toss-ups, both in the 90s and now- dividing roughly evenly between the leading side, the trailing side and the draw).

In games which did feature a big difference on first innings the distribution of outcomes were basically the same as they are now- so teams weren't necessarily better at responding to a large deficit, but large deficits weren't quite so frequent.

Test cricket is a wonderful sport, but I think it must be admitted that one of its weaknesses is that it very easily produces one-sided games which are over long before they are over. This has probably always been the case to some extent, but it would appear to have become exacerbated of late. Proposed explanations and solutions for this may vary. When a truly nail-biting test match comes along, uncertain to the end, treasure it for the previous jewel it is.

Sunday, 4 June 2017

Approaching the milestone

During England's eventually successful chase of 306 to beat Bangladesh in the opening game of the Champions Trophy, Alex Hales was on his way to a hundred. He'd just started to take a fancy to the bowling of Sabbir Rahman and biffed a couple of boundaries to move to 94. Swinging for the fence once more he was caught at deep midwicket.

Of course, this was a cause of exasperation for some English observers but George Dobell - one of my favourite cricket writers- took a different view, tweeting:

Dobell is suggesting - and for what its worth I broadly agree- that whatever you think of the selection and execution of the shot, Hales' attitude was admirable. Rather than play steadily through the 90s to try and guarantee himself the milestone and associated plaudits, he judged that it was better for the team if he carried on accelerating, and was willing to risk the personal achievement of notching another ODI hundred for the good of the team.

The tweet also seems to allude to a converse attitude among many of Hales' peers- that many of them do slow down as they approach 100, for the sake of trying to make sure of getting there. In today's blogpost I want to examine that idea amongst modern ODI batsmen.

Is it really common for ODI batsmen to noticeably slow down as they approach 100?

If so, how much do they slow down? How much innings momentum is lost to milestone hunting?

In an attempt to answer these questions I have had a look at the ball by ball data for all of the ODI centuries scored between the beginning of 2016 and the England v Bangladesh match the other day (so the data doesn't include Kane Williamson and Hashim Amla's centuries in the last two days).

This adds up to 108 centuries. I divided them each up into windows of 10 runs (0-9; 10-19 etc) and asked how many balls each batsman spent with their score in each window. If batsmen are tending to slow down as they approach 100, we should see that they spend more balls in the nineties than in the 70s or 80s.

The graph below show the average result for each run window, averaged over the 108 centuries in the sample. The red line is the mean, the blue line is the median.

One feature of the graph which I like- which is basically irrelevant to today's question but I'll mention it all the same- is that does give a nice visualisation of the batsmen playing themselves in. The first 10 runs really are noticeably slow compared to the rest of the innings, taking an average of around 14 balls. Thereafter, the average ODI centurion stays close to the run a ball mark, with gentle acceleration over the course of the innings. The average number of balls taken to get through a 10 run window goes from 10.09 in the twenties down to 8.27 in the 80s.

The brakes do seem to go on just a little bit in the 90s however, with the average balls taken for those 10 runs ticking back up to 9.23. (The median goes up to 9 from 8).

So, the data is a consistent with a weak slowing down as batsmen get near the milestone. But its a very tiny effect. Indeed the size of the effect is comparable to the degree of statistical noise in the data, so I'm not even 100% sure its real. But, if we take it at face value, batsmen are on average spending about 1 ball longer in the 90s than they are in the 80s, possibly influenced by the impending glory of an ODI hundred.

To look at the data a different way: 56% of the centurions were slower through 90s than they were through the 80s. By way of comparison only 38% were slower through the 80s than the 70s. Again this is consistent with the expected steady acceleration through the innings, which is very slightly waylaid by a nervous nineties slowdown. The distributions for the number of balls taken to get through the 80s and the 90s are plotted below as histograms. Comparing them you can see a slight rightward shift of the distribution as you go from 80s to 90s.

An extra one ball to score 10 runs is very small potatoes, so you probably don't need to be too worried that this is going to cost your team a match. Of course, for some individuals the effect may be stronger. And still, its interesting to reflect that even the top pros may be affected by the arbitrary milestones made for them, even if just by a tiny bit.

Sunday, 5 February 2017

Double the score at 30 overs

At some point during the display of pure swashbuckling, boundary smashing, batsmanship that was the recent India vs England ODI series something caught my eye. Well, several things did, but only one directly inspired this blog post. It was a someone on twitter estimating what the batting team's final total would be, using the rule-of-thumb that it will be roughly double the score at 30 overs.

In my head, my reaction went something like this "Really? People are still using the 'double the score at 30 overs rule'? Surely that's way out of date now, if ever it was true."

I then continued thinking: "I bet modern batsmen, with their skills honed to play aggressively, playing on flat pitches with short boundaries probably consistently beat that mark these days".

Well, the data shows my snide internal monologue was wrong, as you'll see below. To be honest, a moment's further thinking would have revealed my prediction to be hopelessly naive (not to mention cliche-ridden)- if team's are now scoring faster in the later overs, they're also scoring faster in the early overs. So, how it ends up working out for the "double the score at 30 overs" rule isn't immediately obvious.

In fact, in recent ODIs, if you estimate a team's total by doubling their 30 over score, you will be quite consistently too generous.

The graph below plots the ratio of the final total to the score at 30 overs achieved by teams batting first in (non-weather affected) ODIs since the beginning of 2016, against the number of wickets fallen at 30 overs. Points above the green dashed line represent innings which beat the benchmark of twice the 30 over score, points below the green dashed line fell short of it. The red line is the median ratio as a function of the number of wickets down at 30 overs, just to give a sense of how much it depends on how many batsmen are back in the pavilion.

As you can see, there are many more points below the green dashed line than above it. To be precise, sides batting first fall short of that mark 74% of the time. Even sides who are only one or two wickets down at 30 overs fall short of doubling their score more often than not.

The "double the score" heuristic is still not too bad as a ballpark figure, which I guess is all its meant for- the average ratio between the 30 over score and the final score was 1.81. Nevertheless, it is a fairly consistent overestimate. If you want a rule-of-thumb which is still somewhat simple, but a bit less over-generous "double the score and subtract 10%", might be better.

One might then wonder, how this plays out for teams batting second. Is a chasing side that's only halfway to the target at 30 overs likely to win?

The answer is: no, they are not. Chasing teams that are around halfway to their target at 30 overs usually lose.

The graph below plots, for each ODI chase since the beginning of 2016 (again, in non-weather-affected matches), the fraction of the target achieved at 30 overs against the number of wickets fallen at that point. Red points represent winning chases, blue points represent losing chases.

The dashed green line is the threshold of being halfway to the target at 30 overs. You'll notice that teams on the edge of this threshold rarely win and that teams below it never do. (I say never- of course I just mean within the sample I studied- I'm sure there are plenty of examples if I had gone back further).

The bottom line is that a chasing team, who at 30 overs is only halfway there, is very much livin' on a prayer.

Wednesday, 21 December 2016

Jennings, Hameed, Duckett .... and Bayes?

As the umpires of time take off the bails of 2016, and as the England team pick themselves off the mat after an absolute hammering from India, I bring you this year's festive look at the data. It's...a bit different.

England's test tour of Bangladesh and India featured the introduction of three new batsmen into test cricket. Keaton Jennings and Haseeb Hameed have been singled out as definite bright spots of the tour, while Ben Duckett may have to wait a little while for his next opportunity. But how can we sensibly assess each one's tour? And how likely is it that each one will be a medium-to-long term success at test level?

Things like this are difficult. Whenever a new player is picked, nobody- not the fans, not the selectors, not the player themselves- knows with certainty how it's going to go. Some demonstrably very talented players don't succeed in the long run, and some apparently more limited ones do and it isn't always obvious why. As a player starts out in test cricket (or at any new level or form of cricket), we must acknowledge that there are range of possible "outcomes" for their career- they may later be remembered as a legend, an unfulfilled talent, an "every dog has his day"... we just don't know.

But over time, we find out. With each fresh innings we see more of them, and gradually our uncertainty morphs into knowledge. A couple of years ago, I don't think it was enormously obvious which of Joe Root and Gary Ballance would be the better test player. Now, however, I think have a certain degree of confidence about the answer to that.

This is, in essence, a problem of forecasting. We have data about the past and we want to make a (hopefully, educated) guess about the future. Which of two players will be better? With each new data point, we update our forecast and hopefully arrive closer to the truth*.

There's a famous theorem of mathematics- called Bayes theorem, after Rev. Thomas Bayes, which we can use to do exactly this. As an equation it looks like this:

For our purposes: 'B' is something I have observed (e.g. "Keaton Jennings scoring a century"). 'A' is the thing I want to find out the probability of being true (e.g. "Keaton Jennings will be a long term success at test level").

P(A|B) is the thing I want to know. It is the probability that A is true, now that I know that B is true.
P(A) is the "prior"- the probability I would have given to A being true before I knew that B was true.
P(B) is the probability of B happening without regard to whether A is true.
P(B|A) is the probability of B happening assuming A is true.

Let's try and apply this to England's new recruits, albeit in a slightly crude fashion.

To begin, let's suppose four possible long to medium term outcomes for a player's test career:

1) Very good
2) Good
3) Okay
4) Not so good

Let's now consider all the batsmen who made their debut for England batting in the top six between 2000 and 2014. (I cut it off at 2014 so as to have reasonable some chance to say how their career went after their debut).

We'll exclude Ryan Sidebottom from the sample because he was there as a nightwatchman, and Chris Woakes because I think most people would say he's mainly a bowler, leaving us with 22 players.

Ranking them by batting average and dividing the list according to the categories above, I would say they break down something like this (*controversy alert!*):
Group 1 ("very good")- Cook, Pietersen, Root
Group 2 ("good") Barstow, Bell, Strauss, Trescothick, Trott
Group 3 ("okay") Ali, Ballance, Collingwood, Compton, Stokes
Group 4 ("not so good") Bopara, Carberry, Clarke, Key, Morgan, Robson, Shah, Smith, Taylor

This gives us our initial values for P(A) in the Bayes theorem equation. For a generic England debutant batsman in the modern era:
P(very good)=3/22=13.6%
P(good)=5/22=22.7%
P(okay)=5/22=22.7%
P(not so good)=9/22=40.9%

We then want to know the probability for a player belonging to each of these groups to have a given outcome from one innings. We'll categorise the outcomes of an innings pretty crudely: 0-9. 10-49, 50-99, 100-149, greater than150.

Based on the records of the players listed above we can estimate the probability that (e.g.) a player belonging to Group 2 ('good') will score between 100 and 150 in any given innings. The table looks something like this:

Obviously, the best players are more likely to get high scores and less likely to get low scores. But crucially there's a finite probability for any innings outcome for any group of player. This actually gives us all the information we need to take one innings outcome for a player and use Bayes theorem to generate a new forecast about the probability that they belong to each of our four categories.

So, let's take the example of Keaton Jennings. Before he batted, I though of him as just a generic debutant and my forecast about his ability at test level looked this:

P(Jennings is very good)=13.6%
P(Jennings is good)=22.7%
P(Jennings is okay)=22.7%
P(Jennings is not so good)=40.9%

After he scored a hundred, applying Bayes theorem gives:
P(Jennings is very good)=16.8%
P(Jennings is good)=26.8%
P(Jennings is okay)=23.5%
P(Jennings is not so good)=32.9%

So the odds I would give to him turning out to be a very good player after the fashion of Root, Cook or Pietersen went up after his hundred, but only modestly. It's still only one data-point after all, and the fact remains that most batsmen don't turn out to be a Root, Cook or Pietersen.

He then got two low scores and a fifty. Applying the process iteratively we end up at:

P(Jennings is very good)=14.2%
P(Jennings is good)=28.0%
P(Jennings is okay)=24.3%
P(Jennings is not so good)=33.4%

So there's still a high degree of uncertainty. Relative to before his debut the probability that things won't work out is down and the probability that he'll turn out to be great is up. But only modestly. We don't know much.

For Hameed and Duckett we can do the same thing with their results on tour.

Hameed is in a similar boat to Jennings. The probability that he'll be a long term success is up, but only modestly. We'll have to wait to be sure.

P(Hameed is very good)=18.0%
P(Hameed is good)=28.7%
P(Hameed is okay)=23.2%
P(Hameed is not so good)=30.1%

For Ben Duckett, the outlook is a bit poorer. Our calculation now gives over a 50% chance that he'll be in the "not so good" category and a less than 10% chance that he'll be in the "very good" category. Specifically:

P(Duckett is very good)=7.3%
P(Duckett is good)=17.2%
P(Duckett is okay)=23.1%
P(Duckett is not so good)=52.4%

Still, though, the calculation calls us to be circumspect. We have some indications about Ben Duckett's test prowess, but not the full picture. A nearly 25% chance that he'll turn out either good or very good is far from nothing.

There are two things I like about this way of thinking. Firstly, it allows us to acknowledge the world's inherent uncertainty without throwing up our hands and giving up. We can't have absolute certainty, but we can have some idea. Secondly, it gives us a mechanism to build new information into our thinking, to update our view of the world as we get new information.

The calculation I've outlined above is clearly much too crude, and leaves too much out to be used for selection purposes. But I genuinely think this way of thinking- i.e. probabilistic and updating forecasts based on new information- is well suited to this kind of thing. "Keaton Jennings will open England's batting for years to come" is too certain a statement for a complicated and uncertain world. Maybe, "there's a 42% chance that Keaton Jennings will turn out to be at least as good as Marcus Trescothick" is closer to the truth.

* There's a really nice book about statistics and forecasting called "The Signal and the Noise" by Nate Silver. It doesn't mention cricket, more's the pity, but it covers many fields- from baseball to finance to earthquakes- where some kind of forecasting is desirable and looks at why some of these areas have seen great successes using statistical methods and others have seen catastrophic failures. It's very readable and I very much recommend it if you're into that sort of thing.

Saturday, 20 August 2016

One remarkably consistent aspect of test cricket

It seems that James Vince and I have at least one thing in common: despite starting the season full of high hopes, neither of us have had a very prolific summer. I haven't blogged much of late, and indeed I haven't paid as much attention to England's test summer as I normally would, due to various other things that have occupied my time and brain space. This is a shame for me, because the series with Pakistan seems to have been a great one, judging by the bits of coverage I did catch.

In today's return to the statistical fray, I was interested to have a look into how the relative importance of different parts of the batting order has changed over time in test cricket. For instance, it is a well worn claim that tail enders are better batsmen than they used to be- does this mean teams now rely on them more for runs compared to other parts of the team? England have relied heavily on their lower-middle order of late- is this part of a trend or just how things are in one team right now?

To get a sense of this I divided the batting order into 4 parts: openers (1-2), upper middle order (3-5), lower middle order (positions 6-8) and tail (9-11) and looked at the percentage of runs off the bat each part of the order contributed in tests in each year since 1946.

I don't want to undermine my own blog too much, but the result was the most strikingly featureless dataset I have ever written about- as you can see in the graph below. The points show year by year data and the lines show 10 year averages.

Consistently, openers get about 26% of the runs, positions 3-5 get about 41%, numbers 6-8 get 25 % and the tail about 8 %. This has barely changed at all in the last 70 years.

The one small trend you can pick up is that the gap between openers and the lower middle order closes over time from a position where openers were contributing 3-4% more than numbers 6-8
up until the present day when the two contributions are basically equal (openers 25.3% vs lower middle 25.7 % over the last 10 years). This change is consistent with the increased batting role of wicket keepers which we discussed in the last post. There is a big uptick in the lower middle order data just this year, that stands out as rather an outlier- this part of the batting order has made 32.7% of the runs 2016, several percentage points above the long term average. This is in large part driven by England's reliance on that part of the line up- fully 42.6% of England's runs off the bat have come from numbers 6-8 this year. I expect the global figure (and probably England's too) will regress to the mean a bit before the year is out.

Positions 3-5 consistently provide the biggest slice of the run scoring pie. The difference between their contribution and the openers is a couple of percentage points larger than can be explained by the fact there's simply one less player in the openers category. This is consistent with the notion that teams tend to put their best batsmen somewhere between 3 and 5.

Batsmen 9-11 meanwhile, for all the talk of improving tail enders, have chipped in with about 8% of the teams runs extremely consistently all this while and show no signs of changing.

Plus ca change, plus c'est la meme chose.

Thursday, 26 May 2016

Charting the evolving role of the wicketkeeper

Last week's test between England and Sri Lanka belonged to Jonny Bairstow. A century on his home ground and a match winning one at that- rescuing England from 83-5 and dragging them to a total out of the reach of Sri Lanka's callow batting line up. Behind the stumps in his role as wicketkeeper he took 9 catches, making it an all round good 3 days at the office.

Bairstow is an example of what would seem to have become a pretty established pattern for the modern test match side: picking your wicketkeeper with a heavy emphasis on their willow-wielding ability, and a lesser focus on their glovemanship than might have been seen in previous generations. I don't think I'm going too far out on a limb to suggest that Bairstow is not the best pure wicketkeeper available to England, but out of the plausible keeping options he's the best of the batsmen, at least for the longer format.

This has made me wonder: how much has the wicketkeeper's role evolved over time? How much more are teams relying on their keepers to score runs? And has an increased emphasis on the batting prowess of keepers had a measurable cost in their performance behind the stumps?

The simplest thing to think would be that picking keepers based on their batting would come at a price in catches and stumpings. But can this be seen in the data?

I particularly enjoyed researching this post, not least because answering those questions will take not one, not two, not three but four graphs.

First of all, the run scoring. The graph below shows the run scoring output of designated wicketkeepers, as a percentage of total runs scored by batsmen in tests from 1946-2015. The red points are the year by year data and the blue line is the decade by decade average. The decade by decade averages give you a better sense of the long term trends.

This data shows a clear evolution towards a greater dependence on wicket keepers to provide runs. Wicket keepers provided only 6% of runs in the immediate post-war period, but they now provide nearly 10%. This is, of course, very much in line with conventional wisdom. One thing that struck me, however is how steady this increase has been. I had expected to see a rather more dramatic increase in the 90s and early 2000s after Adam Gilchrist made the swashbuckling batsman-keeper cool, but the importance of the wicketkeeper's runs had been rising steadily for a while (with a bit of a dip in the 1980s).

But what of their behind the stump performance? If teams' enthusiasm for batsman-keepers is leading to a lower standing of keeping, one might expect that to be reflected in how wickets are taken. If keepers are worse than they used to be then perhaps modes of dismissal which depend on them- catches behind and stampings- will decrease relative to other, non-keeper dependent, modes of dismissal.

The next graph shows the percentage of total wickets that were catches by the keeper in tests from 1946-2015. (Again, red points=year by year, blue line=decade by decade)

Far from decreasing, the reliance on wicketkeeper catches to provide wickets increases steadily post 1946- over the same period that keeper run scoring was on the rise- before hitting a plateau around the 1990s. Modern wicketkeepers provide about 19% of the total wickets through catches, and that figure has shown any noticeable downward shift since keepers have been expected to provide more runs. It may well be that what this graph is telling us has most to do with the evolution wicket keeping and bowling styles rather than keeping quality, but in any case its true that modern teams rely on wicket keepers both for more runs, and for more catches than teams 70 years ago. As the responsibility of keepers has increased their responsibility as glovemen has not diminished at all.

Wicket keepers can also contribute to dismissals via stampings. This is a much rarer mode of dismissal than caught behind but, we some may argue its a truer test of wicket keeping skill. The graph below shows the percentage of wickets that were stumpings over the same period as the graphs above.

The contribution of stumpings to the total wickets decreases in the post war years- over the same period that the contribution of catches increase (perhaps reflective of a decrease in standing up to the stumps? I'm not sure). But it's held steady between 1.3% and 1.9% for the last 50 years. So, wicket keepers continue to hold up their end in whipping off the bails.

If we can't see any strong changes in wicket keeping contributions to wickets, what about other ways of measuring wicket keeping quality? Byes, for instance. The graph below shows the number of byes conceded per 1000 deliveries in test cricket from 1946-2015.

The rate of conceding byes has hardly changed in 70 years. Looking at the decade by decade trends you could argue that it was on a steady decrease up to the 90s before taking an uptick, but these changes are miniscule- corresponding to maybe 1 extra bye conceded in a 1000 deliveries.

So, while its clear that more runs are indeed required of the modern keeper, the expectations behind the stumps have not shifted that much. Keepers contribute a consistent ~19% of wickets through catches with an additional ~1.5% through stumpings. They concede about 7 byes per 1000 balls and have barely budged from that for 70 years. Considering that the expectations on their batting have increased, while they have remained steady in other aspects of the game, keepers arguably have more on their plate than ever before.

Monday, 16 May 2016

Reverse Swept Radio

This week I had the pleasure of being interviewed by Andy Ryan on the excellent Reverse Swept Radio podcast. If you would like to hear me talk about cricket, stats and this blog, the link is here:

http://reversesweptradio.podbean.com/e/rsr-81-a-cricket-podcast/