Forget shot numbers, let’s use expected goals instead

“Evaluation the quality, rather than the absolute number or chances created seems like a worthwhile effort. And with more detailed Eredivisie data on goal scoring attempts available on, hopefully, short notice, this kind of tool might prove a valuable addition to this season’s match reports on 11tegen11.”

It’s been two years since I wrote these words in an article named ‘A chance is a chance is a chance?’. Unfortunately, breaking down chances into expected goals, rather than simply counting shots has not made it to the Eredivisie, or any other league, yet. But times are about to change…


Strike Zones and Game States

Using our recent explorations on strike zones and game states, we can stratify shots according to location and match situation and come up with expected goals per shot. This is much more valuable than simply adding shot numbers, as it removes the basic – and incorrect – assumption that all shots are of equal value.

Shot location may be the most influential factor when it comes to shot quality, as we’ve learned from the days when StatDNA still posted quality analysis pieces on their blog, but location isn’t the only factor involved. The most difficult – and therefore often unmentioned – factor is defensive positioning, or defensive pressure. Measuring this in detail would require GPS tracking of all players on the pitch, which I’m sure is done behind closed doors at present, but it generates huge amounts of data, which complicates the analysis a lot. And more importantly, data at such a level of detail is not widely available yet.

We’ll have to do with what we’ve got, and game states serve as a nice proxy for defensive pressure, as we’ve seen that teams trailing by a single goal give up significantly better chances than teams defending a single goal lead, a gap that measures up to around 25%.

Expected Goals

The challenge for this post is now to convert all our recent explorations of strike zones and game states into something handy and simple. We need a single number to indicate the quality of shots that teams create and concede, or at player level, a number that indicates the quality of shots that a player took. Simply said, we should know how many goals the average Eredivisie player would have scored from the attempts that a team, or a player, has had. We shall term this ‘Expected Goals Scored’.

Actually, it is a very simple concept. Let’s take Ajax and examine their shots in detail. In total, Ajax created 544 shots, of which 2 penalties are excluded. Here’s a table of Ajax’ 542 remaining shots created per zone and game state.

GS -2 GS -1 GS 0 GS +1 GS +2
Zone 1 0 0 2 0 1
Zone 2 3 22 82 41 44
Zone 3 3 19 64 29 46
Zone 4 2 19 97 29 39

Our previous explorations have shown how many goals are scored per shot for each combination of strike zone and game state. We can now easily compute the expected amount of goals for Ajax’ 542 shots by multiplying both tables.

GS -2 GS -1 GS 0 GS +1 GS +2
Zone 1 0.800 0.857 0.815 0.833 0.667
Zone 2 0.192 0.179 0.190 0.269 0.274
Zone 3 0.059 0.089 0.063 0.103 0.087
Zone 4 0.028 0.033 0.035 0.022 0.059

For example, from Strike Zone 2 at GS 0, Ajax took 82 shots. The league average conversion rate for shots from Strike Zone 2 at GS 0 is 0.190. Therefore, the total Expected Goals Scored for Ajax from Strike Zone 2 at GS 0 is 82 * 0.190 = 15.59.

We can repeat this exercise for all combinations of Strike Zones and Game States and add all the subtotals. This will show that Ajax had 65.35 Expected Goals Scored with their 542 shots. In other words, the average Eredivisie team would have scored 65.35 goals from Ajax’ shots, if we correct for Strike Zone and Game State. Only one small step to go, divide the Expected Goals Scored by the number of shots, and now we know the quality of shots that Ajax created: 65.35 / 542 = 0.121 Expected Goals Scored per shot.


Quality of Shots Created

We can repeat the trick for each team to come up with the following graph. The bars represent the quality of the shots that teams created.

There is a considerable spread in quality of shots created. PSV and Feyenoord may expect 0.129 and 0.128 goals per shot, while Willem II creates chances that result in only 0.101 goals per shot. In other words, the type of shots that PSV and Feyenoord create are generally worth 27% more than shots by Willem II. PSV and Feyenoord are followed by the teams that also complete the top-6 in the final league standing, and Roda. In general, the quality of shots created nicely correlates with the league table, with Roda being the big exception. Roda finished 16th in the table, but comes up 4th in terms of the quality of shots created.  


Quality of Shots Conceded

We can do the same thing for shots conceded, and measure the quality of shots conceded. This time, of course, lower bars indicate less shots per goal conceded, as an indicator of quality defending.

Again, there is a considerable spread when we compare the best team, Groningen, with the worst team, Heerenveen. Groningen earns their top spot in this chart by doing an excellent job in forcing their opponents to shoot from low quality positions (Zone 4), as we’ve seen previously. In contrast to the Offensive Shot Quality, there is no clear correlation between Defensive Shot Quality and the final league positions. It seems that quality of shots created is a better way than quality of shots conceded to tell good and bad teams apart.


In the end

It’s always a good thing if analysis and observation start to overlap, and with more detailed information to work with, we’re slowly getting there. The mini-series of posts this past week has now lead to a simple parameter called Expected Goals, which we can either express per shot, over a match, or over a series of matches. It has an offensive and a defensive side and the former can be applied to teams and players, while the latter is limited to team level, since shots conceded can’t be linked to single defenders.

Next up will be a series where we will compare the outcome in terms of goals scored and conceded with the Expected Goals scored and conceded. The Expected Goals parameter estimates the chance of the shots, while the difference with the actual outcome is an indicator of Finishing Quality, or its defensive equivalent.

9 thoughts on “Forget shot numbers, let’s use expected goals instead

    1. 11tegen11 Post author

      Graham, the correlation between (xGS-xGC) and GD is 0.79, click here for the graph.
      For (shots created – shots conceded) and GD this is 0.75, click here for the graph.

      The improvement hides in the correlation between xGS and Goals Scored, which goes from 0.73 to 0.81 when using xGS over Shots Created.

  1. Pingback: StatsBomb | Premier League Strikers And Repeatability

  2. Pingback: StatsBomb | Shots and Key Passes are Better than Goals and Assists

  3. Max

    I have followed your matchplots throughout the world cup 2014. I must say that I find them a work of genius. At a single glance I can see how the match went, who scored when and how good the chances were.
    The accusations against your plots, when they fail to explain e.g. Germany’s 7-1 win, I find ridiculous. In this case they rather point out that the outcome was extraordinary in comparison to what should have been expected from an ‘average’ point of view.
    Did you base the ExpG on your available Eredivisie data or did you use something different? And could you please explain what counts as a shot? And the only additional information to create an ExpG from a shot is zone and game state? It seems surprisingly accurate for so little information.

    1. 11tegen11 Post author

      Hi Max,

      Thanks for those generous words!
      I just keep telling myself that the negative voices are always louder, and most happy people enjoy stuff in silence… 😉

      Perhaps I should write a new blog post about my current ExpG model, as the information is quite fragmented on my blog right now.

      The present ExpG model uses a lot more parameters than it initially did. It’s true that shot location, shot type and Game State already conveys a lot of information. The present model, though, uses a lot more variables.
      It isn’t based on Eredivisie data alone, but on a database of over 150.000 shots. This allows accurate modeling of rare situations, as you can see.
      Anyway, I’ll go into more detail in a separate blog post before the new season gets underway!

  4. Pingback: By the Numbers: What to Make of Mattia Destro? - Chelsea Index - Chelsea Index

  5. Pingback: Analytics, Scouting, and MLS Attacking Tendencies – and then, the hex

Leave a Reply