Can Expected PDO teach us about luck in football?

PDO is a curious stat. It’s a meaningless acronym, it’s both simple and complicated at the same time, and it’s constantly mistaken as an equivalent for good or bad luck.

 

Some background

PDO was born in the brain of Brian King, an active member of the ice hockey analytics community. He posted his work under his forum name PDO, hence the metric got this slightly confusing name.

The definition of PDO is quite simple. It consists of two parts. First, take all shots that a team takes, compute the scoring percentage. Second, take all shots that a team conceded, compute the save percentage. Third, add the two together and multiply by 1000 to lose the silly decimals.

In one formula we then get this.

PDO = ( (Goals For / Shots For) + ( 1 – (Goals Against / Shots Against) ) ) * 1000

Since matches are player between two teams only, and each shot for one team is a shot against another team, a league is a close format. This means that league average PDO is always going to be 1000.

PDO was introduced in football by James Grayson (and quickly thereafter adopted here on 11tegen11). He showed that this metric is an excellent tool to illustrate how teams how historically performed. This makes intuitive sense, because when a team scored all chances they got and saved all chances they conceded, yes, it’s safe to say they had been good.

However, we cannot infer that historical PDO translates into future PDO. More simply said, finishing and saving does not carry over to future performances. I won’t go over all the evidence to support the case that finishing and saving isn’t repeatable, but excellent 2011 pieces from James are to be found here, here and here.

 

Some theory

When PDO was introduced in football, two important assumptions were made. First, that finishing and saving was essentially random. Second, that in a reasonable sample shot quality was more or less equal from team to team.

The first assumption still more or less stands. It might be true that Barcelona finishes chances a little bit better than the next team, but we need large samples to establish which teams are true Barcelonas and which teams are simply getting a series of good bounces. The vast majority of dominance in football is established by shooting more and conceding less shots.

The second assumption does not stand, and that is a very important point. If we adopt the assumption that holds in ice hockey, that all shots are equal, and finishing is random, then all teams should have PDO values around 1000 in the long run. However, we’ve shown that PDO values don’t approach 1000, but rather stay in the 980 to 1040 zone. Some teams are able to keep their PDO above 1000.

To explain this, we look at ‘Expected Goals’, or xG. This method has been discussed a lot on 11tegen11 over the years. Shortly said, a regression model assigns each attempt a value between 0 and 1, which represents the quality of that goal scoring opportunity. Many factors go into the model and it has proven to be the most predictive single value metric available.

 

How do teams beat PDO?

In theory there could be two ways to beat PDO, i.e. to consistently achieve PDO values over 1000. Either be more efficient with your attempts: convert shots at a higher rate than other teams and save shots at a higher rate than other teams.

Or create better goal scoring opportunities than you concede. Then, just convert and save at league average rates (corrected for the quality of attempts) and your shooting percentage and saving percentage will be above league standard, resulting in a PDO over 1000.

Teams do not consistently beat xG, i.e. teams do not consistently score more or concede less than their xG values. Sometimes over a full season they will, but hardly any teams beat xG multiple seasons in a row (hello Gladbach!). However, in a real world where we study so many teams over a handful of seasons, random effects could provide a reasonable explanation for the fact that some teams do outscore xG a few seasons in a row.

The second explanation, heterogeneity in shot quality, is convincingly proven. Better teams create more shots than they concede, but they also create better shots than they concede. Part of this is due to quality – creating better shots through better play – but another part of this is due to Game State effects. It’s easier to score then you’re already leading because the defending team needs to take more risks to try and salvage something from the game.

 

Expected PDO

This is where we finally get to the point I’m trying to make in this post, Expected PDO.

Where normal PDO is computed using actual shooting and saving percentages, expected PDO can be computed using xG per shot created and xG per shot conceded.

xPDO = ( (xG For / Shots For) + ( 1 – (xG Against / Shots Against) ) ) * 1000

Normal PDO measures how efficiently a team has converted shots into goals, and prevented the opponent from doing so.

xPDO measures how efficiently a team converts shots into goals, and prevents the opponent from doing do, given the shot quality for and against, and assuming league average conversion of xG into goals.

In the long run we can expect teams to be equally skilled at converting xG into goals, so in the long run we can expect a team’s PDO to approach the xPDO. We should not expect each team’s PDO to revert back to 1000 because the assumption about equal shot quality does not hold.

 

(Here comes the part where I need to watch my words, since the term ‘luck’ is probably the most problematic term in analytics.)

Luck

A team that has a stunning season opening, but with a PDO much bigger than their xPDO has achieved their success on the basis of an unsustainable efficiency. Some people would be happy to call this luck, others may prefer to keep to ‘unsustainable efficiency’. For me that is a semantic debate, where I’m willing to respect personal preference. If I’d toss a crumbled piece of paper perfectly into my bin 10 times in a row, I’ll consider myself lucky. But on other hand, I did perform at an elite level of paper tossing for a short (and unsustainable) period of time, which may fulfil me with pride at my effort.

 

Examples

The important thing for me is that xPDO can be used to learn the reasonable direction for PDO to move. Let’s check some leagues and see what we can observe. In all graphs, the red line represents an identical PDO and xPDO, so we can assume that line to be the magnet where teams are pulled towards. The colors of the dots reflect the gap between PDO and xPDO, which is also the distance from the red line. Orange teams have a lower PDO than could be expected and should assume some positive regression. Blue and purple teams have a higher PDO than could be expected on the basis of shot quality and should fear some negative regression. The further from the red line a team is, the stronger this discrepancy.

 

PDO - xPDO Matrix England Premier League 2015 2016West Ham shines like a lone star above the rest, only in this case it’s not a good thing for them. Their near 1150 PDO is not backup up by the quality of shots created and conceded. Both of those are quite in balance, so a PDO around 1000 seems a more likely estimate for the long run.

Southampton would go down as unlucky based on around 925 PDO, but in fact their shot quality created is so much lower than their shot quality conceded that their xPDO is just a mere 960. If Koeman doesn’t fix the shot quality issue, joining the big boys is not going to happen for the Saints.

Liverpool have drawn mainly short straws when it comes to finishing. An xPDO of around 1000 is not impressive, but their PDO below 950 will definitely have helped camouflage the underlying performance by the Reds.

PDO - xPDO Matrix Netherlands Eredivisie 2015 2016This plot identifies De Graafschap, currently pointless bottom of the table, as the most unlucky team so far. Their xPDO of 1000 indicates they have created about equal quality shots as they have conceded, but their PDO of 830 indicates their scoring and save percentages have not lived up to that expectation.

This is in part also true for Utrecht, who may have a PDO near 1000, but who create the best quality shots of the Eredivisie, so they should expect a higher PDO, even over 1050.

On the other hand, Ajax have a quite extreme PDO of over 1150, but given their shot quality they should be around 1040. A big red regression warning seems warranted.

PDO - xPDO Matrix Germany Bundesliga 2015 2016Stuttgart are the De Graafschap of Germany. Their extremely low sub 800 PDO should revert a lot, but they concede better quality chances than they create, so around 970 seems likely in the longer run.

Bayern, Dortmund and Mainz would be identified as over performing in the old adage of ‘PDO goes to around 1000’, but the xPDO value indicates that these over 1050 values do actually represent their shot quality values very well.

 

22 thoughts on “Can Expected PDO teach us about luck in football?

  1. Matthias Kullowatz (@MattyAnselmo)

    “But on other hand, I did perform at an elite level of paper tossing for a short (and unsustainable) period of time, which may fulfill me with pride at my effort.”

    Like beer pong! Oh college…

    Early on at ASA, the model I was using involved just two predictors: shot differential and finishing differential (a proxy for PDO, I would argue). By the end of the MLS season, both variable were highly correlated with future results, which I think supports your conclusion that not all PDOs regress to 1000.

    Reply
  2. JW

    I usually like the two-dimensional figures where you rate all teams of a league by PDO and expected goal difference (lucky and good, although maybe I should not use those words).

    With this metric in mind, maybe you should use PDO above expected instead of just PDO? This seems a better indicator of unsustainable efficiency, which will give better good/lucky figures.

    Keep up the good work 🙂

    Reply
    1. 11tegen11 Post author

      Indeed, I was thinking to use delta PDO in my Good-Lucky matrices.
      Hope the message still gets true to the audience then, which is a difficult balancing act with these infographics.
      It feels tempting to call delta PDO ‘luck’ at times, just to get the message accross to most people, but still I want to inform the part of the audience that wants to know the details. Hence, a post like this which I could refer to when using the Good-Lucky matrices.

      Reply
      1. JW

        That is indeed a hard balance.

        Btw, why don’t you just use goals above expected as your luck-axis? This is easier to understand than PDO, since people will have to know what xG is to understand your graphs anyway.

        Another thing: football is not about scoring goals, but about winning matches. If Ajax were less “lucky” in your graph, this would probably mean they would have the same amount of points, but with a lesser goal difference (so winning 1-0 instead of 3-0). Maybe you should incorporate game state in your graphs in some way?

        Reply
        1. 11tegen11 Post author

          Goals above expected is closely related to delta PDO indeed, but it’s not the same since PDO isn’t influenced by shot volume.
          The intent of this piece is to illustrate why the old adage that PDO will approach 1000 for every team does not hold true.
          The fun of this process is that it opens op opportunities for further use, like delta PDO. We’ll just see where this will take us.

          Reply
      1. Max

        Sample size is sufficiently large to allow the unlikely event to happen, right? All leagues have one “freaking outlier” (the unlike event) but at a second glance the patterns of the teams grouping around the red line look pretty different.

        Reply
        1. Jörg Seidel

          @Max: yes, I like the chart. It makes total sense to me. As you point rightly out we also should expect some freak outliers. Funny is just that is one per league rather than say a league without and another league with two.

          Reply
    1. 11tegen11 Post author

      Teams will generally approach the red line with more matches played.
      However, due to random variation and potentially also by things not captured by the xG model, not all teams will always do this. To any rule, there are exceptions.
      As a rule of thumb, yes, teams will move towards the red line.

      Furthermore, a season is not enough to truly find the best team with any certainty. We are drilled to think that the team that wins a league must have been the best team in that league. However, and I’m planning to write on this soon, it’s very common for a team that is not the best in its league, to win it. Randomness plays a big role in sports.

      Reply
      1. Pinkybum

        “We are drilled to think that the team that wins a league must have been the best team in that league.”

        This is a very important point especially if individual games are considered to be single statistical events then 38 games in a season is hardly a significant (in the statistical sense) sample.

        This is why a lot of hand-wringing about the England national team is misplaced I believe whereby they missed out on advancement last year by only one or two goals!

        Reply
  3. TR

    I was thinking about a metric that compares xPDO with a metric taking (Actual Goals for) / ExpG(For) and (Actual Goals against)/ExpG(Against) into account. This would indicate over/under-performance. Have you done some work on this?

    Reply
    1. 11tegen11 Post author

      Please go ahead and share your findings.

      I haven’t done that, because in its essence, xPDO is already actual goals compared with xG, both on offense and on defense. Only corrected for shot volume, because it uses goals per shot and xG per shot.
      My guess would be that your proposed formula finds much the same as xPDO, only multiplied by shot volume, which I have deliberately taken out of the equation to prevent high volume shooting teams seeming more or less lucky just because of their volume.

      Reply
  4. Pingback: #Link11: Lewa geht weiter | Fokus Fussball

  5. Pingback: Separating Skill and Luck | Leeds By Numbers

  6. Max

    I love your work. It’s truly awesome. It appeals to me as a footballer and engineer 🙂
    And I admire how you still find the time to do it despite fathering a child. I hope all is well in this regard? I myself barely find the time to read when our children keep my wife and me busy 🙂

    Reply
  7. Pingback: Links – 1/12/2015 | PyBet

  8. Pingback: The Curious Case of Dundee United | The Backpass Rule

  9. Pingback: PDO and Luck in the SPFL | The Backpass Rule

  10. Pingback: Football Analytics – Part Three: PDO | One Short Corner

Leave a Reply