The ever present challenge in football analytics is separating luck from skill. In a low scoring sport such as football, there will always be teams that fly high simply due to a spell of good fortune, and teams that find themselves in a hole they did not really deserve to be in on the basis of their displayed skill level. This is particularly true in Cup competitions, which are notorious for their surprise results, but even a common double round robin format that is used in most leagues around the world is too short to consistently identify the truly best team.
The challenge is to tell teams apart according to their true skill level, and not to get sucked into the group of people jumping on the bandwagon of teams that simply owe their ‘great performance’ to a spell of good fortune.
A very helpful tool to separate luck from skill is the concept of PDO. This is a very simple metric that originates from the low-scoring sport of ice hockey but slowly finds its way into football at present.
For a detailed description of the concept and the logic behind PDO, please read the introductory post on PDO.
In short, PDO is computed from a team’s saves percentage and the same team’s shot percentage. Start with the total number of goals conceded and scored, collect the total number of shots conceded and created and you’re all set. Simply add the team’s saves percentage and shot percentage together and to get rid of the decimals, multiply by 1000. Since one team’s goals are always another team’s failed saves, the league wide overall PDO will always be 1000.
One of the most fascinating things of PDO is that is seems to revert back towards the mean of 1000 for all teams. Although intuition will tell you that better players will finish a higher rate of chances presented, the raw numbers seem to tell that better teams don’t separate themselves from weaker teams by consistently obtaining higher shots and saves percentages. Much of this bears down to this influential post by the fantastic James Grayson, who analyzed ten seasons worth of EPL data to prove that PDO indeed moves ever closer to 1000 as the season progresses.
But the question for this post is, what is a normal level for PDO. Does it continue to regress towards 1000 as matches keep being added, or is there a certain bandwidth of normal PDO levels?
In this graph, Infostrada Sports data has been used to assess PDO levels since the start of the 2010/11 Eredivisie season. So, the horizontal axis contains two and a half seasons of Eredivisie matches with individual match rounds simply numbered from 1 to 86. The vertical axis presents the PDO levels and the lines represent the four different quartiles, with Q1 being the top-25% and Q4 the bottom-25% teams in terms of PDO levels. In order to smooth the curves, the single highest and lowest PDO teams have been left out, which also helps to create four nice groups of four teams out of all 18 Eredivisie teams.
Essentially, this is a repeat of James’ work on the concept that PDO regresses towards the mean of 1000, but the graph is extended beyond the single season level of 34 matches. And as you can easily see, the regression towards the mean seems to stop around the 40 matches mark.
From there on, a zone of PDO from 980 to 1020 nicely captures the average PDO and it no longer narrows down further. This probably means that superior teams separate themselves from inferior teams in terms of PDO, but only within this range. Teams that find themselves in a nice league position with PDO levels above 1020 will revert back over time, but top teams with PDO levels of 1020 may not necessarily follow that path as this PDO of 1020 may well be that team’s baseline. Also, inferior teams with PDO levels around 980 may not safely assume that fate will revert their lower league standing over time, as 980 may well be their baseline PDO level.
We can produce the same graph for both components of PDO: saves percentage and shots percentage.
These graphs learn us that the zone of saves percentage ranges from 87% to 89% and that the zone of shots percentage ranges from 10.5% to 14%. Teams with performances outside these zones have historically been unable to sustain that kind of level and will be expected to regress back to these zones within one season’s time.
In the end
PDO is a powerful concept to separate lucky teams from unlucky teams. Due to the low scoring nature of the sports, luck is an essential component of achieving a good league table position, but it is generally neglected when evaluating the performances of football teams. A high PDO refers to teams that stand on a high they did not earn, while low PDO teams find themselves in a hole they did not deserve.
The original thinking involved the notion that PDO will revert back towards 1000 over time, and while this is certainly true for values outside the 980 to 1020 zone, it seems better to rephrase and say that PDO will revert back to the 980 to 1020 zone, rather than towards 1000 for all teams.