With the international break almost over – thank goodness, fans of Dutch national football – this is part one of an intended small miniseries looking at how teams in the top leagues of the world have started the season. I plan to look at several leagues this week by using the same format each time, starting with the English Premier League.
Most leagues have some seven or eight match rounds played, so we may expect key performance indicators like Expected Goals (ExpG) to have settled a fair bit. Generally, the R-square for ExpG’s after seven matches to ExpG’s after a full season tends to be around 0.72, so the relation is quite strong. This means it does make sense to look at current ExpG’s and try to spot patterns, as well as make some ExpG based predictions.
Using the recently explained Good-Lucky matrix, in a format adopted from Benjamin Pugsley, we can easily scan the league for the best performance teams (horizontal axis) and the most efficient teams (vertical axis). Anyone into football analysis will know that being highly efficient lasts only so long, and PDO levels tend to revert back to normal before you know it. Depending on team quality, normal is a PDO of 980-ish for poor teams and 1020-ish for good teams.
The eye-catcher in this chart is Chelsea’s pink dot that illustrates their supremacy in both axes. Unfortunately for Chelsea, their dominance in the efficiency won’t hold, but their ExpG-ratio of 0.721 will still separate them from the rest.
That rest is led by Arsenal, with an excellent performance, but low general efficiency. We could expect the latter to regress a fair bit, but whether they can keep on performing with injuries to key players hurting them remains to be seen.
The next group could be termed the top-4 chasers in terms of performances. Interestingly the efficiency is spread very wide in this bunch, with Southampton riding the PDO wave a bit, and Newcastle feeling hard done. Surprise teams among this bunch are WBA and new-style West Ham, who might be candidates for a bit of regression in ExpG, based on historical performances.
United (4th in the league table) and Everton (17th) wouldn’t have thought to find themselves in the middle of the pack, and despite their widely different league positions, they occupy similar positions in terms of both performance and efficiency. More on that in the graph below.
QPR should catch up with the lower mid-table bunch with regression of their PDO, as well Burnley, although a short performance dip may easily throw them in the orange zone. A dense cluster of three teams should be alarmed by the fact that their season if fuelled by efficiency rather than performances. Don’t be surprised once Swansea (5th), Leicester (12th) and Hull (11th) start their drift down the table.
The final words are for Villa, whose return of no goals and no points from the last three matches initiated an unavoidable drop after their unsustainable season start.
Points per Game
The next graph is a slight variation on the previous one. The horizontal axis still presents the teams’ performance (ExpG-ratio), while the vertical axis now presents the outcome (points per game). Since this graph partly presents the same information, I’m going through it a bit quicker.
Chelsea leads by a wider gap in points than performances, confirming our conclusion above. In the rest of the table, performances and points are not so much in line, yet. The blue colours that signify good performance are spread from top to bottom. The orange zone, for troublesome performances, holds a mid-table position in the table, for now.
Were this my first assessment, I’d have cast severe doubts about ExpG-ratio as a metric. But it has proven its status before, and I’d pick ExpG over PPG any time.
Here’s the ‘sticking my neck out’ part of this mini-series. Using ExpG as a basis, a pretty straightforward model can simulate the remaining part of the season and come to predictions for the final league table. I figured it would be more fun sharing these from time to time, for various leagues, and see what we can learn along the way towards the end of the season.
For this model I’ve limited ExpG to 11v11 or 10v10 situations, filtered out blocked shots (since shot blocking is a skill), filtered out penalties (since they are distributed pretty random and skew the numbers a fair bit) and filtered out rebounds. Furthermore, I’ve regressed the ExpG towards last season’s numbers, based on the R2 between ExpG’s on each particular match day to ExpG’s at the end of the season.
Without further ado, here’s the graph of predicted points, along with a box plot showing the spread for each particular team. Enjoy!