Putting Expected Goals to the test

After yesterday’s post where Expected Goals was explained in detail, today’s post will put the metric to the test. How good is Expected Goals? And is it better than Total Shots Rate?

We’ll compare ExpG and TSR at several levels as we go along. The dataset used for the first part of this analysis consists of all 98 teams from the 2013/14 season so far, for top-5 leagues. As usual, data comes from Squawka, my go-to-site for OPTA driven football data. All comparisons in this piece are made on team level. We’ll leave the individual player analysis of ExpG for another day.

ExpG is calculated as explained in yesterday’s post, and for comparison with TSR, ExpG ratios (ExpGR) are used. For all behind-the-scenes input in the ExpG formula no data from the 2013/14 season was used. All regression analysis that was needed to determine how to rate different factors that influence ExpG was carried out on earlier data. The risk of over fitting is therefore minimized.

ExpGR = ExpG for / (ExpG for + ExpG against)

TSR = Shots for / (Shots for + Shots against)


TSR and outcome

First up, the relation between TSR and the outcome in terms of points per game (PPG) and goal difference (GD). Click on the graph if needed, for a larger version.

TSR and outcomeTSR is a very good metric. It correlates nicely with the most relevant two performance indicators PPG and GD. The R-squared values of 0.55 and 0.58 indicate that knowing a team’s TSR provides around 75% of knowledge needed for a perfect knowledge of either PPG or GD. For more, and better explanations of R-squared and R, check Phil Birnbaum. The man really knows his stuff.

In general, R-squared values are higher when leagues have a clear separation into two groups. EPL typically has values over 0.6, while Ligue 1, where the dots are one bunch, generally scores below 0.4.


ExpG and outcome

These two plots show the relation between ExpGR and outcome.

ExpGR and outcomeFrom face value alone, you can tell that ExpGR has a better correlation with outcome than TSR has. The dots are closer to the red regression line, so the R-squared value is a lot higher. For PPG, the R-squared is 0.73, while for GD it is somewhat higher at 0.79.

This is a magnificent correlation between a metric and outcome, but don’t get carried away yet. We would expect ExpGR to do better here, as it carries more detailed information to rate goal scoring chances. The formula behind it is designed to improve the relation with outcome in terms of PPG and GD. It would be a true shock if ExpG did not do a lot better than TSR here. What’s more important is the second half of this piece, looking at repeatability of the metrics.


TSR and repeatability

From here on, a different data set is used, as we’ll now compare the same metric over two consecutive seasons. Data consists of season 2012/13 and 2013/14 so far for the top-5 leagues, where obviously relegated sides from the first season did not produce a second season for comparison, as promoted sides in the second season did not have a first season to compare with. This left 84 teams with consecutive seasons.

TSR repeatabilityTSR is pretty repeatable, producing an R-square of 0.51. This indicates that TSR in the first season is a moderately good predictor of TSR in the second season. Most teams are roughly in the same ballpark, but deviations of 0.100 are far from rare.


ExpGR and repeatability

The next plot shows ExpGR in the first and second season.

ExpGR repeatability

ExpGR has an even better repeatability than TSR did With an R-squared of 0.67 this metric carries a good signal over multiple seasons. Stripping a few outliers, teams generally don’t deviate more than 0.050.


In the end

This scatter plot heavy piece proves a superior correlation for ExpGR with both outcome and repeatability compared to TSR. To speak with Nate Silver, ExpGR carries more signal and less noise than TSR.

The first part of this post, relating ExpGR and outcome, shows that in measuring team performance, ExpGR show prevail over TSR. This conclusion was probably known intuitively, but is now illustrated and quantified.

The second part of this post is more revolutionary, as it establishes ExpGR as a more reliable parameters to use for predictions. This means not just fancy number heavy predictive models, but also any easy made claims regarding upcoming matches or final league positions.

TSR still holds the quite relevant advantage that counting shots is a lot easier than building an ExpG model. However, with more and more variations of ExpG models around, these numbers will gradually become easier to obtain over time.



I feel like I could have put a dozen links to James Grayson’s amazing site in this TSR heavy post, but I’d rather urge you to just go to his site and check it thoroughly. It is good.