Can Expected PDO teach us about luck in football?

PDO is a curious stat. It’s a meaningless acronym, it’s both simple and complicated at the same time, and it’s constantly mistaken as an equivalent for good or bad luck.


Some background

PDO was born in the brain of Brian King, an active member of the ice hockey analytics community. He posted his work under his forum name PDO, hence the metric got this slightly confusing name.

The definition of PDO is quite simple. It consists of two parts. First, take all shots that a team takes, compute the scoring percentage. Second, take all shots that a team conceded, compute the save percentage. Third, add the two together and multiply by 1000 to lose the silly decimals.

In one formula we then get this.

PDO = ( (Goals For / Shots For) + ( 1 – (Goals Against / Shots Against) ) ) * 1000

Since matches are player between two teams only, and each shot for one team is a shot against another team, a league is a close format. This means that league average PDO is always going to be 1000.

PDO was introduced in football by James Grayson (and quickly thereafter adopted here on 11tegen11). He showed that this metric is an excellent tool to illustrate how teams how historically performed. This makes intuitive sense, because when a team scored all chances they got and saved all chances they conceded, yes, it’s safe to say they had been good.

However, we cannot infer that historical PDO translates into future PDO. More simply said, finishing and saving does not carry over to future performances. I won’t go over all the evidence to support the case that finishing and saving isn’t repeatable, but excellent 2011 pieces from James are to be found here, here and here.


Some theory

When PDO was introduced in football, two important assumptions were made. First, that finishing and saving was essentially random. Second, that in a reasonable sample shot quality was more or less equal from team to team.

The first assumption still more or less stands. It might be true that Barcelona finishes chances a little bit better than the next team, but we need large samples to establish which teams are true Barcelonas and which teams are simply getting a series of good bounces. The vast majority of dominance in football is established by shooting more and conceding less shots.

The second assumption does not stand, and that is a very important point. If we adopt the assumption that holds in ice hockey, that all shots are equal, and finishing is random, then all teams should have PDO values around 1000 in the long run. However, we’ve shown that PDO values don’t approach 1000, but rather stay in the 980 to 1040 zone. Some teams are able to keep their PDO above 1000.

To explain this, we look at ‘Expected Goals’, or xG. This method has been discussed a lot on 11tegen11 over the years. Shortly said, a regression model assigns each attempt a value between 0 and 1, which represents the quality of that goal scoring opportunity. Many factors go into the model and it has proven to be the most predictive single value metric available.


How do teams beat PDO?

In theory there could be two ways to beat PDO, i.e. to consistently achieve PDO values over 1000. Either be more efficient with your attempts: convert shots at a higher rate than other teams and save shots at a higher rate than other teams.

Or create better goal scoring opportunities than you concede. Then, just convert and save at league average rates (corrected for the quality of attempts) and your shooting percentage and saving percentage will be above league standard, resulting in a PDO over 1000.

Teams do not consistently beat xG, i.e. teams do not consistently score more or concede less than their xG values. Sometimes over a full season they will, but hardly any teams beat xG multiple seasons in a row (hello Gladbach!). However, in a real world where we study so many teams over a handful of seasons, random effects could provide a reasonable explanation for the fact that some teams do outscore xG a few seasons in a row.

The second explanation, heterogeneity in shot quality, is convincingly proven. Better teams create more shots than they concede, but they also create better shots than they concede. Part of this is due to quality – creating better shots through better play – but another part of this is due to Game State effects. It’s easier to score then you’re already leading because the defending team needs to take more risks to try and salvage something from the game.


Expected PDO

This is where we finally get to the point I’m trying to make in this post, Expected PDO.

Where normal PDO is computed using actual shooting and saving percentages, expected PDO can be computed using xG per shot created and xG per shot conceded.

xPDO = ( (xG For / Shots For) + ( 1 – (xG Against / Shots Against) ) ) * 1000

Normal PDO measures how efficiently a team has converted shots into goals, and prevented the opponent from doing so.

xPDO measures how efficiently a team converts shots into goals, and prevents the opponent from doing do, given the shot quality for and against, and assuming league average conversion of xG into goals.

In the long run we can expect teams to be equally skilled at converting xG into goals, so in the long run we can expect a team’s PDO to approach the xPDO. We should not expect each team’s PDO to revert back to 1000 because the assumption about equal shot quality does not hold.


(Here comes the part where I need to watch my words, since the term ‘luck’ is probably the most problematic term in analytics.)


A team that has a stunning season opening, but with a PDO much bigger than their xPDO has achieved their success on the basis of an unsustainable efficiency. Some people would be happy to call this luck, others may prefer to keep to ‘unsustainable efficiency’. For me that is a semantic debate, where I’m willing to respect personal preference. If I’d toss a crumbled piece of paper perfectly into my bin 10 times in a row, I’ll consider myself lucky. But on other hand, I did perform at an elite level of paper tossing for a short (and unsustainable) period of time, which may fulfil me with pride at my effort.



The important thing for me is that xPDO can be used to learn the reasonable direction for PDO to move. Let’s check some leagues and see what we can observe. In all graphs, the red line represents an identical PDO and xPDO, so we can assume that line to be the magnet where teams are pulled towards. The colors of the dots reflect the gap between PDO and xPDO, which is also the distance from the red line. Orange teams have a lower PDO than could be expected and should assume some positive regression. Blue and purple teams have a higher PDO than could be expected on the basis of shot quality and should fear some negative regression. The further from the red line a team is, the stronger this discrepancy.


PDO - xPDO Matrix England Premier League 2015 2016West Ham shines like a lone star above the rest, only in this case it’s not a good thing for them. Their near 1150 PDO is not backup up by the quality of shots created and conceded. Both of those are quite in balance, so a PDO around 1000 seems a more likely estimate for the long run.

Southampton would go down as unlucky based on around 925 PDO, but in fact their shot quality created is so much lower than their shot quality conceded that their xPDO is just a mere 960. If Koeman doesn’t fix the shot quality issue, joining the big boys is not going to happen for the Saints.

Liverpool have drawn mainly short straws when it comes to finishing. An xPDO of around 1000 is not impressive, but their PDO below 950 will definitely have helped camouflage the underlying performance by the Reds.

PDO - xPDO Matrix Netherlands Eredivisie 2015 2016This plot identifies De Graafschap, currently pointless bottom of the table, as the most unlucky team so far. Their xPDO of 1000 indicates they have created about equal quality shots as they have conceded, but their PDO of 830 indicates their scoring and save percentages have not lived up to that expectation.

This is in part also true for Utrecht, who may have a PDO near 1000, but who create the best quality shots of the Eredivisie, so they should expect a higher PDO, even over 1050.

On the other hand, Ajax have a quite extreme PDO of over 1150, but given their shot quality they should be around 1040. A big red regression warning seems warranted.

PDO - xPDO Matrix Germany Bundesliga 2015 2016Stuttgart are the De Graafschap of Germany. Their extremely low sub 800 PDO should revert a lot, but they concede better quality chances than they create, so around 970 seems likely in the longer run.

Bayern, Dortmund and Mainz would be identified as over performing in the old adage of ‘PDO goes to around 1000’, but the xPDO value indicates that these over 1050 values do actually represent their shot quality values very well.


Why Jasper Cillessen is most likely just about an average penalty stopper

It’s July 5th, 2014. One hundred and twenty-one minutes of football in the World Cup quarter final between favorites Holland and outsiders Costa Rica have not resulted in a goal, and a penalty series is imminent. Dutch manager Louis van Gaal makes the infamous move to remove goal keeper Jasper Cillessen in favour of Tim Krul, and Holland wins the penalty series to reach the semi-final of the World Cup.

Japser Cillessen cools his anger after being subbed off against Costa Rica in the 2014 World Cup.

Japser Cillessen cools his anger after being subbed off against Costa Rica in the 2014 World Cup.

I vividly remember seeing this very unusual move and immediately thought: “how on earth could they have found any evidence to support this decision?”. Competitive penalties in football are such rare events that goal keeping skill when it comes to stopping penalties should be near impossible to evaluate. In fact, Van Gaal may well have used this move to trick the Costa Ricans into thinking Krul was quite special at stopping penalties, thereby influencing the odds in his and Holland’s favour.1


Low hanging fruit

It turns out that I was in the minority to ignore any historical penalty stopping data. Football and data were becoming hot those days, and the low hanging fruit was of course Cillessen’s horrible track record of 18 competitive penalties faced, 18 goals conceded.

Strong opinions sell a lot better than fine nuance, particularly at the highpoint of emotion that is a football World Cup. Ever since the summer of 2014, Cillessen is associated with the words ‘penalty trauma’ and ‘syndrome’. And I’m sticking to professional media outlets here; on Twitter Cillessen was ridiculed in every possible way, bringing the tweeters their desired high number of retweets. Feed the people sweet low hanging fruit, and they will happily swallow.



How can a goal keeper that doesn’t stop 18 consecutive penalties still be average? Even a simple statistical test called Chi-square would tell you there is just around 1% chance that Cillessen is average at stopping penalties.2

There are several reasons to assume that Cillessen, despite his poor track record, is an average penalty stopper.

  1. Cillessen may have around 1% chance of being average, but he is far from the only goal keeper being studied in this respect. There are over a hundred goal keepers who faced at least 18 penalties and this influences how we should view that fact that one of them, Cillessen, produced this odd series. If you’d try it often enough, you’ll sometimes just toss a long series of heads with a perfectly normal coin. As the wiki page on the statistical phenomenon called ‘Bonferroni correction’ says it: “as we increase the number of hypotheses being tested, we also increase the likelihood of a rare event.”

Simply said, study enough goal keepers and one will indeed not stop 18 penalties in a row, despite being a completely average goal keeper.

  1. To study binary outcomes (yes or no, goal or no goal), 18 observations is a very small set. The idea of statistics is to use numbers to make as reliable statements as possible. Imagine three of the next five penalties not being scored. Does this make you completely turn around on the claim that Cillessen can’t stop penalties? If that’s true, you shouldn’t have made the claim in the first place.

This problem, calling an effect while in fact there is no effect, happens much more than we think. Studies finding ‘significant’ effects sell better than negative studies. If this is true in the scientific world, imagine how it works in the world of journalism where people fight each other for clicks and reads.

  1. It’s easy to assume that a goal keeper with a 0/18 record is a very poor penalty stopper, while in fact that only thing that we test with our “1% chance” is that Cillessen is different from the average goal keeper at stopping penalties. This test does not tell you anything about the effect size. It doesn’t mean that stopping 0/18 is Cillessen’s actual level, but rather that based on this small set of 18 observations Cillessen may well be below average. Whether this mean that Cillessen stops 75.9% instead of 76% or 50% instead of 76%, we can’t say. But any reasonable thinking will lead you to assume that a goal keeper that is generally above the level of rival professional goal keepers in many other aspects, can’t be tens of percentage points behind on stopping penalties. So, if any effect exists at all, it is probably a few percentage points at most.


Now what?

So, should we just ignore the whole 0/18 thing? Probably not.

At best, this point in time, July 5th, 2014 serves as the moment we recognize that Cillessen may have a penalty issue, though odds are he probably hasn’t. Can you imagine a newspaper or twitter feed scoring any points with such a balanced point of view?

The best thing to do would probably be to start tracking his numbers prospectively and see if a new set of observations confirms what we found in the pilot study. Only then can we reliable say if there’s a true issues with Cillessen and penalties, or just a coincidence.

From penalty 19 onwards, Cillessen saw the first three penalties being converted (Sep 30 2014 APOEL – Oct 13 2014 Iceland – Nov 9 2014 Cambuur), a fourth one hitting the post (Aug 27 2015 Jablonec) and the fifth one being converted (Sep 3 2015 Iceland). This combines for a 1/5 record.

Obviously, it is debatable to ascribe a penalty that hits the woodwork to Cillessen’s penalty stopping skills, but changing the definitions of our study halfway because the outcome doesn’t really suit our desired statement is fraud with numbers, however tempting it is. Think of the headline ‘Cillessen still hasn’t stopped a penalty in his career’.


In the end

We will follow Cillessen’s new 1/5 set with close attention until the set has another 18 observations and see where we stand. Chances are that we could publish our revolutionary findings and claim that new research shows that Cillessen doesn’t have a penalty syndrome. Chances also are that no one will be interested in such a headline, and mass media will have found new situations where numbers can be abused in order to get clicks and sell papers.



1 For a nuanced view on events surrounding the Cillessen-Krul substitution, read this post.

2 Based on Cillessen stopping 0 from 18 penalties versus other keepers in my dataset stopping 1020 from 3229 penalties.

A close look at my new Expected Goals Model

The summer may be the best part of the year for football analysis. Ironically, a break from the frantic rhythm of football stimulates exciting developments in football analysis. Behind the scenes, like players in training camps, many analysts use this time of year to lay the foundations for next season.

In my case most time has been invested in a major re-structuring of data behind the scenes, with most algorithms being rebuilt from scratch. I won’t bore you with that, and will skip to the most interesting development of this summer: the next generation of my Expected Goals model.

The idea to create an Expected Goals model was there before the data was. In 2013, enough data was available for a very basic first model, which was extended around one and a half years ago. This summer means the second major overhaul. The initial reluctance to share the workings of the model has slowly dissolved, as the model became more and more sophisticated. After all, more and more ExpG models appear and if you don’t know how a particular model works, then why would you trust any of its output?


So, what does the ExpG model do?

For each goal scoring attempt, a number between 0 and 1 is assigned to indicated the chance of the attempt resulting in a goal.

The easiest attempt to explain is a penalty.

Typically, penalties are awarded around 0.76 ExpG, based on historic conversion rate. A penalty is the easiest attempt to classify, since it’s a situation isolated from play, with a standard spot for taking it. The number of penalties taken is way too low to factor in player or keeper performance, so we do best by just estimating 0.76 ExpG.



My ExpG model is divided in 10 different types of attempts, and each of these types has its own formula. In more technical terms, separate regression models are created for each of these situations. After all, to evaluate an open play shot we needed to look at other factors than we should do for a shot from a corner. These are the 10 situations for which separate models are in place.

  • Open play shots
  • Open play headers
  • Penalties
  • Direct Freekicks
  • Indirect Freekicks
  • Corners
  • Throw-ins
  • Rebounds from a GK save
  • Rebounds from woodwork
  • Fast Breaks

The total database to work with comes from publicly available data sources where Opta data is presented. The database used to construct the ExpG algorithm contains nearly 400.000 attempts. Every match is added to the reference database to make the model even better, so the number of attempts that serves as reference is increasing rapidly.

For each situation, different factors influence the odds of scoring. For example, for open play shots a through ball assist is quite a huge bonus, but for direct free kicks though ball assists do not occur. This makes me prefer different models for different situations. The choice to go with these 10 situation is arbitrary, but this set-up connects nicely with Opta data and in my experience it works well. Basically, open play is the only one with separate models for shots and headers, while in all other situations I’ve preferred to keep all attempts in one model and use shot type as a factor in the model.


Which factors does the model evaluate?

For each situation, regression analysis is performed with goal or no goal as the outcome variable. The standard set of variables is tested and significant predictors are kept in the model. Let’s go over all of the factors that are in the model, and explain their influence on ExpG.

  • Shot location
    • By far the most important predictor. Most models probably use shot zones, but I prefer a different method, which doesn’t have the granularity of zones, but rather uses location as a continuous parameter. In my model, location translates into two parameters: angle of view of the goal and distance from the goal.
    • Angle of view means two lines are drawn from the shot location to each post, and the angle between these lines signifies the view the player has. For very close shots, these angles go to a theoretical 180 degrees and for long range shots, or shots from acute angles, this approaches 0 degrees. Obviously, wide angles are better, since they signify closer shots from better angles.
    • Since the angle of view parameter is also in the model, the influence of distance on ExpG is a bit more complicated. Once angle is corrected for, distance has a positive impact. Think of a shot with an angle of view of just 5 degrees. This is either a close shot from a very acute angle, or a shot from way outside the box. The chance of scoring is higher for shots from outside the box than for shots from very acute angles, so more distance will raise ExpG in this particular model, where the angle of view is already corrected for.
  • Shot type
    • Foot shots are better than headers, after all other factors have been corrected for. However, a first attempt at implementing strong or weak foot did not improve the model. Perhaps we’ll get back at this one day.
  • Big Chance
    • Opta’s coders assign this code where they judge attempts to be big chances. This factor has quite a big impact on the ExpG, which supports the fact that on ball data alone is not enough to perfectly assess ExpG. Think of a weird long range shot when a keeper is out of place. For an ExpG model this will always be a hard attempt to qualify, since keeper position is not directly available. To me, this is a perfect example where data is helped by human judgement, since off ball event data would make analysis infinitely more complicated.
  • Start of possession
    • Attempts that result from possessions won high up the pitch have a higher chances of resulting in a goal than attempts from possession that started further down the pitch. A fine (but not the only) example where defensive pressure is in the ExpG model, though not directly but indirect. This factor is a recent addition, and based on some explorations there seems to be a sharp cut-off around 4/5th of the pitch. The difference is that sharp that for now I’ve put it in the model as a binary, either an attempt comes from a high turnover, or it doesn’t.
  • Assist
    • All attempts are either assisted or they are not. Assisted shots are assisted either intentionally or not. The unintentional assist stands for a casual pass that was never intended to provide a scoring chance, but was turned into a shot anyway. Opta makes this distinction, and I think it is very handy.
    • Intentional assists are a big plus for ExpG. This makes intuitive sense, since the assisting player makes a deliberate choice to allow a team mate to shoot (or head) the ball at goal, which illustrates a quality attempt.
    • Unintentional assists have a negative impact on ExpG compared to unassisted shots. Most of these attempts will be rather forced, and not of the highest quality. Unless, of course, a brilliant dribble precedes the attempt, but that kind of factors will come later.
  • Through ball
    • Nearly the best assist type possible. A through ball eliminates one or more defenders, forcing the remaining defenders into unwanted choices, and increasing the odds of scoring. Hence, a big bonus for ExpG.
  • One pass after a through ball
    • This is the best assist possible, as far as my variables go. It’s even better than a shot coming directly from a trough ball. Mostly this pass will be sideways to eliminate, or at least wrong-foot, the goal keeper.
  • Cross
    • Crosses are bad. This could be a title for a future post, but it is certainly true that crosses have an independent negative impact on ExpG. This is not to say teams should never cross a ball, or crossing as an offensive strategy is always bad, but it does say that after all other factors have been corrected for, crosses have quite a negative impact on ExpG. Crosses may be an efficient way to create goal scoring chances, but they won’t be the best way to create quality attempts. There is a balance in crosses somewhere. Too many signifies too low quality attempts and too few signifies a team that may create too few attempts.
  • Dribbles
    • Dribbles increase the odds of scoring. Much like a through ball, at least one defender is eliminated, but other than after a through ball, said defenders may come back into position later in the same attack. So, the effect is smaller than a through ball, but it does help. Oh, and more dribbles preceding the same attempt increase the effect, which makes intuitive sense.
  • Dribbles around the keeper
    • This is probably the biggest plus for ExpG. Shooting a football into an empty net is easier than scoring with the keeper in place, who’d have thought?
  • Vertical speed
    • Attacking at speed is beneficial in open play situations. This is measured quite roughly, since data is stamped by second, but it still has an independent effect on ExpG. Leaving defenders less time to settle is a good thing, and it can be measured.
  • Number of Touches
    • Creating attempts after lots of touches in a possession spell is good. It’s probably to be seen as a sign of dislocating the defense. This isn’t to say that the passing game is superior, but when it does result in an attempt, it seems to be a relatively good one.
  • Game State
    • Even after correcting for all factors above, Game State still has an independent effect on the odds of scoring. GS -1 is the hardest state to score. However, for direct free kicks this factor is not in place, which makes sense, as teams probably don’t defend direct free kicks differently according to the score line. This sounds better the other way around, teams do defend differently according to the score line in open play, but to a lesser extent also for indirect free kicks and corners. In regular play, the effect is much more pronounced for shots than for headers. Another case that makes sense, since those GS +1 counter attacks will be aimed at creating shots, rather than headers. Small note: since better teams lead more and poorer teams trail more, the debate about Game State is full of nuances and cannot fully be put to bed based on just this data.


So, how good is the model?

All this complexity is worth nothing is an ExpG model doesn’t beat a simple shot count. However, proving the quality of an ExpG model is a nuanced business, and it isn’t something that I’m going to add to this, already quite extensive, post. Over time, probably during one of these dull international breaks, I’ll post another piece where this new model is tested with respect to its predictive powers, just like I did for the previous model.


Can we get some examples?

Yes, of course. Nothing speaks to the mind as images do. Here are some shots of the past weekend, with their respective ExpG’s. If you like these examples, we may turn this into a recurrent thing, Youtube clips with the ExpG explained in full.

El Ghazi in AZ- Ajax

I like this one, because (A) it’s a stunner of a goal, and (B) ExpG obviously assigns a low number to it. The location is unfavourable, and there isn’t a single factor that helps raise ExpG. In fact the ExpG is so low that an estimated 75 shots from this situation are needed to score one goal.

ExpG: 0.013

Situation: Indirect Free Kick

Shot location: Angle of view 10.4 degrees and distance 34.1.

Shot type: foot shot

Big Chance: no

Start of possession: no high turnover

Assist: unintentional

Through ball: no

One pass after a through ball: no

Cross: no

Dribbles: 0

Dribbles around the keeper: 0

Vertical speed: 2.05 meters per second

Number of Touches: 2

Game State: 0



Lucas Moura in Lille – PSG

Another amazing goal, but a challenging one for ExpG models. The location in itself isn’t all that good, but the context more than makes up for it. In the data this attempt shows up as a Big Chance, after a dribble around the keeper, after a through ball and after a decent number of touches. All of these factors help raise ExpG to a much higher level than any other shot from that position would.

ExpG: 0.447

Situation: Regular play shot

Shot location: Angle of view X degrees and distance X.

Shot type: foot shot

Big Chance: yes

Start of possession: no high turnover

Assist: intentional

Through ball: yes

One pass after a through ball: no

Cross: no

Dribbles: 1

Dribbles around the keeper: 1

Vertical speed: 1.6 meters per second

Number of Touches: 16

Game State: 0


 Georginio Wijnaldum in Newcastle – Southampton

Our third and final example is another beauty. It’s also quite different from the goals before, as we can see in the data. Attempts from fast breaks are good, and they are processed through the Fast break model, which doesn’t have the same factors aboard, since not all factors relevant in usual open play situations are also relevant in fast break attempts.

ExpG: 0.202

Situation: Fast break

Shot location: Angle of view 41.0 degrees and distance 8.5.

Big Chance: no

Start of possession: no high turnover

Through ball: no

One pass after a through ball: no

Cross: yes

Dribbles around the keeper: 0

Game State: 0

Introducing the European Power Rankings

We are at an interesting phase of the season. Enough matches have been played to make reliable assumptions about the strength of the teams involved, yet most relevant outcomes in league football are still to be determined. But the last two weeks have brought a new challenge to light too, as the European Cup Competitions awoke from their annual winter sleep.

When Ajax plays Groningen, usual models can provide an estimate of the teams’ playing strength, be it by tradional methods like shot rates, more advanced stats like Expected Goals, or even by the new Composite Team Rating.



But Ajax played Legia, a team they hadn’t faced ever before. Legia played just three double-legged confontations with Dutch teams ever, with one of those going back to the early seventies, and the most recent match-up with a Dutch side was their clash with PSV in 2011.

So, comparing historical outcomes between Ajax and Legia doesn’t work. Comparing outcomes between Legia and teams that Ajax play regularly, or Polish teams that play Ajax isn’t going to work either. Now what?



Well, now math!

Without overcomplicating things (hopefully), I’ll explain the basics of a solution to tackle this problem. All of this leans heavily on an idea of Michael Caley (@MC_of_A), who came up with a tweet, and an ESPN article, on European Power Rankings, a few weeks ago.

The basic idea is that although Ajax and Legia didn’t meet before, and not one team played both Ajax and Legia this season, if we look at all matches played in European leagues and European club competition, there are still enough indirect links between the teams to get an estimate of their strength.

We’ll use a data set of all Europa League matches and Champions League matches, including qualifiers. To this set we add all matches played in leagues that have teams involved in European football this season. For the main EL and CL tournaments and the top-5 leagues, plus the Eredivisie, Russia and Turkey, we use ExpG numbers. For all other matches we use goals.



Next, a lineair regression is constructed to assess the influence of both teams on the match outcome, ExpG results if available, goals results if not. The regression assigns a coefficient to each team to reflect the team strength. Good teams are associated with a positive result in terms of ExpG / goals, bad teams with a negative result.

With the present abundance of European football, there are enough links between the teams to obtain decent rankings. Ajax may have played Team X that has played Team Y from league Z. Other teams from league Z have played Dutch teams, and other teams that have played Legia, etc etc. Without us having to dizzy ourselves by discovering all those links, the regression just shows us how to estimate the strength of each team.

The next graph shows you the top-20, with the number representing the coefficient that a particular team is assigned. For now, don’t pay too much attention to the exact numbers, as I still need to work this into something more satisfying than an uninterpretable regression coefficient. Just use it to get an indication of order and separation between teams.

Top 20 - bar chart - 03 maart 2015Looking at the top-20, we can see that this method passes the eyeball test. Bayern and Barcelona are shown as the strongest teams in Europe, while the chart is filled with the usual suspects of European club football. The top-20 is dominated by Germany, Spain and Portugal, while France and Italy are limited to a single top-20 side.

What about the world’s richest league? The EPL teams, apart from league leaders Chelsea, play an outside role in Europe’s elite, according to this model.


Ajax, PSV and Feyenoord

To find Ajax and Legia we have to scroll quite a bit further down, far away from the top-20, to the 191st and 172nd place. At the same stage, PSV (92nd) unsurprisingly lost heavily at Zenit (12th), who as the second highest ranked team in competition remain one of the favorites to win the Europa League, according to the European Power Rankings. Feyenoord (77th) was eliminated by Roma (32nd), who are rated significantly higher than the Rotterdam side.


Next week

Let’s finish with the rankings for next week’s ties.

(89th)               Everton           –           Dinamo Kiev              (51st)

(112th)             Dnipro            –           Ajax                            (191st)

(12th)               Zenit               –           Torino                         (44th)

(11th)               Wolfsburg       –           Inter                            (36th)

(13th)              Villareal          –           Sevilla                         (16th)

(23rd)               Napoli             –           Dinamo Moscow        (147th)

(49th)               Club Brugge   –           Besiktas                      (80th)

(47th)               Fiorentina       –           Roma                          (32nd)





Introducing the Composite Team Rating

Football is a game that seems simple at first, and proves more and more complicated the longer it is studied. It is also relatively easy to explain when the match is over, but so very hard to predict beforehand. What’s a valuable metric to assess historical performance isn’t always the best tool to make predictions. This article will introduce a composite team rating, and show that it is a better predictor than Expected Goals alone.


Early days

Before the birth of analytics, a simple look at the table was all we had. An educated guess as to which team would win the next match was the best we could do. Much like it’s done on TV nowadays, really.

In the early days of analytics we thought we could do better than educated guesses. We wrote pages full of shot based metrics, with Total Shots Rate the darling we loved most. Later on, with more data came newer, more exciting and more attractive loves and TSR was traded for Expected Goals. The gap between the mainstream and the analytics blogosphere widened.

Intuitively simple, counting shots and weighing them by quality of scoring, Expected Goals is now the mainstay of assessing performances of football teams. It’s probably coming to a TV near you, in some universe, at some point in time.


History or future?

I still believe Expected Goals is the best metric to represent historical performances, i.e. answering questions like which team has performed best over a certain period of time. In predictive modelling, however, things may just be a bit different. It’s not so much about historical performance, it’s about how much of that historical performance proves repeatable.

Until recently, I took a metric that I thought best represented historical performance, like ExpG-ratio. From there on, I simulated matches based on these ratings, expecting teams to just cruise along at the speed indicated by their historical performances in terms of ExpG-ratio. With ExpG-ratio being the single input of a predictive model, each team at a certain ExpG-ratio was expected to perform at the same level in future matches. Of course, fixture planning would dictate how many points they would win, but the underlying predicted performance would be the same.


A fictional example

Imagine Chelsea having a reasonable, though not overwhelming season start. After ten matches they have recorded an ExpG-ratio of 0.600. Good for most teams, but usually not enough to seriously challenge for the title. In the same fictional league, West Ham United have had a magnificent start to their campaign, some balls just happened to bounce well, a few calls went their way, and the Hammers have also recorded an ExpG-ratio of 0.600 after ten matches.

Common sense would dictate that Chelsea are expected to operate at higher level than West Ham over the remaining matches of the season. Most likely, Chelsea will return to elite 0.650 levels, while West Ham would regress towards mid-table 0.500 obscurity. We all watch football, we know that, right?

The model, however, doesn’t have eyes and is just fed two teams with identical ExpG-ratios. Hence it expects both teams to continue at that 0.600 pace we all just said we knew both teams wouldn’t hold. So, what do we know that the model doesn’t?


Historical regression

Our knowledge of Chelsea and West Ham is partly based on historical information, like Chelsea’s excellent performances over the past decade, and West Ham’s lower mid table years with some time spent in the Championship recently. Theoretically, we could plug that into the model and have teams regress to historical performances. But there’s a danger here, and a big one too. Things can change quite fast in football, even at club level. The departure of Sir Alex at United, the sudden influx of money at clubs like City of PSG.

Some historical regression is good, but be careful, too much historical regression in the model holds a big risk of missing sudden changes.


Feed the model

The other solution, and that’s where I’m going, is to feed the model more information about the season at hand. ExpG numbers are still an important driver of the team rating that the model produces, but raw shot numbers, shot on target numbers, goals scored or conceded, and even pass ratio’s hold information that may make ExpG numbers more steady.

Chances are that even with equal ExpG-ratios of 0.600 over their fictional first ten matches, Chelsea and West Ham will differ in terms of shot numbers and pass patterns. These shot numbers and pass patterns form like a finger print of a team’s underlying activities leading to a certain ExpG-ratio. Some lucky deflections, some welcome yet debatable offside calls, a sending off here of there, it can all help steer a club’s ExpG-ratio in a certain direction that won’t hold for the future. Use more metrics, and your assessment will gain stability, particularly in the early stages of the season.

Adding this info to the ExpG-ratios will, in all likelihood, lead to a lower Team Rating for West Ham and a higher Team Rating for Chelsea. Subsequently, more points will be predicted for Chelsea than for West Ham, despite an equal assessment of present day performance in terms of their 0.600 ExpG ratio. At the end of this article, a more mathematical detailed explanation of the present Team Rating model will be given. For now, remember the new Team Rating as a broad assessment of shot numbers, Expected Goals and passing patterns.



The best way to show the performance of the Expected Goals model and the new Team Rating is in the graph below. This shows gap between predicted points per game for the remaining matches of the season and actual points per game for those matches at different stages of the season.

ST DEV for Future PPG - All Full MetricsPredicting is hard when you’ve got little information about the teams, so initially the gap between predicted points per game (PPG) and actual PPG is quite wide. As the season progresses, predictions become more accurate, to the point where the amount of remaining matches becomes so small that predictions, again, are harder to make.

From this graph it is clear that using very raw information like points or goal is not a good way to predict future points. In other words, the league table is not your best source of information if you want to find out about team strength.

Total Shots Ratio and Shots on Target Ratio are a big step forward in comparison with points or goals. Predictions based on these relatively simple metrics are better in all stages of the season.

The red line is the Expected Goals model, which delivers the best predictions of any non-composite rating from match day 12 onwards, but needs those 12 matches to pick up enough information.

The orange line is the new 11tegen11 Team Rating. As said before, it hold information about goals scored, total shot numbers, shot on target numbers, actual goals scored and passing patterns. See below for more details.

It is quite clear that the new Team Rating outperforms the old Expected Goals model significantly. Even after computing Expected Goals ratio, there still is valuable information left in simple shot, pass and goal numbers!

For any predictions made I’m using the Composite Team Rating now, unless explicitly stated otherwise.

For interested readers, I will explain the model in more detail below the next graphs. If you’re just here for the football, and you’re not into the details of the machinery behind the predictions, feel free to check out here!

Team Rating - bar chart - English Premier League 2014-15 16 februari 2015


The model I now use for predictions concerning the 2014/15 season is a linear regression based on two seasons of data: 2012/13 for the top-5 leagues and 2013/14 for the top-5 plus the Eredivisie.

The dependent variable in the regression is future PPG. Independent variables taken from the present season are goals, shots, shots on target, ExpG, passes, passes completed, passes in the box, passes completed in the box. All of these are recorded both as scored and as conceded. Furthermore, points per game, goals for and goals against in the past season are used, as well as league (as a categorical parameter).So, each teams records as many lines as there are match days, minus one since there’s nothing left to predict after all matches have been played.

The regression formula produces a Team Rating for each team, which is then scaled to a format to resemble the numbers we know from 0-1 rating scales like TSR and ExpG-ratio. So elite teams will have Team Ratings of 0.700 while very poor teams will score Team Ratings like 0.350. Since this Team Rating comes from a regression formula based on multiple historical leagues, it is no longer true that the average over a league should always be 0.500.

Oh, and for the graph in this article, a regression was run on 2012/13 top-5 leagues data only, to allow the 2013/14 data for the assessment of the model’s performance. Always avoid over fitting, you know.

Obviously, the disadvantage of a Composite Team Rating is that it’s less intuitive like Total Shots Ratio or Expected Goals Ratio. This makes it harder for a reader to assess what it really means, and it forces the reader to trust the model, since the ‘under-the-hood’ part becomes quite complicated. On the flipside, the model performs better and produces more stable predictions. For me, the improvement in prediction accuracy beats the reduction in interpretability.

If the simpler model would perform in the range of the more complex model, we would always prefer the simple one. For the audience it’s easier to understand what goes on within the model, and therefore decide to trust the model. For the creator of the model it’s easier to spot potential errors in either the data the goes into the model, or in the model itself. But I guess the easy days of Total Shots Ratio are over. Football just ain’t so simple.


Feel free to ask questions in the comments below if you want to know more!

The best predictor for future performance is Expected Goals

One of the oldest challenges for football fans is to estimate the strength of teams. For years and years, this was quite a simple matter actually. You had the league table, showing points won and goals scored or conceded, and that was it. All the rest was left to debating softer observations as to which team didn’t yet get the rewards for their good performances, or which team was flying higher than their wings would carry them.

Fast forward to the days of football data and all kind of detailed metrics are just a mouse click away, thanks to sites like WhoScored and Squawka delivering OPTA data for free. No longer are we limited to objectively ranking teams on the basis of points and goals only. Shots, shots on target, or even expected goals from those shots can be thrown into the debate. What’s more, some people might even argue that only a subset of those shots should be counted, at close or at tied game states perhaps?

In this post, we will study the performance of 5 different metrics and see if we can established which one holds the best predictive power at which stage of the season. The data set consists of the 2012/13 and 2013/14 seasons for the top-5 leagues in Europe, and the 2013/14 Eredivisie. I’ve selected the following metrics, with clear definitions found at the bottom of this post.

  • Points per Game
  • Goal Ratio
  • Total Shots Ratio
  • Shots on Target Ratio
  • Expected Goals Ratio

All of these metrics are tested for their correlation to future performance in terms of future points per game and future goal ratio. This is done for each match round of the season.

For example, after 8 match rounds played, all twelve metrics are computed over match days 1 to 8 and compared to points per game and goal ratio from match round 9 to the end of the season. This is done by fitting a linear model and noting the correlation in terms of R squared. This process is repeated for each metric at every match round, to obtain R-squared values for each metric at each point in the season.


Points per Game and Goals Ratio

The first graphs show the output of the two historically available parameters: Points per Game and Goal Ratio in predicting future Goal Ratio and future Points per Game.

Metrics for Future Goal Ratio - PPG and GR Metrics for Future PPG - PPG and GRThis is basically the equivalent of looking at the league table and expecting trends to continue as they do. Not a bad habit, and it does certainly hold valuable information, but it has several disadvantages too. Most notably, the correlation takes a while to pick up, settling down around week 10. Also, beyond that moment, hardly any improvement is made with respect to predicting future performance. A final interesting remark is that Goals Ratio is quickest to pick up information, but Points per Game might just be a touch better in the final stages of the season. This ties in with the statistical intuition that goals are the more frequent occurrence, and therefore pick up signal earlier, but also collect more noise along the way.

Note that the graphs drop off after the halfway point of the season. This does not indicate that the model becomes worse, but rather that there is more variety in the outcome parameter. It’s simply easier to predict points tallies and goal numbers with more matches to play than it is to predict single match outcomes, as is the case near the end of the season.

The slight kick-up at match day 34 reflects the fact that Bundesliga and Eredivisie seasons are 34 matches long and the rest of the leagues in the dataset play 38 match seasons.


Total Shots Ratio

A little under four years ago, a concept called Total Shots Ratio made its way into the (then quite small) world of football analytics. Pioneer James Grayson explored it on his blog, a site that is still a great read to get yourself acquainted with the development of football analytics.

Total Shots Ratio, of TSR, proved a very interesting way to rank teams, without having to resort to direct output like goals scored or points won. Shots attempted do reflect the balance of play, and the metrics does recognize under or over performing teams.

Metrics for Future Goal Ratio - PPG GR and TSRMetrics for Future PPG - PPG GR and TSRLook at that massive boost of knowledge early in the season. It now proved possible to identify the strength of teams as early as after seven of eight match rounds, with an accuracy comparable to what traditional methods could only achieve at their height in mid-season.

TSR, like Goals Ratio, forms an improvement early in the season by picking up signal a lot earlier. After all, shots are roughly 10 to 11 times more frequent than goals. In the end, it turns out that this method collects noise at a faster rate too. Not all shots are equal, and some teams have tactical setups that allow them to consistently perform better or worse than TSR suggests. As is shown in the sharp drop in performance that TSR shows after match round 25. After match day 28 you’re generally even better off just looking at the league table!


Shots on Target Ratio

In the introductory post linked above, James Grayson declared his preference for using Shots on Target Ratio (SoTR) over TSR, but later on this line of thought got some nuance. Theoretically, SoTR could be a nice method to lose the noise that weakens TSR in later stages of the season, hopefully without losing too much of the early signal that makes the method so powerful.

Metrics for Future Goal Ratio - PPG GR TSR and SoTRMetrics for Future PPG - PPG GR TSR and SoTRI’ve always gone with TSR over SoTR because I feared the lack of signal in a smaller sample of shots, and it was increased signal that made TSR such a powerful tool early in the season. I was wrong, it seems. Despite holding roughly one third of the sample of TSR – around 1 in 3 shots is on target – the SoTR metric picks up its signal equally fast and holds it longer. Just like it theoretically should!

At its peak of predictivity, the mid-season, SoTR performs notably better than TSR, which should make it the preferred method to treat raw shot counts. As said before, not all shots are equal, and the capacity to get shots on target seems to hold predictive power for future performance. Partly this may be the effect of better teams simply firing more accurately, but it may also contain information about playing in favourable game states. After all, it’s now generally known that teams trailing a match by a single goal see a drop in shooting accuracy, while teams leading by a single goal rise their shot accuracy.


Expected Goals Ratio

Next up in football analytics land was the appearance in 2013 of Expected Goals models. Simply said, each shot is assigned a number between 0 and 1 to reflect the odds of such a shot resulting in a goal. This process is not done subjectively by hand, by objectively, by using large databases of earlier shots and determining correct odds by regression methods. Expected Goals models do differ a slight bit from one model to another, but the mainstay of the input is shot location and shot type. If traits hidden in the Expected Goals methodology reflect their playing style and/or their playing quality, this method should form an improvement on raw shot metrics.

Metrics for Future Goal Ratio - All Full MetricsMetrics for Future PPG - All Full MetricsThe conclusion from these graphs is quite simple actually. Expected Goals Ratio forms an impressive improvement on raw shot metrics at each and every point in the season. It picks up information much like the raw shot metrics do in the very early stages, then predicts future performance significantly better at early to mid-season, and also holds predictive capacities for longer. It makes sense to use Expected Goals Ratio from as early as four matches played. Even that early, it is as good a predictor for future performance as Points per Game and Goals Ratio will ever be.






The metrics were defined as below.

  • Points per Game: points won / matches played
  • Goal Ratio: goals for / sum (goals for + goals against)
  • Total Shots Ratio: shots for / sum (shots for+ shots against)
  • Shots on Target Ratio: shots on target for / sum (shots on target for + shots on target against)
  • Expected Goals Ratio: expected goals for / sum (expected goals for + expected goals against)

An updated look at performances in the EPL

During the last international break, we’ve used the lull in club team football to look around the top-5 leagues for some early looks at performances. And it’s international break again, so why not check back with four more match rounds played and see how are estimations have held up so far, making some new predictions to go along with it. Here’s the EPL to start with.



I’ll plot both Good-Lucky matrices to allow for a quick comparison, on top the chart from the previous piece after seven match rounds, below the updated version with now eleven match rounds played. Generally, over the course of a season we tend to see teams move closer to the red line, indicating average PDO’s in relation to performances as measured by Expected Goals (ExpG) ratio. Last time out, after seven matches, five teams had PDO’s below 950. Four matches later, only Burnley remains that low. Neat phenomenon, that regression towards the mean, isn’t it?

At the top there is some PDO regression too(Leicester, Hull, Spurs), but some PDO’s are moving in the opposite direction. Would that make Southampton fans happy? Well, it’s a mixed bag really. On one hand, this PDO rise will undoubtedly come down over the rest of the season. On the other hand, the points won over the past four matches (four wins, goal tally 12-0) are in the bag and will help propel the team to very firm top-4 candidacy, and even an outside shot at the title. Just don’t count on PDO driven points next season, and enjoy the wave while it lasts!

Good - Lucky Matrix English Premier League 2014-15 14 oktober 2014Good - Lucky Matrix English Premier League 2014-15 17 november 2014


Please note the top-2 teams in both charts.

Chelsea were early season runners, pairing exceptional performance (ExpG ratio) with very efficient finishing (PDO). Over the past four matches, their ExpG ratio has come down from 0.721 to 0.657. Still good enough to win the league, but no longer that elite to win it without depending on PDO favours a bit.

Meanwhile, Arsenal have moved in the opposite direction from very good to elite (0.730). However, their unkind PDO ride has already cost them a significant amount of points, and even with a return to PDO normality, they probably won’t catch Chelsea with the present twelve points gap. Still, a top-4 spot seems a lock, which may be enough to produce some ‘Arsène managed to turn it around’ or ‘The return of [insert injured player] helped Arsenal back on track’ story somewhere.

Interestingly, Chelsea have not regressed and even maintained their PDO at 1081 and also Arsenal don’t experience a PDO rise as they have stayed at 949. Over the course of the season however, Arsenal will start chipping away points from Chelsea’s 12 points lead. Looking at the end-of-season predictions (see below), we wouldn’t immediately expect the Gunners to overtake the Blues, but with some kind PDO swings, Arsenal may just pull it off and allow magic headlines to appear.


Title race

Chelsea still remain hot favourites for the title, with City only just leading the chasing trio. Where the model previously gave Chelsea an 81% chance to win, their stock has now fallen to 58%. Note that Southampton (enjoy the PDO wave while it lasts!) may just have enough to seriously challenge for the title. An 0.632 ExpG ratio team does need a friendly PDO wave to complete that, but hey, it’s not impossible at all.

Boxplot projected League positions - winners English Premier League 2014-15 18 november 2014

Relegation land

In relegation land, we had firmly marked Burnley, Leicester and Hull as relegation candidates last time we looked, with all three of those teams well over 60% odds to go down. Four matches later, things are still very poor at Burnley and Leicester, but slightly less frightening for Hull.

Boxplot projected League positions - relegation English Premier League 2014-15 18 november 2014It’s curious to see how poor the relation is between a low ExpG ratio and relegation odds at the moment. Hull are definitely the worst side in terms of performance as measured by their 0.318 ExpG ratio, but have had a kind PDO wave (1044) that allowed them 11 points already.

The same can be said of Swansea, who’ve enjoyed friendly PDO spells to start the season (1077). That PDO has covered up what has otherwise been a very poor performance (0.386 ExpG ratio). It’s even looking quite possible that Palace, who are ranked 17th with 9 points despite a 0.452 ExpG ratio may overtake Swansea (5th at 18 points) with any decent swing of PDO. I’m not saying it will happen, but just realize how easy swings around mid-table are made in football.

I’ll leave you with an updated version of the points prediction. Looks like the most interesting battles will be for the least interesting positions like for 2nd place and for Europa League qualification…

Boxplot projected league table English Premier League 2014-15 18 november 2014


If all these terms are new to you, click here for explanations of Expected Goals and PDO.

Quantifying ‘Gegenpressing’

For a moment, I thought I’d call this post ‘Just how weird a team do we think Bayern is?’. A catchy title helps to draw in a crowd, which you can then try and sell a smart point about footy analytics. Indeed, we will touch upon Bayern and see just how weird they are, but this post probably still won’t be one for the masses. We’ll dive into two advanced defensive metrics that have been set out in the analytics community recently and combine them to identify different defensive team styles.



Gegenpressing, or counter pressing, refers to what teams do when they don’t have the ball.

One extreme would be to run back to their penalty box and park the bus there, wait for the opponent to arrive and try to limit space, and thereby the amount of damage done. This may results in quite a few shots conceded, but the idea would be to limit the quality of those attempts and thereby limit the odds of conceding.

The other extreme would be to put aggressive pressure on the opposing player in possession of the ball, and try and knick it from him before any decent offensive move can be started. It will result in a lower amount of shots conceded, but once the opponent gets into scoring position, most defenders will probably be out of place, and high quality attempts could arise.

To quantify this, we will use two advanced defensive metrics, put out originally by Gerry Gelade and Colin Trainor. Beware of some acronyms now.


Average Defensive Distance – ADD

This metric computes the average distance up the pitch where a team performs its defensive actions. For all tackles (failed and completed), interceptions and fouls we use the distance between event and the goal line of the defending team. The average of all of those is the Average Defensive Distance. Teams that perform their defensive actions high up the pitch (i.e. far away from their own goal), have a high ADD, those that defend primarily close to their own goal have a low ADD.

In his original definition, Gerry only included own half events, but I’ve not applied this selection. I believe this limitation won’t change the outcome all that much, and it’s probably just easier to reproduce including all defensive actions.

The highest ADD in the dataset for the 2013-14 season (Brazil, Bundesliga, EPL, Eredivisie, La Liga, Ligue 1, Mexico, MLS, Russia, Championship and A-League) is Bayern’s 46.7. This means that Bayern makes its defensive actions just over 17% further away from goal than the average team does. The average ADD is 39.9.

The lowest ADD in the dataset is Crystal Palace, who under Tony Pulis had an ADD of just 36.6. On an average pitch of say 100 meters, this mean Palace defends 3.3 meters deeper than the average team, and 10.1 meters deeper than Bayern.

Obviously, not all of this metric represents a conscious tactical choice. Poor teams will generally be playing more in their own half, as their superior opponents lay their will on them. Therefore, this metric needs to be interpreted with care, and in the light of team strength. The extremes like Bayern and Palace are easy, but the less extreme ADD’s are more difficult to interpret. I tend to think of it more as representing a certain style, and not so much as a performance metric.


Passes allowed Per Defensive Action – PPDA

The second metric we’ll use is Colin Trainor’s Passes allowed per Defensive Action. Again, getting used to the acronym probably takes more time than understanding the metric, as it’s quite straightforward actually.

To compute PPDA we divide the amount of passes that a team allows (i.e. passes that the opponent attempts), and divide that number by the amount of defensive actions made. By convenience, we compute this metric over the passes and defensive actions made at least 40 meters from the goal line (OPTA’s 40 coordinate on the x-axis). Colin has explained the reasoning behind this choice very well, so I’ll just refer to his original work here.

Teams that sit back and allow their opponent possession of the ball in their own half and around the halfway line, will note a high PPDA. Lots of opponent passes will be divided by a low number of defensive actions that far from the own goal line. In reverse, teams that aggressively pressure their opponent will note a low PPDA. A low amount of opponent passes will be divided by a high number of defensive actions high up the pitch.

The lowest PPDA (i.e. the highest amount of pressure) in the dataset is again noted by Bayern at 6.9. So, for every 6.9 passes that their opponents make in that zone further than 40 meters from the Bayern goal, Bayern make one defensive action.

The highest PPDA (i.e. the lowest amount of pressure) in the dataset is noted by Mexico’s Atlante for 16.6. Within the top-5 leagues, the highest PPDA was for… Crystal Palace.

I hear you thinking. Doesn’t this mean that ADD and PPDA are essentially the same thing?


Combining ADD and PPDA

Both metrics share common ground, but I’d make the case they are different enough to be valuable. What’s more, they can bring even more insight when combined. ADD tells you where teams performed their defense, PPDA tells you how much defense away from goal they performed. ADD is how high their defensive line was, PPDA is how intense the press was.

In part, both go hand in hand. Generally, teams that play high defensive lines also use intense press (Bayern), and teams that play deep defensive lines use low press (Palace). The regression line runs in inverse direction.

The R-squared is just 0.46 though, so there is significant variation: teams play high defensive lines with only moderate pressure (Twente), teams that play low defensive lines with high pressure (Cruzeiro), teams that play very deep with moderately low pressure (Queretaro, Montreal), teams that play average defensive lines without defensive pressure (Morelia, Lorient). Oh, and don’t forget to note Bayern stretching the plot in the upper left corner with their absurdly high line and intense press.

There is a lot of work to be done with these metrics. We’ll need to check repeatability (but from face value this should be okay), study teams that get a new manager (to separate player effects from tactical choices), assess potential league effect (cultural differences in defending style), etc.

For now, I’ll leave you with the big plot, where teams further than 1.5 standard deviation from the regression line have been tagged. Click for the full version.

Advanced defensive metrics by team - new layout

Heracles and the failed art of shot blocking

One team in particular has been playing really out of sorts in the Eredivisie so far. Poor little Heracles have noted a 1-0-8 record to open the season with, and therefore occupy the bottom spot of the table with a poor three points from nine played. Still, in analytics terms, something makes them very interesting, and I believe now is the time to share this observation, so that we can follow it over the coming months.

HeraclesIf you’re not a die-hard Eredivisie fan, you may not know all that much about Heracles, so let me tell you something about them. They are a genuinely small team with strong local support, but as a well-run business they’ve been a stable Eredivisie side for ten years now. With your classical education background, you’d probably already noted their cool name, referring to a divine hero in ancient Greek mythology. Our hero was noted for his extreme strength and courage, something that can hardly be more out of place in reference the performance of Heracles the football team this season.

As a first exploration for that disastrous 1-0-8 season opening, I’d probably look at some shot numbers, expressed as per match.

Shots for                             11.6

Shots against                     11.9

TSR                                       0.494

Well, that’s weird. Apparently Heracles have a near balance in shots created and conceded, yet noted that 1-0-8 record. Bad luck, or something to do with shot quality?

ExpG per shot for             0.100

ExpG against                     0.125

Mmm, that’s an ugly picture. Heracles create shots that fly in in a 1 in 10 rate, and concede shots that usually convert at 1 in 8 rate. For years they have been fooling TSR with this behaviour, leading to overestimation of their strength in a metric that values each shot equally. This means that simply combining shots numbers and shot quality should provide part of the answer already.

ExpG created                    1.16

ExpG conceded                1.48

ExpG-ratio                         0.439

So, this metric should do it. But wait, 0.439 isn’t good at all, but it is far from in line with that 1-0-8 record. Usually, 0.439 teams record around 1.2 points per game, so something like a 3-2-4 record would be a more fitting reward for their play in terms of ExpG-ratio.

The answer, unsurprisingly for readers who remembered the title of this piece, lies in shot blocking.

Shots blocked offense                                 27.6%

Shots blocked defence                                10.2%

ExpG of unblocked shots created              0.91

ExpG if unblocked shots conceded           1.36

ExpG-ratio unblocked                                  0.401

An average 2014-15 Eredivisie team blocks around 19.2% of shots. Heracles’ offense sees around 50% more of its shots being blocked by opposing defenders. In return, Heracles’ defence blocks shots at a rate that’s 50% lower than their rival teams do. Their unblocked ExpG-ratio is a poor 0.401, which, combined with some tough luck goes a long way explaining the horrendous season so far.

Make of it what you will. Heracles might be very poorly organised from a tactical standpoint, producing below-average quality shots that are 50% more likely to get blocked, while the reverse is true for their offense. Those are some painful numbers that illustrate aspects that TSR won’t grasp.

But it’s a bit more nuanced than that. Using TSR to explain what has happened is always going to lose to measures like ExpG-R that take in more variables to correct for relevant details of performance, like shot quality, and in the case of unblocked ExpG-R also block rates. But do those aspects carry over from historic data to future performance, or are they mere variations that tend to even themselves out over time?

For comparison of the repeatability of TSR and ExpG-R, I’d refer you to earlier work on this site.

For the art of shot blocking, I haven’t shown data before. Here’s a simple scatter plot of the percentage of blocked shots that teams have noted in two consecutive seasons. That dataset here is EPL, La Liga, Bundesliga, Serie A, Ligue 1, MLS and Brazil 2012-13 and 2013-14.

Block rate - defensive - all shots Block rate - offensive - all shotsA few things are of note there.

  1. The relation between the rate of shots blocked in consecutive seasons is not particularly strong, so most of it is probably variance.
  2. Teams don’t note a shot blocking rate below 16%. This mean Heracles are either going to set a world record of poor shot blocking, or they’ll get picked up by regression any time soon.
  3. The defensive aspect of shot blocking is a tiny bit more repeatable than the offensive side, i.e. avoiding your shot getting blocked.

The problem with this raw analysis of blocks is that not all shots are blocked at the same rate. Here are the block rates for different types of play in the Eredivisie (2013-14 and 2014-15 data).

Direct FK                             24.7%

Open play shots               23.5%

First time attempt            22.6%

Rebounds                          22.3%

Indirect FK                         18.9%

Corners                             17.6%

Open play headers          6.5%

The distribution of shots from various situations may be different from team to team, thereby producing bias in the block rates. Some teams may concede more headers than others, which would make them look like poor shot blockers, since headers are rarely blocked.

To cut a long story short, of all different types of play mentioned above, only open play shots show any degree of repeatability in terms of the block rate. For all other types of play the correlation from season to season is virtually non-existent.

Block rate - defensive - open play shots only Block rate - offensive - open play shots onlyInterestingly, the R2 values decrease sharply when doing the repeatability test for block rates, leaving only open play shots to show any degree of repeatability, however small.

The fact that the overall analysis shows a stronger correlation than the subgroup analysis suggests that a big part of the repeatability of block rates of overall shots should in fact be bias introduced by the fact that teams show different distributions in types of play, rather than in actual block rate.


In the end

What can we, and our poor Heracles, take from this in-depth analysis of block rates? Simply put, they should be ignored when trying to predict future performances. The repeatability of block rates is very poor in general, and even smaller when performing subgroup analysis for different types of play. If your team suffers from very poor block rates, like Heracles, chances are this will regresses. Whether this means that block rates represent luck that events itself out, or managers that identify issues they get fixed rather efficiently, is impossible to tell at this moment.

For Heracles, this brings a touch of optimism, as their horrible block rates, both offensively and defensively, are expected to regress over the season. This should bring their actual outcome closer to their ExpG-ratio of 0.439, for a 1.2 points-per-game pace. This still means they are expected to add just 30 points (25 games to be played * 1.2 PPG) to their current total of 3 points. Relegation territory, but they should be within touch of the pack, rather than trailing miserably as they do now. We will be watching closely!

An early look at performances in La Liga

For the fourth and final part of our miniseries, attention shifts to Spain. Will anyone be able to put up resistance to the picture perfect season opening by Barcelona? Is Valencia really this season’s surprise package? And will Sociedad recover from their disastrous opening?



Using the recently explained Good-Lucky matrix, in a format adopted from Benjamin Pugsley, we can easily scan the league for the best performance teams (horizontal axis) and the most efficient teams (vertical axis). Anyone into football analysis will know that being highly efficient lasts only so long, and PDO levels tend to revert back to normal before you know it. Depending on team quality, normal is a PDO of 980-ish for poor teams and 1020-ish for good teams.

Good - Lucky Matrix La Liga 2014-15 16 oktober 2014Yes, it’s Barcelona, a bit of nothingness, more nothing, and then the rest. An out of this world ExpG-ratio of 0.799 combined with an extreme PDO wave of over 1175 resulted in six wins and a draw, no goals conceded, and La Liga’s title all but clinched. The PDO will resolve, points will be dropped, but hey, no one really looks like catching Barcelona here.

A close bunch of five teams competes for the honours of second place, as it seems. The expected names of Real, Sevilla and Atletico are there, but Celta seem to do very well in just their second season after promotion, as well as those other boys from Barcelona, Espanyol.

In analytics terms, Valencia are an outlier of note. Their PDO has been even higher than Barcelona’s, riding them to 2nd place in the table. Point is, and ExpG-ratio of 0.493 is not going to take them far, once this PDO wave runs out of steam. Obviously, the 17 points from seven matches will boost their final standings, but on wouldn’t really expected them to threaten the top three.

Orange is trouble in the Good-Lucky matrix, so Granada, Córdoba and Levante catch the negative light here. Bilbao will improve, once their PDO pulls towards the red line, but are still way below mid-table.


Points per Game

If you’re performance is as elite as Barcelona, you won’t drop many points. Their 0.799 ExpG-ratio simply means they are on average four times more likely to scores than their opponents. Hard to see them losing more than a handful over the season then.

As expected, Valencia are flying high, but don’t have the performance levels to back it up, as does Granada, though at another level. Espanyol will pull up over the coming weeks, as will Sociedad, and Deportivo to an extent.

ExpG-r vs PPG by team La Liga 2014-15 16 oktober 2014


Here’s the ‘sticking my neck out’ part of this mini-series. Using ExpG as a basis, a pretty straightforward model can simulate the remaining part of the season and come to predictions for the final league table. I figured it would be more fun sharing these from time to time, for various leagues, and see what we can learn along the way towards the end of the season.

For this model I’ve limited ExpG to 11v11 or 10v10 situations, filtered out blocked shots (since shot blocking is a skill), filtered out penalties (since they are distributed pretty random and skew the numbers a fair bit) and filtered out rebounds. Furthermore, I’ve regressed the ExpG towards last season’s numbers, based on the R2 between ExpG’s on each particular match day to ExpG’s at the end of the season.

Without further ado, here’s the graph of predicted points, along with a box plot showing the spread and most likely number of points for each particular team. Enjoy!

Boxplot projected league table La Liga 2014-15 16 oktober 2014