Tag Archives: Soccermetrics

Where do the best shots come from?

Some people may say that odd-year summers are a dull moment for soccer fans, but I tend to disagree. Instead of getting us sucked into the maelstrom of actuality, this period allows us take a step back and see the bigger picture of football matches. Over the course of the summer, this is what 11tegen11 will do. More depth, less actuality! After that, activity will resume as normal, strengthened by our new found knowledge…

To kick off the summer period, this piece will look at shots and identify four different strike zones on the pitch, where vastly different types of shots are being produced. This may not be all that shocking, as most of it connects with common sense of watching football games, but at the very least, it will provide a nice base for future explorations.

Basically, we could take two different approaches to identify strike zones. Either top-down, by adhering to known areas on the pitch, like the penalty box, the 6-yard box, etc. This obviously has the advantage of easy communication. But the disadvantage would be that the discriminatory power would be less, as shot quality may differ quite a bit within these classical zones.

The alternative, and the method I will use, is a bottom-up approach by using a fine grid to identify where on the pitch the biggest drop of in shot quality occurs, and then work out the best zones from there.

Here’s an Excel-produced half football pitch, split into small grids, with the conversion being printed in each grid and color marks indicating zones of high and low conversion. Conversion, in this case, is goals per shot and data covers the past two Eredivisie seasons. It’s not intended to be fully readable, but the basic point is that it allows the identification of four different strike zones.

Conversion pitch

The continuous line marks the borders of the pitch, and for reference, the penalty box. The dotted lines mark zones, where you can see a clear demarcation in shot conversion. We will use these boxes to identify four different strike zones, each with unique characteristics, as we’ll find out later.

Unsurprisingly, right in front of the goal, nearly all shots are converted into goals. I will therefore refer to this zone as Zone 1. The next most threatening part, Zone 2, covers mostly the central penalty box area, but stretches just beyond the edge of the penalty box. Then comes Zone 3, covering the wider penalty box area as well as longer distance central shots. The rest of the pitch will be Zone 4.

Pitch zones

This table shows the mean conversion rates per zone, indicating that each zone has a distinctly different conversion rate, going a factor three down as we progress down each zone. Shots from Zone 1 are rare, but due to their extremely high conversion they deserve their own group. All other zones contain a reasonably high number of shots, indicating a good distribution of shots among these zones.

Zone

Mean conversion rate

Shots

1

0.805

123

2

0.222

5435

3

0.082

4219

4

0.034

5452

These strike zones will prove a useful reference point to assess difference between teams in terms of strategy, difference between players in terms of quality and will help to explore our beautiful game more in depth.

What if we played the Eredivisie a million times?

Logo_EredivisieWhich team is currently the strongest of the Eredivisie? Ask this question to a fan, a manager, a pundit and an analytic football blogger and you will probably come up with four very different answers…

The fan would probably not understand the very fact that you’re asking this, particularly at this time of the year with the Eredivisie that just finished. Obviously, it would be Ajax, duh. They just won the league for the third time in a row, didn’t they?

The manager would maybe find a bit more nuance and say that this season Ajax was just a bit stronger than PSV, particularly near the season end, and Ajax’ 3-2 win in Eindhoven put them in the driver’s seat towards the title.

The pundit would push Ajax forward as the strongest team, and immediately start feeding you correlations disguised as explanations as to why Ajax is stronger than the other teams. Ajax is building a ‘successful revolution’, the team plays a ‘recognizable style of football’ and they allow home-grown talent a decent shot in their first team.

The analytics blogger would probably take most time in answering this question, and to be fair, if you would ask me, I would not be quite so sure whether to pick Ajax or PSV.

 

Limitations

We’ll now look into the important question of determining the best team in the league, based on a few assumptions.

Firstly, playing 34 football matches is a poor way to determine which team has the most quality, as was demonstrated quite elegantly by @penaltyblog.

Secondly, luck has more influence in football than it’s generally credited for. If skill would be the prime factor involved, wouldn’t betting companies be capable of calling the correct winners of football matches in more than the 50% to 55% of cases that they do now?

 

Experiment

Imagine this thought experiment…

It’s August 10, 2012 and the 2012/13 Eredivisie is about to start in Tilburg, where newly promoted Willem II is about to host NAC Breda. Just prior to the kick-off, we press an imaginary ‘save’ button and quickly fast-forward to May 12, 2013, the final Match Day.

Here we reload our ‘save game’ from August 10, 2012 and we run the same season again. And again, and again and again… A million times.

Using bookmaker odds as an estimate for team strength, we can do just that. After all, bookmaker odds can’t be too far off, otherwise we would be allowed an easy occasion to make some cheap money exploiting them. And if we would all do so, the odds would get corrected quickly. Admittedly, the odds are never perfect and there is quite some evidence that strong home teams will be overestimated, but only slightly so.

 

The winners

Using some recently acquired skills in R we can run a simulation of one million seasons in just under four minutes, so let’s go straight to the winners…

Team

League wins

PSV

553512

Ajax

355104

Twente

53255

Feyenoord

29889

AZ

4805

Vitesse

3105

Utrecht

121

Heerenveen

115

Heracles

39

N.E.C.

30

Groningen

11

Roda

7

ADO

3

RKC

2

NAC

1

PEC Zwolle

1

VVV

0

Willem II

0

 

The 2012/13 Eredivisie, as we can see, was more of a two horse race than it was held for. On the other hand, this simulation also turns out that only two teams – now relegated VVV and Willem II – did not manage to win a single simulation out of a million runs. All other teams, had at least a one in a million shot at the Eredivisie title, if we use bookie odds as an indicator for their chance of winning matches.

 

Process versus results

This should be an important piece of information for people in charge of professional football clubs. The old adage had always been that ‘the winning coach is always right’, and going by the recent laudation of Frank de Boer as up and coming manager, that adage is presently as strong as ever. Results dominate our image.

In fact, it’s the underlying process that should get our attention. Give me a manager that substantially increases my team’s chances of winning the league, rather than a manager that wins the league without an underlying improvement of performance. Professional football organizations that appoint managers on the basis of results only, run a serious risk of dumping a manager who was developing the underlying process well and trading him for a manager that has experienced just a bit more luck.

The separation of process and results will be an important goal at 11tegen11 during the upcoming season, and using simulations will be a crucial tool to do so.

 

 

Some of you may have noticed a decrease in the frequency of new articles on 11tegen11. I can assure you that this is a temporary thing and activity on 11tegen11 will pick up over the summer. It is in part due to the start of Dutch Volkskrant blog ‘De Zestien’, where you are more than welcome to read articles by @SimonGleave, @Tijsrokers, @MichieldeHoog and myself. I hear Google translate does a decent, and on occasion surprisingly entertaining job in making the Dutch articles accessible to foreign readers.

Ajax is better than PSV in the Game State that matters most

Football analytics is a young business. And as such, it is still a rapidly developing field, where new concepts are launched all around. Some of these concepts are there to stay, others disappear as quickly as they came. For me, Game States definitely belong to that first group.

With Game State we indicate the score differential of the match in-play. Each match opens with both teams at GS 0, and a scoring team moves to GS +1, with the conceding team moving to GS -1. This Game State obviously has a big influence on how teams approach the game at hand. However, in traditional – if I may say so in this young business – football analytics groups all match events together, regardless of Game State.

The best concepts in football analytics make rational sense as well as intuitive sense. And such is the case with Game States. A team holding a narrow lead is a different team than a team that defends that lead. Obviously, better teams hold more leads than they defend, but even within teams, the shifts that occur when Game States change are fairly homogeneous. We’ve learned before that moving from GS 0 to GS +1 brings an average team a 10% decrease in Total Shot Rate, while the opponent increases 10% simply because of the shift in Game State.

On this day before the big game, PSV – Ajax, we look at the two best teams of the Eredivisie with a focus on their performance levels at the most crucial Game State: GS 0. The main reason for doing so, and I can safely say this out loud now, is that I have my doubt about the accuracy of the Total Shot Rate model used to predicted the final Eredivisie standing. It has significantly overestimated PSV and underestimated Ajax.

The model uses the Relative Shot Rate (RSR) to estimate the total points at the end of the season. RSR is a variation on the Total Shot Rate (TSR). Early in the season, the RSR has advantages over TSR, because teams have encountered a different strength of opposition, but by now those advantages have gone and RSR is nearly equal to TSR. At the moment, PSV’s TSR stands at 0.671 with Ajax at 0.632. Now what does this figure tell?

PSV has a higher ratio of chances created and conceded. Does this single figure make PSV the better team? No, because you may generate all the chances you want, you’ll need conversion as a skill to turn shots into goals.

PSV’s shooting percentage stands at 17.0%, which compares favorably to Ajax’ 15.4%. Does a higher TSR in combination with a higher shooting percentage make PSV the better team? No, because you can score all you want, you’ll need to prevent the opponent from scoring from their shots too, and this is where saving percentage comes into play.

PSV’s saving percentage is 87.4%, compared to Ajax’ 89.5%. But wait, that’s about the same difference as we found at shooting percentage, only this time PSV comes out on top. That’s true, and so both teams have a comparable PDO, which is the sum of shooting percentage and saves percentage. PSV’s PDO is 1044, and Ajax’ PDO is 1048.

Performance metrics at all Game States
TSR Sh%   Sv% PDO
Ajax 0.632 0.154 0.895 1048
PSV 0.671 0.170 0.874 1044

So, if we wrap these numbers up we can safely say that PSV generates a higher ratio of shots. Taking shooting and saving into account, both teams are roughly equally efficient. Now why doesn’t PSV live up to the expectations of our TSR model?

The answer is to be found in game states. We can repeat the exact same exercise of looking at shot rate, shooting percentage and saves percentage for each game state. I won’t go over every single number, but instead focus on the most crucial Game State: GS 0. The average Eredivisie team plays out nearly 50% of shots at this Game State, but since Ajax and PSV are the two top teams, they can be expected to play out less shots at GS 0. Of all shots in Ajax’ matches, 41.9% take place at GS 0. For PSV this number is 35.4%.

Here’s the table for PSV and Ajax in terms of TSR, shooting and saving efficiency, and PDO at    GS 0. Note that PDO in this case provides a nice summary of efficiency, wrapping up both offensive (shooting) and defensive (saving) skills.

Performance metrics at GS 0
TSR Sh% Sv% PDO
Ajax 0.668 0.128 0.924 1052
PSV 0.604 0.151 0.867 1018

The TSR tells us that at the most crucial Game State (GS 0), Ajax is by a distance the better team in terms of shot creation. PSV partially makes up for the lower TSR with their shooting percentage of 15.1%, which is higher than Ajax’ 12.8%. However, PSV loses this advantage in saves percentage, because their 86.7% is much lower than Ajax’ 92.4%. The combined efficiency is higher at Ajax, indicated by a PDO at GS 0 of 1052, compared to 1018 for PSV.

So, analyzing all shots in every match in one group, PSV seems the better team.

But at GS 0, the most crucial stage of the match, Ajax creates a better shot ratio, and is more effective. They gain more leads, which is a good thing in itself, but it also allows them to play more time at favorable game states, leading to an even better performance.

 

This post is a translation of yesterday’s article for ‘De Zestien’, the football blog of Dutch national newspaper ‘De Volkskrant’. Admittedly, it turned into a rewrite, more than a translation.

Game states and conversion

Sometimes the easy questions can be the hardest ones to answer correctly. This is true in statistics, and since we apply numbers to football, this is true in football analytics as well. Take the never ending debate around shot conversion. Why are better teams able to convert a higher percentage of their shots into goals?

Providing an answer is not the hard part here. Providing a correct answer is.

Let try this, often heard answer. It’s simple. Better teams have the better players. Better players hit more difficult shots, leading to more goals per shot.

Game states

The problem with this answer is not that it isn’t correct. Because it is. Better teams have better players, and these players turn more shots into goals than weaker players.

The problem with this answer is that it stops most people from looking beyond it and consider other factors that come into play here. And you’d probably guessed from the title of this article already, that it’s game states and conversion that I would like to link today. It turns out that game states may well explain more of the variation in shot conversion between better and weaker teams than player quality will ever do.

Two weeks ago I wrote about Total Shot Rate (TSR) and Game States. Let’s recall the graph that was central in that piece.

Let repeat this exercise for shot conversion. So, here’s the same graph, linking shot conversion and game state. Please note that this graph contains all shots from all Eredivisie teams in the present season until match day 27. This time I’ve concentrated on GS -2 to GS +2, to prevent the low numbers at more extreme Game States from disturbing the picture. The shot numbers at different Game States are 371 shots at GS -2, 1160 shots at GS -1, 2885 shots at GS 0, 922 shots at GS +1 and 363 shots at GS +2.

It turns out that, like TSR, shot conversion is also related to Game State. TSR had a complicated shape, with an inverse correlation at close GS, but shot conversion is a lot easier to digest.

In general, the more favorable the Game State, the better shot conversion is. The only exception is at GS 0, where shot conversion is lower than at GS -1 and GS +1. Overall shot conversion for the league is 11.7%. Shot conversion at favorable Game States is significantly better, with GS +1 at 14.1% and GS +2  at 17.1%. The most interesting observation is that shot conversion at GS 0 (10.4%) is lower than at GS -1 (11.6%).

Things become more interesting when we combine the conclusions from TSR and shot conversion at different game states.

 

GS 0

At GS 0, both teams are by definition balanced in terms of TSR, as both team are at the same game state and each team’s shot created is a shot conceded by the other team. In terms of conversion, this is not a fruitful game state. This may well be due to the fact that teams are inclined to be more cautious at this score, since they have a point to lose, particularly near the end of games. A further explanation may be that this Game State, by definition, occurs more in the opening stages of matches, and teams may well be more conservative at the start of a match than they are at the end.

 

GS +1

At GS +1, there is an interesting trade off. The TSR declines over 10% to 0.443, while the shot conversion rises to 14.1%. The tricky situation with TSR is that it works two ways. The leading team creates 44.3% of shots, but it concedes 55.7% at this game state. So, overall, the TSR of a chasing team is 26% higher (0.557/0.443) than the TSR of a team defending a single goal lead. Despite the fact that the conversion at GS -1 is better than at GS 0, teams at GS +1 convert 22% (0.141/0.116) better than teams at GS -1.

If we combine the shift in TSR and shot conversion for teams at GS -1 and GS +1, we find that a team at GS +1 pays 26% of TSR to gain 22% in shot conversion.

So, generally speaking, teams have a slightly worse chance of scoring when leading by a single goal than when chasing a single goal.

 

GS +2

At GS +2, there is a whole new world. Teams at GS +2 have restored their TSR to 0.495, while their conversion rises further to 17.1%. Teams at GS -2 fall back in terms of TSR (to 0.505), while their conversion drops to 8.6%. Overall, both teams create a roughly equal amount of chances, but the team leading by two goals converts nearly twice as much.

 

In the end

Let’s turn to the opening question once more. Why are better teams able to convert a higher percentage of their shots into goals? They take a much higher proportion of their shots at favorable game states.

Who are the conversion kings of the Eredivisie? Right, PSV at 16.9%. They took 20.6% of their shots at GS +2 or higher, compared to 8.3% on average for the other Eredivisie teams…

The next step in football analytics: Game States

One of the most appreciated posts last year on 11tegen11 did not contain any numbers, nor did it contain any tactical analysis. It did contain a picture of a flying pig (really…) to help making the point that analysis without context is pointless, or at least dangerous.

Or, quoted from Nate Silver’s inspring book ‘The Signal and the Noise’, “a failure to think carefully about causality will lead us up blind alleys”.

 

Football analytics

Too many people think that football analytics revolves around fancy individual player analysis, multivariate scouting models or complex GPS tracking data of on field events. While it may be true that these ‘holy grails’ get most attention, it’s the simplest of questions that yet remain unanswered. With this tendency to run before we can walk, we run the risk of falling and hurting ourselves over and over again.

Think about this very simple question: why do better teams win football matches?

Obviously because they score more goals than weaker teams.

This bears down to two factors involved in goal scoring: creating shots and converting shots. And, of course, to the defensive equivalent: preventing shots and saving shots. This post will focus on shot creation, and a quick follow-up post will take on shot conversion later.

 

Total Shot Rate

In order to assess shot creation and prevention in a single number, we’ve become familiar with the concept of shot rates: Total Shot Rate (TSR) and Shots on Target Rate (SOTR). I’ve made no secret of my preference for TSR over SOTR, simply because it has three times more shots to work with and variation in offensive and defensive shooting accuracy between teams is mainly noise. For this analysis, however, I will also include SOTR to serve the audience of people who still believe certain teams to be substantially better at hitting the target than others.

Let’s look at shot creation and prevention by way of TSR for different game states (GS). With GS, I mean score differentials while the shot is taken. Each match starts with both teams at GS 0, until one team scores a goal and moves to GS +1, with the opposing team moving to GS -1.

This is a very interesting graph, containing a wealth of information in a single line. Broadly, the line moves from the lower left hand corner to the upper right, indicating that either leading teams create more chances, or the reverse causation, teams that create more chances end up leading games.

But there’s much more to this graph than just that. Let’s start with GS 0. This point of the graph will always be 0.500. Football is a closed model, meaning that one team’s created shot is another team’s conceded shot, and at GS 0, both teams are at that same state. So each created shot is automatically a conceded shot in that same category. Likewise, shots created by teams at GS +1 are conceded by teams at GS -1 and vice versa. In short, the graph will always be a point symmetrical around GS 0 and 0.500.

Now, while it’s true that the line roughly indicates that there is a positive correlation between TSR and GS, the catch is that over 80% of shots take place at GS -1, GS 0 or GS +1, the so called close game states (CGS).

And in these CGS, there is a negative correlation between TSR and GS. In simple words: teams that go a goal up create over 10% less chances, and allow over 10% more chances at the same time. The shift in TSR is over 25% in favour of the team trailing the goal.

 

Shots on Target Rate

We can’t really judge this trend without taking the accuracy of shots into account. Therefore, I’ve also included the same graph for SOTR and GS.

To cut a long story short, both graphs are virtually identical. So, the hypothesis that teams at GS -1 take overly hopeful pot-shots does not gain ground from these data. At least, the shot accuracy between GS -1 and GS +1 is virtually identical. This does not mean that the quality of the shots is comparable too, but we’ll go into that when we look at conversion rates.

First, I want to stress the implications of the fact that TSR is negatively correlated with GS for over 80% of the match. This means that teams that spend a lot of time trailing a single goal, will have an inflated TSR, while teams that spend a lot of time defending a single goal lead will be underestimated in terms of TSR.


Predicting

In early September, only four matches into the season, we’ve gone bold by publishing a predicted final standing of the Eredivisie based on TSR. There are some interesting over- and underestimations in this prediction to learn from.

Willem II, now bottom last and near-certain relegation candidates, had been predicted 14th with 37 points. Of their shots created and conceded, 25% took place at GS -1, compared to a league average of 17%, and only 7.3% at GS +1. This has significantly overestimated their qualities.

Another overestimation is RKC. Top-half in terms of TSR, they are now 14th in the actual league table and are battling to avoid the relegation playoffs. Needless to say they won’t make their predicted 7th spot. Of RKC’s shots, 23.5% took place at GS -1, and a somewhat substandard 15.5% at GS +1.

There are also some interesting underestimations in the model. Feyenoord do better than their TSR would suggest, as they spent over 40% less time at GS -1 than the average team and nearly 30% more at GS +1. Vitesse are another example, with nearly 20% less time at GS -1 and over 25% more time at GS +1.

PSV and Ajax are less affected, because they also spent quite some time at GS +2 and higher, where leading teams create a dominant TSR.

 

In the end

In short, TSR depends on Game State. Over 80% of all shots are contested at Close Game States (GS -1, GS 0 or GS +1), where TSR and GS are negatively correlated. Teams that spend a lot of time trailing single goals are overestimated in terms of TSR, while teams that spend a lot of time defending single goal leads are underestimated.

What is a normal PDO?

The ever present challenge in football analytics is separating luck from skill. In a low scoring sport such as football, there will always be teams that fly high simply due to a spell of good fortune, and teams that find themselves in a hole they did not really deserve to be in on the basis of their displayed skill level. This is particularly true in Cup competitions, which are notorious for their surprise results, but even a common double round robin format that is used in most leagues around the world is too short to consistently identify the truly best team.

The challenge is to tell teams apart according to their true skill level, and not to get sucked into the group of people jumping on the bandwagon of teams that simply owe their ‘great performance’ to a spell of good fortune.
A very helpful tool to separate luck from skill is the concept of PDO. This is a very simple metric that originates from the low-scoring sport of ice hockey but slowly finds its way into football at present.
For a detailed description of the concept and the logic behind PDO, please read the introductory post on PDO.
In short, PDO is computed from a team’s saves percentage and the same team’s shot percentage. Start with the total number of goals conceded and scored, collect the total number of shots conceded and created and you’re all set. Simply add the team’s saves percentage and shot percentage together and to get rid of the decimals, multiply by 1000. Since one team’s goals are always another team’s failed saves, the league wide overall PDO will always be 1000.

One of the most fascinating things of PDO is that is seems to revert back towards the mean of 1000 for all teams. Although intuition will tell you that better players will finish a higher rate of chances presented, the raw numbers seem to tell that better teams don’t separate themselves from weaker teams by consistently obtaining higher shots and saves percentages. Much of this bears down to this influential post by the fantastic James Grayson, who analyzed ten seasons worth of EPL data to prove that PDO indeed moves ever closer to 1000 as the season progresses.

But the question for this post is, what is a normal level for PDO. Does it continue to regress towards 1000 as matches keep being added, or is there a certain bandwidth of normal PDO levels?

PDO

In this graph, Infostrada Sports data has been used to assess PDO levels since the start of the 2010/11 Eredivisie season. So, the horizontal axis contains two and a half seasons of Eredivisie matches with individual match rounds simply numbered from 1 to 86. The vertical axis presents the PDO levels and the lines represent the four different quartiles, with Q1 being the top-25% and Q4 the bottom-25% teams in terms of PDO levels. In order to smooth the curves, the single highest and lowest PDO teams have been left out, which also helps to create four nice groups of four teams out of all 18 Eredivisie teams.

Essentially, this is a repeat of James’ work on the concept that PDO regresses towards the mean of 1000, but the graph is extended beyond the single season level of 34 matches. And as you can easily see, the regression towards the mean seems to stop around the 40 matches mark.

From there on, a zone of PDO from 980 to 1020 nicely captures the average PDO and it no longer narrows down further. This probably means that superior teams separate themselves from inferior teams in terms of PDO, but only within this range. Teams that find themselves in a nice league position with PDO levels above 1020 will revert back over time, but top teams with PDO levels of 1020 may not necessarily follow that path as this PDO of 1020 may well be that team’s baseline. Also, inferior teams with PDO levels around 980 may not safely assume that fate will revert their lower league standing over time, as 980 may well be their baseline PDO level.

We can produce the same graph for both components of PDO: saves percentage and shots percentage.

saves shooting

 

These graphs learn us that the zone of saves percentage ranges from 87% to 89% and that the zone of shots percentage ranges from 10.5% to 14%. Teams with performances outside these zones have historically been unable to sustain that kind of level and will be expected to regress back to these zones within one season’s time.

 

In the end

PDO is a powerful concept to separate lucky teams from unlucky teams. Due to the low scoring nature of the sports, luck is an essential component of achieving a good league table position, but it is generally neglected when evaluating the performances of football teams. A high PDO refers to teams that stand on a high they did not earn, while low PDO teams find themselves in a hole they did not deserve.
The original thinking involved the notion that PDO will revert back towards 1000 over time, and while this is certainly true for values outside the 980 to 1020 zone, it seems better to rephrase and say that PDO will revert back to the 980 to 1020 zone, rather than towards 1000 for all teams.

isg_logo_on_white

Separating home and away strength

Logo_EredivisieWould you rather play Ajax at home or NAC away?

Obviously, Ajax is a lot stronger than NAC, but playing at home is preferable to playing away from home… Difficult choice I would say, and luckily, only a hypothetical one, since managers control a lot, but not which team they play at which venue in competitive matches!

Just like the concept of Relative Shots Rate (RSR) can be used to separate offensive and defensive strength, it can also be used to separate home and away strength. RSR is calculated using a team’s performance relative to the performance of other teams that played the same fixture, and not simply by counting shots alone, like the Total Shots Rate, or TSR, would do. This means that a correction for the strength of opposition and the venue of the fixture is implemented in the measurement of RSR.

The diagram below depicts team on the basis of their RSR in home (vertical axis) and away (horizontal axis) matches. The fat red line is the trend line that separates teams that are stronger away from home (below the line) from teams that are stronger at home (above the line).

Several obvious things can be noted, before we will go on to discuss most of the teams.

First, as expected, home teams create more shots. The average number of shots for home teams in the 2012/13 Eredivisie so far has been 14.33, while away teams have created an average of 11.11 shots. The average home RSR is therefore 14.33 / (14.33 + 11.11) = 0.563, while the average away RSR is 0.437.

Second, and also not a surprise, teams that have a higher home RSR also tend to have a higher away RSR. In fact, there a no teams with a higher away RSR than home RSR, indicating that there are no teams that are truly stronger away from home than at home. There may be quite some teams that have picked up most points away from home this early in the season (PEC Zwolle, Groningen, ADO, N.E.C., Utrecht and Vitesse) , but no teams consistently produce more points on the road than at home.

From the graph it is quite clear that PSV separates itself from the rest of the title contenders, both at home and on the road. They are the best in both RSR’s, and by a distance. The three of Ajax, Feyenoord and Twente are quite close together and are quite close to the trend line, indicating a good balance between home and away strength.

Heracles, Utrecht and RKC are 5th, 6th and 7th in overall RSR, but Utrecht does so in a different style than Heracles and RKC. Utrecht’s home performance is one of the most disappointing ones in the league (with N.E.C. and PEC Zwolle who are also way below the trend line), while away from home Utrecht is right up there with title contenders like Feyenoord and Twente. Heracles and RKC, on the other hand, perform at the level of Feyenoord and Twente when playing at home, but are just plain mediocre on the road with exactly average away RSR’s.

Another outlier is Vitesse. They only foreign-owned Dutch club are presently second in the league table, but their RSR does not indicate they earned that spot through the quality of play. Their home RSR is 7th in the league, but only league numbers 17 and 18, Roda and Willem II perform worse than Vitesse does on the road.

 

In the end
In two consecutive posts, we’ve demonstrated how RSR can be used to separate both offensive and defensive strength, and home and away strength. In the next post we will combine these two elements and come up with some interesting food for thought as not all teams strike the same balance between offense and defense when play home or away.
Oh, and the answer to the initial question, NAC’s home RSR and Ajax’ away RSR are both 0.48.

 

Data provided by Infostrada Sports

isg_logo_on_white

Separating offensive and defensive performances

What makes Heracles and RKC the two extremes of the Eredivisie? What area does Marco van Basten need to work on at Heerenveen? Why does PSV stand out from the rest of the title contenders? And why is Vitesse not a true contender for the title?

The recently introduced method of computing Relative Shot Rates allows us easy answers to all of these questions. This method allows to separate offensive from defensive performances by looking at relative numbers of shots created and conceded.

 

Why are shots better than goals?

The simple answer is that shots are nearly ten times more frequent than goals. This means that it takes much more time to collected enough data when studying goals instead of shots. The average amount of goals scored by an Eredivisie team is around 1.6 per match, while the average amount of shots is around 13. In other words, the amount of shots during the 5th match round equals the amount of goals at the end of a 34-match season.
We know that converting shots into goals and preventing your opponent from doing so is a skill that is hard to repeat. Superior teams don’t separate themselves from inferior teams with regards to shots conversion and saves percentages, but simply by creating more shots and conceding less.

 

Why the relative amount of shots?

So, better teams create more shots and concede less. It’s no secret that some teams excel by creating shots, while other specialize in conceding less, but the it’s only the top teams that combine this skill.
However, by only looking at the absolute amount of shots, another problem creeps up.
Suppose that your team is a relegation favorite and you want to compare their total shot rate (TSR) to a relegation threatened rival team. Now, your team faces the league leaders early in the season, while the rivals only meet them near Christmas. It’s easy to imagine that your team’s shot rate over much of the first half of the season looks worse than their performance actually is. Only when the other team has also been thrashed by the league leaders the comparison becomes fair again. The Relative Shots Rate (RSR) compares the performance of a team with the performance of other teams that have played the same opponents. Details of this method were explained earlier.

The X-axis of the plot (horizontal) depicts the teams’ offensive performances, while the Y-axis (vertical) depicts the teams’ defensive performances. Note that the offensive performances show a wider range of variety than the defensive performances. For clarity, the defensive record is shown inverted, so that better defenses (that concede fewer shots) are higher in the graph. Better offenses are on the right hand side of the graph, so the upper right hand corner is the area of optimal performance, while the lower left hand corner shows the struggling teams.

It is immediately clear that PSV sticks out positively. They are matched in defensive terms by title rivals Feyenoord, and at a small distance Twente and Ajax, but PSV’s offensive record is second to none and allows them five shots per game more than even the best of the rest. The media runs a nice narrative of PSV’s defensive problems, but their defensive record shows that they are up there with the best of the Eredivisie.

An interesting position in this graph is RKC in the upper left hand corner. Erwin Koeman’s team takes a completely different approach compared to lower right hand corner Heracles. RKC matches the top teams in terms of limiting their opponent’s number of shots, but they pay for it by sacrificing their offensive chances to the level of relegation strugglers Roda, NAC and Willem II. Heracles, on the other hand, are second in terms of shots creation, but concede way too many shots in the process.

Van Basten’s Heerenveen has a situation that is quite comparable to Heracles. They can boost the third best offensive record, but only four Eredivisie teams give up more shots, which illustrates the imbalance in their team. Expect teams like Heerenveen and Heracles to be involved in high scoring matches, like Heerenveen’s 4-4 draw today, and expect RKC to play in low scoring affairs.

So, why is Vitesse not a true title contender? See for yourself, their offensive record is slightly above average (+0.25), but they concede 0.67 shots more than the average Eredivisie level. Expect Vitesse to regress as the 2012/13 Eredivisie comes near its end. Their overachievement so far will probably propel them into a play-off position, but be cautious to expect more on the basis of their present shot records.

 

In the end

We will regularly revisit this Relative Shots Rate concept, as it seems the best way to asses team performances. Studying shots rather than goals provides a much more solid base by generating nearly ten times higher numbers. On top of that, the Relative Shots Rate eliminates the bias that is otherwise always introduced due to differences in strength of schedule.

Next up: separating home and away performances…

Introducing the ‘Relative Shots Rate’

It has all kind of characteristics to make it both in the wide world of football blogging, and in the even wider world of football journalism. The Total Shots Rate, or TSR, is simple and easy to explain and it requires little data. Yet so far it is the single most powerful predictor of future performance of football teams.

 

TSR

For those not yet aware of the concept, let me explain shortly. TSR is simply the fraction of shots created by a football team in a single match, or over multiple matches. If Feyenoord creates 10 shots against Ajax, while Ajax creates 20 shots in that same match, Feyenoord’s TSR will be 0.333 and Ajax’ TSR will be 0.667. The total TSR over a single match will always be 1 and since two teams divide that total, average TSR’s of all teams in a league will always be 0.500. Over multiple matches, simply add together the number of shots created by your team and divide by the total number of shots in those matches.

But despite being the most powerful predictor around, TSR has it’s disadvantages too, with the most obvious one being that it does not correct for strength of schedule. The best teams in a league generally have a TSR of around 0.700, while at the lower end of the table TSR’s of 0.350 are more common. So the better teams seem to be around twice as good as the weaker teams with respect to generating shots. This leads to considerable bias throughout the season, as teams experience a different spread in strength of the opponents they face, but at the end of the season, when all teams have played each other twice, most of this bias has disappeared. The only bias remaining comes from the fact that teams don’t play against themselves, so the best team does not play the best team, while the weakest team does not play the weakest team. So better teams face on average lower TSR opposition compared to weaker teams.

 

Model

At 11tegen11, we’ve introduced a model to predict the final standing of the Eredivisie table based on TSR. Since shots are nearly ten times more frequent than goals, the model identifies better teams much faster than the regular league table does. The main problem with the model is that teams have different strengths of schedule. After fourteen matches have been played, PSV tops the Eredivisie table in terms of TSR with 0.730. But before the first half of the season is over and all teams have faced each other once, they still have to play Ajax (0.560), Twente (0.559) and N.E.C. (0.498). So it’s safe to assume that their TSR of 0.730 will fall over the coming three matches. Meanwhile, Heerenveen (0.469) still has to play Willem II (0.385), Roda (0.353) and Utrecht (0.544). So Heerenveen’s TSR will likely be an underestimation of their true strength.

 

Relative Shots

In order to tackle this problem, we will introduce the ‘Relative Shots Rate’, or RSR. The RSR is computed by comparing the number of shots created by a team against the average number of shots created by all teams in the league against that same team. This compares the performance of a team in a certain fixture with how all clubs have performed in that same fixture. Thereby correcting for the strength of schedule.

So, Ajax concedes on average 8.0 shots when playing at home, and VVV created 4 shots in Amsterdam. This gives VVV’s offense a -4.0 for that match. Meanwhile, VVV concedes 17 shots in that same match, against a league average of 13.6, so VVV’s defense record for that match is -3.4. If you would do this for every match played, and then add a team’s offense en defense record separately, you get an overall offense and defense performance that represents the average amount of shots that a team creates or conceded compared to league average.

Over fourteen matches, PSV has created on average 5.95 shots more than league average, while they have conceded 3.59 shots less. Now, these numbers can be converted to a single parameter that we will call the ‘Relative Shots Rate’, or RSR.

 

Computing the RSR

The average number of shots in a 2012/13 Eredivisie match has been 12.71. So, against the average opponent, one can expect PSV to create 12.71 + 5.95 = 18.66 shots and PSV can be expected to concede 12.71 – 3.59  = 9.12 shots. So the best estimate for shots when PSV plays an average league opponent would be 18.66 shots created by PSV and 9.12 shots conceded by PSV. This translates into a RSR for PSV of 18.66 / (18.66 + 9.12) = 0.672.

Now, we’ve learned that PSV’s TSR of 0.730 is quite a lot higher than their RSR of 0.672. This should indicate a strong series of fixtures coming up before the season is at its half-way stage. And indeed, with Ajax (RSR 0.554), Twente (RSR 0.538) and N.E.C. (RSR 0.486) still to play, PSV likely won’t maintain their TSR as high as 0.730.

As mentioned above, Heerenveen have three relatively easy fixtures coming up before the half-way stage of the season, playing Willem II (RSR 0.402), Roda (RSR 0.393) and Utrecht (RSR 0.536). So Heerenveen’s TSR of 0.469 is likely to be a slight underestimation of their strength.

 

In the end

So, while TSR is more straightforward and easier to explain, RSR offers a better representation of a team’s strength. It eliminates the bias of strength of schedule, and also allows to correct for situations where teams have played more home or away matches. On top of that, it is possible to create separate RSR’s for home and away matches, but we will save that for a later post…

 

Shot data provided by Infostrada Sports.

What if Wilfried Bony was a professional cyclist?

Make no mistake, I think that he is a good player, and that he probably is a very good player, but Vitesse striker Wilfried Bony is not the world-beater that people take him for. The Ivory Coast striker is presently the hottest player in the Eredivisie, thanks to his impressive return of 15 goals in 14 matches. Not bad, is it?

But, join me here on a slightly weird thought experiment.

 

Cycling

Imagine if Wilfried Bony would not have been a professional footballer who scores goals for a living, but a professional cyclist instead. After racing in his home country where his talent was quickly recognized, he transferred to Europe to develop further at continental cycling level for a few years, which represents his years playing football for Sparta Prague between 2008 and 2010. After that, he joined a World Tour team, which represents his transfer to the Eredivisie, and the cyclist Bony had a good first season there.

Now suddenly, in his second season at Vitesse, results have really picked up and Bony seems twice as good as last year, may be even the best cyclist in the entire peloton. The cyclist Bony would immediately get linked to rumours of illegal substance use or, to say the dreaded word out loud, doping.

Now, as far as we know, performance enhancing drugs don’t exist in football, or at most they play a marginal role. But in our thought experiment they do, and they go by the name of ‘luck’. The doping that caused the cyclist Bony to perform twice as good in his second year at the club is the luck that caused the football player Bony to temporarily perform at the level he does now for Vitesse.

Luck

There are certain parallels between doping in cycling and luck in football, which make it easier to assess the role luck plays in football. Doping has the potential to turn a decent professional cyclist into a good one, or a good one into a true world beater. It won’t turn an amateur racer into a World Champion. Luck has the potential to turn a decent striker into a good one, or a good goal scorer into a world beater. So our cyclist Bony performs significantly above his usual level for as long as the doping effect lasts, and our striker Bony converts way above his usual rate, for as long as his luck lasts.

But taking doping is a deliberate choice, while one can’t control luck. So, think of luck in football as cycling’s doping, but without any control over timing and dosage. Oh, and luck is not illegal as well, as all competitors have equal excess to it. In the case of Bony, this thought experiment helps to point out why his present goal scoring rate won’t last, as the amount of luck he currently experiences is not going to last.

Prior to this weekend’s match, where he scored the winning goal to hand PSV a rare Eredivisie home defeat, Bony had scored 14 goals in 13 matches, having played a total of 1115 minutes. That averages a goal every 80 minutes, which is a truly elite number.

GP = games played ; Sb = subs ; Min = minutes played ; Gl = Goals ; Sht = shots on target

His 14 goals, however, came from just 20 shots on target, for a conversion rate of shots on target into goals of 70%. Out of this world? Yes! Sustainable? No!

 

25 goals

There are broadly two numbers to compare Bony’s conversion rate with. First there is the normal conversion rate for all players alike, which stands at 30% over the present Eredivisie season. And second, there is Bony’s individual conversion rate prior to this season, which stands at 40 goals from 104 shots, or 38% for his entire career, including his time at Sparta Prague. Or third, we could take his individual Eredivisie conversion history, which stands at 15 goals from 46 shots, or 33%.

Both standards for comparison are nowhere near Bony’s present luck-infused 70% conversion rate. But how many goals can we expect Bony to score over the remainder of the season?

The best estimated guess for Bony’s total goals at the end of this season would be to hold his usual Eredivisie conversion rate of 33%, which is well in line with other quality strikers, against the expected number of shots on target that he will take for the remainder of the season.

Extrapolating on the amount of matches still to be played, and his (and his team mates’) capacity of generating shots on target, we can expect Bony to take 32 more shots on target. This would most likely give him around 11 more goals, one every 171 minutes, rounding off his season total at a respectable 25 goals.

Keep this in mind when judging short term performances. Just like doping in cycling, luck catches up quickly, and both of these performance enhancers are never there for the long run!

 

 

Statistics provided by Aaron Nielsen (@ENBSports). Check his site too!