Tag Archives: Soccermetrics

How to scout a striker?

Scouting strikers should not be that hard, right? Their prime responsibility is putting the ball in the back of the net, and goals are one of the few elements of football where traditional fans and nerdy analysts agree. A goal is a goal, counting goals cannot go wrong. Strikers who score a lot of goals are better than strikers that score less goals. Or not?

In our previous piece on scouting offensive talent, we’ve distinguished two elements that constitute a good striker.

  1. The striker has to get into good scoring positions, and accumulate good shots. This is best measured as Expected Goals (ExpG) per 90 minutes, with exclusion of penalties.
  2. The striker has to convert these chances into goals. This can be measured by comparing ExpG and actual non penalty goals.

The previous post on strikers illustrated how we can measure those two elements and judge strikers separately on both of these qualities. Today we will take it a step further and see what scouting implications come from it. We will show that sometimes it is better to buy a lower scoring striker, and which high scoring strikers to avoid. But first, I want you to meet someone.
 

Meet our striker!

He plays in a big league, for a good team, where he has taken 160 non penalty shots in the past season. On average, each shot was good for 0.152 ExpG, so over all shots together we could have expected 24.4 goals from him.

The thing is, our striker is pretty good, so instead of 24.4, he scored 43 non penalty goals for an over performance of 18.6 goals. We can stick an ugly acronym to it and say his non penalty goals above replacement (NPGAR) is 18.6.

NPGAR = Non Penalty Goals – Expected Non Penalty Goals

You’ve probably guessed by now that our striker is Lionel Messi. This season, Messi still plays for Barcelona, where he has taken 75 non penalty shots to date. On average the quality of the chances was comparable to last season, with an ExpG per shot of 0.149. Overall, we should expect 11.1 goals.

The thing is, Messi is suddenly not so excellent at finishing, and he has come up with 9 non penalty goals instead of 11. His NPGAR is now -2.14, which indicates that the average player, not even the average striker, would have scored two more goals with the type and number of shots that Messi has taken this season.

 

Analysis

A story about Messi is not analysis, it’s anecdote. And anecdotal evidence is no evidence. We could ‘prove’ that finishing does stick with a player by simply picking someone else that happened to follow an excellent finishing season with another excellent finishing season and fire that point home.

It makes more sense to repeat this work for all 479 players of the top-5 leagues who took at least 10 non penalty shots in the baseline 2012/13 season. We take separate looks at the creation of goal scoring chances (ExpG per 90) and at the conversion of chances into goals (Goals minus ExpG). Both parameters will be compared over one season and the next.

 

ExpG per 90

In the first graph we will look at the repeatability of non penalty Expected Goals per 90 minutes (ExpG NP per90). The horizontal axis shows ExpG NP per 90 for the first season, and the vertical axis shows the same for the next season.

ExpG90 correlationExcellent! It turns out that players with a high ExpG per 90 in one season, are also the players with a high ExpG per 90 in the next season. This is not too surprising, as several factors influencing ExpG per 90 will remain constant over time. Strikers will still be playing as strikers, and most players playing for top team will still be playing for top teams. More work needed here, but we’ll leave that for another post, as there is a far more interesting graph coming up.

 

NPGAR

The next graph shows the repeatability of non penalty goals above replacement (NPGAR). This represents the conversion of goal scoring chances into actual goals.

NPGAR correlationIt turns out that if you correct for the quality of goal scoring attempts, there is absolutely no connection between conversion in one season and the next. A high or low NPGAR in one season has zero relation with NPGAR in the next season.

Messi is the dot in the lower right hand corner, who had an unworldly 2012/13 season, with an NPGAR of +18.6, followed by the current season of -2.1.

 

Scouting

This is a shocking conclusion with huge implications for striker scouting. If a striker bases his goal scoring mainly on conversion, he has a good chance to fail in the next season. If a striker bases his goal scoring mainly on good underlying ExpG numbers, he has a good chance to persist his level of scoring.

Buying strikers who score their goals due to a high NPGAR is something you should always avoid.

We all know these famous examples of one season wonders, who got transferred for big money, only to disappoint at their new clubs. Usually, loads of soft factors like the higher level of competition, language issues, or playing style are used to explain the disappointing results, while the only thing going on is regression of NPGAR.

Regression does not always occur though, and you can see in the scatter plot that some players do indeed follow a season of high NPGAR with another season with high NPGAR. But just as many players do not, and just as many players with high NPGAR in the second season come off seasons with low NPGAR.

 

Finnbogason

We should use NPGAR as a red flag in striker scouting. A player like Alfred Finnbogason, currently the Eredivisie top scorer with 21 goals in 20 matches, is a nice example. We can put up several red flags.

First, 8 of his 21 goals are penalties. Second, his NPGAR is +2.68, indicating that he is nearly three non penalty goals above expectations. There is no ground at all to assume that he, or any other player, will outperform the ExpG model  next year. All in all, Finnbogason’s non penalty ExpG per 90 is 0.51, which is still a good number, but by no means near the present perception of a striker that scores 1.05 goals per 90.

For next season, 0.51 goals per 90 seems a reasonable estimate. The problem is, next season Finnbogason will not be playing at Heerenveen, as he will make the step up to a bigger league, where he won’t contribute the same number as in the Eredivisie. His true level should then be estimated somewhat lower than  0.51 goals per 90 minutes, and we will all start wondering what is going on with all these high scoring strikers who just don’t cut it outside the Eredivisie.

 

Exceptions

Inevitably, though, there will be players who seems to disprove the workings of NPGAR. We can assume that half of all players will have a positive NPGAR and half will have a negative NPGAR. A season later, one quarter of players will have two consecutive positive NPGAR seasons. One eighth will have three consecutive seasons where they outperform ExpG, and so on.

In this study among players from top-5 leagues with at least 10 shots, we find 479 players. With such a big group of players, there will inevitably be some players who consistently outperform ExpG to produce season after season of positive NPGAR. This is a misleading situation, as these players will be credited with finishing skills that are basically the product of an unrepeatable effort.

 

In the end

The message in striker scouting is quite clear. Familiarize yourself with the terms ExpG and NPGAR and these mistakes of flopping striker are generally avoidable. Stay away from strikers with high NPGAR and aim for those with high ExpG numbers, as the latter group will cut it next season, while the first group has every chance of falling back.

Probably, a negative NPGAR in a player with good underlying ExpG numbers is a sign of a bargain buy. The world will see a striker struggling to convert, and it takes some balls to buy him, but the numbers indicate that a return to scoring form is right around the corner.

Putting Expected Goals to the test

After yesterday’s post where Expected Goals was explained in detail, today’s post will put the metric to the test. How good is Expected Goals? And is it better than Total Shots Rate?

We’ll compare ExpG and TSR at several levels as we go along. The dataset used for the first part of this analysis consists of all 98 teams from the 2013/14 season so far, for top-5 leagues. As usual, data comes from Squawka, my go-to-site for OPTA driven football data. All comparisons in this piece are made on team level. We’ll leave the individual player analysis of ExpG for another day.

ExpG is calculated as explained in yesterday’s post, and for comparison with TSR, ExpG ratios (ExpGR) are used. For all behind-the-scenes input in the ExpG formula no data from the 2013/14 season was used. All regression analysis that was needed to determine how to rate different factors that influence ExpG was carried out on earlier data. The risk of over fitting is therefore minimized.

ExpGR = ExpG for / (ExpG for + ExpG against)

TSR = Shots for / (Shots for + Shots against)

 

TSR and outcome

First up, the relation between TSR and the outcome in terms of points per game (PPG) and goal difference (GD). Click on the graph if needed, for a larger version.

TSR and outcomeTSR is a very good metric. It correlates nicely with the most relevant two performance indicators PPG and GD. The R-squared values of 0.55 and 0.58 indicate that knowing a team’s TSR provides around 75% of knowledge needed for a perfect knowledge of either PPG or GD. For more, and better explanations of R-squared and R, check Phil Birnbaum. The man really knows his stuff.

In general, R-squared values are higher when leagues have a clear separation into two groups. EPL typically has values over 0.6, while Ligue 1, where the dots are one bunch, generally scores below 0.4.

 

ExpG and outcome

These two plots show the relation between ExpGR and outcome.

ExpGR and outcomeFrom face value alone, you can tell that ExpGR has a better correlation with outcome than TSR has. The dots are closer to the red regression line, so the R-squared value is a lot higher. For PPG, the R-squared is 0.73, while for GD it is somewhat higher at 0.79.

This is a magnificent correlation between a metric and outcome, but don’t get carried away yet. We would expect ExpGR to do better here, as it carries more detailed information to rate goal scoring chances. The formula behind it is designed to improve the relation with outcome in terms of PPG and GD. It would be a true shock if ExpG did not do a lot better than TSR here. What’s more important is the second half of this piece, looking at repeatability of the metrics.

 

TSR and repeatability

From here on, a different data set is used, as we’ll now compare the same metric over two consecutive seasons. Data consists of season 2012/13 and 2013/14 so far for the top-5 leagues, where obviously relegated sides from the first season did not produce a second season for comparison, as promoted sides in the second season did not have a first season to compare with. This left 84 teams with consecutive seasons.

TSR repeatabilityTSR is pretty repeatable, producing an R-square of 0.51. This indicates that TSR in the first season is a moderately good predictor of TSR in the second season. Most teams are roughly in the same ballpark, but deviations of 0.100 are far from rare.

 

ExpGR and repeatability

The next plot shows ExpGR in the first and second season.

ExpGR repeatability

ExpGR has an even better repeatability than TSR did With an R-squared of 0.67 this metric carries a good signal over multiple seasons. Stripping a few outliers, teams generally don’t deviate more than 0.050.

 

In the end

This scatter plot heavy piece proves a superior correlation for ExpGR with both outcome and repeatability compared to TSR. To speak with Nate Silver, ExpGR carries more signal and less noise than TSR.

The first part of this post, relating ExpGR and outcome, shows that in measuring team performance, ExpGR show prevail over TSR. This conclusion was probably known intuitively, but is now illustrated and quantified.

The second part of this post is more revolutionary, as it establishes ExpGR as a more reliable parameters to use for predictions. This means not just fancy number heavy predictive models, but also any easy made claims regarding upcoming matches or final league positions.

TSR still holds the quite relevant advantage that counting shots is a lot easier than building an ExpG model. However, with more and more variations of ExpG models around, these numbers will gradually become easier to obtain over time.

 

 

I feel like I could have put a dozen links to James Grayson’s amazing site in this TSR heavy post, but I’d rather urge you to just go to his site and check it thoroughly. It is good.

What is ExpG?

This post will look at the latest love child of the football analytics community, Expected Goals, commonly referred to as ExpG or xG. I’ve noticed a lot of questions via Twitter recently, regarding this relatively new concept. Spread across multiple posts, the concept is mentioned and has been explained on 11tegen11 before, but I felt the need for a comprehensive explanatory piece on ExpG to explain this important concept, and to use it for future reference.

 

ExpG

ExpG stands for Expected Goals. It measures not how many goals a team has scored, but how many goals an average team would have scored with the amount and quality of shots created.

Each goal scoring attempt is assigned a number based on the chance that this attempt produces a goal. Typical parameters to use are shot location and shot type (shot vs header). Some models, including the one I use on 11tegen11, also use assist information to separate through-balls from crosses.

Teams that produce more ExpG than they concede have the best chances of winning football matches.

 

Total Shots Rate

ExpG has its roots in another key metric in football, Total Shots Rate, or TSR. Before trying to grasp ExpG, it is important to get familiar with shots rates.

Total Shots Rate = Shots For / (Shots For + Shots Against)

This formula provides TSR on a 0 to 1 scale. If a team takes all shots in a match, or a series of matches, TSR will be 1, and the more shots it has to leave to opponents, the lower TSR gets. On average, over multiple teams in the same league, TSR will always be 0.500, since each shot for is a shot against for another team.

TSR is pretty simple, yet it is a powerful predictor for future performance of football teams. Ever since its introduction to football, by James Grayson, TSR has dominated the analytics community. James has shown TSR to have the two qualities that are essential for a powerful team ranking tool.

  1. TSR shows a strong correlation with both points per game, and goal difference.
  2. TSR in one time period shows a strong correlation with TSR in the next time period.

If only the first condition is met, the metric would be strong in telling what has happened, but does not translate into the future. Goal keeper saves percentage is a nice example of a stat that helps explaining what has happened, but holds no power for matches still to come.

If only the second condition is met, the metric would be strong in translating into the future, but not correlated to performance. Team shirt color is a nice example, where translation into the future is easy, but a relation to performance does not exist.

 

The problem with TSR

The problem with TSR is that it treats all shots equal, which does not fit the fluency of football, where shots are not equal. Shots may come through a crowd of defenders from 40 yards out, or from the penalty spot in optimal circumstances. For TSR, both shots count as one, and both influence TSR equally.

This induces errors and probably also bias.

Errors arise because some shots are worth more than others. Sometimes a team creating 20 shots did a powerful job, but other days the team was just trigger happy and produced weak quality output. It may sound weird, but errors are not too much of a problem in a predictive model.

Bias is much worse.

If all teams produce and concede an equal case mix of poor and high quality shots, TSR would, despite its errors, be a perfect tool. However, there is plenty of evidence around that this is not the case. Some teams produce high quality shots, like Barcelona, and other teams produces low quality shots, like Laudrup’s Swansea.

 

Shot quality

Shot quality definitely meets condition one. It is related to performance in terms of points per game and goal difference. However, the clear cut evidence that it meets condition two is less clear. Data to measure shot quality is around since the 2012/13 season, so we don’t have high quality season-to-season correlation measurements. In other words, was Swansea’s recent struggle to produce decent shot quality just a flurry that would fix itself, or does it indicate an underlying reason that will cause the team to produce below average quality shots in the near future?

 

In the end

ExpG is hot, and if you’d ask me now, I’d say ExpG is the next big step that is being taken now in football analytics. Intuitively it makes a lot of sense to separate goal scoring attempts by the odds of scoring from it. However, for a new metric to be adopted for truth, a bit more work is needed. ExpG is a lot more complex than just counting shots. To show that this effort is worthwhile, we should first do a better job to illustrate its supremacy over TSR.

Never judge a goal keeper by his saves

Sometimes analysis and football intuition fit nicely together, and in those situations writing analysis pieces is easy. Sometimes they don’t, and writing gets tougher.

I’ve been thinking for most of the past weeks on goal keeper analysis. A topic that seems as simple as it gets, but as we’ll find out in this post, is actually a difficult one to get your head around and do properly.

GK saveAccording to the all-knowing Wikipedia, a goal keeper is “a designated player charged with directly preventing the opposing team from scoring by intercepting shots at goal”.

So, what could be more difficult than assessing how many of those shots end up as goals and, voilà, here’s our goal keeper analysis?

Let me start with a poll question. No need to fill out the answer, just take a little bit of time to make up which answer you think is correct.

The best way to identify goal keeping talent is…

  1. Percentage of shots saved
  2. Percentage of shots on target saved
  3. Percentage of shots saved with a correction for shot quality
  4. Other

 

In my personal history in football analysis I’ve gone from A to B, back to A, to C.

At C I’ve spent most of this season, but some background work I’ve done these past weeks have moved me further down, to D.

Yes, in my view, goal keeper analysis cannot reasonably be done on the basis of analyzing saves.

Now, that statement requires a bit of back up, so here we go. In the remaining part of this article we’ll analyze goal keepers in the top-5 leagues (England, Spain, Italy, Germany and France), who have faced at least 100 shots in two consecutive seasons (2012-13 and 2013-14), with the same club. To my idea, this is the best sample to use, to prevent keepers switching teams from screwing up the sample, and to prevent keepers with low numbers from doing the same.


Percentage of shots saved

We’ll start with raw saves percentage. This is the easiest parameter to collect, and probably the most used tool to evaluate goal keepers. It also ties in nicely with our intuition that good goal keepers stop a higher proportion of shots than bad goal keepers.

GK save percentage 03 februari 2014

The horizontal axis shows save percentage in the first year, and the vertical axis shows save percentage in the subsequent year. Remember, these are all goal keepers playing two seasons for the same club.

The connection is not very strong, but it’s not totally absent either. Generally, goal keepers who noted good saves percentages in the first year, noted better saves percentages in the second year, but the spread is huge. This makes it unreliable to estimate the second year’s saves percentages on the basis of the first year’s saves percentages. The repeatability of goal keepers saves percentage is poor. In general, if your stat has a poor repeatability, it’s useful to describe what has happened, but very misleading to assume that things will happen along the same lines in the future.

These numbers correspond with the excellent and far under viewed work by James Grayson, who found a similarly poor relation in a much larger set, matching teams in one season and the next.

 

Percentage of shots on target saved

Let’s move a little step forward and isolate shots on target. Some people advocate to use this over raw saves percentage, since goal keepers are hardly responsible for off target shots. In theory, though, keepers may take responsibility for some off target shots. By approaching a striker they could disrupt shot placement, or by reputation alone they could force strikers to try and find more difficult corners of the goal. Just raising a few hypotheses here.

GK save percentage SoT 03 februari 2014Again, first year performance is plotted on the horizontal axis, with second year performance on the vertical axis. The connection is even weaker for saves percentage of on target shots than it is for saves percentage of all shots conceded. Let’s save the debate until after the next plot.

 

Percentage of shots saved with a correction for shot quality

The third analysis uses shot quality. Based on our Expected Goals (ExpG) model, each shot is assigned a chance of ending up in goal, based on shot location, shot type and several other factors. This helps to control for the difficulty goal keepers have to make the save. In theory, this analysis is the best test for shot stopping quality, since it removes the fact that some keepers face tougher shots than others.

Goals conceded above replacement identifies how many goals a keeper conceded above or below the value of Expected Goals per 100 shots faced.

GK CAR 03 februari 2014After correcting for shot quality, all connection between first year performance and second year performance is lost. A goal keeper who over performed in the first year, has an equal chance of over performing in the second year as a goal keeper who under performed in the first year.

The most intriguing part of this rather shocking conclusion is that this knowledge is already out there, yet people continue to analyze goal keepers on the basis of saves. Again, I’m pointing you towards James Grayson, who, with smaller numbers taken from a Paul Riley post, found no correlation between goal keeper saves percentages in one season and the next after correction for shot location.

 

Shot quality

Please allow me to add one more scatter plot. This time, I’ve linked saves percentage and ExpG per shot, to show the strong link between those two.

GK save percentage and ExpG 03 februari 2014No goal keepers that faced shots higher than 0.11 ExpG noted a saves percentage over 92%, and no goal keepers that faced shots lower than 0.09 ExpG noted a saves percentage below 90%.

 

In the end

Putting all four plots together, this is compelling evidence to ignore each and every analysis using goal keeper saves percentage. The only, weak, link between goal keeper saves percentages (first graph) is driven by the quality of shots allowed. Some teams tend to face higher quality shots than others, therefore some goal keepers tend to have higher saves percentages than others. Nothing more, nothing less. On top of that, there’s going to be a huge amount of variance in performance.

This does not mean that shot stopping is not a skill. It most definitely is. It just indicates that among all factors that dictate a goal keeper’s saves percentage, the spread of skill level in shot stopping among top level goal keepers is very close. Other factors that influence goal keeper saves percentage completely overshadow the effect of skill, most notably shot quality, as indicated by ExpG.

 

Goal keeping talent

GK save 2So, how to scout for goal keeping talent? Start by ignoring saves percentage and you’ll leave most of the scouting world behind. Scouts will be aiming at goal keepers who’ve had random high saves percentages in some season, but those goal keepers stand an equal chance next season compared to all other goal keepers. Goal keepers who’ve had the bad luck of noting a low saves percentage season will probably be undervalued by the market.

What signs to look for, if not saves percentage? This piece shows compelling evidence against saves percentages, but it does not say that all goal keepers are equal. Far from that. It may well be that better goal keepers face less shots or shots with a lower ExpG. Better goal keepers will give up less rebound chances, less corners, claim more crosses, distribute balls better or sweep up nicely behind the defense.

All this stuff can be counted, but it’ll be hard to separate it from the effort of defenders. We’ll get to that in time. In the meantime, don’t let yourself be fooled by saves percentages.

How to scout goal scoring talent?

Strikers are the most sought after commodity in football. Having a player who can put the ball in the back of the net more than others can is a highly valuable asset to a football team. So, how to find one? The easiest and most applied way would be to list names and goals, pick a top name, and bingo!

Now, while semantically it is true without a doubt, that a top scorer is the guy who scorers the most goals, counting goals seems a poor way to identify goal scoring talent. Let’s walk along some simple improvements to do it better.

 

Traditional

We start with this well-known format of player names and goals scored. Easy, right?

Top scorers - Traditional - Eredivisie 2013-14

This table will always do a good job at the top, since players like Finnbogason and Pellè take a ton of shots, and would never show up that high if they did not have true goal scoring skill. But what about players a little lower down the table? Is a player with 7 goals to his name at this half way point of the season doing a good job, or not?

 

Per 90 (G90)

Time to make our first, and very simple adjustment: a correction for playing time. Just like the smart people at Statsbomb – do check that site out, it’s amazing – I prefer my goal scoring information as per 90. Just divide goals by playing time to arrive at that stat. Here’s the table again. I’ve excluded players who’ve played less than half of the season, to prevent Jaïro Riedewald – 2 goals in an 11 minutes sub appearance – from skewing the chart.

Top scorers - G90 - Eredivisie 2013-14

It won’t make too much of a change at the top, as those players play nearly every possible minute, but still, subtle changes do occur. Behind the identical top-6, new names appear. Lower down the list, we can expect a bigger impact, since here we find players that may be successful impact subs, have been injured, or youngster who are not the focal point in their teams yet.

 

Non penalty goals per 90 (NPG90)

With the next improvement also comes the next acronym. In the age of quick, twitter centered communication, football analytics can’t do without it’s acronyms. It is not that we don’t value accessibility, since we really do, but acronyms makes talking about these metrics possible.

So, we’ll strip out penalties and then look at the goals per 90 again. A second, simple adjustment that corrects for the fact that not all players take an equal amount of penalties, or even take penalties at all. Penalties are one of the best goal scoring opportunities around, but they are very unevenly distributed among the players. So it makes intuitive sense to strip them out when looking for goal scoring talent.

Top scorers - NPG90 - Eredivisie 2013-14 

Piazon drops a bit, from 0.72 to 0.52, but the most remarkable drop is Aron Jóhansson, who drops out of the top-10 while holding the fourth spot on the G90 table. Four of his 11 goals are penalties. But, AZ fans, don’t worry, Jóhansson will be back later in this piece.

We can take it a step further, and this may be where things may look more complicated. Don’t worry, because it isn’t complicated and I’ll walk you through the next level.

The main thing that is wrong with the NPG90 table is that not all players have had an equal amount of goal scoring opportunities.

 

Classroom exam

Imagine yourself sitting in a classroom, taking an important exam. On this exam, only correct answers will be counted, no penalty for wrong answers, and you get a paper filled with just ten questions. A slight look around tells you that other people have been handed more questions, some even got multiple papers to fit all the questions in. That doesn’t feel right, does it? How could you show your qualifications if they don’t ask you enough questions in the first place.

Now, in football, strikers are at least partly responsible for creating their own goal scoring opportunities, so the metaphor does not hold 100%, but I guess you get the point. And not only do shot numbers differ between players, each shot also has a unique chance of being converted to a goal. In our metaphor each question is on a unique level of difficulty.

So, you may have been handed just ten questions, if they were all easy peasy no-brainers, you would still have a good shot at making a good grade. In football, it’s the same. Goal scoring opportunities all have their own different level of quality and should be evaluated as such. Raw conversion rates are useless in a game where some people shoot from 30 yards out and others have a style that relies on short range tap-ins.

 

Expected Goals

This is where the Expected Goals, or ExpG, concept comes in. Based on shot location, shot type, assist information and some other factors, we can assign each goal scoring opportunity the correct odds of being scored if an average player was taking the shot. This brings us two separate qualities to evaluate with respect to goal scoring.

1. Which player creates the most goal scoring threat? Obviously, each players’ ExpG is a combined product of striker skill and team mate skill, and on top of that, playing for a top team will bring you more ExpG, just like it is with the traditional method of counting goals.

2. Which player makes the most of his ExpG? Which player scores more goals than his goal scoring opportunities would have brought at the feet of an average player?

In the following diagrams, just like above, penalties have been stripped out to create a fair picture.

 

Goal scoring threat

In terms of goal scoring threat, Graziano Pellè equals over 0.8 goals per game. He is the spearhead of Feyenoord’s offense and we learn here that an average Eredivisie player should expect nearly a goal per game with the goal scoring opportunities that Pellè and his team mates create for Pellè.

Heerenveen’s Alfred Finnbogason, who leads the traditional chart with 17 goals, comes in at fourth. Twente striker Castaignos and Vitesse striker Havenaar complete the top-3 behind Pellè, which feeds the theory that playing on a big team, and therefore having good team mates around, is obviously of influence here. Remember, this metric stands for creating goal scoring opportunities, which is a combined effort of both the striker himself and his team.

Top ExpG plot Eredivisie 2013-14

Finishing

The second aspect of scoring goals is converting ExpG into goals.

In terms of finishing, the Eredivisie currently holds no better player than Heerenveen’s Alfred Finnbogason. The Icelandic striker manages 4 more goals than an average player would score with his chances. Finnbogason is closely trailed by Vitesse’s Chelsea loanee Piazón and a bit further by AZ’s American striker Aron Jóhansson.

Top scorers plot Eredivisie 2013-14Graziano Pellè paints a completely different picture here. The Feyenoord striker does best in terms of fashioning out chances, but finishing them is a different picture. Even an average player would have scored over three more goals than he did. Pellè is not the worst finisher, though. Imagine what Vitesse could have done with a decent finisher on Havenaar’s position.

The green and red bars represent players whose finishing is more than two standard deviations away from the average.

 

In the end

In this post, we’ve come from a traditional list of names and goals scored, to a sophisticated metric to judge goal scoring talent in its most honest way. It seems creating chances for yourself, or allowing team mates to do so, is a different skill from finishing those chances. Only the true top strikers blend these skills.

This metric may also help explain why Graziano Pellè was disappointingly average at AZ and Cesena, but is now seen as a real top scorer. Feyenoord has developed a playing style that runs its offense for a huge part through him, and uses his skills to create goal scoring threat to its maximum. But finishing chances is not one of Graziano’s skills.

Another nice individual to single out is AZ’s Aron Jóhansson. He is fourth in the G90 list, but drops out of the top-10 if we strip his four penalties. The combined ExpG graphs learn us that he is way too low in terms of goal scoring threat, but what he gets thrown at him, he finishes with elite skill for this league. He is like a reverse-Havenaar, who gets in the mix of the third most ExpG, but is the worst finisher identified here.

Title Contenders By The Numbers – Early Days Edition

Logo_EredivisieWith five matches played, we’ll look at some shot numbers across the Eredivisie Title favorites. Yes, it’s early days, and a lot of this may look different when, after another five matches, team numbers will start to settle at levels closer to their true values. Also, casually, this post will touch on shot quality a lot more than I did in the past. We’ll slowly work to a way of combining shot quality and quantity. An improved TSR, so to say.

 

Struggles

So far, over the first five matches, in terms of points won, each one of Twente, Ajax, PSV and Feyenoord has already had its struggles and none has won more than three matches yet. A look at the numbers will reveal where each team has failed to live up to expectations.

There’s only one team that owns the Eredivisie right now. The Dusan Tadic show that Twente is, dominates in terms of shot creation (123 shots for) and prevention (34 against). By definition, you’ll have the highest Total Shot Rate (TSR) then: 0.783. If you’re still not familiar with football analysis’ most significant stat, let me explain by saying that Twente creates over three times more shots than they concede. A simple plot of each team’s shots for (horizontal axis) and against (vertical axis) will help illustrate just how far ahead of the rest of the pack Twente is: nearly off the chart!

 

So, if Twente owns the Eredivisie, they lead the table, right?

Well, no, or at least, not yet. Oddly enough, Twente had trouble scoring in three of their first five matches, leading to two home draws already, and a 1-0 loss at Vitesse. At least they did win the other two games, to make it a 2-2-1 W-D-L record. Twente’s main problem was clutch scoring: 10 of their 11 goals were scored in the two wins. That will always mess up overall ratings like TSR.

 

Shot Quality

Twente’s struggles to score become apparent when we factor in the quality of the 100+ shots that they created. The inclusion of Eredivisie data in Squakwa.com enables us to collect several shot characteristics that reflect shot quality. Shot location is the most important factor here, but also shots and headers need to be separated, as they have different conversion rates.

Overall, we can stratify Twente’s shots for location and shot type in order to compare against a league wide conversion. The average team would have scored around 9.5% of Twente’s 123 shots. With this shot quality for (SQF) of just 0.105, Twente comes in just 15th. By the way, combining shot quality and frequency, the model expects Twente to score 12.9 goals (0.105 * 123), which is somewhat behind their actual 11 goals scored.

 

Misleading TSR

Behind Twente, it’s the usual suspects that complete the TSR top-3: Ajax (0,591) and PSV (0,578). Ajax, however, is one fine example of a misleading TSR! Their 52 shots conceded comes in 2nd lowest in the Eredivisie, but it’s the quality of conceded shots that is a source for major concern. Of 52 shots conceded by Ajax, a worrying 37 (71.1%) have come from inside the box and of those 37, the majority have come from central inside the box positions!

This all leads to a shot quality against (SQA) that is not even close to any other team in the Eredivisie: 0.155. So, despite coming in second in terms of the raw number of shots conceded, Ajax comes in 10th in terms of Expected Goals conceded (8.1), which ties in nicely with their 8 goals conceded!

 

PSV

PSV also deserve a mention in the shot quality column, but for their poor SQF. With an expected conversion of just 0.077 they rank 17th in terms of offensive Shot Quality. They did, however, hide that by significantly outperforming the model in terms of actual goals scored. Despite an expected 6.5 goals scored in the model, they managed 12 in real life.

This chart shows PSV’s shots and goals. At first glance, it’s not too bad, is it? But beware, the golden balls representing goals will soon start to dry up as too many of their attempts are from outside the box and from wide areas within the box. Yes, they often play compact and tight defenses, but the lack of central zone shooting will cost PSV dearly at some point in the season.

PSV attempted 85 shots, of which 38 (44.7%) were from outside the box. Those shots resulted in two goals, while PSV’s 10 remaining goals were scored with their 47 attempts from inside the box. Another reason for PSV’s poor offensive Shot Quality is the fact that from their shots from inside the box, under a quarter were fired in from nice central zones, and the far majority from lateral shooting positions.

 

Feyenoord

Should we mention Feyenoord here? Well, last season’s number three had certainly hoped to be title contenders this year around, but three losses to open the season have lead to a 2-0-3 record now. Let’s look one layer deeper…

Shots created: 64 (13th), shots conceded 73 (7th), for a TSR of 0.467 (11th). Not good.

Shot quality for: 0.088 (16th, ouch), shot quality against 0.110 (12th, ouch again).

We can factor that into the TSR by looking at Expected Goals scored (5.6) and conceded (8.0), which gives and Expected Goals Ratio of 5.6 / 5.6 + 8.0 = 0.412 (15th).

You still there? Good. For Feyenoord’s 12th place 0.492 TSR would is bad already, but a correction for shot quality drops them down, even to 15th. One small side note: Feyenoord played part of the match against Twente with nine men, which may skew the numbers. A bit.

 

In the end

Of the title contenders Twente, Ajax, PSV and Feyenoord, who had the best start over five matches? This in depth look at the numbers makes a firm case for Twente, as clutch scoring and a disappointing offensive shots quality are better problems to have than what the other teams are dealing with. Also, what Twente lack in terms of offensive shot quality, they make up for in terms of raw numbers with over 20 shots created per match.

Ajax have a horribly high quality of shots against, which explains their high amount of goals conceded (8 in 5 matches, versus 31 in 34 matches over last season). PSV have the reverse problem: a very disappointing shot quality for, but for the moment it is concealed behind an impossible conversion rate of nearly twice the model’s expectations. Feyenoord are mainly mentioned here for last year’s 3rd place finish, as their numbers indicate mid-table quality so far. Sure, they will regress to their true level a bit, but their disappointing opening is down to more than just bad luck.

 

TSR = Total Shots Rate

SQF = Shot Quality For

SQA = Shot Quality Against

 

data: squawka.com

Where analysis starts to meet tactics: Finishing Quality

It’s not just our goal at 11tegen11 over the summer period to feed you with analytics pieces, although at the present rate you may start thinking otherwise. The aim is to provide more detail in analytics and ultimately to remove the barrier between analytics and tactics.

Long term readers will know that 11tegen11 started out, back in 2010, as a pure tactics oriented blog, but since then, slowly the analytics part has crept in. Pure tactical analysis has become a rare commodity here, after analytics took over. The main reason behind this, and I can safely say that now, is that tactical analytics is all hindsight bias.

Early this year, Richard Whitall put it very nicely in one of his State of Analytics columns

This is not to say that this kind of subjective interpretation of tactical trends, strengths and weaknesses is without value, but I do think it is subject to abuse. For example, it’s too often the case that some writers will imply a strong causal link between a certain, game-specific formation and a final outcome or set of outcomes.

These two sentences kind of bring together why I gave up writing tactical match reports. I missed the evidence to make my statements and basically, explaining tactics in the context of highly luck-driven outcomes felt like the abuse that I just quoted.

So, let’s move on and hope that analytics and tactics will soon merge. I firmly believe that trend has recently started, and it won’t stop soon. Analysts are nothing without tactical content, just as tactics are empty without empirical evidence to back it up.

This post was intended to tackle the issue of finishing quality, so let’s continue and do a little thought experiment. Try and answer this simple question…

What does it take to score a goal?

Simple, right? Creating a shooting opportunity and finding the back of the net. Correct.

In the previous posts, we’ve focused on the first step: creating shooting opportunities. Teams differ in two respects here: better teams create more shooting opportunities and they create better shooting opportunities. Both the number of shots created and the amount of ‘Expected Goals per Shot Created’ nicely correlate with the final league table.

But what’s up with step two? You may create less shooting opportunities, or shots with a lower ‘Expected Goals Scored’ number attached to it, as long as you make up for it by finishing more chances, you’ll be fine. Today’s post will identify Finishing Quality and come up with a simple parameter to judge teams or players by.

Remember that we’ve recently established a number for Expected Goals for each team. The number of Expected Goals is quite simple. Categorize each shot by strike zone and game state and look up the league average conversion rate for that shot. Add the total for all shots, and here we are, a total number of Expected Goals Scored.

We’re now just one step away from establishing Finishing Quality and that requires a comparison with the actual number of goals scored. Score more than the average Eredivisie team does from your shots (same number, same strike zone, same game state) and you’re an above average team when it comes to Finishing Quality. The next graph ranks all teams by Finishing Quality, defined as the number of Actual Goals Scored divided by the number of Expected Goals Scored.

Ajax and PSV, unsurprisingly, come up as the best finishing teams in the league. Vitesse and Roda complete a quite distanced top-4. Heracles is somewhere behind them in 5th place, but certainly higher than their league ranking would suggest. At the back end, Groningen’s problem is uncovered without shame. Perhaps also surprising, Feyenoord, with a FQ of just 0.84.

We’ll take this one step further and provide split numbers per team per zone. In order to keep the number of graphs within limits, I’ve created this Tableau graph (scroll down) where you can flip through all of the teams and see their performances for yourself. I’ll go over all teams in brief, as I believe there are some very interesting numbers.

Pitch zones

ADO
Overall, ADO has slightly overachieved, with an FQ of 1.05. This has led to two more goals than expected, with the best number from Zone 2. Small differences between Expected and Actual Goals though, so hardly any different from the average Eredivisie team.

Ajax
With an FQ of 1.25 Ajax scores a quarter more than the average team would do from their shots. We can see this mainly comes down to long and medium distance shooting. Ajax’ FQ from Zone 3 (20/13 or 1.67) and Zone 4 (13/7 or 1.86) is downright impressive and indicates that they’re doing things quite right from distance.

AZ
Overall, with an FQ of 1.02 AZ are an average team when it comes to finishing. It’s just from distance (Zone 4) that they seems to overachieve a bit.

Feyenoord
Now here’s an interesting one. Third ranked Feyenoord proves to be one of the worst finishing teams of the Eredivisie, who’d have thought. The previous post had identified them as creating the best chances in the Eredivisie, but their FQ comes in at just 0.84. We can see that this problem arises in all four zones, but scoring only one goal from Zone 4, where five were expected is quite poor. They took a total of 159 shots from Zone 4, which deserves a more detailed examination in a later piece.

Groningen
The worst team when it comes to Finishing. Regular followers of the Eredivisie will probably know that already, but Groningen’s strikers really can find the back of the net. Their long distance performance is poor, but it’s Zone 2 that really catches the eye. Nine goals behind an expected tally of 32, that’s quite the difference between a firm top-6 spot and mid-table.

Heerenveen
Despite having a top striker in Finnbogason, Heerenveen scores below par with an overall FQ of 0.87. Their deficiencies are mainly all over the pitch, as they underscore in each zone apart from the true tap-ins of Zone 1.

Heracles
Here’s another remarkable chart. Heracles ranks 5th overall with an FQ of 1.12, and it’s mainly because of their performance in Zone 3. That’s the area just outside the box, or slightly off-central within the box. Another lead for further investigation is born.

N.E.C.
I can’t keep saying they’re all interesting, right? Overall, N.E.C. is poor at an FQ or 0.84, but their long distance efforts overachieve, while their performance from Zone 2 is dreadful. Saved for later.

NAC
Overall around average at an FQ of 1.05. The spread looks a bit like N.E.C., with an overachievement from distance and underachievement closer by. This time, though, the differences are small, and may be not even significant.

PEC Zwolle
With an overall FQ of 0.93, PEC Zwolle don’t surprise, but they graph point out their quality was mainly in Zone 2, where this newly promoted team outperformed the average Eredivisie level.

PSV
The top scorers of the Eredivisie with 99 non penalty goals have the second highest FQ, at 1.24, just behind Ajax. But in contrast to their title rivals, PSV overachieved from every zone.

RKC
Overall, RKC came in okay, at 0.97. The graph shows it’s their Zone 2 performance that did the trick, while the medium distance shots were the problem area.

Roda
The sign of a team with an excellent striker. Massive overachievement in Zone 2, while the rest is at par. Will be studied on player level, if only to satisfy Sanharib Malki’s fans.

Twente
With an FQ of 0.92, Twente underperformed. Mainly from distance it seems, although a team that held title ambitions prior to the season start should have a better close range strike force than this sub-par Eredivisie level.

Utrecht
Overachiever in the table with their 5th spot, Utrecht did not do so on the basis of Finishing Quality. At just 0.93 overall, they were on par from Zone 2, but disappointed from further out.

Vitesse
Much like the pattern at Roda, Vitesse overachieved (1.22) and mainly did so from Zone 2 and 3. Wilfried Bony, anyone? Stay tuned.

VVV
At 0.84 an unremarkable overall FQ, and the spread across the pitch is quite even.

Willem II
Their FQ is level with VVV at 0.84. The problem has not been to score from distance, but the close range has let them down.

Forget shot numbers, let’s use expected goals instead

“Evaluation the quality, rather than the absolute number or chances created seems like a worthwhile effort. And with more detailed Eredivisie data on goal scoring attempts available on, hopefully, short notice, this kind of tool might prove a valuable addition to this season’s match reports on 11tegen11.”

It’s been two years since I wrote these words in an article named ‘A chance is a chance is a chance?’. Unfortunately, breaking down chances into expected goals, rather than simply counting shots has not made it to the Eredivisie, or any other league, yet. But times are about to change…

 

Strike Zones and Game States

Using our recent explorations on strike zones and game states, we can stratify shots according to location and match situation and come up with expected goals per shot. This is much more valuable than simply adding shot numbers, as it removes the basic – and incorrect – assumption that all shots are of equal value.

Shot location may be the most influential factor when it comes to shot quality, as we’ve learned from the days when StatDNA still posted quality analysis pieces on their blog, but location isn’t the only factor involved. The most difficult – and therefore often unmentioned – factor is defensive positioning, or defensive pressure. Measuring this in detail would require GPS tracking of all players on the pitch, which I’m sure is done behind closed doors at present, but it generates huge amounts of data, which complicates the analysis a lot. And more importantly, data at such a level of detail is not widely available yet.

We’ll have to do with what we’ve got, and game states serve as a nice proxy for defensive pressure, as we’ve seen that teams trailing by a single goal give up significantly better chances than teams defending a single goal lead, a gap that measures up to around 25%.

 
Expected Goals

The challenge for this post is now to convert all our recent explorations of strike zones and game states into something handy and simple. We need a single number to indicate the quality of shots that teams create and concede, or at player level, a number that indicates the quality of shots that a player took. Simply said, we should know how many goals the average Eredivisie player would have scored from the attempts that a team, or a player, has had. We shall term this ‘Expected Goals Scored’.

Actually, it is a very simple concept. Let’s take Ajax and examine their shots in detail. In total, Ajax created 544 shots, of which 2 penalties are excluded. Here’s a table of Ajax’ 542 remaining shots created per zone and game state.

GS -2 GS -1 GS 0 GS +1 GS +2
Zone 1 0 0 2 0 1
Zone 2 3 22 82 41 44
Zone 3 3 19 64 29 46
Zone 4 2 19 97 29 39

Our previous explorations have shown how many goals are scored per shot for each combination of strike zone and game state. We can now easily compute the expected amount of goals for Ajax’ 542 shots by multiplying both tables.

GS -2 GS -1 GS 0 GS +1 GS +2
Zone 1 0.800 0.857 0.815 0.833 0.667
Zone 2 0.192 0.179 0.190 0.269 0.274
Zone 3 0.059 0.089 0.063 0.103 0.087
Zone 4 0.028 0.033 0.035 0.022 0.059

For example, from Strike Zone 2 at GS 0, Ajax took 82 shots. The league average conversion rate for shots from Strike Zone 2 at GS 0 is 0.190. Therefore, the total Expected Goals Scored for Ajax from Strike Zone 2 at GS 0 is 82 * 0.190 = 15.59.

We can repeat this exercise for all combinations of Strike Zones and Game States and add all the subtotals. This will show that Ajax had 65.35 Expected Goals Scored with their 542 shots. In other words, the average Eredivisie team would have scored 65.35 goals from Ajax’ shots, if we correct for Strike Zone and Game State. Only one small step to go, divide the Expected Goals Scored by the number of shots, and now we know the quality of shots that Ajax created: 65.35 / 542 = 0.121 Expected Goals Scored per shot.

 

Quality of Shots Created

We can repeat the trick for each team to come up with the following graph. The bars represent the quality of the shots that teams created.

There is a considerable spread in quality of shots created. PSV and Feyenoord may expect 0.129 and 0.128 goals per shot, while Willem II creates chances that result in only 0.101 goals per shot. In other words, the type of shots that PSV and Feyenoord create are generally worth 27% more than shots by Willem II. PSV and Feyenoord are followed by the teams that also complete the top-6 in the final league standing, and Roda. In general, the quality of shots created nicely correlates with the league table, with Roda being the big exception. Roda finished 16th in the table, but comes up 4th in terms of the quality of shots created.  

 

Quality of Shots Conceded

We can do the same thing for shots conceded, and measure the quality of shots conceded. This time, of course, lower bars indicate less shots per goal conceded, as an indicator of quality defending.

Again, there is a considerable spread when we compare the best team, Groningen, with the worst team, Heerenveen. Groningen earns their top spot in this chart by doing an excellent job in forcing their opponents to shoot from low quality positions (Zone 4), as we’ve seen previously. In contrast to the Offensive Shot Quality, there is no clear correlation between Defensive Shot Quality and the final league positions. It seems that quality of shots created is a better way than quality of shots conceded to tell good and bad teams apart.

 

In the end

It’s always a good thing if analysis and observation start to overlap, and with more detailed information to work with, we’re slowly getting there. The mini-series of posts this past week has now lead to a simple parameter called Expected Goals, which we can either express per shot, over a match, or over a series of matches. It has an offensive and a defensive side and the former can be applied to teams and players, while the latter is limited to team level, since shots conceded can’t be linked to single defenders.

Next up will be a series where we will compare the outcome in terms of goals scored and conceded with the Expected Goals scored and conceded. The Expected Goals parameter estimates the chance of the shots, while the difference with the actual outcome is an indicator of Finishing Quality, or its defensive equivalent.

Which team creates the best shots of the Eredivisie?

The secret behind Feyenoord’s successful campaign? Creating high quality shots. The reason VVV and RKC did not live up to the expectations of the predictive models? Too many low quality shots. This post will use shot location to capture a challenging part of performance: differences in shot quality. And yes, some teams create better chances than others…

 

Strike zones

Last week’s post has identified four different strike zones on a football pitch. These zones have been chosen so that Zone 1 identifies near-certain goals: shots with conversion rates of over 80%. Shots from Zone 1 are rare (< 1%), but nevertheless have a huge impact. Zones 2 to 4 all contain over 25% of shots, but with sharply diminishing success rates. Shots from Zone 2 are considered high quality shots, and roughly 1 of 4 shots results in a goal. From Zone 3 this number is 1 of 13 and from Zone 4 just 1 in 30. For convenience, we could say that shots from Zone 1 are worth three shots from Zone 2, which are worth three shots from Zone 3, which in turn are worth three shots from Zone 4.

Pitch zones

Using these zones allows us to study the basic fallacy of our beloved shot metric, Total Shots Ratio (TSR). Remember, TSR is a simple, but very successful metric, that uses the fraction of shots that teams take in their matches. TSR has shown to correlate very well with the amount of points that teams obtain and it has proven its usefulness in identifying teams that experience lucky or unlucky streaks.

 

All shots are equal, or not?

The main concern with TSR, however, is that all shots are treated equal. As simple as the TSR model is, shots from 40 yards out are counted the same as one yard tap-ins. Over the long run, most of the differences between shots will even out, but only if we assume that shot quality is evenly balanced out between teams. But is that really the case? Don’t certain teams create better chances than others? If so, we could better understand why certain teams did not live up to the expectations by the TSR model, while others have seemingly over performed.

The build-up of this post will be to look at all Eredivisie teams, and identify where they took most of their shots. Let’s get started with the raw numbers. I’ve excluded penalties for this analysis.

Zone 1 Zone 2 Zone 3 Zone 4
ADO 2 118 101 159
Ajax 3 192 161 186
AZ 4 161 126 160
Feyenoord 6 206 152 159
Groningen 2 157 93 174
Heerenveen 3 154 148 144
Heracles 3 141 138 176
NAC 3 118 82 131
N.E.C. 1 171 118 164
PEC Zwolle 1 133 99 165
PSV 8 225 172 213
RKC 4 100 122 135
Roda 4 137 89 110
Twente 4 181 134 152
Utrecht 3 178 118 153
Vitesse 2 172 109 146
VVV 2 120 112 147
Willem II 2 122 87 152

Thanks to my friend Benjamin Pugsley, this table format has now been improved a lot, and you can even sort the columns yourself! But from an analytics view, this data is not really accessible, is it? Let’s go through a few numbers in detail and see what we can work out from there… PSV have dominated the shots department and have created the highest number of shots in each of the three zones. Other outliers are a remarkable high number of shots from Zone 2 by Feyenoord – that’s a good thing – and a remarkable low number of shots from that zone by RKC – that’s a bad thing. Roda tend to shy away from taking long shots, with only 110 attempts from Zone 4, but they tend to shy away from creating shots from any zone, so, alas, not necessarily a good thing for them.  

Relativity

Clearly, this table needs relativity. Teams like Roda may do well by avoiding futile shots (Zone 4) at goal, but they hardly create any quality attempts (Zone 2) either.

Zone 2 Zone 3 Zone 4
ADO 0.31 0.27 0.42
Ajax 0.35 0.3 0.34
AZ 0.36 0.28 0.35
Feyenoord 0.39 0.29 0.3
Groningen 0.37 0.22 0.41
Heerenveen 0.34 0.33 0.32
Heracles 0.31 0.3 0.38
NAC 0.35 0.25 0.39
N.E.C. 0.38 0.26 0.36
PEC Zwolle 0.33 0.25 0.41
PSV 0.36 0.28 0.34
RKC 0.28 0.34 0.37
Roda 0.4 0.26 0.32
Twente 0.38 0.28 0.32
Utrecht 0.39 0.26 0.34
Vitesse 0.4 0.25 0.34
VVV 0.31 0.29 0.39
Willem II 0.34 0.24 0.42

This is basically the same information, but presented in a slightly different way. PSV stood out in the previous table, because they had the highest amount of shots in each zone, but the spread across the zones is not remarkable. RKC stand out as a team with big trouble creating quality shots (Zone 2), while Feyenoord, Vitesse, Utrecht and Twente succeeded in creating a high proportion of shots from Zone 2.

Willem II, PEC Zwolle, ADO and Groningen take over 40% of shots from Zone 4, where teams on average need around 30 shots for a goal, while Feyenoord, Heerenveen, Roda and Twente limit themselves to less than 33% of shots from Zone 4.

 

Defense

We can repeat this trick for shots conceded, to try and identify teams that give up high quality chances and teams that give up mainly low quality chances. Here’s a similar table to the one above, but this time for the defensive side of things.

Zone 2 Zone 3 Zone 4
ADO 0.41 0.3 0.29
Ajax 0.33 0.31 0.36
AZ 0.31 0.29 0.39
Feyenoord 0.36 0.25 0.38
Groningen 0.31 0.24 0.44
Heerenveen 0.45 0.23 0.31
Heracles 0.38 0.31 0.3
NAC 0.36 0.25 0.38
N.E.C. 0.37 0.28 0.36
PEC Zwolle 0.38 0.29 0.33
PSV 0.33 0.28 0.38
RKC 0.4 0.24 0.35
Roda 0.36 0.28 0.35
Twente 0.32 0.23 0.45
Utrecht 0.32 0.26 0.41
Vitesse 0.34 0.29 0.36
VVV 0.41 0.27 0.31
Willem II 0.35 0.31 0.33

Painful news for Heerenveen, who give up a record high 45% of shots from Zone 2, followed at some distance by ADO and VVV at 41% and RKC at 40%. Better numbers in that regard for AZ and Groningen, who lead a bunch of teams with just over 30% of shots from Zone 2.
Remeber, you can sort the table yourself by clicking on the header.

Twente and Groningen perform well in terms of forcing their opponents into low quality chances (Zone 4), while ADO, Heracles, Heerenveen and VVV could do a lot better in that regard.

 

In the end

Lots of numbers, but don’t let that scare you away. We’ll wrap the concept of shot quality differences up in a later post, where we’ll fuse a lot of this concepts to make things simpler.

Using shot locations and league wide conversion from our well defined zones, we can see which teams did well in terms of creating shots from quality positions, while preventing their opponents from doing so. At the very least, it seems there are relevant differences between teams in terms of shot quality, which in itself is bad news for models based on TSR. And for the nearby future, this type of detailed analysis can help us to slowly build towards a better synthesis of analysis and tactics…

A deeper look at shot locations: we still need Game State

Sometimes, what we intuitively have known for a long time, makes sense in the numbers too. That’s probably a good sign and it is certainly true for shot location. Basically, the further away from goal and the further out of the midline, the worse a shot is. But what about that other new concept, Game State? Can we safely leave that out now that we’ve got shot location? It seems not…

 

Four zones

Yesterday, we’ve seen that we can reasonably split a football pitch into four zones, conveniently named Zone 1 to 4, with each higher numbered zone cutting the conversion chances about a third. For clarity, here’s the diagram of the zones again.

Pitch zones

Shots from Zone 1 are rare (< 1%), but they do deserve their own category because of their extremely high conversion rate (>80%). I will leave Zone 1 out of the next table, since the low numbers only make things messy without adding value.

 

Conversion

The next table focuses on conversion rates at different Game States in Zones 2 to 4. Note that Game States represents score differential, but Game States -2 and +2 contain all teams chasing or leading two or more goals.

Overall conversion GS -2 GS -1 GS 0 GS +1 GS +2
Zone 2

0.222

0.212

0.194

0.211

0.251

0.268

Zone 3

0.082

0.056

0.076

0.072

0.118

0.087

Zone 4

0.034

0.016

0.030

0.033

0.030

0.074

As we already know, overall conversion drops sharply as we move up the Zones. However, the most interesting finding is that conversion within a zone is strongly linked to Game State. This means that using shot location is important, but independently, one should also factor in Game State. So, high quality shots are even more high quality if a team is leading a match.

Obviously, shot quality is not directly influenced by the score board, but we consider it a proxy for defensive positioning. As teams chase a goal, they give up defensive effort to gain more offense. Subsequently, as we now find out, their opponents can generate higher quality shots, independent of location.

 

Leading and chasing

So leading teams fire in better shots, but do Game States also influence the amount of shots that teams take from different zones? Do leading teams fire in more close range shots?

To answer that question, we need to study shots, zones and Game States in a slightly different way. The next table shows the relative amount of shots that teams take from Zones 2 to 4 given their particular Game State. Remember, Zone 1 is left out, so numbers in the columns don’t exactly add up to 100%.

Shots

GS -2

GS -1

GS 0

GS +1

GS +2

Zone 2

0.357

0.360

0.354

0.348

0.360

0.387

Zone 3

0.277

0.270

0.267

0.265

0.296

0.312

Zone 4

0.358

0.361

0.371

0.379

0.335

0.293

Overall, teams take most shots either from Zone 2, which is mainly the central penalty box area, where conversion lies around 22%, or from Zone 4, which is the near hopeless area of 3.4% conversion. But Game States do influence where teams take their shots from.

As expected, teams that defend a lead, either by one goal or by two goals or more, decrease the frequency of low quality shots (Zone 4) to below 30%. Also, leading teams have the highest proportion of high quality shots (Zone 2, > 38%).

Interestingly, the highest proportion of low quality shots (Zone 4) is not fired in by teams chasing a lead, but at an even Game State. This links in well with the earlier observation that conversion rates dip significantly at this Game State. It is true, however, that teams chasing a single goal (GS -1) also fire in a high amount of low quality shots (Zone 4, > 37%).

Conclusion

If we link these two tables together, we can learn that leading teams take more shots from good positions (Zone 2) and less from hopeless positions (Zone 4), but their conversion rate from each zone is also significantly higher. So, it works two ways when teams chase leads: they sacrifice defense by giving up more Zone 2 shots that also stand a higher chance of finding the net.