How to scout goal scoring talent?

Strikers are the most sought after commodity in football. Having a player who can put the ball in the back of the net more than others can is a highly valuable asset to a football team. So, how to find one? The easiest and most applied way would be to list names and goals, pick a top name, and bingo!

Now, while semantically it is true without a doubt, that a top scorer is the guy who scorers the most goals, counting goals seems a poor way to identify goal scoring talent. Let’s walk along some simple improvements to do it better.

 

Traditional

We start with this well-known format of player names and goals scored. Easy, right?

Top scorers - Traditional - Eredivisie 2013-14

This table will always do a good job at the top, since players like Finnbogason and Pellè take a ton of shots, and would never show up that high if they did not have true goal scoring skill. But what about players a little lower down the table? Is a player with 7 goals to his name at this half way point of the season doing a good job, or not?

 

Per 90 (G90)

Time to make our first, and very simple adjustment: a correction for playing time. Just like the smart people at Statsbomb – do check that site out, it’s amazing – I prefer my goal scoring information as per 90. Just divide goals by playing time to arrive at that stat. Here’s the table again. I’ve excluded players who’ve played less than half of the season, to prevent Jaïro Riedewald – 2 goals in an 11 minutes sub appearance – from skewing the chart.

Top scorers - G90 - Eredivisie 2013-14

It won’t make too much of a change at the top, as those players play nearly every possible minute, but still, subtle changes do occur. Behind the identical top-6, new names appear. Lower down the list, we can expect a bigger impact, since here we find players that may be successful impact subs, have been injured, or youngster who are not the focal point in their teams yet.

 

Non penalty goals per 90 (NPG90)

With the next improvement also comes the next acronym. In the age of quick, twitter centered communication, football analytics can’t do without it’s acronyms. It is not that we don’t value accessibility, since we really do, but acronyms makes talking about these metrics possible.

So, we’ll strip out penalties and then look at the goals per 90 again. A second, simple adjustment that corrects for the fact that not all players take an equal amount of penalties, or even take penalties at all. Penalties are one of the best goal scoring opportunities around, but they are very unevenly distributed among the players. So it makes intuitive sense to strip them out when looking for goal scoring talent.

Top scorers - NPG90 - Eredivisie 2013-14 

Piazon drops a bit, from 0.72 to 0.52, but the most remarkable drop is Aron Jóhansson, who drops out of the top-10 while holding the fourth spot on the G90 table. Four of his 11 goals are penalties. But, AZ fans, don’t worry, Jóhansson will be back later in this piece.

We can take it a step further, and this may be where things may look more complicated. Don’t worry, because it isn’t complicated and I’ll walk you through the next level.

The main thing that is wrong with the NPG90 table is that not all players have had an equal amount of goal scoring opportunities.

 

Classroom exam

Imagine yourself sitting in a classroom, taking an important exam. On this exam, only correct answers will be counted, no penalty for wrong answers, and you get a paper filled with just ten questions. A slight look around tells you that other people have been handed more questions, some even got multiple papers to fit all the questions in. That doesn’t feel right, does it? How could you show your qualifications if they don’t ask you enough questions in the first place.

Now, in football, strikers are at least partly responsible for creating their own goal scoring opportunities, so the metaphor does not hold 100%, but I guess you get the point. And not only do shot numbers differ between players, each shot also has a unique chance of being converted to a goal. In our metaphor each question is on a unique level of difficulty.

So, you may have been handed just ten questions, if they were all easy peasy no-brainers, you would still have a good shot at making a good grade. In football, it’s the same. Goal scoring opportunities all have their own different level of quality and should be evaluated as such. Raw conversion rates are useless in a game where some people shoot from 30 yards out and others have a style that relies on short range tap-ins.

 

Expected Goals

This is where the Expected Goals, or ExpG, concept comes in. Based on shot location, shot type, assist information and some other factors, we can assign each goal scoring opportunity the correct odds of being scored if an average player was taking the shot. This brings us two separate qualities to evaluate with respect to goal scoring.

1. Which player creates the most goal scoring threat? Obviously, each players’ ExpG is a combined product of striker skill and team mate skill, and on top of that, playing for a top team will bring you more ExpG, just like it is with the traditional method of counting goals.

2. Which player makes the most of his ExpG? Which player scores more goals than his goal scoring opportunities would have brought at the feet of an average player?

In the following diagrams, just like above, penalties have been stripped out to create a fair picture.

 

Goal scoring threat

In terms of goal scoring threat, Graziano Pellè equals over 0.8 goals per game. He is the spearhead of Feyenoord’s offense and we learn here that an average Eredivisie player should expect nearly a goal per game with the goal scoring opportunities that Pellè and his team mates create for Pellè.

Heerenveen’s Alfred Finnbogason, who leads the traditional chart with 17 goals, comes in at fourth. Twente striker Castaignos and Vitesse striker Havenaar complete the top-3 behind Pellè, which feeds the theory that playing on a big team, and therefore having good team mates around, is obviously of influence here. Remember, this metric stands for creating goal scoring opportunities, which is a combined effort of both the striker himself and his team.

Top ExpG plot Eredivisie 2013-14

Finishing

The second aspect of scoring goals is converting ExpG into goals.

In terms of finishing, the Eredivisie currently holds no better player than Heerenveen’s Alfred Finnbogason. The Icelandic striker manages 4 more goals than an average player would score with his chances. Finnbogason is closely trailed by Vitesse’s Chelsea loanee Piazón and a bit further by AZ’s American striker Aron Jóhansson.

Top scorers plot Eredivisie 2013-14Graziano Pellè paints a completely different picture here. The Feyenoord striker does best in terms of fashioning out chances, but finishing them is a different picture. Even an average player would have scored over three more goals than he did. Pellè is not the worst finisher, though. Imagine what Vitesse could have done with a decent finisher on Havenaar’s position.

The green and red bars represent players whose finishing is more than two standard deviations away from the average.

 

In the end

In this post, we’ve come from a traditional list of names and goals scored, to a sophisticated metric to judge goal scoring talent in its most honest way. It seems creating chances for yourself, or allowing team mates to do so, is a different skill from finishing those chances. Only the true top strikers blend these skills.

This metric may also help explain why Graziano Pellè was disappointingly average at AZ and Cesena, but is now seen as a real top scorer. Feyenoord has developed a playing style that runs its offense for a huge part through him, and uses his skills to create goal scoring threat to its maximum. But finishing chances is not one of Graziano’s skills.

Another nice individual to single out is AZ’s Aron Jóhansson. He is fourth in the G90 list, but drops out of the top-10 if we strip his four penalties. The combined ExpG graphs learn us that he is way too low in terms of goal scoring threat, but what he gets thrown at him, he finishes with elite skill for this league. He is like a reverse-Havenaar, who gets in the mix of the third most ExpG, but is the worst finisher identified here.

14 thoughts on “How to scout goal scoring talent?

  1. Benny

    As always, a very interesting article and well explained. But there was one point where I could not quite follow: “This is where the Expected Goals, or ExpG, concept comes in. Based on shot location, shot type, assist information and some other factors, we can assign each goal scoring opportunity the correct odds of being scored if an average player was taking the shot.”

    Forgive me for my ignorance, but how exactly do you do that?

    Thanks and keep up the great work,
    Benny

    Reply
    1. 11tegen11 Post author

      Thanks, Benny!

      ExpG mean that you assign each shot the correct odds of resulting in a goal. Variables used here are shot location (most important), shot type (headers versus shots), assists (as they indicate a shot is not a rebound, or any other quick takeover), etc. etc.
      Behind this concept hides a complex logistic regression analysis to determine if certain variables do influence the odds of a shot resulting in a goal, and to what degree.
      I would not bore you readers with this mathematical side of stuff, though I think it’s a good sign you among other people are asking this question. Don’t take things for granted!

      Reply
  2. p

    That’s a good read. Thanks.

    I think, one potential problem with the above approach is that it takes performance at its face value, rather than attempting to identify the talent behind it. For example, how much of the finishing quality of the players at the top of the last figure was due to their inherent skill and how much was due to factors beyond their control? The answer will determine to what extent we can expect the performance to carry over to future games, which is the only thing that really matters. There is some work out there that attempts to do it although in a simpler shot generation/conversion setting.

    Reply
    1. 11tegen11 Post author

      Absolutely agree with you on that one. Repeatability is the one thing that metrics should have. Otherwise we end up assigning value to random stuff and following noise instead of signal.
      You’ll understand that it would take this introductory piece a step to far if I’d start throwing scatter plots and R-squares around immediately, but that will come. Soon enough… Important point!

      Reply
      1. Goalimpact

        I wouldn’t dem and repeatability from a metric but rather that the metric tells us something about the future rather than describing what happened in the past. In case that we expect the future to equal the past, then we have repeatability and this is not seldom the case. However, some things change over time and then we would be more interested on the expected realization of the metric in the next games rather then the previous.

        I just randomly posted this statement under your article. I find your work and analytical thinking great. But I read the whish for repeatability often from many authors, so I incidentally felt the urge to clarify this.

        Reply
  3. Conor

    Great article again, I think this is a much purer way of identifying good strikers. However one thing that I think could improve even further on defining how good a finisher a player is is to use Colin Trainor and Constantinos Chappas ExpG2 model (or similar) instead of actual goals. For example if a player takes a shot is placed right in the top corner it might have an ExpG2 of close to 1, but if the gk somehow manages to save it, then it seems a little unfair to record it as no goal scored. Using ExpG2 should give more accurate results than actual goals scored, remove variance, and thus give you a much better idea of how good a finisher a player actually is.

    Reply
    1. 11tegen11 Post author

      Thanks, Conor.

      I am well aware of Colin and Constantinos’ ExpG2 model, but I have my doubts about whether (A) placing shots is a skill that can be repeatedly demonstrated by professional strikers in a football match, and (B) whether the influence of that factor is enough to warrant inclusion in an ExpG model. For all I know, strikers may be quite equally skilled in this respect (remember they are all elite professional strikers) and then we allow a lot of noise to enter our model out of fear of missing a little piece of signal…
      For now, no shot placements for me, but I’m willing to change my opinion should data show that I should.

      Reply
  4. jalal102

    Amazing article !!. Wonder what stats would be helpful in identifying midfield talent. E.g. which players are great at making key passes? Is it the influence of the system the team utilizes? – with more runs from the strikers ? How safe is your defensive midfielder when it comes to controlling the ball? e.g. Pass accuracy for Busquets much higher due to the team he plays in. If you’ve already figured it out please do write an article on that topic.

    Reply
    1. 11tegen11 Post author

      Thanks a lot!
      It is no coincidence that I start with the scouting piece for goal scoring talent, since that is probably the easiest skill to isolate. And it isn’t even fully possible to isolate is.
      There is more coming up along the same lines though. And it will indeed be a challenge to separate team- and individual efforts without over complicating matters.
      Perhaps you have already read the recent piece by Ted Knutson on StatsBomb about defensive midfielders. If not, check this!

      Reply
  5. Tom

    Thanks for another interesting read, I’m really enjoying your analyses.

    Just a thought: Like any analysis on goal scoring, I would assume that you accept that there is some element of luck in the final ranking of actual goals vs ExpG, because of the small sample size, and because the variation even between (say) position 3 on the chart and position 9 is probably less than one goal. This means that one lucky break, e.g. if a shot that hit the post had gone in instead, would allow someone to jump several places in the table. So given that these limitations already exist in the data, I wonder if excluding players who have played only half of the games so far is a little bit too harsh? I mean, I agree it makes no sense to have Riedewald in there, but the data for someone like Klaassen who has had 10 appearances may not be that much worse than for players who have played the full 17 games so far…

    Reply
  6. jeremad

    Hi guys, This is a huge and fantastic work. I’m a french engineer, and obviously a french league fan. I would like to do a similar work fir the french league. Is it possible to access your formula here?

    Anyway, keep on what you’re doing, it’s great!

    Jeremy, from France!

    Reply
    1. 11tegen11 Post author

      Hi Jeremy,

      Merci beaucoup pour ce mots genereux!
      Thank you for those kind words…

      I’m hoping to a round across Europe (top 5 leagues at least), come the first international break.
      For now, there’s too much actuality going on, but stayed tuned and you’ll be served!

      Reply

Leave a Reply