We are busy people. Most of us are in their twenties or thirties, have demanding day jobs, and a partner or family we love to attend to. And, just for fun, a few years back we opened a blog and wrote something about football and numbers. We liked it, so we wrote some more, and kept on doing so. We coined ourselves something of an online community of football analytics bloggers, and by now we’ve been around for years.
But something is changing. Most established football analytics blogs experience a severe drop in articles over the past year or so, and 11tegen11 is no exception to that trend. We are busy with our jobs, lives and families. Football writing can wait a moment, and another moment, and another moment. To the point where I started believing my own lie that I couldn’t find the time for writing recently.
Life is no busier now than it was when I started writing, back in the summer of 2010, and blaming time constraints is just the easy way out of a question that deserves an honest answer. A recent piece by @JFFutbol poses the question sharply: “is football blogging dead, dying, or simply changing?”
Author Johnathan Fadugba comes up with three major reasons for the decline in blogging: time constraints, ‘it isn’t going anywhere’, and ‘it isn’t fun anymore’. None of these apply to 11tegen11, since time constraints are no different now or back then, over the years we’re absolutely going somewhere, and I definitely enjoy writing blog pieces. Yet I do have the feeling that my blogging activity is painfully slow recently. So, here’s a personal story about the 11tegen11 blog, and how it has developed over the years.
In the summer of 2010, 11tegen11 started out as a tactics blog with a focus on Dutch football. My aim at 11tegen11 had always been to be an independent, personal blog that provides well-constructed opinions on anything Dutch football related. The use of numbers and analytics was a logical path to take. I figured I’d use numbers and analysis to form an opinion, write about it and be different from just a random guy with an opinion. My writing focuses more on the travel (analytics), than the destination (conclusion).
The biggest problem, back in 2010, was the general lack of access to data. My writing mainly concerned tactical match and team reports. That was hardly data driven at all, but it did help me to get in touch with two data companies: Infostrada Sports and InStat Football. Both of them helped me get access to data I would never have seen otherwise, though, back in 2010, that meant raw shot and possession numbers per match. Which still felt like the bomb, by the way. My football blogging helped me to establish a platform to use this data, which I would not have had if I‘d just been the average casual fan.
Oh happy days
Exploring this level of data with our growing football analytics community, we dragged the concept of Total Shots Ratio (TSR) as far as we could. We’ve developed predictive models based on TSR, used it to evaluate manager performances, and successfully identified under- and overachievers at several stages of the season.
Databases were simple two dimensional spreadsheets, calculations were done within seconds, and the rest of the evening remained for writing. For most of 2011 we had a lot of fun with simple concepts like TSR, which proved a decent performance analysis tool.
In 2012, things started changing. Websites like Squawka and WhoScored filled our desire for more and better data. Both sites bring a wealth of OPTA-fueled data at just some mouse clicks away. Shot charts, minute-by-minute data, individual player actions, you name it.
It wasn’t long before even we, TSR protagonists, had to confess the limits of simply counting each and every goal scoring attempt. It took some time to develop, but the invention of ‘Expected Goals’ (ExpG), was inevitable (as can be seen in this philosophical piece from 2011). With ever refining models, we assign each goal scoring attempt a number between 0 and 1 to reflect to odds of said chance resulting in a goal. ExpG is definitely the eye catcher of football analytics at present, but the possibilities are endless, both on team and player level.
Meanwhile, the activity of our football analytics community did not go unnoticed in mainstream media and from 2012 onwards, a significant number of early blog writers got snapped up by established media sources or data companies.
Personally, early in 2013 I was offered the opportunity to join a small group of pioneers and start writing for the website of Dutch national newspaper De Volkskrant. Recently, I could add a support writing role for digital news medium ‘De Correspondent’, which meant a step up in mainstream media land. The increased attention allowed us to show our work to a bigger crowd of Dutch readers at an established stage, but it also brought along the pressure of deadlines and expectation. All that time, blog writing could wait.
With the introduction of Squawka and WhoScored in 2012, the amount of publicly available data grew exponentially, and so did the complexity of our analytics. Personally, I used some in-between-jobs time to train myself to use R statistical software to make best use of our new found wealth, and time investment sharted shifting from writing to analyzing.
The present ExpG model on 11tegen11 is a self-learning general linear regression stratified for different match situations like open play, corners, free kick, etcetera. The model uses as much contextual information as possible within the limits of on-ball data. Shot location, shot type, assist information, game state, league effects are all used if appropriate for the match situation at hand. A spare hour is easily spent trying to fine tune some aspects of the model, or to fix some complicated large size database issues. Again, blog writing could wait.
On top of that, in the back of our minds, a soft voice kept insisting: “don’t share everything you’re developing now, it might be of competitive advantage”. So far, it’s hard to earn money with football analytics, though that may change in the future. Clubs refrain from massively adopting analysts for various reasons, and the betting industry is pretty hard to catch over longer periods of time. Personally though, this phenomenon has played a role for a while, and it would be unfair to open up in this piece without mentioning this factor.
Pressed in between work, social life, and new-found deadlines for mainstream work, it was often easier for me to pop out a twitter shout or a short infographic. R is a great piece of software to create scripted infographics, and potential blog pieces ended up half-written before actuality had caught up with them, or never even got further than some pilot data work.
On top of that, blog writing suffered severe competition for the one thing even better than football data. Right, watching football that is. Now that’s where 2010 and 2014 make a huge difference. Nearly every day between August and May holds top level league matches that can be found on TV or streaming on the internet. And for those dull months in between there’s play-offs, World Cups, friendlies, etcetera. Never an evening without football on your flatscreen. And, with the advent of detailed league data worldwide, the number of leagues to get indulged just keeps on growing. If you can watch the Argentine Super Clasico, blog writing can wait.
Back in the TSR days of 2011, writing about football analytics was easy. In counting shots there isn’t much one can do wrong. But things are different now. Complex scripts contain small errors that need tracking and fixing. The free flowing game of football needs complicated analysis to be at least somewhat accurate, and complicated analysis needs a lot of words to be explained.
People want to read about football, not about analytical modelling, and it’s a challenge to walk the tight rope between under and over explaining analytical methods. On 11tegen11 at times, I’ve avoided this issue by not writing at all, or, in most cases, by focusing on concepts (like scouting or identifying playing style) rather than teams or players. The concepts often didn’t return. Not because they weren’t interesting, but because self-imposed 1000 or 1500 word limits for team of player articles doesn’t leave room for explaining the concepts enough.
Perhaps that’s wrong, and I should have just used terms like ‘crosses to through ball ratio’ or ‘ExpG over performance’ regularly so that returning readers would familiarize themselves with it. And readers that shy away from terms like that, well, would that be your audience anyway?
In the end
In the four years that 11tegen11 has been around, a lot has changed. We’ve got more detailed data than we can handle, we can see more matches than would actually be healthy, and kept writing waiting for too long.
Football analytics blogging may well be at a breaking point in its short life. Investing more in deeper and more complicated – yet more accurate – analysis, without explaining to a wider audience, would see us dig a hole for ourselves. It would make our little community inaccessible in a few years time, and that would not help develop this niche that I don’t think should be a niche.
Writing can makes watching and analyzing football more fun. If we’d make up for lost ground and write without those unpretentious pieces that we did a few years ago, we’d be better off in the long run. Not all pieces need to be mouth-watering analysis in eloquently written near poetry. Bring back the raw unedited pieces that football blogging should be all about. Bring back the fun!