Monday, February 20, 2012

The Dynamics of Relegation in the Premier League: Early Warning Signs and Seeing the Forest for the Trees


(c) 2011 mirrorfootball.co.uk

In hindsight, relegation often seems inevitable. If you had asked the pundits, Blackpool's demotion to the Championship last year was all but a done deal in August. But do the data agree? And what can they tell us about the inevitability and predictability of relegation ahead of time, rather than after the fact?

It's not an easy question to answer. The trick to avoiding what psychologists call hindsight bias is to spot trends before they become facts. But that's a hard thing to do in the middle of a season when the weekly performance of teams varies for all kinds of reasons and the hoopla and grind of the season make it difficult to see the forest - the real performance of a club - for the trees (some examples are here). Moreover, there are so many different and variable data points to consider - match outcomes, individual player form, injuries, you name it - that normal data analysis techniques aren't always ideal for assessing what is really going on. And finally, to avoid seeing relegation as inevitable requires analysts to be on the lookout for early warning signs - but how would we know what those signs might be and when they might show up?

To explore how these challenges can be dealt with, let’s look at what happens to relegated clubs during the course of an entire season with data from 2011-12. Some obvious questions you might ask of the data are these:
  • How did relegated clubs perform?
  • Were there obvious trends in performance early in the season - did relegated clubs get better or worse over the course of the season?
  • Were the trends in performance radically different between relegated and non-relegated clubs?
Answering these questions means looking at data over time - trends in performance. It also means cutting through the thicket and noise that is inherent in any performance data that vary across teams, and especially from week to week. Analytically, this means that we are interested in both the long-term (season-long) and short-term (week to week) trends in performance. A nifty technique called lowess smoothing regressions - also known as locally weighted polynomial regression - can provide some answers. While it may sound fancy, it's actually quite simple. Lowess smoothing is a regression technique that allows us to drill down to the true underlying trends in the data in a way that is sensitive to short-term fluctuations and allows curvilinear relationships. Simply, instead of fitting one straight line through the data for, say, a whole season, the technique takes so-called localized subsets of data (weeks) and runs many (in our case, literally hundreds) of regressions to weed out the outliers and identify the shared trends in the short- and the longer-run.

But enough of the econometrics - how does it look in practice?

Friday, February 17, 2012

The Lure of the January Fix: A Data-Based Review of Bolton’s, Everton’s, and QPR’s Transfer Strategies

By Laban Scott Libby

In a world of unrelenting pressure for results, the January transfer window offers the tempting opportunity for the quick fix. For clubs contending for the championship or Europe, it encourages the hunt for that one piece of the puzzle that will make the club complete; and for clubs fighting relegation, that one special player may seem like the difference between another year among the world’s top or a long year of away games at Barnsley and Peterborough. Because of the lure of the fix, January transfer window activity by clubs also provides a window into what management sees as the club’s weaknesses and strengths.

A couple of weeks ago, SBTN provided some benchmarking of clubs’ offensive and defensive performance during the first 20 weeks of the season. Below, I spend some time reviewing the transfers clubs in fact made to see if and what kind of insight they provide into clubs’ thinking and strategies. To start, total transfer expenditure by Premier League clubs had reached £59m as of February 1st. This made it a window of relative austerity compared to last January’s bumper sales record £225m. With the numbers flying around considerably less heady than 2011, significant outlays by Chelsea (£20.5m), QPR (£10.5m) and Newcastle (£10m) represented well over half of all the money spent during January. Gary Cahill and Papiss Demba Cissé represent the only permanent first-team signings made by clubs in the current top eight, with loan deals the preferred choice of many teams throughout the league.

In the bottom half of the table, clubs employed a variety of transfer approaches. Both West Brom and Wigan apparently decided that Birmingham City was the one-stop shop for survival saviours, plundering the promotion hopefuls for Liam Ridgewell and Jean Beausejour, respectively. But most clubs were more reserved. Stoke chose not to make a single signing; Swansea and Wolves both opted for loan deals and small transfers under £250,000; temporary loanee Robbie Keane was Aston Villa’s sole addition; and Fulham’s main signing, striker Pavel Pogrebnyak, was helped along by Bobby Zamora’s last minute move 3 miles north to QPR. Blackburn, meanwhile, seemed to be hoping that a successful transfer window is just as much about whom you keep as whom you buy, and if they can somehow lift themselves out of the relegation zone come May then holding onto an unhappy Chris Samba may prove a masterstroke.

Of those clubs deciding that significant reinforcements were necessary, Everton and QPR featured heavily in the Deadline Day transfer activity; together with Bolton (£6.5m), the Toffees (£6.5m) and Rangers (£10.5m) spent the most amongst clubs outside the top six.*

These are the basic facts. But what do match data tell us about each side’s performance levels this season and how the performance of the players they brought in may or may not help them improve in the remainder of the 2011/12 season? To get a handle on these, I take a look at each of the three club’s and player’s performance stats to diagnose what, specifically, ails the clubs and how the players’ performance profiles may rectify gaps in performance. As you will see below, the numbers and transfers tell very different stories about each of the three clubs.**

Monday, February 13, 2012

How Efficient Are Player Salaries in Major League Soccer? Data From 2011

By Benjamin Leinwand and Chris Anderson

When it comes to pay, not all positions are created equal. In fact, we have long known that strikers command a premium for their services. So it should come as no surprise that Major League Soccer is no exception. In 2011, average earnings were distinctly tilted toward the offensive side of the pitch. Consider this: the average MLS forward earned $183,060, while midfielders made $141,594, defenders $118,558, and goalkeepers $86,208 – less than half of what strikers took home.

Of course we can argue these numbers – does it make sense to compare goalkeepers, most of whom ride the bench – to forwards – since a lower percentage of goalkeepers played real minutes compared to other positions? Others could point out that the averages may be skewed, given that every designated player in the MLS is a midfielder or a forward. But even if we leave goalkeepers to the side and even if we look at median rather than average pay (to eliminate some of the distortions produced at the very high and low ends), the overall pattern holds up. The table below shows some of the details about which positions played and which positions were paid in Major League Soccer.

The data show significant skew by position. Taking median salaries, midfielders earn roughly a $6,000 premium over defenders, and forwards a roughly $8,000 premium over midfielders, and these numbers are significantly higher when we consider averages. Any way we slice it, forwards are the most valued position by compensation, followed, with pretty significant gaps, by midfielders, defenders, and goalkeepers. While this is not surprising, it is made more notable by the fact that forwards played on average the fewest minutes of any position.

Is this market efficient? As Billy Beane showed the baseball world, sometimes the players who can create the most wins aren’t paid the most. In fact, the inefficiencies discovered by Beane are what made him and the A’s so successful and his story so interesting. By overpaying for some skills and undervaluing others, is it possible that general managers in the MLS are making the same mistake general managers in baseball were making?  

We thought we’d take a look. 

Tuesday, February 7, 2012

Using Castrol Player Ratings To Predict Team Success in MLS

By Benjamin Leinwand and Chris Anderson

Prior to the start of the 2011 season, Major League Soccer teamed up with the Castrol Index to deliver a more statistically advanced version of the match day player ratings. According to the creators, “the Castrol Index objectively analyses player performance, tracking every move on the field and assessing whether it has a positive or negative impact on a team's ability to score or concede a goal. At the end of each game, players are given a score out of ten.”

This kind of individual player performance indicator has enormous promise. If done right, it holds out the prospect of allowing analysts to argue conclusively that one player is better than another, regardless of team settings, reputation, or any number of other factors that can interfere with objective measurement. And of course, it also can be used to measure how well certain teams are acquiring and managing their talent.

While the Castrol Index has been around for some time, it is new to MLS. It has been measured for players in the top European football leagues for a few years, but there are some subtle differences between the MLS and the European index – for example, the European Castrol Index is based on a more comprehensive database for more players and seasons, and naturally it covers more leagues. On the MLS side, the 2011 ratings for 455 players are currently available online.* So now that the season is over, we thought we would look at the final ratings for individual players for the season as a whole to see how well the Castrol Index has fared.

But how can we know if the Castrol Index is useful for settling arguments with some confidence?

Thursday, February 2, 2012

Norwich, QPR, and Swansea: How Are the Promoted Clubs Faring This Year?

Now that the January transfer window is closed, clubs are gearing up for the rest of the season. It's a particularly intense time for the three promoted clubs. Before the season, many observers had them on their list of likely candidates for relegation. But judging from the league table at the moment, Norwich and Swansea look to be in a good position to stay up. QPR, in contrast, seem to be a more obvious candidate for a prolonged relegation battle. 

But looks and even results can be deceiving sometimes, so I thought I'd take a closer look at some of the underlying trends in performance among the three promoted clubs. First things first: goals for and against. The graph below shows goals for (green) and against (red) for each club; each dot is a match, and the red and green lines indicate what the performance trend has been this season.


The patterns are remarkably different. While Norwich's offensive and defensive production have been roughly equal (with slightly worse defensive performance generally) and consistently so, Swansea's performance early in the season was a bit all over the place and not all that good. However, since the early days, the Swans have done well, and their offensive and defensive performance levels have been better than they had been early in the year. Contrast that with the pattern we see for QPR: The Rangers' defense has been consistently and significantly worse than their offense. What is more, their offensive production has taken a dive in the second ten weeks of the year.

So how did we get here?

Tuesday, January 31, 2012

Everton and Fulham, Quo Vadis? Data From This Season

On this side of the Atlantic, the football news that received quite a bit of attention over the past few days was Friday's FA Cup tie between Everton and Fulham, featuring Clint "Hat Trick" Dempsey, Landon Donovan, and Tim Howard. Everton won, and as SI.com put it
Landon Donovan etched his name further into Everton lore with two assists in the Toffees' FA Cup victory over Fulham. The match featured Donovan and fellow U.S. alpha dog Clint Dempsey going head-to-head as opponents for the first time since an MLS match between the Los Angeles Galaxy and New England Revolution on May 6, 2006, and unlike that day, when Dempsey's side rolled to a 4-0 victory, Donovan stole the show with his contributions from the right wing.
Coming into the game, you would have been hard pressed to pick a favorite, however. So far this season, both Fulham and Everton have been solidly mid-table teams, sitting on 26 points to date. And when you look at the trends in goals scored and conceded (in the graph below), the two clubs look eerily similar (dots are individual matches, and the lines are the trends over time). The trend over the first 20 weeks of the season puts both of them at just over one goal per match scored and conceded, and the table seems to reflect their similarities.*


Interestingly, the data show that Everton's defense has tightened up somewhat in the second 10 weeks of the campaign, while Fulham's has stayed mainly level (with the exception of the 0-5 home defeat against Manchester United).

One question, of course, is how these very similar records have been produced, so I took a quick peek at shots taken and conceded, along with one of my favorites, offensive and defensive efficiency (as measured by the goals to shots ratios).

Monday, January 30, 2012

Zero Sum Games: Some Pictures of Shot and Goal Differentials in the Premier League This Season

Football's a simple game. If you score more or concede less than the other side, you win the match. So I thought it would be worth taking a look at which teams have been doing better than their opponents. The simplest way of doing just that is to calculate differentials for the stuff that matters most - goals - and the stuff without which the stuff that matters most usually doesn't happen (that would be shots, of course). 

So, here, first of all are goal differentials by match, for the 20 Premier League clubs during the first 20 weeks of this season.


One thing that's immediately noticeable, regardless of name, is how few clubs consistently produce positive goal differentials. The season's top two (Manchester) teams clearly stand out, both for their consistency in producing positive goal differentials as well as the size of these differentials. But other trends are quite visible as well - compare, for example, Sunderland and Fulham. Sunderland's matches have been decided by just a goal for most of the season, while Fulham have had massive swings, it seems. Or compare Bolton and Blackburn, two clubs that are stuck in and around the relegation zone. Blackburn's consistent losses have been relatively small compared to Bolton's sizable (but also decreasing) defeats. Finally, Arsenal's terrible early season and recovery but weeks 10-12 show up very nicely, as does their subsequent decline.

Whatever you may read into these goal differentials, they're kind of fun to ponder, as are the following differentials on shots taken by each club in each match.

Friday, January 27, 2012

Manchester City's Offensive Production: A PS

Here's a short PS to my earlier post about Manchester City's offensive performance so far this season. It provides a summary of shot creation and finishing for the first and second 10 weeks of the season. Dots mark the club's performance in an individual match. The number next to the dot indicates the week. And the dark blue lines tell us where half of all clubs (the 50th percentile of the league) stand in shots or goal to shot ratios by Week 10 or 20, respectively.



The pictures tell the story (hence the post). While virtually all of City's performances during the first 10 weeks put them in the upper right hand corner - among teams that took more shots and finished better - the second ten weeks of the season to date paint a more mixed picture.

Thursday, January 26, 2012

What's Ailing Arsenal? Diagnosing the Gunners' (Offensive) Weakness With Some Data

Arsenal have had a most unusual season of highs and lows. After a rough start to the campaign, the ship seems to have steadied. Robin van Persie is on pace for a record setting season; his 19 goals so far are only one less than last year's 20 scored by Golden Boot winners Carlos Tevez and Dimitar Berbatov. And iconic legend Thierry Henry has been signed on loan to lend some offensive firepower, experience, and spirit. At the same time, the 36 points Arsenal have accumulated at this point in the campaign are the fewest during Arsene Wenger's reign. This Arsenal season has been anything but boring for outside observers and surely nerve-wracking for supporters.

But what do the numbers tell us about what the Gunners are producing on the pitch? Have Arsenal improved? What have they been doing well? Is it offense, defense, both, or neither?

To start, let's take a look at the trends in the stuff that ultimately matters the most: scoring and conceding goals. The graphs below show the numbers of each, with a best-fitting trend line superimposed to see if there is a pattern to Arsenal's performance over the course of the season to date.


It only takes one quick glance to figure out that Arsenal's offensive season has been one of a significant up and a notable down. Arsenal started the season without much offensive success but saw a significant increase in offensive output all the way to Week 10. But that high point didn't last; instead, it gave way to a steady slide in offensive output all the way to the halfway point of the season. As it stands now, Arsenal's offense is not doing nearly as well as it was 8-10 weeks ago.

In some contrast, Arsenal's inconsistent defensive displays that marked the start of the season seem to have been overcome over the course of the last three months. Halfway through, Szczęsny & Co. managed to steady the ship, put in consistent performances, and produce 7 clean sheets in the process.


Taken together, this produces a mixed picture of where Arsenal stand at this point in the year. On the offensive end of the pitch, after improving significantly two months into the season, and despite van Persie's record-setting goalscoring pace, the Gunners' sharpshooters have been silenced somewhat. In contrast, their defense have allowed only .7 goals on average in the last 10 matches.

So what has happened to Arsenal's offense? To answer that question, lets' look at chance creation and finishing - two of the most critical dimensions of offensive performance for any club.

Tuesday, January 24, 2012

Manchester City's Goal Machine: Is It Slowing Down?

When Manchester City dispatched Manchester United in commanding fashion earlier this season, the City steamroller was powerful, efficient, and impressive. It looked downright unstoppable. As Kevin McCarra noted the day of City's 6-1 trouncing of United in The Guardian, "For the football world at large, it is more significant still that City have 33 goals from nine league games. A club of such means does not usually inspire fondness from neutrals, but only a curmudgeon could fail to appreciate the accomplishment of City." Today, two months later and after beating fellow contenders Spurs 3-2 over the weekend, City still sit atop the Premier League, seemingly the team to beat this year.

So all is well at the Etihad, or so it seems. But are City still in a league of their own? The data below tell us that, despite the impressive goal difference revealed by a glance at the league table, City's offensive steamroller has slowed down. In fact, as a comparison across all 20 clubs shows, City is the club that has seen the most significant deterioration in offensive performance since the early days of the season.

Don't believe me? Take a look at the following graphs of some simple statistics. The first shows clubs' goal production for each week of play, up until Week 20 of the season. The dots show the number of goals; the lines show the trend. 


Clearly, between Weeks 1 and 10, City were on an astonishing run. But, as the numbers reveal, this trend was not sustained. Statistical tests show that the trend in goals for City this year are best approximated by a quadratic function. This tell us that, after high level of scoring for the first half of the first half of the season, there has been a noticeable decline in City's offensive production.

Just in case you want to take a look at City's numbers in greater detail, here they are again for good measure.


Of course, any reasonable person might ask "How could there not have been a decline?" They were off the charts, and this may simply have been unsustainable. So if the team over-performed early in the year, what we are looking at is simply regression to the mean, and we should see a decline in performance across different dimension of offensive play as the season wore on. So let's take a quick look at two indicators: first, chance creation; second, efficiency.