Monday, February 20, 2012

The Dynamics of Relegation in the Premier League: Early Warning Signs and Seeing the Forest for the Trees

(c) 2011

In hindsight, relegation often seems inevitable. If you had asked the pundits, Blackpool's demotion to the Championship last year was all but a done deal in August. But do the data agree? And what can they tell us about the inevitability and predictability of relegation ahead of time, rather than after the fact?

It's not an easy question to answer. The trick to avoiding what psychologists call hindsight bias is to spot trends before they become facts. But that's a hard thing to do in the middle of a season when the weekly performance of teams varies for all kinds of reasons and the hoopla and grind of the season make it difficult to see the forest - the real performance of a club - for the trees (some examples are here). Moreover, there are so many different and variable data points to consider - match outcomes, individual player form, injuries, you name it - that normal data analysis techniques aren't always ideal for assessing what is really going on. And finally, to avoid seeing relegation as inevitable requires analysts to be on the lookout for early warning signs - but how would we know what those signs might be and when they might show up?

To explore how these challenges can be dealt with, let’s look at what happens to relegated clubs during the course of an entire season with data from 2011-12. Some obvious questions you might ask of the data are these:
  • How did relegated clubs perform?
  • Were there obvious trends in performance early in the season - did relegated clubs get better or worse over the course of the season?
  • Were the trends in performance radically different between relegated and non-relegated clubs?
Answering these questions means looking at data over time - trends in performance. It also means cutting through the thicket and noise that is inherent in any performance data that vary across teams, and especially from week to week. Analytically, this means that we are interested in both the long-term (season-long) and short-term (week to week) trends in performance. A nifty technique called lowess smoothing regressions - also known as locally weighted polynomial regression - can provide some answers. While it may sound fancy, it's actually quite simple. Lowess smoothing is a regression technique that allows us to drill down to the true underlying trends in the data in a way that is sensitive to short-term fluctuations and allows curvilinear relationships. Simply, instead of fitting one straight line through the data for, say, a whole season, the technique takes so-called localized subsets of data (weeks) and runs many (in our case, literally hundreds) of regressions to weed out the outliers and identify the shared trends in the short- and the longer-run.

But enough of the econometrics - how does it look in practice?

Friday, February 17, 2012

The Lure of the January Fix: A Data-Based Review of Bolton’s, Everton’s, and QPR’s Transfer Strategies

By Laban Scott Libby

In a world of unrelenting pressure for results, the January transfer window offers the tempting opportunity for the quick fix. For clubs contending for the championship or Europe, it encourages the hunt for that one piece of the puzzle that will make the club complete; and for clubs fighting relegation, that one special player may seem like the difference between another year among the world’s top or a long year of away games at Barnsley and Peterborough. Because of the lure of the fix, January transfer window activity by clubs also provides a window into what management sees as the club’s weaknesses and strengths.

A couple of weeks ago, SBTN provided some benchmarking of clubs’ offensive and defensive performance during the first 20 weeks of the season. Below, I spend some time reviewing the transfers clubs in fact made to see if and what kind of insight they provide into clubs’ thinking and strategies. To start, total transfer expenditure by Premier League clubs had reached £59m as of February 1st. This made it a window of relative austerity compared to last January’s bumper sales record £225m. With the numbers flying around considerably less heady than 2011, significant outlays by Chelsea (£20.5m), QPR (£10.5m) and Newcastle (£10m) represented well over half of all the money spent during January. Gary Cahill and Papiss Demba Cissé represent the only permanent first-team signings made by clubs in the current top eight, with loan deals the preferred choice of many teams throughout the league.

In the bottom half of the table, clubs employed a variety of transfer approaches. Both West Brom and Wigan apparently decided that Birmingham City was the one-stop shop for survival saviours, plundering the promotion hopefuls for Liam Ridgewell and Jean Beausejour, respectively. But most clubs were more reserved. Stoke chose not to make a single signing; Swansea and Wolves both opted for loan deals and small transfers under £250,000; temporary loanee Robbie Keane was Aston Villa’s sole addition; and Fulham’s main signing, striker Pavel Pogrebnyak, was helped along by Bobby Zamora’s last minute move 3 miles north to QPR. Blackburn, meanwhile, seemed to be hoping that a successful transfer window is just as much about whom you keep as whom you buy, and if they can somehow lift themselves out of the relegation zone come May then holding onto an unhappy Chris Samba may prove a masterstroke.

Of those clubs deciding that significant reinforcements were necessary, Everton and QPR featured heavily in the Deadline Day transfer activity; together with Bolton (£6.5m), the Toffees (£6.5m) and Rangers (£10.5m) spent the most amongst clubs outside the top six.*

These are the basic facts. But what do match data tell us about each side’s performance levels this season and how the performance of the players they brought in may or may not help them improve in the remainder of the 2011/12 season? To get a handle on these, I take a look at each of the three club’s and player’s performance stats to diagnose what, specifically, ails the clubs and how the players’ performance profiles may rectify gaps in performance. As you will see below, the numbers and transfers tell very different stories about each of the three clubs.**

Monday, February 13, 2012

How Efficient Are Player Salaries in Major League Soccer? Data From 2011

By Benjamin Leinwand and Chris Anderson

When it comes to pay, not all positions are created equal. In fact, we have long known that strikers command a premium for their services. So it should come as no surprise that Major League Soccer is no exception. In 2011, average earnings were distinctly tilted toward the offensive side of the pitch. Consider this: the average MLS forward earned $183,060, while midfielders made $141,594, defenders $118,558, and goalkeepers $86,208 – less than half of what strikers took home.

Of course we can argue these numbers – does it make sense to compare goalkeepers, most of whom ride the bench – to forwards – since a lower percentage of goalkeepers played real minutes compared to other positions? Others could point out that the averages may be skewed, given that every designated player in the MLS is a midfielder or a forward. But even if we leave goalkeepers to the side and even if we look at median rather than average pay (to eliminate some of the distortions produced at the very high and low ends), the overall pattern holds up. The table below shows some of the details about which positions played and which positions were paid in Major League Soccer.

The data show significant skew by position. Taking median salaries, midfielders earn roughly a $6,000 premium over defenders, and forwards a roughly $8,000 premium over midfielders, and these numbers are significantly higher when we consider averages. Any way we slice it, forwards are the most valued position by compensation, followed, with pretty significant gaps, by midfielders, defenders, and goalkeepers. While this is not surprising, it is made more notable by the fact that forwards played on average the fewest minutes of any position.

Is this market efficient? As Billy Beane showed the baseball world, sometimes the players who can create the most wins aren’t paid the most. In fact, the inefficiencies discovered by Beane are what made him and the A’s so successful and his story so interesting. By overpaying for some skills and undervaluing others, is it possible that general managers in the MLS are making the same mistake general managers in baseball were making?  

We thought we’d take a look. 

Tuesday, February 7, 2012

Using Castrol Player Ratings To Predict Team Success in MLS

By Benjamin Leinwand and Chris Anderson

Prior to the start of the 2011 season, Major League Soccer teamed up with the Castrol Index to deliver a more statistically advanced version of the match day player ratings. According to the creators, “the Castrol Index objectively analyses player performance, tracking every move on the field and assessing whether it has a positive or negative impact on a team's ability to score or concede a goal. At the end of each game, players are given a score out of ten.”

This kind of individual player performance indicator has enormous promise. If done right, it holds out the prospect of allowing analysts to argue conclusively that one player is better than another, regardless of team settings, reputation, or any number of other factors that can interfere with objective measurement. And of course, it also can be used to measure how well certain teams are acquiring and managing their talent.

While the Castrol Index has been around for some time, it is new to MLS. It has been measured for players in the top European football leagues for a few years, but there are some subtle differences between the MLS and the European index – for example, the European Castrol Index is based on a more comprehensive database for more players and seasons, and naturally it covers more leagues. On the MLS side, the 2011 ratings for 455 players are currently available online.* So now that the season is over, we thought we would look at the final ratings for individual players for the season as a whole to see how well the Castrol Index has fared.

But how can we know if the Castrol Index is useful for settling arguments with some confidence?

Thursday, February 2, 2012

Norwich, QPR, and Swansea: How Are the Promoted Clubs Faring This Year?

Now that the January transfer window is closed, clubs are gearing up for the rest of the season. It's a particularly intense time for the three promoted clubs. Before the season, many observers had them on their list of likely candidates for relegation. But judging from the league table at the moment, Norwich and Swansea look to be in a good position to stay up. QPR, in contrast, seem to be a more obvious candidate for a prolonged relegation battle. 

But looks and even results can be deceiving sometimes, so I thought I'd take a closer look at some of the underlying trends in performance among the three promoted clubs. First things first: goals for and against. The graph below shows goals for (green) and against (red) for each club; each dot is a match, and the red and green lines indicate what the performance trend has been this season.

The patterns are remarkably different. While Norwich's offensive and defensive production have been roughly equal (with slightly worse defensive performance generally) and consistently so, Swansea's performance early in the season was a bit all over the place and not all that good. However, since the early days, the Swans have done well, and their offensive and defensive performance levels have been better than they had been early in the year. Contrast that with the pattern we see for QPR: The Rangers' defense has been consistently and significantly worse than their offense. What is more, their offensive production has taken a dive in the second ten weeks of the year.

So how did we get here?