Tuesday, February 7, 2012

Using Castrol Player Ratings To Predict Team Success in MLS

By Benjamin Leinwand and Chris Anderson

Prior to the start of the 2011 season, Major League Soccer teamed up with the Castrol Index to deliver a more statistically advanced version of the match day player ratings. According to the creators, “the Castrol Index objectively analyses player performance, tracking every move on the field and assessing whether it has a positive or negative impact on a team's ability to score or concede a goal. At the end of each game, players are given a score out of ten.”

This kind of individual player performance indicator has enormous promise. If done right, it holds out the prospect of allowing analysts to argue conclusively that one player is better than another, regardless of team settings, reputation, or any number of other factors that can interfere with objective measurement. And of course, it also can be used to measure how well certain teams are acquiring and managing their talent.

While the Castrol Index has been around for some time, it is new to MLS. It has been measured for players in the top European football leagues for a few years, but there are some subtle differences between the MLS and the European index – for example, the European Castrol Index is based on a more comprehensive database for more players and seasons, and naturally it covers more leagues. On the MLS side, the 2011 ratings for 455 players are currently available online.* So now that the season is over, we thought we would look at the final ratings for individual players for the season as a whole to see how well the Castrol Index has fared.

But how can we know if the Castrol Index is useful for settling arguments with some confidence?

The tricky analytical issue for deciding the index's utility 's that it is designed to measure individual performance; but there are very few plausible ways to ascertain whether the individual performance measured by Castrol connects to some other performance indicator or output not measured by Castrol and easily accessible to us. Aside from box scores like goals and shots (or shots saved), there is precious little to compare the index to. One potential solution we favor is to see how the index lines up with wins and points. Of course, wins and points are team outcomes, so what these data allow us to do is compare the strength of teams. If the Castrol Index really does reflect how well an individual player performs during his time on the pitch, a team full of players with better Castrol scores should have a better record than one with players with worse Castrol scores. Another way to say this is to ask if the Castrol Index has what scientists typically refer to as construct validity. This is more than just an academic exercise – if an index doesn’t measure what it is intended to measure, we cannot have faith in the conclusions we draw from looking at it.

To conduct this analysis, we multiplied each player’s minutes played over the course of the season by his Castrol index score to get a player’s Total Castrol Contribution to the team he played on. Then we took each team's Total Castrol Contribution, and divided it by each team's total minutes played (total minutes played didn't add up to the same number for each team, which is likely due to players being sent off or matches of varying lengths, though it is also possible that the Castrol index did not include every player). This allowed us to generate every team’s overall Castrol score, which should indicate how well the team played over the season.

Is this what the data tell us?

For starters, the league average team Castrol score is 7.207, with a standard deviation of .355. This suggests that the league is fairly evenly spread out in terms of performance. But more importantly, the Team Castrol Index performs well overall. Teams' Castrol scores have a .76 correlation with season team points, and a .75 correlation with the number of wins. This is exactly what we should expect, as better teams should win more matches and lose (or tie) fewer. This is a very good result for the Castrol Index.

Running a regression with points as the dependent variable, we get the results shown in the picture below.

There is a clear upward trend in points when a team’s Castrol Index increases, with the regression line being the diagonal line going from lower left to upper right.  The vertical line is the average Team Castrol Index for the league as a whole (7.207), while the horizontal line is the average number of points in the league table (45.1). If a team performs one full Castrol point better than another, it is expected to gain 22.6(!) points in the league table, according to these results. Using a more realistic example, if your favorite team plays one standard deviation better than the league, you can expect to gain 8 points in the league table.**

Naturally, the small sample size of 18 teams puts a limit on how far we should push our interpretation of the results - obviously more data would be preferable – but clearly the Castrol Index does a good job measuring effective team play. Mind you, anything else would be odd, given that the index is focused on scoring and preventing goals, and given that we know goals to be strongly associated with points and wins.

So we have seen that the Castrol Index performs very well for the league as a whole. And while correlations of .75 and .76 with wins and points, respectively, sound good, they translate into 56-58 percent of the variance in team outcomes accounted for with the help of the Castrol Index. Looked at from another angle, this means that 42-44 percent of team performance is not accounted for by the indicator.

Why may this be? One reason could be unusual teams - outliers – that over- or under-performed on points and wins, given their measured play.

So let’s examine some specific teams’ performances. The first notable fact is that the LA Galaxy had the best regular season record with a total of 67 points; yet, they were only the 4th best team in MLS according to the Castrol Index. In fact, according to these calculations based on the Castrol Index, Sporting Kansas City was the best team in the league; yet, they only finished 5th in points. Using the Castrol Index, Toronto was by far the worst team in the league, though they outperformed Vancouver and New England on the league table, and significantly outperformed their expected number of points (as indicated by the overall regression line).

There are a couple of possible explanations for these statistical outliers. One is that the Castrol Index isn’t sufficiently precise – that it contains what statisticians call “measurement error” – and performs better with regard to measuring which (players and therefore) teams have obvious disparities in talent. But we think this is less likely than teams’ inconsistent performances from one match to the next. If LA is always the best team on the pitch, but plays down to the level of its opposition during each match, it may look worse than a Kansas City team that always plays to its full potential. The data used here are pooled; in contrast, each match is atomized, and it is likely that comparing teams’ Castrol Index scores at the end of each match would give even better predictions of league standings.  At the level of analysis done here, however, this has to remain speculation for the time being.

So where does this leave us? For one, over the course of a season, a team with a better Castrol score does indeed perform better than one with a lesser Castrol score. Fans can safely compare their teams’ performances – though it needs to be said that these analyses do not tell us how valid individual players’ performances are.

To disentangle the data a little bit more, we will take a look at Castrol ratings for players in different positions in the coming days. The results may surprise you.

* It is worth noting that MLS Castrol Index was measured only for regular season matches that occurred in 2011, allowing us to directly compare season results to Castrol Index results without getting extraneous data.

** Some technical notes: When points are regressed on Team Castrol Score, adjusted r squared for this regression is .55. The Castrol Index also has a lower limit of 3.89, which was hit by some players, so this removes some variation among those players. A separate analysis was performed with those players who had a 3.89 excluded, and the results were nearly identical. Furthermore, though formations have not been taken into account for this analysis, about 75% of starting formations were either 4-4-2 or 4-1-2-1-2 (https://twitter.com/#!/OptaJack/statuses/152466522340720640)