Thursday, June 2, 2011

Comparing the Best Soccer Leagues in the World

This post is a slightly edited reprint of an article in Sports, Inc., a magazine edited by students at Cornell University in Ithaca, New York. To read the full issue, which is full of analytic insights about the major professional sports, click here or on the image below.


The article is a little bit of this, that, and everything from SBTN - some highlights and some low lights - intended for people with an interest in analytics but not necessarily expertise in soccer. One caveat up front: comparing leagues is inherently tricky business, if for no other reason that the data quality may be uneven and can be subject to limitations unknown to the analyst. As a result, differences attributed to style of play may reflect data collection techniques rather than real differences on the pitch. In any case, I bet regular readers of this blog will recognize most of it. Enjoy.

------------------------------------------------------------------------------------------------
If you live in the U.S., trying to follow top-level soccer isn’t always so easy. Sure, with MLS we now have a viable and high quality professional soccer league in the U.S., and it’s lots of fun to go to MLS matches, especially in the new built-for-soccer stadiums. But the truth is that the very best soccer is still played in Europe and will be for some time to come. So one of the perennial questions soccer fans have debated over the years is which leagues are the very best, and how you may be able to tell. To answer that question, UEFA, Europe’s soccer governing body, has been in the business of measuring the quality of leagues. This is meant to take some of the subjective judgments out of the debate, but more importantly, it helps UEFA determine how many teams from each league get a chance to participate in the crown jewel of international soccer competition, the UEFA Champions League.

UEFA does this by calculating a so-called league “coefficient,” which is determined by the results of the clubs of the leagues in UEFA Champions League and UEFA Europa League games over the past five seasons.  UEFA’s most recent (2010) coefficients of the European leagues reveal the following hierarchy of leagues: the English Premier League (EPL), Spain’s La Liga, Italy’s Serie A, and the German Bundesliga are currently the top 4 leagues with some distance to spare (with the leagues in France, Russia, Ukraine, Romania, Portual, and the Netherlands rounding out the top 10 leagues). A closer look at the coefficients reveals rough parity between the English and Spanish leagues (with coefficients around 80) followed by Serie A and the Bundesliga (with coefficients around 65). And this sounds about right; if you asked soccer professionals – coaches and players – where they want to work, these are the leagues that would likely rank highest in their minds.

An important and interesting follow-up question for soccer analysts is whether the style and quality of play differ across these four in important ways. At the level of players, the question would be whether moving from one league to another is akin to moving from, say, the AFC East to the NFC West in American football. At the level of teams and managers, the question is whether performance measured in one environment (speak: league) is comparable to performance in another – no manager wants to overpay for performance in a league that’s nothing like the one the player is hired into.

One indicator of a league’s quality may be how its teams do in head to head competition with teams from other leagues in Champions League or Europa League play. But there is surprisingly little else we know about how leagues compare, and it is difficult to develop very strong prior expectations about what the data might tell us about league differences in style and quality. On one hand, one might expect that leagues’ results reflect different, perhaps national, styles of play and tactics. So, off the bat, one might expect to see fewer shots on goal in countries like Italy and Germany that are traditionally known for a more defensive style of play than in countries like England, where teams have traditionally played a more physical game or Spain where a more open offensive possession-dominated game has predominated. On the other hand, one might argue that these leagues have become so thoroughly internationalized from the youth academies up, with player and manager movement and the diffusion of soccer knowledge across Europe and the globe, that one wouldn’t expect too many differences across the top leagues that could be attributed to “national” styles and soccer cultures.

In what follows, I report some data on league performance on offensive production and fouls and punishment to show that, while soccer at the very highest level follows similar basic patterns, there also are some real differences across the Big Four leagues of soccer.

To make things comparable and recent, I examine data for the last five seasons – that is, from 2005/06 to 2009-10.

Offensive Production

First, here is a look at offensive production across the leagues, measured by the number of goals and shots taken by teams per match. An obvious place to start is to look at the number of goals scored per match.

There is relatively little variation across years and leagues. Statistically speaking, these leagues are extremely “well behaved” and it is difficult to detect over time trends or cross-league differences. Each of the leagues, on average, sees slightly fewer than 3 goals each match each season. We observe the most stability in Serie A, which has only minute variation over time, and in the Bundesliga. The EPL and La Liga have seen slight upward trends in goals, but data for five seasons are probably not sufficient to say if these are long-term trends (the high point came in La Liga’s 2008/09 season at 2.9 goals per match). Overall, virtually without fail, the four big leagues see slightly below 3 goals per average match.

But teams can’t score if they do not shoot, so what do the data reveal about shots taken on goal (SOG) and shots on target (SOT)? One thing to note up front is that, in each of the four leagues, shots on target (SOT) and shots on goal (SOG) are (unsurprisingly) positively correlated with goals and wins. This means that the more teams shoot and the more accurately they shoot, the more they score and the more matches they win. Importantly, shots on target (SOT) are more highly correlated with outcomes than shots on goal (SOG).

Here, again, we see that the leagues are remarkably similar to one another. On average, teams take about 25 shots per match. Over the last five years, the Bundesliga has been the most trigger-happy league with 27.6 SOG, and the EPL the least trigger-happy with 23.2. Serie A and La Liga were in between at 24.4 and 25.2, respectively. And the one notable anomaly seems to be Serie A in the 2005-06 and 2006-07 seasons with only about 20 SOG. Overall, these are small differences around a similar central tendency.
And finally, teams can’t score unless they actually hit the target, so here are the numbers for shots on target (SOT) rather than just shots on goal (the data for the 2005-06 Bundesliga season are missing). Here, we finally see some more distinct differentiation among the leagues, mostly with regard to the English Premier League.

Aside from the one notable and peculiar outlier - Bundesliga clubs were particularly accurate in 2006-07 - the numbers of SOT are quite similar, with one exception: accuracy has gradually and notably gone up in the EPL where it is by now highest among the four leagues. That is, there has been an increase in accuracy in the Premier League, along with the increase in shots taken.

Another way to see this is to calculate the shots on goal by shots on target ratios - how many shots did teams have to take to yield shots on target? Here are the ratios, averaged over the past five years:

EPL: 1.87
Bundesliga: 2.46
Serie A: 2.58
La Liga: 2.79

These numbers show that the EPL clearly stands out: the league is clearly more efficient than the other leagues when it comes to shot accuracy and the difference to the other leagues is distinct. While shooters in the EPL are slightly less trigger-happy than shooters elsewhere, especially in recent years, they need fewer shots to create shots on target. And when we combine the shots on target trend with the accuracy ratio, it is clear that the EPL has outpaced the other leagues in recent years. Enough to say that it is different from the other leagues? By and large, the EPL is quite similar to the other leagues - so far as goals and overall shots are concerned - but hitting the target is one of the things that make it distinct.

When we put all these things together in one graph to show the various ratios of goals and shots (overall and on target), the distinctions among the leagues become more obvious (using data from the 2009-10 season).

Across the Big 4, the goal/shot ratios are virtually identical and reminiscent of Charles Reep’s ratio of 1 goal in nine shots on goal (.111) (Reep and Benjamin 1968). Despite this essential similarity, there are sizable differences in shot accuracy and conversion efficiency across them. In fact, the EPL and La Liga couldn’t be more different despite their virtually identical goal/shot ratios. In the EPL, we see lots of high value shots (the highest SOT/Shots ratios), but low conversion (the lowest goals/SOT ratios). In La Liga, we see the lowest proportion of accurate shots, but the highest conversion rates. Finally, the Bundesliga and Serie A are similar to one another in that they have more accurate shooting than in La Liga, but lower conversion rates than the Spanish league.

These findings suggest that the quality of forward play in the EPL is higher in that teams manage to take more accurate shots (though EPL strikers, on average, take fewer shots overall). At the same time, La Liga play stands out offensively because of the high conversion rate we see in the league. Whether this is due to better goalkeeping in EPL or weaker (though accurate in the sense of hitting the goal) shooting in the EPL cannot be answered with these data.

Fouls and Cards

Another way to evaluate the style of play is to consider how many fouls teams commit or how much punishment referees have to mete out. These can be taken as indicators of style of defensive play in the case of tactical fouls intended to interrupt the flow of the game, but also of how physically tough and dangerous a league is. When counting up fouls, however, there’s a thorny definitional issue. The official statistics we have from box scores and various other published sources include only fouls that are called by the referee, not necessarily those that were committed. Counting how many times referees blow the whistle for a foul and a card is not the same as counting actual fouls or correct punishment. Assuming that too many fouls called on any one team we would randomly draw from a hat cancel out too few called on another drawn from a hat, below is the total number of fouls called over the past five seasons.

As the data show, there is quite a range in how busy referees are. The totals range from fewer than 9,000 fouls called in the 2008/09 EPL season to almost 15,000 in the 2005/06 La Liga season and the 2007/08 Serie A season. Among other things, this suggests fewer interruptions to the game in Germany and England than Italy and Spain or conversely, a more fluid, continuous style of play. Over the 2005/06-2009/10 seasons as a whole, the average numbers of fouls per match were:

Bundesliga: 36.46
EPL: 24.63
La Liga: 37.41
Serie A: 35.09

Again, the EPL looks distinctly different from the rest of the pack (the low foul totals for the Bundesliga shown in the graph are virtually entirely due to the fact that there are fewer teams [18] and therefore matches played in that league). Clearly, fewer fouls are called in the Premiership. The data show that play is interrupted just for a foul (aside from all the other interruptions that happen in a match) every 3.5 minutes in the Premier League and every 2.5 minutes in the other leagues. At the level of individual teams, this means that teams in the Premiership are called for fouls an average 12 times per match, while teams in the other three big leagues foul a whopping 50% more at an average of about 18 times per match. This statistic is particularly interesting in light of the fact that commentators commonly talk about the alleged physical play in the EPL. Perhaps by that they mean that fouls are committed as often there as elsewhere but simply not called as much. This could be the case, of course, or there may simply be fewer fouls in the Premiership than anywhere else.

Along with fouls, does football punishment get meted out equally across leagues?

One easy way to see if there are patterns and to quantify their size is to look at yellow cards - a common enough occurrence in a match to yield some interesting and sufficient data. So here are trends in yellow cards since the 2005-06 season per team/match.

Overall, teams see about two yellows per match played. But clearly, referees in some leagues more easily pull out the card than in others. In particular, refs in La Liga give significantly more yellows than refs in the Premier League, but also than in Serie A, a league with similar foul totals. La Liga’s 2.5 yellows per team/match easily dwarf the Premiership’s roughly 1.5 cards. Whether this reflects differences in playing style, instructions from the league, training of refs, or more skillful diving in Spain’s top league is unclear, but punishment is clearly not meted out equally. We see consistently more yellows over the years in Spain and Italy than in England and Germany. We also see the fewest yellow cards in the EPL, consistent with the pattern of fouls called.

They’re the Same, Except When They’re Not, and the English Premier League Really Is Different

The data reviewed above provide some descriptive evidence for two basic conclusions. First, the highest quality soccer leagues in the world are remarkable similar in important ways. On common metrics of offensive production like goals scored, shots on goal, or the goal to shot ratio, the leagues are very similar. But lurking underneath these basic metrics we see that the English Premier League is different from the rest in key ways: play is interrupted less frequently because of fouls, there are fewer delays on the field because of yellow cards awarded, and shots on goal are significantly more likely to be accurate, though less likely to find their target when they are accurate, than in the other three leagues. Taken together, this suggests a faster, more continuous, and more exciting pace of play that viewers value. For players coming into the league, this suggests that players cannot count on refs to stop play, and the ability to keep going despite a tackle or challenge from the opposition is a key ingredient for EPL success. As well, EPL managers will be on the lookout for accurate shooters more than managers in other league as well as defenders and goalkeepers who know how to play together to turn away accurate shots after they’ve been taken (for example, after set play like a corner or free kick).

Next time you have a chance to watch a Premier League and Serie A match side by side, see if your own eyes confirm what these data just told you. But the beauty of the game and whether this is better soccer, lies in the eyes of the beholder.


Reference

Reep, C. and Benjamin, B. (1968). "Skill and chance in association football." J. Royal Statistical Society A 131: 581-585.