Sunday, February 27, 2011

Should Arsenal Have Won? Arsenal v. Birmingham City Performance Comparisons

Here's my quick and dirty and not particularly considered post-mortem to the Carling Cup final between Arsenal and Birmingham. Going into the match, Arsene Wenger pretty much guaranteed a victory. It didn't happen, and we can speculate about why. Some will point to the absence of Fabregas and Walcott, some will argue that it's about nerves, some will point the finger at Arsenal's Achilles heels (defending and especially goalkeeping), and so on.

So what do the numbers tell us? Here are some of the key stats from the box score, according to the BBC.

Possession: Arsenal
 56% - Birmingham 
Shots: Arsenal
 20 - Birmingham
Shots on target: Arsenal
 12 - Birmingham
Corners: Arsenal
 6 - Birmingham
Fouls: Arsenal
 11 - Birmingham

Clearly, as anyone who watched the game saw, Arsenal outplayed Birmingham in virtually every respect. And Alex McLeish owes Ben Foster a huge thanks for his outstanding display. Clearly, Arsenal were the better team.

None of this came as a surprise. Take a look at the clubs' stats before this match over the course of the Premier League season so far. Below are offensive and defensive stats for the two teams as of February 23, 2011 (click on the pictures to enlarge them).

Clearly, Arsenal dominate Birmingham on every dimension. They take more shots, more accurate shots, and score more as a result. In fact, they take as many accurate shots in a match as Birmingham takes total shots. 
We can see similar performance differences on defense. Arsenal allow their opponents to take fewer shots, degrade their accuracy to a greater extent, and don't allow their opponents to score as much (1 v. 1.36, respectively). So Arsene Wenger could be forgiven for thinking that, under normal circumstances, Arsenal should win. And they should have.

End of story, right?! Well, not exactly; at least it's not the complete story if you ask me. Here's why:

Thursday, February 24, 2011

Princes and Paupers: How Pay Inequality Makes Teams Worse, Or Why Paying Thierry Henry Lots Of Money Is A Bad Idea

Following the recent debates about the record-setting transfers in the EPL and the Bundesliga, and with the new MLS season about to get under way, I've been thinking about how money relates to success in soccer.

Of course, one big question about transfers is always whether they are worth it; this year is no exception, and the interesting discussions in (and about) Tomkins, Riley, and Fulcher's Pay As You Play* show that there is room for disagreement about how much clubs can buy success, either by buying great players (transfers) or paying great players lots of money (wages).

One way to think about money is to take wages and transfer costs as proxies - as imperfect but reasonable measures - of player (and when you put them all together, team) quality. Turns out, money is a great proxy: as the various analyses have shown, it does an amazing job predicting the success of clubs each season.

But there's another side to money worth thinking about: how much can players be incentivized by monetary compensation for their services? After all, buying players and paying them oodles of cash is one thing, getting them to score on demand (Gomez, Torres?) or to win as a team is another (Man City).

Lots of people assume that money motivates people to perform: paying someone more should lead to better performance. So why doesn't it always work that way? As it turns out, and as psychologists and scholars of organizations have long recognized, this kind of extrinsic motivation - motivation by monetary reward - can only go so far. Ultimately, intrinsic motivation - motivation driven by interest or enjoyment of a task - seems to be the more important factor in driving performance. Sure, money helps, but the effect of wages on performance diminishes at the high end of compensation - paying someone 75k a week instead of 65k is unlikely to make them play harder or run faster.

So if that's true, and if superstars can't win matches all by themselves - that's one of the great and annoying things about team sports if you ask me (or the LA Galaxy) - then what motivates the rest of the team, the guys you need to put the superstars in the best position to help the team win? Turns out, they can be motivated by money, but in a very different way.

Tuesday, February 22, 2011

Who Imports Players? Comparisons Across Leagues

Who imports soccer players? Turns out, the leagues most dependent on foreign labor aren’t necessarily the obvious ones. Here’s a nifty graph from The Economist, showing the percentages and absolute numbers of foreign players in each league. These data come from a report released by the Professional Football Players Observatory, an academic research group.

Cyprus is the biggest importers of football labor: 72.3% of players in the Cypriot first division are foreign. But Cyprus is tiny, with a population of less than 800,000, so maybe there just aren't enough players to field truly competitive teams (and international competitiveness is something that particular country may care about for political and economic reasons). In contrast, the former Yugoslav republics (Croatia and Serbia) have very low levels of imports at 18% and 11.5%, respectively, perhaps reflecting their more general isolation from the rest of Europe.

But back to size. It doesn't seem to explain patterns in footballing imports. We see almost twice the level of imports in England (58.4%) than in France (29.5%), though these two nations are roughly the same size. And clearly, of the top leagues, the Premier League has the largest number of foreigners (even without the 50 or so players from elsewhere in the UK) at almost 60%, followed by Portugal (56.4), Greece (55.8), and Belgium (50.9).

This means that fewer than 50% of players in the EPL are not from the UK. Another way to look at these data would be to say that home-grown players get to play the most in Spain’s La Liga (where 62.1% are domestic players) and France's Ligue 1 (where 70.5% are French). Of course, these numbers don't tell us about starters versus benchwarmers, but still.

What's also interesting about the report's figures is that France is the biggest exporter of footballers in Europe, with 261 players. How can this be? For one, I suspect that the number of foreign born players is deflated by French citizens of African descent from places like Senegal, Cote d'Ivoire, etc., but the exporting number may not be. Lots French players ply their trade in a number of neighboring countries with excellent leagues (like Belgium, Switzerland, etc.), but also the top leagues, like Germany and England (where they constitute the single largest import with 49 players, more than Scotland and the Republic of Ireland combined).

Where do most players come from? Brazil serves the world of professional football as the biggest supplier overall: 577 Brazilians are active in Europe’s top divisions - curiously there are more Brazilians playing in Cyprus than Germany. But in terms of overall patterns, you can see linguistic affinities (more than half of all Brazilians play in Portugal and most Argentines go to Spain), but also neighborhood effects: Russians and Ukrainians go to Belarus,  Germans to Austria, and Swedes to Denmark.

Is there a connection between league quality and the percentage of foreign-born players? I suspect there is, with the best leagues attracting the highest percentage of the best players from all around the world. Or, conversely, great players, wherever they come from, make leagues better. I'll take a look at that conjecture in a future post. Until then, enjoy the United Nations of Football.

Friday, February 18, 2011

The Goal Value of Corners: Zero

Corners are funny events; not funny as in "ha ha" but funny as in "what do they tell us about the game?".

Let's think of them as measures of offensive production for a moment. Teams playing lots of offense should not see too many corners on their own side of the field. Instead, teams that press, have lots of possession, pass forward a lot and take lots of shots on goals should, by implication, see some of the balls diverted by goalies or defenders to produce corners.

So corners should be correlated with shots on goal, right? And they, in fact, are - across the big leagues and the years. Take a look at this graph of the relationship between the number of corners in a match and the number of shots on goal for the 2005/06 to 2009/10 seasons.

The graph paints just the kind of picture you would expect. Yes, there are some variations across the leagues, but more shots are associated with more corners (and the other way around). The Pearson correlations range between .44 in La Liga to .51 in the Bundesliga. On average, teams that take 10 shots or fewer can expect to see 5 or fewer corners; teams that take about 15 shots get about 10 corners and so on. At the high end, the correlation peters out and becomes more variable.

So this should mean that teams that shoot more and get more corners also manage to score more goals, right? Wrong, it turns out.  When we quantify the offensive "value" of corners, we find that corners do not have much of a goal value; in fact, across the leagues, goal totals do not increase with corner totals. The correlation is essentially 0 (it's strongest in the EPL at .06, and weakest in Serie A and La Liga at less than .01). Take a look at a graphical representation of this pattern, keeping in mind that the average number of goals a team scores per match is around 1.3 across leagues and seasons.

Turns out, knowing that a team has generated lots of corners does not improve your prediction of how many goals that team will score (in fact, as the number of corners increases, the goal totals become increasingly variable).

Surprising? Well, a little, but not if you consider how teams create corners. Oftentimes, teams are awarded a corner precisely because they did not manage to score (so perhaps the relationship should be negative). But in any case, balls that do not cross the goal line between the posts but next to them turn into corners.

PS: Mind you, these data cannot tell us the odds that any one corner yields a goal - they are simply match totals.

Thursday, February 17, 2011

Firing the Manager: A Study of the Dutch Eredivisie

Here is the promised follow-up to the earlier post about the effectiveness of sacking a manager. Remember that managerial resignations are common - as the study* by Dutch economist Bas ter Weel shows, between 1986 and 2004 in the Eredivisie, about 40-50% of managers left each season, and many of these were forced resignations. Below is a graph from the ter Weel study I mentioned in the earlier post.

Source: ter Weel (2011)
And look around this year (or any given year): so far this season, 5 of 20 managers in the EPL have been sacked, and 5 of 18 Bundesliga managers have (I don't have the La Liga or Serie A numbers in front of me, but I'd be surprised if they were far off).

So, does it help to sack the manager? A number of studies have suggested that it does, but ter Weel disagrees, in large part because prior studies have not systematically accounted for the selection bias involved in managerial sackings: teams that sack their manager aren't selected at random, sackings don't occur at random intervals but because the team's performance has been declining, and bad teams don't get to hire the very best managerial talent.

What does this mean? It means it's important to construct the comparison group - the control group, experimentally speaking - that does not receive the experimental "treatment" of a managerial sacking in a way that makes this group resemble the group that does receive the "treatment" as closely as possible. Using a so-called difference in differences statistical technique, the key is "to examine the effect of some sort of treatment by comparing the treatment group after treatment both to the treatment group before treatment and to some other control group" (Wikipedia).

Once you do this, you see some interesting results that run counter to much conventional wisdom. Here's the key graph from the published study. It shows relative performance measured as a moving average of four game results divided by the season’s average to allow for comparisons across teams. It is based on 81 forced resignations, 103 voluntary departures, and 212 performance dips that serve as a control group.

Source: ter Weel (2011)
A few results stand out (and if you're interested in all the gory statistical details, the study is full of them): first, in line with conventional wisdom, turnover does happen after teams experience performance dips and teams' performance increases after a forced managerial resignation. But, secondly, and this is the important part, new managers do not do better than managers left in charge after performance dips similar to those experienced by teams who fired their boss. While they do better their very first match out with the new team, they underperform teams that are on a similar performance trajectory but don't sack their manager. To quote from the paper:
"What is clear is that performance increases after one period are significant but that the new manager performs worse compared to the control group in the next three periods he is in charge.
Avram, quo vadis?
The explanations for this are not all that hard to imagine. Sure, new brooms sweep clean: changing the manager produces a (very) short-term bounce, perhaps brought on by the fact that players no longer have an easy alibi for underperforming. But: it's important to remember that teams' fortunes aren't unlike the business cycle - there is variation around the overall performance level, and times of playing well are followed by matches that aren't going so well (think of Chelsea early in the season experiencing a boom and later in the season experiencing a recession, so to speak). Since managers are typically fired during a recession, the tendency back toward a team's mean level of performance can easily fall into the first few weeks of a new manager's tenure. But of course, this may have nothing to do with the manager's work, and everything with good timing. And, as ter Weel's study shows, it may actually make things worse, compared to what could have been, had the club stuck it out.

There is an additional benefit to keeping a manager: you save severance pay. This seems to be the strategy a team like West Ham seems to be following. But as of today, the bookies have Grant as the odds-on favorite (by a mile) to be the next EPL manager to be fired (followed by David Moyes and Roberto Martinez). Honestly, I don't believe any of them deserve to get the boot, so I hope for all them that things work out in their favor.

* Bas ter Weel, "Does Manager Turnover Improve Firm Performance? Evidence from Dutch Soccer, 1986–2004." De Economist: Netherlands Economic Review 2011 (in press)

Tuesday, February 15, 2011

Defending Your Territory: Are Hormones (A) Key To Understanding Home Field Advantage?

I thought it was, well, endearing to hear Jack Wilshere's take on how Arsenal can beat Barcelona in their Champions League matchup this week:
"This year we have to get in their faces and show them what we're all about. When we have the ball, we've got to keep it as well as they can. We've got to change our game a bit to play against Barcelona – we'll learn from last year, but we need to get in their faces and, if you like, be a bit nasty, in a footballing sense, to get the ball back."
Clearly, Wilshere has a theory about how Arsenal needs to play to win at home: play aggressively, dominantly, "in their face" as it were. So why did Wilshere's interview make me chuckle? Honestly, I can think of lots of adjectives to describe Arsenal's football, but "macho" or "aggressive" or "dominant" aren't the first ones to come to mind. So heading into Wednesday's match-up, I'm wondering: how much aggressiveness can really be engineered?

One answer to this question can be found in academic research on the home field advantage.* Lots has been written about the reasons teams do better at home than on the road. We see it in every league and every season, virtually without fail (though there are of course cross-team differences in home field advantage).

For example, take a look at the home-away differences in goals per match I pulled together from data for the first half of this year's season. While average goal levels vary across leagues, home teams outscore away teams. And this advantage can be documented on a number of different dimensions (goals, shots, fouls, etc.).

The enduring question has been why we see these patterns. A number of explanations have been proposed, including crowd effects, referee bias, travel effects, rule changes, tactics, and a few more.

While the sources of this effect have been hard to nail down with any degree of certainty, one of the more innovative and interesting explanations I've come across has to do with human physiology, and specifically hormones (testosterone). According to the authors of a study* linking testosterone and home field advantage, the key explanation is that of "territoriality",
"One explanation that has received little attention is that of territoriality, the protective response to an invasion of one’s perceived territory. Territoriality is prevalent amongst many animal species, which typically display agonistic behaviours and attack more readily and with greater vigour when defending a home territory. Several studies have shown a ‘home advantage’ for an animal when its territory is threatened or attacked, even when the defender is smaller than the rival, suggesting an important motivational incentive in territorial defense." (Neave and Wolfson, 2003)
The authors (Neave and Wolfson) set out to study the effect of territorial behavior by investigating "possible changes in testosterone levels of football players dependent upon match venue." The idea is that higher levels of testosterone are linked to more competitive (and perhaps aggressive) behavior on the field.

Clever idea, and the authors come to some interesting findings, based on the testosterone levels of different samples of professional and semi-professional footballers in England before matches at home, away, and during practices. One of the important things to know about this is that testosterone levels, while naturally produced by the body, are subject to environmental influences (like playing at home or away).

Below are a few of the key findings.

Monday, February 14, 2011

Sacking the Manager: Does It Make Sense? And How Would We Know?

It's easy to be distracted by some of the soap opera-like goings-on we see in professional soccer; every year (week, day?), there's plenty of gossip and bad behavior on and off the field. Some of it is even entertaining and interesting. But it's easy to lose track of some of the fundamentals of the game when we pay too much attention to that sort of stuff.

So here's something I didn't realize while busily reading about Steve McLaren's exit from Wolfsburg or the (alleged fact) that Australia's Newcastle Jets will offer Man U a cash plus racehorse (yes, racehorse) deal for Michael Owen this summer.

I don't know about you, but I thought it was remarkable that we headed into last weekend's matches with all three teams in the EPL's relegation zone still managed by the managers who started the season. That's right: despite all the pressure for immediate success, none of these clubs (West Ham, Wolves, and Wigan - the three W's, as it were) have fired their coaches. And of the three teams right above the relegation zone, only one has had a change of manager (WBA), with lots of folks scratching their heads as to why that particular firing occurred. While this particular pattern may not hold across leagues (e.g., Koeln and Stuttgart already fired their coaches earlier this season and Moenchengladbach and Osasuna just this weekend), if you look carefully, you'll find as many underperforming clubs who are sticking it out as you find clubs who filed for a quick divorce.

So who's right? Does it pay to fire the manager mid-season or is it better to stick it out? Underlying all these moves (and non-moves) are acute pressures individual clubs face, but also, I would hazard to guess, theories clubs have about whether sacking managers and installing new ones brings success (or relief from fans and the media). Well, does it?

Turns out, it's really hard to know for sure. Over the years, there has been ongoing, but sporadic interest by academics in understanding the effects of manager hirings and firings. Invariably, these analyses have suffered from a variety of difficulties in nailing down "manager effects." A recent paper* by Dutch economist Bas ter Weel helps us understand why it is so hard to figure out if sacking the manager makes sense and how we could get closer to knowing for sure. ter Weel lays out these analytical challenges very nicely. Among the more interesting ones (to me) are:

First, before we even think about whether firing one manager and hiring another improves a team's chances, why would we think that managers matter in the first place?

An argument can be made that team success over the course of a season is determined primarily by the structure and conditions provided by clubs (things like capital, infrastructure, scouting and development, fanbase and stadium, etc.). Call it the Redknapp theory; in the run-up to the Champions League clash with AC Milan, Harry was quoted yesterday in The Guardian that Tottenham's success isn't really up to him:

"It's like everything, you've got to keep improving if you want to compete with the top clubs ... I think the owners want to compete so it's up to them. They hold the key more than the manager. If the right people come along, I think the club would finance it, and if they do that for the next few years the sky is the limit for Tottenham."

If that is the case, the work of individual managers with particular skills won't matter, unless they are somehow able to impose their management style on the club and use these resources differently than they were used before. Think about the difficulties that managers sometimes have working with a club with a particular structure, culture, and fan base - think Benitez at Inter or Mourinho at Real or Hughton at Newcastle - or how some managers are able to impose their own style - think Coyle at Burnley and now Bolton or van Gaal at Bayern. Only if it is the latter should we be able to see "manager effects" on performance. Perhaps managers know this, and that is why some of them insist on hiring all or most of their own staff. But the bottom line is this: maybe managers don't matter nearly as much as we think, and it's all about the club's infrastructure, investment (think squad costs and transfer markets), and upper level management. In this case, it may still be important to hire the best manager you can get for the money you're willing to pay, but firing them may not make a lot of sense.

But clearly, managers turnover is high, and managers get fired all the time, as the data from ter Weel's study of the Eredivisie show. Below is a graph of the frequency of resignations between 1986 and 2004. It shows that, during the average season, about 50% of teams changed managers, and about 44% of these were forced resignations. So clearly, clubs consistently seem to believe that "new brooms sweep clean." Well does it? This is where the second analytical challenge comes in.

Source: ter Weel (2011)
Assuming that there could be a manager effect, it is difficult to determine conclusively whether firing a manager mid-season is better than, say, between seasons. It is hard to nail down statistically what exactly these effects may be because of what statisticians call "selection bias." Selection bias, in the words of Wikipedia,

Friday, February 11, 2011

Raking It In: Football Wealth in 2010

As you probably know, I'm a big fan of communicating complex and statistical information graphically. So you won't be surprised to hear that I'm also a big fan of The Economist, the masters of the interesting and thought-provoking chart. Here's a graph from the magazine showing a ranking of "football wealth" based on the just released "Football Money League" report by consultancy Deloitte.

(c) The Economist
It shows the "league table" of Top 12 clubs, based on revenues generated during the 2009-10 season. As in previous years, Real Madrid comes out on top, and the top six clubs were unchanged from the year before. Though 2 La Liga clubs (Real Madrid and Barcelona) sit atop the table, they are the only ones on the list. 50% of the clubs in the Top 12 are from the Premiership, 3 or 25% from Serie A, and only one club from the Bundesliga - the league that, overall, seems to be healthiest (Bayern Munich). Top mover: Manchester City, from 20 to 11.

There are some other noteworthy differences across clubs and leagues that are worth talking about another time - but for today, I'll mention only the significant differences in terms of revenue generated from commercial activities: Bayern leads the pack in terms of revenue generated from this source at over  €170 million - and it's the largest percentage of the club's overall revenue (over 50%). One final note: it's interesting to see what's not on the graph: there are no French or Dutch clubs in the Top 12 (though Lyon and Marseille do make the Top 20 at 14 and 15, respectively).

Stay tuned for more detailed analyses before too long.

Wednesday, February 9, 2011

Why England Lose: Not Enough Qualified Coaches?

Here's an interesting (and perhaps useless) factoid of the day. Don't know how and why I missed this last summer, but I was astonished to read just today about the huge discrepancy in the numbers of highly qualified coaches across European soccer nations, and in particular England's relative and absolute backwardness.

According to a story from the Guardian newspaper published right before the World Cup, "UEFA data shows that there are only 2,769 English coaches holding Uefa's B, A and Pro badges, its top qualifications. Spain has produced 23,995, Italy 29,420, Germany 34,970 and France 17,588." I don't know about you, but I find these numbers astonishing. Not only do Germany and Italy have more than ten times as many of these coaches as England, the ratios of coaches to players is just as bad or even worse. Consider these ratios of UEFA-qualified coaches to active players:

Spain 1:17
Italy 1:48
France 1:96
Germany 1:150
Greece 1:135
England 1:812

No, that's not a typo: Spain's ratio is 1 coach to 17 players; England's is an incredible 1 to 812. The Spaniards may be overdoing it (but maybe not, since they're World Champs), but even the Germans have a 1 to 150 ratio.

Or consider this: In 2009, 115 English coaches had UEFA's pro license; in Spain there were 2,140. This translates to ratios of available Pro-licensed coaches to players of 1:190 in Spain, 1:19,565 in England.

The Telegraph, Source: PA
Capello, Mourinho, Martinez, Benitez, Mancini, (Di Matteo - until last week), Ancelotti, Houllier, Grant? Certainly England is an attractive place to coach, but perhaps there is a supply issue with English managers. And come to think of it, before Steve McLaren's recent stint at Wolfsburg, an English coach never worked in the German Bundesliga.

There may be reasons why English coaches do not seek UEFA badges that I am unaware of - please enlighten me if you know - or perhaps we are systematically over- or under-counting active players in some of the countries. But assuming they're not far from the truth, these numbers have to matter, and matter at all levels, from youth player development to the professional leagues.

I'm sure folks in the League Managers Association and elsewhere are deeply worried about this. After all, England used to export coaches to far flung corners of the world and helped found and nurture clubs in places as different as Argentina and Austria. Without them, the game wouldn't have spread around the world in the way that it did. So what happened? I'm not sure, but it seems high time to do something about it.

Tuesday, February 8, 2011

The Point Value of Goals: Does It Matter If A Team Is Ahead Or Behind?

Here's another installment of "what is the point value of goals?" It's one thing to see how many points are associated with different numbers of goals, or to see if first or second half goals generate more points (they don't). But these analyses leave open the question of whether goals produce different amounts of points, depending on when the situation the team finds itself in - whether it is behind, tied, or ahead.

To see if this is the case, I calculated the point values of goals, depending on the score at halftime. As before, these calculations are based on data from the Big 4 leagues and for the last five seasons (2005/06 to 2009/10).

As they do on many other dimensions of soccer statistics, the leagues look very similar to one another. And at first glance, it appears that goals are most valuable when teams are ahead - that is, scoring no or just one goal when ahead still gives teams an expected point value of 2 to 2.7 points, while scoring no goal or just one when behind produces an expected point value of 0 to about .3 points. But this will hopefully surprise no one - teams that are ahead at the half are more likely to win (about 75% of them do), so these numbers aren't really reflections of the value of the goals but simply a reflection that teams that are behind lose more and teams that are ahead win more.

Across the leagues, the average values look like this:

Monday, February 7, 2011

The Bundesliga v. the Premier League (and the NFL): Attendance Comparisons

I've written about match day attendance and leagues' financial health before, so I thought it was interesting to see these figures from the latest report on the Bundesliga's financial situation. Here's the trend in spectators since the inception of the league in 1963.

While the league's annual attendance ranged between 5 and 7 million per season for almost three decades throughout the 1960s, 1970s, and 1980s, we have seen a rapid and sustained increased since Germany's unification in 1989/90 from about 5 million then to almost 13 million during these last two years. On average, that's over 41,000 folks in the stadium for any given match. And when you add in the 2. Bundesliga - the league's second division - you get almost 17 million fans watching professional soccer matches in Germany's top two leagues.

So how does this compare with other professional leagues? To compare leagues, it is necessary to account for variations in terms of numbers of games/matches played, etc. So a reasonable metric is to focus on and compare match day attendance figures. Take a look.

Turns out, the Bundesliga's average of about 41,000 paying customers per match makes it the second-most attended league in the world of professional sports, behind only the NFL with its astonishing average of over 67,000 paying spectators. This also means that the Bundesliga beats out the Australian Football League, Major League Baseball and, importantly, the English Premier League with an average that is almost 20% lower at slightly over thirty-four thousand.

Does this mean that all is well in the Bundesliga? Not by a long shot, and I'll be reporting on the league's financial health before too long.

Saturday, February 5, 2011

First Or Second Half Goals: What's More Valuable?

If you want to know how unusual yesterday's comeback by Newcastle against Arsenal was, consider the expected point value of first and second half goals.

One way to compare the point value of goals it so calculate their values depending on when they were scored. To keep the analysis simple (after today's match), I was wondering simply whether first half goals are more valuable than second half goals?

Here's one possible answer. Below are the average point values of first and second half goals over a span of five seasons (2005/06 - 2009/10) in each of the four big leagues (it's the average number of points a team won in a match when the team scored varying numbers of first and second half goals).

On their face, these graphs look very, very similar. The patterns of point values of up to 6 goals scored in each half look similar in that teams win more points when they score more goals, and it doesn't seem to matter too much whether they scored them in the before or after the halftime whistle.

But looks can be deceiving, if ever so slightly. Turns out there are some small, but noticeable differences, and they occur in each of the leagues. Curiously, these differences show up when we examine the point values of no goal or 1 goal scored in the first and second halves.

No. of goals and point values (by half)
0 goals: .90 v. .73 points (1st v. 2nd half)
1 goal: 1.77 v. 1.67
2 goals: 2.44 v. 2.44
3 goals: 2.84 v. 2.80
4 goals: 2.96 v. 2.96
5 goals: 3
Strangely, scoring no goals in the first half seems slightly more valuable point-wise than not scoring in the second half (.9 v. .73), and scoring a single goal only in the first half will bring more points than scoring a single goal in the second half (1.77 v. 1.67). After that, the values are very similar, if not identical. And while the values vary slightly across the leagues, the general pattern of first half goals (or 0 goals) holds across the leagues.

How do you explain this? I'm not sure, except that many more draws end in 0-0 and 1-1 than in, say, 2-2 or 3-3 or 4-4, so not scoring the first half (or perhaps scoring just one goal) appears to be a harbinger of an eventual draw. But remember that one of the things we're interested in is what the added value of a goal is; using this metric, scoring a single goal in the second half adds slightly more value than scoring a single goal in the first half (1.77 minus .9 v. 1.67-.73 or .87 for the 1st half and .94 for the second half), and scoring 2 goals in the second half, relative to only one adds more points (.77) than scoring 2 goals in the first half relative to just one (.67).

It's easy to make too much of these numbers; to me, they suggest overwhelming similarities across leagues and seasons in terms of first and second half goals. At the end of the day, they seem very much the same, if you ask me.

This brings me back to Newcastle and Arsenal. The score alone was unusual: only .26% of all matches in the past five seasons ended in a 4-4 draw (that's one quarter of one percent of all matches played in these leagues). More importantly, as you can tell from the graph above, not a single team that managed to score 4 goals in the first half failed to take 3 points from the match. So Newcastle's feat to wrest 3 (almost) certain points away from Arsenal was a real accomplishment to be proud of, statistically speaking.

But turn it around, and the day doesn't look so amazing for Newcastle either if you consider this: only one EPL team in the last five years failed to win the match when they scored 4 goals in the second half. That team was Liverpool when they came back from 4 down in the second half against none other than Arsenal on 21 April 2009. What is it with Arsenal blowing 4 goal halftime leads?

Friday, February 4, 2011

Soccer As American Football: The European Super Bowl Champions

If soccer had a Super Bowl, who would win it? Since this year's Super Bowl - the final for the U.S. National Football League (NFL) championship - is happening on Sunday between the Green Bay Packers and the Pittsburgh Steelers, I thought it'd be fun to treat soccer as a version of American football - only for a laugh, of course, lest you think I'm trying to blaspheme the best game in the world.

Ok, so here goes: the goal of American football is to carry the ball into the end zone - the area behind the goal line, so to speak - or to kick it through the upright goal. A touchdown (a carry into the endzone) counts for 6 points, while a kick through the upright counts for 3 points. If you think about it, soccer, too, shares one central goal with American football: after all, the aim is to maneuver the ball behind the line of the opposing team, albeit within a narrowly defined range (between the posts and underneath the crossbar).  And in every game, teams also manage to place the ball behind the goal line, though not between the posts, by taking shots and being awarded corner kicks.

Thankfully, box scores allow us to calculate which teams were able to place the ball "in the end zone", so to speak. So, our game statistics allow us to see which teams in soccer (the real football) did best using American football metrics of scoring.

We could do the following: we could simply add up corners, shots, and goals to see which team was best able to place the ball behind the goal line. But this would not take into account the fact that some of these placement are more valuable than others (as they are in American football). So to make things roughly equivalent between soccer and American football, I counted goals as most valuable (6 points - similar to a touchdown); corners as second most valuable since the team retains the ball and is able to take a set piece (3 points - analogous to a field goal in American football); and shots on goal but off target as least valuable (1 point) since they do cross the line. I did not count shots on target because these either result in a goal (and will therefore be counted) or a corner (and therefore will be counted), or the ball simply remains in play.

Because champions, by definition, are the best teams in a league, I summed up all points teams achieved last season for each of the Big Four leagues and for all four combined.1 Based on these calculations, here are the best and worst American football teams in the Bundesliga, EPL, La Liga, and Serie A (drumroll please ...):

The European Soccer Super Bowl winners were:

Bundesliga: Werder Bremen, which narrowly beat out Bayern Munich (1405 v. 1373 points)
EPL: Chelsea, which easily left the rest of the field in the dust (with 1782 points)
La Liga: Real Madrid (1783 points), ahead of Barcelona (1691)
Serie A: Roma (1528), which came out ahead of Fiorentina (1508)

If this really were American football and we were to treat each league as one of the divisions of the NFL, then we could also figure out the overall best team of European "American" football: in that case, Real Madrid would beat out Chelsea by one lousy point (1783 to 1782). And while the top teams last season also come out on top using American football scoring, and the lesser teams bunch at the bottom, we also see that the eventual league winners did not win our imaginary Super Bowl: In fact, with the exception of Chelsea and the EPL, in each of the other leagues the eventual winner comes in second on this score. But I bet you they don't mind a bit!

1.  We could also calculate the average points haul for a team per match, but this would yield the same ranking since all teams play an equal number of matches.

Thursday, February 3, 2011

The Point Value of Goals in Soccer: The Big Leagues From 2005/06 to 2009/10

With clubs like Liverpool, Chelsea, and Aston Villa dishing out loads of cash - well, actually recycling it - to move for strikers during the most recent transfer window, I started wondering how valuable goals really are. Obviously, you need to score goals to win matches, but how many points does a first or second or third goal give you? One way to gauge value is to calculate the point value of goals, similar to my earlier calculations of the values of clean sheets or yellow cards; it's calculated simply as the average amount of points associated with the number of goals a team scored in a match.

To get a good baseline, I started by looking at the long-run tendencies in soccer generally. An easy way to do this is to summarize how many points a team wins per match in each of the four big leagues across a span of five seasons, depending on how many goals they scored in that match. Take a look.

Starting with the obvious, it won't come as a surprise to anyone that scoring five goals or more guarantees 3 points for the team, no matter what, and this is the case across all four leagues. It also shouldn't come as a surprise - given the most common scores in soccer I've reported previously - that not scoring at all doesn't yield much in terms of points (but the point value of 0 goals is not 0, simply because about 7-8% of all matches end in a 0-0 draw).

It also is interesting to see that the general pattern of points for goals is very similar across the four leagues. The average values of goals (across the four leagues jointly) is as follows:

0 goals: .28 points
1 goal: 1.14 points
2 goals:  2.15 points
3 goals:  2.67 points
4 goals:  2.90 points
5 and more goals: 3 points

So one goal virtually guarantees at least a tie, while more than 2 gets you closer to a win than a draw.

So far, so good. But one of the things we really want to know is not whether teams win when they score 5 goals, but what the value of each additional goal is, given the number of goals scored so far in a game. Another way to think about this is to ask: how many more points does scoring a goal produce if the team has scored 0, 1, 2, or 3 goals? To get at the "marginal" value of each additional goal, we can simply calculate how much each additional goal moves the number of points for a team. Here's what we get: