Thursday, September 30, 2010

Fake Fouls And The Effective Length Of A Match

Drawing By Lazo
I thought this post from was interesting in light of some of my recent posts about fouls suffered and committed in the Premiership. Turns out, researchers at Wake Forest University and the University of Plymouth have been hard at work trying to distinguish between real and fake fouls, and how dives affect the match. Unsurprisingly, fake fouls (dives) are quite frequent and there are ways of categorizing them based on how they occur. But as importantly (or more importantly), dealing with fake fouls adds up, so that the real length of a match is consistently less than it should be.

It's not clear whether the pattern of dives differs systematically across teams or how dives are rewarded and punished by referees, but this research gives me yet another reason to call for harsher penalties for diving. In our house, taking a dive is called "doing a Ronaldo" - I bet you know why. And speaking of dives, this clip is pretty funny.

Links to the studies:

Thanks to Jesper S.

Tuesday, September 28, 2010

Clean Sheets and Points: Is There A Connection?

I recently reread Simon Kuper's column about statistics in soccer, in which he quotes Mike Forde, Chelsea's Performance Director, who has worked hard to find ways of making statistical analysis useful for his club. It's an interesting piece, but what struck me as particularly interesting for soccer analysts was this quote:

"The holy grail would be discovering the key to victory. “I do not think we are there yet,” Forde admits. But he says: “If you look at 10 years in the Premier League, there is a stronger correlation between clean sheets and where you finish than goals scored and where you finish.”"

Of course, I was curious to see if there is something to clean sheets that's worth pursuing, as Forde claims. So here we go.

One way to think about it is to see if we can quantify the point value of a clean sheet. It may help to think about it this way: a clean sheet guarantees a team at least one point from a match and potentially gives it three (in case the team scores a goal). So I cranked up the old EPL dataset and calculated the average amount of points associated with a clean sheet (and goals allowed per match generally). Here's what we see for the 2009-10 EPL season.

Forde is right on the money (kudos to his number crunchers). Clean sheets (denoted by the 0 line, which indicates no goals allowed in a match) on average produce almost 2.5 points per match. And even only 1 goal allowed still gives a team slightly more than 1.5 points on average. But by the time we get to 2 goals allowed, point value rapidly declines. And once the other team scores 3 or 4 on a team, the point value rapidly goes to 0.

Compare this with offensive production (see below). You might think that scoring at least one goal will help you as much as not letting one in. You'd be wrong. Scoring 1 goal only gives you about 1 point from a match, on average. Compare that to 1.5 points for allowing one goal only. So the point value of 1 goal allowed is 50% greater than the point value of 1 goal scored, according to this calculation.

Another way to think about this is to ask how many goals a team needs to score to produce the points produced by a clean sheet. The answer for the 2009-10 EPL season is slightly greater than 3. So a clean sheet produces about as many points for a team as scoring 3 goals on offense.

If I'm not completely off the mark (and I could be - wouldn't be the first time), then this raises an interesting question: why are strikers valued significantly more highly than defenders and goalies? Is that about offense being the best defense, or is there more going on? Either way, Mike Forde knows what he's talking about.

Thursday, September 23, 2010

Who Played Tit For Tat in 2009-10? A Postscript on Fouls in the EPL

A quick PS to the earlier post about fouls committed and suffered in the Big Four leagues of European football. After showing that there is a tendency for fouls committed and suffered to go together, I thought it'd be interesting to see whether all teams were equally likely to play tit for tat or whether there were significant differences.

Take a look based on data for the entire 2009-10 EPL season. The graph shows the distribution of fouls committed and suffered by match and team, along with a line indicating the slope of the statistical (linear) relationship. As you can see, a good number of teams did play tit for tat (measured here as committing more fouls when the other team commits more fouls against a team).

For 15 of the 20 EPL teams the slope is positive, indicating that teams that committed more fouls were also more likely to suffer more fouls. The exceptions were Birmingham, Fulham, Wigan, West Ham, and Wolves. This connection was particularly pronounced for teams like Aston Villa, Hull, Man City, and Stoke. The slope for Arsenal is positive, too; and if we were to disregard the match denoted by the dot in the lower right hand quadrant, it would be among the strongest. And notably, Blackburn were not an "eye for an eye" team in 2009-10 (wonder what Arsene Wenger would say about that). Before too long, I'll take a look at other leagues and seasons, too, so stay tuned!

Wednesday, September 22, 2010

An Eye For An Eye? The Connection Between Fouls Committed and Suffered

Where do fouls come from? Listening to Arsene Wenger or reading analyses of the World Cup final, they reflect an inferior team's strategic decision to disrupt the play of superior teams. That may well be, but I bet that's not the end of the story. Anyone who's ever played the game knows that on some days the match just turns out to be a little nastier, tougher, more competitive than on others.

Why would this be? Game theory may offer one possible answer. Game theory is a branch of applied mathematics social scientists use to capture behavior in strategic situations or games, in which an individual's success in making choices depends on the choices of others (Wikipedia). It has famously been applied to study the interaction of goalkeepers and shooters during penalty kicks (for a nice piece on that, see this article in Slate Magazine). But beyond penalty kicks, it has rarely been applied to soccer analyses.

So on the heels of my recent posts about fouls committed and suffered during the course of a match, I started to wonder if game theory could also provide insights into other situations in football. In particular, I remembered there is a famous concept in game theory called "tit for tat", which refers to a particular way of playing the game (a strategy, if you will). Quoting Wikipedia: "an agent using this strategy will initially cooperate, then respond in kind to an opponent's previous action. If the opponent previously was cooperative, the agent is cooperative. If not, the agent is not." The key thing about tit for tat is that is is the optimal (winning) strategy in so-called repeated prisoner's dilemma games.

So where would we look for tit for tat in football matches? Very simply put, more fouls by one team in the course of a single match should beget more fouls by the opposite team. If all the mention of game theory has your eyes glazing over by now, just think of it as sort of an "eye for an eye" strategy. Thus, contrary to the idea that bad teams foul more, both good and bad teams should foul more when the other team fouls more, and they should foul less when the other team fouls less.

To get a quick and dirty handle on this, I collected data on fouls committed and suffered by each side during matches in the Big Four leagues (Bundesliga, EPL, La Liga, and Serie A) for the 2005/06 to 2009/10 seasons (I was able to get these data for 7,219 matches). I then graphed fouls committed against fouls suffered - remember, we would expect to see a positive correlation: as one goes up, so should the other. Here's what we get when we do that:

The scatterplot reveals systematic evidence of tit for tat. As the number of fouls committed goes up in a match, the number of fouls suffered goes up as well (and vice versa, of course). Moreover, this pattern exists in every one of the four leagues. Sure, there is variation around this central tendency - plenty of matches deviate in one way or another - but the fact that we can see it, given such a large sample of matches, and the fact that we do not see the opposite pattern or no pattern are worth remembering.

We can see this pattern most clearly when we use the data to extract the linear combination by calculating regressions (where we make outcome - fouls committed - a linear function of the other - fouls suffered). Here's what the data show when we do that (these graphs strip out the variation around the average in the data):

Monday, September 20, 2010

Do Bad Teams Foul More? Testing the Wenger Hypothesis

Last week, Arsene Wenger yet again accused another team of intentionally going after his players in an attempt to take them out of the game. This time Sam Allardyce and Blackburn Rovers were his foil (at other points Stoke have been one of his targets, too). As I read them, Wenger's comments had the feel of "bad teams aren't good enough to compete with us, so they will just foul us to keep up." All of this reminded me of the aftermath of this year's World Cup final,  when there was a widespread perception that the Dutch intentionally tried to play a more destructive game to slow down and interrupt the Spanish passing game. See the parallel? Arsenal = Spain? Blackburn = Holland?

All this begs the question of whether Wenger's hypothesis is supported by the data. And if you think about it, there really are two parts to Wenger's argument. First, reading between the lines, Wenger expects that bad teams foul more than good teams. Second, good teams are fouled more by other teams.

So let's take a quick look at the 2009-10 EPL season to see if this is what the data tell us. Below are two graphs. First, the average number of fouls committed by a team (per match); second, the average number of fouls suffered by a team (per match).

Looking first at fouls committed, the Wenger hypothesis looks convincing. Teams toward the bottom of the table commit more fouls, on average, than teams toward the top of the table. Allardyce's Blackburn leads this table with almost 14 fouls per match (13.8 to be exact). Compare that to Man City's low of 10.58 or Arsenal's 11.02. But there are exceptions: Birmingham didn't foul very much, and neither did Fulham or Burnley. 

So let's look at fouls suffered to see if the second part of the Wenger hypothesis is supported. This is where the data do not look as favorable for Arsene's argument. To be sure, Wenger's Arsenal was the team that suffered the second most fouls in the league, at 13.63 right behind Everton's 13.76.

But if you look more closely, Hull and West Ham, too, suffered many more fouls, and their seasons weren't nearly as successful. Similarly, at the other end, Man U (11.37) and Aston Villa (11.0) suffered relatively few fouls at the hands, I mean, boots of their opponents.

So, Wenger is right in that less successful teams on average commit more fouls, but more successful teams aren't necessarily fouled more. At the same time, his grievance is based in fact: Arsenal played a really clean game in 2009-10; they didn't foul very much, but were fouled a lot by their opponents. And Blackburn fouled the most.

Sunday, September 19, 2010

The Foul Register: How Many Fouls Do Teams Commit?

In the past few days, there's been yet another dustup between managers about intentional fouls committed against their teams, this time between Arsene Wenger and Sam Allardyce. Wenger and Allardyce are the prototypes for managers with different playing philosophies - with Wenger preferring the short passes, possession dominated game, and Allardyce favoring a more long ball oriented game (I am simplifying greatly here, I know). This time, Wenger accused Allardyce of intentionally targeting key players on his team.

So how many fouls do teams commit? I had to admit that I didn't know, but since its' knowable, I looked it up (well, I let my trusty computer do it for me). When counting up fouls, however, there's a tricky definitional issue we need to get out of the way before looking at the numbers. The official statistics we have from box scores and various other published sources include only fouls that are called, not necessarily those that were committed. Counting how many times refs blow the whistle for a foul is not the same as counting fouls. Anyone who's ever played the game knows that there's a difference between the two and, depending on a variety of circumstances, quite a difference. But: for the sake of argument, let's assume for simplicity that too many fouls called on any one team we would randomly draw from a hat cancel out too few called on another drawn from a hat.

So here's the total number of fouls called in the Big Four soccer leagues of Europe over the past five seasons.

There's quite a range in how busy referees are. The totals range from fewer than 9,000 fouls called in the 2008/09 EPL season to almost 15,000 in the 2005/06 La Liga sesaon and the 2007/08 Serie A season. Among other things, this suggests to me many fewer interruptions to the game in Germany and England than Italy and Spain or conversely, a more fluid, continuous style of play. It also may imply fewer worries about injuries, or simply imperious referees who like to be the center of attention. But I digress.

So how much do individual teams foul? What do these big totals translate to at the level of individual teams and matches? Here we go.

Thursday, September 16, 2010

Card Games: Reds and Yellows in the Big Leagues, 2009-10 Season

It's been fun comparing various aspects of performance in the big four UEFA soccer leagues. It's clear that in some fundamental ways, the leagues are very similar (think about goal/shot ratios, for example). But in other ways, they're quite different, as you'll see below. So, on the topic of punishment in soccer, I've been looking some more at data on red and yellow cards in the four big leagues. In particular, I've been wondering about getting the baselines right. So, for starters, here are the average numbers of red and yellow cards (per team and match) for the 2009-10 season.  As you can see, there are some significant differences across the leagues. Take a look.

As I've noted previously, the EPL and the Bundesliga are quite different from La Liga and Serie A. Refs in the former two leagues get out their yellow cards much less frequently than refs in the latter two. While teams in the Bundesliga and the Premiership see around 1.6/1.7 yellows and less than .1 reds per match, teams in Serie A and La Liga can expect about 2.5 yellow cards and around .2 reds. These are significant differences that coaches must know and think about when preparing for a match. From an analyst's perspective, an obvious difference is geographic: more yellows as you move South! But it's not clear what is driving these patterns. Are players in Spain and Italy more likely to commit fouls, fall more spectacularly and writhe on the ground more, or are refs just tougher in the southern top divisions of European football? I'd be curious to know.

One question is whether there is a connection between yellow and red cards given. You would imagine that teams that see more yellows also, by extension, see more reds - so long as we think of refereeing as consistent and punishing the more severe and repeated fouls and transgressions on the pitch more harshly than the occasional, less severe ones.

To get a sense of the connection between yellow and red cards, I collected data for all teams and all matches played in 2009-10 in each league and looked at the correlation between yellow and red cards for each team in each match. [A correlation implies a pattern in the data where higher values on one variable (say, yellow cards) go hand in hand with higher values on another (say, red cards) for a positive correlation or lower values (for a negative correlation).] I then graphed these correlations for your viewing pleasure. Take a look.

Wednesday, September 15, 2010

Unequal Punishment: Trends in Yellow Cards in the Big Leagues of Soccer

Does football punishment get meted out equally across leagues, teams, and over time? That is, are there patterns to penalties in football or are they fairly random? Most coaches will tell you that home teams have an advantage - they often complain about biased referees after the match. And while some of it may be whining, some of it may well reflect a pattern observed over years of watching and playing in soccer matches.

I was reminded of this last weekend when my son's team (which I coach) was consistently called for offside by one of the linesmen (which the ref duly followed), while the other linesman did not. But I digress. Back to punishment: what are the patterns? One way to see if there are trends or cross-league differences is to look at trends in the Big Four leagues of soccer. One easy way to see if there are patterns and to quantify their size is to look at yellow cards - a common enough occurrence in a match to yield some interesting and sufficient data. So here are trends in yellow cards since the 2005-06 season per team/match.

Overall, teams see about two yellows per match played. But clearly, refs in some leagues more easily pull out the card than in others. In particular, refs in La Liga give significantly more yellows than refs in the Premier League. La Liga's 2.5 yellows per team/match easily dwarf the Premiership's roughly 1.5 cards. Whether this reflects differences in playing style, instructions from the league, training of refs, or more skillful diving in Spain's top league is unclear, but punishment is clearly not meted out equally. We see consistently more yellows over the years in Spain and Italy than in England and Germany (perhaps suggesting something about the diving argument). (There also are a couple of interesting trends, with the Bundesliga refs decreasing their enthusiasm for yellow somewhat over time.)

Given these different baselines across leagues, another way to think about inequality in punishment is to see if some teams are systematically more likely to receive yellows than others. Here, the obvious explanation may be home field advantage. As avid readers of this blog know, the home field advantage in offensive and defensive production is a fairly consistent pattern across leagues and over time, but with some clear exceptions for particular teams, as I tried to point out in an earlier post. But does this extend to another kind of home field advantage: namely, an advantage in penalties assessed?

Well, let's look at the data. The story here is very straightforward:

Monday, September 13, 2010

Goals Against, Big Four plus EPL and Bundesliga (2009-10)

While offensive production has been one important story of the Premier League season so far - in particular for leaders Chelsea and Arsenal - matches aren't won with offense alone. At the end of the day, it is the difference in goals that matters, and for that, defensive production is important.  Here's how the Big Four leagues stack up, using data from the 2009-10 season (no good reason to assume this year will be any different).

Of course, this is basically the offensive home field advantage in reverse (home teams are able to force away teams to concede more goals, and away teams are less able to get home teams to concede goals). So league totals are not entirely useful. But bear with me since I think it's instructive to think about the same data by looking at them in a slightly different way.

So back to defensive production. Overall, the big leagues look quite similar. There is a home team defensive advantage in each of the leagues, with home teams giving away fewer goals than away teams. This defensive advantage is least pronounced in the Bundesliga and most pronounced in the Premier League. Moreover, home teams defend least effectively in the Bundesliga, and away teams' defenses are most porous in the Premier League.

These overall figures hide considerable variation, as you can see when you drill down further into the leagues. So let's look at the Bundesliga and EPL to get a sense of this variation by teams (again, for 2009-10).

While the defensive home team advantage holds for most teams in the Bundesliga, it is astonishing that some teams - and even some very good teams like Wolfsburg - defend less successfully at home than on the road. Aside from Wolfsburg, teams in this category include Cologne, Bochum, and Stuttgart (though there's hardly any difference for the VfB). This could be because they play a more open game at home to give their supporters something fun to watch, but I have my doubts. As importantly, the top teams are just outright stingy on defense. Teams like Bayern, Schalke, Stuttgart, Leverkusen, and Bremen have records of around 1 goal against per match at home and away. Top teams defend well, no matter what.

Let's see if that's the case in the Premier League, too. Clearly, it is. Virtually all of the top teams have great home and away defensive production.

Friday, September 10, 2010

Defensive Production: Goals Against in the English Premier League, 2009-10

In the last few days, I've been thinking about ways of measuring and displaying defensive production at the level of teams. It's been helpful to think and read about the performance of individual players, especially in related team sports like hockey, but I figured it'd be nice to get a handle on things just by looking at overall team performance.

Mind you, it doesn't really make sense to do this at the level of the league as a whole since goals scored are also goals not defended, but for each team we can calculate the number of shots they allowed the other team to take, how many they allowed them to make accurate enough to land on goal, and how many goals they ultimately allowed.

So for starters, here's a quick calculation of defensive production, measured as full time goals against (GA), for the 2009-10 season, separately by whether teams played at home or away.

A few things stand out. First, you can see the home team advantage quite clearly, with the distribution of goals allowed at home to the left of goals allowed away. The range at home is (roughly) between .6 and 1.6 at home and .75 and almost 3 away.

Second, this means that the distributions are tighter at home than away - there is less variation across teams in terms of goals against at home than away. This is in large part due to the woeful performance of three teams: Wigan, Hull, and Burnley who managed to allow 2.5 goals or more per away match - most other teams were below 2!

Third, having said all that, there is interesting variation across teams in the league. Generally speaking, the better teams have better defenses, but this is not uniformly the case. In particular, Birmingham and Tottenham defended much better at home than away (this is true, too, of Wigan, Burnley, and Hull). In contrast, the teams at the very top (Chelsea and Man U, for example) defended well, no matter what. And for Portsmouth, too, it didn't seem to matter whether they played at home or away: their defense was equally mediocre in both, suggesting that they did not have much of a home advantage there. 

I'll keep digging into these statistics in the weeks to come. Any suggestions? Let me know.

Expensive Toys: Does It Matter Who Owns Premier League Teams?

In the wake of match fixing allegations involving the Pakistan cricket team, and while I was thinking about the seemingly unrelated topics of who buys EPL teams and match fixing, it occurred to me that you could easily combine the two - corruption and team ownership, that is.

If you really wanted to control outcomes in matches people bet on heavily, why not buy yourself a Premier League team? According to Declan Hill's The Fix, something like this was done on a small scale in the Finnish league (and in Belgium), when the alleged fixer Ye Zheyun (a mysterious Chinese businessman) used the financial problems of the Finnish team AC Allianssi to become its manager through an injection of cash. On the side, Ye Zheyun also helped out a number of Belgian clubs in financial distress by becoming a part owner in them. So, when it turned out that there were unusual betting patterns in the Finnish and Belgian league involving teams that Ye had a stake in, noone should have been surprised. There currently are being investigated by Belgian authorities, and pre-trial proceedings are underway.

All of this got me wondering: could this happen in England? And more generally, why would anyone want to own an EPL team in the first place? Aside from graft, I see several possibilities:
  1. A love of the game and a particular team
  2. A chance to make money
  3. Social competition
I happen to think that the odds of match fixing in the EPL are very low, in large part because the incentives for players to agree to throw a match are very low. Yes, assuming that players are human, it comes down to money. Simon Kuper has a nice piece explaining the logic of these incentives. And while we're at it, take a look at the level of corruption in the countries of origin of EPL owners. The higher the value on the scale, the cleaner the country has been judged to be by the good folks at Transparency International, an international organization devoted to fighting corruption and bribery.

(You'll see that I also included potentials Liverpool owners in orange).

But ideally and realistically, the people who can afford to buy EPL teams and wish to manipulate betting markets would want to invest in cheaper and less visible teams in smaller leagues. Seems like a better bet (pun intended). So if not manipulation, what's behind the fact that people from corrupt countries and not very transparently run economies are out to buy Premier League (and other) teams?

Monday, September 6, 2010

Fixing Matches: Mapping Corruption

I just finished reading Declan Hill's The Fix: Soccer and Organized Crime. It's a sobering tale of what can happen at the highest levels of football competition when gangsters and gamblers become involved in the game. Hill writes with the passion of a missionary, and he tells a compelling tale of how dirty money leads to corrupt behavior in football.

And wouldn't you know it, the German football league (the DFL, responsible for running the 1. and 2. Bundesligas) just announced a campaign called "Transparency and Integrity in Football". It seeks to develop a series of measures, including workshops, help desks, and informational campaigns aimed at players and coaches to prevent match fixing. I can't help but applaud the DFL for taking this step and sincerely hope that it produces meaningful results. But so far it's little more than a wish list - though one backed by an organization (Transparency International) that has experience in combating and exposing corruption.

But back to fixing matches. The Fix tells a sad tale; without clean games that we can believe in, people's faith in football dies. This is not altogether different from what happens in other areas of life. Corruption breeds cynicism about politics, too, among other things. And if you know something about corruption and its pernicious effects, you are unlikely to be surprised by Hill's revelations, given the geographic distribution of corruption around the world. Take a look at Transparency International's global map of corruption around the world.

Clearly, corruption is more common the further South and the further East you go on the map. And these tend to be poorer and less democratic countries. All this jibes with Hill's tale of corruption in Asia and Africa.

But this doesn't mean corruption in soccer stops there. And this part is critical.

Thursday, September 2, 2010

Reforming Soccer: Proposals From An Economist

I'm not the biggest fan of economists - as social scientists, they're among the more self-important species, meaning they seem to think that their theories can be applied to anything. Really, anything. The chutzpah is a bit much at times. At the same time, economists do like to think systematically about the world, and that's something I appreciate. So it pains me to admit that I found this article by distinguished behavioral economist Richard Thaler from the New York Times very intriguing.

The short piece is about ways of reforming soccer by making the the job of referees easier. The analogy Thaler uses as the basis of the piece is straightforward: referees are kind of like financial regulators (remember the financial crisis?). So what would you do to make the life of a regulator easier and improve the quality of the decisions the regulator makes?

It's certainly worth a read because it's clever and the proposals are worth talking about. I like the ones about adding a referee and redefining the offside rule. But judge for yourself!

(c) David G. Klein, New York Times