Monday, May 30, 2011

How Much Football Is There In A Match?

Talk in the aftermath of the Barcelona - Manchester match about who had more passes and more possession (or more this, that, and the other thing, as is often the case after matches) reminded me of The Guardian's statistical review of the Premier League's season (courtesy of Opta data). I thought one of the more interesting tidbits was that, on average, "the ball was in play for 62.39 minutes this season – more than in the much-vaunted Spanish and German top flights (61.48 minutes and 61.22 minutes respectively), but significantly less than in Serie A (65.15 minutes)."


What the comparison doesn't mention explicitly is (the perhaps obvious) that 90 minutes of a football match actually don't give you 90 minutes of football. Of course, we've all known this, at least intuitively, but it's good to know exactly how much or how little football there is in a match, for at least two reasons. First, it's good to separate fact from fiction. Second, it's interesting to think about the implications of the fact that a football match only has about an hour of actual football. Mind you, that's very different from, say, ice hockey or basketball. Hockey games are 60 minutes long and basketball games are 48 minutes long. Every time, to the 10th of a second. The puck or ball leaves the field, the clock stops. Not so in football, and I bet it matters in a number of ways.

Wednesday, May 25, 2011

Soccer In America: A Big Four Sport In The Public Mind?

Occasionally, there's a debate among fans just how far soccer has come in the United States. It's clearly more popular than ever, and the country now boasts a league that, according to some scouts, is on par with the English Championship (Division 2). Another way to chart the progress of soccer in the U.S. is to  take a look at the numbers provided by Google's terrific and innovative Ngram viewer tool. In case you haven't been following what Google has been up to, the tool calculates how frequently particular phrases or words have occurred in a particular corpus of books (e.g., "British English", "English Fiction", "American English", "French", etc.) over selected years.

So let's take a look at two things: has the growth of soccer been reflected in the lexicon of American English? And how does soccer compare to the other big professional sports in North America on this score?

Below is a graph of trends in how commonly used various terms are in books in American English, defined as books published in the United States. To compare the use of terms related to the big sports, the graph compares the usage of the terms soccer, football, basketball, hockey, and baseball in the written language. And to have a complete historical record, I thought we should start before their emergence as major sports - so first, here's a graph of these terms since 1850. What do we see?


To begin, the graph shows the historical origins of the major sports in America, with football taking off around 1880, and baseball about ten years later. Both sports had started to enter the dictionary in increasing numbers after 1900 and saw considerable growth throughout the first three decades of the 20th century. Basketball wasn't far behind, historically speaking, increasing its popularity in written American English throughout the 1910s and into the 1920s and 30s. By World War II, football was a clear No.1,  baseball No.2, and basketball No.3. Far behind, soccer and hockey brought up the rear at roughly the same level. 

In the aftermath of WWII, something curious happened. The popularity of virtually all major professional sports stagnated in published American English. In fact, throughout the 1950s and into the 1960s, there was a decline in football, baseball, and basketball. To get a clearer look at this pattern, here's the graph of the occurrence of the major sports from 1950 onward.

Friday, May 20, 2011

The Uselessness of Free Kicks in the Premier League

If you've been reading this blog these last few weeks, you know that I've been spending way too much time digging through data on shot creation in the Premier League with the help of the Opta/Guardian chalkboards. But I can't quite help myself, so here's yet another installment; this time it's on the (relative) uselessness of free kicks.

Don't get me wrong; I love a beautifully curled shot from 20 yards out into the upper right corner of the goal as much as the next guy. What I don't like is how rare and inefficient free kicks are, relative to other shot situations in a match. How rare and inefficient? To answer that question, we need information about the frequency of shots from free kicks and the efficiency of goal creation.

First, consider the graph below. It shows the overall frequencies of shots generated from free kick situations for the league as a whole, with the y-axis displaying the % of matches with various numbers of shots from free kicks situations. For the league as a whole, in 80% of all matches, teams have 0 or 1 shot on goal generated from a free kick. Compared to the average number of 14 overall shots per team and match in the first half of the season, shots from free kicks are very rare at a rate of just .82 per team and match.


Naturally, there are some interesting differences across teams. Take a look.

Tuesday, May 17, 2011

Which Shots Are Most Efficient? Creating Goals From Different Kinds of Shots in the EPL

A few days ago, I took a look at the origins of goals in the first half of this year's Premier League season to see what we can learn about the connection between different match situations (defined as open play, corners, fast breaks, penalties, and free kicks) and goal creation.

In case you didn't have the time or inclination to read the details, here's the upshot: Keeping in mind that teams scored on average 1.35 goals per match, the overall distributions showed that a sizable majority of goals were created from open play at a rate of .9 per match. The remainder (about .45 per match, since 1.35-.9=.45 so right around almost half a goal) were generated from corners (.18), penalties (.095), fast breaks (.074), and free kicks (.038). The relative contributions of each kind of match situation to goal creation were as follows: roughly 70% of goals were scored from open play, 14% from corner situations*, around 7% from penalties, almost 6% from fast breaks, and the remaining 3% from free kicks.

These averages are interesting as far as they go. But knowing that 70% of all goals were created from open play situations does not necessarily tell you that shots from open play are more likely to produce goals. It simply means that teams spent more of their time and effort on creating goals from open play.

Because these numbers do not tell us anything about conversion or efficiency - that is, the combined odds that shots from different match situations will be accurate and converted to goals - below I calculated the goal creation ratios - goals to shots - from different match situations based on data from the Opta/Guardian Chalkboards. These numbers are shown below; they indicate the odds that shots from different match situations actually end up in goal.


The overall ratio for all goals and shots from all match situations (not shown in the graph) was .097; that's right around 1 goal in 10.25 shots. By far the most efficient shots - unsurprisingly - were penalty kicks with a ratio of .69. This translates into 1 goal for every 1.45 penalty shots taken. In stark contrast, the goal creation ratio for free kicks is a truly lousy .039. That's about 1 goal for every 25 shots from free kick situations. Another way to think about these numbers that the odds of scoring from a penalty are 18 times better than the odds of scoring from a free kick situation.

Clearly, penalties are unusual, and it's a little bit comparing apples and oranges comparing them to other match situations. So to get a cleaner comparison of "typical" match situations, below is the same graph but with the bar for penalties removed. Take a look.

Saturday, May 14, 2011

"Soccer" By The Numbers, Literally ...: Historical Trends in the English Language

I thought I'd take "Soccer By The Numbers" literally for a change. How? For years, Google has been busy digitizing books and printed materials, and they've been going back into historical archives. So now, researchers (and, well, anyone with access to a computer, like me) can search for words or phrases that have occurred in print over the past several hundred years, give or take. The so-called Books Ngram Viewer is a tool that displays a graph of how frequently those phrases or words have occurred in a particular corpus of books (e.g., "British English", "English Fiction", "American English", "French", etc.) over the selected years.* This also means we can us the tool to take a look at "soccer" by the numbers.

I thought we'd start with the motherland of the beautiful game. So here's a graph of the popularity of the words "soccer" and "football" over the period between 1850 and 2008 in British English - that is, only books published in Great Britain. Take a look.


To the extent that the words we use reflect what we talk and think about, the graph reveals very nicely the steady rise of football as part of British culture and society. Right around the time of the formation of the English and then Scottish FA's in 1863 and 1873, football becomes part of the (British) English lexicon. Another interesting facet is that there has been a noticeable change in the trajectory in recent history. Since the early 1990's - and the formation of the Premier League - there has been an even steeper increase in how commonly-used the word "football" is. Finally, I thought it would be fun to compare the popularity of "football" and "soccer" in British English. While the term "soccer" is indeed a poor cousin to "football", it, too, has seen a noticeable increase, especially since the early 1990s.

And speaking of soccer, much has been made of the American colonies' lack of interest in English-style football, and instead the popularity of American-style rugby (aka "American football"). So what is the trend in the use of "soccer" in books in American English, defined as books published in the United States? See for yourself.

Wednesday, May 11, 2011

Where Do Goals Come From? Shot Creation and Goals in the Premier League

Goals don't just happen - they are made. Both on offense and defense, teams control how they deploy their resources (speak: players) on the pitch, and they make tactical choices about how to attack and defend. Some rely more on fast breaks, while others try to create or avoid chances from open play. So this means that goals are created and allowed.

Goals are also rare, and therefore precious. Around 70-75% of matches in the top European leagues see 2 goals or fewer; on average, teams score 1.32 goals per match, and 50% of the time they score 1. And, as we've seen before, being up a goal gives teams considerable leverage. So where do goals come from?

As I've said before, you can't score unless you shoot (forgetting about own goals for a second), so shot creation is a critical ingredient in all this. Shots can be created in any number of ways, but for analytical purposes, I have been focusing on 5 different kinds of situations that can lead to shots: shots created from open play, free kicks, penalties, corners, and fast breaks. In previous posts, I examined shot creation in the first half of this year's Premier League season. The data showed that most shots were created from open play - roughly 75% of them - with the remainder split up as follows: 15% from corners, 5% from free kicks, and another 3.5% from fast breaks, with less than 1% resulting from penalties. But, as we also know, some of these shots were much more likely to be accurate and actually constitute an acute threat to the other team. For example, only about 10% of all accurate shots were created in the aftermath of a corner, while the relative importance of penalty kicks goes up threefold when we look only at accurate shots.

These numbers suggest that the relative levels of accuracy varies by type of shot created. That is, not only do the frequencies of shots created from different match situations differ, but the odds that any one type of shot finds the target differ as well. As it turns out, next to penalty shots (which had .88 odds of being on target), the most accurate shots came from fast breaks - or what we can think of as transition play (at a rate of .43). In contrast, shots created in the aftermath of corners were least likely to be on target at a rate of .2.

These numbers help us understand how often certain kinds of shots were created and which were more likely to find the target. Knowing where shots come from and which shot situations are more likely to yield accurate shots is important. It allows us to understand which efforts on the pitch are most and least futile and promising in terms of giving teams opportunities to score. To put it in more soccer analytic terms, they tell the story of shot creation and accuracy.

So far, so good. But they do not tell us where goals come from. For this, we need to go back to the data. The dataset for the first half of the season collected with the help of the Guardian/Opta Chalkboards revealed that each team scored an average of 1.35 goals per match (all of the statistics reported here are at the team level). So where did these 1.35 goals per team and match originate? Below are the numbers of goals created from different match situations in the Premier League.


Clearly, the vast majority of goals were created from open play at a rate of .9 per match. The remainder (about .45 per match, since 1.35-.9=.45) were generated from corners (.18), penalties (.095), fast breaks (.074), and free kicks (.038). Another way to think about these numbers that it took teams 1.1 matches to score a goal from open play, 5.5 matches to score one from a corner, 10.5 matches to score a goal from a penalty, 13.5 to score from a fast break, and 26.3 matches to score a goal from a free kick.

Sunday, May 8, 2011

Shots That Matter: Team Differences in Creating Chances From Fast Breaks in the EPL

When Hernandez scored for Manchester United in the first minute of play against Chelsea today on a fast break after a beautiful pass from Park, it reminded me of the high value such shots have for teams. Mind you, these kinds of opportunities don't come along all that much in the average match: I have previously noted that teams actually create relatively few shots from fast breaks. Data from the Opta/Guardian chalkboards show that Premier League teams, on average, took about 15 shots per match in the first half of this season (14.7 to be exact). But by far most common shots were generated from open play (11). In contrast, shots from fast breaks are pretty rare; on average, teams created only half a shot per match from fast breaks (.52), while they generated about four times as many shots from corners (2.1) and about 50% more from free kicks (.82). The only shots more uncommon than shots from fast breaks are penalty kicks.

While shots from fast breaks may be rare, they also are much more likely to pose a threat to the other side. When we break down the data and look only at those shots that were accurate and thus had a real chance of yielding a goal, we see that the odds of shots created from open play being on target are right around the average in accuracy (around .3) and shots created from corner situations least likely to be on target (at .2). In stark contrast, shots created from transition play (fast breaks) were the most likely to be accurate (ignoring penalty kicks for a moment), at a rate of .43. Clearly, shots from transition play are particularly valuable for teams because they have a much higher chance of being accurate than the average shot.

These averages help us to benchmark team performance, but of course they also typically disguise differences across teams. When we looked at overall shot creation from fast breaks, the data showed that there was quite a mix across teams with regard to who was able to create shots from transition play. The top five performers in the first half of the season included the top three teams (Arsenal, Chelsea, and Man U), but also teams further down in the table (Sunderland and Birmingham) at over .8 shots from fast breaks per match.


At the other end, Wolves, Fulham, and Newcastle created about a fourth of that (.2 shots) per match. One surprise was how similar the profiles of Wigan and Man U were on this dimension, and how mixed the teams were generally.

Wednesday, May 4, 2011

Why the Goal Value of Corners Is (Almost) Nil: Evidence From the EPL

A while back, I wrote about the goal value of corners. Turns out that more corners don't equal more goals. Across the big leagues, the correlation between corners and goals is essentially 0 (it's strongest in the EPL at .06, and weakest in Serie A and La Liga at less than .01). The graphical representation of this pattern tells the story, keeping in mind that the average number of goals a team scores per match is around 1.3 across leagues and seasons.


So, statistically speaking, the offensive "value" of corners seems to be slim to none. Similarly, match outcomes appear to be unaffected by corners. Assuming that earning corners is an indicator of offensive pressure, shouldn't teams that generate lots of corners also be teams that generate more goals? What gives?

One of the issues with the kind of analysis above, of course, is these data are based on match totals. As such, they are not designed to tell us the odds that any one corner actually yields a goal - they are simply match totals  "macro-level data", so to speak - that are aggregated over the course of an entire match.

To dig a little more into the goal value of corners, what we ideally would also want is more information about the micro-level of play; that is, what exactly happened after any one corner was taken and whether it yielded a goal. More specifically, we want to know what the odds are that any one corner kick ends up with a team putting the ball in the back of the net.

So to get a better handle on this issue, the good folks at StatDNA generously dug into their treasure trove of in-match data and helped me to calculate shots and goals scored from corners for a reasonably-sized sample of about 12-14 Premier League matches from this season. Shots and goals created from this particular match situation are defined here as occurring within three touches of a corner.

We can think of goals produced from corners as a simple chain of events: corners lead to shots, and shots lead to goals (of course, some corners are direct shots and thus some goals are directly scored from corners). So I wanted the know the following:
  • What proportion of corners actually produces shots on goal?
  • What proportion of shots created from corners produces goals
  • And overall, what is the ratio of corners to goals?
Of course, we would expect some slippage along the way. Not every corner will produce a shot on goal, and not every shot on goal will go in. As a consequence, the ratio of corners to goals is likely to be smaller than 1. "But how how much smaller?", inquiring minds will want to know. And where does most of the slippage occur?

Enough of the preliminaries. Here's what we see for teams in the Premier League.

Monday, May 2, 2011

Creating Shots From Open Play: Which Premier League Teams Perform Best?

Here's another analysis of shot creation in the Premier League. This time, I'm looking into the creation of accurate shots from one specific kind of situation: open play.

From the earlier analyses, we already know the following basic facts from the first half of this year's EPL season, based on data from the Opta/Guardian chalkboards. First, shots from open play are by far the most common type of shot created; about 75% of all shots teams created came from this source. On average, Premier League teams each create about 4.4 accurate shots on goal per match total, and about 3.4 accurate shots from open play.

As always, these averages can easily disguise differences across teams. So here are accurate shot creation statistics from open play separately by team.


Turns out, there was quite a variety in the numbers of accurate shots created from open play, ranging between roughly 2.5 and 5 accurate shots per team and match. Arsenal led the league in the first half of the season in accurate shots created from open play, followed by Chelsea and Manchester United. Surprisingly strong in this category were Wolves and Everton. A the low end, Birmingham, Blackburn, West Ham, Sunderland, Stoke, and West Brom all managed fewer than 3.