Wednesday, May 11, 2011

Where Do Goals Come From? Shot Creation and Goals in the Premier League

Goals don't just happen - they are made. Both on offense and defense, teams control how they deploy their resources (speak: players) on the pitch, and they make tactical choices about how to attack and defend. Some rely more on fast breaks, while others try to create or avoid chances from open play. So this means that goals are created and allowed.

Goals are also rare, and therefore precious. Around 70-75% of matches in the top European leagues see 2 goals or fewer; on average, teams score 1.32 goals per match, and 50% of the time they score 1. And, as we've seen before, being up a goal gives teams considerable leverage. So where do goals come from?

As I've said before, you can't score unless you shoot (forgetting about own goals for a second), so shot creation is a critical ingredient in all this. Shots can be created in any number of ways, but for analytical purposes, I have been focusing on 5 different kinds of situations that can lead to shots: shots created from open play, free kicks, penalties, corners, and fast breaks. In previous posts, I examined shot creation in the first half of this year's Premier League season. The data showed that most shots were created from open play - roughly 75% of them - with the remainder split up as follows: 15% from corners, 5% from free kicks, and another 3.5% from fast breaks, with less than 1% resulting from penalties. But, as we also know, some of these shots were much more likely to be accurate and actually constitute an acute threat to the other team. For example, only about 10% of all accurate shots were created in the aftermath of a corner, while the relative importance of penalty kicks goes up threefold when we look only at accurate shots.

These numbers suggest that the relative levels of accuracy varies by type of shot created. That is, not only do the frequencies of shots created from different match situations differ, but the odds that any one type of shot finds the target differ as well. As it turns out, next to penalty shots (which had .88 odds of being on target), the most accurate shots came from fast breaks - or what we can think of as transition play (at a rate of .43). In contrast, shots created in the aftermath of corners were least likely to be on target at a rate of .2.

These numbers help us understand how often certain kinds of shots were created and which were more likely to find the target. Knowing where shots come from and which shot situations are more likely to yield accurate shots is important. It allows us to understand which efforts on the pitch are most and least futile and promising in terms of giving teams opportunities to score. To put it in more soccer analytic terms, they tell the story of shot creation and accuracy.

So far, so good. But they do not tell us where goals come from. For this, we need to go back to the data. The dataset for the first half of the season collected with the help of the Guardian/Opta Chalkboards revealed that each team scored an average of 1.35 goals per match (all of the statistics reported here are at the team level). So where did these 1.35 goals per team and match originate? Below are the numbers of goals created from different match situations in the Premier League.


Clearly, the vast majority of goals were created from open play at a rate of .9 per match. The remainder (about .45 per match, since 1.35-.9=.45) were generated from corners (.18), penalties (.095), fast breaks (.074), and free kicks (.038). Another way to think about these numbers that it took teams 1.1 matches to score a goal from open play, 5.5 matches to score one from a corner, 10.5 matches to score a goal from a penalty, 13.5 to score from a fast break, and 26.3 matches to score a goal from a free kick.

A side note. The numbers on goals from corners are somewhat higher than what I have previously reported on the goal value of corners using StatDNA's data.Without exploring this in great detail, I can see two ready-made explanations for this: (1) the samples of matches are not equivalent - this is the first half of a season, whereas the StatDNA data contain matches spread across the season; as a result, we are comparing apples and oranges; (2) and this is the more likely culprit, if you ask me: Opta's and StatDNA's definitions of shots and goals created from "corner situations" differ in ways that matter.

Recall that StatDNA's definition implied goals scored after 3 touches. This may be fairly restrictive, and it treats corner situations like other dead ball situations (like a free kick, for example); it also allows us to clearly differentiate between corners and open play situations. If that is our analytical goal, then this is a useful definition of events. When we expand the definition of including more than 3 touches, the line between open play and corners becomes slightly more blurry.

To get a quick sense of what is going on in the data, I took a look at some of the specific data points in the Opta/Guardian data, and it appears that (2) is at least partly responsible for the difference. Here's an example. Below is a match situation where the data were coded as a goal resulting from a corner situation: a goal by Christopher Samba for Blackburn againts Manchester United back in November.

The Samba goal, shown below, demonstrates the occasionally fuzzy line between open play and corners. After an unsuccessful corner, the ball is played out and crossed in again; there are more than three touches on the ball (and a pass, a cross and an unsuccessful shot on target in between), but the goal is counted as resulting from a corner situation. Of course, this is both true and not true at the same time. The point is that it depends, at least in part, on how analysts define and therefore measure these events.


So keeping in mind that more expansive or restrictive definitions of "corner situations" or "open play" will affect the conclusions we draw about the relative importance of different match situations, another way to slice these numbers is to look at what portion of the overall pie each type of goal constitutes. These numbers are shown below.


Clearly, most goals are scored from open play (about 70% of them). In contrast, the smallest source of goals is free kick situations at less than 3%. Keeping the fuzzy line between open play and corners mentioned above in mind, in-between we have corners dominating free kicks and fast breaks.

What does it all mean? I am not entirely sure what it means affirmatively, but what it does not suggest to me is that teams should stop practicing free kicks and instead focus on learning how to dive to earn penalties; or that they should devote 70% of practices to creating shots from open play. To make those kinds of recommendations, two essential pieces are missing: First, information about the team's personnel, and therefore their relative strengths and weaknesses in creating shots from different kinds of situations - put simply, if you have the world's best taker of free kicks, you'd be a fool not to use him. Second, what is missing is information about which types of game situations are more likely to pay off. That is, while most goals are indeed scored from open play, it may be much more efficient for teams to create relatively few but high percentage shots from other kinds of match situations that are more likely to yield goals. For that, we need to look at efficiency ratios - but that's a topic for another day.