Euro 2012: Who will make it to the final?

Author: Vasilis Nikolaou

Mark van Bommel

The Netherlands' Mark van Bommel chases a

ball during the match against Germany.

The Germans triumphed  2-1. Image

by Дмитрий Неймырок/Wikimedia.

As the 2012 UEFA European Football Championship (Euro 2012) gradually approaches its finale, the anguish and longing of millions of football fans all over the world are touching red. Four nations remain - the favourites (and defending champions) Spain, a strong Germany, a good Portugal side and a battling Italy.

After a first round of entertaining and thrilling football, the Netherlands were considered by many as one of the favourites to reach the semi-finals but were eliminated together with Russia who were surprisingly beaten by Greece in their last and crucial group game to advance to the quarter finals.

The semi-finalists easily qualified with wins in the quarter-final – the Germans triumphed against Greece, Portugal overcame the Czech Republic whilst Spain boringly beat France. Italy, meanwhile, only just beat England after a penalty shootout following a clean sheet during the game. Italy registered 36 shots on target to England’s 9!

Among the fans watching Euro 2012 are statisticians and bookmakers who will try and give their best odds in predicting the winning team. In this article I will attempt to predict the pair that will make the final with the help of a Bayesian poisson model and the intuition of Paul the Octopus!

This model was initially developed by Maher (1982) and has been used in the modelling of football scores by other authors including Lee (1997) and Karlis and Ntzoufras (2000). An outline of the model is below.

If yi1 and yi2 are the goals scored in the ith game by home and away teams respectively the model is then described as: Yij~Poisson(λik), j=1,2 with log(λi1) = μ + homeak[HTi] + dk[ATi] and log(λi2) = μ + ak[ATi] + dk[HTi] for i = 1,2,…,n, where n is the number of games; μ is a constant parameter; home is the home effect; HTiand ATi the home and away teams respectively competing in the ith game while ak and dk are the attacking and defensive effects abilities of k team for k = 1,2,…K where K is the number of teams competing.

Based on the above, μ corresponds to the overall level of log-expected goals scored in away games and home is the home effect representing the difference between the log-expected goals scored when two teams of equal strength are competing each other. Moreover, the attacking the defensive parameters ak nd dk represent the deviations of the attacking and defensive abilities respectively from the average level. This means that a positive attacking parameter shows that a team has a better offensive performance than average while a negative defensive parameter indicates that a team has a better defensive performance than the average level of the competing teams.

So, using the model above, Table 1 shows the probabilities of win, draw and win for the (typically) home and away teams in each of the two pairs.

Table 1. Probabilities of match result

Table 1. Probabilities of match result

As the table shows Spain is the team more likely to qualify from the first game while Germany the one more likely to qualify from the second game. The chances of a draw after 90 minutes of normal time are almost 30% for the first and 26% for the second pair. Figures 1 and 2 display the probabilities of the number of goals scored by each team.

Figure 1. Predicted number of goals scored by Portugal and Spain

Figure 1. Predicted number of goals scored by Portugal and Spain

Portugal's Nani is hoping to work with Ronaldo to overcome the Spanish. Image by Ilya Khokhlov/Wikimedia.

Portugal's Nani is hoping

to work with Ronaldo

to overcome the Spanish.

Image by Ilya

Khokhlov/Wikimedia.

The graph above indicates the probability of Spain scoring more than one goal is higher than that of Portugal’s. In particular, the estimated probability of one goal difference in favour of Spain is 17%, while Portugal’s respective chance of one goal difference is 11%.

On the other hand, a tighter match is to be expected between Germany and Italy with the Germans having slightly better chances to score more than two goals than the Italians. However, Italy seem more likely to score exactly one goal than Germany. Again, the chance of one goal difference in favour of each team is almost the same (14% and 13% in favour of Germany and Italy respectively).

Figure 2. Predicted number of goals scored by Germany and Italy

Figure 2. Predicted number of goals scored by Germany and Italy

From the evidence above it appears that we should expect to watch two very exciting games with all teams having quite good probabilities of winning. A more likely pair for the next Sunday’s final would be that of Spain versus Germany, although any other combination could be possible. At the end of the day, it’s all about luck!

References

  • Maher, M. (1982), “Modelling association football scores”, Statistica Neerlandica 36, 109-1 18
  • Lee, A. (1997), “Modeling scores in the premier league: Is Manchester United really the best?”, Chance 10, 15-19
  • Karlis, D. and Ntzoufras, I. (2000), “On modelling soccer data”, Student 3, 229-244
Bookmark and Share

Comment on this article

Submit your comment
  1. Image of unique ID

Comments

Meic Goodyear

In the  figures what does the smooth interpolation convey when the number of goals can only take integer values?

reply to this comment

Antonio

Should I infer from the two charts that one of the teams can score 2.78182818 or 3.141592653 goals?

reply to this comment

Gianluca Baio

Conflict of interest declaration (and may be I should point out to this in disguise: http://www.tandfonline.com/doi/abs/10.1080/02664760802684177 ), but, in our experience, these models do quite well. However, I do hope the model got it wrong this time ;-)

reply to this comment

Skip to Main Site Navigation / Login

Site Search Form

Site Search