Our predictions are the result of doing thousands of simulations, which are only partly random. In each match, the probability of one team or the other winning depends on their statistics. For example, if France plays against Albania, the probability that the former will win is around 83%.
The model has three parts:
1. How strong is each team? To capture this we use two metrics: their recent results (measured with an Elo ranking, an original method in chess) and the quality of their players (measured with their value in euros, with data from the Transfermarkt website).
2. Who wins each match? We have trained a model with thousands of matches to, given two teams and their strength metrics, estimate how likely each outcome is. The model tells the probability of victory, draw and defeat, and even that of the score. For example, in that hypothetical duel between France and Albania, the most probable results are 2-0 and 3-0 with around 14% each.
3. And to predict the entire Euro Cup? What we do is simulate it game by game, crossing by crossing. We repeat this thousands of times, to have 200,000 possible Euro Cups, and thus be able to estimate the probability of each event.
Frequently asked questions
So you say France is going to win? No no. Our model says that France is the most likely team, but also that it has only one chance in five of winning. It is important to interpret this correctly: in reality, your victory is no more likely than seeing a goalkeeper save a penalty.
These data show that a Euro Cup is difficult to predict. And it’s not a surprise. First, it is a tournament designed for luck to influence: it is not a regular league, it has no playoffs, nor home and away games. Second, national teams play few important games and their performance is more uncertain than that of a club. And, third, we talk about football, an exciting sport because it is full of surprises. Almost no one would sit down to watch a game if the result was decided.
Have you done this before? Yes. We used a similar model in the 2018 and 2022 World Cups. Here you can read how each worked. The model has proven to be well calibrated: outcomes to which we gave a probability between 0% and 15% occurred 4% of the time, for example, and those with a probability of 85% or 100% occurred 94% of the time. In 2018, we did much better than chance, better than the FIFA ranking and two big banks (UBS and Goldman Sachs), although France won, which was only our sixth favorite to begin with. In 2022, the final was played by two of the three teams that we saw as the best, and we even beat the bets.
Should I bet using your predictions? No. Our model is relatively sophisticated and can work well. But betting has proven in the past to be very difficult to improve. Furthermore, in order not to lose money, it is not enough to beat them; you must also compensate for the margin that bookmakers reserve when setting prices.
Statistical models are useful as a reference, for transparency and because they allow us to calculate details that the bets do not respond to. But, when it comes to getting it right, the bets use a hybrid approach: they combine their own models with the fine adjustment made by their experts, to consider the extra information they have (such as details of playing styles, states of form or injuries).
More technical details
What is an Elo ranking. It is a metric that captures the strength of each team based on its results. Each team has a certain amount of points—their Elo points—and with each match there is an exchange. The winner takes points from the loser. If the victory is by surprise (because the weak team wins) the teams exchange more points. Elo rankings work well and are increasingly used, in sports and video games, for example, to organize games between players of similar level. Our model uses the Elo ranking from the Eloratings website.
Why we use the value of templates. Because the teams play few competitive matches and that makes it difficult to measure their performance before a big event. One way to add information to the model is to use the transfer market value of the players that make up the team. We take the data from the popular website Transfermarktadjusted for the age of the players (those over thirty are cheaper because they have fewer years of career ahead of them).
How the match simulator works. We have adjusted a simple model that estimates the goals that each team will score based on whether it is better or worse than its rival, in Elo points and squad value. The model also takes into account whether the matches have a host or are played on a neutral field, but that hardly influences the tournament. We use a Poisson distribution, which reasonably fits goals in soccer and has been used in statistical models and academic studies.
In the end, for our model, the probability of a team winning a match depends on numbers: the differences with the rival in Elo points and in squad value. The graph summarizes it with numbers:
A summary of the model. The graph shows the probability that a team has of winning a match based on two parameters: it differentiates them from their rival in Elo points and in the economic value of their squads.
A team with a squad 160 million more expensive than its rival and with 200 more Elo points wins 62% of its duels
Difference of
value of
160 million
A summary of the model. The graph shows the probability that a team has of winning a match based on two parameters: it differentiates them from their rival in Elo points and in the economic value of their squads.
A team with a squad 160 million more expensive than its rival and with 200 more Elo points wins 62% of its duels
Difference of
value of
160 million
A summary of the model. The graph shows the probability that a team has of winning a match based on two parameters: it differentiates them from their rival in Elo points and in the economic value of their squads.
A team with a squad 160 million more expensive than its rival and with 200 more Elo points wins 62% of its duels
Difference in value of
160 million
Why do we use a model that predicts goals and not directly wins? There are two advantages to doing it this way: it helps us to resolve the group stage and to predict overtime (such as 30-minute matches). Some studies (since Dixon & Coles, 1997) say that models that predict goals underestimate the ties that actually occur. We have verified that this is the case in the Champions League, and in the major European leagues, but the effect is less in the World Cups, and we have decided not to make that adjustment with the Euro Cup.
What success can we expect from the model? To calibrate it, we have used a database of 18,000 national team matches since 2004, including hundreds of World Cups and European Championships.
With training data, the model correctly predicts the result of 60% of the matches with the local team and 56% on a neutral field. Evaluated in probabilistic terms, the model predictions obtain a score between 0.176 and 0.186 (expressed with a Ranking Probability Score, as explained here or here). A reasonable precision, if we compare it with other models or with bets.
Model training. The dots represent the results of real matches and the lines the model predictions. For matches between teams with a certain difference in Elo ‘ranking’ (x-axis), the probability or percentage that ends in defeat, draw or victory is shown (y-axis).
Model training. The dots represent the results of real matches and the lines the model predictions. For matches between teams with a certain difference in Elo ranking (x-axis), the probability or percentage that ends in defeat, draw or victory is shown (y-axis).
Model training. The dots represent the results of real matches and the lines the model predictions. For matches between teams with a certain difference in Elo ranking (x-axis), the probability or percentage that ends in defeat, draw or victory is shown (y-axis).
You can also see the evaluation we made of our 2018 model and in 2022, after finishing each World Cup.