This is post is part of a series of posts describing a predictive model for the Eurovision Song Contest. The full set of posts can be found here.
Previously on the Eurovision Song Contest
Last year around this time, I wrote a series of blog posts outlining a Bayesian predictive model for the Eurovision Song Contest, and making a set of predictions about last year’s contest. Apart from a few hiccups relating to Malta, the model was a fairly qualified success. This year, by “popular” demand, I’ve revisited the model and brought it up to date for 2013, ahead of Saturday’s showdown in Malmö.
For those just coming in, I’m going to give a quick recap of the model that I used last year. If you’re already familiar, feel free to skip ahead to the next section, where I’ll talk about what’s new for this year. If you’re looking for a more detailed look at last year’s contest, take a peek at the series of posts from the start
Essentially, we can look at people’s voting preferences in the Eurovision Song Contest as composed of two components: song quality, and a “friendship” score, which takes into account how much the voting country likes or dislikes1 the country being voted on. If we want to know whether a voter V will rank country A or country B higher, we add up the song quality and the friendship score in each case, and subtract the two. Then we can take this difference, feed it through a logistic curve and use the result as a probability.
I’ve taken voting results from both the Eurovision finals (going back to the introduction of televoting in 1998) and the semi-finals (going back to their introduction in 2004). I’ve then used a Markov Chain Monte Carlo sampler2 to calculate the song qualities and friendship scores, assuming that they’re both normally distributed.
Once I’ve got the parameters, it’s relatively straightforward to run a simulation of this year’s contest, including the semifinals and all the voting procedures. Last year I ran 10,000 simulations like this, and looked at both who the likely qualifiers from the semifinals were, and the overall winner. The model managed fifteen out of twenty of the qualifiers, as well as the eventual winner. Although this level of success, particularly predicting the winner, was probably mostly luck, this year I’d like to do the same.
The biggest change to the contest since last year is that four countries have pulled out (Bosnia-Herzegovina, Portugal, Slovakia, Turkey), and one has returned (welcome back, Armenia!). This obviously has effects on the voting dynamics, both directly (e.g. all the Balkan countries no longer get votes from Bosnia-Herzegovina) and indirectly (e.g. Germany will have an extra voting slot free, rather than spending it on Turkey). Overall, we should expect a slight rebalancing of the votes, although the changes are fairly evenly spread across Europe, so it’s unclear what the overall effect should be.
In terms of the model, I’ve added two terms to try to increase the accuracy. The first of these is a term which sets the average song quality for a given country. If we look at countries’ past record in Eurovision, some stick out as more consistently successful (or unsuccessful) than others. For example, Azerbaijan have never finished outside the top ten, and only once outside the top five. On the other hand, Switzerland seem to have trouble even qualifying for the final, and on one occasion even scored “nul points” in a semi-final, which is quite an achievement, given the competition. This leads us to an idea that some countries might, through greater enthusiasm for the contest, a larger talent pool, or some quirk of their selection process, just be plain better at producing Eurovision entries than others.
The term I’ve added simply affects the mean song quality for the country: all countries still have the same variance, and they all still have the potential to produce songs of any quality. However, on average, some countries do end up better than others. The effect here is smaller than the general variation in song quality, but still relatively large overall: it ends up explaining about a third to a half of the variation in song quality. I’ve plotted the value of this term below for all the countries which have competed since 1998. This year’s entrants are in green, and the bars show one standard deviation of song quality: roughly speaking the song quality should be inside the bars about two-thirds of the time.
Countries near the top of the list (e.g. Azerbaijan, Russia) tend to be those which put a lot of resources into the contest, and see it as a way of promoting their culture throughout Europe. We also see high places for countries such as Sweden and Italy which have very well-developed national song competitions, which gives a strong talent pool to draw on. Near the bottom of the list we see a lot of smaller countries (Andorra, Monaco, San Marino) which simply don’t have the resources or the talent pool to compete successfully.
It’s also fairly notable that (barring Turkey and Bosnia-Herzegovina), the countries which are not competing this year are largely those which typically produce low-quality songs. It’s hard to tell which way the relationship works here. It’s possible that these countries have low enthusiasm for Eurovision, and thus have small talent pools to pick from, and pulling out is less of a big deal. It’s also possible that a string of poor performances could lead to a country becoming disillusioned with the competition.
The second change I’ve made to the model is to introduce a term accounting for the gender of the performers3. There’s a definite effect there: all-female entries are slightly better than all-male entries, which are a lot better than mixed entries. However, the overall magnitude is fairly small, around 0.3 quality units, roughly equal to the quality bonus a song gets for being from Malta.
This year, I’ve run 100,000 simulations of the full contest. Looking just at the first semi-final for now, there are 10 qualification places available, from a field of 16 countries. From a completely naïve standpoint, this means each country has a basline qualification probability of 62.5%. In reality though, some countries are more likely than others. For the first semi-final, this is what the model says.
In general, this year’s model is more certain about its predictions than last year’s; time will tell if it’s any more accurate. Anyway, as I think most people would predict, Russia and Serbia are dead certs for qualification. Montenegro, on the other hand, have a mountain to climb. There aren’t enough Montenegrins for the diaspora to have any significant effect on most countries’ voting patterns (although Serbia, Croatia and Slovenia will probably give them a few).
The interesting stuff is in the middle of the table. Ireland benefit from having the UK and Denmark voting in their semifinal, while Slovenia will have a boost from having three other Balkan countries in the mix. I’m not sure whose idea it was to put Netherlands and Belgium in the same semifinal, but it’ll be interesting to see if it gives either of them a big enough boost.
Overall, I think Russia (94%), Serbia (91%) and Ukraine (86%) are safe. Denmark (76%), Croatia (74%), Estonia (69%) and Moldova (66%) are also pretty good bets. Beyond that, it looks like Lithuania (60%), Cyprus (58%) and Slovenia (58%) but I don’t think it’s safe to rule out Ireland (55%).
Back in Baku?
Looking on to the final then, once again the model is more confident than last year (although possibly no more accurate). The top predictions are similar, but not identical, to the list of countries with highest average song quality. Quirks of the voting have boosted Serbia’s relative chances, for example. Overall though, the lists are very similar.
Overall, we’d be foolish not to plump for either Azerbaijan or Russia as winner. The bookies, on the other hand, are heavily pushing the Danish entry. At time of writing, the available odds on Betfair are around 2.5, implying a win probability of 40%. I’d take this with a note of caution though. While the wisdom of the crowds is often right, Eurovision betting has something of an echo chamber effect, with lots of people piling on the perceived favourite, and driving the odds downwards. Two years ago, the favourite was France which finished in an ignominious 15th place, losing out to… Azerbaijan.
Let’s ignore for now what it means for a country to “like” another country. Countries that like each other vote for each other. ↩
Gender is performance, and Eurovision doubly so. In each case I’ve assigned people the gender that they appear to be performing as. For avoidance of dount, Dustin the Turkey is male, and Verka Serduchka is female. ↩