What is the World Football Elo Rating System?

I have seen the World Football Elo rating system referred to a couple of times during the World Cup. People seem to think that it is much better than the Fifa ranking system, although I am not sure that is a high bar. Nevertheless, I became curious, so I decided to look it up. That was easier said than done, and this post became much longer than planned, but here is the result.

According to the football Elo ratings website and Wikipedia, the Elo rating system was developed by Arpad Elo to rank chess players, and adapted to football by Bob Runyan in 1997. The website is clear in giving the formulas used to compute the actual rankings, but lacks information about the motivation behind the choice of parameter values. A key idea is that it is a 0-sum game where points are distributed on the basis of both the actual result and the strength difference between the teams. Here is how it works:

  • Start from some rating R0. After a match, the new ratings are calculated as:

R = R0 + K G (WWe)

  • Results in more important matches count more, parameterized by K:

K is the weight constant for the tournament played:
60 for World Cup finals;
50 for continental championship finals and major intercontinental tournaments;
40 for World Cup and continental qualifiers and major tournaments;
30 for all other tournaments;
20 for friendly matches.

K governs both the weights given to different matches, and how much the outcome of the last match counts relative to the previous rating.

  • Goal difference also counts, and is taken into account in the following way:

G = 1 if the match is a draw or is won by one goal
G = 3/2 if the match is won by two goals
G = (11+N)/8 if the match is won by goal difference N and N is three or more

  • W is the result of the game (1 for a win, 0.5 for a draw, and 0 for a loss).”
  • We is the expected result (win expectancy), either from the chart or the following formula: We = 1 / (10(-dr/400) + 1), dr equals the difference in ratings plus 100 points for a team playing at home.”

The last one gave me a bit of a hard time. After consulting the book Who’s #1?: The Science of Rating and Ranking (2012) by Langville and Meyer and the paper The predictive power of ranking systems in
association football (ungated) (2013) by Lasek Szlávik and Bhulai, I learnt the following:

We is the expected win probability and results from assumptions about the performance distributions of the two teams, which themselves depend on the prior rating. To calculate this probability one needs to assume performance distributions for the two teams. It appears that to make it a little simpler, what is often assumed is a distribution of the performance differential directly. The chess system for some reason assumes that the performance difference is distributed as a logistic function (to the base 10) of the difference in ratings. Since that function is 1 / (10-x + 1), with x=dr, one can see that we are approaching the expression we are aiming at. But where does the constant 400 come from? Also from the chess rating system. The value of that constant governs the variance of distribution. If it is “really small” there is little variance and the player/team with the higher rating will almost always win, while if it is “really big” the lower rated team will be lucky and win more often. (In the limit as it approaches infinity, any match will be a 50-50 draw, no matter the rating difference.) The home team advantage of 100 rating points looks a little like it is taken out of thin air.

Langville and Mayer praises the Elo system for the flexibility it gives in the choice of the K parameter and the scale (variance) parameter in the win probability distribution, since it means users can tailor-make the system to their particular purpose, but then it does seem a little peculiar that the football system has simply copied most of the chess system.

Without any documentation, the parameter values seem like a haphazard mix of some things kept from the chess rating system and some just made up. That is not to criticize the creator Runyan or people who use it, and apparently the system is performing quite well, but it would be nice to know if there was some specific thought behind the choices made.

For those interested to learn more, I can recommend the two references above.

Let us not make such a fuss over a bite

People should not bite each other. Neither should they push, kick or stamp on each other. That is why we have regulations against these sorts of behavior, both on and off the football pitch. When someone breaks these regulations, (s)he is punished. The punishment is presumably related to the severity of the offense. Not so. Luis Suarez’ bites are less dangerous than many of the tackles that occur almost every football match (and that are deservedly rewarded with yellow and red cards when the referee sees them, as biting should be). Yet the previous time Suarez bit, he was given a 10 match ban, making it to no. 5 on Top 10 longest Premier League bans. Joey Barton is no. 3 on that list with a 12 match ban, because he “elbowed Carlos Tevez, kicked Kun Aguero and attempted to headbutt Vincent Kompany.” Comparable offenses?

Besides, Suarez’ acts are so irrational and non-self serving as to be covered by an insanity defense.

“The law, in its majestic equality, forbids the rich as well as the poor to sleep under bridges, to beg in the streets, and to steal bread.”

Tyler Cowen refers to the great great quote by Anatole France in a discussion of “anti-homeless spikes” in the UK. It is highly relevant also in Norway, as the government recently struck a deal in parliament to ban begging, and also here we are trying the “raise the cost of being homeless” approach to homelessness.

“[…] by pushing his jam always forward into the future, [the “purposive” man] strives to secure for his act of boiling it an immortality”

John M. Keynes thus criticized an excessive preoccupation with the future in his essay Economic possibilities for our grandchildren (1930). I was a bit puzzled by this, and the full quote does not really help:

The “purposive” man is always trying to secure a spurious and delusive immortality for his acts by pushing his interest in them forward into time. He does not love his cat, but his cat’s kittens; nor, in truth, the kittens, but only the kittens’ kittens, and so on forward forever to the end of cat-dom. For him jam is not jam unless it is a case of jam to-morrow and never jam to-day. Thus by pushing his jam always forward into the future, he strives to secure for his act of boiling it an immortality.

Helpfully, there is a Wikipedia page on the “jam tomorrow“.  It turns out that the jam reference comes from Lewis Carroll’s Through the Looking Glass (1871), in which the White Queen offers Alice to work in exchange for jam that she (Alice) will always receive tomorrow, i.e. never. Back to Keynes: Such forward-looking behavior is helping to solve the economic problem, but as soon as that is done (it will take at least 100 years), we can stop pushing the jam into the future.

(And presumably start eating it, though Alice says she does not care for jam, but perhaps that is another story.)

Monthly book roundup – 2014 May

Books finished in May:
(Warning: reviews are unpolished and quickly written.)

Junkyard Planet: Travels in the Billion-Dollar Trash Trade (2013) by Adam Minter. A fascinating account of the globalized trade in junk. Illustrates how trade connects parts of the world with different specializations. Repeatedly comes back to the fact that the trash trade has an undeservedly bad reputation: Minter several times acknowledges that there are problems with pollution and lack of labor regulations many places, but emphasizes that the trade allows materials to be used again rather than be used as landfills. If the trade in junk was not there, we would see a lot more environmentally harmful mining to extract these materials, that is something to think about for greens denouncing the garbage trade. Recommended.


A Troublesome Inheritance: Genes, Race and Human History (2014) by Nicholas Wade. An interesting book. Wade argues that genetic factors are seriously undervalued and indeed repressed as an explanation for human societal diversity. He claims that different social tendencies at the race level have evolved fairly recently and explain much of today’s economic world. His view is a subtle one – these tendencies are not god-given, but have evolved in response different societies’ needs (-“human evolution has been recent, copious and regional”). However, I think he should have gone more deeply into the point that as in the past, whether traits are good or bad depends on the context, both today and in the future. There was an interesting discussion about the book on Andrew Gelman’s blog.

Predictably Irrational, Revised and Expanded Edition: The Hidden Forces That Shape Our Decisions (2010) by Dan Ariely. Ok. Often lacking is a discussion of how various seemingly irrational behaviors may not be so dumb in a larger context, but the book is fine enough, good popularization of many findings.

Red April (Vintage International) (2006) by Santiago Roncagliolo. Set in Perú in 2000. Prosecutor Félix Chacaltana Saldívar investigates murders purportedly carried out by the Maoist terrorist group Sendero Luminoso. Meets and creates several difficulties. The conflict and violence in the novel are modelled on the real world. Despite this, the book was not really my style.

The Atrocity Archives (A Laundry Files Novel) (2007) by Charles Stross. Occult IT expert Bob Howard starts his journeys in the British intelligence organization “the Laundry”. Charlie Stross’ blog is here.

The Jennifer Morgue (A Laundry Files Novel) (2009) by Charles Stross. Second book about Bob Howard working in intelligence organization the Laundry. This time he outlandishly finds himself in a literal James Bond plot, the idea behing which is a little difficult to follow at times.

How Pleasure Works: The New Science of Why We Like What We Like (2010) by Paul Bloom. “[…] people naturally assume that things in the world – including other people – have invisible essences that make them what they are. Experimental psychologists have argued that this essentialist perspective underlies our understanding of the physical and social worlds, and developmental and cross-cultural psychologists have propposed that it is instinctive and universal. We are natural-born essentialists. (p xii)” Evolution moulded us this way, and our essentialism determines much of how we experience pleasure from food (how old we believe a wine to be), sex, art (the real painting, not a fake); even if many pleasures evolved as by-products. Maybe, but much essentialism still seem quite silly. It was interesting to learn about an experiment by McClure et. al (2004) which showed that difference areas in the brain lighted up in fMRI scans when people knew as opposed to did not know whether they drank Coke or Pepsi.

The Secret Agent: A Simple Tale (1907) by Joseph Conrad. Disappointing. About anarchist terrorists in London around the end of the 19th century, but one hears little concrete of either anarchism or terrorism, only about the not too interesting characters. One of the characters is supposed to have been an inspiration for the “Unabomber” Ted Kaczynski.

Enigma (1996) by Robert Harris. Picked up this novel set in the codebreaking center Bletchley Park during world war II as a follow-up to reading Stephenson’s Cryptonomicon. I learnt less than what I had hoped about cryptography. And I do not find historical fiction in which the protagonists contribute major efforts to historical episodes that interesting. Not recommended.

Cryptonomicon by Neal Stephenson. We follow two groups of people, one that is attempting to conceal the fact that the Allied powers have broken the German code system Enigma during World War II, and some of their relatives who try to launch a digital currency in the 1990’s. Cryptography plays important roles in both storylines. In contrast to the books of Ramez Naam, which I recently read, the lack of threats of torture is conspicuous and made the plot seem less realistic. The book is good enough, but one should note that it is so long that one can read several other books in the time that it takes to read it.

