Skip to content

The SimpElo Team Ratings A simple Elo model for rating teams in the AFL

Posted in Ratings

Developing an accurate and realistic rating system is often a primary for a sports analyst. Just about every organised sport competition in the world has it's own implicit rating system in which we expect "good" teams to be rated higher than poorer teams. In AFL footy we call this the ladder. The problem with using a team's position on the ladder to infer how well it plays is that the ladder is sorted primarily by wins. While winning lots of games is important (#analysis), how many games a team has won previously is not always the best indicator of how many they'll win in the future. This is especially true of a competition like the AFL which uses an uneven draw. A team towards the top of the ladder that has yet to face any other difficult teams has obviously not proven itself to be a strong side.

A "true" rating system provides us with a wonderful descriptive and predictive tool. We can compare teams over time. (Just how does this year's Hawthorn team hold up against Brisbane of the early 2000s?). We can map changes in team rating after notable player and administrative changes. (How important will Patrick Dangerfield's move from Adelaide to Geelong be for both sides?). And perhaps most tantalising for some, we can calculate implied probabilities for upcoming matches and even seasons and make a profit betting against inefficiencies in sports-betting odds. (What is fair price for Hawthorn to make it 4 in a row next season?)

Given this motivation, I have created a few different types of rating systems that I have been testing out over the last season. Today I'll introduce you to simplest of these, a basic Elo model which I have donned "SimpElo"1, and show you the impressive results that can be achieved with just a few basic principles.


What's Elo


Elo Rating System creator, Arpad Elo.
Elo Rating System creator, Arpad Elo.

Elo is a rating system developed by Arpad Elo for use in determining the relative skill of players in international chess competitions. It has since found a grounding as a simple rating system in many player v player and team v team sports including football, NBA, NFL and even video gaming.2 A number of Elo-based rating systems for the AFL already exist, but as you will see, I think improvements can be made. Tony Corke at MatterOfStats uses an Elo system for his weekly ChiPS rankings and predictions, he also frequently discusses his methodology in informative posts. The Footy Maths Institute also provides season long commentary into their "modified Elo" system.

The beauty of the Elo rating system lies in it's simplicity. At it's heart, all the computation is done by one simple formula. All teams start the season with an initial rating (more on this later), after each match the change in team ratings is calculated as so.

Change\ in\ rating = k \times (Result - Probability\ of\ Win)

Where Result is either 1, 0, or 0.5 depending on whether the team win, loses or draws. The great thing about this is that the team who won the match gain just as many points as the team who lost lose. More points are gained if you win a match in which you were given a low chance of success than are gained if you beat a team you were expected to beat. Which leads us on to how win probability is calculated. Luckily, it's not much harder. We first calculate probability of the home team winning. The probability of the away team winning is simply 1 minus this number. (In this model a draw is not considered)

The probability of a home win is calculated as so.

Probability\ of\ Win = 1/10^{\frac{InitOppRat-InitRat-HGA}{400}+1}

Where, InitOppRat is the initial rating of the opponent, InitRat is the initial rating of the home team and HGA is the home-ground advantage that the team enjoys. This formula has the nice property that a HGA inclusive difference between team ratings of 400 equals odds of exactly 10:1.

At it's core it's a simple as that. After every game we add/subtract points from the winner/loser and calculate their new rating dependent only on their results and the implied difficulty of their schedule faced. Of course, there are some parameters in all this that we need to fix before we can do that.
 


Home-Ground Advantage


Home-Ground Advantage is without doubt a factor. Home teams win significantly more often than would be expected if all games were played at a neutral venue. The average score for an away team is around a goal and a half lower than the average home team score. I think it's important to understand this and account for it, which is why, even in its quest for simplicity, SimpElo recognises an HGA variable while forgoing many other game variables.

Subiaco Oval provides one of the more formidable away challenges in domestic sport. Away teams often travel over 3000km to play West Coast and Fremantle here.

The reasons for Home-Ground Advantage are numerous and not all that well studied. However, we do know that some form of HGA exists in almost every sport in the world. Many theories for why this is so have been suggested. Ground familiarity, crowd intimidation, travel benefits, referee bias, even the psychological benefit that comes with "defending" one's "home" have all been among those posited theories.

In SimpElo, I include only two main factors when assessing HGA for a specific game; the amount of experience each team has at the venue (counted as the amount of games played at the venue up until that point in both the current season and the previous)3, and the distance that each team must travel to make the game (measured as the plane would fly).4 These two factors can together model most of the suggested reasons for HGA. For example, a team with greater ground experience and less travel distance can reasonably be expected to have a greater proportion of the crowd barracking for them.

The formula for HGA looks like this.

HGA= \alpha \times LogDiffDist + \beta \times \frac{HTGroundExp}{HTGroundExp-ATGroundExp}


Where,

LogDiffDist= ln(Dist\ Travelled\ by\ AT) - ln(Dist\ Travelled\ by\ HT)

α and β are constants that we will optimise later. The benefit of using the difference of natural logarithms (ln) of the distance travelled by both teams is twofold. Firstly, it recognises the effect of travelling a little bit further when you are already travelling a long way is negligible (the difference between a flight from Brisbane to Melbourne and a flight from Perth probably isn't that significant). Secondly, it minimises the travel-related Home-Ground Advantage in those games where both teams have to travel (think North and Hawthorn in Tassie,  GWS in Canberra, etc.).5
 


Starting Ratings


I'll run the model starting from the 2003 season as this gives as nice little snapshot of what can be considered the "modern game".6 The beauty of Elo models is that they can very quickly adjust and reflect the quality of the teams, but we still need to give it some initial values to work from. In the tradition of Elo, I choose to let every team start the 2003 season on 1500 points. As Elo rating changes are zero-sum (the winner wins what the loser loses), 1500 remains the average rating no matter what time we look at the model.  A rating of 1500 suggests a team completely in the middle of the pack.

Starting subsequent seasons with each team rated 1500 would be silly. We now have prior information! Why would you knowingly consider Hawthorn and Carlton to be evenly rated teams next season? Similarly, giving each team the starting rating that they finished last season with would be comparatively nonsensical. With the way the AFL's "equalisation policies" (draft picks and concessions), champion teams aging and what we know about regression toward the mean (that lucky teams are unlikely to be as lucky next year and vice versa), on average we can expect the bad teams to get better and the good teams to come down to earth a bit.

Port Adelaide are probably 2015's best example of “coming back down to earth”.

Because of this, I choose to arbitrarily regress each team's end of previous year rating to the mean by 40% when calculating start of season ratings. For example, if a team finishes with a rating of 1600, they'll start next year with a rating of 1600 * 0.6 + 1500 * 0.4 = 1560. This brings all teams back to a more level playing field, giving any off-season changes a chance to be quickly reflected in the Elo ratings.7

There is also the slight problem of the two expansion sides entering the competition in 2011 and 2012. Rather than give them starting ratings of 1500 I give Gold Coast and GWS starting ratings of 1400 and 1347 respectively and share the other points equally among every other team in order to keep the average at 1500. This reflects their starting skill level a bit more fairly and prevents the teams that played them early on in the season before their ratings readjusted from racking up too many undeserved points.8
 


The K-Factor


If you've forgotten what the k-factor is, then scroll up to have a look at the formula again. The k-factor basically determines how drastically the team ratings respond to the week-to-week results. If the k-factor is large we have many drastic swings, teams can go from very strong to very poor in a matter of weeks, there is no stability and the ratings are meaningless. Conversely, if the k-factor is too small we don't see any swings in team ratings after favourable or unfavourable results. Ratings now reflect only the starting rating that we chose. Choosing an appropriate k-factor is the key to any good Elo model.

In SimpElo I choose to have a variable k-factor that can take on one of two values. A higher value for all competitive games where we can assume that both teams are playing to their full ability, this will give us a good indication of each team's strength. And a lower value for those end-of-season games where at least one team has already shut-up shop and is trailing new players, these don't really tell us as much about the underlying strength of both teams.

I choose to assign any game played in the last 5 weeks of the regular season featuring a team out of contention for making top-8 the lower value k. All other games, including finals, get the higher value.
 


Optimisation


We now have four variables for optimisation, α and β (from the HGA calculation), high-k and low-k. I use complete data from seasons 2005-2013 as training data to optimise these parameters against their Logarithmic Probability Score. This leaves seasons 2014 and 2015 free to test out-of-model as an indication of the validity of SimpElo in prediction.

All optimisation is done inside R using the optim function from the stats library.
 


Results


I'll go into more detail about the results and the overall usefulness of SimpElo in a follow-up post, but for now I'll leave you with the end-of-season team ratings for 2015 and you can make of them what you will. Keep in mind that this is self-correcting model that knew absolutely nothing about the teams in 2003, there are still a lot of tests to conduct, but it certainly has potential.

SimpElo final ladder 2015


  1. Rhymes with tangelo.
  2. Expanding Elo from player v player to team v team requires a few restrictive assumptions. Perhaps the most unrealistic of these is that the playing group remains constant, especially in the short term.
  3. Looking at only two seasons is arbitrary but works well in ensuring the bulk of the team is pretty much the same. The make-up of many teams from 3 years ago is quite different from today. We shouldn't be calculating that past team's experience.
  4. For simplification, Geelong are counted as a Melbourne team, Gold Coast a Brisbane team, etc.
  5. If there is no distance travelled we set the distance to 1, because ln(1)=0.
  6. "Zone football" came in sometime in the mid-2000s. Neil Craig's Adelaide and Paul Roos' Sydney were some of the best early proponents. This is roughly the time I consider "modern day" footy to have started.
  7. I could try and optimise the carry-over figure rather than just make it 60%. But for Simplicity I have decided not to.
  8. Both teams actually fell much lower than this in their first season, but I wanted to be at least somewhat fair with the redistribution of points.

One Comment

Leave a Reply