In brief, the proposed system gave a more reasonable estimation of grade, faster than the current system. In addition, rating never drops for a win against a lower rated player.
I do not propose to do any more on this, unless there is support from others (Craigy, XanthosNZ, whoever, criticism is welcome), and a reasonable chance that it would be implemented (Russ - I am sure you have loads of free time...). The main benefit will be to new players, but would also remove the penalty to a provisionally rated player of winning against someone rated 200 points lower.
For more detail, read on
Yours,
Gezza
I have put together a spreadsheet, to calculate ratings, based on a proposed new rating caculation system. The results follow.
The base for the calculation is work by Professor Mark Glickman (see http://math.bu.edu/people/mg/ratings.html), and the description of FICS implementation of the glicko system (http://www.freechess.org/Help/HelpFiles/glicko.html)
The difference from the glicko system described and implemented on FICS is that I do not propose to implement this system wholesale (I can imagine some resistance) - this would mean changing all rating calculations, but just implementing it for the initial 20 games, where rating is totally unknown. Implementing the complete system could be a solution to problems caused by people who, for whatever reason, disappear from the site losing lots of games, and come back some time later, with a low (but established) rating, but that is another discussion.
The system relies on a Rating Deviation value, RD (which is the accuracy of the rating). For established ratings, I chose a value of 40, pretty much at random, because it is half the value FICS uses to indicate an active player. For new players, I chose an RD of 350, and a rating of 1200. It might be interesting to use the average value of all established players as an inital value.
I chose a random opponent's rating within 100 points of the player's current rating, and then calculated the game result based on a known player strengh (different to the rating, as even Mr. Kasparov would start at 1200 here). I did the same for the current system (average of opponents ratings + result for the first 20 games)
The results were:
Rating 600
Current system after 6 games:807
Current system after 20 games:777
Proposed system after 6 games:671
Proposed system after 20 games:655
Rating 1000:
C6: 940
C20: 920
P6:990
P20:1038
Rating 1400:
C6:1474
C20:1414
P6:1422
P20:1397
Rating 1800:
C6:1474
C20:1770
P6:1602
P20:1722
Rating 2200:
C6:1607
C20:1937
P6:1750
P20:2008
The result for 1800 appears to be due to randomness in the test.
The K factors changed from 350 for game 1, to 34 for game 20, on what looks like an exponential curve - K was 90 for game 7.
I did not calculate the effect on opponent's rating. The calculation is similar, but I would rather spend time playing than spend more time doing this.
Nor did I include provisional players in the opponents. That would have to take account of their RD and the tables become messy for little gain.
In Professor Glickman's system, RD increases with time since the last game played. In this way, if someone drops off the site for a while, their RD will increase. RD is used to indicate how active a player is, and so how correct their rating is estimated to be. Too high an RD would be used to indicate that a player's rating is unreliable, and so they should not enter banded tournaments - in the same way as a provisional rating is used now.
Professor Glickman has developed a Glicko-2 system (I had trouble reading the file, so did not try it), which addresses step changes in ability - by increasing the effective RD - this may help deal with players who start to study, or who just stop playing. The RD goes up, reducing the effect on other player's grades.
It would be some work to implement these systems, including ensuring no IP was used without permission. It is not worth it unless there is a need and a commitment to implement a change.