Are rating systems the best way to fairly assess skills, or do they reveal the limits of quantification?

Rating systems are used in a variety of fields as a tool to objectively evaluate skills, but the process raises questions about the ability to quantify all skills.

Skill is the ability to do something. This “something” can be anything from studying, competitions, board games, online games, and more. People often say they are “good” at something because they see people who are better than them at it. However, it’s hard to objectively measure skill. It’s all about giving people a metric that allows them to objectively compare their skills to yours. This is where the difficulty comes in. In most fields, it’s difficult to line up the skills of different people in a simple way.
In addition, the concept of skill is inherently subjective, so comparing yourself to others isn’t necessarily accurate. For example, musicianship isn’t just about technical proficiency; it’s also about emotional expression and creativity. Similarly, in sports, in addition to physical strength and skill, mental strength, game management, and other factors affect performance. It’s always going to be controversial to reduce the complexity of “skill” to a single number.
However, the benefits of quantifying skill are clear. In an environment where competition is inevitable, it’s necessary to provide a fair standard. Whether someone is a better student, a better chess or Go player, or a better gamer is not something that can be determined after just one test or one game. Dozens, hundreds, or even thousands of data points must be accumulated before we can compare two people’s skills, and it is difficult to take that many tests or matches.
In order to solve this problem, attempts have been made to objectively measure skill in various fields. One such example is the “Rating System,” which aims to turn the ambiguous metric of “skill” into a “number” or “rating” that anyone can objectively recognize. This number is adjusted to reflect the appropriate level of skill by increasing as you perform well against others and decreasing as you perform poorly. The way ratings are scaled varies from field to field, and as a result, there are many different names and different types of rating systems. In this blog post, we’ll take a closer look at the history, principles, and uses of rating systems.
The first rating system was introduced in chess. In the early 1900s and before, the game of chess was played simply by competing against each other in tournaments, so even if you were a very good player, if you were eliminated early in a tournament, you would be rated as a lesser player, or you would have to play in larger tournaments to be evaluated. To solve these problems, the first rating system was introduced in 1950. The idea was to objectively compare players’ abilities by factoring “every” match between two players with the same rating into the skill rating scale.
However, the first rating systems had many problems with their formulas and criteria, often failing to provide an objective comparison of skill, and many people still stuck to the classic method. It was Dr. Arpad Elo, an American physics professor and chess player, who solved this problem and ensured that the ratings accurately represented each player’s skill. In honor of his work, the current rating system for chess is called the Elo Rating, and it’s used in much the same way for other competitive board games and online games.
There are two main principles underlying the rating system. First, all players start at the same rating, and their rating increases or decreases as they win or lose matches. This makes sense, as ratings are a measure of skill. Secondly, the amount of increase/decrease in your rating is affected by the difference between your rating and your opponent’s rating, i.e. the difference in skill. If you win against a player with a lower rating than you, your rating will increase less, and if you lose, your rating will decrease more. Conversely, if you win against a higher-rated player, your rating will increase, and if you lose, your rating will decrease. The reason for this is to provide a more accurate assessment of skill.
A world-class chess player’s victory over a mediocre student will not make the player’s rating any better than it was before; it will simply be taken for granted. The method described above reflects this. If a chess player wins against an average student, neither the chess player’s rating nor the student’s rating changes much. However, if a student defeats a chess player, the student’s rating rises dramatically and the chess player’s rating plummets. This is because the fact that the student has beaten the chess player increases the student’s rating of their skill, while the chess player’s rating decreases. These two basic principles are the math behind the rating system.
In addition to these two basic principles, the Elo rating system introduces another principle. This new principle states that “relative odds are always constant for equal rating differences”. For example, the win rate in a match between a player with an Elo rating of 1200 and a player with an Elo rating of 1300 is the same as the win rate in a match between a player with an Elo rating of 1900 and a player with an Elo rating of 2000. This is because in both cases, the difference in ratings is the same, 100. The change in win rate based on this difference is given by a logarithmic function, which theoretically suggests that for every rating difference of 400, the win rate decreases by 1/10th. The players’ ratings will change from game to game according to this formula.
Let’s look at the chess example again. All players are given an Elo rating of 1200 for the first game. As the game progresses, the players’ ratings change, and eventually the ratings represent a reasonable representation of their skill. Statistics show that if there is a 100 difference in Elo rating, the player with the higher rating has a 64% chance of winning on average. For a difference of 200, the win rate is 76%, for a difference of 366, it’s 90%, and for a difference of 677, it’s 99%, which is pretty much in line with the theoretical win rate based on the difference in ratings. This was a successful outcome in quantifying skill and could be widely used in both theory and practice.
In addition to leveling the competitive playing field in a given field, rating systems have the advantage of providing constant motivation for participants. As ratings rise, participants can feel that their skills are improving, and conversely, as ratings fall, they realize that they need to work on their deficiencies. In this sense, a rating system can be more than just a skill assessment tool; it can be a catalyst for continuous learning and growth.
Organized rating systems like this are used today in a variety of fields. On the competition side, informal rating systems are typically used in programming competitions that adopt Olympiad-like problems. In this case, users’ ratings are measured based on the results of online mock competitions organized by different sites. Based on these ratings, the organizers of the competition provide questions at the level of the contestants’ ratings, and the contestants are able to solve the questions at their level. In addition, many students practice on online sites outside of the competition to improve their ratings. These examples show that rating systems are not just limited to one field, but have a wide range of applications in many different fields.
Rating systems also play a very important role in online games. A variation of the Elo rating is used as a ranking system in games like League of Legends, Dota 2, and StarCraft 2. It is used in a variety of ways to maximize the fun of the game by matching players with opponents of similar skill, and to reduce the unfairness that can occur in matches with large skill gaps. This allows players to play with opponents of similar skill to themselves, which goes a long way to improving the user experience of the game.
Finally, rating systems aren’t just limited to the competitive arena; they can also be used in education. For example, a rating system can be used to evaluate students’ academic performance and provide them with learning materials accordingly. This way, you can tailor instruction to each student’s level, which will help maximize learning.
As you can see, rating systems are useful in many fields, and their applications are likely to grow in the future. When you look at how the system originated in chess and how it has been adapted and used in many different fields today, you can see the importance of rating systems in fairly assessing skill and providing the right challenge.

Are rating systems the best way to fairly assess skills, or do they reveal the limits of quantification?

About the author

Blogger

About the author

Blogger

Read more