I’m reposting this here so it might be easier to find. The original thread has an excellent discussion and is here: http://forum.wotblitz.com/index.php?/topic/112723-development-of-matchmaking/page__p__1771556#entry1771556
———
On July 26, 2019 Stubbo66 posted on the EU forum a link to a Russian technical article on the development of matchmaking in Blitz. http://forum.wotblitz.eu/index.php?/topic/52541-how-team-balancing-works-in-blitz-from-wg-themselves/. The discussion in that thread includes a useful summary of the article.
The article itself is here: https://habr.com/ru/company/wargaming/blog/458866/. It gives some idea where RNG may actually be at work and perhaps some information on Ratings MM. WG undoubtedly has continued to work on improvements and may have changed some items since 2019.
There is a Google translation below here, but it lacks the formatting and illustrations. (A Safari translation was similar and included the formatting and graphics).
————
bak
July 23, 2019 at 06:13 PM
How the team balancer works in World of Tanks Blitz
Wargaming blog,
Programming,
Game development,
Algorithms
WoT Blitz is a mobile tank shooter in which players fight in a 7v7 format.
A matchmaker, or balancer, is a mechanism that forms the composition of teams based on the queue of players who want to get into battle.
Tanks have the following important parameters for matchmaking:
Level. Depending on the level, the tanks change different characteristics (for example, speed, armor penetration). At the 1st level - the weakest tanks, at the 10th - the strongest.
A type. There are 4 types of tanks in WoT Blitz: light, medium, heavy and tank destroyers (anti-tank self-propelled artillery mounts)
The simplest implementation of a matchmaker is to randomly throw players into teams. But in this case, players at low levels will have no chance of inflicting any damage, and it will become uninteresting to play.
Requirements
The list of requirements for the balancer was generated based on feedback from the gaming community and is periodically updated to this day.
At the time of this writing, the list for regular battles consisted of the following items:
Balancer Requirements List
The difference between the maximum and minimum level of the tank should not exceed one, with the exception of platoons (that is, if there are 10s in battle, then there should not be 5s or 7s, only 9s and 10s);
Teams must be 7x7, except for low online (in this case, you can create smaller battles, for example 5x5 or 3x3);
Teams should be mirror-balanced according to the level of technology (if one team has 3 tanks of the tenth level and 4 of the ninth, then the other should also have 3 tens and 4 nines);
In both teams, the tier of platoons must be the same;
Teams should have no more than 3 tanks of each type (for example, no more than 3 heavy tanks, no more than 3 tank destroyers). The rule works from the 5th level and higher;
The difference in the number of tanks of the same type for two teams should not be more than one (for example, if one team has 1 tank destroyer, then the second can have a maximum of 2 tank destroyers);
Teams should be balanced in the number of identical tanks, with a difference of no more than one tank (if one team has 1 IS-7, then in the other - no more than 2 IS-7);
Players should only get into the battle modes they have chosen (if a player has only turned on "Encounter battle", then he should not get into "Superiority.” ) ;
If the player bought a new tank, then in the first N battles on the new tank the levels of other tanks do not exceed the level of the player's new tank (that is, if the player has a level 5 tank, then there should not be level 6 tanks in the battle);
The player must hit the cards that he has already loaded. Some of our content is loaded after installing the game;
If a player has enabled the "Single type of control" option, then he should only play with players who have the same type of control as him (if he plays on a tablet / phone, then he should not get to players with mice, and vice versa).
Developing a matchmaker, especially with so many restrictions, is a very interesting task. And there can be quite a lot of approaches to its solution
Balancer forming pairs of players
Initially, mobile tanks used a balancer that he inherited from his big brother - desktop tanks. In general, he worked pretty well, but he had several problems: first, he did not give clear guarantees that the requirements were met; secondly, it was rather difficult to add new requirements.
Therefore, another balancer was written that worked according to the following algorithm:
We divide the players into groups according to the level and type of equipment;
We form pairs from the resulting players;
We scatter pairs in different teams: take each pair, throw the first player into the first team, the second into the second;
We make the final rebalance in the received teams: to meet most of the requirements, we replace some of the players from one team with players from the other team.
The resulting balancer worked 5-10 times faster than the previous version and initially assembled teams that met all the requirements at that time. New rules were added by writing additional rebalancing passes.
Everything worked well in the beginning. But over time, the more rules were added, the more difficult it was to write rebalancing. As a result of their work, new rebalancing should not break the work of previous ones. It became clear that this was a road to nowhere.
Balancer based on simulated annealing
In the final version, we settled on a balancer that works according to the following algorithm. All players who press the "Join Battle" button will be added to the general queue. The balancer in an infinite loop does the following:
Selects random battle parameters (random battle level (from 1 to 10), random mode, random map);
Finds in the queue all players who match the criteria selected above (entered the battle on a tank of a suitable level, have the selected mode enabled, have the selected map loaded);
Attempts to form teams that meet all of the above requirements (described below);
If he managed to form a team, he throws these players out of the waiting queue and the battle starts.
To form teams from a list of eligible players, the simulated annealing method is used. You can read more about the method itself here.(https://habr.com/ru/post/209610/)
In the context of application to the formation of commands, the algorithm is as follows:
Starts with two empty teams;
At each iteration, it randomly changes the state of the commands. To do this, it does one of the following operations:
Adds a random player from the list of suitable players to the first or second team (we also take a random team);
Removes a random player from a random team;
Replaces a random player from the list of eligible players with one of those existing in the first or second team;
Changes a random player from the first team to a random player from the second team.
Gets an estimate of the resulting state. To do this, calls the evaluation function. The function goes through the list of requirements and for violation of each of the points increases the penalty. The more the point is violated, the higher the penalty. For example, a 2x2 team penalty will be higher than a 6x6 team penalty;
Depending on the change in the value of the evaluation function and the current temperature, we determine the probability of transition to a new state;
We continue the process until either the temperature reaches the specified minimum, or the value of the evaluation function reaches zero (in this case, all requirements are satisfied and the battle can be started).
The main advantage of this approach: to add new requirements, it is enough to modify the evaluation function. There is no need to write code that describes exactly how to get what we want. It is enough to add a rule that looks at the formed team and says whether it is well balanced or not.
Rating battles are a good example of adding such rules. In rating battles in the matchmaker, several new rules have appeared at once:
The teams must be balanced in terms of rating (the difference in the total rating of the players between the teams must not exceed the specified value);
The difference in rating between players should be minimal (players from the Bronze League should not get into battles with players from the Diamond).
The first rule is implemented by modifying the scoring function: a penalty has been added for exceeding the maximum allowable difference in the rating. The second rule is implemented by changing the function that pulls eligible players out of the queue.
The disadvantage of this approach is the slow speed of work. Compared to the previous version, the current one began to work approximately 10 times slower, even despite a number of optimizations. By the way, about optimization. Most of the server (apart from networking and physics) for the game is written in Python. The balancer has been rewritten in C ++ and is multi-threaded. A request to form a command arrives from Python to the positive code. Then, each of the streams independently starts the annealing method. As soon as some thread finds a solution, the rest of the threads stop the search process, and the found solution is returned to Python.
As online grew, so did the load on the balancer. This fall, when online on the RU server reached 120 thousand (during the Mad Games event), the balancer stopped coping. As a temporary measure, we turned off some of the rules, this allowed us to reduce the load. To avoid similar problems in the future, we made the matchmaker distributed.
Rating system
In many MMO games, in addition to random battles, there are also rating / ranked / etc. The main idea of this mode: opponents are not searched for random, but suitable for the skill. If you are a skill player, you will play with the same skill players, and vice versa, if you do not know how to play, you will be caught against the same newbies.
At the beginning of the season, the player goes through a series of calibration battles based on the results of which his starting position is determined. Then, depending on the further success of the game, the player either rises or falls in the ranking. The rating system in Blitz was created, first of all, for the correct balancing. It is tailored to the skill of the players and practically does not depend on the number of games played.
To implement rating battles, it was necessary to solve two problems at once. First, it was necessary to choose a rating system (according to what principle the rating should be given to the players). Secondly, the balancer had to be improved so that it could collect battles taking into account rating restrictions.
The main requirement for the rating system is the ability to determine the player's level as accurately as possible. To assess how accurately a particular rating system works, a simulator was created, at the input of which the battle history and the selected rating system were submitted, and at the output the accuracy of the system was obtained.
The accuracy was calculated as follows:
All players were assigned a starting rating;
According to the results of each of the battles, the rating of the players who participated in this battle was recalculated according to the selected system;
Before each battle, the system tried to predict which team would win;
As a result, the percentage of battles for which the system gave a correct prediction was calculated.
The most popular rating calculation systems: winrate, Elo, Glicko, TrueSkill. Winrate is the usual win rate. Elo is a rating system originally created for two-person games (chess, etc). In this system, the player is given / subtracted a certain number of points for victory / defeat, depending on the opponent's rating. Glicko is generally similar to Elo, but it also takes into account how long the player has been inactive. TrueSkill is a proprietary rating system from Microsoft, in which each player has two parameters: rating and system confidence in this rating.
During the development of the first version of rating battles, we considered winrate and Elo (several options adapted for team play), as well as a simple Score system (in which players were always given a fixed number of rating points for a victory and subtracted for a defeat).
The best results were shown by the Elo system, in which Ra is the player's rating, and Rb is the difference between the total rating of the opposing team and the total rating of the player's team, excluding the player himself.
The main difficulties we encountered after launch:
too large a spread in the rating between players;
poorly predictable rate at which players gain rating (reach the league).
The first problem could not be completely solved due to the fact that there are too few skill players, they have to wait a long time until the battle begins, and very often they see weaker players in their teams. To mitigate the effect, we made rated battles available only during prime time (that is, when the maximum number of people play on the servers).
We were able to build a good model that showed results significantly better than the current one, but one difficulty arose (which quite often spoils everything in machine learning tasks). The model was good, but it sometimes had mistakes that people can easily see. From time to time there were situations when a player who, from the point of view of a person, showed good results, from the point of view of a model, received a low bonus, and vice versa.
As a result, we abandoned the ML model and adopted a simpler manual formula. This formula takes into account only combat experience, excluding win bonuses, x2, and others. It gives a very decent result, although it is slightly lower than that of the ML model.
Conclusion
The balancer based on the simulated annealing method allowed us to move from the description of the solution (how to assemble the command) to the description of the requirements (which conditions should not be violated);
In team rating battles, the modified Elo system has shown itself well, taking into account the individual actions of the player in battle;
It is not always worth using complex machine learning methods (especially when human interpretability and comprehensibility of the result is important).
We continue to develop and improve the balancer. We practically defeated the negative impressions of the imbalance by class. The main problems that our players pay attention to are the imbalance in skill, turbo drains and afk players. This is a serious challenge, we continue to work on the balancer in these areas.