Jump to content


Why the WG balancing method is bad statistically.

balance

  • Please log in to reply
32 replies to this topic

wannabeunicum #1 Posted 30 July 2019 - 04:12 AM

    First Sergeant

  • Players
  • 43622 battles
  • 5,783
  • [-MM]
  • Member since:
    10-25-2015

As we know WG balances on 55 to 65% players. I generally disagree with the idea of only using 5% of the playerbase to balance but am not argue about that. I am not gonna argue about controversial tanks here that may or may not be considered OP such as the vk 100p or charioteer or similar. That is a waste of time in arguing. Rather I will argue using regular stats and how they use that 5% of the player base.

 

The problem comes from something similar in statistics or sampling bias. It is true this is every single 55 to 65% player so it isn't necessarily a sample but the problem lies in the fact WG's data doesn't tell them the difference between 55 to 65% players. Comparing a 55% percent player to a 65% player is like comparing someone who earns 100k per year to someone who earns 1 million per year. Yes the 1st is still comfortable in life and above average but there are still huge differences. Say for example within 1000 random players in the 55 to 65 percent range. I would say 300 would be 55 , 250, would be 56, 150 would be 57 or something and the numbers keep getting lower because of the larger deviation from the mean win rate of 48%. This means we get a median win rate in this group of around 56-57% if I had to guess. However when a new tank comes up we generally find the unicums or closer to 65% players getting excited with it. This group joins in the rush to play it and the tank is thereby over represented in the charts by players closer to 65% rather than 55% winrate. For example the most infamous example I can find is the 4.5 Ferdi buff. This was also the tiger P/Tiger buff but we can agree the tiger P was super OP and the tiger was still quite OP in 4.5. However literally a few minutes of searching hasn't yielded a single complaint about the Ferdi from 4.5 to 4.7 so not a single player thought it was OP. Rather unicums were likely rehabbing their stats in it and thereby it had super OP stats which are comparable to the Charioteer stats(aka very high). Therefore it was nerfed. The way WG balances using 55 to 65% players should really be fixed to properly allow for new tanks to enter the game rather than the current cycle of nerf batting new tanks 1 or 2 updates after they are released.

 

If anyone is confused just ask questions.


Edited by wannabeunicum, 30 July 2019 - 04:16 AM.


Absolute_Sniper #2 Posted 30 July 2019 - 04:33 AM

    First Sergeant

  • Players
  • 45571 battles
  • 3,926
  • [CRU2L]
  • Member since:
    11-05-2015
I’m not a math wizard but aren’t they balancing the tanks against other tanks and not how the players use them. They pick the best players because the best can make any tanks “work”. If a 65%er can do 3k average damage in an IS-7 but can only manage 2k in an IS-4, is it because of the tank?

Not saying this is why they balance the way they do but it makes sense if I look at it that way.

I used to have a picture here but tiny pic took an arrow to the knee....

 

https://discord.gg/twdc4Dq

 


rosgrim #3 Posted 30 July 2019 - 04:34 AM

    Senior Sergeant

  • Players
  • 98255 battles
  • 589
  • [COD]
  • Member since:
    04-23-2016

@wannabeunicum

well I think it is a little bit more complicated.

I mean the point is also the sample number (considering they monitor all the tanks).

Then also: 1 thing is a 60% that play 90% tier 8-10 another is a 60% playing mostly 3-7 (there a lot of those).

Keeping your example: is like comparing one earning $3000/month in NYC and the other in Nigeria.

So I guess they try to get the best from the tankers population.

What do you suggest ? I missed this point.

To be honest I don't have any better solution that convince myself.

Also the balance situation - considering all - is not that bad and they adjust macro failed super OP tank (with few exceptions).

It is not my 1st concern in this game.

May be a much accurate panel would be 55 ad over by tier: if you are talking about a tier 10 and you are an unicom playing tier 8 how accurate could be your contribution ?

 

Also 

they balancing the tanks against other tanks as

@Absolute_Sniper said

so you need a panel (remember a consistent number of players) that basically know how to play to see how tank vs tank (that is "balance") perform in terms of WR and DMG

 


Edited by rosgrim, 30 July 2019 - 04:39 AM.


wannabeunicum #4 Posted 30 July 2019 - 04:37 AM

    First Sergeant

  • Players
  • 43622 battles
  • 5,783
  • [-MM]
  • Member since:
    10-25-2015

View PostAbsolute_Sniper, on 30 July 2019 - 04:33 AM, said:

I’m not a math wizard but aren’t they balancing the tanks against other tanks and not how the players use them. They pick the best players because the best can make any tanks “work”. If a 65%er can do 3k average damage in an IS-7 but can only manage 2k in an IS-4, is it because of the tank?

Not saying this is why they balance the way they do but it makes sense if I look at it that way.

They aren't see that though. They are comparing all 55 to 65% as a group and comparing them with other tanks. So they take all the 55/65s in the is4  and all 55/65ers in the is7 and compare the stats there. So the stats how 55/65ers get 2k dmg and the is4 has 2.5k dmg. The problem comes from self selecting bias. Lets say the is4 is slightly better than the is7 but unicums still rush to it more. So in an is4 they would do 2.8k dmg but in an is7 they would do 2.7k. They keep self selecting to this tank increasing the proportion of unicum rather than blueunicum players playing the is4.



wannabeunicum #5 Posted 30 July 2019 - 04:39 AM

    First Sergeant

  • Players
  • 43622 battles
  • 5,783
  • [-MM]
  • Member since:
    10-25-2015

View Postrosgrim, on 30 July 2019 - 04:34 AM, said:

@wannabeunicum

well I think it is a little bit more complicated.

I mean the point is also the sample number (considering they monitor all the tanks).

Then also: 1 thing is a 60% that play 90% tier 8-10 another is a 60% playing mostly 3-7 (there a lot of those).

Keeping your example: is like comparing one earning $3000/month in NYC and the other in Nigeria.

So I guess they try to get the best from the tankers population.

What do you suggest ? I missed this point.

To be honest I don't have any better solution that convince myself.

Also the balance situation - considering all - is not that bad and they adjust macro failed super OP tank (with few exceptions).

It is not my 1st concern in this game.

May be a much accurate panel would be 55 ad over by tier: if you are talking about a tier 10 and you are an unicom playing tier 8 how accurate could be your contribution ?

http://wot-news.com/game/tankinfo/en/eu/ussr/R134_Object_252K

 

Simple. World of tanks has winrate curves. They show how much players at a specific winrate over perform. For example on PC wot there are major complaints about the wheeled vehicles being super OP but in reality if you actually look at the stats a 60% player wont really do better than 60% in most of the wheelies. Blitz WG should use data like this rather than giving us data that is massively skewed. 

 

What really made me mad about the balance is the 14 cm Conway nerf. I can understand the nerf to the 120 cm. Some players said it was OP and I can semi agree. but the 14 cm was actually underpowered and now it will be a worse wt pz iv in every manner.



rosgrim #6 Posted 30 July 2019 - 04:43 AM

    Senior Sergeant

  • Players
  • 98255 battles
  • 589
  • [COD]
  • Member since:
    04-23-2016

View Postwannabeunicum, on 30 July 2019 - 04:39 AM, said:

http://wot-news.com/game/tankinfo/en/eu/ussr/R134_Object_252K

 

Simple. World of tanks has winrate curves. They show how much players at a specific winrate over perform. For example on PC wot there are major complaints about the wheeled vehicles being super OP but in reality if you actually look at the stats a 60% player wont really do better than 60% in most of the wheelies. Blitz WG should use data like this rather than giving us data that is massively skewed. 

 

What really made me mad about the balance is the 14 cm Conway nerf. I can understand the nerf to the 120 cm. Some players said it was OP and I can semi agree. but the 14 cm was actually underpowered and now it will be a worse wt pz iv in every manner.


that could be interesting. But do we have enough players in WOTB for each WR playing a certain tank to give us consistent numbers ? this a genuine question to myself

 



wannabeunicum #7 Posted 30 July 2019 - 04:56 AM

    First Sergeant

  • Players
  • 43622 battles
  • 5,783
  • [-MM]
  • Member since:
    10-25-2015

View Postrosgrim, on 30 July 2019 - 04:43 AM, said:


that could be interesting. But do we have enough players in WOTB for each WR playing a certain tank to give us consistent numbers ? this a genuine question to myself

 

The russia server definitely does. If need be WG could lower it to 54 to 64% players as the upper end probably is a bit short.



ARandomGenieGuy #8 Posted 30 July 2019 - 05:50 AM

    First Sergeant

  • Players
  • 18088 battles
  • 1,884
  • Member since:
    06-26-2014
I don’t really understand all of this of how tanks are balanced, however I suppose that whatever meta is currently dominant on the most populous servers are what determines the buffs or nerds to the whole player base. Of course there are some exceptions like crazy OP tanks that dominate in every single server like some British TDs when introduced but most of the fine tuning is not even determined by NA statistics since we are the smaller server of them all. 

__V_O_P__ #9 Posted 30 July 2019 - 11:33 AM

    First Sergeant

  • Players
  • 48070 battles
  • 1,278
  • [LAP]
  • Member since:
    09-06-2016

View Postwannabeunicum, on 30 July 2019 - 04:12 AM, said:

As we know WG balances on 55 to 65% players. I generally disagree with the idea of only using 5% of the playerbase to balance but am not argue about that. I am not gonna argue about controversial tanks here that may or may not be considered OP such as the vk 100p or charioteer or similar. That is a waste of time in arguing. Rather I will argue using regular stats and how they use that 5% of the player base.

 

The problem comes from something similar in statistics or sampling bias. It is true this is every single 55 to 65% player so it isn't necessarily a sample but the problem lies in the fact WG's data doesn't tell them the difference between 55 to 65% players. Comparing a 55% percent player to a 65% player is like comparing someone who earns 100k per year to someone who earns 1 million per year. Yes the 1st is still comfortable in life and above average but there are still huge differences. Say for example within 1000 random players in the 55 to 65 percent range. I would say 300 would be 55 , 250, would be 56, 150 would be 57 or something and the numbers keep getting lower because of the larger deviation from the mean win rate of 48%. This means we get a median win rate in this group of around 56-57% if I had to guess. However when a new tank comes up we generally find the unicums or closer to 65% players getting excited with it. This group joins in the rush to play it and the tank is thereby over represented in the charts by players closer to 65% rather than 55% winrate. For example the most infamous example I can find is the 4.5 Ferdi buff. This was also the tiger P/Tiger buff but we can agree the tiger P was super OP and the tiger was still quite OP in 4.5. However literally a few minutes of searching hasn't yielded a single complaint about the Ferdi from 4.5 to 4.7 so not a single player thought it was OP. Rather unicums were likely rehabbing their stats in it and thereby it had super OP stats which are comparable to the Charioteer stats(aka very high). Therefore it was nerfed. The way WG balances using 55 to 65% players should really be fixed to properly allow for new tanks to enter the game rather than the current cycle of nerf batting new tanks 1 or 2 updates after they are released.

 

If anyone is confused just ask questions.

 

I’m genuinely interested in what you would suggest as an alternative. I assume that WG try to identify tanks that are “over performing” as opposed to over powered. And If that’s the case then WG needs a stable baseline to measure against. I imagine WG finds a lot of variability in the sub 55% crowd.

 

And the nerf bat issue? I still wonder whether WG actively allows the tanks to be OP to start with to encourage uptake. It’s the same point someone made about supertesting - a Unicom in a cool looking OP tank racing around the map is a pretty good marketing tool.



Morphman11 #10 Posted 30 July 2019 - 12:33 PM

    First Sergeant

  • Players
  • 32 battles
  • 1,247
  • Member since:
    04-12-2011
60%ers aren’t the main balancing pool. Which is why mediums were nerfed so hard at tier X.
OFFFICIAL PUBLIC RELATIONS for the Morphman YouTube Channel

VelociTanker #11 Posted 30 July 2019 - 12:43 PM

    Senior Sergeant

  • Players
  • 16226 battles
  • 568
  • [III]
  • Member since:
    05-21-2017

Note that also WG often introduces tanks as slightly or significantly OP, for marketing reasons, then nerfs them.

 

I wonder how many players on the NA server manage to get over 55%. I see maybe 2 per game. And lots of long term players at 55% are much better than 65%ers like me. There's no ideal way of using this date to balance tanks. It's a best guess deal backed up by gut feeling from players.



Bias Much? KV-1 blocks 5,525 damage / 6 kills  / 
T34-1 2809 dmg Kolobanov replay on Mines

5.5 aka "The Great Nerfening" © VelociTanker LLC

 

 


acrisis #12 Posted 30 July 2019 - 12:49 PM

    Captain

  • Players
  • 21314 battles
  • 12,797
  • [III-C]
  • Member since:
    12-27-2014

Who says they have not been / are not doing extra comparisons?

 

When it comes to double checking whether all tanks get a fair shake; they need to look at a slice of the population that can handle themselves and knows how to use their tanks. 

 

So, where to look for checking on tank performance? 

Overall average player base, while interesting in and off itself is quite broad

 

Which would be the most relevant group ... 

Low end ( <42% )

Median slice of ( 42-53% )

Good players and up ( 55-65% )

Super high end ( > 65% ) 

 

Could they perhaps check a couple narrower bands; let’s say 55-60 and 60-65 and 65+?  Sure. Who says they haven’t? Who says the data is significantly different for 3 small % or percentile slices vs a the bigger slice we get now?  As data driven as WG is, I’m sure they did not settle on their method as the only idea, totally out of the blue. At some point one needs to make a decision, which group to look at in game as far as tank performance. And the current selection surely represents capable and motivated tankers. Can there be an initial ramp up with a new tank, where it is unicums first, pushing numbers up? Can there be a stat rehabber ... sure. Who is to say that WG does not compare that initial ramp up compared to other tanks before? And, still, Is it meaningful across many thousands of players? Who’s to say they don’t have an isolated data set for the first week, two weeks ... that they don’t do weekly, bi-weekly samples for new vehicles and lines, to help makes their decision(s)? 

 

Just because WG is presenting us one broad data set, that does not mean that is the only one. 

 

When it comes to new vehicles. WG tries to see how well they will do with internal testing and small pre-releases to selected tankers and I believe community contributors. Could they do a better job, not releasing a new vehicle in an OP state ... sure. 

 

Nerf buff nerf cycles should be avoided, I think we all can agree on that. 

I am also sure that WG is aware that releasing something OP and needing to turn it down after a few updates, does not look good. 

 

Still, after all the bad and wrong, you stated, what is the alternative? what is your solution? 

 

 

 

 

 

 

 

 

 


 

Forum essentials:  > Desktop mode 
Check recent threads. Check WG staff threads. Use search.
    > WG Bugs and current update     > Feedback     > Vehicle Bay     > Guides     > Videos 

 

     BE  (+)   

 [ NEW > ]  Lag, Packet Loss, Ghost Shells, Networking and more demystified usingPingPlotter (free)  
Blitz Community Coalition | Looney Tooners | Triarii
Blitz University | Basic Training  |  Mentor  |  The Pingplotter Guy

 


Gavidoc01 #13 Posted 30 July 2019 - 12:54 PM

    Platinum Card Wallet Warrior

  • Players
  • 49477 battles
  • 4,223
  • [III]
  • Member since:
    10-12-2014

What we don't know is what the number of 55-65% players are on the Russian server which is what they use for the balancing. Possible that due to sheer quantity the number of those players is equal to the entire NA server size. As an example...Most recent peak for players online hourly on 7/29:

 

Russia: 42,250

NA: 5,332

 

If we use the same 5% number for 55-65% players and apply it to Ru that's almost half of the entire peak NA population by hour that was online on 7/29.

 

IMO they should use all servers for balancing and lower it to 55-60% players.


My Blitzstars

 

I'm a Platinum Card Wallet Warrior.

You’re welcome for supporting the game. 


_UrgleMcPurfle_ #14 Posted 30 July 2019 - 03:19 PM

    First Sergeant

  • Players
  • 17130 battles
  • 2,791
  • [HATED]
  • Member since:
    07-19-2016

Instead of only looking at a tank's performance stats, they should compare them to the stats of their players. That would fix your issue. The problem isn't that they only look at a specific demographic within the playerbase. 



wannabeunicum #15 Posted 30 July 2019 - 03:56 PM

    First Sergeant

  • Players
  • 43622 battles
  • 5,783
  • [-MM]
  • Member since:
    10-25-2015

View Postacrisis, on 30 July 2019 - 12:49 PM, said:

Who says they have not been / are not doing extra comparisons?

 

When it comes to double checking whether all tanks get a fair shake; they need to look at a slice of the population that can handle themselves and knows how to use their tanks. 

 

So, where to look for checking on tank performance? 

Overall average player base, while interesting in and off itself is quite broad

 

Which would be the most relevant group ... 

Low end ( <42% )

Median slice of ( 42-53% )

Good players and up ( 55-65% )

Super high end ( > 65% ) 

 

Could they perhaps check a couple narrower bands; let’s say 55-60 and 60-65 and 65+?  Sure. Who says they haven’t? Who says the data is significantly different for 3 small % or percentile slices vs a the bigger slice we get now?  As data driven as WG is, I’m sure they did not settle on their method as the only idea, totally out of the blue. At some point one needs to make a decision, which group to look at in game as far as tank performance. And the current selection surely represents capable and motivated tankers. Can there be an initial ramp up with a new tank, where it is unicums first, pushing numbers up? Can there be a stat rehabber ... sure. Who is to say that WG does not compare that initial ramp up compared to other tanks before? And, still, Is it meaningful across many thousands of players? Who’s to say they don’t have an isolated data set for the first week, two weeks ... that they don’t do weekly, bi-weekly samples for new vehicles and lines, to help makes their decision(s)? 

 

Just because WG is presenting us one broad data set, that does not mean that is the only one. 

 

When it comes to new vehicles. WG tries to see how well they will do with internal testing and small pre-releases to selected tankers and I believe community contributors. Could they do a better job, not releasing a new vehicle in an OP state ... sure. 

 

Nerf buff nerf cycles should be avoided, I think we all can agree on that. 

I am also sure that WG is aware that releasing something OP and needing to turn it down after a few updates, does not look good. 

 

Still, after all the bad and wrong, you stated, what is the alternative? what is your solution? 

 

 

 

 

 

 

 

 

 

Winrate curves as I stated. Comparing mean overperformance per player within this range would be much better. Its clear wg doesnt use this format as shown by the Ferdinand nerf.



The_Flop #16 Posted 30 July 2019 - 05:11 PM

    Junior Sergeant

  • Players
  • 53546 battles
  • 237
  • [-OMG-]
  • Member since:
    10-08-2016
I think your conclusion is wrong. The higher the win rate the better the numbers will represent the true cap performance of the tank. The wr spread is merely to get a good sample number. Even if your assumptions concerning early tank usage was correct, which you provide nothing but your own opinions and no facts, it would still be a valid analysis of the tank. The flaw lies in the testing not the post release statistics. The VK100 is the only tank I think they should have not released as is, the British nerf occurred because of the inclusion of the boosters which were not present when testing. On top of all that, they've were extremely light in releasing new tanks last year so I'd rather have them relate theses new lines and adjust them then have the only have the 2 Chinese lines like we got last year.

wwing #17 Posted 30 July 2019 - 06:54 PM

    Either We Win Or They Do

  • Players
  • 39478 battles
  • 252
  • Member since:
    08-31-2014

the mean win rate of 48%”

@wg

My 2 cents:  Nerf every tank that has wr of over this number!



__V_O_P__ #18 Posted 30 July 2019 - 07:00 PM

    First Sergeant

  • Players
  • 48070 battles
  • 1,278
  • [LAP]
  • Member since:
    09-06-2016

View Postwannabeunicum, on 30 July 2019 - 03:56 PM, said:

Winrate curves as I stated. Comparing mean overperformance per player within this range would be much better. Its clear wg doesnt use this format as shown by the Ferdinand nerf.

 

I’m unclear as to why you are clear that it is clear. WG has the same data as Blitzstars and more I imagine. How are you so sure they don’t use win rate curves? Sure they publish the average but who knows what they actually have. And I read and reread your original post and I still don’t understand the significance of the Ferd nerf.  


Edited by __V_O_P__, 30 July 2019 - 07:01 PM.


wannabeunicum #19 Posted 30 July 2019 - 07:25 PM

    First Sergeant

  • Players
  • 43622 battles
  • 5,783
  • [-MM]
  • Member since:
    10-25-2015

View Post__V_O_P__, on 30 July 2019 - 07:00 PM, said:

 

I’m unclear as to why you are clear that it is clear. WG has the same data as Blitzstars and more I imagine. How are you so sure they don’t use win rate curves? Sure they publish the average but who knows what they actually have. And I read and reread your original post and I still don’t understand the significance of the Ferd nerf.  

Because it was uncontraverisally not OP at all and every player said dont nerf it but wg nerfed it because of their stats.



wannabeunicum #20 Posted 30 July 2019 - 07:26 PM

    First Sergeant

  • Players
  • 43622 battles
  • 5,783
  • [-MM]
  • Member since:
    10-25-2015
I still dont think the vk 100p was that OP. But I'm not gonna argue here about whether or not a tank was OP or not besides the ferdi whuch clearly wasn't. 





Also tagged with balance

1 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users