Nerf Herders?

Against Statistical Design

Statistical Design

I suggested that a good way of improving one’s design sense is by staring at Rorschach Tests, and here is a practical example of the importance of practicing pattern-avoidance.

To me it looks like a designer's brain in a vice

Stop seeing patterns!

This image is a heatmap showing where people most often die on Assembly, a Halo multiplayer map.  These heatmaps were first used by the Halo design team to analyze maps during testing, but were so interesting looking they became part of the statistics pages.  This data is so rich — so detailed and specific — it must be useful to a designer in some way, right?  The problem-loving brain of the game designer latches on to this as The Solution and immediately starts searching for The Problem.  It is tempting, given a powerful tool like statistical analysis, to incorporate it into the design process somehow — especially since design is often stranded in a world of abstraction and uncertainty.  Having concrete numbers is a rare treat.

However, what does this data mean?  Are red areas bad?  Should dark areas be eliminated?  Does a well-designed multiplayer map have a symmetrical shape?  What percentage of a map should be yellow?  Something about high-contrast feels unbalanced, so perhaps the map should be revised so that the gradient from safe to dangerous is more continuous.  And areas where nobody dies seem wasteful, maybe they should be removed.  And obviously the red areas will be frustrating, so they should be made safer by limiting line-of-sight and adding cover.  Pretty soon we have a completely yellow multiplayer map, that we have tricked ourselves into believing is balanced because our data looks pretty.  We have fallen victim to statistical design.

Players Aren’t Statistical

Statistics are powerful tools because they aggregate a large number of unique instances into a manageable form so it can be analyzed.  It would be impossible to watch every death of every player across thousands of games and have any cohesive understanding of how often players were dying in a given area.  Given enough examples, we would develop an emotional feeling of dread or security associated with certain spots, but the brain uses a very unscientific method to determine these attachments.  Exciting experiences are weighted much too heavily, which is why the impartiality of statistics is useful in discovering imbalances.  Using statistics to find problems is fine; designers go wrong when they use statistics to evaluate solutions.

Players don’t engage with the game statistically — they experience it personally.  It doesn’t matter if more players are killed standing in a specific spot than anywhere else on the map, what matters is the unique experience of a player killed in that spot.  If they realize that they shouldn’t have crested the hill with no cover that is right below where the Sniper Rifle spawns, vow not to do that again and move on, there is nothing wrong with the map.  Even if they do it over and over, growing more and more frustrated at their repeated mistake and creating a bright red dot on the heatmap, the map is not unbalanced.  However, if players are forced to expose themselves at a single chokepoint, or get sniped through a hidden line-of-sight in an otherwise safe area, it doesn’t matter if it is a rare experience and there is no red, the map ought to be fixed.  Neither of these situations can be found through statistical analysis, and neither of them are fixed by a solution that merely addresses the probability of being killed in a given area.

Avoiding Statistical Design

Some systems can only be balanced statistically.  If there are three factions in the game, and one faction wins 43% of the time, the factions are not balanced.  If a map is intended to be used for two-flag CTF, but the bases aren’t mirror images of one another, then the two sides had better be perfectly fair.  The necessity of reverting to statistical methods is inherent in the design of the system itself.  The designer will be forced to make changes that do not change the unique player experience — or may even harm it– in order to fix a statistical imbalance.  Worse still, players are skilled at detecting when a system must be balanced statistically, but since they do not have access to hard numbers their personal experience will tell them that it isn’t balanced — even when the data says that it is!

Nerf Herders?

Nerf Paladin?

Well-designed systems do not need to be balanced through data-manipulation.  If there are 10 weapons in the game, and one weapon is responsible for 20% of the kills, there is probably not a problem.  If the unique player experience isn’t negatively impacted, the statistical difference isn’t a balance issue.  So, the easiest way to avoid the trap of statistical design is to avoid systems that must be balanced mathematically in favor of those that can be balanced behaviorally.  If a system requires a large amount of instrumentation and is extremely sensitive to tiny value changes, instead of obsessing over statistical patterns, try revisiting the system’s design and making it less brittle.