Nerf Herders?

Against Statistical Design

Statistical Design

I suggested that a good way of improving one’s design sense is by staring at Rorschach Tests, and here is a practical example of the importance of practicing pattern-avoidance.

To me it looks like a designer's brain in a vice

Stop seeing patterns!

This image is a heatmap showing where people most often die on Assembly, a Halo multiplayer map.  These heatmaps were first used by the Halo design team to analyze maps during testing, but were so interesting looking they became part of the statistics pages.  This data is so rich — so detailed and specific — it must be useful to a designer in some way, right?  The problem-loving brain of the game designer latches on to this as The Solution and immediately starts searching for The Problem.  It is tempting, given a powerful tool like statistical analysis, to incorporate it into the design process somehow — especially since design is often stranded in a world of abstraction and uncertainty.  Having concrete numbers is a rare treat.

However, what does this data mean?  Are red areas bad?  Should dark areas be eliminated?  Does a well-designed multiplayer map have a symmetrical shape?  What percentage of a map should be yellow?  Something about high-contrast feels unbalanced, so perhaps the map should be revised so that the gradient from safe to dangerous is more continuous.  And areas where nobody dies seem wasteful, maybe they should be removed.  And obviously the red areas will be frustrating, so they should be made safer by limiting line-of-sight and adding cover.  Pretty soon we have a completely yellow multiplayer map, that we have tricked ourselves into believing is balanced because our data looks pretty.  We have fallen victim to statistical design.

Players Aren’t Statistical

Statistics are powerful tools because they aggregate a large number of unique instances into a manageable form so it can be analyzed.  It would be impossible to watch every death of every player across thousands of games and have any cohesive understanding of how often players were dying in a given area.  Given enough examples, we would develop an emotional feeling of dread or security associated with certain spots, but the brain uses a very unscientific method to determine these attachments.  Exciting experiences are weighted much too heavily, which is why the impartiality of statistics is useful in discovering imbalances.  Using statistics to find problems is fine; designers go wrong when they use statistics to evaluate solutions.

Players don’t engage with the game statistically — they experience it personally.  It doesn’t matter if more players are killed standing in a specific spot than anywhere else on the map, what matters is the unique experience of a player killed in that spot.  If they realize that they shouldn’t have crested the hill with no cover that is right below where the Sniper Rifle spawns, vow not to do that again and move on, there is nothing wrong with the map.  Even if they do it over and over, growing more and more frustrated at their repeated mistake and creating a bright red dot on the heatmap, the map is not unbalanced.  However, if players are forced to expose themselves at a single chokepoint, or get sniped through a hidden line-of-sight in an otherwise safe area, it doesn’t matter if it is a rare experience and there is no red, the map ought to be fixed.  Neither of these situations can be found through statistical analysis, and neither of them are fixed by a solution that merely addresses the probability of being killed in a given area.

Avoiding Statistical Design

Some systems can only be balanced statistically.  If there are three factions in the game, and one faction wins 43% of the time, the factions are not balanced.  If a map is intended to be used for two-flag CTF, but the bases aren’t mirror images of one another, then the two sides had better be perfectly fair.  The necessity of reverting to statistical methods is inherent in the design of the system itself.  The designer will be forced to make changes that do not change the unique player experience — or may even harm it– in order to fix a statistical imbalance.  Worse still, players are skilled at detecting when a system must be balanced statistically, but since they do not have access to hard numbers their personal experience will tell them that it isn’t balanced — even when the data says that it is!

Nerf Herders?

Nerf Paladin?

Well-designed systems do not need to be balanced through data-manipulation.  If there are 10 weapons in the game, and one weapon is responsible for 20% of the kills, there is probably not a problem.  If the unique player experience isn’t negatively impacted, the statistical difference isn’t a balance issue.  So, the easiest way to avoid the trap of statistical design is to avoid systems that must be balanced mathematically in favor of those that can be balanced behaviorally.  If a system requires a large amount of instrumentation and is extremely sensitive to tiny value changes, instead of obsessing over statistical patterns, try revisiting the system’s design and making it less brittle.


GDC 2010: Design in Detail XV

Without anyone getting kicked in the face…

You always need to listen when people don’t like something. You are too close to the game; You probably already fixed all the things you didn’t like, so you should value a fresh perspective. Keep in mind that you can always trust someone’s emotional reactions, they are always authentic and valuable, but never just blindly take their advice. The designer’s job is to separate emotional feedback from thoughtful suggestions and treat the appropriately.

Before you can interpret someone’s feedback, you need to understand the source. Feedback means “the game in my head is different” and often times your response to feedback should be to probe about what kind of game they are imagining. You don’t necessarily need to agree on the game you are making to benefit from their feedback; they probably represent some portion of your audience.

You see Development Bias a lot with the public when the development process is very open. Playtesters know the game isn’t finished, they know you expect them to provide constructive criticism, so they become a lot more sensitive and more likely to complain. Once the game is on the shelves, those small problems fade into the background and players rarely notice them.

You also need to understand the source of feedback; If you can categorize someone’s play style, it will help you understand how to react to their feedback. You can weight their comments appropriately.
Here are some examples:
(The names have been changed to protect the guilty)

I used to balance “Easy” by playing with my nose (true story) but Steve still couldn’t beat it. I miss that guy, he was incredibly useful for balancing.

Even more important than categorizing other players, you need to understand your own playstyle. For instance, I’m a “role-player”, so I tend to ignore small balance problems if the results are still dramatic. I have to recruit “pros” that are more sensitive to useless or underpowered elements.

GDC 2010: Design in Detail XIV

If you were disciplined in writing your paper design, and stayed firm while doing setting up the rough balance, this stage should be very rewarding and exciting.  If not, it is going to be disappointing and frustrating.

The timing for this stage is tricky.  If you start too early, your balance changes will be swallowed up by the churn of new features coming online.  If you wait too long, the rough balance will become entrenched and the team will object to changes.  Generally, this coincides with a “First Playable” build where everything is at least in the game and functioning.

It’s crucially important to communicate this new phase to the rest of the team, so they know what to expect and understand that now is the time for them to give the feedback they have been patiently waiting to deliver. One way to do that is to implement a controlled opportunity for them to play the latest build and provide their feedback in a structured format.  Make sure you tell them what you are currently working on, so their responses will be relevant, but don’t tell them exactly what has changed or you may bias their opinions.

So how do you balance a Sniper Rifle? It is not by adding weaknesses!  Don’t undo the work you did in making it powerful!  Balance it by narrowing its role through limitations.

The best way to detect which elements need to be limited is by watching for the game to become predictable.  If the same strategy is being used in a variety of different situations, to the point where players are no longer required to think about which strategy to choose, it means an element is too useful outside of its designated role.  If the Sniper Rifle is not only the best weapon at long range, but players are carrying it indoors and using it against vehicles, it needs to be constrained.  Give it some time first, because the playtesters might just not have figured out the new balance yet, but if it is consistent for a few tests, start looking for ways to limit the dominant element.

On the other hand, if the game is completely unpredictable, it is a sign that the elements are not effective enough at their roles.  A truly random strategy should never be as good as intentionally selecting an element that is strong in the desired role.  It may also be a symptom of a role going unfulfilled.  If there is no Sniper Rifle, the Shotgun and the SMG are equally terrible at long range combat, so it doesn’t matter which one you choose.

GDC 2010: Design in Detail XII

Notice that these guys are getting stronger and stronger as we go?

I actually got this bug. Not only is it balance feedback presented with the authority of a bug report, it’s so incredibly early in the process, there is no way to know if the Sniper Rifle was balanced or not, since most of the game didn’t even work! Ideally, production would help shield a designer from this kind of inappropriate feedback, but in all likelihood they are the ones filing it in the first place.  [Pause for laugh]

Remember, you are getting paid to be the designer, it is your duty to use your best judgment and not swing back and forth based on the latest feedback, especially at this early stage. Hopefully the team will understand that and you will get to see it through.

If you design by committee, you end up like these guys.

Rewarding Play

Previously, we’ve defined games as “interactive experiences constrained by mechanics designed to reliably satisfy common exotelic aspirations“.  In other words, humans have needs.  Some of those needs are aspirational, meaning they aren’t necessary, but produce a positive emotion when met.  And some of those aspirations are exotelic, meaning they are not fulfilled by getting something, but by being used for some other good.  Games are the ideal source for meeting these needs because they are entertaining (required for aspirations) and interactive (required for exotelic experiences.)  But how do games meet this sort of need?  What is the process by which a game is fun?  And by understanding this process, can we make them more fun?

Lacks and Goods

Every need consists of two complimentary halves, an internal lack and an external good.  If someone has absolutely no desires or requirements, or those desires and requirements can be met completely internally, than they never suffer from a lack and therefore never have any needs.  Those people do not play games… because they do not exist!  The human condition is rooted in our imperfections and our inability to make ourselves whole.  This means that needs can only be met by some external source.

On the other hand, most external objects do not meet any sort of need; a specific external good is necessary to fill a given internal lack.  That is why games are particularly suited to a certain type of need, and totally useless when it comes to others.  If a pet rock could really meet the need for affection and companionship, all of our needs would be instantly met by a pile of gravel.

You guys rock

Actually, I do feel a little better...

When an internal lack is paired with the right external good, the need is met. When the need is a requirement, we refer to the feeling as “satisfaction” or “contentment” because it eliminates a source of negative emotion and leaves us feeling neutral. But when the need is an aspiration we commonly describe the experience as “gratifying” or “rewarding” because it is generally more positive and leaves us uplifted. Gratification and reward are the two most important tools in a game designer’s repertoire for meeting a player’s aspirations.

[To be continued…]