Monday, April 4, 2005

Design Testing: The use of addiction metrics to force rapid evolution of innovative game designs

One of my goals with this blog is to formulate a 'new game development methodology' that empowers the little guy and helps the growth of innovation in the game industry. How do we build innovative, highly addictive games more quickly and with lower risk? Part of the answer is the rigid application of gaming metrics to the process of improving player addiction.


The Legacy of Cowboy Designers
The traditional designer is a cowboy designer. Modern game designs are the result of the messy, content dependent process a cowboy designer intuitively follows when building a game.

Cowboy Programmers
The term comes from the land of programming where early programmers would whip out l33t code in as little time as possible. Cowboy programmers were lone guns, experts in their field who possessed a deeply intuitive understanding of what works and what doesn't.

Coding standard, methodologies, even team work were taboo for the ancient cowboy programmers. Their code was inscrutable and many decisions seemed arbitrary. Troubles inevitably arose as the industry matured. Project didn't scale and many failed as they were bogged down in a plague of bugs. Eventually the world figured out better ways of programming that were more reliable, less risky, and produced better results. God bless process advancement.

Cowboy Designers: Copy cats with a hip attitude
Cowboy Designers are similar in many ways. They shoot from the hip when it comes to decisions, relying on their own finely tuned sense of 'fun' to design systems and create requirement docs. This sense of 'fun' is typically built up after internalizing the game play of dozens of similar game titles.

Such expertise works well when you are creating a clone or focusing on the later layers of the game design where you can't do much damage. Adding the 101st 'designer inspired' Pokemon is about as risk free as adding the 100th one. Subtle, oh so well crafted variations on existing themes are the bread and butter of a cowboy designer. The 'I could build it better' syndrome that drives many game designers is not only a contributing factor to the stagnation of innovation, but is actively encouraged by most game publishers as a means of reducing risk.

Cowboy designers stifle innovation
The big problem is that intuitive cowboy designers have high failure rates when it comes to inventing core game mechanics. Experience in pre-existing genres is a poor guide for success when your job is to put together new rules that result in dynamically different psychological scenarios. When cowboy designers attempt to refine a new genre, one of two things typically occur.
  • The half-breed design: The designer mixes two well defined genres. Since their decisions are informed by experiance and not psychology, the result is rarely enjoyable to play.
  • The mush design: The design mixes multiple genres together with rules that are untested and arbitrary in nature. Randomly designed games tend to a remarkably low success rate.

Either way, the result is ruined teams and failed games. It is no wonder that publishers are adverse to large investments in innovation.

Bad process, not bad people
It doesn't have to be this way. We are simply using the wrong design methodology to build innovative games. The old cowboy method only works with Shooter Clone #64. It fails miserably when attempting something new. With the right design methodology built around the concept of make risky game design decisions painless, innovative games have the opportunity to prosper.


Introducing feedback: A miracle design tool
What we really need is a reliable feedback mechanism that lets us reduce investment risk in order to create a safety net for innovation.

The modern feedback desert
Consider the traditional feedback cycle in game design. You spend 12 to 18 months building a game. You recieve the majority of your feedback from traditional game testers and internal 'team testing'. The information is very useful, but runs into several difficulties

  • The information is subjective: Most feedback is qualitative and is filtered through a pre-biased team of hardcore game developers.
  • Feedback is not statistically valid: Testing occurs with small numbers of testers that do not accurately represent the target market, nor are their opinions verifiably the same as larger market.

At this stage you still don't get a chance to react. Almost all gameplay fixes occur in the higher layers of the game design due to its lower risk nature. You can change some art or a few engine variables, but rarely is there time to alter core game mechanics due to the exponential cost of change in a content heavy system.

Once the game is released and in the public's hands, you finally get your first pieces of accurate feedback on your design decisions. You either sell a lot of copies, or you don't. If your game happens to end up a failure, there is no second chance. You didn't get it right the first time and now your entire team will be culled in a grand bloodletting by your disappointed publisher.

This isn't a healthy feedback cycle. The opportunity to make meaninful changes is limited from an early stage. Those who make mistakes are punished dramatically. Those who survive see their lifeless brethren on the roadside and learn that risk taking is dangerous.

Specifying a useful feedback mechanism
We need a tool that:

  • Rapidly informs us when design decisions unbalance the game
  • Lets us test multiple variations on a rule without risk
  • Allows us to see the effects of a change before we invest heavily in expensive, difficult-to-change content.
Such a design tool would allow incremental investments in new game designs. If you make a mistake, you can back that change out without putting the entire project at risk. Since the cycle time on between changes, feedback and exception or reject of the change is short, the team can iterate through a series of changes quickly.


Enter the Metrics

There are a wide variety of testing systems available that give us interesting feedback.

  • Unit testing
  • Market testing
  • Design testing

I'll briefly describe each the first two and then explain how Design testing can radically change how you go about game design.

Unit testing
The most common testing in game development is the unit test, borrowed from Agile programming methodologies. These are covered extensively in a wide variety of books and websites and deal primarily with code integrity and refactorability. This is certainly good important stuff that is essential to creating an agile game design methodology. However, unit tests address only programmer risk, not design risk.

Market testing (aka Market Research)
Another common method of product testing involves giving a product sample to users and having them rate how likely they would be to purchase. Market tesing is a huge field and contains everything from focus groups, to concept testing, to full on market testing with a wide scale deployment of the finished product.

Traditional market testing has some fundamental problems when applied directly to game design.

  • Expensive: This restricts its use to only the biggest of game developers and publishers.
  • Provides limited insight: Second, and most damning is that it is nearly impossible to tell anything about the addictive qualities of a game without actually playing it. I can show you a box with a guy with a gun on it and ask potential players if they would buy it. But such a survey gives me no meaningful information on whether or not I have the next Halo or Daikatana on my hands. "How does it play?" is critical competitive information.
Games, as a testable product, exist in a market research vacuum. Many of the tradition techniques honed over decade of consumer product research simply do not apply. They don't capture 'addiction', the competitive essence of games.

Design testing
We need tests and metrics that capture such ephemeral qualities as 'fun' and 'addiction'.

What makes me think we can test 'fun' and 'addiction'? I believe that core game mechanics rely on relatively simple psychological reward schedules. A successfully addicted player exhibits easily identifiable behavioral symptoms. By tracking these symptoms in a statistically valid manner, the designer gains useful feedback on the addictive properties of their gaming system.


Common Metrics for Design Testing

Testing for addiction is easier than you might imagine. The following are easily gathered metrics for measuring system-wide addictive behavior.
  • Length of playtime
  • Intensity of play time
  • Willingness to play again
  • Length between play times
  • Number of play times
  • Spot exit surveys
Game Token Metrics
You can also get more atomic and measure metrics for each game token in order to dig down into why a particular pattern of addiction is occuring.

  • Use time of each token
  • Frequency of use for each token
  • Gap between token use
  • Spot surveys of user's token enjoyment
ROI metrics
Finally, you can gather ROI metrics by combining the information from metrics above with cost of production information from your project tracking. This gives you some interesting information on where your development investments are paying off.

  • ROI of each token: Calculated by use time / development cost.
  • ROI of each game system
Intriguing results immediately pop to the surface once you implement ROI metrics. Additional level design has a marginal ROI. Character art is the same. You can add a new monster or a new level to the game and the addictive qualities of the title don't budge an inch. On the other hand, add a new powerup system and watch the addiction rise.

Control Charts
You can track these metrics on control charts. This simple charting method tracks changes in specific metrics over time. When a system is changed, you can usually see the results immediately in the control chart.

In general there will be one or two key metrics that (Key Performance Indicators, or KPI) that give you a strong indication of the addiction of your gaming system. Other metrics will be secondary factors that influence you KPI. For example, the reuse time of the powerup system is not the single most important factor in the game, but it influences total session playing time, which is your primary indicator of addiction.

Using the data
Once you have the control charts populated with data, it is a simple matter following a clearly defined change regimen

  1. Create a design change
  2. propagate that change your game players
  3. Measure the results
  4. If the change is positive compared to the previous baseline, keep the functionality.
  5. If the change is negative or mixed, create a new set of changes
  6. Track the key metrics over time to ensure that there is a steady improvement.
Other areas of future exploration
This is a very rough overview of the techniques involved in design testing. It is both a broad and deep field that borrows from many well-developed ideas in the world of market research and process engineering and applies them them to the problem of game design. Other topics include:
Batch testing: Test a wide number of variations in game design mechanics simultaneously. Take the best results and explore them further.
Tie KPI to financially meaningful results: Use regression analysis to link key statistics to financially meaningful results. For online games, measure re-up rate on subscriptions. For ad-based games, measure customer referral rates and number of impressions. For shareware games, measure initial purchase rates.


Design Testing Limitations

Design testing is the core of a rather radical new game design methodology. Let's take a look at some of the limitations.

Not every game can be design tested
Design testing is not for every team nor is it for every type of game. To borrow a term from the agile programmer world, most modern game designs are poorly refactored. They are clumsy, non object-oriented messes of content spaghetti strung together by cowboy designers and their complacent teams of artists and programmers. The typical modern game design has the following attributes:

  • Change is expensive.
  • Testing takes forever.
  • Development cycles are long
  • Static content is king

None of this is conducive to an effective design testable system. I think of applying design testing to an adventure game and shudder. The sheer mass of the static level content combined with the linear sequencing of content results in a system where a change to one location has no effect on any other location. Players are likely to play the game only once and it will take them 40 hours to complete. Good luck getting any timely feedback.

Requirements for a design-testable game
To use design testing as part of our process, we need a game design that is ammendable to thorough application of the technique. The following are some key characteristics of a design-testable game.

  • Refactored Design: The game is composed of highly reusable object-oriented elements. Changes to these elements propagate throughout the entire game system.
  • Game Mechanics focused, not Content focused: Static content in the form of level designs, sequenced boss attacks, fixed plot points, etc is rarely used. Instead the focus is on interesting game rules, meta game rules, dynamically generated levels to create an enjoyable game experience.
  • Automated update mechanism: The game designer can rapidly push changes out to a population of game players.
  • Real-time metrics: When a change is made, statistics on current player usage are immediately sent back to a central electronic dashboard. Most commonly this will be through an internet-based tracking system connected directly into the game.
  • Large population of game players: Statistics are worse than meaningless if you do not have a pertinent population to survey. Testable games require a large standing population of active game players. This suggests extensive open betas and other mechanisms to encourage player interaction before the game is finished. Subscription-based models also work well with this requirement.
Markets ripe for design testing
Online games have a clear advantage. Many of the tracking systems are already built into the web and you already have logs and a database ready to receive the data. You are guaranteed 100% correct information since you see everything that occurs. MMOGs are already doing many of the things outlined in this essay. Their success is readily apparent and I challenge you to find a more addictive genre.

Consoles are moving online in the next generation and most gaming PC's are online already. The technological infrastructure is certainly possible. All it takes are teams innovative enough to improve their development process. Indy games, Nintendo DS titles, and mass market consumer titles all are place where new methodologies might blossom.


Is design testing worth it?

If you want to make a design testable game, you need to throw out decades of highly polished game design experience and theory. You need to rely on cold metrics instead of your warm fuzzy 'I could do it better' cowboy designer instincts. You need to shun common content heavy genres that you grew up with and love in the deepest core of your gamer heart.

What you lose
Design testing fundamentally changes how games are developed.

  • Long development schedules cloistered away from the public: Feedback critical to the product's final success. Alphas, Closed beta, and Open betas become essential tools to releasing a polished game.
  • Offline games: If you aren't online, you've got no way of gathering data about gameplay.
  • Static level design: You need a refactored game design that allows changes to be made quickly.
  • Plot: If the ROI isn't there, kill it. You've got data that proves you better things to spend you development time working on.
What you gain
What you get in return is the ability to make radically addictive, highly competitive games with limited risk of failure.

  1. Increase your competitive advantage: The other guy is spending all his effort just to maintain his position at the top of the king-of-the-genre battle. He invest in mature genres and every game burns out his hardcore audience a little more. You can come onto the scene with a fresh new title that is more addictive than his current offering. When his FPS is merely one of many such competing titles, your title is a one of the kind 'must have' title.
  2. Reduce your costs: Instead of spending millions on movie level content, you gain your addictive rush from intelligent, informed game mechanics. The result is a lower cost structure. Because you have ROI metrics built into your design
  3. Reduce product failure risk: From the very beginning of development you know, to the decimal point, how addictive your game is with your target market. This lets you cull the bad games early, and focus your efforts on the winners.
If I can make a game that does the three things listed above, I'm willing to give up all the game design traditions that don't work for design-testable games.

The result is a refactored, innovation friendly game design methodology. You can take risks and succeed. You can spend less money and still beat the big guy. As the next generation titles come upon us, the smaller game developers have a choice. They can either work smarter or they can die. Design testing is a great tool for avoiding the later.

-Danc.