Saturday, November 17, 2007

Project Horseshoe 2007 slides: Smashing the game design atom

Here are my slides (with talking notes) from Project Horseshoe. I blazed through this in about 30 minutes since dinner was waiting and there is nothing more ornery than a crew of wild haired game designers in complete glucose crash. See if you can spot the source of the infamous '8mm' meme that stalked the conference.

Since it was Halloween yesterday, let's start with a tale of horror. Not so long ago there was an experienced team, working with a known platform, and a known engine. They had just scored a popular girl friendly license valued at roughly $160 million. Their game had the potential to hit as big as Pokemon or Nintendogs.

The designer ignored all this. You see, he had always wanted to make a Zelda with the critical element that has always been missing from all Zelda games: Hardcore jumping puzzles. The designer thought, "Nintendo is smart, but how could they have missed such an obvious improvement?" Sure the license was targeted at tween girls, but tweens like big swords don't they? This was the game he personally had always wanted to play.

The team were contractually obligated to go along with the design. It had been green lighted. There were milestones tied to the completion of each voluminous chapter of the tome like design script. If the team missed the milestones, the penalties were extreme. So they crunched in happy little silos of artists, level designers and programmers, all in accordance to a strict production schedule. It was the only possible way to get all the specified content done in time for the looming holiday ship date.

Finally, as the end drew near, they sent it off to testing. Early reports come back that the jumps in the first few levels were rather clumsy. The designer relied on his gut and sent forth an email containing a new set of parameters that were intended to polish the jump mechanics.

Eventually, a month later, someone got around to playing the last few levels. Uh oh. They relied heavily on laboriously constructed jumping puzzles tuned to the old jump mechanics. The last few levels of the game were massively broken.

It was too late to fix all the early levels so they entered into a death march to rework the last few levels. Lacking time or resources, they busted their budget hiring extra crew to take up the extra workload. The game was still delayed by several months.

Surprise, surprise, the end result wasn’t a very good game. It received miserable scores, but even worse, the core audience who would have bought it on the license alone was completely alienated by the design decisions. They wanted to befriend and grow cute little animals. They didn't want to die repeatedly while being attacked by spiky monsters while scrambling through military obstacle courses.

When the licensee pulled out of the sequel, the team collapsed. The human cost was immense. Home were lost, families relocated, many were so burnt out on game development they left the industry permanently, their passion crushed.

There were a lot of problems in this tale, but the primary one was a blatant failure of the game design process at almost every level. The game designer really didn’t know what he was doing. He thought he was writing a movie script. He thought he was making a game for himself. He had no idea that the game systems he was dabbling in were deeply interconnected.

Most game design that occurs isn’t much better off. It is a combination of craft (such as cloning Zelda) and intuition (such as when he hoped that tweaking the jumping mechanics would fix all his problems.) There is no science here. No predictability.

We have the ability to do so much better. We can create a science of game design.

If we want to modernize game design and move beyond the land of craft and intuition, we need to face and conquer four great challenges. These challenges will define game design for at least the next twenty years and perhaps longer.

  • Modeling our players: What do our player really want?
  • Generating new games systems: How do we create new mechanics that serve those player needs?
  • Metrics: How and what do we test to make sure we are doing the right thing?
  • Results: How do we get the results quickly and iterate on them so we can evolve the game quickly?
If we solve these issues and start spreading the resulting practices across the industry, horror stories like the one I just told will become mostly a thing of the past. The ultimate promise of a deep pragmatic science of game design is better game, happier teams and fewer human tragedies.

We are starting to see a smattering of theorists and teams with money to burn tackling these problems. They are creating systems that save their butts from expensive design mistakes. This is damned exciting.
  • You’ve got the Halo folks tracking heat maps of where players die. Valve has been relying on metrics for years. Nintendo builds early player tests and kleenex tester right into their dev process.
  • On the game systems side you’ve got Raph’s game grammar.
  • We are starting to rely on real data to model players moods and reactions with Chris Bateman and Nicole Lazarro’s work.
The work is still at the stage where most pragmatic folks think of these systems as the domain of eccentrics. Yet, each isolated advance reminds me of the turn of the century when physics was being cracked. Brilliant theorists. Great experiments. World changing results.

All these systems are being developed in parallel. You can measure things, but you don’t know what you are supposed to measure. You can write about game grammar, but it never is anything more than a loosely applied system of egghead analysis.

Maybe, just maybe, we can come up with a unified system that tries to answer multiple challenges simultaneously. The connections are all there. We just need to put them together.

In my Seattle laboratory, I'd been working on one attempt. It mixes game grammar, player models and measurement systems into one delightfully unified game design process. I’ve got 10 minutes left to share it with you. Think I can do it?

I started with a player model. Let's assume for a moment that players are naturally inclined to wander about, sucking up tidbits of info in the hope of learning interesting new skills.

From the player model we can construct an atomic feedback loop that describes how they learn all the new skills. This basic atomic loop includes all the fundamental pieces of a game. We are taking the deconstructed analytic elements described in so many books and tying them back together into a functional system.

  • You’ve got your game system, that black box in the center of the loops.
  • You’ve got your player input
  • You've got feedback to the player
  • You have the the players cognitive model of the game.

We’ve reduced the scope to a single atom in order to making things managable and
Press button, character jumps. That’s a skill atom.

Once we have a skill atom we can say interesting things about how the player interacts with it. First, skill atoms have states.
  • The player can figure out how to jump. They can exercise that skill by jumping on things. That is mastery. We can record this.
  • The player can fail to figure out how to jump. They never touch the button again. That’s early burnout.
  • They can get bored with jumping and ever jump again. That is late burnout. We can measure this as well.

Skill atoms are chained together. You can visualize them as a directed graph. Later stage atoms depend heavily on early stage atoms.

Want to kill a Koopa? You need to jump on him. Better hope you mastered the jump skill. We can now represent that classic relationship created by Miyamoto ages ago in a visual model. The theory is slowly catching up with the experimentalists.

You can turn these little chains into big chains that describe the entire game. Here’s a skill chain of Tetris.

Skill chains are remarkably flexible and rather easy to apply to almost any game. You look for the actions the user is performing, the skills they are learning and the positive / negative feedback you’ve built into the game. Any game with explicit rewards can be diagrammed.

There are probably a goodly number of you rolling your eyes at this point. You can create pretty diagrams to analyze anything. Here we've got someone who has created a very lovely and describing diagram of a penguin defecating. This is not a helpful diagram.

We ultimately need pragmatic everyday tools, not egghead analytics. The primary reason we create skill chains is to help solve two of our outstanding challenges:
  • Get real results quickly
  • Choose the right metrics so we aren't wading through huge quantities of data.

Skill chains can be used to create a rapid, iterative test driven game design process.

If we really rapid feedback, let’s build the feedback system into the game from the very beginning. Skip the giant paper tome phase. Start with a playable system that gives you meaningful reports.

The nice thing about skill atoms is that they eminently testable. When you write code that is going to be put in front of player, define your skill atoms. Its the same conceptual ideas behind writing unit tests.
  • You have a test framework.
  • You write the tests when you write game logic.
  • You run the test suite when you run the game logic.
  • You get a clean simple report when someone plays the game.

When you write your game systems, you can instrument each and every atom. It is a relatively inexpensive process.
  • You labels the rewards
  • You label the actions

You know when and atom is touched. You know when it is inactive. All those, states, burnout, inactive, etc you can record.

Remember burnout? The next time someone plays the game, we can visualize burnout directly on our skill chain diagram. You see instantly what atoms folks are missing. Here is someone failing to figure out how to complete a single line in Tetris.

You can also look at the data in terms of feedback channels and activity graphs.

Either way you get quick, easy to decipher feedback.
  • Instead of having a team that creates customized visualizations tailored to your game, you can use a more generalized system.
  • Instead of sorting through dozens or hundreds of badly organized logs, you can see in a glance where problems are occurring.

This requires a change in your development methodology. You want people to play your game as early as possible and as often as possible. Luckily automated testing of skill atoms reduces the cost substantially compared to traditional manual tests.
  • Anytime that anyone, anywhere in the world runs you game, you get valuable play balancing information.
  • Build up a database of a thousand players and release your daily builds to three people a day for every single day of your dev cycle.
In this day of web 2.0 and connected consoles this is now a broadly accessible practice.

Once you have rapid, daily feedback in place, you can use the resulting reports to evolve your design iteratively. All this analytical game grammar silliness becomes a foundational feedback system.
  • We can regression test game designs now.
  • We can fix busted skill atoms and see how things improve the next day.
  • What happen when we refactor our designs to make them more testable? I have no idea, but it excites me.
Imagine if a system like this had been in place when the 'designer' in our horror story made his jumping tweaks. The dashboard would have gone dark almost instantly with burnout spreading across the screen.

The systems I've described today are just the beginning; rough sketch of the future, if you will.
  • Our player models are primitive.
  • Our metrics can advance dramatically in their sophistication. We are just starting to tap into biometrics
  • Our player testing systems are still expensive to run.
  • There are amazing new games waiting to be designed and evolved into stunning experiences.
The great challenges are still out there. Both the theory and practice of our science is still very being born. Sometimes I wonder, "Who is going to take game design to the next level?"

I love this picture. 1927 5th Solvay conference. Einstein, Bohr, Curie. 17 out of 29 attendees went on to win the Nobel prize.

The first conference was in 1911, almost a hundred years ago. Einstein was the youngest present. Who is the youngest person here? These quirky, brilliant people revolutionized our understanding of physics. Without their work, we wouldn’t have semi-conductors, computers or video games. They were theorists and experimenters not so different than what we have in our industry today. A small group of eggheads changed the world.

I look out at this group and I see the same potential. We’ve got the brains. We’ve got the time.

Let’s make this an amazing weekend."

take care

PS: There was one more group photo shown immediately after the Solvay photo. It however, has been redacted due to national security concerns.


  1. Interesting. I wonder how "encodable" an Atom really is: whether it's really possible to describe a complete game design in concrete terms that an automated system can process. Even if there is nontrivial design work that can't be encoded as Atoms, it seems like there could still be value in an Atom-like language. It doesn't have to solve all the problems at once to be useful! :)

    (Also: your images link to copies of the same size rather than more-helpful larger versions; this means some of the slides are completely unreadable!)

  2. Most of the images are just for flavor. Let me know if there are any specific ones that you want to see more clearly.

    As you say it doesn't need to solve all the problems at once to be useful! This is a key insight. :-)

    Something I need to make clear in a more complete writeup of the whole test driven game design concept is that skill atoms are not intended to be an alternate description of the game. I believe that the game is ultimately the code. Anyone who attempt to break that golden rule by creating a parallel all encompassing game grammar has permanently entered ivory tower land, IMHO.

    Instead, think of skill atoms as a useful view on your game that provides insight into the player experience. They can let you know when the player does the things you hope he/she will do. They can let you know when the player doesn't do those things. And due to their relatively simple structure, they perform this service in a manner that is highly amendable to automation, remote play testing and regression testing.

    Such activities are all useful and worth doing as a matter of course. Wax on, wax off. There is also another benefit hidden away in the mundane practice of testing your designs. It requires you to be thoughtful and exacting about how you define and think about skills, feedback and learning within your game. In the long run, this the difference between the naive writer who merely jots down what 'sounds right' and the writer who knows how to wield the vocabulary of their trade with penetrating precision.

    take care

  3. This is interesting, but I'm unsure how useful it all is. Of course designers should keep the player in mind, run experiments to see what works, and refine the core system to make the game accessible to many. But I wonder how useful this way of explaining it all is.

    From your description, it seems the problem with the designer in the example story wasn't that he was trying to make a game ill-suited to the license, but that he eventually made a late change without fully considering the ramifications. That no one would have noticed that it broke the later levels until a whole month later seems odd, like it speaks of other, unmentioned problems with the team.

    I'd also take issue with the structure of the team. No one likes to be chained to a project they know is going off the rails. Why didn't anyone tell this to the designer instead of just blindly walking off the cliff? If they were afraid for their jobs, if they weren't listened to, if they were listened to but over-rulled peremptorily, if no one had the capacity of seeing the game was in trouble, those are also problems that could be solved in ways far different than inventing a new way of thinking about the process of game design.

    I dunno. I could well be wrong I suppose. It's certainly a jarring story.

  4. Small build cycles with tons of play testing is just about always going to yield better results.

    A large part of the benefit from the practice of egg head analysis is simply that someone is paying attention to design and test results.

    Having said all that, the skill atom idea sounds really neat, I've been watching some of the "player stats" things that appear in Team Fortress 2 along with some of the achievements that happen in XBox, Steam, Battlefield etc.

    I think this skill atom thing could be really interesting for the players themselves to look at if it can be well visualized.

    Obviously not for all users, but being able to see what you're doing well and where you are on the "skill curve" could be quite a motivator for the hardcore crowd.

    In other news, like JohnH said, your story about the designer and his team of slave monkeys has a whole lot of things wrong with it along with the bad design choice and late play testing.

    I think if the same designer was given tight play testing and feedback on who gets the skill atoms when you'd probably end up with a higher quality Zelda clone with well tuned jump puzzles that the target audience still wouldn't buy.

    It sounds like what was required was a jump to a completely new game idea, which I think would be even less likely with small iterations and lots of tester feedback.

    Failure to understand your target audience is a real killer, because your target audience generally isn't going to explain themselves to you, you've just got to intuitively 'get them'.

    Analysis is just a dissection of your understanding once you've got it, it won't deliver your initial understanding.

  5. Your idea of including a feedback-response system from the beginning reminds me of a similar view taken by 37signals' book Getting Real.

    Using interactable, usable prototypes from an early stage to get a true feel for what the system does when a user meets it. Instead of writing up countless documents that only prove to further clarify (and then muddy) what the goal of the project is, create what the user sees. Start with the point at which users interact with the software and then work on building functionality into this shell.

    As for creating meaningful metrics from your skill atoms, I don't see how it would be any more difficult than how some groups currently analyse log reports. In most cases I'm sure it would be simpler to build a framework around the feedback from atoms and this definitely excites me.

    More agile ideas coming together, bridging design and development? This is what brings people forward, the aspect of groups working towards a common goal.

    As Einstein once said, "Any intelligent fool can make things bigger, more complex, and more violent. It takes a touch of genius -- and a lot of courage -- to move in the opposite direction."

  6. @JohnH: Certainly there were more issues with the given example team and game than simply a flawed game design core, however given the talk and the audience the flawed game design is certainly the highlight for the crowd.

    And the ultimate problem is that the example situation isn't unheard of. While I wouldn't call it common, it likely happens far more frequently than anyone would like to own up to.

    Dan's whole thing here is to get a unified mind set going, and he is modestly proposing it be his system, simply because there is no unified process in the industry yet. Things that we do where I work that seem like common sense, don't happen at other places. Even the things that Dan has in his presentation about getting lots of feedback and tweaking the system to accomadate the feedback which I'm sure many people are "duh", just isn't happening as much as it should.

    Of course, the factors for the lack of adoption are as varied as the reasons why the example team failed. But you have to start somewhere.

    Dan - I specifically would like to see larger versions of the skill-tree of tetris, and the two slides on test feedback results. Also, being in the seattle area, I was wondering if there were any local game design informal gatherings that you attend that it would be possible to attend and/or pick your brain at.

  7. I'm quite interested in how a state like Burnout is judged. It's not so difficult to imagine the atoms and make the connections, nor to measure their usage, but finding the dividing line between Active and Burnout seems arbitrary.

    Also, it's difficult to see how the system would yield useful results for the designer given in the case study. Comparing a situation with no testing to a system both with testing and a design visualisation aid such as these atoms changes 2 factors in the experiment, thereby failing to comprehensively demonstrate the benefit of either.

  8. I stumbed on to this site (I forget how) looks interesting i'll have to read up on it. The tile art is what caught my eye. Anyway I apologize for the off topic post but I have a proposal for you, it is a paid artist position for a project I am working on. Please email me at ssc_at_smithcosoft dot com if you are curious at all.

    And just so you know said project is a ways from finished so you could work at your own pace (within reason).

    I look forward to hearing back from you, even if the answer is a "no thank you" or "not interested"

  9. I'm not in the industry, just a lowly computer science grad student. When I try putting on the game designer hat and thinking deep thoughts, the hat falls off. Still: it seems to me behavioral extinction and burnout can be observed in an automated way by measuring player behavior in terms of known addictive behavior models.

    Think back to reward schedules, fixed and random reward intervals, and behavioral extinction. If your game session analyzer can detect when certain skills can be usefully applied by the player (known/learned/mastered or not), you should be able to detect when skills are available but aren't being used, when skills are being used successfully/unsuccessfully, or when skills are being used at inappropriate times.

    From there you should be able to detect early player experimentation, skill usage and mastery, the presence or absence of overt reward or punishment, and even behavioral extinction.

    I like Danc's use of Super Mario Brothers (SMB) as an example. Suppose we are trying to model the effects of "turtle shell kicking" in SMB. What does player frustration and behavioral extinction look like, as applied to that specific atom? Imagine the game notices the first appearance of a turtle, and logs the player's behavior with regard to that turtle. Now suppose the player starts to frequently die from kicking the shell at a wall, only to have the shell bounce back and damage the player. The player becomes frustrated with the effects of this atom, and stops using it.

    What pattern of events would you see on an event log, if you were manually reviewing the "atom hits" from a gameplay session? Now how would you design an automated system to detect those patterns and label them?

  10. heh, I hardly ever used the turtle shell kicks in mario. I'd generally be much happier to just slowly jump on everything's head and explore the level at a slower pace. (Yes you can explore a side scrolling platformer level)

    Would my behavior pattern be described as skill burn out? Should it be described as such? Should the game change because of it?

  11. I believe in that situation, the automated game log analyzer would observe that nearly every time you were presented with a turtle, of the actions available to you (avoid turtle's area of influence; jump over; stomp and leave; stomp and kick at unbounded area; stomp and kick at area bounded on one side; stomp and kick at area bounded on two close-together sides; etc) you rarely chose to stomp or stomp/kick the turtle.

    A human surveilling a gameplay session in progress would comment that the player never noticed you could kick turtle shells, or never seemed to want to use that ability. An automated parser should detect the same thing.

    Before you can have behavioral extinction you must have a behavior. I believe a system can easily detect when a mechanic is being used inconsistently across the player population (take it or leave it), or when the mechanic is consistently being tried, used for a while, and then abandoned. If it's being abandoned, what is it being abandoned in favor of?

  12. "Small build cycles with tons of play testing is just about always going to yield better results."

    Only if the following are true:

    - Meaningful changes take short amounts of time to make.
    - Only the truly important aspects are reworked between iterations.

    Otherwise the developer may be either walking in circles, or walking too slowly. It's not enough to go in the right direction, you have to get there.

  13. Danc,

    As always, your latest post is filled with wonderful ideas and inspiration. I can't help but think about the number of day to day problems developers have that could be avoided using the techniques you suggest. Do you have any advice for how to set up the technology to properly record the data, as well as aggregate it all into something meaningful?

    Also, I was browsing some of your older stuff and ran into this:

    and couldn't help but notice the similarities to Viva Pinata. Incidentally, what is your opinion on Viva Pinata, and how do you think it compares to your original idea of Neighbors?

    Finally, I would love to hear your thoughts on Expression Design. What went right, what went wrong. How would you review it in comparison to your competitors?


  14. I am not only grateful for the insights of this article, but intend to be mildly evangelical about it in my workplace. So frequently the production concerns dictate the design process and allocated resources; that the horror story outlined doesnt sound unfamiliar - and many industry veterans would be lying if they said otherwise.

    If would be most obliged if you could post higher resolution images of both the skill chain and the activity graphs.

  15. Great comments! I'm glad the idea resonates.

    Do small feedback cycles help if your initial concept is completely off base?: Johnh and Travis both wondered if the feedback system would have been enough when there were obviously numerous dysfunctions within the team. Feedback systems are by no means a panacea, but they can help signal failure early in the process.

    For example in the horror story above, early testing with target customers might have revealed that the fundamental concept was incorrect. All it takes is a few user early in the process to say "You know, this whole jumping idea is kind of stupid. What I really want are ponies!" The red flag is raised by an external source, one that ultimately pays the teams bills. This doesn't always get through, but at least the right signals are being sent into the team so they can have an opportunity act in an informed manner.

    I disagree that you only need to 'intuitively get' your user. It is certainly important to attempt to build up empathy, but even the most tuned in developers find themselves in trouble when they rely solely on mental models instead of concrete customer evidence. Great design mean repeated putting your product in front of the customer and observing how they react.

    The line between Active and Burnout: The lag time after when in atom was last activated is the variable you'd use to determine burnout. As the designer, you need to set a baseline of what the expected periodicity might be and then test to see what actually happens. If it is a larger meta-game atom that happens every 20 hours of play you might allow for a long time before you mark the atom as burned out. If it is a core atom such as movement or using a primary attack, you might set a very short burnout fuse.

    It is worth noting that it is highly likely that you will need to adjust your skill atoms as player data is collected. Players will figure out new skills and learn existing skills in ways that you didn't initial imagine were possible.

    Skill atoms are your theory. The play test is the experiment. This often results in revising your original theory. :-)

    Setting up the system: This is going to highly depend on your game. The most obvious candidates are web-based projects, but you can achieve similar results if you can push updates to users and you have a net connection. The details of exactly where you add you instrumentation, how often you collect the data, how you store it and how you display it are all wide open problems. Go forth and try something! :-)

    Meeting in Seattle: T.carl, I'm always up for meeting folks in Seattle. Drop me a note at danc [at] lostgarden [dot] com and we'll figure something out.

    Bigger images Here are a couple:

    take care

  16. You shouldn't have posted that last picture. It's completely pretentious. How can you compare a bunch of game designers to a collection of nobel prize winners, the greatest minds of their time. "I see the same potential." Maybe it was meant light heartedly, etc, but that's context and we [reading] do not have it.

    I'm sure you are all intelligent, but keep your feet on the ground, for God's sake. If you were comparable to the likes of Einstein, you wouldn't be designing games, that's for sure.

  17. In reply to the post above

    All religions, arts and sciences are branches of the same tree.
    -Albert Einstein

  18. In Reply to the 2nd to last comment ... relax, it was supposed to be inspiring :)

    In reply to the comment directly above me ... **SNAP** ;)

  19. Nice post, one would think it an almost tautological proposition but since no one is doing it, past time to be saying 'lets model what experience the player has'!

    I haven't much to add today, I am working on a player modelling PhD but the doing takes precedence over the talking :D
    Still, I would like to say thanks for an earlier post that alerted me to work by Biederman and Vessel, this has been most useful to validate some of my theory. I'll send you a free copy of the (long, dull) paper :D You don't have to read it!
    Also, I was curious - why pick a diagram of the Shotokan karate kata Empi in your opening slide? Personal curiosity...

  20. Once upon a time a man had a computer game, and he wanted to make it better, so he added a little process that would take everything that wasn't fun out of the game. But when he started it up, nothing seemed to happen, the little process had removed itself.
    I have a little theory about burnout, it is when people play a game for a reason, but as they play the game encourages them not to play that way, but they keep going out of habit. Eventually they twig this and stop, but they may not know what it was that originally encouraged them to play. I still play tetris quite happily and have never moved onto the "advanced" high scores, I play it because I have a love of arranging and of feeling my own improvement, tetris allows both, as the slow speed-increase allows it to get harder as I get better. I stopped playing my old favourite clone when I was faster than the control mechanism, and the ability to improve maxed out. Finally your skill atoms remind me of levels, just as you might play the same level of a game without progressing, you might play the same skill set over and over for some other part of the experience. Your skill atom system is centred around the ideal of achievement, there are others.

  21. Re: Josh
    Actually, the skill atom idea is built around the concept of learning, which comes in many forms, not only acheivement. Your love of arranging and feeling your own improvement fits quite nicely into the model. I recommend you read the Chemistry of Game Design article located here:

    Re: zenBen
    Send me your paper! My email is danc [at] lostgarden [dot] com.

    take care

  22. Hi,
    Em, probably should've said "I'll send it when its published!"

    Still curious about the karate diagram...