Read Chris Crawford on Interactive Storytelling Online
Authors: Chris Crawford
This edifice was built to solve engineering problems. Computer scientists have tackled a wide range of problems, but they have veered toward more mechanical problems because these problems are the ones they can solve. When they have attempted to tackle softer problems, such as natural language, they have encountered frustrating obstacles and made less progress.
The perfect example of computer science at its best is the programming of the Mars robots Spirit and Opportunity. These two robots operate millions of miles away from their human controllers, in a harsh and alien environment, with absolutely no opportunity for assistance should they err. They use their cameras to view their surroundings, and their software analyzes the images in 3D to figure out locations and sizes of rocks, the slope and texture of the ground, and so forth. They plan safe routes over rocky and uneven ground. They have to be able to carry out minor repairs on their own software and evaluate their own mechanical problems. This is complicated stuff, yet they have performed magnificently. These machines are a triumph of computer science.
But all these tasks are engineering problems, expressible in precise mathematical form. There are no uncertainties, no probabilities, no soft factors. Everything the robot needs to know it can calculate, and the results of its computations are black and white. These robots aren’t programmed to make judgment calls based on hunches or soft factors. That’s not their department—computer science doesn’t do that kind of thing well.
What about fuzzy logic and other methodologies for handling uncertainty?
True, computer scientists have made many stabs at handling uncertainty, but these efforts have not enjoyed the triumphs that hard-core approaches can claim. Many interesting and impressive results have been obtained using fuzzy logic, but so far this area of effort is more a field of opportunity than a spire of triumph.
The problem with the most common computer science technologies is their emphasis on boolean data structures. Fundamentally, a boolean value is a stupid number. The numerical value of pi is 3.14159265358979, but the boolean value of pi is 1, as distinct from 0. Seeing the world through boolean eyes eliminates all color and all shades, leaving just two values: black and white (see
Figure 19.1
).
FIGURE
19.1
: On the left, arithmetic imagery; on the right, boolean imagery.
This black-and-white vision is what computer scientists tend to do with dramatic reality. They build large datasets containing statements like this:
An apple is a fruit.
An orange is a fruit.
A banana is a fruit.
All fruit is edible.
Only edible things can be eaten.
All Actors have the boolean attribute Hungry.
If the Hungry flag is true, then the Actor sets a goal “to eat.”
The Hungry flag is set to true eight hours after an Actor last ate.
After a thing is eaten, it no longer exists.
This kind of dataset makes it possible for actors to determine that they are hungy, decide that they need to eat, search for something that’s edible, and then eat it. By expanding this pile of statements, computer scientists believe that they can attain any level of dramatic fidelity they want.
Figure 19.2
illustrates the way they see it in their mind’s eye.
FIGURE
19.2
: Every day, in every way, better and better!
The problem with this thinking is that it underestimates the number of logical statements needed to make “someday” happen. The most ambitious effort in this direction is the CYC project (www.cyc.com
), which attempts to capture all the basic information necessary for what we call “common sense.” The CYC team has already compiled millions—I do not exaggerate—of such statements and it hasn’t reached completion. Common sense is a subset of what’s needed to express all the laws of drama. Obviously, the task of applying this boolean methodology to interactive storytelling is far beyond our reach.
Does that not suggest the task is impossible? How do you know that a numerical approach will be any less humongous?
Consider a drawing, shown in
Figure 19.3
, based on the previous images.
At heart,
Figure 19.3
is a collection of curving lines, not a collection of pixels. The images in
Figure 19.2
are composed of pixels, all black and white. The more pixels, the better the image.
Figure 19.3
’s drawing is not composed of pixels, however; its core data structure is a curve, which is a numeric data structure, not a boolean data structure. Note how the drawing is able to capture the human essence of the images in
Figure 19.2
without having to store thousands of pixels. It consists of only a few dozen curves. The drawing is a more artistic construction because it emphasizes the parts of the image you’re interested in (eyes, mouth, and hand) and suppresses those parts you’re not interested in (clothing, details of the hat, wrinkles). Sure, the best bitmaps look more realistic, but they have a high cost. The drawing in
Figure 19.3
requires less information than the first-try bitmap, which it’s clearly superior to. In other words, for a certain amount of work on your part, the numeric approach gives you more fidelity than the boolean approach.
FIGURE
19.3
: A drawing, not a bitmap.
Scott McCloud makes this point powerfully in his book
Understanding Comics
. He discusses it in terms of the abstraction with which an image is presented and the effect that more abstraction leads to wider applicability of the image. He offers a visual representation of the concept atwww.scottmccloud.com/inventions/triangle/triangle.html
.
This is what the current emphasis on boolean thinking misses. At the lowest levels of effort, the boolean approach is more readily controllable and easier to work with, but in tackling big complex problems such as interactive storytelling, the boolean approach requires more true/false combinations than can be realistically assembled.
The following sections are a synopsis of the various research reports I have been able to discover.
The late 1980s saw lots of excitement in the computer science community over an idea referred to as “intelligent agents.” The concept is to create a body of software that addresses a particular task and give that software some kind of human face. Instead of designing everything in machine-like terms, agent technology approached the problem as a matter of teaching a computer agent to behave properly. This approach was largely a matter of changing the software designer’s point of view, but much of this thinking bled over into other areas. The most prominent example is the “help wizard” who appeared onscreen to give advice, answer questions, and offer help. These wizards didn’t work out because they were much dumber than any wizard, so user-wizard interaction was largely frustrating for users. Eventually designers of these systems laid off most of their user interface wizards and shipped the survivors to a back room.
A number of researchers, however, saw merit in the idea of applying agent technology to entertainment software. Dr. Joseph Bates led the first effort in this direction, the Oz Project, at Carnegie Mellon University in the early 1990s
1
.
The agents, called “woggles,” were balls with eyes that could move around their environment by bouncing. Most of the research effort went into designing good personality models for the woggles, which could in turn be translated into interesting behavior.
The crucial flaw in this effort, in my opinion, lay in the emphasis on the character model at the expense of the verb set. The woggles are interesting creatures, but they can’t
do
very much because verbs were built into the system only to demonstrate woggle personality. My definition of interactivity places all the emphasis on the verbs in the design. By giving short shrift to verbs, the designers created a completely computable but ultimately moribund virtual universe.