How to Build a Virtual Human
Virtual Humans is the first book with instructions on designing a "V-human," or synthetic person. Using the programs on the included CD, you can create animated computer characters who can speak, dialogue intelligently, show facial emotions, have a personality and life story, and be used in real business projects. These excerpts explain how to get started.
To be published in Virtual
Humans, AMACOM, November 2003. Published on KurzweilAI.net October
20, 2003.
About 30% of building a virtual human is in the engine. A good
engine will make it easy for you to create a believable personality.
It provides functions that allow things like handling complex sentences,
bringing up the past and learning better responses if one doesn't
work. But in the end, it's your artistry that gives the entity its
charm.
There are many natural language approaches that can handle the
job. Simple pattern matching engines are the least sophisticated
and most useful of them all. With the rash of recent interest,
I'm not going to pretend I know all the nuances of all the engines
out there. Instead, I'll concentrate on using simple software to
build complex personalities. Together we will build a clever virtual
person using a mind engine kindly supplied by Yapanda Intelligence,
Inc. of Chickasha Oklahoma. I selected this one because it can drive
a real-time 3D head animation with lip-synch. Nevertheless, the
basic steps in creating a virtual personality are platform independent.
I've included some additional engines to play with. The most powerful
is ALICE. She's an implementation of Artificial Intelligence Markup
Language (AIML). Alice source code is available to those of you
who want to modify it and build your own Virtual Human engine, adding
your own special features. I've also included a copy of Jacco Bikker's
WinAlice for PC users. It demonstrates some unique features such
as the ability to bring up ancient history and to learn new responses
from you.
I'll talk more about the actual engines in chapter three. But
it's important to realize that the software you use to build your
virtual human is just a tool for expressing your artistry.
The most important and least understood part of virtual humans
-- their personalities is our focus. We are going to have some
serious fun. Let's look at some uses for virtual people.
Good For Business
From a business perspective virtual humans with a personality are
a major boon. Imagine a person signing onto your web page. There's
already a cookie that contains significant information about them,
gathered by your virtual host on the guest's first visit. The encounter
might go a bit like this:
Host "Hey, Joanne, Its nice to see you again." <smile>
Joanne: "You remember me?"
Host "Of course I do. But it's been a while. I missed you." <expression>
Joanne "Sorry about that, I've been really busy."
Host "So did you read 'The Age of Spiritual Machines?'
Joanne "Yeah, it was really interesting. <beat> Are you one
of them?"
Host "Not yet, I'm afraid, but I'm working on it.<beat> Before
I forget, you should know about Greg Stock' new book on how to live
to be 200 plus years old!"
Joanne "I read his last book and liked it. Can you send me a copy?"
Host "Sure, we have it in stock <grin>. Same charge, same
place?
Joanne "Yup. Also, do you have any books on Freestyle Landscape
Quilting?
Host "I'll check. <beat> Hold on a few more seconds. Okay,
I found two…..
And so forth. You can see that Virtual humans bring back that personal
touch so sorely missing in commerce today. Believe it or not, I've
observed people from every level of sophistication and background
respond positively to personal attention from a Virtual Human. It
feels good.
Your marketing software can be made to generate marketing variables
that can be fed to your virtual human host: Joanne's buying patterns,
personal information like her date of birth etc. Trust is a big
issue, so such data must be handled with respect for the client
and used in clever ways. Imagine when Joanne comes online within
a week of her birthday and Host sings happy birthday to her. Hokey?
Yes. Appealing, you bet. I've also discovered that many people
tolerate hokey behavior from V-people. It's a bit like the ways
we tolerate…even appreciate the squash and stretch exaggeration
in animated film characters. Of course Host would not want to sing
happy birthday to every customer. She has to know how to tell which
is which. Later in the book will look into using unobtrusive personality
assessment to provide those cues. This is one of the most important
and most neglected tools you have. You'll see why later.
An advantage of rule based approaches is that you can have multiple
sets of rules, each one with responses specifically honed to a specific
task or person or language. For example when Joanne logs in, her
cookie can initiate the uploading of a rule database tailored specifically
to her general personality and buying patterns. That means that
when a rule triggers, it will respond in a way likely to make Jonnie
comfortable while meeting her needs. Next a person from Korea logs
on and the host switches to a Korean intelligencebase, greeting
the client in that language. One well designed host can handle
orders in more than 20 languages. This clearly presents opportunities
for small companies to expand internationally.
Depending on your type of business or usage, Virtual Human needs
will vary. For example, voice-only virtual humans are already very
active in phone information and ordering systems. They don't have
much personality yet, but we're going to work on that. In fact there
are a number of different types of virtual humans and we'll be building
one up from the simplest to one of the more complex with a 3D animated
talking head. By taking it step by step you'll be amazed at your
own ability to master Virtual Human design.
A good Virtual Human should be able to cope with language. Changing
language should be as easy as switching databases and voice engines.
Monica Lamb, a Native American scientist and V-person developer
has used Alice to build a V-person that teaches and speaks Mohawk.
At a minimum, your V-person will be able to handle general conversational
input by voice or keyboard, parse that input to arrive at appropriate
behaviors, and output behavior as text or speech, on-screen information,
and/or machine commands to software or external devices. It should
also have a face display capable of at least minimal emotional expression
such as smile, frown and neutral. I prefer a 3D face capable of
complex emotional expression that is part of the communication system.
This is a tall order, but I believe we can handle it. Here's and
interesting example of how one creative company has used this technology
in a mechanical robot:
Redgate Technologies is a company that thrives on invention. They
became interested in Natural Language Processing (NLP) early on.
They had invented a new chip technology to monitor and control complex
technical systems. NLP was useful for interpreting the complex codes
generated by their chips. Just for fun, they expanded their NLP
engine to represent several personalities. They quickly discovered
that a virtual human hooked into their system became a super-capable
assistant to a human supervisor. Imagine one on a space station,
keeping track of all mechanical systems and keeping the inhabitants
company with casual conversation. For luck we won't name her HAL.
A wonderful example of this V-person species is Redgate's Sarha.
She's an innovative virtual human interface for industrial monitoring
and control. Sarha stands for "Smart Anthropomorphic Robotic Hybrid
Agent." Redgate has used NLP pattern matching to monitor an entire
industrial complex. The Virtual Human system they devised sends
out queries to specialized monitoring modules using the special
Redgate chips. She then reads and interprets the encoded feedback
in spoken English, issuing warnings when conditions warrant. She
can also take emergency action on her own, if necessary. Her supervisor
communicates with her in spoken English, asking her to start processes
or check specific conditions. In a demonstration of Sarha's application
to home security, she reported "Anthony, someone left the garage
door open." Anthony replied "Close it for me will you please, Sarha?"
And of course she does.
The thing I like most about Sarha is her personality. She makes
personal comments; even chides her operator, whom she knows by name.
As a demonstration, Sarha was installed into a fully robotic interface
that could move around, point to objects and complain about and
avoid objects in her path. She was linked by microwave to a control
computer she used to monitor her charges. She even gave a brief
talk on those special chips Redgate designed to transmit monitoring
data back to her. She reached into a bowl, pulled out a chip, pointed
at it with a metal finger and started her spiel. Later she took
questions. All the while she was monitoring various systems. She
even brought on-line, a loud monster generator in another room during
the demonstration.
Perhaps one of the most important applications for Virtual Human
technology is in teaching. I've found that young people have trust
issues with the educational system. I can't blame them when administrators
waste millions on bad decisions but there aren't enough books to
go around. Virtual teacher's seem separated from all this. It's
hard to attribute ulterior motives to an animated character, even
if she is smart and talkative and knows you by name. Properly scripted,
a V-teacher can get to know a student on a personal basis. The
real human teacher can feed her personal tidbits she can bring up
during a lesson:
"So Bill, is it true you threw the winning touchdown in Saturday's
game?"
"Yeah, how'd you know about that?"
" Hey, I keep on top of things. Congratulations. Now let's teach
you how to estimate the diameter of an oleic acid molecule.
Young children can be fascinated by virtual people. I got a call
from a retired engineer from rural New Mexico. He had spent a lot
of time tweaking the voice input on his V-person so that she would
understand his very bright 3 year old grand daughter, and had a
story to tell me. He'd been remarkably successful and the little
girl spent hours in happy conversation with her virtual friend.
One evening a few neighbors came by to play Canasta. While they
were playing, the little girl came into the adjoining room and fired
up her computer. In moments an animated conversation ensued.
One of the neighbors, a devout fundamentalist Christian became terrified
and insisted he smash the girl's computer immediately. It was inhabited
by the devil. He refused of course. He told me he'd been using
the virtual character to teach his grand daughter everything from
her ABCs to simple math. I gave him some unpublished information
on how to get her to record the granddaughter's responses to questions,
so he could check on them later.
The point is, in creative hands virtual humans already have enormous
potential and the platforms are constantly improving.
Blending art, technology and a little psychology allows us to take
a functional leap, decades ahead of pure artificial intelligence.
Although the simple VH software of today will eventually be replaced
by highly sophisticated neural nets or entirely new kinds of computing,
it will be a long time before they'll have unique human like personalities…if
ever. Meanwhile let's give the evolution of technology a kick in
the butt by building really smart, personable virtual people today.
Because creating a believable synthetic personality is more of
an art than science, it's important that we get a feel for how we
humans handle our conscious lives. It's part philosophy, part psychology
and believe it or not, part quantum physics. We'll start by comparing
people and computers, with out getting to philosophically crazed.
Any discussion of the human mind must consider consciousness. It's
a danger zone and I already know the discussions to follow will
dump me smack into the boiling kettle. I'll walk you through the
important parts. Disagree and send me nice email if you like.
Coming up in chapter two we'll explore the nature of consciousness
and why it's an essential consideration in virtual human design.
Synthespians: Virtual Acting (Chapter 13)
with Ed Hooks
Virtual people have to convince us they have wheels spinning inside.
They do, of course, have electrons spinning in service of the plot,
but if they don't show it on their faces, we just don't buy it.
We're used to seeing people think. It's true; thought is conveyed
through action.
Although I'm remarkably opinionated about acting in animation,
I'm not a certified expert on the subject--Ed Hooks is. He teaches
acting classes for animators internationally, and has held workshops
for companies such as Disney Animation (Sydney), Tippett Studio
(Berkeley), Microsoft (Redmond, Washington), Electronic Arts (Los
Angeles), BioWare (Edmonton, Canada), and PDI (Redwood City, California).
Among his five books, Acting for Animators: The Complete Guide to
Performance Animation. , Heinemann; Revised edition (September 2003)
has been a major hit.
The Seven Essential Concepts in Face Acting
The following concepts are interpretations of Ed Hooks' "Seven
Essential Acting Concepts." We've adapted them here to focus
on the V-people and their faces.
1. The face expresses thoughts beneath. The brain, real or artificial,
is the most alive part of us. Thinking, awareness, and reasoning
are active processes that affect what's on our face. Emotion happens
as a result of thinking. Because these characters don't have a natural
link between thinking and facial expression, your job as animator
is to create those links. In effect, you want your synthetic brain
to emulate recognizable human cognition on the face, which leads
to the illusion of real and appropriate emotions.
2. Acting is reacting. Every facial expression is a reaction to
something. Even the slightest head and hand movement in reaction
to what's happening can be most convincing. If the character tilts
its head as you begin to speak to it, or nods on occasion in agreement,
you get the distinct feeling of a living person paying attention.
A double take shows surprise. Because you have very few body parts
to work with, you have a superb challenge in front of you.
3. Know your character's objective. Your character is never static.
He is always moving, even if the movement is the occasional twitch,
a shift of the eye, or a blink. Your objective is to endow your
character with the illusion of life. As such, it is wise to follow
Shakespeare's advice, "Hold the mirror up to nature" (Hamlet,
III. ii.17-21). Notice that when a person listens, she may tilt
her head to the side or glance off in the distance as she contemplates
and integrates new information. When she smiles and says nice things
to you, her objective is to please. Always know what your character's
objective is because it is the roadmap linking behaviors to their
goals. Knowing her personality and history are essential here.
4. Your character moves continuously from action to action. Your
character is doing something 100 percent of the time. There must
always be life! Even if she appears to be waiting, things are going
on mentally. Make a list of boredom behaviors and use them. When
people talk, a good emotion extraction engine will feed her cues
on how to react to what's being said. Her actions expressing emotional
responses are fluid. They flow into each other forming a face story.
You should be able to tell from the character's expression how she's
reacting to what you're saying. Say she takes a deep breath and
you see the cords on her neck tighten. They then relax. Her body
slumps a bit and perhaps she nods. Always in motion, she maintains
the illusion of life.
5. All action begins with movement. You can't even do math without
your face moving, exposing wheels spinning beneath. Your eyes twitch.
You glance at the ceiling, pondering. Your brow furrows as you struggle
with the solution. Try this experiment: Ask a friend to lie as still
as possible on the floor. No movement at all. Then, when he is absolutely
stone still, ask him to multiply 36 by 38. Pay close attention to
his eyes. You will note that they immediately begin to shift and
move. It is impossible to carry out a mental calculation without
the eyes moving. Sometimes movement on the screen needs to be a
bit more overt than in real life. That's okay, even essential. It
nails down the emotion. Done right, people won't notice the exaggeration,
but will get the point.
6. Empathy is audience glue. The main transaction between humans
and Virtual humans has to be emotion, not words. Words alone will
lose them. You will catch a viewer's attention if your character
appears to be thinking, but you will engage your viewer emotionally
if your character appears to be feeling. You must get across how
this V-person feels about what's going on. If you do it successfully,
the audience will care about (empathize with) those feelings. I
promise you it can be done. A great autonomous character can addict
an audience in ways a static animation cannot. The transaction between
audience and character is in real-time and directly motivated, much
as it is on stage. This is a unique acting medium, which is part
live performance and part animation. It's an opportunity for you
to push things--experiment with building empathy pathways.
7. Interaction requires negotiation. You want a little theatrical
heat in any discourse with a V-person. To accomplish this, remember
that your character always has choices. We all do, in every waking
moment. The character has to decide when and whether to answer or
initiate a topic. If your character is simply mouthing words, your
audience response will be boredom. Whether they know it or not,
people want to be entertained by your character. Artonin Artaud
famously observed that "actors are athletes of the heart."
Dead talk is not entertaining. There must be emotion. Recognize
that you're working with a theatrical situation and that the viewer
will crave more than a static picture.
Sure, there are loads more acting concepts we could talk about,
but these seven are the hard-rock core of it. You're faced with
a unique acting challenge because you have an animated character
that is essentially alive. If that character is a cartoon or anime
design and personality, you'll have to read Preston Blair , for
example, to learn the principles of exaggerated cartoon acting,
and then incorporate these squash and stretch type actions into
your character's personality. If you take the easier road and use
a photorealistic human actor, you still must make their actions
a bit larger than life, but not as magnified as cartoons demand.
The stage you set will depend on the Virtual actor's intention.
If he's there to guide a person around a no-nonsense corporate Web
site, you'll need to think hard about how much entertainment to
inject. Certainly you need some. Intelligent Virtual actors in games
situations--especially full-bodied ones--present marvelous opportunities
to expand this new field of acting. You'll know their intentions.
Let them lead you to design their actions. Embellish their personalities,
embroider their souls, and decorate their actions. Making them bigger
than life will generally satisfy.
Synthespians: The Early Years
Next I want to tell you about the clever term "Synthespian,"
which unfortunately I didn't coin. I do believe it should become
a part of our language.
Diana Walczak and Jeff Kleiser produced some early experimental
films featuring excellent solo performances by digital human characters.
For example, Nestor Sextone for President premiered at SIGGRAPH
in 1988. About a year later, Kleiser and Walczak presented the female
Synthespian, Dozo, in a music video: "Don't Touch Me."
These were not intelligent agents, but they were good actors. "It
was while we were writing Nestor's speech to an assembled group
of "synthetic thespians" that we coined the term "Synthespian,"
explains Jeff Kleiser. Nestor Sextone had to be animated from digitized
models sculpted by Diana Walczak.
As history will note, the field of digital animation is a close,
almost incestuous one. Larry Weinberg, the fellow who later created
Poser, worked out some neat software that allowed Jeff and Diana
to link together digitized facial expressions created from multiple
maquettes she'd sculpted to define visemes. That same software allowed
them to animate Nestor's emotional expression. I've put a copy of
this wonderful classic bit of animation on the CD-ROM, with their
blessing.
Note that this viseme-linking was an early part of the development
chain leading to the morph targets you see in Poser and all the
high-end animation suites today. Getting your digitized character
to act was difficult in those days before bones, articulated joints,
and morphing skin made movement realistic. Nestor was made up of
interpenetrating parts that had to be cleverly animated to look
like a gestalt character without any obvious cracks or breaks or
parts sticking out.
In most cases, V-people don't have a full body to work with, just
a face, and perhaps hands. Body language is such an effective communications
tool, but when we just don't have it we end up putting twice as
much effort into face and upper body acting. Fortunately a properly
animated face can be wonderfully expressive, as shown in Figure
13-1.
Figure 13-1: Virtual actors can really show emotion
Synthespians All Have a Purpose
A Synthespian playing a living person is probably the trickiest
circumstance you'll encounter. Depending on the situation, you want
to emulate that person's real personality closely, or exaggerate
it for comedic impact or political statement. If you exaggerate
features and behavior heavily you've entered a new art form: interactive
caricature or parody.
Let's say we've built a synthetic Secretary of Defense Donald
Rumsfeld. The interactive theatrical situation is that we are interrupting
him while he is hectically planning an attack somewhere in the world.
He might be impatient and have an attitude regarding our utter stupidity
and lack of patriotism for bothering him at a time like this. His
listening skills might be shallow. He might continually give off
the dynamic that he has better things to do. By thus exaggerating
his personality, we create interest and humor. As a user, you want
to interact because you feel something interesting is happening.
There is comic relief, and all the while this character is making
a political statement. I suspect Rumsfeld would get a kick out of
such a representation, as long as it's done in good taste.
Action conveys personality, and you can't set up a virtual actor
without knowing the character well. For example, Kermit the Frog
has a definite psychology behind him. As a Web host, he is just
very happy to be there. He enjoys being in the spotlight, and his
behavior strongly implies he doesn't want to be any place else.
He's happy to show you around his Web site, and he might even break
out in song along the way. Occasionally he'll complain about Miss
Piggy's lack of attention or the disadvantages of his verdant complexion.
Think first about your intention and then the character's intention.
Mae West and Will Rogers wanted to make 'em laugh. No matter what
your purpose for a Synthespian, you want it to entertain. Sometimes
it may be understated. Remember that cleverness is always in style.
Notice the look people get on their faces when they think they're
being clever. It's usually an understated cockiness that shows around
the eyes. The intention is to be clever, the words are smart, but
remember to add that subtle touch of smugness or self-satisfaction
around the eyes and the corners of the mouth.
Note: There is a new book titled Emotions Revealed: Recognizing
Faces and Feelings to Improve Communication and Emotional Life,
by Paul Ekman (Times Books, 2003), which is well worth your time
to read. Ekman, who is professor of psychology in the department
of psychiatry at the University of California Medical School, San
Francisco, is one of the world's great geniuses on the subject of
the expression of emotion in the human face. His new book has more
than one hundred photographs of nuanced facial expression, complete
with explanations for the variances.
As an aside, I used to train counter-terrorist agents in psychological
survival. One way to spot a terrorist in a crowd is that they often
have facial expressions that are inappropriate to the situation.
I used Ekman's work as a reference to help my agents recognize when
facial expression and body language don't match up, an indication
often exhibited by potential terrorists. You can use Ekman's work
to make sure your V-human agents have appropriate expressions for
the situation.
You Are the Character
When you've done your homework, you'll know your character like
you know yourself. You'll identify with the character so intensely
you will have the sensation of being that character. Stage actors
learn to create characters by shifting from the third person to
the first person reference. Instead of saying, "My character
would be afraid in this situation," a stage actor might say,
while portraying the character, "I feel afraid." In your
case, you are creating a second-party character, but you're empathizing
personality with the emotions of your own creation. There is an
identity between the two of you that will be both fun and compelling.
Designing animation elements for the character requires feeling
them. I remember watching my daughter as she animated a baby dragon
early in her career. Her natural instinct was to get inside that
baby dragon and be it. I smiled as I watched her body and face contort
as she acted out each part of the sequence. Her instruction had
not come from me…it was intuitive. At Disney, I've watched
animators making faces in little round mirrors dangling from extension
arms above their desks. They glance in the mirror, make a face and
then look at the cel and try to capture what they've seen. That
part hasn't changed. For us it's glance at the mirror, glance at
the screen, and then tweak a spline or morph setting. You won't
be able to do all this with the simple animation tools I've given
you for free. Those are just to get you hooked. If you intend to
learn this stuff, get ready to invest heavily in time and commitment
and a fair amount in coin as well. A small investment considering
the return.
If You Want to Go Further
There are great animation schools, and this continent has some
of the best. My favorite is at Sheridan College in Oakville, Ontario.
But there are many good schools here in the United States as well.
A few years ago, most of them were a waste of money. But things
have improved. Do some Web research and find which school can best
help you meet your goals. There is a long-term need for talented,
well-trained character animators, and in general the pay for the
talented is phenomenal.
If you're a developer, you have to be familiar with all this stuff
to manage it effectively. You're responsible for the final product.
If you have animators working for you, believe in them, give them
freedom, but guide them toward your vision as well. The best animated
characters reflect the wisdom, vision, and artistry of their prime
artists and the producers behind them. A great producer is an artist,
a business person, and a technician. It's not easy to get there,
and too may producers only have the business end down. As a producer,
you have to understand the artistry of production. You have to feel
the emotion of good animation. How else will you know what to approve
and not approve. So learn it and you'll be way above the crowd.
I want to thank Ed Hooks for contributing his wisdom to this chapter.
Remember, what you've read here is just a taste of what you need
to learn. If you're lucky, you'll find a way to take a live class
with Ed, who now lives in the Chicago area. It will change your
perspective forever.
In the chapter upcoming, I'm going to kick it up a notch with
ways to give your character true awareness of his surroundings.
Imagine your well-developed character, now able not only to listen
and talk, but actually to see you, look you in the eyes, and recognize
you without asking. You don't want to miss this one.
Ed
Hooks, author of Acting
for Animators (Heinemann, Revised Second Edition 2003), has
been a theatre professional for three decades and has taught acting
to both animators and actors for PDI, Lucas Learning, Microsoft,
Disney Animation, and other leading companies.
© 2004 Peter Plantec
|