|
|
Chapter 9: Bit Beliefs
Originally published by Henry Holt and Company 1999. Published on KurzweilAI.net May 15, 2003.
For as long as people have been making machines, they have been
trying to make them intelligent. This generally unsuccessful effort
has had more of an impact on our own ideas about intelligence and
our place in the world than on the machines' ability to reason.
The few real successes have come about either by cheating, or by
appearing to. In fact, the profound consequences of the most mundane
approaches to making machines smart point to the most important
lesson that we must learn for them to be able to learn: intelligence
is connected with experience. We need all of our senses to make
sense of the world, and so do computers.
A notorious precedent was set in Vienna in 1770, when Wolfgang
van Kempelen constructed an automaton for the amusement of Empress
Maria Theresa. This machine had a full-size, mustachioed, turbaned
Turk seated before a chess board, and an ingenious assortment of
gears and pulleys that enabled it to move the chess pieces. Incredibly,
after van Kempelen opened the mechanism up for the inspection and
admiration of his audiences and then wound it up, it could play
a very strong game of chess. His marvel toured throughout Europe,
amazing Benjamin Franklin and Napoleon.
After von Kempelen's death the Turk was sold in 1805 to Johann
Maelzel, court mechanic for the Habsburgs, supporter of Beethoven,
and inventor of the metronome. Maelzel took the machine further
afield, bringing it to the United States. It was there in 1836 that
a budding newspaper reporter, Edgar Allan Poe, wrote an exposé duplicating
an earlier analysis in London explaining how it worked. The base
of the machine was larger than it appeared; there was room for a
small (but very able) chess player to squeeze in and operate the
machine.
While the Turk might have been a fake, the motivation behind it
was genuine. A more credible attempt to build an intelligent machine
was made by Charles Babbage, the Lucasian Professor of Mathematics
at Cambridge from 1828 to 1839. This is the seat that was held by
Sir Isaac Newton, and is now occupied by Stephen Hawking. Just in
case there was any doubt about his credentials, his full title was
"Charles Babbage, Esq., M.A., F.R.S., F.R.S.E., F.R.A.S., F. Stat.
S., Hon. M.R.I.A., M.C.P.S., Commander of the Italian Order of St.
Maurice and St. Lazarus, Inst. Imp. (Acad. Moral.) Paris Corr.,
Acad. Amer. Art. et Sc. Boston, Reg. Oecon. Boruss., Phys. Hist.
Nat. Genev., Acad. Reg. Monac., Hafn., Massil., et Divion., Socius.
Acad. Imp. et Reg. Petrop., Neap., Brux., Patav., Georg. Floren.,
Lyncei. Rom., Mut., Philomath. Paris, Soc. Corr., etc." No hidden
compartments for him.
Babbage set out to make the first digital computer, inspired by
the Jacquard looms of his day. The patterns woven into fabrics by
these giant machines were programmed by holes punched in cards that
were fed to them. Babbage realized that the instructions could just
as well represent the sequence of instructions needed to perform
a mathematical calculation. His first machine was the Difference
Engine, intended to evaluate quantities such as the trigonometric
functions used by mariners to interpret their sextant readings.
At the time these were laboriously calculated by hand and collected
into error-prone tables. His machine mechanically implemented all
of the operations that we take for granted in a computer today,
reading in input instructions on punched cards, storing variables
in the positions of wheels, performing logical operations with gears,
and delivering the results on output dials and cards. Because the
mechanism was based on discrete states, some errors in its operation
could be tolerated and corrected. This is why we still use digital
computers today.
Babbage oversaw the construction of a small version of the Difference
Engine before the project collapsed due to management problems,
lack of funding, and the difficulty of fabricating such complex
mechanisms to the required tolerances. But these mundane details
didn't stop him from turning to an even more ambitious project,
the Analytical Engine. This was to be a machine that could reason
with abstract concepts and not just numbers. Babbage and his accomplice,
Lady Ada Lovelace, realized that an engine could just as well manipulate
the symbols of a mathematical formula. Its mechanism could embody
the rules for, say, calculus and punch out the result of a derivation.
As Lady Lovelace put it, "the Analytical Engine weaves algebraical
patterns just as the Jacquard Loom weaves flowers and leaves."
Although Babbage's designs were correct, following them went well
beyond the technological means of his day. But they had an enormous
impact by demonstrating that a mechanical system could perform what
appear to be intelligent operations. Darwin was most impressed by
the complex behavior that Babbage's engines could display, helping
steer him to the recognition that biological organization might
have a mechanistic explanation. In Babbage's own memoirs, Passages
from the Life of a Philosopher, he made the prescient observation
that
It is impossible to construct machinery occupying unlimited
space; but it is possible to construct finite machinery, and to
use it through unlimited time. It is this substitution of the infinity
of time for the infinity of space which I have made use of, to limit
the size of the engine and yet to retain its unlimited power.
The programmability of his engines would permit them, and their
later electronic brethren, to perform many different functions with
the same fixed mechanism.
As the first computer designer it is fitting that he was also the
first to underestimate the needs of the market, saying, "I propose
in the Engine I am constructing to have places for only a thousand
constants, because I think it will be more than sufficient." Most
every computer since has run into the limit that its users wanted
to add more memory than its designers thought they would ever need.
He was even the first programmer to complain about lack of standardization:
I am unwilling to terminate this chapter without reference
to another difficulty now arising, which is calculated to impede
the progress of Analytical Science. The extension of analysis is
so rapid, its domain so unlimited, and so many inquirers are entering
into its fields, that a variety of new symbols have been introduced,
formed on no common principles. Many of these are merely new ways
of expressing well-known functions. Unless some philosophical principles
are generally admitted as the basis of all notation, there appears
a great probability of introducing the confusion of Babel into the
most accurate of all languages.
Babbage's frustration was echoed by a major computer company years
later in a project that set philosophers to work on coming up with
a specification for the theory of knowledge representation, an ontological
standard, to solve the problem once and for all. This effort was
as unsuccessful, and interesting, as Babbage's engines.
Babbage's notion of one computer being able to compute anything
was picked up by the British mathematician Alan Turing. He was working
on the "Entscheidungsproblem," one of a famous group of open mathematical
questions posed by David Hilbert in 1900. This one, the tenth, asked
whether a mathematical procedure could exist that could decide the
validity of any other mathematical statement. Few questions have
greater implications. If the answer is yes, then it could be possible
to automate mathematics and have a machine prove everything that
could ever be known. If not, then it would always be possible that
still greater truths could lay undiscovered just beyond current
knowledge.
In 1936 Turing proved the latter. To do this, he had to bring some
kind of order to the notion of a smart machine. Since he couldn't
anticipate all the kinds of machines people might build, he had
to find a general way to describe their capabilities. He did this
by introducing the concept of a Universal Turing Machine. This was
a simple machine that had a tape (possibly infinitely long), and
a head that could move along it, reading and writing marks based
on what was already on the tape. Turing showed that this machine
could perform any computation that could be done by any other machine,
by preparing it first with a program giving the rules for interpreting
the instructions for the other machine. With this result he could
then prove or disprove the Entscheidungsproblem for his one machine
and have it apply to all of the rest. He did this by showing that
it was impossible for a program to exist that could determine whether
another program would eventually halt or keep running forever.
Although a Turing machine was a theoretical construction, in the
period after World War II a number of laboratories turned to successfully
making electronic computers to replace the human "computers" who
followed written instructions to carry out calculations. These machines
prompted Turing to pose a more elusive question: could a computer
be intelligent? Just as he had to quantify the notion of a computer
to answer Hilbert's problem, he had to quantify the concept of intelligence
to even clearly pose his own question. In 1950 he connected the
seemingly disparate worlds of human intelligence and digital computers
through what he called the Imitation Game, and what everyone else
has come to call the Turing test. This presents a person with two
computer terminals. One is connected to another person, and the
other to a computer. By typing questions on both terminals, the
challenge is to determine which is which. This is a quantitative
test that can be run without having to answer deep questions about
the meaning of intelligence.
Armed with a test for intelligence, Turing wondered how to go about
developing a machine that might display it. In his elegant essay
"Computing Machinery and Intelligence," he offers a suggestion for
where to start:
We may hope that machines will eventually compete with
men in all purely intellectual fields. But which are the best ones
to start with? Even this is a difficult decision. Many people think
that a very abstract activity, like the playing of chess, would
be best.
Turing thought so; in 1947 he was able to describe a chess-playing
computer program. Since then computer chess has been studied by
a who's who of computing pioneers who took it to be a defining challenge
for what came to be known as Artificial Intelligence. It was thought
that if a machine could win at chess it would have to draw on fundamental
insights into how humans think. Claude Shannon, the inventor of
Information Theory, which provides the foundation for modern digital
communications, designed a simple chess program in 1949 and was
able to get it running to play endgames. The first program that
could play a full game of chess was developed at IBM in 1957, and
an MIT computer won the first tournament match against a human player
in 1967. The first grandmaster lost a game to a computer in 1977.
A battle raged among computer chess developers between those who
thought that it should be approached from the top down, studying
how humans are able to reason so effectively with such slow processors
(their brains), and those who thought that a bottom-up approach
was preferable, simply throwing the fastest available hardware at
the problem and checking as many moves as possible into the future.
The latter approach was taken in 1985 by a group of graduate students
at Carnegie Mellon who were playing hooky from their thesis research
to construct a computer chess machine. They used a service just
being made available through the Defense Advanced Research Projects
Agency (DARPA) to let researchers design their own integrated circuits;
DARPA would combine these into wafers that were fabricated by silicon
foundries. The Carnegie Mellon machine was called "Deep Thought"
after the not-quite-omniscient supercomputer in Douglas Adams's
Hitchhiker's Guide to the Galaxy.
In 1988 the world chess champion Gary Kasparov said there was "no
way" a grandmaster would be defeated by a computer in a tournament
before 2000. Deep Thought did that just ten months later. IBM later
hired the Carnegie Mellon team and put a blue suit on the machine,
renaming it "Deep Blue." With a little more understanding of chess
and a lot faster processors, Deep Blue was able to evaluate 200
million positions per second, letting it look fifteen to thirty
moves ahead. Once it could see that far ahead its play took on occasionally
spookily human characteristics. Deep Blue beat Kasparov in 1997.
Among the very people you would expect to be most excited about
this realization of Turing's dream, the researchers in artificial
intelligence, the reaction to the victory has been curiously muted.
There's a sense that Deep Blue is little better than van Kempelen's
Turk. Nothing was learned about human intelligence by putting a
human inside a machine, and the argument holds that nothing has
been learned by putting custom chips inside a machine. Deep Blue
is seen as a kind of idiot savant, able to play a good game of chess
without understanding why it does what it does.
This is a curious argument. It retroactively adds a clause to the
Turing test, demanding that not only must a machine be able to match
the performance of humans at quintessentially intelligent tasks
such as chess or conversation, but the way that it does so must
be deemed to be satisfactory. Implicit in this is a strong technological
bias, favoring a theory of intelligence appropriate for a particular
kind of machine. Although brains can do many things in parallel
they do any one thing slowly; therefore human reasoning must use
these parallel pathways to best advantage. Early computers were
severely limited by speed and memory, and so useful algorithms had
to be based on efficient insights into a problem. More recent computers
relax these constraints so that a brute-force approach to a problem
can become a viable solution. No one of these approaches is privileged—each
can lead to a useful kind of intelligence in a way that is appropriate
for the available means. There's nothing fundamental about the constraints
associated with any one physical mechanism for manipulating information.
The question of machine intelligence is sure to be so controversial
because it is so closely linked with the central mystery of human
experience, our consciousness. If a machine behaves intelligently,
do we have to ascribe a kind of awareness to it? If we do, then
the machine holds deep lessons about the essence of our own experience;
if not, it challenges the defining characteristic of being human.
Because our self-awareness is simultaneously so familiar and so
elusive, most every mechanism that we know of gets pressed into
service in search of an explanation. One vocal school holds that
quantum mechanics is needed to explain consciousness. Quantum mechanics
describes how tiny particles behave. It is a bizarre world, remote
from our sensory experience, in which things can be in many places
at the same time, and looking at something changes it. As best I've
been able to reconstruct this argument, the reasoning is that (1)
consciousness is mysterious, (2) quantum mechanics is mysterious,
(3) nonquantum attempts to explain consciousness have failed, therefore
(4) consciousness is quantummechanical. This is a beautiful belief
that is not motivated by any experimental evidence, and does not
directly lead to testable experimental predictions. Beliefs about
our existence that are not falsifiable have a central place in human
experience—they're called religion.
I spent an intriguing and frustrating afternoon running over this
argument with an eminent believer in quantum consciousness. He agreed
that the hypothesis was not founded on either experimental evidence
or testable predictions, and that beliefs that are matters of faith
rather than observation are the domain of religion rather than science,
but then insisted that his belief was a scientific one. This is
the kind of preordained reasoning that drove Turing to develop his
test in the first place; perhaps an addendum is needed after all
to ask the machine how it feels about winning or losing the test.
The very power of the machines that we construct turns them into
powerful metaphors for explaining the world. When computing was
done by people rather than machines, the technology of reasoning
was embodied in a pencil and a sheet of paper. Accordingly, the
prevailing description of the world was matched to that representation.
Newton and Leibniz's theory of calculus, developed around 1670,
provided a notation for manipulating symbols representing the value
and changes of continuous quantities such as the orbits of the planets.
Later physical theories, like quantum mechanics, are based on this
notation.
At the same time Leibniz also designed a machine for multiplying
and dividing numbers, extending the capabilities of Pascal's 1645
adding machine. These machines used gears to represent numbers as
discrete rather than continuous quantities because otherwise errors
would inevitably creep into their calculations from mechanical imperfections.
While a roller could slip a small amount, a gear slipping is a much
more unlikely event. When Babbage started building machines to evaluate
not just arithmetic but more complex functions he likewise used
discrete values. This required approximating the continuous changes
by small differences, hence the name of the Difference Engine. These
approximations have been used ever since in electronic digital computers
to allow them to manipulate models of the continuous world.
Starting in the 1940s with John von Neumann, people realized that
this practice was needlessly circular. Most physical phenomena start
out discrete at some level. A fluid is not actually continuous;
it is just made up of so many molecules that it appears to be continuous.
The equations of calculus for a fluid are themselves an approximation
of the rules for how the molecules behave. Instead of approximating
discrete molecules with continuous equations that then get approximated
with discrete variables on a computer, it's possible go directly
to a computer model that uses discrete values for time and space.
Like the checkers on a checkerboard, tokens that each represent
a collection of molecules get moved among sites based on how the
neighboring sites are occupied.
This idea has come to be known as Cellular Automata (CAs). From
the 1970s onward, the group of Ed Fredkin, Tomaso Toffoli, and Norm
Margolus at MIT started to make special-purpose computers designed
for CAs. Because these machines entirely dispense with approximations
of continuous functions, they can be much simpler and faster. And
because a Turing machine can be described this way, a CA can do
anything that can be done with a conventional computer.
A cellular automata model of the universe is no less fundamental
than one based on calculus. It's a much more natural description
if a computer instead of a pencil is used to work with the model.
And the discretization solves another problem: a continuous quantity
can represent an infinite amount of information. All of human knowledge
could be stored in the position of a single dot on a piece of paper,
where the exact location of the dot is specified to trillions of
digits and the data is stored in the values of those digits. Of
course a practical dot cannot be specified that precisely, and in
fact we believe that the amount of information in the universe is
finite so that there must be a limit. This is built into a CA from
the outset.
Because of these desirable features, CAs have grown from a computational
convenience into a way of life. For the true believers they provide
an answer to the organization of the universe. We've found that
planets are made of rocks, that rocks are made up of atoms, the
atoms are made up of electrons and the nucleus, the nucleus is made
up of protons and neutrons, which in turn are made up of quarks.
I spent another puzzling afternoon with a CA guru, who was explaining
to me that one needn't worry about this subdivision continuing indefinitely
because it must ultimately end with a CA. He agreed that experiments
might show behavior that appears continuous, but that just means
that they haven't looked finely enough. In other words, his belief
was not testable. In other words, it was a matter of personal faith.
The architecture of a computer becomes a kind of digital deity that
brings order to the rest of the world for him.
Like any religion, these kinds of beliefs are enormously important
in guiding behavior, and like any religion, dogmatic adherence to
them can obscure alternatives. I spent one more happily exasperating
afternoon debating with a great cognitive scientist how we will
recognize when Turing's test has been passed. Echoing Kasparov's
"no way" statement, he argued that it would be a clear epochal event,
and certainly is a long way off. He was annoyed at my suggestion
that the true sign of success would be that we cease to find the
test interesting, and that this is already happening. There's a
practical sense in which a modern version of the Turing test is
being passed on a daily basis, as a matter of some economic consequence.
A cyber guru once explained to me that the World Wide Web had no
future because it was too hard to figure out what was out there.
The solution to his problem has been to fight proliferation with
processing. People quickly realized that machines instead of people
could be programmed to browse the Web, collecting indices of everything
they find to automatically construct searchable guides. These search
engines multiplied because they were useful and lucrative. Their
success meant that Web sites had to devote more and more time to
answering the rapid-fire requests coming to them from machines instead
of the target human audience. Some Web sites started adding filters
to recognize the access patterns of search engines and then deny
service to that address. This started an arms race. The search engines
responded by programming in behavior patterns that were more like
humans. To catch these, the Web sites needed to refine their tests
for distinguishing between a human and machine. A small industry
is springing up to emulate, and detect, human behavior.
Some sites took the opposite tack and tried to invite the search
engines in to increase their visibility. The simplest way to do
this was to put every conceivable search phrase on a Web page so
that any query would hit that page. This led to perfectly innocent
searches finding shamelessly pornographic sites that just happen
to mention airline schedules or sports scores. Now it was the search
engine's turn to test for human behavior, adding routines to test
to see if the word use on a page reflects language or just a list.
Splitting hairs, they had to further decide if the lists of words
they did find reflected reasonable uses such as dictionaries, or
just cynical Web bait.
Much as Gary Kasparov might feel that humans can still beat computers
at chess in a more fair tournament, or my colleague thinks that
the Turing test is a matter for the distant future, these kinds
of reasoning tasks are entering into an endgame. Most computers
can now beat most people at chess; programming human behavior has
now become a job description. The original goals set for making
intelligent machines have been accomplished.
Still, smart as computers may have become, they're not yet wise.
As Marvin Minsky points out, they lack the common sense of a six-year-old.
That's not too surprising, since they also lack the life experience
of a six-year-old. Although Marvin has been called the father of
artificial intelligence, he feels that that pursuit has gotten stuck.
It's not because the techniques for reasoning are inadequate; they're
fine. The problem is that computers have access to too little information
to guide their reasoning. A blind, deaf, and dumb computer, immobilized
on a desktop, following rote instructions, has no chance of understanding
its world.
The importance of perception to cognition can be seen in the wiring
of our brains. Our senses are connected by two-way channels: information
goes in both directions, letting the brain fine-tune how we see
and hear and touch in order to learn the most about our environment.
This insight takes us back to Turing. He concludes "Computing Machinery
and Intelligence" with an alternative suggestion for how to develop
intelligent machines:
It can also be maintained that it is best to provide the
machine with the best sense organs that money can buy, and then
teach it to understand and speak English. This process could follow
the normal teaching of a child. Things would be pointed out and
named, etc.
The money has been spent on the computer's mind instead. Perhaps
it's now time to remember that they have bodies, too.
WHEN THINGS START TO THINK by Neil Gershenfeld. ©1998 by
Neil A. Gershenfeld. Reprinted by arrangement with Henry Holt and
Company, LLC.
| | |