Origin > Will Machines Become Conscious? > When Things Start To Think > Chapter 11: The Nature of Computation
Permanent link to this article: http://www.kurzweilai.net/meme/frame.html?main=/articles/art0568.html

Printable Version
    Chapter 11: The Nature of Computation
by   Neil Gershenfeld

Originally published by Henry Holt and Company 1999. Published on KurzweilAI.net May 15, 2003.


. . . will things that think be developed?

by taking advantage of what nature

already knows how to do

through research that maximizes

contact, not constraints

with education that happens as it is needed,

rather than in advance

by interconnecting systems of people and

things to solve hard problems


Information technology can be thought of as a pyramid. At the top are the supercomputers, the biggest, fastest machines, bought by the biggest, richest institutions. Then come the large mainframes that run companies. After that are the servers for a department, high-powered workstations for demanding applications, and then the common PC. As the cost goes down, the number of these things in use goes up.

Below the PC are emerging devices that promise to be used in still larger numbers.

A few hundred dollars buys a Personal Digital Assistant, a pocket-sized computer handy for mobile information access. For tens of dollars, the wristwatch has for many years been the one processor and display that people carry around with them. It is now acquiring new capabilities, such as a pager, or a GPS receiver, or the Swatch Access watches used as electronic ski-lift tickets. Europe is also leading the way with smart cards, computers the size of a credit card that are used much the same way, but that store electronic cash instead of requiring access to a remote computer. They were first used to eliminate the need for change to use pay phones, and are now spreading to all areas of electronic commerce. Finally, for tens of cents come the RFID chips that are used to tag things with an electronic identity. These are used in the badges that control access to a workplace, and in tracking laboratory animals. And that's where it stops. No one's been able to fabricate, package, and sell chips for less than about ten cents. While that might not appear to be a severe limit, think about what it leaves off: just about everything!

One of the most frequent challenges presented to me is finding a way to compute for pennies. Companies have a need to track the identity, location, and environment of the things they make, but in competitive markets they can't afford to add more than a few cents per item. While it would be nice to be able to walk out of a supermarket without waiting in line because the groceries come tagged with chips that can automatically be scanned by the door, or have your cupboard at home contact the grocery store when it needs restocking, these things won't happen if the silicon chips add more than a few cents to the cost of the potato chips. A business card that can call up a business's Web page would be convenient, but there wouldn't be a business left if its cards cost more than a few cents.

This is such an intensely commercial question that it becomes an interesting scientific one. It's not possible to reach one-cent computers just by cutting corners in the fabrication of conventional chips; the whole industry has been trying unsuccessfully to do that for many years. The only way to get there is to be smarter about taking advantage of what nature already knows how to do.

A clue is provided by the cheapest digital device of all, the shoplifting tag. This is a one-cent one-bit remotely read memory device, storing the status of a product (purchased vs. stolen). Many of these, such as the white rectangular strip that you find on a compact disk, are made by a company called Sensormatic. These contain a metallic glass, a thin sheet of a metal alloy that was cooled so quickly when it formed that the atoms were frozen in a random arrangement. It looks like a piece of aluminum foil, but when it is exposed to an oscillating magnetic field it flexes slightly and then rings much like a tuning fork. This resulting oscillation is much too fast to be audible, but it in turn creates an oscillating magnetic field that can be detected. The portals around the door at a CD store contain the coils for exciting and detecting these oscillations.

My student Rich Fletcher was working with Sensormatic on squeezing more functionality into these materials by trimming them to varying lengths so that they would oscillate at different frequencies, and sensing the motion of magnets near them by how they shift the frequency. This let him store a few bits in the tag, and use it as a passive wireless user-interface controller. By putting the coils in a table instead of a door, Rich could create a work space that could distinguish among objects brought in, and detect how they were moved. Doing that isn't remarkable, but doing it for a penny is. At that cost this capability could be put into most anything, suggesting that smart materials might be able to replace dumb chips. Indeed, Rich has since found many other materials that can be remotely sensed to determine their identity, track their position, and detect properties such as pressure and temperature.

This approach rapidly runs into a limit on the amount of information that can be stored in a tag because there are not enough distinguishable frequencies. Very little data is contained in the call sign of a radio station; the interesting part is the signal the station broadcasts. For a smart material to be able to send out a more complex signal it needs to be nonlinear. If you hit a tuning fork twice as hard it will ring twice as loud but still at the same frequency. That's a linear response. If you hit a person twice as hard they're unlikely just to shout twice as loud. That property lets you learn more about the person than the tuning fork.

I set out to find a nonlinearity that we could use to store more data in smart material tags. To keep the cost down to a penny it had to be a natural property of the material rather than something that would require elaborate fabrication. An intriguing candidate is the interaction between the atoms in a material. Chemists study these with the technique of nuclear magnetic resonance (NMR). Magnetic resonance imaging (MRI) uses exactly the same mechanism, minus the word "nuclear," which might scare patients.

NMR is performed in much the same way as shoplifting tags are detected, applying an oscillating magnetic field and then listening for the resulting oscillating field. In the tag the oscillation came from a mechanical vibration of the material; in NMR the oscillation arises from the nuclei of the atoms. In a magnetic field these precess like tiny spinning tops. The rate at which they precess depends on the orientation of neighboring nuclei, spinning faster if they point in the same direction for example. By probing these neighboring interactions with NMR, chemists are able to deduce the structure of molecules. MRI applies a magnetic field that varies over space, so that the oscillation frequency of a nucleus becomes a function of its position.

The more I learned about NMR, the more puzzled I became. It' usually is performed by putting a tube full of a liquid in a strong magnetic field. A coil wrapped around the tube sends in an oscillating magnetic field to make the nuclei precess, and then detects the resulting magnetic field from the response of the nuclei. The chemists analyze NMR by using the language of quantum mechanics. Quantum mechanics describes the bizarre behavior of very small things, not a macroscopic test tube. Quantum mechanically there is nothing preventing this book from splitting into ten copies and leaping to the moon, but the interactions with you and the rest of the world force it to remain sensibly classical. To understand the paradox of how an apparently classical test tube can behave quantum mechanically, I started lugging around a distinctly macroscopic NMR bible that chemists use, a book by Ernst.

I was not getting anywhere until I was forced to do jury duty. For months I had been trying to avoid serving, but I ran out of excuses and had to clear a few days and report to the Cambridge courthouse. There I was locked in the jury room for a day with nothing to do but read Ernst. Freed from all my usual distractions, I finally was able to understand just how elegant the apparently impenetrable and self-referential chemists' language is.

In NMR each molecule in the liquid behaves as an independent but identical quantum system. The experiments treat the liquid as a single effective molecule that in reality is represented by an enormous number of redundant copies. All of the classical interactions needed to hold and interact with the liquid do ruin some of those copies, but there are so many of them that the quantum mechanical properties of the rest can persist for many seconds, plenty of time to perform an experiment. This means that NMR provides a convenient way to access a nonlinear quantum mechanical system.

Those attributes happen to be exactly what's needed for the future of high-performance computing. The search for cheap chips was leading to something much more. We've come to expect that every year or so our computers will double in speed and capacity. This trend was first noticed by Gordon Moore of Intel in 1965, and has continued for the last three decades. It's not a law of nature; it's an observation about remarkable engineering progress. From the very beginning it was clear where this was heading. Each year it appears that it will be hard to continue improving performance, and each year some new tricks are found to squeeze out the next round. But some time around 2020 or so everything will hit bottom. At the rate of current progress, the wires will be one atom wide, the memory cells will have one electron, and the fab plant will cost the GNP of the planet so that no one can afford to build it anyway. Further improvements cannot come from the current path of shrinking silicon circuits.

In part, this doesn't matter because most of the current constraints on using a computer have nothing to do with the speed of the processor, they come from the user interface, or the power requirement, or the software. A few of the billion dollars spent on building chip fabs could profitably be used elsewhere. But in part it does matter, because there are some hard problems that always demand faster computers.

Finding prime factors, for example, is the mathematical problem that underpins the cryptography used for electronic commerce. The difficulty of factoring is what lets you send a credit card number from a Web browser without worrying about an eavesdropper obtaining it. A desktop computer can multiply together the factors of a four-hundred-digit number in an instant, but finding them would take today's fastest computer the age of the universe. That is because factoring is an exponential algorithm, meaning that doubling the size of the number to be factored squares instead of doubles the time needed to do it. And beyond factoring, it's also true that builders of each generation of computers have confidently predicted that they were sufficient for most applications, only to find entirely new uses for new machines that weren't anticipated by extrapolating the old ones. I don't feel limited because my desktop computer isn't one thousand times faster, but that reflects a poverty of my imagination as much as an observation about what I think I need a computer for.

There aren't many places to turn for alternatives to speed up computing. Light travels a foot in a billionth of a second, the cycle time of today's fastest chips. This means that they cannot be clocked much faster because there isn't time for the signals to go anywhere.

Another option is to keep the clock speed but use more computers in parallel. Not only would this require an annual doubling of the number of computers to keep pace with Moore's law, parallelism alone does not scale very well. Scientists have managed to perform logical functions with DNA molecules, turning a tube full of DNA into an enormous number of computers, about 1023 of them. That's a lot. Because they communicate with each other slowly, the easiest way to use them is like a lottery, preparing all possible answers to a problem and then checking to see which molecule has the winning solution. But since the number of possible factors of a cryptographic key is exponential in the size of the key, this means that 1023 computers could simultaneously check for the factors of just a twenty-three-digit number.

There's only one thing in our universe that is exponentially big, but is not used for computing: quantum mechanics. A classical bit can be either a 0 or a 1. A bit governed by the laws of quantum mechanics can simultaneously be a 0 or a 1. To describe it you need to specify the probability that you'll find a 0 if you measure it, and the probability that you'll find a 1. The state of a classical computer with N bits is specified by giving those bits, but for a quantum computer it requires giving 2N coefficients, one for each possible combination of the bits. For just fifty quantum bits, 250 = 1015, storing more information than the largest classical computers. Because a quantum computer could perform all possible calculations at the same time, it might be the key to building more powerful 'computers.

This question was first asked in the early 1980s by Richard Feynmann at Caltech, Paul Benioff at Argonne National Laboratory, and David Deutsch at Oxford University. Feynmann was struck by the recognition that it takes a classical computer an exponential amount of time to model a quantum system, because it has to keep track of all of the possible states of the system in parallel. He wondered if a computer governed by the laws of quantum mechanics might be able to efficiently model other quantum systems. Feynmann's conjecture, recently supported by Seth Lloyd at MIT, had broader implications than he realized.

The study of quantum computing remained a theoretical curiosity through the 1980s, pursued by an unusual collection of people who had to know a great deal about both physics and computer science, and also be slightly strange. Meetings of this community were a delight because the people involved were so unusual, and the discussion so irrelevant to any practical concerns. That began to change in the early 1990s. A series of results were proved by David Deutsch, Richard Josza of Plymouth University, and Dan Simon, now at Microsoft, showing that a quantum computer is more powerful than a classical computer for a series of increasingly less trivial problems. In 1994 Peter Shor of AT&T was able to use these techniques to show that a quantum computer could find prime factors in polynomial instead of exponential time. This made factoring a four-hundred-digit number almost as easy as multiplying its factors.

These results drew on a second aspect of quantum mechanics that is even more bizarre than the ability to be in many states simultaneously. If a quantum bit is in a superposition of 0 and 1, and it interacts with a second bit, the value of the second bit will depend on the state of the first. If they're now separated to opposite ends of the universe and the first bit is measured, it is forced to decide between being a 0 or a 1. At that instant the value of the second bit is determined. This appears to be an instantaneous action at a distance, something that made Einstein very unhappy. It is called entanglement, and it serves in effect to wire together the bits in a quantum computer. Without entanglement a quantum computer would have the same problem as a DNA computer, trying many answers at the same time and then having to locate the correct one, like trying to find a needle in a haystack. With entanglement, a single quantum computer can be certain to solve a factoring problem.

Naturally, the three-letter agencies (NSA, CIA, . . . ) panicked. Here was a very real threat to information security, coming from an entirely unexpected quarter. Since by that time the result was already widely known they couldn't cover it up, but they could try to keep ahead of the competition. So they started showing up at meetings, in effect offering a purchase order to anyone who would build them a quantum computer. They started to relax when no one serious would take one. In the next year many people realized just how hard it was going to be to build a practical quantum computer. It would need to be carefully isolated from the rest of the world so that it could stay quantum, but would need to be accessible to the outside world so that the initial conditions could be loaded in, the computation run forward, and the answer read out. Papers started to appear saying that quantum computers could not be built.

Then, in 1996, two things happened to scare the three-letter agencies again. The first was the discovery by Peter Shor, Charles Bennett of IBM, and Andy Steane of Oxford, that quantum error correction is possible. Just as classical computers add extra bits to memories to catch mistakes, they realized that quantum computers can use extra quantum bits to fix their errors. This means that an imperfect quantum computer could perform a perfect computation, as long as it is just good enough to be able to implement the error correction.

Here's where NMR comes in. After my day of jury duty I understood that it might be possible to solve the paradox of connecting the classical and quantum worlds with the chemist's trick. Instead of working with a single computer, if a tube full of quantum computers in the form of a liquid sample in an NMR apparatus was used, then all of the external interactions might ruin a few of them, but since each effective quantum bit would be represented in an enormous number of molecules it would retain its quantum coherence for a much longer time. For this idea to work it would have to use natural molecules, since it's not feasible to individually fabricate that many computers. So the question then became whether a molecule could compute.

Classically, a digital computer must be able to flip a bit (the NOT gate), and perform a logical operation on two bits (such as the AND gate). Any other logical operation can be assembled from these components. Similarly, a quantum computer must be able to rotate a quantum bit to any angle between 0 and 1, and must be able to perform a nonlinear operation between two bits. These are easy to do with NMR. The precession of a nucleus in a magnetic field is exactly the kind of rotation that is needed, and the interaction between the nuclei provides the nonlinearity.

I arrived for a visit at the Institute for Theoretical Physics in Santa Barbara, armed with this knowledge of how to represent quantum information in what appears to be a classical system, and how to apply magnetic fields to manipulate the quantum information. On the first day I stopped by to see Isaac Chuang (now at IBM), and emerged from his office a week later. He had just finished working out how to reduce a quantum computer to a series of equivalent operations. The operations that he needed and that I knew how to do matched up perfectly.

By the end of the week we had worked out the details of programming a molecular quantum computer. We had an eerie feeling that we weren't really discovering anything, just lifting a curtain to reveal a beautiful structure that had been there all along waiting for someone to notice it.

A liquid is in fact that most perfect system that could be built for quantum computing. It's easy to obtain in huge quantities, the molecules all come perfectly assembled without any manufacturing imperfections, and the cloud of electrons around the nucleus and the rapid tumbling of the molecules in the liquid protect the quantum coherence of the nuclei. Once we understood the connection between liquid NMR and quantum computing we didn't even need to do a demonstration experiment, because the chemists had been doing them for years. Their molecular studies were showing all of the working elements of a quantum computer, without their realizing that's what they were doing.

Richard Feynmann gave a seminal talk in 1959 entitled "There's Plenty of Room at the Bottom." He foresaw the problems coming in shrinking computers, and wondered if we could jump to making molecular computers. This inspired the field of nanotechnology that sought to do just that. Unfortunately, the study of nanotechnology has produced thoughtful analyses and compelling visions, but very little in the way of working machines. It now turns out that ordinary molecules all along have known how to compute—we just weren't asking them the right questions.

This was one of those ideas that was ready to be thought. An MIT/Harvard group found a related way to manipulate quantum information in a liquid, one at Los Alamos found still another, and many groups around the world are now using these techniques to make small quantum computers. The first experimental computation we did that required fewer steps than on a classical computer was a search problem, using the molecule chloroform. Given no advance knowledge, unlocking a classical padlock with N settings requires about N/2 attempts. Lov Grover at Bell Labs has shown that this can be done in 'N steps on a quantum computer. Our chloroform computer had four states; our quantum search was able to find in a single step the answer that took a few tries classically.

Daunting challenges remain if these demonstration experiments are ever to grow up to exceed the performance of today's fastest classical computers, although these are increasingly technological rather than conceptual problems. The most severe limit is a rapid decrease in sensitivity for larger molecules, but this might be solved by using lasers to align the nuclei. The quantum coherence lasts for thousands of operations, approaching the limit needed to be able to use quantum error correction. This also needs a technique such as the optical nuclear alignment, to prepare the extra quantum bits used in the error correction. Finally, as the size of a molecule is increased, the interactions between nuclei at the far ends of a molecule become too weak to use for computing, but Seth Lloyd has shown that a polymer with a simple repeating atomic pattern is almost as useful and requires that the nuclei interact only with their nearest neighbors.

These scaling constraints offer a glimpse of an ideal molecule for quantum computing. It should have a regular structure, so that Seth's technique can be used. Better still, the pattern should be 2D or 3D rather than a linear chain, to reduce the time to pass messages around the molecule. It should have a rigid configuration, so that all of the local atomic environments are identical, and it should have a special structure at the end that can be used to load in data and read out results. This sounds disturbingly like a microtubule.

Microtubules are a kind of molecular scaffolding in neurons. While there's no compelling reason to believe that quantum mechanics has anything to do with consciousness, those who do think that it does have come to identify microtubules as the seat of quantum consciousness. I couldn't begin to guess how the brain could perform the kinds of molecular manipulations that we do in NMR, although it is amusing to observe that a head in an MRI magnet is a perfectly good medium in which to perform a quantum computation.

The brain may not be the best place to look for guidance in how to build a quantum computer, but it does have a valuable lesson to teach us about how to build more powerful computers. Moore's law demands exponential improvements in performance. Quantum mechanics promises one way to do that; analog circuitry offers another.

The neurons in a brain are about a million times slower than the transistors in a computer, yet they can instantly perform tasks that tax the fastest computers. There are many more neurons in the brain than transistors in a computer, about 1011 of them, and even more connections among them, roughly 1015 synapses. These synapses are the key.

Many computing problems reduce to finding good ways to approximate complicated functions. For example, for a computer to recognize faces it must have a function that can take as an input the pixels in an image and provide as an output the name associated with the face. There are many techniques for doing this, but they all reduce in the end to some kind of function that relates inputs to outputs. In traditional engineering practice unknown functions are represented by using some family of basis functions, such as a collection of polynomial powers or trigonometric functions, combined with a set of weighting coefficients. There are relatively straightforward techniques for finding the best coefficients to match a given set of data.

This isn't how the brain does it. In the brain the coefficients correspond to the synaptic connection strengths, and the basis functions are the S-shaped response of the neurons. The brain uses the coefficients in forming the arguments of the basis functions, rather than using them just to combine the results. Engineers have historically stayed away from doing this because it can be shown that there's no longer a practical procedure to find the best values for the coefficients. All that can be done is some kind of search that learns reasonable ones.

The study of neural networks was inspired by the goal of emulating the organization of the brain. While the brain is still far too complex to model in any detail, simple mathematical networks that put the unknown coefficients before the nonlinear basis functions were surprisingly successful. It is now understood why.

A characteristic feature of difficult recognition problems is that there may be a huge number of inputs, such as all of the pixels in an image. In a traditional model, one coefficient is needed for each possible combination of inputs and basis functions. This rapidly explodes, so that if ten are needed for one input, one hundred are needed for two, one thousand for three, and so forth. This is called the "curse of dimensionality." A neural network model is much more flexible. The adjustable coefficients inside the nonlinear functions are far more powerful; the number required is a polynomial function instead of an exponential function of the number of inputs.

Even so, there's still the problem of finding their values. A second result comes to the rescue here. It turns out that these models are so flexible, almost anywhere they start out there is a good solution nearby. A search procedure does not need to find the single best solution, just an acceptable one, and there is a huge number of them. This is ensured by using a large number of neurons, far more than would be considered sane in a conventional model. This property has come to be called the "blessing of dimensionality." It helps explain why learning is so essential to how the brain functions.

So here is the second exponential insight: use models that require learning the values of coefficients that appear inside nonlinear functions. While this is not applicable to all computational problems, it works for just the sorts of things that people do well, such as recognizing patterns and prioritizing tasks. A digital computer can simulate this kind of architecture, but it's a misplaced application of the digital precision. Simpler analog circuits can do the same thing with far fewer parts.

Analog circuits were used in the early days of computing, but they fell out of favor because they accumulate errors. Digital computers can correct their mistakes and hence perform arbitrarily long computations. However, this is not an essential distinction between analog and digital systems. Ultimately everything is analog; the apparently digital values are a consequence of carefully designed analog circuits. We're now learning how to marry the most desirable features of both worlds, correcting analog errors while using the analog flexibility to squeeze much more performance out of a simple physical system.

For example, Geoff Grinstein of IBM and I studied what are called spread-spectrum coding techniques, which are used to intentionally make a signal appear to be more random than it is. While this may appear to be a perverse aim, it turns out that almost every property of engineered systems is improved when they work with effectively random signals (more people can share a communications channel with less interference, more data can be stored on a disk, and so forth). There is a beautiful theory for how to generate bits for use in spread spectrum that appear to be completely random but in fact are entirely predictable. A transmitter uses such a sequence to spread its signal and the receiver then uses an identical copy to recover the message (this idea can be traced back to the actress Hedy Lamarr and composer George Antheil using piano rolls in World War II). This is what you hear when your modem hisses. The catch is that if the receiver is remote from the transmitter it has a difficult computing job to do to figure out the spreading sequence that the transmitter used. This is part of what a GPS receiver must do to lock onto a satellite and find its position.

Two copies of almost any analog system, if allowed to interact in almost any way, have a very interesting property: they synchronize. This used to be seen in the simultaneous ticking of mechanical clocks on the wall of a clock shop. Geoff and I wondered if we could use this property to solve the problem of spread-spectrum acquisition. Much to our pleasant surprise, we discovered that we could design simple continuous systems that exactly matched the response of the familiar digital ones for digital signals, but if they started in the wrong state they could use their analog freedom to synchronize onto the digital signal. Instead of splitting the job into an analog detector and then a digital search program as is done now, a physical analog part can do everything.

The digital world has gone on refining Turing's machine, when even Turing understood its limits. One more area that he made a seminal contribution to was explaining how biology can form patterns by using chemical reactions. This useful behavior is easy to find in nature but hard to model on a digital computer. Now that we're approaching the end of the era of rapid increases in the performance of digital computers, like Turing we have no choice but to pay more attention to information processing in the analog world.

When Feynmann said that there's plenty of room at the bottom, he meant that there was a long way to go to shrink computers down to atomic sizes. That's no longer true. With difficulty we are approaching that limit with carefully engineered circuits, and can now glimpse how to eliminate all that effort by using natural materials. Although he wasn't thinking about economics, that may carry the most interesting implication of all of his challenge. There's still plenty of room at the bottom to make much cheaper and more widely accessible computers. Nature is not only good at designing computers, it has a few lessons left to teach us about manufacturing them.

WHEN THINGS START TO THINK by Neil Gershenfeld. ©1998 by Neil A. Gershenfeld. Reprinted by arrangement with Henry Holt and Company, LLC.







   [Post New Comment]
Mind·X Discussion About This Article:

how to build objects by nanotechnology
posted on 11/16/2004 11:00 AM by anyguy

[Reply to this post]

just an idea about nanomanufactoring:

would it be possible to digitally model a sort of space, just like the negative of a film. If we are talking about replacing atoms in its right place, then why not construct a space filled with right atoms so that they will form or match with the atoms of the object to be constructed through a process, lets say positive/negative charge . Lets call it a negative envelope. when opened, inevitably shall be filled/attracted with the positive atoms to form the object to be constucted nanotechnologically. This whole thing comes from the very idea that positive particules will attract negative ones and vice versa