| Chapter 10: Seeing Through Windows 
 
 
 
 Originally published by Henry Holt and Company 1999. Published on KurzweilAI.net May 15, 2003. I vividly recall a particular car trip from my childhood because 
              it was when I invented the laptop computer. I had seen early teletype 
              terminals; on this trip I accidentally opened a book turned on its 
              side and realized that there was room on the lower page for a small 
              typewriter keyboard, and on the upper page for a small display screen. 
              I didn't have a clue how to make such a thing, or what I would do 
              with it, but I knew that I had to have one. I had earlier invented 
              a new technique for untying shoes, by pulling on the ends of the 
              laces; I was puzzled and suspicious when my parents claimed prior 
              knowledge of my idea. It would take me many more years to discover 
              that Alan Kay had anticipated my design for the laptop and at that 
              time was really inventing the portable personal computer at Xerox's 
              Palo Alto Research Center (PARC). Despite the current practice of putting the best laptops in the 
              hands of business executives rather than children, my early desire 
              to use a laptop is much closer to Alan's reasons for creating one. 
              Alan's project was indirectly inspired by the work of the Swiss 
              psychologist Jean Piaget, who from the 1920s onward spent years 
              and years studying children. He came to the conclusion that what 
              adults see as undirected play is actually a very structured activity. 
              Children work in a very real sense as little scientists, continually 
              positing and testing theories for how the world works. Through their 
              endless interactive experiments with the things around them, they 
              learn first how the physical world works, and then how the world 
              of ideas works. The crucial implication of Piaget's insight is that 
              learning cannot be restricted to classroom hours, and cannot be 
              encoded in lesson plans; it is a process that is enabled by children's 
              interaction with their environment. Seymour Papert, after working with Piaget, brought these ideas 
              to MIT in the 1960s. He realized that the minicomputers just becoming 
              available to researchers might provide the ultimate sandbox for 
              children. While the rest of the world was developing programming 
              languages for accountants and engineers, Seymour and his collaborators 
              created LOGO for children. This was a language that let kids express 
              abstract programming constructs in simple intuitive terms, and best 
              of all it was interfaced to physical objects so that programs could 
              move things outside of the computer as well as inside it. The first 
              one was a robot "turtle" that could roll around under control of 
              the computer, moving a pen to make drawings. Infected by the meme of interactive technology for children, Alan 
              Kay carried the idea to the West Coast, to Xerox's Palo Alto Research 
              Center. In the 1970s, he sought to create what he called a Dynabook, 
              a portable personal knowledge navigator shaped like a notebook, 
              a fantasy amplifier. The result was most of the familiar elements 
              of personal computing. Unlike early programming languages that required a specification 
              of a precise sequence of steps to be executed, modern object-oriented 
              languages can express more complex relationships among abstract 
              objects. The first object-oriented programming language was Smalltalk, 
              invented by Alan to let children play as easily with symbolic worlds 
              as they do with physical ones. He then added interface components 
              that were being developed by Doug Engelbart up the road at the Stanford 
              Research Institute. Doug was a radar engineer in World War II. He realized that a computer 
              could be more like a radar console than a typewriter, interactively 
              drawing graphics, controlled by an assortment of knobs and levers. 
              Picking up a theme that had been articulated by Vannevar Bush (the 
              person most responsible for the government's support of scientific 
              research during and after the war) in 1945 with his proposal for 
              a mechanical extender of human memory called a Memex, Doug understood 
              that such a machine could help people navigate through the increasingly 
              overwhelming world of information. His colleagues thought that he 
              was nuts. Computers were specialized machines used for batch processing, 
              not interactive personal appliances. Fortunately, Engelbart was 
              able to attract enough funding to set up a laboratory around the 
              heretical notion of studying how people and computers might better 
              interact. These ideas had a coming out in a rather theatrical demo 
              he staged in San Francisco in 1968, showing what we would now recognize 
              as an interactive computer with a mouse and multiple windows on 
              a screen. In 1974 these elements came together in the Xerox Alto prototype, 
              and reached the market in Xerox's Star. The enormous influence of 
              this computer was matched by its enormous price tag, about $50,000. 
              This was a personal computer that only big corporations could afford. 
              Windows and mice finally became widely available and affordable 
              in Apple's Macintosh, inspired by Steve Jobs's visit to PARC in 
              1979, and the rest of personal computing caught up in 1990 when 
              Microsoft released Windows 3.0. The prevailing paradigm for how people use computers hasn't really 
              changed since Englebart's show in 1968. Computers have proliferated, 
              their performance has improved, but we still organize information 
              in windows and manipulate it with a mouse. For years the next big 
              interface has been debated. There is a community that studies such 
              things, called Human-Computer Interactions. To give you an idea 
              of the low level of that discussion, one of the most thoughtful 
              HCI researchers, Bill Buxton (chief scientist at Silicon Graphics), 
              is known for the insight that people have two hands. A mouse forces 
              you to manipulate things with one hand alone; Bill develops interfaces 
              that can use both hands. A perennial contender on the short list for the next big interface 
              is speech recognition, promising to let us talk to our computers 
              as naturally as we talk to each other. Appealing as that is, it 
              has a few serious problems. It would be tiring if we had to spend 
              the day speaking continuously to get anything done, and it would 
              be intrusive if our conversations with other people had to be punctuated 
              by our conversations with our machines. Most seriously, even if 
              speech recognition systems worked perfectly (and they don't), the 
              result is no better than if the commands had been typed. So much 
              of the frustration in using a computer is not the effort to enter 
              the commands, it's figuring out how to tell it to do what you want, 
              or trying to interpret just what it has done. Speech is a piece 
              of the puzzle, but it doesn't address the fundamental mysteries 
              confronting most computer users. A dream interface has always been dreams, using mind control to 
              direct a computer. There is now serious work being done on making 
              machines that can read minds. One technique used is magnetoencephalography 
              (MEG), which places sensitive detectors of magnetic fields around 
              a head and measures the tiny neural currents flowing in the brain. 
              Another technique, functional magnetic resonance imaging, uses MRI 
              to make a 3D map of chemical distributions in the brain to locate 
              where metabolic activity is happening. Both of these can, under 
              ideal conditions, deduce something about what is being thought, 
              such as distinguishing between listening to music and looking at 
              art, or moving one hand versus the other. The problem that both 
              struggle with is that the brain's internal representation is not 
              designed for external consumption. Early programmers did a crude form of MEG by placing a radio near 
              a computer; the pattern of static could reveal when a program got 
              stuck in a loop. But as soon as video displays came along it became 
              much easier for the computer to present the information in a meaningful 
              form, showing just what it was doing. In theory the same information 
              could be deduced by measuring all of the voltages on all of the 
              leads of the chips; in practice this is done only by hardware manufacturers 
              in testing new systems, and it takes weeks of effort. Similarly, things that are hard to measure inside a person are 
              simple to recognize on the outside. For example, hold your finger 
              up and wiggle it back and forth. You've just performed a brain control 
              task that the Air Force has spent a great deal of time and money 
              trying to replicate. They've built a cockpit that lets a pilot control 
              the roll angle by thinking; trained operators on a good day can 
              slowly tilt it from side to side. They're a long way from flying 
              a plane that way. In fact, a great deal of the work in developing thought interfaces 
              is actually closer to wiggling your finger. It's much easier to 
              accidentally measure artifacts that come from muscle tension in 
              your forehead or scalp than it is to record signals from deep in 
              the brain. Instead of trying to teach people to do the equivalent 
              of wiggling their ears, it's easier to use the parts of our bodies 
              that already come wired for us to interact with the world. Another leading contender for the next big interface is 3D graphics. 
              Our world is three-dimensional; why limit the screen to two dimensions? 
              With advances in the speed of graphical processors it is becoming 
              possible to render 3D scenes as quickly as 2D windows are now drawn. 
              A 3D desktop could present the files in a computer as the drawers 
              of a file cabinet or as a shelf of books, making browsing more intuitive. 
              If you're willing to wear special glasses, the 3D illusion can be 
              quite convincing. A 3D display can even be more than an illusion. My colleague Steve 
              Benton invented the reflection holograms on your credit cards; his 
              group is now developing real-time holographic video. A computer 
              calculates the light that would be reflected from a three-dimensional 
              object, and modulates a laser beam to produce exactly that. Instead 
              of tricking the eyes by using separate displays to produce an illusion 
              of depth, his display actually creates the exact light pattern that 
              the synthetic object would reflect. Steve's system is a technological tour de force, the realization 
              of a long-standing dream in the display community. It's also slightly 
              disappointing to many people who see it, because a holographic car 
              doesn't look as good as a real car. The problem is that reality 
              is just too good. The eye has the equivalent of many thousands of 
              lines of resolution, and a refresh rate of milliseconds. In the 
              physical world there's no delay between moving an object and seeing 
              a new perspective. Steve may someday be able to match those specifications 
              with holographic video, but it's a daunting challenge. Instead of struggling to create a computer world that can replace 
              our physical world, there's an alternative: augment it. Embrace 
              the means of interaction that we've spent eons perfecting as a species, 
              and enhance them with digital content. Consider Doug Engelbart's influential mouse. It is a two'dimensional 
              controller that can be moved left and right, forward and backward, 
              and intent is signaled by pressing it. It was preceded by a few 
              centuries by another two-dimensional controller, a violin bow. That, 
              too, is moved left and right, forward and backward, and intent is 
              communicated by pressing it. In this sense the bow and mouse are 
              very similar. On the other hand, while a good mouse might cost $10, 
              a good bow can cost $10,000. It takes a few moments to learn to 
              use a mouse, and a lifetime to learn to use a bow. Why would anyone 
              prefer the bow? Because it lets them do so much more. Consider the differences 
              between the bow technique and the mouse technique: Bow Technique 
 
Sul ponticello (bowing close to the bridge)Spiccato (dropping the bow)martelé (forcefully releasing the stroke)Jeté (bouncing the bow)Tremolo(moving back and forth repeatedly)Sul tasto (bowing over the fingerboard)Arpeggio (bouncing on broken chords)Col legno (striking with the stick)Viotti (unaccented then accented note)Staccato (many martele notes in one stroke)Staccato volante (slight spring during rapid staccato)Détaché (vigorous articulated stroke)Legato (smotth stroke up or down)Sautillé (rapid strike in middle of bow)Lauré (separated slurred notes)Ondulé (tremolo between two strings) Mouse Technique 
 There's much more to the bow than a casual marketing list of features 
              might convey. Its exquisite physical construction lets the player 
              perform a much richer control task, relying on the intimate connection 
              between the dynamics of the bow and the tactile interface to the 
              hand manipulating and sensing its motion. Compare that nuance to 
              a mouse, which can be used perfectly well while wearing mittens. When we did the cello project I didn't want to ask Yo-Yo to give 
              up this marvelous interface; I retained the bow and instead asked 
              the computer to respond to it. Afterward, we found that the sensor 
              I developed to track the bow could respond to a hand without the 
              bow. This kind of artifact is apparent any time a radio makes static 
              when you walk by it, and was used back in the 1930s by the Russian 
              inventor Lev Termen in his Theremin, the musical staple of science-fiction 
              movies that makes eerie sounds in response to a player waving their 
              arms in front of it. My student Josh Smith and I found that lurking behind this behavior 
              was a beautiful mathematical problem: given the charges measured 
              on two-dimensional electrodes, what is the three'dimensional distribution 
              of material that produced it? As we made headway with the problem 
              we found that we could make what looks like an ordinary table, but 
              that has electrodes in it that create a weak electric field that 
              can find the location of a hand above it. It's completely unobtrusive, 
              and responds to the smallest motions a person can make (millimeters) 
              as quickly as they can make them (milliseconds). Now we don't need 
              to clutter the desk with a rodent; the interface can disappear into 
              the furniture. There's no need to look for a mouse since you always 
              know where to find your hand. The circuit board that we developed to make these measurements 
              ended up being call a "Fish," because fish swim in 3'D instead of 
              mice crawling in 2'D, and some fish that live in murky waters use 
              electric fields to detect objects in their vicinity just as we were 
              rediscovering how to do it. In retrospect, it's surprising that 
              it has taken so long for such an exquisite biological sense to get 
              used for computer interfaces. There's been an anthropomorphic tendency 
              to assume that a computer's senses should match our own. We had trouble keeping the Fish boards on hand because they would 
              be carried off around the Media Lab by students who wanted to build 
              physical interfaces. More recently, the students have been acquiring 
              as many radio-frequency identification (RFID) chips as they can 
              get their hands on. These are tiny processors, small enough even 
              to be swallowed, that are powered by an external field that can 
              also exchange data with them. They're currently used in niche applications, 
              such as tracking laboratory animals, or in the key-chain tags that 
              enable us to pump gas without using a credit card. The students 
              use them everywhere else. They make coffee cups that can tell the 
              coffeemaker how you like your coffee, shoes that can tell a doorknob 
              who you are, and mouse pads that can read a Web URL from an object 
              placed on it. You can think of this as a kind of digital shadow. Right now objects 
              live either in the physical world or as icons on a computer screen. 
              User interface designers still debate whether icons that appear 
              to be three-dimensional are better than ones that look two-dimensional. 
              Instead, the icons can really become three-dimensional; physical 
              objects can have logical behavior associated with them. A business 
              card should contain an address, but also summon a Web page if placed 
              near a Web browser. A pen should write in normal ink, but also remember 
              what it writes so that the information can be recalled later in 
              a computer, and it should serve as a stylus to control that computer. 
              A house key can also serve as a cryptographic key. Each of these 
              things has a useful physical function as well as a digital one. My colleague Hiroshi Ishii has a group of industrial designers, 
              graphical designers, and user interface designers studying how to 
              build such new kinds of environmental interfaces. A recurring theme 
              is that interaction should happen in the context that you, rather 
              than the computer, find meaningful. They use video projectors so 
              that tables and floors and walls can show relevant information; 
              since Hiroshi is such a good Ping-Pong player, one of the first 
              examples was a Ping-Pong table that displayed the ball's trajectory 
              in a fast-moving game by connecting sensors in the table to a video 
              projector aimed down at the table. His student John Underkoffler 
              notes that a lamp is a one-bit display that can be either on or 
              off; John is replacing lightbulbs with combinations of computer 
              video projectors and cameras so that the light can illuminate ideas 
              as well as spaces. Many of the most interesting displays they use are barely perceptible, 
              such as a room for managing their computer network that maps the 
              traffic into ambient sounds and visual cues. A soothing breeze indicates 
              that all is well; the sights and sounds of a thunderstorm is a sign 
              of an impending disaster that needs immediate attention. This information 
              about their computer network is always available, but never demands 
              direct attention unless there is a problem. Taken together, ambient displays, tagged objects, and remote sensing 
              of people have a simple interpretation: the computer as a distinguishable 
              object disappears. Instead of a fixed display, keyboard, and mouse, 
              the things around us become the means we use to interact with electronic
information as well as the physical world. Today's battles between 
              competing computer operating systems and hardware platforms will 
              literally vanish into the woodwork as the diversity of the physical 
              world makes control of the desktop less relevant. This is really no more than Piaget's original premise of learning 
              through manipulation, filtered through Papert and Kay. We've gotten 
              stuck at the developmental stage of early infants who use one hand 
              to point at things in their world, a decidedly small subset of human
experience. Things we do well rely on all of our senses. Children, of course, understand this. The first lesson that any 
              technologist bringing computers into a classroom gets taught by 
              the kids is that they don't want to sit still in front of a tube. 
              They want to play, in groups and alone, wherever their fancy takes 
              them. The computer has to tag along if it is to participate. This 
              is why Mitch Resnick, who has carried on Seymour's tradition at 
              the Media Lab, has worked so hard to squeeze a computer into a Lego 
              brick. These bring the malleability of computing to the interactivity 
              of a Lego set. Just as Alan's computer for kids was quickly taken over by the 
              grown-ups, Lego has been finding that adults are as interested as 
              kids in their smart bricks. There's no end to the creativity that's 
              found expression through them; my favorite is a descendent of the 
              old LOGO turtle, a copier made from a Lego car that drives over 
              a page with a light sensor and then trails a pen to draw a copy 
              of the page. A window is actually an apt metaphor for how we use computers now. 
              It is a barrier between what is inside and what is outside. While 
              that can be useful at times (such as keeping bugs where they belong), 
              it's confining to stay behind it. Windows also open to let'fresh 
              air in and let people out. All along the coming interface paradigm has been apparent. The 
              mistake was to assume that a computer interface happens between 
              a person sitting at a desk and a computer sitting on the desk. We 
              didn't just miss the forest for the trees, we missed the earth and 
              the sky and everything else. The world is the next interface. WHEN THINGS START TO THINK by Neil Gershenfeld. ©1998 by 
              Neil A. Gershenfeld. Reprinted by arrangement with Henry Holt and 
              Company, LLC. |