Essentials of General Intelligence: The direct path to AGI
General intelligence comprises the essential, domain-independent skills necessary for acquiring a wide range of domain-specific knowledge -- the ability to learn anything. Achieving this with "artificial general intelligence" (AGI) requires a highly adaptive, general-purpose system that can autonomously acquire an extremely wide range of specific knowledge and skills and can improve its own cognitive ability through self-directed learning. This chapter in the forthcoming book, Real AI: New Approaches to Artificial General Intelligence, describes the requirements and conceptual design of a prototype AGI system.
1. Introduction
This paper explores the concept of "artificial general intelligence" (AGI) -- its
nature, importance, and how best to achieve it. Our[1]
theoretical model posits that general intelligence comprises a limited number
of distinct, yet highly integrated, foundational functional components.
Successful implementation of this model will yield a highly adaptive,
general-purpose system that can autonomously acquire an extremely wide range of
specific knowledge and skills. Moreover, it will be able to improve its own
cognitive ability through self-directed learning. We believe that, given the
right design, current hardware/ software technology is adequate for engineering
practical AGI systems. Our current implementation of a functional prototype is
described below.
The idea of "general intelligence" is quite controversial; I
do not substantially engage this debate here but rather take the existence of
such non domain-specific abilities as a given (Gottfredson 1998). It must also be noted that this essay focuses
primarily on low-level (i.e. roughly animal level) cognitive ability.
Higher-level functionality, while an integral part of our model, is only
addressed peripherally. Finally, certain algorithmic details are omitted for
reasons of proprietary ownership.
Intelligence can be defined simply as an entity's ability to
achieve goals -- with greater intelligence coping with more complex and novel
situations. Complexity ranges from the trivial -- thermostats and mollusks (that
in most contexts don't even justify the label "intelligence") -- to the
fantastically complex; autonomous flight control systems and humans.
Adaptivity, the ability to deal with changing and novel
requirements, also covers a wide spectrum: from rigid, narrowly domain-specific
to highly flexible, general purpose. Furthermore, flexibility can be defined in
terms of scope and permanence -- how much, and how often it
changes. Imprinting is an example of limited scope and high permanence, while
innovative, abstract problem solving is at the other end of the spectrum. While
entities with high adaptivity and flexibility are clearly superior -- they can
potentially learn to achieve any possible goal -- there is a hefty efficiency
price to be paid: For example, had Deep Blue also been designed to learn
language, direct airline traffic, and do medical diagnosis, it would not have
become Chess champion (all other things being equal).
General
Intelligence comprises the essential,
domain-independent skills necessary for acquiring a wide range of domain-specific
knowledge (data & skills) -- i.e. the ability to learn anything (in
principle). More specifically, this learning ability needs to be autonomous,
goal-directed, and highly adaptive:
Autonomous -- Learning occurs both automatically, through exposure to sense
data (unsupervised), and through bi-directional interaction with the environment,
including exploration/ experimentation (self-supervised).
Goal-directed -- Learning is directed (autonomously) towards achieving varying
and novel goals and sub-goals -- be they "hard-wired," externally specified,
or self-generated. Goal-directedness also implies very selective learning
and data acquisition (from a massively data-rich, noisy, complex environment).
Adaptive -- Learning is cumulative, integrative, contextual and adjusts to
changing goals and environments. General adaptivity not only copes with gradual
changes, but also seeds and facilitates the acquisition of totally novel abilities.
General cognitive ability stands in sharp contrast to inherent specializations
such as speech- or face-recognition, knowledge databases/ ontologies, expert
systems, or search, regression or optimization algorithms. It allows an entity
to acquire a virtually unlimited range of new specialized abilities. The mark
of a generally intelligent system is not having a lot of knowledge
and skills, but being able to acquire and improve them -- and
to be able to appropriately apply them. Furthermore, knowledge must
be acquired and stored in ways appropriate both to the nature of the data,
and to the goals and tasks at hand.
For example, given the correct set of basic core
capabilities, an AGI system should be able to learn to recognize and categorize
a wide range of novel perceptual patterns that are acquired via different
senses, in many different environments and contexts. Additionally, it should be
able to autonomously learn appropriate, goal-directed responses to such input
contexts (given some feedback mechanism).
We take this concept to be valid not only for high-level
human intelligence, but for lower-level animal-like ability. The degree of
"generality" (i.e., adaptability) varies along a continuum from genetically
"hard-coded" responses (no adaptability), to high-level animal flexibility
(significant learning ability as in, say, a dog), and finally to self-aware
human general learning ability.
Core
Requirements for General Intelligence
General intelligence, as described above, demands a number
of irreducible features and capabilities. In order to proactively accumulate
knowledge from various (and/ or changing) environments, it requires:
1. Senses to obtain features from "the world" (virtual or actual),
2. A coherent means for storing knowledge obtained this way, and
3. Adaptive output/ actuation mechanisms (both static and dynamic).
Such knowledge also needs to be automatically adjusted and
updated on an ongoing basis; new knowledge must be appropriately related to
existing data. Furthermore, perceived entities/ patterns must be stored in a
way that facilitates concept formation and generalization. An effective way to
represent complex feature relationships is through vector encoding (Churchland
1995).
Any practical applications of AGI (and certainly any
real-time uses) must inherently be
able to process temporal data as patterns in time -- not just as static patterns
with a time dimension. Furthermore, AGIs must cope with data from different
sense probes (e.g., visual, auditory, and data), and deal with such attributes
as: noisy, scalar, unreliable, incomplete, multi-dimensional (both space/ time
dimensional, and having a large number of simultaneous features), etc. Fuzzy
pattern matching helps deal with pattern variability and noise.
Another essential requirement of general intelligence is to
cope with an overabundance of data. Reality presents massively more features
and detail than is (contextually) relevant, or that can be usefully processed.
This is why the system needs to have some control over what input data is
selected for analysis and learning -- both in terms of which data, and also the degree of detail. Senses ("probes") are
needed not only for selection and focus, but also in order to ground concepts --
to give them (reality-based) meaning.
While input data needs to be severely limited by focus and
selection, it is also extremely important to obtain multiple views of reality --
data from different feature extractors or senses. Provided that these different
input patterns are properly associated, they can help to provide context for
each other, aid recognition, and add meaning.
In addition to being able to sense via its multiple,
adaptive input groups and probes, the AGI must also be able to act on the world
-- be it for exploration, experimentation, communication, or to perform useful
actions. These mechanisms need to provide both static and dynamic output
(states and behavior). They too, need to be adaptive and capable of learning.
Underlying all of this functionality is pattern processing.
What is more, not only are sensing and action based on generic patterns, but so
is internal cognitive activity. In fact, even high-level abstract thought,
language, and formal reasoning -- abilities outside the scope of our current
project -- are "just" higher-order elaborations of this (Margolis 1987).
Advantages of
Intelligence being General
The advantages of general intelligence are almost too
obvious to merit listing; how many of us would dream of giving up our ability
to adapt and learn new things? In the context of artificial intelligence this
issue takes on a new significance.
There exists an inexhaustible demand for computerized
systems that can assist humans in complex tasks that are highly repetitive,
dangerous, or that require knowledge, senses or abilities that its users may
not possess (e.g., expert knowledge, "photographic" recall, overcoming
disabilities, etc.). These applications stretch across almost all domains of
human endeavor.
Currently, these needs are filled primarily by systems
engineered specifically for each domain and application (e.g., expert systems).
Problems of cost, lead-time, reliability, and the lack of adaptability to new
and unforeseen situations, severely limit market potential. Adaptive AGI technology,
as described in this paper, promises to significantly reduce these limitations
and to open up these markets. It specifically implies --
That systems can learn (and be taught) a wide spectrum of data and functionality
They can adapt to changing data, environments and uses/ goals
This can be achieved without program changes -- capabilities are learned,
not coded.
More specifically, this technology can potentially:
Significantly reduce system "brittleness"[2]
through fuzzy pattern matching and adaptive learning -- increasing robustness
in the face of changing and unanticipated conditions or data.
Learn autonomously, by automatically accumulating knowledge about new environments
through exploration.
Allow systems to be operator-trained to identify new objects and patterns;
to respond to situations in specific ways, and to acquire new behaviors.
Eliminate programming in many applications. Systems can be employed in many
different environments, and with different parameters simply through self-training.
Facilitate easy deployment in new domains. A general intelligence engine
with pluggable custom input/ output probes allows rapid and inexpensive implementation
of specialized applications.
From a design perspective, AGI offers the advantage that all
effort can be focused on achieving the best general
solutions -- solving them once, rather than once for each particular domain. AGI
obviously also has huge economic implications: because AGI systems acquire most
of their knowledge and skills (and adapt to changing requirements) autonomously,
programming lead times and costs can be dramatically reduced, or even eliminated.
The fact that no (artificial!) systems with these
capabilities currently exist seems to imply that it is very hard (or
impossible) to achieve these objectives. However, I believe that, as with other
examples of human discovery and invention, the solution will seem rather
obvious in retrospect. The trick is correctly choosing a few critical
development options.
3. Shortcuts to AGI
When explaining Artificial General Intelligence to the
uninitiated one often hears the remark that, surely, everyone in AI is working
to achieve general intelligence. This indicates how deeply misunderstood
intelligence is. While it is true that eventually
conventional (domain-specific) research efforts will converge with those of
AGI, without deliberate guidance this is likely to be a long, inefficient
process. High-level intelligence must
be adaptive, must be general -- yet very little work is being done to
specifically identify what general intelligence is, what it requires, and
how to achieve it.
In addition to understanding general intelligence, AGI
design also requires an appreciation of the differences between artificial (synthetic) and biological
intelligence, and between designed
and evolved systems.
Our particular approach to achieving AGI capitalizes on
extensive analysis of these issues, and on an incremental development path that
aims to minimize development effort (time and cost), technical complexity, and
overall project risks. In particular, we are focusing on engineering a series
of functional (but low-resolution/ capacity) proof-of-concept prototypes.
Performance issues specifically related to commercialization are assigned to
separate development tracks. Furthermore, our initial effort concentrates on
identifying and implementing the most general and foundational components
first, leaving high-level cognition such as abstract thought, language, and
formal logic for later development (more on that later). We also focus more on
selective, unsupervised, dynamic, incremental, interactive learning; on noisy,
complex, analog data; and on integrating entity features and concept attributes
in one comprehensive network.
While our project may not be the only one proceeding on this
particular path, it is clear that by far the majority of AI work being done
today follows a substantially different overall approach. Our work focuses on:
General rather than domain-specific cognitive ability
Acquired knowledge and skills, versus loaded databases and coded skills
Bi-directional, real-time interaction, versus batch processing
Adaptive attention (focus & selection), versus human pre-selected
data
Core support for dynamic patterns, versus static data
Unsupervised and self-supervised, versus supervised learning
Adaptive, self-organizing data structures, versus fixed neural nets
or databases
Contextual, grounded concepts, versus hard-coded, symbolic concepts
Explicitly engineering functionality, versus evolving it
Conceptual design, versus reverse-engineering
General proof-of-concept, versus specific real applications development
Animal level cognition, versus abstract thought, language, and formal
logic.
Let's look at each of these choices in greater detail.
General rather than
domain-specific cognitive ability. The advantages listed in the previous
section flow from the fact that generally intelligent systems can ultimately
learn any specialized knowledge and skills possible -- human
intelligence is the proof! The reverse is obviously not true.
A complete, well-designed AGI's ability to acquire
domain-specific capabilities is limited only by processing and storage
capacity. What is more, much of its learning will be autonomous -- without
teachers, and certainly without explicit programming. This approach implements
(and capitalizes on) the essence of "Seed AI" -- systems with a limited, but
carefully chosen set of basic, initial capabilities that allow them (in a
"bootstrapping" process) to dramatically increase their knowledge and skills
through self-directed learning and adaptation. By concentrating on carefully
designing the seed of intelligence, and then nursing it to maturity, one essentially
bootstraps intelligence. In our AGI design this self-improvement takes two
distinct forms/ phases:
1. Coding the basic skills that allow the system to acquire a large amount
of specific knowledge.
2. The system reaching sufficient intelligence and conceptual understanding
of its own design, to enable it to deliberately improve its own design.
Acquired knowledge and skills, versus loaded databases and
coded skills. One crucial measure of general intelligence is its ability
to acquire knowledge and skills, not how much it possesses. Many AI efforts
concentrate on accumulating huge databases of knowledge and coding massive amounts
of specific skills. If AGI is possible -- and evidence presented here and elsewhere
seems overwhelming -- then much of this effort will be wasted. Not only will
an AGI be able to acquire these additional smarts (largely) by itself, but moreover,
it will also be able to keep its knowledge up-to-date, and to improve it. Not
only will this save initial data collection and preparation as well as programming,
it will also dramatically reduce maintenance.
An important feature of our design is that there are no
traditional databases containing knowledge, nor programs encoding learned
skills: All acquired knowledge is integrated into an adaptive central
knowledge/ skills network. Patterns representing knowledge are associated in a
manner that facilitates conceptualization and sensitivity to context. Naturally,
such a design is potentially far less prone to brittleness, and more
resiliently fault-tolerant.
Bi-directional,
real-time interaction, versus batch processing. Adaptive learning systems
must be able to interact bi-directionally with the environment -- virtual or
real. They must both sense data and act/ react on an ongoing basis. Many AI
systems do all of their learning in batch mode and have little or no ability to
learn incrementally. Such systems
cannot easily adjust to changing environments or requirements -- in many cases
they are unable to adapt beyond the initial training set without reprogramming
or retraining.
In addition to real-time perception and learning, intelligent
systems must also be able to act. Three distinct areas of action capability are
required:
1. Acting on the "world" -- be it to communicate, to navigate or
explore, or to manipulate some external function or device in order
to achieve goals.
2. Controlling or modifying the system's internal parameters
(such as learning rate or noise tolerance, etc.) in order to set or improve
functionality.
3. Controlling the system's sense input parameters such as focus,
selection, resolution (granularity) as well as adjusting feature extraction
parameters.
Adaptive attention
(focus & selection), versus human pre-selected data. As mentioned
earlier, reality presents far more sense data abundance, detail, and complexity
than are required for any given task -- or than can be processed. Traditionally,
this problem has been dealt with by carefully selecting and formatting data
before feeding it to the system. While this human assistance can improve
performance in specific applications, it is often not realized that this additional
intelligence resides in the human, not the software.
Outside guidance and training can obviously speed learning;
however, AGI systems must inherently
be designed to acquire knowledge by themselves. In particular, they need to
control what input data is processed -- where specifically to obtain data, in
how much detail, and in what format. Absent this capability the system will
either be overwhelmed by irrelevant data or, conversely, be unable to obtain
crucial information, or get it in the required format. Naturally, such data
focus and selection mechanisms must themselves be adaptive.
Core support for
dynamic patterns, versus static data. Temporal pattern processing is
another fundamental requirement of interactive intelligence. At least three
aspects of AGI rely on it: perception
needs to learn/ recognize dynamic entities and sequences, action usually comprises complex behavior, and cognition (internal processing) is inherently temporal. In spite of
this obvious need for intrinsic support for dynamic patterns, many AI systems
only process static data; temporal sequences, if supported at all, are often
converted ("flattened") externally to eliminate the time dimension. Real-time
temporal pattern processing is technically quite challenging, so it is not surprising
that most designs try to avoid it.
Unsupervised and
self-supervised, versus supervised learning. Auto-adaptive systems such as
AGIs require comprehensive capabilities to learn without supervision. Such
teacher-independent knowledge and skill acquisition falls into two broad
categories: unsupervised (data-driven, bottom-up), and self-supervised
(goal-driven, top-down). Ideally these two modes of learning should seamlessly
integrate with each other -- and of course, also with other, supervised methods.
Here, as in other design choices, general adaptive systems are harder to design
and tune than more specialized, unchanging ones. We see this particularly clearly
in the overwhelming focus on back-propagation[3]
in artificial neural network (ANN) development. Relatively little research aims
at better understanding and improving incremental, autonomous learning. Our
own design places heavy emphasis on these aspects.
Adaptive, self-organizing data structures, versus fixed
neural nets or databases. Another core requirement imposed by data/
goal-driven, real-time learning is having a flexible, self-organizing data
structure. On the one hand, knowledge representation must be highly integrated,
while on the other hand it must be able to adapt to changing data densities
(and other properties), and to varying goals or solutions. Our AGI encodes all
acquired knowledge and skills in one integrated network-like structure. This
central repository features a flexible, dynamically self-organizing topology.
The vast majority of other AI designs rely either on loosely-coupled data
objects or agents, or on fixed network topologies and pre-defined ontologies,
data hierarchies or database layouts. This often severely limits their
self-learning ability, adaptivity and robustness, or creates massive
communication bottlenecks or other performance overhead.
Contextual, grounded
concepts, versus hard-coded, symbolic concepts. Concepts are probably the
most important design aspect of AGI; in fact, one can say that "high-level intelligence
is conceptual intelligence." Core
characteristics of concepts include their ability to represent
ultra-high-dimensional fuzzy sets that are grounded in reality, yet fluid with
regard to context. In other words, they encode related sets of complex, coherent,
multi-dimensional patterns that represent features of entities. Concepts obtain
their grounding (and thus their meaning) by virtue of patterns emanating from
features sensed directly from entities that exist in reality. Because concepts
are defined by value ranges within each feature dimension (sometimes in
complex relationships), some kind of fuzzy pattern matching is essential. In
addition, the scope of concepts must be fluid; they must be sensitive
and adaptive to both environmental and goal contexts.
Autonomous concept formation is one of the key tests of
intelligence. The many AI systems based on hard-coded or human-defined concepts
fail this fundamental test. Furthermore, systems that do not derive their
concepts via interactive perception are unable to ground their knowledge in
reality, and thus lack crucial meaning. Finally, concept structures whose
activation cannot be modulated by context and degree of fit are unable to
capture the subtlety and fluidity of intelligent generalization. In combination,
these limitations will cripple any aspiring AGI.
Explicitly
engineering (and learning) functionality, versus evolving it. Design by
evolution is extremely inefficient -- whether in nature or in computer science.
Moreover, evolutionary solutions are generally opaque; optimized only to some
specified "cost function," not comprehensibility, modularity, or
maintainability. Furthermore, evolutionary learning also requires more data or
trials than are available in everyday problem solving.
Genetic and evolutionary programming do have their uses -- they are powerful tools that can be used to solve very specific problems, such as
optimization of large sets of variables; however they generally are not
appropriate for creating large systems of infrastructures. Artificially
evolving general intelligence directly
seems particularly problematic because there is no known function measuring
such capability along a single continuum -- and absent such direction, evolution
doesn't know what to optimize. One approach to deal with this problem is to try
to coax intelligence out of a complex ecology of competing agents -- essentially
replaying natural evolution.
Overall, it seems that genetic programming techniques are
appropriate when one runs out of specific engineering ideas. Here is a short
summary of advantages of explicitly engineered functionality:
Designs can directly capitalize on and encode the designer's knowledge and
insights.
Designs have comprehensible design documentation.
Designs can be more far more modular -- less need for multiple functionality
and high inter-dependency of sub-systems than found in evolved systems.
Systems can have a more flow-chart like, logical design -- evolution has
no foresight.
They can be designed with debugging aids -- evolution didn't need that.
These features combine to make systems easier to understand, debug, interface,
and -- importantly -- for multiple teams to simultaneously work on the design.
Conceptual design,
versus reverse-engineering. In addition to avoiding the shortcomings of
evolutionary techniques, there are also numerous advantages to designing and engineering
intelligent systems based on functional
requirements rather than trying to copy evolution's design of the brain. As
aviation has amply demonstrated, it is much easier to build planes than it is
to reverse-engineer birds -- much easier to achieve flight via thrust than
flapping wings.
Similarly, in creating artificial intelligence it makes
sense to capitalize on our human intellectual
and engineering strengths -- to ignore design parameters unique to
biological systems, instead of struggling to copy nature's designs. Designs explicitly engineered to achieve desired
functionality are much easier to understand, debug, modify, and enhance.
Furthermore, using known and existing technology allows us to best leverage
existing resources. So why limit ourselves to the single solution to
intelligence created by a blind, unconscious Watchmaker with his own agenda
(survival in an evolutionary environment very different from that of today)?
Intelligent machines
designed from scratch carry neither the evolutionary baggage, nor the
additional complexity for epigenesis, reproduction, and integrated self-repair
of biological brains. Obviously this doesn't imply that we can learn nothing
from studying brains, just that we don't have to limit ourselves to biological
feasibility in our designs. Our (currently) only working example of high-level
general intelligence (the brain) provides a crucial conceptual model of cognition, and can clearly inspire numerous
specific design features.
Here are some
desirable cognitive features that can be included in an AGI design that would
not (and in some cases, could not) exist in a reverse-engineered brain:
More effective control of neurochemistry ("emotional states")
Selecting the appropriate degree of logical thinking versus intuition
More effective control over focus and attention
Being able to learn instantly, on demand
Direct and rapid interfacing with databases, the Internet, and other machines
-- potentially having instant access to all available knowledge
Optional "photographic" memory and recall ("playback") on all senses!
Better control over remembering and forgetting (freezing important knowledge,
and being able to unlearn)
The ability to accurately backtrack and review thought and decision processes
(retrace and explore logic pathways)
Patterns, nodes and links can easily be tagged (labeled) and categorized
The ability to optimize the design for the available hardware instead of
being forced to conform to the brain's requirements
The ability to utilize the best existing algorithms and software techniques
-- irrespective of whether they are biologically plausible
Custom designed AGI (unlike brains) can have a simple speed/ capacity upgrade
path
The possibility of comprehensive integration with other AI systems (like
expert systems, robotics, specialized sense pre-processors, and problem solvers)
The ability to construct AGIs that are highly optimized for specific domains
Node, link, and internal parameter data is available as "input data" (full
introspection)
Design specifications are available (to the designer and to the AGI itself!)
Seed AI design: A machine can inherently be designed to more easily understand
and improve its own functioning -- thus bootstrapping intelligence to ever
higher levels.
General
proof-of-concept, versus specific real applications development. Applying
given resources to minimalist proof-of-concept designs improves the likelihood
of cutting a swift, direct path towards an ultimate goal. Having identified
high-level artificial general intelligence as our goal, it makes little sense
to squander resources on inessentials. In addition to focusing our efforts on
the ability to acquire knowledge
autonomously, rather than capturing or coding it, we further aim to speed
progress towards full AGI by reducing cost and complexity through --
Concentrating on proof-of-concept prototypes, not commercial performance.
This includes working at low data resolution and volume, and putting aside
optimization. Scalability is addressed only at a theoretical level, and not
necessarily implemented.
Working with radically-reduced sense and motor capabilities. The fact that
deaf, blind, and severely paralyzed people can attain high intelligence (Helen
Keller, Stephen Hawking) indicates that these are not essential to developing
AGI.
Coping with complexity through a willingness to experiment and implement
poorly understood algorithms -- i.e. using an engineering approach. Using
self-tuning feedback loops to minimize free parameters.
Not being sidetracked by attempting to match the performance of domain-specific
designs -- focusing more on how capabilities are achieved (e.g. learned
conceptualization, instead of programmed or manually specified concepts) rather
than raw performance.
Developing and testing in virtual environments, not physical implementations.
Most aspects of AGI can be fully evaluated without the overhead (time, money,
and complexity) of robotics.
Animal level
cognition, versus abstract thought, language, and formal logic. There is
ample evidence that achieving high-level cognition requires only modest structural improvements from animal
capability. Discoveries in cognitive psychology point towards generalized
pattern processing being the foundational mechanism for all higher level
functioning. On the other hand, relatively small differences between higher
animals and humans are also witnessed by studies of genetics, the evolutionary
timetable, and developmental psychology.
The core challenge
of AGI is achieving the robust, adaptive conceptual learning ability of higher
primates or young children. If human level intelligence is the goal, then
pursuing robotics, language, or formal logic (at this stage) is a costly
sideshow - whether motivated by misunderstanding the problem, or by commercial
or "political" considerations.
Summary.
While our project leans heavily on research done in many specialized disciplines,
it is one of the few efforts dedicated to integrating such interdisciplinary
knowledge with the specific goal of developing general artificial intelligence. We firmly believe that many of the
issues raised above are crucial to the early achievement of truly intelligent
adaptive learning systems.
4. Foundational Cognitive Capabilities
General intelligence requires a number of foundational
cognitive abilities. At a first approximation, it must be able to --
Remember and recognize patterns representing coherent features of reality
Relate such patterns by various similarities, differences, and associations
Learn and perform a variety of actions
Evaluate and encode feedback from a goal system
Autonomously adjust its system control parameters.
As mentioned earlier, this functionality must handle a very
wide variety of data types and characteristics (including temporal), and must
operate interactively, in real-time. The expanded description below is based on
our particular implementation; however, the features listed would generally be
required (in some form) in any
implementation of artificial general intelligence.
Pattern learning,
matching, completion, and recall. The primary method of pattern acquisition
consists of a proprietary adaptation of lazy learning (Aha 1997, Yip 1997). Our
implementation stores feature patterns (static and dynamic) with adaptive fuzzy
tolerances that subsequently determine how similar patterns are processed. Our
recognition algorithm matches patterns on a competitive winner-take-all basis,
as a set or aggregate of similar patterns, or by forced choice. It also offers
inherent support for pattern completion, and recall (where appropriate).
Data accumulation and
forgetting. Because our system learns patterns incrementally, mechanism are
needed for consolidating and pruning excess data. Sensed patterns (or
sub-patterns) that fall within a dynamically set noise/ error tolerance of
existing ones are automatically consolidated by a hebbian-like mechanism that
we call "nudging." This algorithm also accumulates certain statistical
information. On the other hand, patterns that turn out not to be important (as
judged by various criteria) are deleted.
Categorization and
clustering. Vector-coded feature patterns are acquired in real-time and
stored in a highly adaptive network structure. This central self-organizing
repository automatically clusters data in hyper-dimensional vector-space. Our
matching algorithm's ability to recall patterns by any dimension provides
inherent support for flexible, dynamic categorization. Additional
categorization mechanisms facilitate grouping patterns by additional
parameters, associations, or functions.
Pattern hierarchies
and associations. Patterns of perceptual features do not stand in isolation
-- they are derived from coherent external reality. Encoding relationships between
patterns serves the crucial functions of added meaning, context, and
anticipation. Our system captures low-level, perception-driven pattern
associations such as: sequential or coincidental in time, nearby in space,
related by feature group or sense modality. Additional relationships are
encoded at higher levels of the network, including actuation layers. This
overall structure somewhat resembles the "dual network" described by Goertzel
(1993).
Pattern priming and
activation spreading. The core function of association links is to prime[4]
related nodes. This helps to disambiguate pattern matching, and to select contextual
alternatives. In the case where activation is particularly strong and
perceptual activity is low, stored patterns will be "recognized" spontaneously.
Both the scope and decay rate of such activation spreading are controlled
adaptively. These dynamics combine with the primary, perception-driven
activation to form the system's short-term memory.
Action patterns.
Adaptive action circuits are used to control parameters in the following three
domains:
1)--Senses, including adjustable feature extractors, focus and
selection mechanisms
Output actuators for navigation and manipulation
Meta-cognition and internal controls.
Different actions states and behaviors (action sequences)
for each of these control outputs can be created at design time (using a
configuration script) or acquired interactively. Real-time learning occurs
either by means of explicit teaching, or autonomously through random exploration.
Once acquired, these actions can be tied to specific perceptual stimuli or
whole contexts through various stimulus-response mechanisms. These S-R links
(both activation and inhibition) are dynamically modified through ongoing
reinforcement learning.
Meta-cognitive
control. In addition to adaptive perception and action functionality, an
AGI design must also allow for extensive monitoring and control of overall
system parameters and functions. Any complex interactive learning system
contains numerous crucial control parameters such as noise tolerance, learning
and exploration rates, priorities and goal management, and a myriad others. Not
only must the system be able to adaptively control these many interactive
vectors, it must also appropriately manage its various cognitive functions
(such as recognition, recall, action, etc.). Our design deals with these
requirements by means of a highly adaptive introspection/ control "probe."
High-level
intelligence. Our AGI model posits that no additional foundational functions are necessary for higher-level cognition.
Abstract thought, language, and logical thinking are all elaborations of core
abilities. This controversial point is elaborated on further on.
5. An AGI in the making
The functional prototype currently under development at
Adaptive A.I. Inc. aims to embody all the abovementioned choices, requirements,
and features. Our development path is as follows:
1) Development framework
2) Memory core and interface structure
3) Individual foundational cognitive components
4) Integrated low-level cognition
5) Increasing level of functionality.
The software comprises an AGI engine framework with the
following basic components:
A set of pluggable, programmable (virtual) sensors and actuators (called
"probes")
A central pattern store/ engine including all data and cognitive algorithms
A configurable, dynamic 2D virtual world, plus various training and diagnostic
tools.
The AGI engine design is based on, and embodies insights
from a wide range of research in cognitive science -- including computer
science, neuroscience, epistemology (Rand 1990, Kelley 1986), and psychology
(Margolis 1987). Particularly strong influences include: embodied systems
(Brooks 1994), vector encoded representation (Churchland 1995), adaptive
self-organizing neural nets (esp. Growing Neural Gas, Fritzke 1995),
unsupervised and self-supervised learning, perceptual learning (Goldstone 1998), and fuzzy logic
(Kosko 1997).
While our design
includes several novel, and proprietary algorithms, our key innovation is the
particular selection and integration of established technologies and prior
insights.
AGI Engine Architecture & Design Features
Our AGI engine (which provides this foundational cognitive ability) can logically
be divided into three parts (See figure above.):
Cognitive core
Control/ interface logic
Input/ output probes
This "situated agent architecture" reflects the importance
of having an AGI system that can dynamically and adaptively interact with the
environment. From a theory-of-mind perspective it acknowledges both the crucial
need for concept grounding (via senses), plus the absolute need for
experiential, self-supervised learning.
The components listed below have been specifically designed
with features required for adaptive general intelligence in (ultimately) real
environments. Among other things, they deal with a great variety and volume of
static and dynamic data, cope with fuzzy and uncertain data and goals, foster
coherent integrated representations of reality, and -- most of all -- promote
adaptivity.
Cognitive Core:
This is the central repository of all static and dynamic data patterns --
including all learned cognitive and behavioral states and sequences. All data
is stored in a single, integrated node-link structure. The design innovates the
specific encoding of pattern "fuzziness" (in addition to other attributes). The
core allows for several node/ link types with differing dynamics to help define
the network's cognitive structure.
The network's topology is dynamically self-organizing -- a
feature inspired by "Growing Neural Gas" design (Fritzke 1995). This allows
network density to adjust to actual data feature and/ or goal requirements.
Various adaptive local and global parameters further define network structure
and dynamics in real time.
Control and Interface
Logic: An overall control system coordinates the network's execution cycle,
drives various cognitive and housekeeping algorithms, and controls/ adapts
system parameters. Via an Interface Manager, it also communicates data and
control information to and from the probes.
Probes: The
Interface Manager provides for dynamic addition and configuration of probes.
Key design features of the probe architecture include the ability to have programmable
feature extractors, variable data resolution, and focus & selection mechanisms.
Such mechanisms for data selection are imperative for general intelligence:
even moderately complex environments have a richness of data that far exceeds
any system's ability to usefully process.
The system handles a very wide variety of data types and
control signal requirements -- including those for visual, sound, and raw data
(e.g., database, internet, keyboard), as well as various output actuators. A
novel "system probe" provides the system with monitoring and control of its
internal states (a form of meta-cognition). Additional probes -- either custom
interfaces with other systems or additional real-world sensors/ actuators -- can
easily be added to the system.
Development
Environment/ Language/ Hardware. The complete AGI engine plus associated
support programs are implemented in (Object Oriented) C# under Microsoft's .NET
framework. The system is designed for optional remoting of various components,
thus allowing for some distributed processing. Current tests show that
practical (proof-of-concept) prototype performance can be achieved on a single,
conventional PC (2 Ghz, 512 Meg). Even a non-performance-tuned implementation
can process several complex patterns per second on a database of well over a
million stored features.
This section covers some of our near-term research and
development; it aims to illustrate our expected path toward meaningful general
intelligence. While this work barely approaches higher-level animal cognition (exceeding it in some
aspects, but falling far short in others such as sensory-motor skills), we take
it to be a crucial step in proving the validity and practicality of our model.
Furthermore, the actual functionality achieved should be highly competitive, if
not unique, in applications where significant autonomous adaptivity and data
selection, lack of brittleness, dynamic pattern processing, flexible actuation,
and self-supervised learning are central requirements.
General intelligence doesn't comprise one single, brilliant
knock-out invention or design feature; instead, it emerges from the synergetic
integration of a number of essential fundamental components. On the structural
side, the system must integrate sense inputs, memory, and actuators, while on
the functional side various learning, recognition, recall and action
capabilities must operate seamlessly on a wide range of static and dynamic
patterns. In addition, these cognitive abilities must be conceptual and
contextual -- they must be able to generalize knowledge, and interpret it
against different backgrounds.
A key milestone in our project is testing the integrated functionality of the basic
cognitive components within our overall AGI framework. A number of
custom-developed, highly-configurable test utilities are used to test the
cohesive functioning of the whole system. This automated training and
evaluation is supplemented by manual experimentation in numerous different
environments and applications. Experience gained by these tests helps to refine
the complex dynamics of interacting algorithms and parameters.
One of the general difficulties with AGI development is to
determine absolute measures of success. Part of the reason is that this field
is still nascent, and thus no agreed definitions, let alone tests or measures
of low-level general intelligence exist. As we proceed with our project we expect
to develop ever more effective protocols and metrics for assessing cognitive
ability. Our system's performance evaluation is guided by this description:
"General intelligence comprises the ability to acquire (and adapt) the
knowledge and skills required for achieving a wide range of goals in a variety
of domains."
In this context, "acquisition" includes all of the
following: automatic, via sense inputs (feature/ data driven); explicitly
taught; discovered through exploration or experimentation; internal processes
(e.g., association, categorization, statistics, etc.).
"Adaptation" implies that new knowledge is integrated
appropriately.
"Knowledge and skills" refer to all kinds of data and
abilities (states and behaviors) that the system acquires for the short or long
term.
Our initial protocol for evaluating AGIs aims to cover a
wide spectrum of domains and goals by simulating sample applications in 2D
virtual worlds. In particular, these tests should assess the degree to which
the foundational abilities operate as an integrated, mutually supportive whole
-- and without programmer intervention! Here are three examples:
Sample Test Domains for Initial Performance Criteria
Adaptive Security
Monitor. This system scans video monitors and alarm panels that oversee a
secure area (say, factory, office building, etc.), and responds appropriately
to abnormal conditions. Note, this is somewhat similar to a site monitoring
application at MIT (Grimson 1998).
This simulation calls for a visual environment that contains
a lot of detail but has only limited dynamic activity -- this is its normal
state (green). Two levels of abnormality exist: (i) minor, or known disturbance
(yellow); (ii) major, or unknown disturbance (red).
The system must initially learn the normal state by simple
exposure (automatically scanning the environment) at different resolutions
(detail). It must also learn "yellow" conditions by being shown a number of
samples (some at high resolution). All other states must output "red."
Standard operation is to continuously scan the environment
at low resolution. If any abnormal condition is detected the system must learn
to change to higher resolution in order to discriminate between "yellow" and
"red."
The system must adapt to changes in the environment (and
totally different environments) by simple exposure training.
Sight Assistant.
The system controls a movable "eye" (by voice command) that enables the
identification (by voice output) of at least a hundred different objects in the
world. A trainer will dynamically teach the system new names, associations, and
eye movement commands.
The visual probe can select among different scenes
(simulating rooms) and focus on different parts of each scene. The scenes
depict objects of varying attributes: color, size, shape, various dynamics,
etc. (and combinations of these), against different backgrounds.
Initial training will be to attach simple sound commands to
maneuver the "eye," and to associate word labels with selected objects. The
system must then reliably execute voice commands and respond with appropriate
identification (if any). Additional functionality could be to have the system
scan the various scenes when idle, and to automatically report selected
important objects.
Object identification must cover a wide spectrum of
different attribute combinations and tolerances. The system must easily learn
new scenes, objects, words and associations, and also adapt to changes in any
of these variables.
Maze Explorer. A
(virtual) entity explores a moderately complex environment. It discovers what
types of objects aid or hinder its objectives, while learning to navigate this
dynamic world. It can also be trained to perform certain behaviors.
The virtual world is filled with a great number of different
objects (see previous example). In addition, some of these objects move in
space at varying speeds and dynamics, and may be solid and/ or immovable.
Groups of different kinds of objects have pre-assigned attributes that indicate
negative or positive. The AGI engine controls the direction and speed of an
entity in this virtual world. Its goal is to learn to navigate around immovable
and negative objects to reliably reach hidden positives.
The system can also be trained to respond to operator
commands to perform behaviors of varying degrees of complexity (for example,
actions similar to "tricks" one might teach a dog). This "Maze Explorer" can easily
be set up to deal with fairly complex tasks.
Clearly, the tasks described above do not by themselves
represent any kind of breakthrough in artificial intelligence research. They
have been achieved many times before. However, what we do believe to be significant and unique is the achievement of these
various tasks without any task-specific programming or parameterization. It is
not what is being done, but how it is done.
Development beyond these basic proof-of-concept tests will
advance in two directions: 1) to significantly increase resolution, data
volume, and complexity in applications similar to the tests; 2) to add
higher-level functionality. In addition to work aimed at further developing and
proving our general intelligence model, there are also numerous practical
enhancements that can be done. These would include implementing multi-processor
and network versions, and integrating our system with databases or with other
existing AI technology such as expert systems, voice recognition, robotics, or
sense modules with specialized feature extractors.
By far the most important of these future developments
concern higher-level ability. Here is a partial list of action items, all of
which are derived from lower-level foundations:
Spread activation and retain context over extended period
Support more complex internal temporal patterns, both for enhanced recognition
and anticipation, and for cognitive and action sequences
Internal activation feedback for processing without input
Deduction, achieved through selective concept activation
Advanced categorization by arbitrary dimensions
Learning of more complex behavior
Abstract and merged concept formation
Structured language acquisition
Increased awareness and control of internal states (introspection)
Learning logic and other problem-solving methodologies.
Many different approaches to AI exist; some of the
differences are straight forward while others are subtle and hinge on difficult
philosophical issues. As such the exact placement of our work relative to that
of others is difficult and, indeed, open to debate. Our view that "intelligence
is a property of an entity that engages in two way interaction with an external
environment," technically puts us in the area of "agent systems" (Russel 1995).
However, our emphasis on a connectionist rather than classical approach to
cognitive modeling, places our work in the field of "embodied cognitive
science." (See Pfeifer and Scheier 1999 for a comprehensive overview.)
While our approach
is similar to other research in embodied cognitive science, in some respects
our goals are substantively
different. A key difference is our belief that a core set of cognitive
abilities working together is sufficient to produce general intelligence. This
is in marked contrast to others in embodied cognitive science who consider
intelligence to be necessarily specific to a set of problems within a given
environment. In other words, they believe that autonomous agents always exist
in ecological niches. As such they focus their research on building very
limited systems that effectively deal with only a small number of problems
within a specific limited environment. Almost all work in the area follows this
-- see Braitenberg (1984), Brooks (1994) or Arbib (1992) for just a few well
known examples. Their stance contradicts the fact that humans possess general
intelligence; we are able to effectively deal with a wide range of problems
that are significantly beyond anything that could be called our "ecological
niche."
Perhaps the closest project to ours that is strictly in the
area of embodied cognitive science is the Cog project at MIT (Brooks 1993). The
project aims to understand the dynamics of human interaction by the
construction of a human-like robot complete with upper torso, a head, eyes,
arms and hands. While this project is significantly more ambitious than other
projects in terms of the level and complexity of the system's dynamics and
abilities, the system is still essentially niche focused (elementary human
social and physical interaction) when compared to our own efforts at general
intelligence.
Probably the closest work to ours in the sense that it also
aims to achieve general rather than niche intelligence is the Novamente project
under the direction of Ben Goertzel. (The project was formerly known as Webmind
-- see Goertzel 1997, 2001.) Novamente relies on a hybrid of low-level neural
net-like dynamics for activation spreading and concept priming, coupled with
high-level semantic constructs to represent a variety of logical, causal and
spatial-temporal relations. While the semantics of the system's internal state
are relatively easy to understand compared to a strictly connectionist
approach, the classical elements in the system's design open the door to many
of the fundamental problems that have plagued classical AI over the last fifty
years. For example, high-level semantics require a complex meta-logic contained
in hard coded high-level reasoning and other high-level cognitive systems.
These high-level systems contain significant implicit semantics that may not be
grounded in environmental interaction but are rather hard coded by the designer
-- thus causing symbol grounding problems (Harnad 1990). The relatively fixed,
high-level methods of knowledge representation and manipulation that this
approach entails are also prone to "frame of reference" (McCarthy and Hayes 1969; Pylyshyn 1987)
and "brittleness" problems. In a strictly embodied cognitive science approach,
as we have taken, all knowledge is derived from agent-environment interaction
thus avoiding these long-standing problems of classical AI.
--Andy Clark (1997) is another
researcher whose model closely resembles our own, but there are no
implementations specifically based on his theoretical work. Igor Aleksander's
(now dormant) MAGNUS project (1996) also incorporated many key AGI concepts
that we have identified, but it was severely limited by a classical AI,
finite-state machine approach. Valeriy Nenov and Michael Dyer of UCLA (1994)
used "massively" parallel hardware (a CM-2 Connection Machine) to implement a
virtual, interactive perceptual design close to our own, but with a more rigid,
pre-programmed structure. Unfortunately, this ambitious, ground-breaking work
has since been abandoned. The project was probably severely hampered by limited
(at the time) hardware.--
Moving further away from embodied cognitive science to
purely classical research in general intelligence, perhaps the best known
system is the Cyc project being pursued by Lenat (1990). Essentially Lenat sees
general intelligence as being "common sense." He hopes to achieve this goal by
adding many millions of facts about the world into a huge database. After many
years of work and millions of dollars in funding there is still a long way to
go as the sheer number of facts that humans know about the world is truly staggering.
We doubt that a very large database of basic facts is enough to give a computer
much general intelligence -- the mechanisms for autonomous knowledge acquisition
are missing. Being a classical approach to AI this also suffers from the
fundamental problems of classical AI listed above. For example the symbol
grounding problem again: if facts about cats and dogs are just added to a
database that the computer can use even though it has never seen or interacted
with an animal, are those concepts really meaningful to the system? While his
project also claims to pursue "general intelligence," it is really very
different from our own, both in its approach and in the difficulties it faces.
Analysis of AI's ongoing failure to overcome its
long-standing limitations reveals that it is not so much that Artificial
General Intelligence has been tried and that it has failed, but rather that the
field has largely been abandoned -- be it for theoretical, historic, or commercial
reasons. Certainly, our particular type of approach, as detailed in previous sections,
is receiving scant attention.
8. Fast-track AGI -- Why so Rare?
Widespread application of AI has been hampered by a number
of core limitations that have plagued the field since the beginning, namely:
The expense and delay of custom programming individual applications
Systems' inability to automatically learn from experience, or to be user
teachable/ trainable
Reliability and performance issues caused by "brittleness" (the inability
of systems to automatically adapt to changing requirements, or data outside
of a predefined range)
Their limited intelligence and common sense.
The most direct path to solving these long-standing problems
is to conceptually identify the fundamental characteristics common to all
high-level intelligence, and to engineer systems with this basic functionality,
in a manner that capitalizes on human and technological strength.
General intelligence is the key to achieving robust
autonomous systems that can learn and adapt to a wide range of uses. It is also
the cornerstone of self-improving, or Seed AI -- using basic abilities to
bootstrap higher-level ones. This essay identified foundational components of
general intelligence, as well as crucial considerations particular to the
effective development of the artificial variety. It highlighted the fact that
very few researchers are actually following this most direct route to AGI.
If the approach outlined above is so promising, then why is
has it received so little attention? Why is hardly anyone actually working on
it?
A short answer: Of all the people working in the field called "AI":
80% don't believe in the concept of General Intelligence (but instead, in
a large collection of specific skills and knowledge)
Of those that do, 80% don't believe that artificial, human-level intelligence
is possible - either ever, or for a long, long time
Of those that do, 80% work on domain-specific AI projects for commercial
or academic-political reasons (results are more immediate)
Of those left, 80% have a poor conceptual framework...
Even though the above is a caricature, in contains more than
a grain of truth.
A great number of researchers reject the validity or
importance of "general intelligence." For many, controversies in psychology
(such as those stoked by The Bell Curve)
make this an unpopular, if not taboo subject. Others, conditioned by decades of
domain-specific work, simply do not see the benefits of Seed AI -- solving the
problems only once.
Of those that do not in principle object to general
intelligence, many don't believe that AGI is possible -- in their life-time, or
ever. Some hold this position because they themselves tried and failed "in
their youth." Others believe that AGI is not the best approach to achieving "AI," or are at a total loss on how to
go about it. Very few researchers have actually studied the problem from our
(the general intelligence/ Seed AI) perspective. Some are actually trying to
reverse-engineer the brain -- one function at a time. There are also those who
have moral objections, or who are afraid of it.
Of course, a great many are so focused on particular, narrow
aspects of intelligence that they simply don't get around to looking at the big
picture -- they leave it to others to make it happen. It is also important to
note that there are often strong financial and institutional pressures to
pursue specialized AI.
All of the above combine to create a dynamic where Real AI
is not "fashionable" -- getting little respect, funding, and support -- further
reducing the number of people drawn into it!
These should be more than enough reasons to account for the
dearth of AGI progress. But it gets worse. Researchers actually trying to build
AGI systems are further hampered by a myriad of misconceptions, poor choices,
and lack of resources (funding and research). Many of the technical issues were
explored previously (See sections 3 and 7.), but a few others are worth
mentioning:
Epistemology.
Models of AGI can only be as good as their underlying theory of knowledge -- the
nature of knowledge, and how it relates to reality. The realization that
high-level intelligence is based on conceptual
representation of reality underpins design decisions such as adaptive, fuzzy
vector encoding, and an interactive, embodied approach. Other consequences are
the need for sense-based focus and selection, and contextual activation. The
central importance of a highly-integrated pattern network -- especially
including dynamic ones -- becomes obvious on understanding the relationship
between entities, attributes, concepts, actions, and thoughts. These and
several other insights lay the foundation for solving problems related to
grounding, brittleness, and common sense. Finally, there is still a lot of
unnecessary confusion about the relationship between concepts and symbols. A
dynamic that continues to handicap AI is the lingering schism between
traditionalists and connectionists. This unfortunately helps to perpetuate a
false dichotomy between explicit symbols/ schema, and incomprehensible
patterns.
Theory of Mind.
Another area of concern is sloppy formulation and poor understanding of several
key concepts: consciousness, intelligence, volition, meaning, emotions, common
sense, and "qualia." The fact that hundreds of AI researchers attend
conferences every year where key speakers proclaim that "we don't understand
consciousness (or qualia, or whatever), and will probably never understand it' indicates just how pervasive this problem is.
Marvin Minsky's characterization of consciousness being a "suitcase word"[6]
is correct. Let's just unpack it!
Errors like these are often behind research going off at a
tangent relative to stated long-term goals. Two examples are an undue emphasis
on biological feasibility, and the belief that embodied intelligence cannot be
virtual, that it has to be implemented in physical robots.
Cognitive psychology.
It goes without saying that a proper understanding of the concept "intelligence"
is key to engineering it. In addition to epistemology, several areas of cognitive
psychology are crucial to unraveling its meaning. Misunderstanding intelligence
has led to some costly disappointments, such as manually accumulating huge
amounts of largely useless data (knowledge without meaning), efforts to achieve
intelligence by combining masses of dumb agents, or trying to obtain meaningful
conversation from an isolated network of symbols.
Project focus.
The few projects that do pursue AGI
based on relatively sound models run yet another risk: they can easily lose
focus. Sometimes commercial considerations hijack a project's direction, while
others get sidetracked by (relatively) irrelevant technical issues, such as
trying to match an unrealistically high level of performance, fixating on
biological feasibility of design, or attempting to implement high-level
functions before their time. A clearly mapped-out developmental path to
human-level intelligence can serve as a powerful antidote to losing sight of
"the big picture." A vision of how to get from "here" to "there" also helps to
maintain motivation in such a difficult endeavor.
Research support.
AGI utilizes, or more precisely, is an integration of a large number of
existing AI technologies. Unfortunately, many of the most crucial areas are
sadly under-researched. They include:
Incremental, real-time, unsupervised/ self-supervised learning (vs. back-propagation)
Integrated support for temporal patterns
Dynamically-adaptive neural network topologies
Self-tuning of system parameters, integrating bottom-up (data driven) and
top-down (goal/ meta-cognition driven) auto-adaptation
Sense probes with auto-adaptive feature extractors.
Naturally, these very limitations feed back to reduce
support for AGI research.
Cost and difficulty.
Achieving high-level AGI will be hard. However, it will not be nearly as
difficult as most experts think. A key element of "Real AI" theory (and its
implementation) is to concentrate on the essentials of intelligence. Seed AI
becomes a manageable problem -- in some respects much simpler than other
mainstream AI goals - by eliminating huge areas of difficult, but inessential
AI complexity. Once we get the crucial fundamental functionality working, much
of the additional "intelligence" (ability) required is taught or learned, not
programmed. Having said this, I do believe that very substantial resources will
be required to scale up the system to human-level storage and processing
capacity. However, the far more moderate initial prototypes will serve as
proof-of-concept for AGI while potentially seeding a large number of practical
new applications.
9. Conclusion
Understanding general intelligence and identifying its
essential components are key to building next-generation AI systems -- systems
that are far less expensive, yet significantly more capable. In addition to
concentrating on general learning
abilities, a fast-track approach should also seek a path of least resistance --
one that capitalizes on human engineering strengths and available technology.
Sometimes, this involves selecting the AI road less traveled.
We believe that the theoretical model, cognitive components,
and framework described above, joined with our other strategic design decisions
provide a solid basis for achieving practical AGI capabilities in the
foreseeable future. Successful implementation will significantly address many
traditional problems of AI. Potential benefits include:
Minimizing initial environment-specific programming (through self-adaptive
configuration)
Substantially reducing ongoing software changes, because a large amount
of additional functionality and knowledge will be acquired autonomously via
self-supervised learning
Greatly increasing the scope of applications, as users teach and train
additional capabilities
Improved flexibility and robustness resulting from systems'
ability to adapt to changing data patterns, environments and goals.
AGI promises to make an important contribution toward realizing software and
robotic systems that are more usable, intelligent, and human-friendly. The time
seems ripe for a major initiative down this new path of human advancement that
is now open to us.
To be published in the forthcoming book, Real AI: New Approaches
to Artificial General Intelligence. Reproduced with permission.
See Essentials of
General Intelligence: The direct path to AGI for updates.
References
Aha, D.W.
(Ed.) (1997). Lazy Learning. Artificial
Intelligence Review,11:1-5 Kluwer Academic Publishers
Aleksander, I.
(1996). Impossible Minds. Imperial College Press
Arbib, M.A.
(1992). Schema theory. In S. C.
Shapiro (Ed.), Encyclopedia of Artificial
Intelligence, 2nd ed (pp. 1427-1443).
John Wiley.
Braitenberg, V.
(1984). Vehicles: Experiments in
synthetic psychology. MIT Press.
Brooks, R.A.,
and Stein, L. A. (1993). Building brains for bodies. Memo 1439, Artificial
Intelligence Lab, Massachusetts Institute of Technology
Brooks, R.A.
(1994). Coherent behavior from many adaptive processes. In D. Cliff, P.
Husbands, J.A. Meyer, and S.W. Wilson (Eds.), From animals to animats: Proceedings of the third International
Conference on Simulation of Adaptive Behavior (421-430).MIT Press.
Churchland, P.M.
(1995). The Engine of Reason, the Seat of
the Soul: A Philosophical Journey into the Brain. MIT Press
Clark, A.
(1997. Being There: Putting Brain, Body
and World Together Again. MIT Press
Fritzke, B.
(1995). A growing neural gas network
learns topologies. In Tesauro, G., Touretzky, D. S., and Leen, T. K.
(Eds.), Advances in Neural Information
Processing Systems 7 (pp. 625-632). MIT Press.
Goertzel, B.
(1997). From complexity to creativity:
Explorations in evolutionary, autopoietic, and cognitive dynamics. Plenum Press.
Goertzel, B.
(2001). Creating internet intelligence:
Wild computing, distributed digital consciousness, and the emerging global
brain Plenum Press.
Goldstone, R.L.
(1998). Perceptual Learning. Annual
Review of Psychology, 49, 585-612.
Gottfredson, L.S. (1998). The general intelligence factor. [Special
Issue]. Scientific American, 9(4), 2, 24-29.
Grimson, W.E.L.,
Stauffer, C., Lee L., Romano R. (1998). Using
Adaptive Tracking to Classify and Monitor Activities in a Site. Proc.
IEEE Conf. on Computer Vision and Pattern Recognition, pp. 22-31, 1998
Harnad, S.
(1990). The symbol grounding problem. Physica
D, 42, 335-346.
Kelley, D.
(1986). The Evidence of the Senses
Louisiana State University Press
Kosko, B.
(1997). Fuzzy Engineering. Prentice Hall
Lenat, D.B.,
Guha, R.V.(1990). Building Large
Knowledge Based Systems. Addison-Wesley.
Margolis, H.
(1987). Patterns, Thinking, and
Cognition: A Theory of Judgment. University of Chicago Press
McCarthy, J.
and Hayes, P.J.(1969). Some
philosophical problems from the standpoint of artificial intelligence. Machine Intelligence, 4, 463-502.
Nenov, V.I.
and Dyer, M.G. (1994). Language Learning
via Perceptual/ Motor Association: A Massively Parallel Model. In: Kitano,
H., Hendler, J.A. (Eds.), Massively
Parallel Artificial Intelligence (pp. 203-245) AAAI Press/The MIT Press.
Pfeifer, R.,
and Scheier, C. (1999). Understanding
intelligence. MIT Press.
Pylyshyn, Z.W.(Ed.)(1987).
The Robot's Dilemma: The frame problem in
A.I.. Ablex.
Rand, A.
(1990). Introduction to Objectivist
Epistemology. Meridian
Russell, S.J.,
Norvig, P.(1995). Artificial
Intelligence: A modern approach. Prentice Hall.
Wang, P. (1995).
Non-axiomatic reasoning system: Exploring
the essence of intelligence. PhD thesis, Indiana University.
Yip, K., and
Sussman, G.J. (1997). Sparse Representations for Fast, One-shot learning. Proc.
of National Conference on Artificial Intelligence, July 1997.
Footnotes
[1] Intellectual
property is owned by Adaptive A.I. Inc.
[2]
"Brittleness" in AI refers to a system's inability to automatically adapt to
changing requirements, or to cope with data outside of a predefined range --
thus "breaking."
[3]
Back-propagation is one of the most powerful supervised training
algorithms; it is, however, not
particularly amenable to incremental learning.
[4] "Priming,"
as used in psychology, refers to an increase in the speed or accuracy of a
decision that occurs as a consequence of prior exposure or activation.
[5] This section
was co-authored with Shane Legg
[6] meaning that
many different meanings are thrown together in a jumble -- or at least packaged
together in one "box," under one label.
|