1. Introduction
This paper explores the concept of ‘artificial general intelligence’ (AGI)
– its nature, importance, and how best to achieve it. Our[1]
theoretical model posits that general intelligence comprises a limited number
of distinct, yet highly integrated, foundational functional components.
Successful implementation of this model will yield a highly adaptive,
general-purpose system that can autonomously acquire an extremely wide range of
specific knowledge and skills. Moreover, it will be able to improve its own
cognitive ability through self-directed learning. We believe that, given the
right design, current hardware/ software technology is adequate for engineering
practical AGI systems. Our current implementation of a functional prototype is
described below.
The idea of ‘general intelligence’ is quite
controversial; I do not substantially engage this debate here but rather take
the existence of such non domain-specific abilities as a given (Gottfredson 1998). It must also be noted that
this essay focuses primarily on low-level (i.e. roughly animal level) cognitive
ability. Higher-level functionality, while an integral part of our model, is
only addressed peripherally. Finally, certain algorithmic details are omitted
for reasons of proprietary ownership.
2.
General Intelligence
Intelligence can be defined simply as an entity’s
ability to achieve goals – with greater intelligence coping with more
complex and novel situations. Complexity ranges from the trivial –
thermostats and mollusks (that in most contexts don’t even justify the
label ‘intelligence’) – to the fantastically complex;
autonomous flight control systems and humans.
Adaptivity, the ability to deal
with changing and novel requirements, also covers a wide spectrum: from rigid,
narrowly domain-specific to highly flexible, general purpose. Furthermore, flexibility
can be defined in terms of scope and permanence – how much, and how
often it changes. Imprinting is an example of limited scope and high permanence,
while innovative, abstract problem solving is at the other end of the spectrum.
While entities with high adaptivity and flexibility
are clearly superior – they can potentially learn to achieve any possible
goal – there is a hefty efficiency price to be paid: For example, had
Deep Blue also been designed to learn language, direct airline traffic, and do
medical diagnosis, it would not have become Chess champion (all other things
being equal).
General
Intelligence comprises the essential,
domain-independent skills necessary for acquiring a wide range of domain-specific
knowledge (data & skills) – i.e. the ability to learn anything (in
principle). More specifically, this learning ability needs to be autonomous,
goal-directed, and highly adaptive:
·
Autonomous -- Learning occurs both automatically,
through exposure to sense data (unsupervised), and through bi-directional
interaction with the environment, including exploration/ experimentation
(self-supervised).
·
Goal-directed – Learning is directed
(autonomously) towards achieving varying and novel goals and sub-goals -- be
they ‘hard-wired’, externally specified, or self-generated.
Goal-directedness also implies very selective learning and data acquisition
(from a massively data-rich, noisy, complex environment).
·
Adaptive – Learning is cumulative,
integrative, contextual and adjusts to changing goals and environments. General
adaptivity not only copes with gradual changes, but
also seeds and facilitates the acquisition of totally novel abilities.
General cognitive ability stands in sharp contrast to
inherent specializations such as speech- or face-recognition, knowledge
databases/ ontologies, expert systems, or search,
regression or optimization algorithms. It allows an entity to acquire a
virtually unlimited range of new specialized abilities. The mark of a generally intelligent system is not having a lot of knowledge and skills,
but being able to acquire and improve them – and to be able to
appropriately apply them.
Furthermore, knowledge must be acquired and stored in ways appropriate both to
the nature of the data, and to the goals and tasks at hand.
For example, given the correct set of basic core capabilities,
an AGI system should be able to learn to recognize and categorize a wide range
of novel perceptual patterns that are acquired via different senses, in many
different environments and contexts. Additionally, it should be able to
autonomously learn appropriate, goal-directed responses to such input contexts
(given some feedback mechanism).
We take this concept to be valid not only for high-level
human intelligence, but for lower-level animal-like ability. The degree of
‘generality’ (i.e., adaptability) varies along a continuum from
genetically ‘hard-coded’ responses (no adaptability), to high-level
animal flexibility (significant learning ability as in, say, a dog), and
finally to self-aware human general learning ability.
Core
Requirements for General Intelligence
General intelligence, as described above, demands a number
of irreducible features and capabilities. In order to proactively accumulate
knowledge from various (and/ or changing) environments, it requires:
1. Senses
to obtain features from ‘the world’ (virtual or actual),
2. A
coherent means for storing knowledge obtained this way, and
3. Adaptive
output/ actuation mechanisms (both static and dynamic).
Such knowledge also needs to be automatically adjusted and
updated on an ongoing basis; new knowledge must be appropriately related to
existing data. Furthermore, perceived entities/ patterns must be stored in a way
that facilitates concept formation and generalization. An effective way to
represent complex feature relationships is through vector encoding (Churchland 1995).
Any practical applications of AGI (and certainly any
real-time uses) must inherently be
able to process temporal data as patterns in time – not just as static
patterns with a time dimension. Furthermore, AGIs must cope with data from
different sense probes (e.g., visual, auditory, and data), and deal with such
attributes as: noisy, scalar, unreliable, incomplete, multi-dimensional (both
space/ time dimensional, and having a large number of simultaneous features),
etc. Fuzzy pattern matching helps deal with pattern variability and noise.
Another essential requirement of general intelligence is to
cope with an overabundance of data. Reality presents massively more features
and detail than is (contextually) relevant, or that can be usefully processed.
This is why the system needs to have some control over what input data is
selected for analysis and learning – both in terms of which data, and also the degree of
detail. Senses (‘probes’) are needed not only for selection and
focus, but also in order to ground concepts – to give them
(reality-based) meaning.
While input data needs to be severely limited by focus and
selection, it is also extremely important to obtain multiple views of reality
– data from different feature extractors or senses. Provided that these
different input patterns are properly associated, they can help to provide
context for each other, aid recognition, and add meaning.
In addition to being able to sense via its multiple,
adaptive input groups and probes, the AGI must also be able to act on the world
– be it for exploration, experimentation, communication, or to perform
useful actions. These mechanisms need to provide both static and dynamic output
(states and behavior). They too, need to be adaptive and capable of learning.
Underlying all of this functionality is pattern processing.
What is more, not only are sensing and action based on generic patterns, but so
is internal cognitive activity. In fact, even high-level abstract thought,
language, and formal reasoning – abilities outside the scope of our
current project – are ‘just’ higher-order elaborations of
this (Margolis 1987).
Advantages of
Intelligence being General
The advantages of general intelligence are almost too
obvious to merit listing; how many of us would dream of giving up our ability
to adapt and learn new things? In the context of artificial intelligence this
issue takes on a new significance.
There exists an inexhaustible demand for computerized
systems that can assist humans in complex tasks that are highly repetitive,
dangerous, or that require knowledge, senses or abilities that its users may
not possess (e.g., expert knowledge, ‘photographic’ recall,
overcoming disabilities, etc.). These applications stretch across almost all domains
of human endeavor.
Currently, these needs are filled primarily by systems
engineered specifically for each domain and application (e.g., expert systems).
Problems of cost, lead-time, reliability, and the lack of adaptability to new
and unforeseen situations, severely limit market potential. Adaptive AGI
technology, as described in this paper, promises to significantly reduce these
limitations and to open up these markets. It specifically implies –
·
That systems can learn (and be taught) a wide
spectrum of data and functionality
·
They can adapt to changing data, environments
and uses/ goals
·
This can be achieved without program changes
– capabilities are learned, not coded.
More specifically, this technology can potentially:
·
Significantly reduce system
‘brittleness’[2]
through fuzzy pattern matching and adaptive learning – increasing
robustness in the face of changing and unanticipated conditions or data.
·
Learn autonomously, by automatically
accumulating knowledge about new environments through exploration.
·
Allow systems to be operator-trained to identify
new objects and patterns; to respond to situations in specific ways, and to
acquire new behaviors.
·
Eliminate programming in many applications.
Systems can be employed in many different environments, and with different
parameters simply through self-training.
·
Facilitate easy deployment in new domains. A
general intelligence engine with pluggable custom input/ output probes allows
rapid and inexpensive implementation of specialized applications.
From a design perspective, AGI offers the advantage that all
effort can be focused on achieving the best general
solutions – solving them once, rather than once for each particular
domain. AGI obviously also has huge economic implications: because AGI systems
acquire most of their knowledge and skills (and adapt to changing requirements)
autonomously, programming lead times and costs can be dramatically reduced, or
even eliminated.
The fact that no (artificial!) systems with these
capabilities currently exist seems to imply that it is very hard (or
impossible) to achieve these objectives. However, I believe that, as with other
examples of human discovery and invention, the solution will seem rather
obvious in retrospect. The trick is correctly choosing a few critical
development options.
3.
Shortcuts to AGI
When explaining Artificial General Intelligence to the
uninitiated one often hears the remark that, surely, everyone in AI is working
to achieve general intelligence. This indicates how deeply misunderstood
intelligence is. While it is true that eventually
conventional (domain-specific) research efforts will converge with those of
AGI, without deliberate guidance this is likely to be a long, inefficient
process. High-level intelligence must
be adaptive, must be general – yet very little work is being done to
specifically identify what general intelligence is, what it requires, and
how to achieve it.
In addition to understanding general intelligence, AGI
design also requires an appreciation of the differences between artificial (synthetic) and biological
intelligence, and between designed
and evolved systems.
Our particular approach to achieving AGI capitalizes on
extensive analysis of these issues, and on an incremental development path that
aims to minimize development effort (time and cost), technical complexity, and
overall project risks. In particular, we are focusing on engineering a series
of functional (but low-resolution/ capacity) proof-of-concept prototypes.
Performance issues specifically related to commercialization are assigned to
separate development tracks. Furthermore, our initial effort concentrates on
identifying and implementing the most general and foundational components
first, leaving high-level cognition such as abstract thought, language, and
formal logic for later development (more on that later). We also focus more on
selective, unsupervised, dynamic, incremental, interactive learning; on noisy,
complex, analog data; and on integrating entity features and concept attributes
in one comprehensive network.
While our project may not be the only one proceeding on this
particular path, it is clear that by far the majority of AI work being done
today follows a substantially different overall approach. Our work focuses on:
·
General
rather than domain-specific cognitive ability
·
Acquired
knowledge and skills, versus loaded databases and coded skills
·
Bi-directional,
real-time interaction, versus batch processing
·
Adaptive
attention (focus & selection), versus human pre-selected data
·
Core support for dynamic patterns, versus static data
·
Unsupervised and self-supervised, versus supervised learning
·
Adaptive,
self-organizing data structures, versus fixed neural nets or databases
·
Contextual,
grounded concepts, versus hard-coded, symbolic concepts
·
Explicitly engineering
functionality, versus evolving it
·
Conceptual
design, versus reverse-engineering
·
General proof-of-concept,
versus specific real applications development
·
Animal
level cognition, versus abstract thought, language, and formal logic.
Let’s look at each of these choices in greater detail.
General rather than
domain-specific cognitive ability. The advantages listed in the previous
section flow from the fact that generally intelligent systems can ultimately
learn any specialized knowledge and skills possible – human
intelligence is the proof! The reverse is obviously not true.
A complete, well-designed AGI’s ability to acquire domain-specific
capabilities is limited only by processing and storage capacity. What is more,
much of its learning will be autonomous – without teachers, and certainly
without explicit programming. This approach implements (and capitalizes on) the
essence of ‘Seed AI’ – systems with a limited, but carefully
chosen set of basic, initial capabilities that allow them (in a
‘bootstrapping’ process) to dramatically increase their knowledge
and skills through self-directed learning and adaptation. By concentrating on
carefully designing the seed of intelligence, and then nursing it to maturity,
one essentially bootstraps intelligence. In our AGI design this
self-improvement takes two distinct forms/ phases:
1. Coding
the basic skills that allow the system to acquire a large amount of specific
knowledge.
2. The
system reaching sufficient intelligence and conceptual understanding of its own
design, to enable it to deliberately improve its own design.
Acquired knowledge
and skills, versus loaded databases and coded skills. One crucial measure
of general intelligence is its ability to acquire
knowledge and skills, not how much it possesses. Many AI efforts concentrate on
accumulating huge databases of knowledge and coding massive amounts of specific
skills. If AGI is possible – and evidence presented here and elsewhere seems
overwhelming – then much of this effort will be wasted. Not only will an
AGI be able to acquire these additional smarts (largely) by itself, but
moreover, it will also be able to keep its knowledge up-to-date, and to improve
it. Not only will this save initial data collection and preparation as well as
programming, it will also dramatically reduce maintenance.
An important feature of our design is that there are no
traditional databases containing knowledge, nor programs encoding learned
skills: All acquired knowledge is integrated into an adaptive central
knowledge/ skills network. Patterns representing knowledge are associated in a
manner that facilitates conceptualization and sensitivity to context. Naturally,
such a design is potentially far less prone to brittleness, and more
resiliently fault-tolerant.
Bi-directional,
real-time interaction, versus batch processing. Adaptive learning systems
must be able to interact bi-directionally with the environment – virtual
or real. They must both sense data and act/ react on an ongoing basis. Many AI
systems do all of their learning in batch mode and have little or no ability to
learn incrementally. Such systems
cannot easily adjust to changing environments or requirements – in many
cases they are unable to adapt beyond the initial training set without
reprogramming or retraining.
In addition to real-time perception and learning,
intelligent systems must also be able to act. Three distinct areas of action capability
are required:
1. Acting
on the ‘world’ – be it to communicate, to navigate or
explore, or to manipulate some external function or device in order to achieve
goals.
2. Controlling
or modifying the system’s internal parameters (such as learning rate or
noise tolerance, etc.) in order to set or improve functionality.
3. Controlling
the system’s sense input parameters such as focus, selection, resolution
(granularity) as well as adjusting feature extraction parameters.
Adaptive attention
(focus & selection), versus human pre-selected data. As mentioned
earlier, reality presents far more sense data abundance, detail, and complexity
than are required for any given task – or than can be processed.
Traditionally, this problem has been dealt with by carefully selecting and
formatting data before feeding it to the system. While this human assistance
can improve performance in specific applications, it is often not realized that
this additional intelligence resides in the human, not the software.
Outside guidance and training can obviously speed learning;
however, AGI systems must inherently
be designed to acquire knowledge by themselves. In particular, they need to
control what input data is processed – where specifically to obtain data,
in how much detail, and in what format. Absent this capability the system will
either be overwhelmed by irrelevant data or, conversely, be unable to obtain
crucial information, or get it in the required format. Naturally, such data
focus and selection mechanisms must themselves be adaptive.
Core support for
dynamic patterns, versus static data. Temporal pattern processing is
another fundamental requirement of interactive intelligence. At least three
aspects of AGI rely on it: perception
needs to learn/ recognize dynamic entities and sequences, action usually comprises complex behavior, and cognition (internal processing) is inherently temporal. In spite of
this obvious need for intrinsic support for dynamic patterns, many AI systems
only process static data; temporal sequences, if supported at all, are often
converted (‘flattened’) externally to eliminate the time dimension.
Real-time temporal pattern processing is technically quite challenging, so it
is not surprising that most designs try to avoid it.
Unsupervised and
self-supervised, versus supervised learning. Auto-adaptive systems such as
AGIs require comprehensive capabilities to learn without supervision. Such
teacher-independent knowledge and skill acquisition falls into two broad
categories: unsupervised (data-driven, bottom-up), and self-supervised
(goal-driven, top-down). Ideally these two modes of learning should seamlessly
integrate with each other – and of course, also with other, supervised
methods.
Here, as in other design choices, general adaptive systems
are harder to design and tune than more specialized, unchanging ones. We see
this particularly clearly in the overwhelming focus on back-propagation[3] in
artificial neural network (ANN) development. Relatively little research aims at
better understanding and improving incremental, autonomous learning. Our own
design places heavy emphasis on these aspects.
Adaptive, self-organizing data structures, versus fixed
neural nets or databases. Another core requirement imposed by data/ goal-driven,
real-time learning is having a flexible, self-organizing data structure. On the
one hand, knowledge representation must be highly integrated, while on the
other hand it must be able to adapt to changing data densities (and other
properties), and to varying goals or solutions. Our AGI encodes all acquired
knowledge and skills in one integrated network-like structure. This central
repository features a flexible, dynamically self-organizing topology. The vast
majority of other AI designs rely either on loosely-coupled data objects or
agents, or on fixed network topologies and pre-defined ontologies,
data hierarchies or database layouts. This often severely limits their
self-learning ability, adaptivity and robustness, or
creates massive communication bottlenecks or other performance overhead.
Contextual, grounded
concepts, versus hard-coded, symbolic concepts. Concepts are probably the
most important design aspect of AGI; in fact, one can say that
‘high-level intelligence is
conceptual intelligence’. Core characteristics of concepts include their
ability to represent ultra-high-dimensional fuzzy sets that are grounded in
reality, yet fluid with regard to context. In other words, they encode related
sets of complex, coherent, multi-dimensional patterns that represent features
of entities. Concepts obtain their grounding (and thus their meaning) by virtue
of patterns emanating from features sensed directly from entities that exist in
reality. Because concepts are defined by value ranges within each
feature dimension (sometimes in complex relationships), some kind of fuzzy
pattern matching is essential. In addition, the scope of concepts must
be fluid; they must be sensitive and adaptive to both environmental and goal
contexts.
Autonomous concept formation is one of the key tests of
intelligence. The many AI systems based on hard-coded or human-defined concepts
fail this fundamental test. Furthermore, systems that do not derive their
concepts via interactive perception are unable to ground their knowledge in
reality, and thus lack crucial meaning. Finally, concept structures whose
activation cannot be modulated by context and degree of fit are unable to
capture the subtlety and fluidity of intelligent generalization. In
combination, these limitations will cripple any aspiring AGI.
Explicitly
engineering (and learning) functionality, versus evolving it. Design by
evolution is extremely inefficient – whether in nature or in computer
science. Moreover, evolutionary solutions are generally opaque; optimized only
to some specified ‘cost function’, not comprehensibility,
modularity, or maintainability. Furthermore, evolutionary learning also
requires more data or trials than are available in everyday problem solving.
Genetic and evolutionary programming do have their uses – they are powerful tools that can be used to solve very specific problems, such as
optimization of large sets of variables; however they generally are not
appropriate for creating large systems of infrastructures. Artificially evolving
general intelligence directly seems
particularly problematic because there is no known function measuring such
capability along a single continuum – and absent such direction,
evolution doesn’t know what to optimize. One approach to deal with this problem
is to try to coax intelligence out of a complex ecology of competing agents
– essentially replaying natural evolution.
Overall, it seems that genetic programming techniques are
appropriate when one runs out of specific engineering ideas. Here is a short
summary of advantages of explicitly engineered functionality:
·
Designs
can directly capitalize on and encode the designer’s knowledge and
insights.
·
Designs
have comprehensible design documentation.
·
Designs
can be more far more modular – less need for multiple functionality and
high inter-dependency of sub-systems than found in evolved systems.
·
Systems
can have a more flow-chart like, logical design – evolution has no
foresight.
·
They can
be designed with debugging aids – evolution didn’t need that.
·
These
features combine to make systems easier to understand, debug, interface, and
– importantly – for multiple teams to simultaneously work on the
design.
Conceptual design,
versus reverse-engineering. In addition to avoiding the shortcomings of
evolutionary techniques, there are also numerous advantages to designing and engineering
intelligent systems based on functional
requirements rather than trying to copy evolution’s design of the brain.
As aviation has amply demonstrated, it is much easier to build planes than it
is to reverse-engineer birds – much easier to achieve flight via thrust
than flapping wings.
Similarly, in creating artificial intelligence it makes
sense to capitalize on our human intellectual
and engineering strengths – to ignore design parameters unique to
biological systems, instead of struggling to copy nature’s designs.
Designs explicitly engineered to
achieve desired functionality are much easier to understand, debug, modify, and
enhance. Furthermore, using known and existing technology allows us to best
leverage existing resources. So why limit ourselves to the single solution to
intelligence created by a blind, unconscious Watchmaker with his own agenda
(survival in an evolutionary environment very different from that of today)?
Intelligent machines
designed from scratch carry neither the evolutionary baggage, nor the additional
complexity for epigenesis, reproduction, and
integrated self-repair of biological brains. Obviously this doesn’t imply
that we can learn nothing from studying brains, just that we don’t have
to limit ourselves to biological feasibility in our designs. Our (currently)
only working example of high-level general intelligence (the brain) provides a
crucial conceptual model of
cognition, and can clearly inspire numerous specific design features.
Here are some
desirable cognitive features that can be included in an AGI design that would
not (and in some cases, could not) exist in a reverse-engineered brain:
·
More effective control of neurochemistry
(‘emotional states’)
·
Selecting the appropriate degree of logical
thinking versus intuition
·
More effective control over focus and attention
·
Being able to learn instantly, on demand
·
Direct and rapid interfacing with databases, the
Internet, and other machines – potentially having instant access to all
available knowledge
·
Optional ‘photographic’ memory and
recall (‘playback’) on all senses!
·
Better control over remembering and forgetting
(freezing important knowledge, and being able to unlearn)
·
The ability to accurately backtrack and review
thought and decision processes (retrace and explore logic pathways)
·
Patterns, nodes and links can easily be tagged
(labeled) and categorized
·
The ability to optimize the design for the available
hardware instead of being forced to conform to the brain’s requirements
·
The ability to utilize the best existing
algorithms and software techniques – irrespective of whether they are
biologically plausible
·
Custom designed AGI (unlike brains) can have a
simple speed/ capacity upgrade path
·
The possibility of comprehensive integration
with other AI systems (like expert systems, robotics, specialized sense
pre-processors, and problem solvers)
·
The ability to construct AGIs that are highly
optimized for specific domains
·
Node, link, and internal parameter data is
available as ‘input data’ (full introspection)
·
Design specifications are available (to the designer
and to the AGI itself!)
·
Seed AI design: A machine can inherently be
designed to more easily understand and improve its own functioning – thus
bootstrapping intelligence to ever higher levels.
General proof-of-concept,
versus specific real applications development. Applying given resources to
minimalist proof-of-concept designs improves the likelihood of cutting a swift,
direct path towards an ultimate goal. Having identified high-level artificial
general intelligence as our goal, it makes little sense to squander resources
on inessentials. In addition to focusing our efforts on the ability to acquire knowledge autonomously, rather
than capturing or coding it, we further aim to speed progress towards full AGI
by reducing cost and complexity through –
·
Concentrating on proof-of-concept prototypes,
not commercial performance. This includes working at low data resolution and
volume, and putting aside optimization. Scalability is addressed only at a
theoretical level, and not necessarily implemented.
·
Working with radically-reduced sense and motor
capabilities. The fact that deaf, blind, and severely paralyzed people can
attain high intelligence (Helen Keller, Stephen Hawking) indicates that these
are not essential to developing AGI.
·
Coping with complexity through a willingness to
experiment and implement poorly understood algorithms – i.e. using an
engineering approach. Using self-tuning feedback loops to minimize free
parameters.
·
Not being sidetracked by attempting to match the
performance of domain-specific designs – focusing more on how capabilities are achieved (e.g.
learned conceptualization, instead of programmed or manually specified concepts)
rather than raw performance.
·
Developing and testing in virtual environments,
not physical implementations. Most aspects of AGI can be fully evaluated
without the overhead (time, money, and complexity) of robotics.
Animal level
cognition, versus abstract thought, language, and formal logic. There is
ample evidence that achieving high-level cognition requires only modest structural improvements from animal
capability. Discoveries in cognitive psychology point towards generalized
pattern processing being the foundational mechanism for all higher level
functioning. On the other hand, relatively small differences between higher
animals and humans are also witnessed by studies of genetics, the evolutionary
timetable, and developmental psychology.
The core challenge
of AGI is achieving the robust, adaptive conceptual learning ability of higher
primates or young children. If human level intelligence is the goal, then
pursuing robotics, language, or formal logic (at this stage) is a costly
sideshow - whether motivated by misunderstanding the problem, or by commercial
or ‘political’ considerations.
Summary.
While our project leans heavily on research done in many specialized disciplines,
it is one of the few efforts dedicated to integrating such interdisciplinary
knowledge with the specific goal of developing general artificial intelligence. We firmly believe that many of the
issues raised above are crucial to the early achievement of truly intelligent adaptive
learning systems.
4.
Foundational Cognitive Capabilities
General intelligence requires a number of foundational
cognitive abilities. At a first approximation, it must be able to –
·
Remember and recognize patterns representing
coherent features of reality
·
Relate such patterns by various similarities,
differences, and associations
·
Learn and perform a variety of actions
·
Evaluate and encode feedback from a goal system
·
Autonomously adjust its system control
parameters.
As mentioned earlier, this functionality must handle a very
wide variety of data types and characteristics (including temporal), and must
operate interactively, in real-time. The expanded description below is based on
our particular implementation; however, the features listed would generally be
required (in some form) in any
implementation of artificial general intelligence.
Pattern learning,
matching, completion, and recall. The primary method of pattern acquisition
consists of a proprietary adaptation of lazy learning (Aha 1997, Yip 1997). Our
implementation stores feature patterns (static and dynamic) with adaptive fuzzy
tolerances that subsequently determine how similar patterns are processed. Our
recognition algorithm matches patterns on a competitive winner-take-all basis,
as a set or aggregate of similar patterns, or by forced choice. It also offers
inherent support for pattern completion, and recall (where appropriate).
Data accumulation and
forgetting. Because our system learns patterns incrementally, mechanism are
needed for consolidating and pruning excess data. Sensed patterns (or
sub-patterns) that fall within a dynamically set noise/ error tolerance of
existing ones are automatically consolidated by a hebbian-like
mechanism that we call ‘nudging’. This algorithm also accumulates
certain statistical information. On the other hand, patterns that turn out not
to be important (as judged by various criteria) are deleted.
Categorization and
clustering. Vector-coded feature patterns are acquired in real-time and
stored in a highly adaptive network structure. This central self-organizing
repository automatically clusters data in hyper-dimensional vector-space. Our
matching algorithm’s ability to recall patterns by any dimension provides
inherent support for flexible, dynamic categorization. Additional
categorization mechanisms facilitate grouping patterns by additional
parameters, associations, or functions.
Pattern hierarchies
and associations. Patterns of perceptual features do not stand in isolation
– they are derived from coherent external reality. Encoding relationships
between patterns serves the crucial functions of added meaning, context, and
anticipation. Our system captures low-level, perception-driven pattern
associations such as: sequential or coincidental in time, nearby in space,
related by feature group or sense modality. Additional relationships are
encoded at higher levels of the network, including actuation layers. This
overall structure somewhat resembles the ‘dual network’ described
by Goertzel (1993).
Pattern priming and
activation spreading. The core function of association links is to prime[4]
related nodes. This helps to disambiguate pattern matching, and to select contextual
alternatives. In the case where activation is particularly strong and perceptual
activity is low, stored patterns will be ‘recognized’
spontaneously. Both the scope and decay rate of such activation spreading are
controlled adaptively. These dynamics combine with the primary,
perception-driven activation to form the system’s short-term memory.
Action patterns.
Adaptive action circuits are used to control parameters in the following three
domains:
1) Senses,
including adjustable feature extractors, focus and selection mechanisms
2) Output
actuators for navigation and manipulation
3) Meta-cognition
and internal controls.
Different actions states and behaviors (action sequences)
for each of these control outputs can be created at design time (using a
configuration script) or acquired interactively. Real-time learning occurs
either by means of explicit teaching, or autonomously through random
exploration. Once acquired, these actions can be tied to specific perceptual
stimuli or whole contexts through various stimulus-response mechanisms. These
S-R links (both activation and inhibition) are dynamically modified through
ongoing reinforcement learning.
Meta-cognitive
control. In addition to adaptive perception and action functionality, an
AGI design must also allow for extensive monitoring and control of overall
system parameters and functions. Any complex interactive learning system
contains numerous crucial control parameters such as noise tolerance, learning
and exploration rates, priorities and goal management, and a myriad others. Not
only must the system be able to adaptively control these many interactive
vectors, it must also appropriately manage its various cognitive functions
(such as recognition, recall, action, etc.). Our design deals with these
requirements by means of a highly adaptive introspection/ control
‘probe’.
High-level
intelligence. Our AGI model posits that no additional foundational functions are necessary for higher-level cognition.
Abstract thought, language, and logical thinking are all elaborations of core
abilities. This controversial point is elaborated on further on.
5. An
AGI in the making
The functional prototype currently under development at
Adaptive A.I. Inc. aims to embody all the abovementioned
choices, requirements, and features. Our development path is as follows:
1) Development
framework
2) Memory
core and interface structure
3) Individual
foundational cognitive components
4) Integrated
low-level cognition
5) Increasing
level of functionality.
The software comprises an AGI engine framework with the
following basic components:
·
A set of pluggable, programmable (virtual)
sensors and actuators (called ‘probes’)
·
A central pattern store/ engine including all
data and cognitive algorithms
·
A configurable, dynamic 2D virtual world, plus
various training and diagnostic tools.

The AGI engine design is based on, and embodies insights
from a wide range of research in cognitive science – including computer
science, neuroscience, epistemology (Rand 1990, Kelley 1986), and psychology
(Margolis 1987). Particularly strong influences include: embodied systems
(Brooks 1994), vector encoded representation (Churchland
1995), adaptive self-organizing neural nets (esp. Growing Neural Gas, Fritzke 1995), unsupervised and self-supervised learning,
perceptual learning (Goldstone 1998), and
fuzzy logic (Kosko 1997).
While our design
includes several novel, and proprietary algorithms, our key innovation is the
particular selection and integration of established technologies and prior
insights.
AGI Engine
Architecture & Design Features
Our AGI engine (which provides this foundational cognitive
ability) can logically be divided into three parts (See figure above.):
·
Cognitive core
·
Control/ interface logic
·
Input/ output probes
This ‘situated agent architecture’ reflects the
importance of having an AGI system that can dynamically and adaptively interact
with the environment. From a theory-of-mind perspective it acknowledges both
the crucial need for concept grounding (via senses), plus the absolute need for
experiential, self-supervised learning.
The components listed below have been specifically designed
with features required for adaptive general intelligence in (ultimately) real
environments. Among other things, they deal with a great variety and volume of
static and dynamic data, cope with fuzzy and uncertain data and goals, foster
coherent integrated representations of reality, and – most of all –
promote adaptivity.
Cognitive Core:
This is the central repository of all static and dynamic data patterns –
including all learned cognitive and behavioral states and sequences. All data
is stored in a single, integrated node-link structure. The design innovates the
specific encoding of pattern ‘fuzziness’ (in addition to other
attributes). The core allows for several node/ link types with differing
dynamics to help define the network’s cognitive structure.
The network’s topology is dynamically self-organizing
– a feature inspired by ‘Growing Neural Gas’ design (Fritzke 1995). This allows network density to adjust to
actual data feature and/ or goal requirements. Various adaptive local and
global parameters further define network structure and dynamics in real time.
Control and Interface
Logic: An overall control system coordinates the network’s execution
cycle, drives various cognitive and housekeeping algorithms, and controls/
adapts system parameters. Via an Interface Manager, it also communicates data
and control information to and from the probes.
Probes: The
Interface Manager provides for dynamic addition and configuration of probes.
Key design features of the probe architecture include the ability to have programmable
feature extractors, variable data resolution, and focus & selection mechanisms.
Such mechanisms for data selection are imperative for general intelligence:
even moderately complex environments have a richness of data that far exceeds
any system’s ability to usefully process.
The system handles a very wide variety of data types and
control signal requirements – including those for visual, sound, and raw
data (e.g., database, internet, keyboard), as well as various output actuators.
A novel ‘system probe’ provides the system with monitoring and
control of its internal states (a form of meta-cognition). Additional probes
– either custom interfaces with other systems or additional real-world
sensors/ actuators – can easily be added to the system.
Development
Environment/ Language/ Hardware. The complete AGI engine plus associated
support programs are implemented in (Object Oriented) C# under Microsoft’s
.NET framework. The system is designed for optional remoting
of various components, thus allowing for some distributed processing. Current
tests show that practical (proof-of-concept) prototype performance can be
achieved on a single, conventional PC (2 Ghz, 512
Meg). Even a non-performance-tuned implementation can process several complex
patterns per second on a database of well over a million stored features.
6. From
Algorithms to General Intelligence
This section covers some of our near-term research and
development; it aims to illustrate our expected path toward meaningful general
intelligence. While this work barely approaches higher-level animal cognition (exceeding it in some
aspects, but falling far short in others such as sensory-motor skills), we take
it to be a crucial step in proving the validity and practicality of our model.
Furthermore, the actual functionality achieved should be highly competitive, if
not unique, in applications where significant autonomous adaptivity
and data selection, lack of brittleness, dynamic pattern processing, flexible
actuation, and self-supervised learning are central requirements.
General intelligence doesn’t comprise one single,
brilliant knock-out invention or design feature; instead, it emerges from the synergetic
integration of a number of essential fundamental components. On the structural
side, the system must integrate sense inputs, memory, and actuators, while on
the functional side various learning, recognition, recall and action
capabilities must operate seamlessly on a wide range of static and dynamic
patterns. In addition, these cognitive abilities must be conceptual and
contextual – they must be able to generalize knowledge, and interpret it
against different backgrounds.
A key milestone in our project is testing the integrated functionality of the basic
cognitive components within our overall AGI framework. A number of custom-developed,
highly-configurable test utilities are used to test the cohesive functioning of
the whole system. This automated training and evaluation is supplemented by
manual experimentation in numerous different environments and applications.
Experience gained by these tests helps to refine the complex dynamics of
interacting algorithms and parameters.
One of the general difficulties with AGI development is to
determine absolute measures of success. Part of the reason is that this field
is still nascent, and thus no agreed definitions, let alone tests or measures
of low-level general intelligence exist. As we proceed with our project we expect
to develop ever more effective protocols and metrics for assessing cognitive
ability. Our system’s performance evaluation is guided by this
description: ‘General intelligence comprises the ability to acquire (and
adapt) the knowledge and skills required for achieving a wide range of goals in
a variety of domains.’
·
In this context, ‘acquisition’
includes all of the following: automatic, via sense inputs (feature/ data
driven); explicitly taught; discovered through exploration or experimentation;
internal processes (e.g., association, categorization, statistics, etc.).
·
‘Adaptation’ implies that new
knowledge is integrated appropriately.
·
‘Knowledge and skills’ refer to all
kinds of data and abilities (states and behaviors) that the system acquires for
the short or long term.
Our initial protocol for evaluating AGIs aims to cover a
wide spectrum of domains and goals by simulating sample applications in 2D
virtual worlds. In particular, these tests should assess the degree to which
the foundational abilities operate as an integrated, mutually supportive whole
– and without programmer intervention! Here are three examples:
Sample Test
Domains for Initial Performance Criteria
Adaptive Security
Monitor. This system scans video monitors and alarm panels that oversee a
secure area (say, factory, office building, etc.), and responds appropriately
to abnormal conditions. Note, this is somewhat similar to a site monitoring
application at MIT (Grimson 1998).
This simulation calls for a visual environment that contains
a lot of detail but has only limited dynamic activity – this is its
normal state (green). Two levels of abnormality exist: (i)
minor, or known disturbance (yellow); (ii) major, or unknown disturbance (red).
The system must initially learn the normal state by simple
exposure (automatically scanning the environment) at different resolutions
(detail). It must also learn ‘yellow’ conditions by being shown a
number of samples (some at high resolution). All other states must output
‘red’.
Standard operation is to continuously scan the environment
at low resolution. If any abnormal condition is detected the system must learn
to change to higher resolution in order to discriminate between
‘yellow’ and ‘red’.
The system must adapt to changes in the environment (and
totally different environments) by simple exposure training.
Sight Assistant.
The system controls a movable ‘eye’ (by voice command) that enables
the identification (by voice output) of at least a hundred different objects in
the world. A trainer will dynamically teach the system new names, associations,
and eye movement commands.
The visual probe can select among different scenes
(simulating rooms) and focus on different parts of each scene. The scenes
depict objects of varying attributes: color, size, shape, various dynamics,
etc. (and combinations of these), against different backgrounds.
Initial training will be to attach simple sound commands to
maneuver the ‘eye’, and to associate word labels with selected
objects. The system must then reliably execute voice commands and respond with
appropriate identification (if any). Additional functionality could be to have
the system scan the various scenes when idle, and to automatically report selected
important objects.
Object identification must cover a wide spectrum of
different attribute combinations and tolerances. The system must easily learn
new scenes, objects, words and associations, and also adapt to changes in any
of these variables.
Maze Explorer. A
(virtual) entity explores a moderately complex environment. It discovers what
types of objects aid or hinder its objectives, while learning to navigate this
dynamic world. It can also be trained to perform certain behaviors.
The virtual world is filled with a great number of different
objects (see previous example). In addition, some of these objects move in
space at varying speeds and dynamics, and may be solid and/ or immovable.
Groups of different kinds of objects have pre-assigned attributes that indicate
negative or positive. The AGI engine controls the direction and speed of an
entity in this virtual world. Its goal is to learn to navigate around immovable
and negative objects to reliably reach hidden positives.
The system can also be trained to respond to operator
commands to perform behaviors of varying degrees of complexity (for example,
actions similar to ‘tricks’ one might teach a dog). This
‘Maze Explorer’ can easily be set up to deal with fairly complex
tasks.
Towards Increased
Intelligence
Clearly, the tasks described above do not by themselves
represent any kind of breakthrough in artificial intelligence research. They
have been achieved many times before. However, what we do believe to be significant and unique is the achievement of these
various tasks without any task-specific programming or parameterization. It is
not what is being done, but how it is done.
Development beyond these basic proof-of-concept tests will
advance in two directions: 1) to significantly increase resolution, data
volume, and complexity in applications similar to the tests; 2) to add
higher-level functionality. In addition to work aimed at further developing and
proving our general intelligence model, there are also numerous practical
enhancements that can be done. These would include implementing multi-processor
and network versions, and integrating our system with databases or with other
existing AI technology such as expert systems, voice recognition, robotics, or
sense modules with specialized feature extractors.
By far the most important of these future developments
concern higher-level ability. Here is a partial list of action items, all of
which are derived from lower-level foundations:
·
Spread activation and retain context over
extended period
·
Support more complex internal temporal patterns,
both for enhanced recognition and anticipation, and for cognitive and action
sequences
·
Internal activation feedback for processing
without input
·
Deduction, achieved through selective concept
activation
·
Advanced categorization by arbitrary dimensions
·
Learning of more complex behavior
·
Abstract and merged concept formation
·
Structured language acquisition
·
Increased awareness and control of internal
states (introspection)
·
Learning logic and other problem-solving
methodologies.
7.
Other Research[5]
Many different approaches to AI exist; some of the
differences are straight forward while others are subtle and hinge on difficult
philosophical issues. As such the exact placement of our work relative to that
of others is difficult and, indeed, open to debate. Our view that
‘intelligence is a property of an entity that engages in two way
interaction with an external environment’, technically puts us in the
area of ‘agent systems’ (Russel 1995).
However, our emphasis on a connectionist rather than classical approach to
cognitive modeling, places our work in the field of ’embodied cognitive
science’. (See Pfeifer and Scheier 1999 for a
comprehensive overview.)
While our approach
is similar to other research in embodied cognitive science, in some respects
our goals are substantively
different. A key difference is our belief that a core set of cognitive
abilities working together is sufficient to produce general intelligence. This
is in marked contrast to others in embodied cognitive science who consider
intelligence to be necessarily specific to a set of problems within a given
environment. In other words, they believe that autonomous agents always exist
in ecological niches. As such they focus their research on building very
limited systems that effectively deal with only a small number of problems
within a specific limited environment. Almost all work in the area follows this
-- see Braitenberg (1984), Brooks (1994) or Arbib (1992) for just a few well known examples. Their
stance contradicts the fact that humans possess general intelligence; we are
able to effectively deal with a wide range of problems that are significantly
beyond anything that could be called our ‘ecological niche’.
Perhaps the closest project to ours that is strictly in the
area of embodied cognitive science is the Cog project at MIT (Brooks 1993). The
project aims to understand the dynamics of human interaction by the
construction of a human-like robot complete with upper torso, a head, eyes,
arms and hands. While this project is significantly more ambitious than other
projects in terms of the level and complexity of the system's dynamics and
abilities, the system is still essentially niche focused (elementary human
social and physical interaction) when compared to our own efforts at general
intelligence.
Probably the closest work to ours in the sense that it also
aims to achieve general rather than niche intelligence is the Novamente project under the direction of Ben Goertzel. (The project was formerly known as Webmind -- see Goertzel 1997,
2001.) Novamente relies on a hybrid of low-level
neural net-like dynamics for activation spreading and concept priming, coupled
with high-level semantic constructs to represent a variety of logical, causal
and spatial-temporal relations. While the semantics of the system's internal
state are relatively easy to understand compared to a strictly connectionist
approach, the classical elements in the system's design open the door to many
of the fundamental problems that have plagued classical AI over the last fifty
years. For example, high-level semantics require a complex meta-logic contained
in hard coded high-level reasoning and other high-level cognitive systems.
These high-level systems contain significant implicit semantics that may not be
grounded in environmental interaction but are rather hard coded by the designer
– thus causing symbol grounding problems (Harnad
1990). The relatively fixed, high-level methods of knowledge representation and
manipulation that this approach entails are also prone to ‘frame of
reference’ (McCarthy and Hayes
1969; Pylyshyn 1987) and
‘brittleness’ problems. In a strictly embodied cognitive science
approach, as we have taken, all knowledge is derived from agent-environment
interaction thus avoiding these long-standing problems of classical AI.
Andy Clark (1997) is another
researcher whose model closely resembles our own, but there are no
implementations specifically based on his theoretical work. Igor Aleksander’s (now dormant) MAGNUS project (1996) also
incorporated many key AGI concepts that we have identified, but it was severely
limited by a classical AI, finite-state machine approach. Valeriy
Nenov and Michael Dyer of UCLA (1994) used ‘massively’
parallel hardware (a CM-2 Connection Machine) to implement a virtual,
interactive perceptual design close to our own, but with a more rigid,
pre-programmed structure. Unfortunately, this ambitious, ground-breaking work
has since been abandoned. The project was probably severely hampered by limited
(at the time) hardware.
Moving further away from embodied cognitive science to
purely classical research in general intelligence, perhaps the best known
system is the Cyc project being pursued by Lenat (1990). Essentially Lenat
sees general intelligence as being ‘common sense’. He hopes to
achieve this goal by adding many millions of facts about the world into a huge
database. After many years of work and millions of dollars in funding there is
still a long way to go as the sheer number of facts that humans know about the
world is truly staggering. We doubt that a very large database of basic facts
is enough to give a computer much general intelligence – the mechanisms
for autonomous knowledge acquisition are missing. Being a classical approach to
AI this also suffers from the fundamental problems of classical AI listed
above. For example the symbol grounding problem again: if facts about cats and
dogs are just added to a database that the computer can use even though it has
never seen or interacted with an animal, are those concepts really meaningful
to the system? While his project also claims to pursue ‘general
intelligence’, it is really very different from our own, both in its
approach and in the difficulties it faces.
Analysis of AI’s ongoing failure to overcome its
long-standing limitations reveals that it is not so much that Artificial
General Intelligence has been tried and that it has failed, but rather that the
field has largely been abandoned – be it for theoretical, historic, or commercial
reasons. Certainly, our particular type of approach, as detailed in previous sections,
is receiving scant attention.
8.
Fast-track AGI – Why so Rare?
Widespread application of AI has been hampered by a number
of core limitations that have plagued the field since the beginning, namely:
·
The expense and delay of custom programming
individual applications
·
Systems’ inability to automatically learn
from experience, or to be user teachable/ trainable
·
Reliability and performance issues caused by
‘brittleness’ (the inability of systems to automatically adapt to
changing requirements, or data outside of a predefined range)
·
Their limited intelligence and common sense.
The most direct path to solving these long-standing problems
is to conceptually identify the fundamental characteristics common to all
high-level intelligence, and to engineer systems with this basic functionality,
in a manner that capitalizes on human and technological strength.
General intelligence is the key to achieving robust
autonomous systems that can learn and adapt to a wide range of uses. It is also
the cornerstone of self-improving, or Seed AI – using basic abilities to
bootstrap higher-level ones. This essay identified foundational components of
general intelligence, as well as crucial considerations particular to the
effective development of the artificial variety. It highlighted the fact that
very few researchers are actually following this most direct route to AGI.
If the approach outlined above is so promising, then why is has
it received so little attention? Why is hardly anyone actually working on it?
A short answer: Of all the people working in the field
called 'AI',
·
80% don't believe in the concept of General
Intelligence (but instead, in a large collection of specific skills and knowledge)
·
Of those that do, 80% don't believe that
artificial, human-level intelligence is possible - either ever, or for a long,
long time
·
Of those that do, 80% work on domain-specific AI
projects for commercial or academic-political reasons (results are more
immediate)
·
Of those left, 80% have a poor conceptual
framework...
Even though the above is a caricature, in contains more than
a grain of truth.
A great number of researchers reject the validity or
importance of ‘general intelligence’. For many, controversies in
psychology (such as those stoked by The
Bell Curve) make this an unpopular, if not taboo subject. Others,
conditioned by decades of domain-specific work, simply do not see the benefits
of Seed AI – solving the problems only once.
Of those that do not in principle object to general
intelligence, many don’t believe that AGI is possible – in their
life-time, or ever. Some hold this position because they themselves tried and
failed ‘in their youth’. Others believe that AGI is not the best approach to achieving
‘AI’, or are at a total loss on how to go about it. Very few
researchers have actually studied the problem from our (the general
intelligence/ Seed AI) perspective. Some are actually trying to reverse-engineer
the brain – one function at a time. There are also those who have moral
objections, or who are afraid of it.
Of course, a great many are so focused on particular, narrow
aspects of intelligence that they simply don’t get around to looking at
the big picture – they leave it to others to make it happen. It is also
important to note that there are often strong financial and institutional
pressures to pursue specialized AI.
All of the above combine to create a dynamic where Real AI
is not ‘fashionable’ – getting little respect, funding, and
support – further reducing the number of people drawn into it!
These should be more than enough reasons to account for the
dearth of AGI progress. But it gets worse. Researchers actually trying to build
AGI systems are further hampered by a myriad of misconceptions, poor choices,
and lack of resources (funding and research). Many of the technical issues were
explored previously (See sections 3 and 7.), but a few others are worth
mentioning:
Epistemology.
Models of AGI can only be as good as their underlying theory of knowledge
– the nature of knowledge, and how it relates to reality. The realization
that high-level intelligence is based on conceptual
representation of reality underpins design decisions such as adaptive, fuzzy
vector encoding, and an interactive, embodied approach. Other consequences are
the need for sense-based focus and selection, and contextual activation. The
central importance of a highly-integrated pattern network – especially
including dynamic ones – becomes obvious on understanding the
relationship between entities, attributes, concepts, actions, and thoughts.
These and several other insights lay the foundation for solving problems
related to grounding, brittleness, and common sense. Finally, there is still a
lot of unnecessary confusion about the relationship between concepts and
symbols. A dynamic that continues to handicap AI is the lingering schism between
traditionalists and connectionists. This unfortunately helps to perpetuate a
false dichotomy between explicit symbols/ schema, and incomprehensible
patterns.
Theory of Mind.
Another area of concern is sloppy formulation and poor understanding of several
key concepts: consciousness, intelligence, volition, meaning, emotions, common sense,
and ‘qualia’. The fact that hundreds of
AI researchers attend conferences every year where key speakers proclaim that
‘we don’t understand consciousness (or qualia,
or whatever), and will probably never
understand it’ indicates just how pervasive this problem is. Marvin Minsky’s characterization of consciousness being a
‘suitcase word’[6] is
correct. Let’s just unpack it!
Errors like these are often behind research going off at a
tangent relative to stated long-term goals. Two examples are an undue emphasis
on biological feasibility, and the belief that embodied intelligence cannot be
virtual, that it has to be implemented in physical robots.
Cognitive psychology.
It goes without saying that a proper understanding of the concept
‘intelligence’ is key to engineering it. In addition to
epistemology, several areas of cognitive psychology are crucial to unraveling
its meaning. Misunderstanding intelligence has led to some costly
disappointments, such as manually accumulating huge amounts of largely useless
data (knowledge without meaning), efforts to achieve intelligence by combining
masses of dumb agents, or trying to obtain meaningful conversation from an
isolated network of symbols.
Project focus.
The few projects that do pursue AGI
based on relatively sound models run yet another risk: they can easily lose
focus. Sometimes commercial considerations hijack a project’s direction,
while others get sidetracked by (relatively) irrelevant technical issues, such
as trying to match an unrealistically high level of performance, fixating on
biological feasibility of design, or attempting to implement high-level
functions before their time. A clearly mapped-out developmental path to
human-level intelligence can serve as a powerful antidote to losing sight of
‘the big picture’. A vision of how to get from ‘here’
to ‘there’ also helps to maintain motivation in such a difficult
endeavor.
Research support.
AGI utilizes, or more precisely, is an integration of a large number of
existing AI technologies. Unfortunately, many of the most crucial areas are
sadly under-researched. They include:
·
Incremental, real-time, unsupervised/ self-supervised
learning (vs. back-propagation)
·
Integrated support for temporal patterns
·
Dynamically-adaptive neural network topologies
·
Self-tuning of system parameters, integrating
bottom-up (data driven) and top-down (goal/ meta-cognition driven) auto-adaptation
·
Sense probes with auto-adaptive feature
extractors.
Naturally, these very limitations feed back to reduce
support for AGI research.
Cost and difficulty.
Achieving high-level AGI will be hard. However, it will not be nearly as
difficult as most experts think. A key element of ‘Real AI’ theory
(and its implementation) is to concentrate on the essentials of intelligence.
Seed AI becomes a manageable problem – in some respects much simpler than
other mainstream AI goals - by eliminating huge areas of difficult, but
inessential AI complexity. Once we get the crucial fundamental functionality
working, much of the additional ‘intelligence’ (ability) required
is taught or learned, not programmed. Having said this, I do believe that very
substantial resources will be required to scale up the system to human-level
storage and processing capacity. However, the far more moderate initial
prototypes will serve as proof-of-concept for AGI while potentially seeding a
large number of practical new applications.
9.
Conclusion
Understanding general intelligence and identifying its
essential components are key to building next-generation AI systems –
systems that are far less expensive, yet significantly more capable. In
addition to concentrating on general
learning abilities, a fast-track approach should also seek a path of least
resistance – one that capitalizes on human engineering strengths and
available technology. Sometimes, this involves selecting the AI road less traveled.
We believe that the theoretical model, cognitive components,
and framework described above, joined with our other strategic design decisions
provide a solid basis for achieving practical AGI capabilities in the
foreseeable future. Successful implementation will significantly address many
traditional problems of AI. Potential benefits include:
·
Minimizing initial
environment-specific programming (through self-adaptive configuration)
·
Substantially reducing ongoing software changes, because a large amount of additional
functionality and knowledge will be acquired autonomously via self-supervised
learning
·
Greatly increasing the scope of applications, as users teach and train additional capabilities
·
Improved flexibility
and robustness resulting from
systems’ ability to adapt to changing data patterns, environments and
goals.
AGI promises to make an important contribution toward
realizing software and robotic systems that are more usable, intelligent, and
human-friendly. The time seems ripe for a major initiative down this new path
of human advancement that is now open to us.
References
·
Aha, D.W.
(Ed.) (1997). Lazy Learning. Artificial
Intelligence Review,11:1-5 Kluwer Academic
Publishers
·
Aleksander,
·
Arbib, M.A.
(1992). Schema theory. In S. C.
Shapiro (Ed.), Encyclopedia of Artificial
Intelligence, 2nd ed (pp. 1427-1443). John Wiley.
·
Braitenberg, V.
(1984). Vehicles: Experiments in
synthetic psychology. MIT Press.
·
Brooks,
R.A., and Stein, L. A. (1993). Building brains for bodies. Memo 1439, Artificial
Intelligence Lab, Massachusetts Institute of Technology
·
Brooks,
R.A. (1994). Coherent behavior from many adaptive processes. In D. Cliff,
P. Husbands, J.A. Meyer, and S.W. Wilson (Eds.), From animals to animats: Proceedings of the
third International Conference on Simulation of Adaptive Behavior (421-430).MIT
Press.
·
Churchland, P.M.
(1995). The Engine of Reason, the Seat of
the Soul: A Philosophical Journey into the Brain. MIT Press
·
Clark, A.
(1997. Being There: Putting Brain, Body
and World Together Again. MIT Press
·
Fritzke, B.
(1995). A growing neural gas network
learns topologies. In Tesauro, G., Touretzky, D. S., and Leen, T. K.
(Eds.), Advances in Neural Information
Processing Systems 7 (pp. 625-632). MIT Press.
·
Goertzel, B. (1997).
From complexity to creativity:
Explorations in evolutionary, autopoietic, and
cognitive dynamics. Plenum
Press.
·
Goertzel, B.
(2001). Creating internet intelligence:
Wild computing, distributed digital consciousness, and the emerging global
brain Plenum Press.
·
Goldstone,
R.L. (1998). Perceptual Learning. Annual
Review of Psychology, 49, 585-612.
·
Gottfredson,
L.S. (1998). The general intelligence factor. [Special Issue]. Scientific
American, 9(4), 2, 24-29.
·
Grimson, W.E.L.,
Stauffer, C., Lee L., Romano R. (1998). Using
Adaptive Tracking to Classify and Monitor Activities in a Site. Proc.
IEEE Conf. on Computer Vision and Pattern Recognition, pp. 22-31, 1998
·
Harnad, S.
(1990). The symbol grounding problem. Physica D, 42,
335-346.
·
Kelley, D.
(1986). The Evidence of the
·
Kosko, B. (1997).
Fuzzy Engineering. Prentice Hall
·
Lenat, D.B., Guha, R.V.(1990). Building
Large Knowledge Based Systems. Addison-Wesley.
·
Margolis,
H. (1987). Patterns, Thinking, and
Cognition: A Theory of Judgment.
·
McCarthy,
J. and Hayes, P.J.(1969). Some
philosophical problems from the standpoint of artificial intelligence. Machine Intelligence, 4, 463-502.
·
Nenov, V.I. and
Dyer, M.G. (1994). Language Learning via
Perceptual/ Motor Association: A Massively Parallel Model. In: Kitano, H., Hendler, J.A. (Eds.), Massively
Parallel Artificial Intelligence (pp. 203-245) AAAI Press/The MIT Press.
·
Pfeifer,
R., and Scheier, C. (1999). Understanding intelligence. MIT Press.
·
Pylyshyn, Z.W.(Ed.)(1987).
The Robot’s Dilemma: The frame
problem in A.I.. Ablex.
·
Rand, A.
(1990). Introduction to Objectivist
Epistemology.
·
Russell,
S.J., Norvig, P.(1995). Artificial Intelligence: A modern approach. Prentice Hall.
·
Wang, P. (1995).
Non-axiomatic reasoning system: Exploring
the essence of intelligence. PhD thesis,
·
Yip, K.,
and Sussman, G.J. (1997). Sparse Representations for
Fast, One-shot learning. Proc. of National Conference on Artificial
Intelligence, July 1997.
[1]
Intellectual property is owned by Adaptive A.I. Inc.
[2]
‘Brittleness’ in AI refers to a system’s inability to automatically
adapt to changing requirements, or to cope with data outside of a predefined
range – thus ‘breaking’.
[3] Back-propagation
is one of the most powerful supervised training algorithms; it is,
however, not particularly amenable to
incremental learning.
[4]
‘Priming’, as used in psychology, refers to an increase in the
speed or accuracy of a decision that occurs as a consequence of prior exposure
or activation.
[5] This
section was co-authored with Shane Legg
[6] meaning
that many different meanings are thrown together in a jumble – or at
least packaged together in one ‘box’, under one label.