Artificial General Intelligence

From Scholarpedia
Ben Goertzel (2015), Scholarpedia, 10(11):31847. doi:10.4249/scholarpedia.31847 revision #154015 [link to/cite this article]
Jump to: navigation, search
Post-publication activity

Curator: Ben Goertzel



The term Artificial General Intelligence (often abbreviated "AGI") has no broadly accepted precise definition, but has multiple closely related meanings, e.g.

  • the capacity of an engineered system to
    • display the same rough sort of general intelligence as humans; or,
    • display intelligence that is not tied to a highly specific set of tasks; or
    • generalize what it has learned, including generalization to contexts qualitatively very different than those it has seen before; or,
    • take a broad view, and interpret its tasks at hand in the context of the world at large and its relation thereto
  • an engineered system displaying the property of artificial general intelligence, to a significant degree
  • the theoretical and practical study of artificial general intelligence systems and methods of creating them

AGI is part of the broader fields of Artificial Intelligence (AI) and Cognitive Science. It is also closely related to other areas such as Metalearning and Computational Neuroscience.

AGI versus Narrow AI

The original founders of the AI field, in the 1950s and 60s, were largely concerned with the creation of hardware or software emulating human-like general intelligence. Since that time, the field has come to focus instead largely on the pursuit of discrete capabilities or specific practical tasks. This approach has yielded many interesting technologies and theoretical results, yet has proved relatively unsuccessful so far in terms of the original central goals of the field. Thus, some researchers have come to prefer the term and concept of "AGI", in order to distinguish the pursuit of general intelligence from more narrowly focused associated pursuits (Goertzel and Pennachin, 2005).

A dichotomy has sometimes been drawn between AGI and "narrow AI" (Goertzel and Pennachin, 2005). For example, Kurzweil (1999) contrasted "narrow AI" with "strong AI" -- using the former to refer to the creation of systems that carry out specific "intelligent" behaviors in specific contexts, and the latter to refer essentially to what is now called AGI. For a narrow AI system, if one changes the context or the behavior specification even a little bit, some level of human reprogramming or reconfiguration is generally necessary to enable the system to retain its level of intelligence. Qualitatively, this seems quite different from natural generally intelligent systems like humans, which have a broad capability to self-adapt to changes in their goals or circumstances, performing ”transfer learning” to generalize knowledge from one goal or context to others.

The precise definition or characterization of AGI is one of the subjects of study of the AGI research field. However, it is broadly accepted that, given realistic space and time resource constraints, human beings do not have indefinite generality of intelligence; and for similar reasons, no real-world system is going to have indefinite generality. Human intelligence combines a certain generality of scope, with various highly specialized aspects aimed at providing efficient processing of pragmatically important problem types; and real-world AGI systems are going to mix generality and specificity in their own ways.

The Emergence of an AGI Community

The emergence of a distinct community focused on AGI has been a gradual process, that has largely coincided with an increase in the legitimacy accorded to explicitly AGI-focused research within the AI community as a whole. During the early 2000s interest in the grand goals began to rise in various research centers around the world, including IDSIA in Switzerland, RPI and Carnegie Mellon in the US, and many others.

In 2005, Springer published an edited volume titled "Artificial General Intelligence" (Goertzel and Pennachin, 2005). In 2006, the first formal research workshop on "AGI" was held, in Bethesda, Maryland (Goertzel, 2013). In the subsequent years, a broad community of researchers united by the explicit pursuit of AGI and related concepts has emerged, as evidenced e.g. by conference series such as Artificial General Intelligence (AGI), Biologically Inspired Cognitive Architectures (BICA), and Advances in Cognitive Systems; and by numerous special tracks and symposia at major conferences such as AAAI and IEEE, focused on closely allied topics such as Human-Level Intelligence and Integrated Intelligence. There is also a Journal of AGI.

The AGI community, from the start, has involved researchers following a number of different directions, including some building cognitive architectures inspired by cognitive psychology and neurobiology; and also some focused on deriving mathematical results regarding formalizations of general intelligence (thus, among other things, building bridges between AGI and other formal pursuits such as theoretical computer science and statistical decision theory). Each of the subcommunities involved has brought its own history, e.g. some AGI cognitive architecture work extends ideas from classic AI cognitive architectures such as SOAR (Laird, 2012) and GPS (Newell et al, 1959), some extends work from evolutionary computing, etc. The mathematical side of contemporary AGI draws heavily on foundational work by Ray Solomonoff (1964) and other early pioneers of formal intelligence theory.

While the qualitative commonality among the various research directions pursued in AGI community is relatively clear, there have not yet been any broadly successful attempts to clarify core hypotheses or conclusions binding the various threads of the AI field. As one effort along these lines, Goertzel (2014) articulated a "core AGI hypothesis”, namely that "the creation and study of synthetic intelligences with sufficiently broad (e.g. human-level) scope and strong generalization capability, is at bottom qualitatively different from the creation and study of synthetic intelligences with significantly narrower scope and weaker generalization capability." This hypothesis was intended as a statement on which nearly all researchers in the AGI community would agree, regardless of their different conceptualizations of the AGI concept and their different architectural, theoretical, technical and engineering approaches. However, much more precise propositions than this will be needed to attain broad agreement among researchers, for the AGI field to be considered theoretically unified.

AGI and Related Concepts

AGI is related to many other terms and concepts commonly used. Joscha Bach has characterized AGI in terms of the quest to create ”synthetic intelligence." (Bach, 2009). One also finds communities of researchers working toward AGI-related goals under the labels ”computational intelligence", ”natural intelligence", ”cognitive architecture", ”biologically inspired cognitive architecture”, and many others.

AGI is related to, yet far from identical to, ”human-level AI” (Cassimatis, 2006) -- a term which is usually used to mean, in effect, ”human-level, reasonably human-like AGI”. AGI is a fairly abstract notion, which is not intrinsically tied to any particular characteristics of human beings beyond their general intelligence. On the other hand, the concept of ”human-level AI” is openly anthropomorphic, and seeks to compare synthetic intelligences to human beings along an implicit lineal scale, a notion that introduces its own special complexities. If a certain AGI system is very different than humans, it may not be easy to assess in what senses it resides on the same level as humans, versus above or below. On the other hand, if one's goal is to create AGI systems that resemble humans, it could be argued that thinking about hypothetical radically different AGI systems is mainly a distraction. The narrower focus of the "human level AI" concept, as opposed to AGI, seems to have positives and negatives, which are complex to disentangle given the current state of knowledge..

Perspectives on General Intelligence

The AGI field contains a number of different, largely complementary approaches to understanding the “general intelligence” concept. While the bulk of the AGI community’s effort is devoted to devising and implementing designs for AGI systems, and developing theories regarding the best way to do so, the formulation of a detailed and rigorous theory of ”what AGI is” also constitutes a small but significant part of the community’s ongoing research.

The lack of a clear, universally accepted definition is not unique to "AGI." For instance, “AI” also has many different meanings within the AI research community, with no clear consensus n the definition. “Intelligence” is also a fairly vague concept; Legg and Hutter wrote a paper summarizing and organizing over 70 different published definitions of ”intelligence”, most oriented toward general intelligence, emanating from researchers in a variety of disciplines (Legg and Hutter, 2007).

Four key approaches to conceptualizing the nature of GI and AGI are outlined below.

The Pragmatic Approach to Characterizing General Intelligence

The pragmatic approach to conceptualizing general intelligence is typified by the AI Magazine article ”Human Level Artificial Intelligence? Be Serious!”, written by Nils Nilsson, one of the early leaders of the AI field (Nilsson, 2005) . Nilsson’s view is

... that achieving real Human Level artificial intelligence would necessarily imply that most of the tasks that humans perform for pay could be automated. Rather than work toward this goal of automation by building special-purpose systems, I argue for the development of general-purpose, educable systems that can learn and be taught to perform any of the thousands of jobs that humans can perform. Joining others who have made similar proposals, I advocate beginning with a system that has minimal, although extensive, built-in capabilities. These would have to include the ability to improve through learning along with many other abilities.

In this perspective, once an AI obsoletes humans in most of the practical things we do, it should be understood to possess general Human Level intelligence. The implicit assumption here is that humans are the generally intelligent system we care about, so that the best practical way to characterize general intelligence is via comparison with human capabilities.

The classic Turing Test for machine intelligence (Turing, 1955) – simulating human conversation well enough to fool human judges – is pragmatic in a similar sense to Nilsson's perspective. But the Turing test has a different focus, on emulating humans. Nilsson isn’t interested in whether an AI system can fool people into thinking it’s a human, but rather in whether an AI system can do the useful and important practical things that people can do.

Psychological Characterizations of General Intelligence

The psychological approach to characterizing general intelligence also focuses on human-like general intelligence; but rather than looking directly at practical capabilities, it tries to isolate deeper underlying capabilities that enable these practical capabilities. In practice it encompasses a broad variety of sub-approaches, rather than presenting a unified perspective.

Viewed historically, efforts to conceptualize, define, and measure intelligence in humans reflect a distinct trend from general to specific (it is interesting to note the similarity to historical trends in AI) . Thus, early work in defining and measuring intelligence was heavily influenced by Spearman, who in 1904 proposed the psychological factor g (the ”g factor”, for general intelligence. Spearman argued that g was biologically determined, and represented the overall intellectual skill level of an individual. In 1916, Terman introduced the notion of an intelligence quotient or IQ.

In subsequent years, though, psychologists began to question the concept of intelligence as a single, undifferentiated capacity. There emerged a number of alternative theories, definitions, and measurement approaches, which share the idea that intelligence is multifaceted and variable both within and across individuals. Of these approaches, a particularly well-known example is Gardner’s (1983) theory of multiple intelligences, which proposes eight distinct forms or types of intelligence: (1) linguistic, (2) logical-mathematical, (3) musical, (4) bodily-kinesthetic, (5) spatial, (6) interpersonal, (7) intrapersonal, and (8) naturalist.

A Mathematical Approach to Characterizing General Intelligence

In contrast to approaches focused on human-like general intelligence, some researchers have sought to understand intelligence in general. One underlying intuition here is that

  • Truly, absolutely general intelligence would only be achievable given infinite computational ability. For any computable system, there will be some contexts and goals for which it’s not very intelligent.
  • However, some finite computational systems will be more generally intelligent than others, and it’s possible to quantify this extent

This approach is typified by the recent work of Legg and Hutter (2007a), who give a formal definition of general intelligence based on the Solomonoff-Levin prior, building heavily on the foundational work of Hutter (2005). Put very roughly, they define intelligence as the average reward-achieving capability of a system, calculated by averaging over all possible reward-summable environments, where each environment is weighted in such a way that more compactly describable programs have larger weights.

According to this sort of measure, humans are nowhere near the maximally generally intelligent system. However, intuitively, such a measure would seem to suggest that humans are more generally intelligent than, say, rocks or worms. While the original form of Legg and Hutter’s definition of intelligence is impractical to compute, there are also more tractable approximations.

The Adaptationist Approach to Characterizing General Intelligence

Another perspective views general intelligence as closely tied to the environment in which it exists. Pei Wang has argued carefully for a conception of general intelligence as ”adaptation to the environment using limited resources” (Wang, 2006). A system may be said to have greater general intelligence, if it can adapt effectively to a more general class of environments, within realistic resource constraints.

Broadly Suspected Aspects of General Intelligence

Variations in perspective aside, there is reasonably broad agreement in the AGI community on some key likely features of general intelligence, e.g.:

  • General intelligence involves the ability to achieve a variety of goals, and carry out a variety of tasks, in a variety of different contexts and environments
  • A generally intelligent system should be able to handle problems and situations quite different from those anticipated by its creators
  • A generally intelligent system should be good at generalizing the knowledge it has gained, so as to transfer this knowledge from one problem or context to others
  • Arbitrarily general intelligence is likely not possible given realistic resource constraints
  • Real-world systems may display varying degrees of limited generality, but are inevitably going to be a lot more efficient at learning some sorts of things than others; and for any given real-world system, there will be some learning tasks on which it is unacceptably slow. So real-world general intelligences are inevitably somewhat biased toward certain sorts of goals and environments.
  • Humans display a higher level of general intelligence than existing AI programs do, and apparently also a higher level than other animals
  • According to our observations of humans and various theoretical perspectives, the following traits, among many others, are typically associated with generally intelligence: reasoning, creativity, association, generalization, pattern recognition, problem solving, memorization, planning, achieving goals, learning, optimization, self-preservation, sensory data processing, language processing, classification, induction, deduction and abduction
  • It seems quite unlikely that humans happen to manifest a maximal level of general intelligence, even relative to the goals and environment for which they have been evolutionarily adapted

There is also a common intuition in much of the AGI community that various real-world general intelligences will tend to share certain common properties; though there is less agreement on what these properties are. A 2008 workshop on Human-Level AI resulted in a paper by Laird and Wray enumerating one proposed list of such properties (Laird et al, 2008); a 2009 workshop on AGI resulted in an alternative, more extensive list, articulated in a multi-author paper published in AI Magazine (Adams et al, 2012).

Current Scope of the AGI Field

Wlodek Duch, in his survey paper (Duch, 2008), divided existing approaches to AI into three paradigms – symbolic, emergentist and hybrid. To this trichotomy we here add one additional category, "universal." Due to the diversity of AGI approaches, it is difficult to find truly comprehensive surveys; Samsonovich (2010) is perhaps the most thorough but is by no means complete.

Universal AI

In the universal approach, one starts with AGI algorithms or agents that would yield incredibly powerful general intelligence if supplied with massively, unrealistically much computing power, and then views practically feasible AGI systems as specializations of these powerful theoretic systems.

The path toward universal AI began in earnest with Solomonoff's (1964) universal predictors, which provide a rigorous and elegant solution to the problem of sequence prediction, founded in the theory of algorithmic information (also known as Kolmogorov Complexity (Kolmogorov, 1965; Li and Vitanyi, 2008). The core idea here (setting aside certain technicalities) is that the shortest program computing a sequence, provides the best predictor regarding the continuation of the sequence.

Hutter's (2000, 2005, 2012) work on AIXI extends this approach, applying the core idea of Solomonoff induction to the problem of controlling an agent carrying out actions in, and receiving reinforcement signals from, a computable environment. In an abstract sense, AIXI is the optimally intelligent agent in computable environments.

In a bit more detail, what AIXI does is to maximize expected reward over all possible future perceptions created by all possible environments $q$ that are consistent with past perceptions. The expectation over environments is weighted, where the simpler an environment, the higher is its weight $2^{−l(q)}$, where simplicity is measured by the length $l$ of program $q$. AIXI effectively learns by eliminating Turing machines $q$ once they become inconsistent with the progressing history.

That is to say: Solomonoff's theoretically optimal universal predictors and their Bayesian learning algorithms assume only that the reactions of the environment are sampled from an unknown probability distribution $\mu$ contained in a set $\cal M$ of all enumerable distributions (Solomonoff, 1964). Since we typically do not know the program computing $\mu$, Solomonoff predicts the future in a Bayesian framework by using a mixture distribution $\xi= \sum_{i} w_i \mu_i$, a weighted sum of all distributions $\mu_i \in \cal M$, $i=1, 2, \ldots$, where $\sum_i w_i \leq 1$. Hutter used $\xi$ to create the theoretically optimal yet uncomputable RL algorithm AIXI (Hutter, 2005). In cycle $t+1$, given a history of past inputs and actions $h(\leq t)$, AIXI selects as its next action the first action of an action sequence maximizing $\xi$-predicted reward up to some horizon, typically $2t$. It turns out that the Bayes-optimal policy $p^\xi$ based on $\xi$ is self-optimizing in the sense that its average utility value converges asymptotically for all $\mu \in \cal M$ to the optimal value achieved by the (infeasible) Bayes-optimal policy $p^\mu$ which knows $\mu$ in advance. The necessary condition that $\cal M$ admits self-optimizing policies is also sufficient.

AIXI is uncomputable, but Hutter’s algorithm AIXItl is a computable approximation that involves, at each step in its “cognitive cycle”, a search over all programs of length less than \(l\) and runtime less than \(t\). Conceptually, AIXItl may be understood roughly as follows:

  • An AGI system is going to be controlled by some program
  • Instead of trying to figure out the right program via human wizardry, we can just write a "meta-algorithm" to search program space, and automatically find the best program for making the AGI smart, and then use that program to operate the AGI.
  • We can then repeat this meta-algorithm over and over, as the AGI gains more data about the world, so it will always have the operating program that’s best according to all its available data.

$AIXI^{tl}$ is a precisely defined "meta-algorithm" of this nature. Related systems have also been formulated, including one due to Schmidhuber (2002) that is based on the Speed Prior, which takes into account program runtime in way that is optimal in a certain sense.

The Universal AI research program also involves blueprints of universal problem solvers for arbitrary computable problems, that are time-optimal in various theoretical senses. These include Levin's (1973) asymptotically optimal Universal Search, which has constant multiplicative overhead (Levin, 1973), and its incremental extension, the Optimal Ordered Problem Solver, which can greatly reduce the constant overhead by re-using previous successful programs (Schmidhuber et al, 2004); as well as Hutter's (2002) asymptotically optimal method, which will solve any well-defined problem as quickly as the unknown fastest way of solving it, save for an additive constant overhead that becomes negligible as problem size grows (this method is related to $AIXI^{tl}$).

Self-improving universal methods have also been defined, including some that justify self-changes (including changes of the learning algorithm) through empirical evidence in a lifelong learning context (Schmidhuber et al, 1997). The self-referential, recursively self-improving "Goedel Machine" (Schmidhuber, 2006) proves theorems about itself. It can be implemented on practical general computers and may improve any part of its software (including its proof searcher and the possibly suboptimal initial learning algorithm itself) in a way that is provably time-optimal in a certain sense that takes constant overheads into account and goes beyond asymptotic optimality (Schmidhuber, 2006). It can be initialized by an asymptotically optimal meta-method (Hutter, 2002) which will solve any well-defined problem as quickly as the unknown fastest way of solving it, save for an additive constant overhead that becomes negligible as problem size grows.

In the perspective of universal AI, the vast majority of computationally feasible problems are "large" in the sense that they exist in the regime where asymptotic optimality is relevant; the other "small" problems are relatively few in number. However, it seems that many (perhaps all) of the problems of practical everyday interest to humans are "small" in this sense, which would imply that reduction in the overhead of the universal methods mentioned above is critical for practical application of universal AI. There has been work in this direction, dating back at least to (Schmidhuber et al, 1991) , and including recent work such as (Schmidhuber et al, 2013a; Veness et al, 2011).

Symbolic AGI

Attempts to create or work toward AGI using symbolic reasoning systems date back to the 1950s and continue to the current day, with increasing sophistication. These systems tend to be created in the spirit of the "physical symbol system hypothesis" (Newell and Simon, 1976), which states that minds exist mainly to manipulate symbols that represent aspects of the world or themselves. A physical symbol system has the ability to input, output, store and alter symbolic entities, and to execute appropriate actions in order to reach its goals.

In 1956, Newell and Simon (1956) built a program, Logic Theorist, that discovers proofs in propositional logic. This was followed up by the General Problem Solver (Newell, 1963) that attempted to extend Logic Theorist type capabilities to commonsensical problem-solving. At this early stage, it became apparent that one of the key difficulties facing symbolic AI was how to represent the knowledge needed to solve a problem. Before learning or problem solving, an agent must have an appropriate symbolic language or formalism for the learned knowledge. A variety of representations were proposed, including complex logical formalisms (McCarthy and Hayes, 1969), semantic frames as proposed by Minsky (1975), and simpler feature-based representations.

Early symbolic AI work led to a number of specialized systems carrying out practical functions. Winograd's SHRDLU system (1972) could, using restricted natural language, discuss and carry out tasks in a simulated blocks world. CHAT-80 could answer geographical questions posed to it in natural language (Warren and Pereira, 1982). DENDRAL , developed from 1965 to 1983 in the field of organic chemistry, proposed plausible structures for new organic compounds (Buchanan and Feigenbaum, 1978). MYCIN, developed from 1972 to 1980, diagnosed infectious diseases of the blood, and prescribed appropriate antimicrobial therapy (Buchanan and Shortliffe, 1984). However, these systems notably lacked the ability to generalize, performing effectively only in the narrow domains for which they were engineered.

Modern symbolic AI systems seek to achieve greater generality of function and more robust learning ability via sophisticated cognitive architectures. Many such cognitive architectures focus on “working memory” that draws on long-term memory as needed, and utilize a centralized control over perception, cognition and action. Although in principle such architectures could be arbitrarily capable (since symbolic systems have universal representational and computational power, in theory), in practice symbolic architectures tend to be less developed in learning, creativity, procedure learning, and episodic memory. Leading examples of symbolic cognitive architectures include ACT-RT (Anderson et al, 2004), originally founded on a model of human semantic memory; Soar (Laird, 2012), which is based on the application of production systems to solve problems defined as residing in various problem spaces, and which has recently been extended to include perception, episodic memory, and a variety of other cognitive functions; and Sigma, which applies many of Soar's architectural ideas using a probabilistic network based knowledge representation (Rosenbloom, 2013).

Emergentist AGI

Another species of AGI design expects abstract symbolic processing – along with every other aspect of intelligence – to emerge from lower-level “subsymbolic” dynamics, which sometimes (but not always) are designed to simulate neural networks or other aspects of human brain function. Today’s emergentist architectures are sometimes very strong at recognizing patterns in high-dimensional data, reinforcement learning and associative memory; but no one has yet shown how to achieve high-level functions such as abstract reasoning or complex language processing using a purely subsymbolic, emergentist approach.

The broad concepts of emergentist AI can be traced back to Norbert Wiener's Cybernetics (1948), and more directly to the 1943 work of McCulloch and Pitts (1943), which showed how networks of simple thresholding "formal neurons" could be the basis for a Turing-complete machine. In 1949, Donald Hebb wrote The Organization of Behavior (Hebb, 1949), hypothesizing that neural pathways are strengthened each time they are used, a concept now called "Hebbian learning", conceptually related to long-term potentiation in the brain and to a host of more sophisticated reinforcement learning techniques (Sutton and Barto, 1998; Wiering and van Otterlo, 2012).

In the 1950s practical learning algorithms for formal neural networks were articulated by Marvin Minsky (1952) and others. Rosenblatt (1958) designed "Perceptron" neural networks, and Widrow and Hoff (1962) presented a systematic neural net learning procedure that was later labeled "back-propagation." These early neural networks showed some capability to learn and generalize, but were not able to carry out practically impressive tasks. A comprehensive history of the early and recent history of the neural network field is given in (Schmidhuber, 2014).

An alternate approach to emergentist AI that emerged in the late 1960s and 1970s was evolutionary computing, centered on the genetic algorithm, a computational model of evolution by natural selection. John Holland's learning classifier system combined reinforcement learning and genetic algorithms into a cognitive architecture with complex, self-organizing dynamical properties (Holland, 1975). A learning classifier system consists of a population of binary rules on which a genetic algorithm (roughly simulating an evolutionary process) alters and selects the best rules. Rule fitness is based on a reinforcement learning technique.

In 1982, broad interest in neural net based AI began to resume, triggered partly by a paper by John Hopfield of Caltech (Hopfield, 1982), explaining how completely connected symmetric neural nets could be used to store associative memories. In 1986, psychologists Rumelhart and McClelland (1986) popularized the extension of the Widrow-Hoff learning rule to neural networks with multiple layers (a method that was independently discovered by multiple researchers).

Currently neural networks are an extremely popular machine learning technique with a host of practical applications. Multilayer networks of formal neurons or other conceptually similar processing units have become known by the term "deep learning" and have proved highly successful in multiple areas including image classification, object detection, handwriting recognition, speech recognition, machine translation, and many other fields (e.g., Schmidhuber, 2015; Bengio, 2014). Today they are often referred to by the popular term “Deep Learning.”

An important subset of emergentist cognitive architectures, still at an early stage of advancement, is developmental robotics, which is focused on controlling robots without significant “hard-wiring” of knowledge or capabilities, allowing robots to learn (and learn how to learn etc.) via their engagement with the world. A significant focus is often placed here on “intrinsic motivation,” wherein the robot explores the world guided by internal goals like novelty or curiosity, forming a model of the world as it goes along, based on the modeling requirements implied by its goals. Some of the foundations of this research area were laid by Juergen Schmidhuber’s work in the 1990s (Schmidhuber, 1991), but now with more powerful computers and robots the area is leading to more impressive practical demonstrations.

Hybrid AGI

In response to the complementary strengths and weaknesses of the other existing approaches, a number of researchers have turned to integrative, hybrid architectures, which combine subsystems operating according to the different paradigms. The combination may be done in many different ways, e.g. connection of a large symbolic subsystem with a large subsymbolic system, or the creation of a population of small agents each of which is both symbolic and subsymbolic in nature. One aspect of such hybridization is the integration of neural and symbolic components (Hammer and Hitzler, 2007). Hybrid systems are quite heterogenous in nature, and here we will mention three that are relatively representative; a longer list is reviewed in (Goertzel, 2014).

A classic example of a hybrid system is the CLARION (Connectionist Learning with Adaptive Rule Induction On-line) cognitive architecture created by Ron Sun (2002), whose design focuses on explicitly distinguishing implicit versus explicit processes, and capturing the interaction between these two process types. Implicit processes are modeled as neural networks, whereas explicit processes are modeled as formal symbolic rules. CLARION involves an action-centered subsystem whose job is to control both external and internal actions; its implicit layer is made of neural networks called Action Neural Networks, while the explicit layer has is made up of action rules. It also involves a non-action-centered subsystem whose job is to maintain general knowledge; its implicit layer is made of associative neural networks, while the bottom layer is associative rules. The learning dynamics of the system involves ongoing coupling between the neural and symbolic aspects.

The LIDA architecture (Faghihi and Franklin, 2012), developed by Stan Franklin and his colleagues, is closely based on cognitive psychology and cognitive neuroscience, particularly on Bernard Baars' Global Workspace Theory and Baddeley's model of working memory. LIDA's dynamics are based on the principles that: 1) Much of human cognition functions by means of frequently iterated (~10 Hz) interactions, called cognitive cycles, between conscious contents, the various memory systems and action selection, 2) These cognitive cycles, serve as the “atoms” of cognition of which higher-level cognitive processes are composed. LIDA contains components corresponding to different processes known to be associated with working and long-term memory (e.g. an episodic memory buffer, a sensory data processing module, etc.), and utilizes different AI algorithms within each of these components.

The CogPrime architecture (Goertzel et al, 2013), implemented in the OpenCog AI software framework, represents symbolic and subsymbolic knowledge together in a single weighted, labeled hypergraph representation called the Atomspace. Elements in the Atomspace are tagged with probabilistic or fuzzy truth values, and also with short and long term oriented "attention values." Working memory is associated with the subset of Atomspace elements possessing the highest short term importance values. A number of cognitive processes, including a probabilistic logic engine, an evolutionary program learning framework and a neural net like associative and reinforcement learning system, are configured to concurrently update the Atomspace, and designed to aid each others' operation.

Future of the AGI Field

The field of AGI is still at a relatively early stage of development, in the sense that nobody has yet demonstrated a software or hardware system that is broadly recognized as displaying a significant degree of general intelligence, or as being near general-purpose human-level AI. No one has yet even demonstrated a compelling "proto-AGI" system, such as e.g.: a robot that can do a variety of preschool-type activities in a flexible and adaptive way; or a chatbot that can hold an hour’s conversation without sounding bizarre or resorting to repeating catch-phrases.

Furthermore, there has not yet emerged any broadly accepted theory of general intelligence. Such a theory might be expected to include a characterization of what general intelligence is, and a theory of what sorts of architecture can be expected to work for achieving human-level AGI using realistic computational resources.

However, a significant plurality of experts believes there is a possibility of dramatic, interlinked progress in AGI design, engineering, evaluation and theory in the relatively near future. For example, in a survey of researchers at the AGI-2010 conference, the majority of respondents felt that human-level AGI was likely to arise before 2050, and some were much more optimistic (Baum et al, 2011). Similarly, a 2014 poll among AI experts (Mueller and Bostrom, 2014) at various conferences showed a broad agreement that AGI systems will likely reach overall human ability (defined as "ability to carry out most human professions at least as well as a typical human") around the middle of the 21st century. The years 2013 and 2014 also saw sharply heightened commercial activity in the AI space, which is difficult to evaluate in terms of its research implications, but indicates a general increase in interest in the field.

The possibility of a relatively near-term advent of advanced AGI has led some researchers and other observers to express concern about the ethics of AGI development and the possibility of "existential risks" associated with AGI. A number of recently-formed research institutes have emerged, placing significant focus on this topic, e.g. the Machine Intelligence Research Institute (formerly the Singularity Institute for AI), Oxford University's Future of Humanity Institute, and Cambridge University's Center for the Study of Existential Risk (CSER).

The dramatic potential benefits of AGI, once it is achieved, has been explored by a variety of thinkers during the past decades. I.J. Good in 1962 famously pointed out that "the first ultraintelligent machine is the last invention that man need ever make." Hans Moravec (1986), Vernor Vinge (1993), Ray Kurzweil (1999, 2006), and many others have highlighted the potential of AGI to effect radical, perhaps sudden changes on human society.

These thinkers view AGI as one of a number of emerging transformational technologies, including nanotechnology, genetic engineering, brain-computer interfacing, mind uploading and others, and focus on the potential synergies between AGI and these other technologies once further advances in various relevant directions occur.


  • Adams, Sam, Itamar Arel, Joscha Bach, Robert Coop, Rod Furlan, Ben Goertzel, J. Storrs Hall, Alexei Samsonovich, Matthias Scheutz, Matthew Schlesinger, Stuart C. Shapiro and John Sowa (2012). "Mapping the Landscape of Human-Level Artificial General Intelligence," AAAI Artificial Intelligence Magazine, Vol. 33, 25-42
  • Anderson, J. R., Bothell, D., Byrne, M. D., Douglass, S., Lebiere, C., & Qin, Y . (2004). An integrated theory of the mind. Psychological Review 111, (4). 1036-1060.
  • Arel, I., D. Rose, T. Karnowski (2010). Deep Machine Learning - A New Frontier in Artificial Intelligence Research. IEEE Computational Intelligence Magazine, Vol. 14, pp. 12-18, November, 2010
  • Baum, Seth, Ben Goertzel, and Ted G. Goertzel (2011). “How long until human-level AI? Results from an expert assessment. “ Technological Forecasting & Social Change, vol. 78, no. 1, pages 185-195.
  • Bengio, Yoshua, Ian Goodfellow and John Courville (2014). Deep Learning. Book in preparation for MIT Press,
  • Buchanan, Bruce G. and Edward H. Shortliffe (1984). Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project. Addison-Wesley. Reading MA.
  • Buchanan, B.G. and Feigenbaum, E.A. (1978). Dendral and meta-dendral: Their applications dimension. Artificial Intelligence, 11: 5-24
  • Cassimatis, Nick (2006), Editor. “Human-Level Intelligence.” Special Issue of Artificial Intelligence Magazine.
  • Duch, Wlodzislaw and Richard Oentaryo and Michel Pasquier (2008). “Cognitive Architectures: Where Do We Go From Here?” Proceedings of AGI-08.
  • Faghihi, U., & Franklin, S. (2012). The LIDA Model as a Foundational Architecture for AGI. In P. Wang & B. Goertzel (Eds.), Theoretical Foundations of Artificial General Intelligence (pp. 105-123). Paris: Atlantis Press
  • Gardner, Howard (1983), Frames of Mind: The Theory of Multiple Intelligences, Basic Books,
  • Goertzel and Pennachin (2005). Artificial General Intelligence. Springer.
  • Goertzel, Ben, Nil Geisweiller and Cassio Pennachin (2013). Engineering General Intelligence. Atlantis Press.
  • Goertzel, Ben (2013). Ben Goertzel on AGI as a Field. Interview with Machine Intelligence Research Institute.
  • Goertzel, Ben (2014). Artificial General Intelligence: Concept, State of the Art, and Future Prospects. Journal of Artificial General Intelligence.
  • Good, I.J. (1965). "Speculations Concerning the First Ultraintelligent Machine" (HTML), Advances in Computers, vol. 6
  • Hammer, Barbara and Pascal Hitzler (Eds), Perspectives of Neural-Symbolic Integration. Studies in Computational Intelligence, Vol. 77. Springer, 2007
  • Hebb, D.O. (1949). "The Organization of Behavior". New York: Wiley & Sons.
  • Holland, John (1975). Adaptation in Natural and Artificial Systems. U. Michigan Press.
  • Hopfield, J. J. (1982) Neural networks and physical systems with emergent collective computational properties. Proc. Nat. Acad. Sci. (USA) 79, 2554-2558.
  • Hutter, Marcus (2000) A Theory of Universal Artificial Intelligence based on Algorithmic Complexity, arXiv:cs.AI/0004001
  • Hutter, M. (2002). The Fastest and Shortest Algorithm for All Well-Defined Problems. International Journal of Foundations of Computer Science 13(3):431-443
  • Hutter, Marcus (2005). Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. Springer
  • Izhikevich, Eugene M. and Gerald M. Edelman (2008). Large-Scale Model of Mammalian Thalamocortical Systems. PNAS (2008) 105:3593-3598
  • Kolmogorov, A. N. (1965). Three approaches to the quantitative definition of information. Problems of Information and Transmission, 1(1):1--7
  • Kurzweil, Ray (1999). The Age of Spiritual Machines. Penguin.
  • Kurzweil, Ray (2006). The Singularity is Near. Penguin.
  • Laird, John and Robert Wray and Robert Marinier and Pat Langley (2009). “Claims and Challenges in Evaluating Human-Level Intelligent Systems.” Proceedings of AGI-09.
  • Laird, John (2012). The Soar Cognitive Architecture. MIT Press.
  • Legg, Shane and Marcus Hutter (2007) . “A Collection of Definitions of Intelligence,” in Advances in Artificial General Intelligence, Ed. by Ben Goertzel, Pei Wang and Stan Franklin. IOS Press.
  • Legg, Shane and Marcus Hutter (2007a). Universal Intelligence: A Definition of Machine Intelligence. Minds and Machines 17(4), 391-444.
  • Levin, Leonid (1973). "Universal search problems. Problems of Information Transmission. 9 (3): 115–116
  • Li, Ming and Paul Vitanyi (2008). An Introduction to Kolmogorov Complexity and its Applications. Springer. 3rd Ed,
  • McCarthy, J. and P.J. Hayes (1969). "Some philosophical problems from the standpoint of artificial intelligence". Machine Intelligence 4: 463–502.
  • McCulloch, Warren S and Walter Pitts (1943). A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics. December 1943, Volume 5, Issue 4, pp 115-133
  • Minsky, Marvin (1975). "A Framework for Representing Knowledge," in The Psychology of Computer Vision," P. Winston (Ed.), McGraw-Hill 1975
  • Moravec, Hans (1988). Mind Children. Harvard University Press.
  • Müller, Vincent C. and Bostrom, Nick (2014). "Future progress in artificial intelligence: A poll among experts’, in Vincent C. Müller (ed.), Fundamental Issues of Artificial Intelligence, Springer.
  • Newell, A. and H. Simon (1976). Computer science as empirical inquiry: symbols and search. Communications of the ACM 19-3.
  • Newell, A.; Shaw, J.C.; Simon, H.A. (1959). Report on a general problem-solving program. Proceedings of the International Conference on Information Processing. pp. 256–264.
  • Nilsson, Nils (2005). ”Human Level Artificial Intelligence? Be Serious!” AI Magazine. 26, Winter, 68–75
  • Rosenblatt, Frank (1958), The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain, Cornell Aeronautical Laboratory, Psychological Review, v65, No. 6, pp. 386–408
  • Rosenbloom, P. S. (2013). The Sigma cognitive architecture and system. AISB Quarterly, 136, 4-13.
  • Rumelhart, David and James McClelland (1986). Rumelhart published Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Bradford.
  • Samsonovich, Alexei (2010). Toward a Unified Catalog of Implemented Cognitive Architectures Alexei V. Biologically Inspired Cognitive Architectures 2010. IOS press.
  • Schmidhuber, J. (1992). Curious Model-Building Control Systems. Proceedings of the International Joint Conference on Neural Networks, Singapore, 2, page 1458--1463. IEEE press, (1991)
  • Schmidhuber, J. (1995). Reinforcement-driven information acquisition in non-deterministic environments. Proc. ICANN’95.
  • Schmidhuber, J, J. Zhao, and M. Wiering. (1997). Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Machine Learning, 28:105--130
  • Schmidhuber, J. (2002). The Speed Prior: a new simplicity measure yielding near-optimal computable predictions. In J. Kivinen and R. H. Sloan, editors, Proceedings of the 15th Annual Conference on Computational Learning Theory (COLT 2002) , Lecture Notes in Artificial Intelligence, 216--228. Springer, Sydney, Australia,
  • Schmidhuber, J. (2004). Optimal Ordered Problem Solver. Machine Learning 54-3, 211-254
  • Schmidhuber, J. (2006). Goedel Machines: Self-Referential Universal Problem Solvers Making Provably Optimal Self-Improvements, in "Artificial General Intelligence", Ed. Ben Goertzel and Cassio Pennachin, Springer
  • Schmidhuber, J. (2013). New Millennium AI and the Convergence of History: Update of 2012. In Ed. Amnon Eden and James Moor, The Singularity Hypothesis, Springer.
  • Schmidhuber, J. (2013a). POWERPLAY: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem. Frontiers in Cognitive Science.
  • Schmidhuber, J. (2014). Deep Learning in Neural Networks: An Overview. Neural Networks 61, p. 85-117
  • Solomonoff, Ray (1964). A Formal Theory of Inductive Inference, Parts I and II. Part I: Information and Control Vol 7, No. 1, pp. 1-22. Part II: Information and Control, Vol. 7, No. 2, pp. 224-254,
  • Sun, Ron (2002). Duality of the Mind: A Bottom-up Approach Toward Cognition. Mahwah, NJ: Lawrence Erlbaum Associates
  • Sutton, Richard and Andrew Barto (1998). MIT Press. Cambridge, MA.
  • Taigman, Yaniv, Ming Yang, Marc'Aurelio Ranzato and Lior Wolf (2014). DeepFace: Closing the Gap to Human-Level Performance in Face Verification. Conference on Computer Vision and Pattern Recognition (CVPR), June 24 2014
  • Turing, Alan (1950). Computing machinery and intelligence. Mind 59.
  • Veness, J.; Ng, K. S.; Hutter, M.; Uther, W. T. B.; and Silver, D. (2011) . A Monte-Carlo AIXI Approximation. J. Artif. Intell. Res. 40, 95–142.
  • Vinge, Vernor (1993). "The Coming Technological Singularity: How to Survive in the Post-Human Era", originally in Vision-21: Interdisciplinary Science and Engineering in the Era of Cyberspace, G. A. Landis, ed., NASA Publication CP-10129, pp. 11–22
  • Wang, Pei (2006). Rigid Flexibility: The Logic of Intelligence. Springer.
  • Warren, D.H.D. and Pereira, F.C.N. (1982). An efficient easily adaptable system for interpreting natural language queries. Computational Linguistics, 8(3-41)
  • Widrow, B. and M.E. Hoff (1962). Associative Storage and Retrieval and Digital Information in networks of adaptive neurons. In EE Bernard and MR Kare, Ed., “Biological prototypes and synthetic systems”, v.1 p. 160. New York, Plenum press.
  • Wiener, Norbert (1948). Cybernetics: Or Control and Communication in the Animal and the Machine. MIT Press.
  • Winograd, Terry (1972). Understanding Natural Language Academic Press, New York.

Internal References

External Links

Personal tools

Focal areas