In search of a model for modeling intelligence

True artificial intelligence will require rich models that incorporate real-world phenomena.


An orrery, a runnable model of the solar system that allows us to make predictions. Photo: Wikimedia Commons.

Editor’s note: this post is part of our Intelligence Matters investigation.

In my last post, we saw that AI means a lot of things to a lot of people. These dueling definitions each have a deep history — ok fine, baggage — that has massed and layered over time. While they’re all legitimate, they share a common weakness: each one can apply perfectly well to a system that is not particularly intelligent. As just one example, the chatbot that was recently touted as having passed the Turing test is certainly an interlocutor (of sorts), but it was widely criticized as not containing any significant intelligence.

Let’s ask a different question instead: What criteria must any system meet in order to achieve intelligence — whether an animal, a smart robot, a big-data cruncher, or something else entirely?

To answer this question, I want to explore a hypothesis that I’ve heard attributed to the cognitive scientist Josh Tenenbaum (who was a member of my thesis committee). He has not, to my knowledge, unpacked this deceptively simple idea in detail (though see his excellent and accessible paper How to Grow a Mind: Statistics, Structure, and Abstraction), and he would doubtless describe it quite differently from my attempt here. Any foolishness which follows is therefore most certainly my own, and I beg forgiveness in advance.

I’ll phrase it this way:

Intelligence, whether natural or synthetic, derives from a model of the world in which the system operates. Greater intelligence arises from richer, more powerful, “runnable” models that are capable of more accurate and contingent predictions about the environment.

What do I mean by a model? After all, people who work with data are always talking about the “predictive models” that are generated by today’s machine learning and data science techniques. While these models do technically meet my definition, it turns out that the methods in wide use capture very little of what is knowable and important about the world. We can do much better, though, and the key prediction of this hypothesis is that systems will gain intelligence proportionate to how well the models on which they rely incorporate additional aspects of the environment: physics, the behaviors of other intelligent agents, the rewards that are likely to follow from various actions, and so on. And the most successful systems will be those whose models are “runnable,” able to reason about and simulate the consequences of actions without actually taking them.

Let’s look at a few examples.

  • Single-celled organisms leverage a simple behavior called chemotaxis to swim toward food and away from toxins; they do this by detecting the relevant chemical concentration gradients in their liquid environment. The organism is thus acting on a simple model of the world – one that, while devastatingly simple, usually serves it well.
  • Mammalian brains have a region known as the hippocampus that contains cells that fire when the animal is in a particular place, as well as cells that fire at regular intervals on a hexagonal grid. While we don’t yet understand all of the details, these cells form part of a system that models the physical world, doubtless to aid in important tasks like finding food and avoiding danger — not so different from the bacteria.
  • While humans also have a hippocampus, which probably performs some of these same functions, we also have overgrown neocortexes that model many other aspects of our world, including, crucially, our social environment: we need to be able to predict how others will act in response to various situations.

The scientists who study these and many other examples have solidly established that naturally occurring intelligences rely on internal models. The question, then, is whether artificial intelligences must rely on the same principles. In other words, what exactly did we mean when we said that intelligence “derives from” internal models? Just how strong is the causal link between a system having a rich world model and its ability to possess and display intelligence? Is it an absolute dependency, meaning that a sophisticated model is a necessary condition for intelligence? Are good models merely very helpful in achieving intelligence, and therefore likely to be present in the intelligences that we build or grow? Or is a model-based approach but one path among many in achieving intelligence? I have my hunches — I lean toward the stronger formulations — but I think these need to be considered open questions at this point.

The next thing to note about this conception of intelligence is that, bucking a long-running trend in AI and related fields, it is not a behavioralist measure. Rather than evaluating a system based on its actions alone, we are affirmedly piercing the veil in order to make claims about what is happening on the inside. This is at odds with the most famous machine intelligence assessment, the Turing test; it also contrasts with another commonly-referenced measure of general intelligence, “an agent’s ability to achieve goals in a wide range of environments”.

Of course, the reason for a naturally-evolving organism to spend significant resources on a nervous system that can build and maintain a sophisticated world model is to generate actions that promote reproductive success — big brains are energy hogs, and they need to pay rent. So, it’s not that behavior doesn’t matter, but rather that the strictly behavioral lens might be counterproductive if we want to learn how to build generally intelligent systems. A focus on the input-output characteristics of a system might suffice when its goals are relatively narrow, such as medical diagnoses, question answering, and image classification (though each of these domains could benefit from more sophisticated models). But this black-box approach is necessarily descriptive, rather than normative: it describes a desired endpoint, without suggesting how this result should be achieved. This devotion to surface traits leads us to adopt methods that do not not scale to harder problems.

Finally, what does this notion of intelligence say about the current state of the art in machine intelligence as well as likely avenues for further progress? I’m planning to explore this more in future posts, but note for now that today’s most popular and successful machine learning and predictive analytics methods — deep neural networks, random forests, logistic regression, Bayesian classifiers — all produce models that are remarkably impoverished in their ability to represent real-world phenomena.

In response to these shortcomings, there are several active research programs attempting to bring richer models to bear, including but not limited to probabilistic programming and representation learning. By now, you won’t be surprised that I think such approaches represent our best hope at building intelligent systems that can truly be said to understand the world they live in.

tags: , , , , , ,

Get the O’Reilly Data Newsletter

Stay informed. Receive weekly insight from industry insiders.

  • Max Shron

    An excellent point. If you’re interested in this kind of thing, the model of intelligence you’re talking about has a solid history in the philosophy of mind. Daniel Dennet, for example, is famous in philosophy circles for defining intelligence and consciousness in terms of models the organism has of its environment and of itself. I seem to remember some version of this idea going back to the late 19th century from the psychologist William Wundt, but obviously not nearly as operational as you’re talking about.

    • beaucronin

      Yes, absolutely – it’s been a long while, but Dennett in particular really helped me to understand the power of this approach. Now, let’s not be afraid to apply those ideas to the intelligent systems that we’re trying to build today…

      • Rowena

        You are absolutely right. This article is very informative and very well explained. I love this type of blogs .
        Visit my blog also : Party Supplies Singapore

    • Max,Thanks for the books. Definitely buying them (and reading!)

      Beau, Good points. We haven’t reached the stage of modeling intelligence – we are way far off. Even with deep learning et al, we are still solving point problems viz recognize cat, differentiate pictures et al. AFAIK, Jeff Hawkins had come the closest to modeling the essentials of biological intelligence in a series of lectures – I had chronicled in my blogs Do you know where they are at now ?

  • Dan Marthaler

    This is basically the mindset behind a lot of work of Pierre Oudeyer (itself extensions of Juergen Schmidhuber) and his team. They want to build these models you are writing about. They use RL to do so (with an objective function, not of maximizing predictability, or interestingness, but of learning rate).

  • floatingbones

    I find it interesting that a mechanical orrery was used as the example for a model of a physical system. In the mid-1980s, Professors Gerry Sussman and Jack Wisdom of MIT created the Digital Orrery and published the paper “Numerical Evidence that the Motion of Pluto is Chaotic” (1988; ). Among other things, this paper shows the unsuitability for any mechanical orrery to ultimately predict the motions of the planets, dwarf planets, moons, etc. notes that the horizon of predictability is a few tens of millions of years, but the fundamental dynamic of chaotic motion must be noted in any accurate model.

  • Sheldon Rampton

    Have you read Read Montague’s book, “Your Brain is (Almost) Perfect”? I read part of it and thought he made some interesting observations.

    First: You write that “big brains are energy hogs.” This may be true relative to small brains, but one of the points Montague makes is that biological brains are actually remarkably energy-efficient compared to electronic computers:

    Brains have to be energy-efficient because they evolved under conditions of scarcity, in which it was not always easy to acquire enough energy to sustain life at all, let alone thought. From an evolutionary perspective, selection pressure favored the development of brains that produce a net positive in terms of an organism’s ability to acquire and retain calories — for example, by helping them become more effective at obtaining food, or by helping them avoid becoming someone else’s food.

    Second: He makes the point that the brain achieves energy efficiency through some tradeoffs. First, brains transmit messages much more slowly than electronic computers. They are also structured in a way that minimizes the need to transmit signals from one part of the brain (or one part of the body) to another part. Brains accomplish this by creating multiple copies of the information they needs to manage. Given the brain’s low energy requirement and slow processing speed, it needs to minimize bandwidth usage by performing operations locally using the nearest available copy rather than by needing to transmit everything back and forth through a central processing unit. Many of our ostensibly conscious behaviors — breathing, walking, etc. — are actually semi-autonomous. Moreover, our notion that we perceive the world through sensory detection is only partially true. The brain uses a model of our foot to anticipate sensations that we expect to feel there, which enables us to anticipate pain or other sensations and therefore react to them more quickly than if we had to wait for the foot’s nerve impulses to be transmitted to the brain. The existence of multiple models of the body within the brain also accounts for phenomena such as phantom limb pain.

    You say that our brains have “a system that models the physical world.” I think it would be a little more accurate to say that brains have MULTIPLE such systems that sometimes only interact very loosely with one another. In addition to modeling the physical world outside itself, these systems also model our own bodies and our own minds. Our mental model of ourself even understands to some degree that it is modeling a collection of processes that are only loosely connected. This is why we are able to understand phrases that would otherwise be nonsensical, such as, “I’m getting in touch with my feelings.” (If the brain’s processes were closely connected, the “I” in that sentence would be identical to “my feelings” and would therefore have no need to get in touch with them.)

    Our brain’s ability to model itself is also the basis for important human attributes such as empathy — emotional intelligence that uses our model of our own feelings as a basis for understanding and imagining that we experience the feelings of others.

    The other thing worth noticing about the intelligence of actual organic brains is that it is remarkably fault-tolerant. Humans routinely misremember and misperceive things, and our perceptions are often shaped heavily by crude instincts, stereotypes and oversimplifications of complex information. Moreover, huge blocks of brain function can be knocked out entirely without fatally compromising the brain’s ability to function. This sloppy approach to cognition would be difficult to program into a computer and would be unacceptable behavior from computer software even if we could program it easily. From an evolutionary perspective, though, there is obvious advantages to being able to make quick, sloppy decisions in real time, even if they only loosely approximate “correct” responses to the situation in which those decisions must be made.