ENTRIES TAGGED "intelligence matters"

In search of a model for modeling intelligence

True artificial intelligence will require rich models that incorporate real-world phenomena.


An orrery, a runnable model of the solar system that allows us to make predictions. Photo: Wikimedia Commons.

Editor’s note: this post is part of our Intelligence Matters investigation.

In my last post, we saw that AI means a lot of things to a lot of people. These dueling definitions each have a deep history — ok fine, baggage — that has massed and layered over time. While they’re all legitimate, they share a common weakness: each one can apply perfectly well to a system that is not particularly intelligent. As just one example, the chatbot that was recently touted as having passed the Turing test is certainly an interlocutor (of sorts), but it was widely criticized as not containing any significant intelligence.

Let’s ask a different question instead: What criteria must any system meet in order to achieve intelligence — whether an animal, a smart robot, a big-data cruncher, or something else entirely?

To answer this question, I want to explore a hypothesis that I’ve heard attributed to the cognitive scientist Josh Tenenbaum (who was a member of my thesis committee). He has not, to my knowledge, unpacked this deceptively simple idea in detail (though see his excellent and accessible paper How to Grow a Mind: Statistics, Structure, and Abstraction), and he would doubtless describe it quite differently from my attempt here. Any foolishness which follows is therefore most certainly my own, and I beg forgiveness in advance.

I’ll phrase it this way:

Intelligence, whether natural or synthetic, derives from a model of the world in which the system operates. Greater intelligence arises from richer, more powerful, “runnable” models that are capable of more accurate and contingent predictions about the environment.

What do I mean by a model? After all, people who work with data are always talking about the “predictive models” that are generated by today’s machine learning and data science techniques. While these models do technically meet my definition, it turns out that the methods in wide use capture very little of what is knowable and important about the world. We can do much better, though, and the key prediction of this hypothesis is that systems will gain intelligence proportionate to how well the models on which they rely incorporate additional aspects of the environment: physics, the behaviors of other intelligent agents, the rewards that are likely to follow from various actions, and so on. And the most successful systems will be those whose models are “runnable,” able to reason about and simulate the consequences of actions without actually taking them.

Let’s look at a few examples.

  • Single-celled organisms leverage a simple behavior called chemotaxis to swim toward food and away from toxins; they do this by detecting the relevant chemical concentration gradients in their liquid environment. The organism is thus acting on a simple model of the world – one that, while devastatingly simple, usually serves it well.
  • Mammalian brains have a region known as the hippocampus that contains cells that fire when the animal is in a particular place, as well as cells that fire at regular intervals on a hexagonal grid. While we don’t yet understand all of the details, these cells form part of a system that models the physical world, doubtless to aid in important tasks like finding food and avoiding danger — not so different from the bacteria.
  • While humans also have a hippocampus, which probably performs some of these same functions, we also have overgrown neocortexes that model many other aspects of our world, including, crucially, our social environment: we need to be able to predict how others will act in response to various situations.

The scientists who study these and many other examples have solidly established that naturally occurring intelligences rely on internal models. The question, then, is whether artificial intelligences must rely on the same principles. In other words, what exactly did we mean when we said that intelligence “derives from” internal models? Just how strong is the causal link between a system having a rich world model and its ability to possess and display intelligence? Is it an absolute dependency, meaning that a sophisticated model is a necessary condition for intelligence? Are good models merely very helpful in achieving intelligence, and therefore likely to be present in the intelligences that we build or grow? Or is a model-based approach but one path among many in achieving intelligence? I have my hunches — I lean toward the stronger formulations — but I think these need to be considered open questions at this point.

The next thing to note about this conception of intelligence is that, bucking a long-running trend in AI and related fields, it is not a behavioralist measure. Rather than evaluating a system based on its actions alone, we are affirmedly piercing the veil in order to make claims about what is happening on the inside. This is at odds with the most famous machine intelligence assessment, the Turing test; it also contrasts with another commonly-referenced measure of general intelligence, “an agent’s ability to achieve goals in a wide range of environments”.

Of course, the reason for a naturally-evolving organism to spend significant resources on a nervous system that can build and maintain a sophisticated world model is to generate actions that promote reproductive success — big brains are energy hogs, and they need to pay rent. So, it’s not that behavior doesn’t matter, but rather that the strictly behavioral lens might be counterproductive if we want to learn how to build generally intelligent systems. A focus on the input-output characteristics of a system might suffice when its goals are relatively narrow, such as medical diagnoses, question answering, and image classification (though each of these domains could benefit from more sophisticated models). But this black-box approach is necessarily descriptive, rather than normative: it describes a desired endpoint, without suggesting how this result should be achieved. This devotion to surface traits leads us to adopt methods that do not not scale to harder problems.

Finally, what does this notion of intelligence say about the current state of the art in machine intelligence as well as likely avenues for further progress? I’m planning to explore this more in future posts, but note for now that today’s most popular and successful machine learning and predictive analytics methods — deep neural networks, random forests, logistic regression, Bayesian classifiers — all produce models that are remarkably impoverished in their ability to represent real-world phenomena.

In response to these shortcomings, there are several active research programs attempting to bring richer models to bear, including but not limited to probabilistic programming and representation learning. By now, you won’t be surprised that I think such approaches represent our best hope at building intelligent systems that can truly be said to understand the world they live in.


How to build and run your first deep learning network

Step-by-step instruction on training your own neural network.


When I first became interested in using deep learning for computer vision I found it hard to get started. There were only a couple of open source projects available, they had little documentation, were very experimental, and relied on a lot of tricky-to-install dependencies. A lot of new projects have appeared since, but they’re still aimed at vision researchers, so you’ll still hit a lot of the same obstacles if you’re approaching them from outside the field.

In this article — and the accompanying webcast — I’m going to show you how to run a pre-built network, and then take you through the steps of training your own. I’ve listed the steps I followed to set up everything toward the end of the article, but because the process is so involved, I recommend you download a Vagrant virtual machine that I’ve pre-loaded with everything you need. This VM lets us skip over all the installation headaches and focus on building and running the neural networks. Read more…

Comment: 1

What is deep learning, and why should you care?

Announcing a new series delving into deep learning and the inner workings of neural networks.


Editor’s note: this post is part of our Intelligence Matters investigation.

When I first ran across the results in the Kaggle image-recognition competitions, I didn’t believe them. I’ve spent years working with machine vision, and the reported accuracy on tricky tasks like distinguishing dogs from cats was beyond anything I’d seen, or imagined I’d see anytime soon. To understand more, I reached out to one of the competitors, Daniel Nouri, and he demonstrated how he used the Decaf open-source project to do so well. Even better, he showed me how he was quickly able to apply it to a whole bunch of other image-recognition problems we had at Jetpac, and produce much better results than my conventional methods.

I’ve never encountered such a big improvement from a technique that was largely unheard of just a couple of years before, so I became obsessed with understanding more. To be able to use it commercially across hundreds of millions of photos, I built my own specialized library to efficiently run prediction on clusters of low-end machines and embedded devices, and I also spent months learning the dark arts of training neural networks. Now I’m keen to share some of what I’ve found, so if you’re curious about what on earth deep learning is, and how it might help you, I’ll be covering the basics in a series of blog posts here on Radar, and in a short upcoming ebook. Read more…

Comments: 4

AI’s dueling definitions

Why my understanding of AI is different from yours.


SoftBank’s Pepper, a humanoid robot that takes its surroundings into consideration.

Editor’s note: this post is part of our Intelligence Matters investigation.

Let me start with a secret: I feel self-conscious when I use the terms “AI” and “artificial intelligence.” Sometimes, I’m downright embarrassed by them.

Before I get into why, though, answer this question: what pops into your head when you hear the phrase artificial intelligence?

For the layperson, AI might still conjure HAL’s unblinking red eye, and all the misfortune that ensued when he became so tragically confused. Others jump to the replicants of Blade Runner or more recent movie robots. Those who have been around the field for some time, though, might instead remember the “old days” of AI — whether with nostalgia or a shudder — when intelligence was thought to primarily involve logical reasoning, and truly intelligent machines seemed just a summer’s work away. And for those steeped in today’s big-data-obsessed tech industry, “AI” can seem like nothing more than a high-falutin’ synonym for the machine-learning and predictive-analytics algorithms that are already hard at work optimizing and personalizing the ads we see and the offers we get — it’s the term that gets trotted out when we want to put a high sheen on things. Read more…

Comments: 5

Streamlining feature engineering

Researchers and startups are building tools that enable feature discovery.

Why do data scientists spend so much time on data wrangling and data preparation? In many cases it’s because they want access to the best variables with which to build their models. These variables are known as features in machine-learning parlance. For many0 data applications, feature engineering and feature selection are just as (if not more important) than choice of algorithm:

Good features allow a simple model to beat a complex model.
(to paraphrase Alon Halevy, Peter Norvig, and Fernando Pereira)

The terminology can be a bit confusing, but to put things in context one can simplify the data science pipeline to highlight the importance of features:

Feature engineering and discovery pipeline

Feature Engineering or the Creation of New Features
A simple example to keep in mind is text mining. One starts with raw text (documents) and extracted features could be individual words or phrases. In this setting, a feature could indicate the frequency of a specific word or phrase. Features1 are then used to classify and cluster documents, or extract topics associated with the raw text. The process usually involves the creation2 of new features (feature engineering) and identifying the most essential ones (feature selection).

Read more…


Untapped opportunities in AI

Some of AI's viable approaches lie outside the organizational boundaries of Google and other large Internet companies.

Editor’s note: this post is part of an ongoing series exploring developments in artificial intelligence.

Here’s a simple recipe for solving crazy-hard problems with machine intelligence. First, collect huge amounts of training data — probably more than anyone thought sensible or even possible a decade ago. Second, massage and preprocess that data so the key relationships it contains are easily accessible (the jargon here is “feature engineering”). Finally, feed the result into ludicrously high-performance, parallelized implementations of pretty standard machine-learning methods like logistic regression, deep neural networks, and k-means clustering (don’t worry if those names don’t mean anything to you — the point is that they’re widely available in high-quality open source packages).

Google pioneered this formula, applying it to ad placement, machine translation, spam filtering, YouTube recommendations, and even the self-driving car — creating billions of dollars of value in the process. The surprising thing is that Google isn’t made of magic. Instead, mirroring Bruce Scheneier’s surprised conclusion about the NSA in the wake of the Snowden revelations, “its tools are no different from what we have in our world; it’s just better funded.” Read more…


“It works like the brain.” So?

There are many ways a system can be like the brain, but only a fraction of these will prove important.

Editor’s note: this post is part of an ongoing series exploring developments in artificial intelligence.

Here’s a fun drinking game: take a shot every time you find a news article or blog post that describes a new AI system as working or thinking “like the brain.” Here are a few to start you off with a nice buzz; if your reading habits are anything like mine, you’ll never be sober again. Once you start looking for this phrase, you’ll see it everywhere — I think it’s the defining laziness of AI journalism and marketing.

Surely these claims can’t all be true? After all, the brain is an incredibly complex and specific structure, forged in the relentless pressure of millions of years of evolution to be organized just so. We may have a lot of outstanding questions about how it works, but work a certain way it must. Read more…

Comments: 14

Welcome to Intelligence Matters

Casting a critical eye on the exciting developments in the world of AI.

Editor’s note: this post was co-authored by Ben Lorica and Roger Magoulas


Siri screenshot.

Today we’re kicking off Intelligence Matters (IM), a new series exploring current issues in artificial intelligence, including the connection between artificial intelligence, human intelligence and the brain. IM offers a thoughtful take on recent developments, including a critical, and sometimes skeptical, view when necessary.

True AI has been “just around the corner” for 60 years, so why should O’Reilly start covering AI in a big way now? As computing power catches up to scientific and engineering ambitions, and as our ability to learn directly from sensory signals — i.e., big data — increases, intelligent systems are having a real and widespread impact. Every Internet user benefits from these systems today — they sort our email, plan our journeys, answer our questions, and protect us from fraudsters. And, with the Internet of Things, these system have already started to keep our houses and offices comfortable and well-lit, our data centers running more efficiently, our industrial processes humming, and even are driving our cars. Read more…

Comments: 2