Solutions to a number of problems must be found to unlock PAPI value.
In November, the first International Conference on Predictive APIs and Apps will take place in Barcelona, just ahead of Strata Barcelona. This event will bring together those who are building intelligent web services (sometimes called Machine Learning as a Service) with those who would like to use these services to build predictive apps, which, as defined by Forrester, deliver “the right functionality and content at the right time, for the right person, by continuously learning about them and predicting what they’ll need.”
This is a very exciting area. Machine learning of various sorts is revolutionizing many areas of business, and predictive services like the ones at the center of predictive APIs (PAPIs) have the potential to bring these capabilities to an even wider range of applications. I co-founded one of the first companies in this space (acquired by Salesforce in 2012), and I remain optimistic about the future of these efforts. But the field as a whole faces a number of challenges, for which the answers are neither easy nor obvious, that must be addressed before this value can be unlocked.
In the remainder of this post, I’ll enumerate what I see as the most pressing issues. I hope that the speakers and attendees at PAPIs will keep these in mind as they map out the road ahead. Read more…
True artificial intelligence will require rich models that incorporate real-world phenomena.
In my last post, we saw that AI means a lot of things to a lot of people. These dueling definitions each have a deep history — ok fine, baggage — that has massed and layered over time. While they’re all legitimate, they share a common weakness: each one can apply perfectly well to a system that is not particularly intelligent. As just one example, the chatbot that was recently touted as having passed the Turing test is certainly an interlocutor (of sorts), but it was widely criticized as not containing any significant intelligence.
Let’s ask a different question instead: What criteria must any system meet in order to achieve intelligence — whether an animal, a smart robot, a big-data cruncher, or something else entirely? Read more…
Why my understanding of AI is different from yours.
Editor’s note: this post is part of our Intelligence Matters investigation.
Let me start with a secret: I feel self-conscious when I use the terms “AI” and “artificial intelligence.” Sometimes, I’m downright embarrassed by them.
Before I get into why, though, answer this question: what pops into your head when you hear the phrase artificial intelligence?
For the layperson, AI might still conjure HAL’s unblinking red eye, and all the misfortune that ensued when he became so tragically confused. Others jump to the replicants of Blade Runner or more recent movie robots. Those who have been around the field for some time, though, might instead remember the “old days” of AI — whether with nostalgia or a shudder — when intelligence was thought to primarily involve logical reasoning, and truly intelligent machines seemed just a summer’s work away. And for those steeped in today’s big-data-obsessed tech industry, “AI” can seem like nothing more than a high-falutin’ synonym for the machine-learning and predictive-analytics algorithms that are already hard at work optimizing and personalizing the ads we see and the offers we get — it’s the term that gets trotted out when we want to put a high sheen on things. Read more…
Some of AI's viable approaches lie outside the organizational boundaries of Google and other large Internet companies.
Editor’s note: this post is part of an ongoing series exploring developments in artificial intelligence.
Here’s a simple recipe for solving crazy-hard problems with machine intelligence. First, collect huge amounts of training data — probably more than anyone thought sensible or even possible a decade ago. Second, massage and preprocess that data so the key relationships it contains are easily accessible (the jargon here is “feature engineering”). Finally, feed the result into ludicrously high-performance, parallelized implementations of pretty standard machine-learning methods like logistic regression, deep neural networks, and k-means clustering (don’t worry if those names don’t mean anything to you — the point is that they’re widely available in high-quality open source packages).
Google pioneered this formula, applying it to ad placement, machine translation, spam filtering, YouTube recommendations, and even the self-driving car — creating billions of dollars of value in the process. The surprising thing is that Google isn’t made of magic. Instead, mirroring Bruce Scheneier’s surprised conclusion about the NSA in the wake of the Snowden revelations, “its tools are no different from what we have in our world; it’s just better funded.” Read more…
There are many ways a system can be like the brain, but only a fraction of these will prove important.
Editor’s note: this post is part of an ongoing series exploring developments in artificial intelligence.
Here’s a fun drinking game: take a shot every time you find a news article or blog post that describes a new AI system as working or thinking “like the brain.” Here are a few to start you off with a nice buzz; if your reading habits are anything like mine, you’ll never be sober again. Once you start looking for this phrase, you’ll see it everywhere — I think it’s the defining laziness of AI journalism and marketing.
Surely these claims can’t all be true? After all, the brain is an incredibly complex and specific structure, forged in the relentless pressure of millions of years of evolution to be organized just so. We may have a lot of outstanding questions about how it works, but work a certain way it must. Read more…
We should raise our collective expectations of what they should provide
There is a tremendous amount of commercial attention on machine learning (ML) methods and applications. This includes product and content recommender systems, predictive models for churn and lead scoring, systems to assist in medical diagnosis, social network sentiment analysis, and on and on. ML often carries the burden of extracting value from big data.
But getting good results from machine learning still requires much art, persistence, and even luck. An engineer can’t yet treat ML as just another well-bahaved part of the technology stack. There are many underlying reasons for this, but for the moment I want to focus on how we measure or evaluate ML systems.
Reflecting their academic roots, machine learning methods have traditionally been evaluated in terms of narrow quantitative metrics: precision, recall, RMS error, and so on. The data-science-as-competitive-sport site Kaggle has adopted these metrics for many of its competitions. They are objective and reassuringly concrete.
Our tools should make common cases easy and safe, but that's not the reality today.
Recently, the Mathbabe (aka Cathy O’Neil) vented some frustration about the pitfalls in applying even simple machine learning (ML) methods like k-nearest neighbors. As data science is democratized, she worries that naive practitioners will shoot themselves in the foot because these tools can offer very misleading results. Maybe data science is best left to the pros? Mike Loukides picked up this thread, calling for healthy skepticism in our approach to data and implicitly cautioning against a “cargo cult” approach in which data collection and analysis methods are blindly copied from previous efforts without sufficient attempts to understand their potential biases and shortcomings.
Well, arguing against greater understanding of the methods we apply is like arguing against motherhood and apple pie, and Cathy and Mike are spot on in their diagnoses of the current situation. And yet …
There is so much value to be gained if we can put the power of learning, inference, and prediction methods into the hands of more developers and domain experts. But how can we avoid the pitfalls that Cathy and Mike are rightly concerned about? If a seemingly simple method like k-nearest neighbors classification is dangerous in unskilled hands (and it certainly is), then what hope is there? Well, I would argue that all ML methods are not created equal with regard to their safety. In fact, it is exactly some of the simplest (and most widely used) methods that are the most dangerous.
Why? Because these methods have lots of hidden assumptions. Well, maybe the assumptions aren’t so much hidden as nodded-at-but-rarely-questioned. A good analogy might be jumping to the sentencing phase of a criminal trial without first assessing guilt: asking “What is the punishment that best fits this crime?” before asking “Did the defendant actually commit a crime? And if so, which one?” As another example of a simple-yet-dangerous method, k-means clustering assumes a value for k, the number of clusters, even though there may not be a “good” way to divide the data into this many buckets. Maybe seven buckets provides a much more natural explanation than four. Or maybe the data, as observed, is truly undifferentiated and any effort to split it up will result in arbitrary and misleading distinctions. Shouldn’t our methods ask these more fundamental questions as well? Read more…
Probabilistic languages can free developers from the complexities of high-performance probabilistic inference.
Probabilistic programming languages are in the spotlight. This is due to the announcement of a new DARPA program to support their fundamental research. But what is probabilistic programming? What can we expect from this research? Will this effort pay off? How long will it take?
A probabilistic programming language is a high-level language that makes it easy for a developer to define probability models and then “solve” these models automatically. These languages incorporate random events as primitives and their runtime environment handles inference. Now, it is a matter of programming that enables a clean separation between modeling and inference. This can vastly reduce the time and effort associated with implementing new models and understanding data. Just as high-level programming languages transformed developer productivity by abstracting away the details of the processor and memory architecture, probabilistic languages promise to free the developer from the complexities of high-performance probabilistic inference. Read more…