"Big Data and Artificial Intelligence: Intelligence Matters" entries
Tips on how to build effective human-machine hybrids, from crowdsourcing expert Adam Marcus.
In a recent O’Reilly webcast, “Crowdsourcing at GoDaddy: How I Learned to Stop Worrying and Love the Crowd,” Adam Marcus explains how to mitigate common challenges of managing crowd workers, how to make the most of human-in-the-loop machine learning, and how to establish effective and mutually rewarding relationships with workers. Marcus is the director of data on the Locu team at GoDaddy, where the “Get Found” service provides businesses with a central platform for managing their online presence and content.
In the webcast, Marcus uses practical examples from his experience at GoDaddy to reveal helpful methods for how to:
- Offset the inevitability of wrong answers from the crowd
- Develop and train workers through a peer-review system
- Build a hierarchy of trusted workers
- Make crowd work inspiring and enable upward mobility
What to do when humans get it wrong
It turns out there is a simple way to offset human error: redundantly ask people the same questions. Marcus explains that when you ask five different people the same question, there are some creative ways to combine their responses, and use a majority vote. Read more…
Practical machine-learning applications and strategies from experts in active learning.
What do you call a practice that most data scientists have heard of, few have tried, and even fewer know how to do well? It turns out, no one is quite certain what to call it. In our latest free report Real-World Active Learning: Applications and Strategies for Human-in-the-Loop Machine Learning, we examine the relatively new field of “active learning” — also referred to as “human computation,” “human-machine hybrid systems,” and “human-in-the-loop machine learning.” Whatever you call it, the field is exploding with practical applications that are proving the efficiency of combining human and machine intelligence.
Learn from the expertsThrough in-depth interviews with experts in the field of active learning and crowdsource management, industry analyst Ted Cuzzillo reveals top tips and strategies for using short-term human intervention to actively improve machine models. As you’ll discover, the point at which a machine model fails is precisely where there’s an opportunity to insert — and benefit from — human judgment.
- When active learning works best
- How to manage crowdsource contributors (including expert-level contributors)
- Basic principles of labeling data
- Best practice methods for assessing labels
- When to skip the crowd and mine your own data
Explore real-world examples
This report gives you a behind-the-scenes look at how human-in-the-loop machine learning has helped improve the accuracy of Google Maps, match business listings at GoDaddy, rank top search results at Yahoo!, refer relevant job postings to people on LinkedIn, identify expert-level contributors using the Quizz recruitment method, and recommend women’s clothing based on customer and product data at Stitch Fix. Read more…
If what we are trying to build is artificial minds, intelligence might be the smaller, easier part.
When we talk about artificial intelligence, we often make an unexamined assumption: that intelligence, understood as rational thought, is the same thing as mind. We use metaphors like “the brain’s operating system” or “thinking machines,” without always noticing their implicit bias.
But if what we are trying to build is artificial minds, we need only look at a map of the brain to see that in the domain we’re tackling, intelligence might be the smaller, easier part.
Maybe that’s why we started with it.
After all, the rational part of our brain is a relatively recent add-on. Setting aside unconscious processes, most of our gray matter is devoted not to thinking, but to feeling.
There was a time when we deprecated this larger part of the mind, as something we should either ignore or, if it got unruly, control.
But now we understand that, as troublesome as they may sometimes be, emotions are essential to being fully conscious. For one thing, as neurologist Antonio Damasio has demonstrated, we need them in order to make decisions. A certain kind of brain damage leaves the intellect unharmed, but removes the emotions. People with this affliction tend to analyze options endlessly, never settling on a final choice. Read more…
A look at a few ways humans mesh with the rest of our data systems.
Here’s a look at a few of the ways that humans — still the ultimate data processors — mesh with the rest of our data systems: how computational power can best produce true cognitive augmentation.
Deciding betterOver the past decade, we fitted roughly a quarter of our species with sensors. We instrumented our businesses, from the smallest market to the biggest factory. We began to consume that data, slowly at first. Then, as we were able to connect data sets to one another, the applications snowballed. Now that both the front office and the back office are plugged into everything, business cares. A lot.
While early adopters focused on sales, marketing, and online activity, today, data gathering and analysis is ubiquitous. Governments, activists, mining giants, local businesses, transportation, and virtually every other industry lives by data. If an organization isn’t harnessing the data exhaust it produces, it’ll soon be eclipsed by more analytical, introspective competitors that learn and adapt faster.
Whether we’re talking about a single human made more productive by a smartphone-turned-prosthetic-brain, or a global organization gaining the ability to make more informed decisions more quickly, ultimately, Strata + Hadoop World has become about deciding better.
What does it take to make better decisions? How will we balance machine optimization with human inspiration, sometimes making the best of the current game and other times changing the rules? Will machines that make recommendations about the future based on the past reduce risk, raise barriers to innovation, or make us vulnerable to improbable Black Swans because they mistakenly conclude that tomorrow is like yesterday, only more so? Read more…
The history of computing has been a constant pendulum — that pendulum is now swinging back toward distribution.
The trifecta of cheap sensors, fast networks, and distributing computing are changing how we work with data. But making sense of all that data takes help, which is arriving in the form of machine learning. Here’s one view of how that might play out.
Clouds, edges, fog, and the pendulum of distributed computingThe history of computing has been a constant pendulum, swinging between centralization and distribution.
The first computers filled rooms, and operators were physically within them, switching toggles and turning wheels. Then came mainframes, which were centralized, with dumb terminals.
As the cost of computing dropped and the applications became more democratized, user interfaces mattered more. The smarter clients at the edge became the first personal computers; many broke free of the network entirely. The client got the glory; the server merely handled queries.
Once the web arrived, we centralized again. LAMP (Linux, Apache, MySQL, PHP) buried deep inside data centers, with the computer at the other end of the connection relegated to little more than a smart terminal rendering HTML. Load-balancers sprayed traffic across thousands of cheap machines. Eventually, the web turned from static sites to complex software as a service (SaaS) applications.
Then the pendulum swung back to the edge, and the clients got smart again. First with AJAX, Java, and Flash; then in the form of mobile apps, where the smartphone or tablet did most of the hard work and the back end was a communications channel for reporting the results of local action. Read more…
We need to understand our own intelligence is competition for our artificial, not-quite intelligences.
A few days ago, Elon Musk likened artificial intelligence (AI) to “summoning the demon.” As I’m sure you know, there are many stories in which someone summons a demon. As Musk said, they rarely turn out well.
There’s no question that Musk is an astute student of technology. But his reaction is misplaced. There are certainly reasons for concern, but they’re not Musk’s.
The problem with AI right now is that its achievements are greatly over-hyped. That’s not to say those achievements aren’t real, but they don’t mean what people think they mean. Researchers in deep learning are happy if they can recognize human faces with 80% accuracy. (I’m skeptical about claims that deep learning systems can reach 97.5% accuracy; I suspect that the problem has been constrained some way that makes it much easier. For example, asking “is there a face in this picture?” or “where is the face in this picture?” is much different from asking “what is in this picture?”) That’s a hard problem, a really hard problem. But humans recognize faces with nearly 100% accuracy. For a deep learning system, that’s an almost inconceivable goal. And 100% accuracy is orders of magnitude harder than 80% accuracy, or even 97.5%. Read more…
Solutions to a number of problems must be found to unlock PAPI value.
In November, the first International Conference on Predictive APIs and Apps will take place in Barcelona, just ahead of Strata Barcelona. This event will bring together those who are building intelligent web services (sometimes called Machine Learning as a Service) with those who would like to use these services to build predictive apps, which, as defined by Forrester, deliver “the right functionality and content at the right time, for the right person, by continuously learning about them and predicting what they’ll need.”
This is a very exciting area. Machine learning of various sorts is revolutionizing many areas of business, and predictive services like the ones at the center of predictive APIs (PAPIs) have the potential to bring these capabilities to an even wider range of applications. I co-founded one of the first companies in this space (acquired by Salesforce in 2012), and I remain optimistic about the future of these efforts. But the field as a whole faces a number of challenges, for which the answers are neither easy nor obvious, that must be addressed before this value can be unlocked.
In the remainder of this post, I’ll enumerate what I see as the most pressing issues. I hope that the speakers and attendees at PAPIs will keep these in mind as they map out the road ahead. Read more…
How neuroscience is benefiting from distributed computing — and how computing might learn from neuroscience.
When we think about big data, we usually think about the web: the billions of users of social media, the sensors on millions of mobile phones, the thousands of contributions to Wikipedia, and so forth. Due to recent innovations, web-scale data can now also come from a camera pointed at a small, but extremely complex object: the brain. New progress in distributed computing is changing how neuroscientists work with the resulting data — and may, in the process, change how we think about computation. Read more…