- Flux: New Approach to System Intuition (LinkedIn) — In general, we assume that if anything is best represented numerically, then we don’t need to visualize it. If the best representation is a numerical one, then a visualization could only obscure a quantifiable piece of information that can be measured, compared, and acted upon. Anything that we can wrap in alerts or some threshold boundary should kick off some automated process. No point in ruining a perfectly good system by introducing a human into the mix. Instead of numerical information, we want a tool that surfaces relevant information to a human, for situations that would be too onerous to create a heuristic. These situations require an intuition that we can’t codify.
- Jumping to the End: Practical Design Fiction (Vimeo) — “Magic is a power relationship” — Matt Jones on the flipside of hiding complex behaviours from users and making stuff “work like magic.” (via Richard Pope)
- Predicting Daily Activities from Egocentric Images Using Deep Learning — aka “people wear cameras and we can figure out what they’re going to do next.”
- 402: Payment Required (David Humphrey) — The ad blocking discussion highlights our total lack of imagination, where a browser’s role is reduced to “render” or “don’t render.” There are a whole world of options in between that we should be exploring.
"machine learning" entries
The O'Reilly Radar Podcast: Rajiv Maheswaran on the science of moving dots, and Claudia Perlich on big data in advertising.
Subscribe to the O’Reilly Radar Podcast to track the technologies and people that will shape our world in the years to come.
In this week’s Radar Podcast episode, O’Reilly’s Mac Slocum chats with Rajiv Maheswaran, CEO of Second Spectrum. Maheswaran talks about machine learning applications in sports, the importance of context in measuring stats, and the future of real-time, in-game analytics.
Here are some highlights from their chat:
There’s a lot of parts of the game of basketball — pick and rolls, dribble hand-offs — that coaches really care about, about analyzing how it works on offense, how to guard them. Before big data and machine learning, people basically watched the games and marked them. It turns out that people are pretty bad at marking them accurately, and they also miss a ton of stuff. Right now, machine learning tells coaches, ‘This is how many pick and rolls these two players have had over the course of the season, how often they do all the different variations, what they’re good at, what they’re bad at.’ Coaches can really find tendencies that can help them play offense, play defense, far more efficiently, based off of machine learning.
What we’re doing is having the machine match human intuition. If I’m watching a game, I know that the shot is harder if I’m farther away, if I have multiple defenders, if they’re close, if they’re closing in on me, if I’m dribbling, the type of shot I’m taking. As a human, I watch this and I have an intuition about it. Now, by giving all that data to the machine, it can make a predictor that actually matches our intuition, and goes beyond it because it can put a number onto what our intuition tells us.
A beginner's guide to evaluating your machine learning models.
Everything today is being quantified, measured, and tracked — everything is generating data, and data is powerful. Businesses are using data in a variety of ways to improve customer satisfaction. For instance, data scientists are building machine learning models to generate intelligent recommendations to users so that they spend more time on a site. Analysts can use churn analysis to predict which customers are the best targets for the next promotional campaign. The possibilities are endless.
However, there are challenges in the machine learning pipeline. Typically, you build a machine learning model on top of your data. You collect more data. You build another model. But how do you know when to stop?
When is your smart model smart enough?
Evaluation is a key step when building intelligent business applications with machine learning. It is not a one-time task, but must be integrated with the whole pipeline of developing and productionizing machine learning-enabled applications.
In a new free O’Reilly report Evaluating Machine Learning Models: A Beginner’s Guide to Key Concepts and Pitfalls, we cut through the technical jargon of machine learning, and elucidate, in simple language, the processes of evaluating machine learning models. Read more…
The O'Reilly Data Show Podcast: Alice Zheng on feature representations, model evaluation, and machine learning models.
Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science.
As tools for advanced analytics become more accessible, data scientist’s roles will evolve. Most media stories emphasize a need for expertise in algorithms and quantitative techniques (machine learning, statistics, probability), and yet the reality is that expertise in advanced algorithms is just one aspect of industrial data science.
During the latest episode of the O’Reilly Data Show podcast, I sat down with Alice Zheng, one of Strata + Hadoop World’s most popular speakers. She has a gift for explaining complex topics to a broad audience, through presentations and in writing. We talked about her background, techniques for evaluating machine learning models, how much math data scientists need to know, and the art of interacting with business users.
Making machine learning accessible
People who work at getting analytics adopted and deployed learn early on the importance of working with domain/business experts. As excited as I am about the growing number of tools that open up analytics to business users, the interplay between data experts (data scientists, data engineers) and domain experts remains important. In fact, human-in-the-loop systems are being used in many critical data pipelines. Zheng recounts her experience working with business analysts:
It’s not enough to tell someone, “This is done by boosted decision trees, and that’s the best classification algorithm, so just trust me, it works.” As a builder of these applications, you need to understand what the algorithm is doing in order to make it better. As a user who ultimately consumes the results, it can be really frustrating to not understand how they were produced. When we worked with analysts in Windows or in Bing, we were analyzing computer system logs. That’s very difficult for a human being to understand. We definitely had to work with the experts who understood the semantics of the logs in order to make progress. They had to understand what the machine learning algorithms were doing in order to provide useful feedback. Read more…