"With relative accuracy, we can predict 33 days out what song will go to No. 1 on the Billboard charts in the U.S.," says Cait O'Riordan, VP of product for music and platforms at Shazam. See more signals from Strata + Hadoop World 2015 in London ...
The O'Reilly Radar Podcast: Cait O'Riordan on Shazam's predictive analytics, and Francine Bennett on using data for evil.
Subscribe to the O’Reilly Radar Podcast to track the technologies and people that will shape our world in the years to come.
In this week’s Radar Podcast, I chat with Cait O’Riordan, VP of product, music and platforms at Shazam. She talks about the current state of predictive analytics and how Shazam is able to predict the success of a song, often in the first few hours after its release. We also talk about the Internet of Things and how products like the Apple Watch affect Shazam’s product life cycles as well as the behaviors of their users.
Predicting the next pop hit
Shazam has more than 100 million monthly active users, and its users Shazam more than 20 million times per day. This, of course, generates a ton of data that Shazam uses in myriad ways, not the least of which is to predict the success of a song. O’Riordan explained how they approach their user data and how they’re able to accurately predict pop hits (and misses):
What’s interesting from a data perspective is when someone takes their phone out of their pocket, unlocks it, finds the Shazam app, and hits the big blue button, they’re not just saying, “I want to know the name of this song.” They’re saying, “I like this song sufficiently to do that.” There’s an amount of effort there that implies some level of liking. That’s really interesting, because you combine that really interesting intention on the part of the user plus the massive data set, you can cut that in lots and lots of different ways. We use it for lots of different things.
At the most basic level, we’re looking at what songs are going to be popular. We can predict, with a relative amount of accuracy, what will hit the Top 100 Billboard Chart 33 days out, roughly. We can look at that in lots of different territories as well. We can also look and see, in the first few hours of a track, whether a big track is going to go on to be successful. We can look at which particular part of the track is encouraging people to Shazam and what makes a popular hit. We know that, for example, for a big pop hit, you’ve got about 10 seconds to convince somebody to find the Shazam app and press that button. There are lots of different ways that we can look at that data, going right into the details of a particular song, zooming out worldwide, or looking in different territories just due to that big worldwide and very engaged audience.
The O'Reilly Data Show Podcast: Gary Kazantsev on how big data and data science are making a difference in finance.
Having started my career in industry, working on problems in finance, I’ve always appreciated how challenging it is to build consistently profitable systems in this extremely competitive domain. When I served as quant at a hedge fund in the late 1990s and early 2000s, I worked primarily with price data (time-series). I quickly found that it was difficult to find and sustain profitable trading strategies that leveraged data sources that everyone else in the industry examined exhaustively. In the early-to-mid 2000s the hedge fund industry began incorporating many more data sources, and today you’re likely to find many finance industry professionals at big data and data science events like Strata + Hadoop World.
During the latest episode of the O’Reilly Data Show Podcast, I had a great conversation with one of the leading data scientists in finance: Gary Kazantsev runs the R&D Machine Learning group at Bloomberg LP. As a former quant, I wanted to know the types of problems Kazantsev and his group work on, and the tools and techniques they’ve found useful. We also talked about data science, data engineering, and recruiting data professionals for Wall Street. Read more…
Notification centers and Apple Watches beg the question: what’s the best way to interrupt us properly?
We’ve been claiming information overload for decades, if not centuries. As a species, we’re pretty good at inventing new tools to deal with the problems of increasing information: language, libraries, broadcast, search, news feeds. A digital, always-on lifestyle certainly presents new challenges, but we’re quickly creating prosthetic filters to help us cope.
Now there’s a new generation of information management tools, in the form of wearables and watches. But notification centers and Apple Watches beg the question: what’s the best way to interrupt us properly? Already, tables of friends take periodic “phone breaks” to check in on their virtual worlds, something that might have been considered unthinkably gauche a few years ago.
Since the first phone let us ring a bell, uninvited, in a far-off house, we’ve been dealing with interruption. Smart interruption is useful: Stewart Brand said that the right information at the right time just changes your life; it follows, then, that the perfect interface is one that’s invisible until it’s needed, the way Google inserts hotel dates on a map, or flight times in your calendar, or reminders when you have to leave for your next meeting.
But all of this technology is interfering with reflection, introspection, and contemplation. In Alone Together, Sheri Turkle observes that it’s far easier to engage with tools like Facebook than it is to connect with actual humans because interactive technology’s availability makes it a junk-food substitute for actual interaction. My friend Hugh McGuire recently waxed rather poetically on the risks of constant interruption, and how he’d forgotten how to read because of it.
At work, modern productivity tools like Slack might do away with email conventions, encouraging better collaboration, but they do so at a cost because they work in a way that demands immediate attention, and that interrupts the natural rhythm we all need to write, to read, and to immerse ourselves in our surroundings. It’s hard to marinate when you’re being interrupted. Read more…
A profile of Dr. Renetta Garrison Tull, from our latest report on women in the field of data.
Download our updated report, “Women in Data: Cutting-Edge Practitioners and Their Views on Critical Skills, Background, and Education,” by Cornelia Lévy-Bencheton and Shannon Cutt, featuring four new profiles of women across the European Union. Editor’s note: this is an excerpt from the free report.Dr. Renetta Garrison Tull is a recognized expert in women and minorities in education, and in the STEM gender gap — both within and outside the academic environment. Dr. Tull is also an electrical engineer by training and is passionate about bringing more women into the field.
From her vantage point at the University of Maryland Baltimore County (UMBC) as associate vice provost for graduate student development and postdoctoral affairs, Dr. Tull concentrates on opportunities for graduate and postdoctoral professional development. As director of PROMISE: Maryland’s Alliance for Graduate Education and the Professoriate (AGEP) program for the University System of Maryland (USM), Dr. Tull also has a unique perspective on the STEM subjects that students cover prior to attending the university, within academia and as preparation for the workforce beyond graduation.
Dr. Tull has been writing code since the seventh grade. Fascinated by the Internet, she “learned HTML before there were WYSIWYGs,” and remains heavily involved with the online world. “I’ve been politely chided in meetings for pulling out my phones (yes plural), sending texts, and updating our organization’s professional Twitter and Facebook status, while taking care of emails from multiple accounts. I manage several blogs, each for different audiences … friends, colleagues, and students.” Read more…
The "six C's": understanding the health data terrain in the era of precision medicine.
Ian Eslick, Tuhin Sinha, and Rob Rustad contributed to this post.
Download a free copy of “Navigating the Health Data Ecosystem,” the first in a series of reports covering our recent investigation into the health data ecosystem, funded by the Robert Wood Johnson Foundation.A few years ago, O’Reilly became interested in health topics, running the Strata RX conference, writing a report on How Data Science is Transforming Health Care: Solving the Wanamaker Dilemma, and publishing Hacking Healthcare. Our social network grew to include people in the health care space, informing our nascent thoughts about data in the age of the Affordable Care Act and the problems and opportunities facing the health care industry. We had the notion that aggregating data from traditional and new device-based sources could change much of what we understand about medicine — thoughts now captured by the concept of “precision medicine.”
From that early thinking, we developed the framework for a grant with the Robert Wood Johnson Foundation (RWJF) to explore the technical, organizational, legal, privacy, and other issues around aggregating health-related data for research — to provide empirical lessons for organizations also interested in pushing for data in health care initiatives. Our new free report, Navigating the Health Data Ecosystem, begins the process of sharing what we’ve learned.
After decades of maturing in more aggressive industries, data-driven technologies are being adopted, developed, funded, and deployed throughout the health care market at an unprecedented scale. February 2015 marked the inaugural working group meeting of the newly announced NIH Precision Medicine Initiative designed to aggregate a million-person cohort of genotype/phenotype dense longitudinal health data, where donors provide researchers with the raw epidemiological evidence to develop better decision-making, treatment, and potential cures for diseases like cancer. In the past several years, many established companies and new startups have also started to apply collective intelligence and “big data” platforms to health and health care problems. All these efforts encounter a set of unique challenges that experts coming from other disciplines do not always fully appreciate. Read more…
How data-driven tech toys are — and aren’t — changing the nature of play.
Sign up to be notified when the new free report Data, Technology & The Future of Play becomes available. This post is part of a series investigating the future of play that will culminate in a full report.
When I was in first grade, I cut the fur pom-poms off of my dad’s mukluks. (If you didn’t grow up in the Canadian North and you don’t know what mukluks are, here’s a picture.) My dad’s mukluks were specially made for him, so he was pretty sore. I cut the pom-poms off because I had just seen The Trouble With Tribbles at a friend’s house, and I desperately wanted some Tribbles. I kept them in a shoebox, named them, brought them to show-and-tell, and pretended they were real.
It’s exactly this kind of imaginative play that a lot of parents are afraid is being lost as toys become smarter. And in exchange for what? There isn’t any real evidence yet that smart toys genuinely make kids smarter.
I tell this story not to emphasize what a terrible vandal I was as a child, rather, I tell it to show how irrepressible childrens’ imaginations are, and to explain why technological toys are not going to kill that imagination. Today’s “smart” toys are no different than dolls and blocks, or in my case, a pair of mukluks. By nature, all toys have affordances that imply how they should be used. The more complex the toy, the more focused the affordances are. Consider a stick: it can be a weapon, a mode of transport, or a magic wand. But an app that is designed to do a thing guides users toward that use case, just as a door handle suggests that you should grasp and turn it. Design has opinions. Read more…