Signals from Strata + Hadoop World New York 2014

From unique data applications to factories of the future, here are key insights from Strata + Hadoop World New York 2014.

Experts from across the data world came together in New York City for Strata + Hadoop World New York 2014. Below we’ve assembled notable keynotes, interviews, and insights from the event.

Unusual data applications and the correct way to say “Hadoop”

Hadoop creator and Cloudera chief architect Doug Cutting discusses surprising data applications — from dating sites to premature babies — and he reveals the proper (but in no way required) pronunciation of “Hadoop.”

We are no longer accidentally well

We’re confronting a technology paradox, says PeaceHealth Labs’ chief medical officer Brigitte Piniewski. Our technology increases exponentially, yet people today aren’t as healthy as past generations — largely because we no longer simply encounter the nutrition and exercise we need. “It’s as if our biology has no respect for our ability to digitize our world,” Piniewski says. Reversing the trend will require community data projects that optimize our health.

Deep understanding leads to better visualizations

Miriah Meyer, assistant professor of computer science at the University of Utah, helped a scientist make a breakthrough with a visualization tool. The key? An inherent sense of what the scientist needed. “Designing visualizations for scientists is more than just about creating images,” Meyer says. “It’s about deeply understanding their problems and their experiments and their mental models.”

Empathy also has practical applications, as noted in this tweet from Farrah Bostic’s session:

20 years from now, factories will look more like data centers

Nathan Oostendorp, Sight Machine’s chief product architect and co-founder, discusses the influence data and technical infrastructure will have on manufacturing. Oostendorp says that while many companies are already gathering data “the next Industrial Revolution is about automating the collection of that data, it’s about being able to get insights out of that data that you wouldn’t be able to without a computer, and it’s about being able to do predictive and prescriptive adjustments to a manufacturing process.”

What is the data lake?

Edd Dumbill, vice president of strategy at Silicon Valley Data Science, talks about the potential of the “data lake” and what it will take for it to become a large-scale reality. He also compares Hadoop’s trajectory to the evolution of Linux, and he reveals the most important shift he’s seen in the data space over the last three years.

On a related note: It’s interesting to chart the maturation of ideas discussed at Strata from year to year. The following tweet is an excellent example. The data lake is now being defined by the execution, not just the concept.

Visualizations: Edge cases vs best case scenarios

Trina Chiasson, Infoactive co-founder and CEO, discusses an essential but often overlooked part of visualizations: typography. She also offers an interesting take on the idea that everyone should learn to code.

Big data and your brain are inconsistent allies

Big data would appear to offer a rational approach to problem solving, but NPR social science correspondent Shankar Vedantam says vast amounts of information do not always yield honest conclusions. Our brains make sense of the world by mapping data to theories, and when those theories are wrong we’ll sometimes disregard contrarian data instead of changing the underlying conclusions.

You can see more keynotes and interviews in our Strata + Hadoop World 2014 playlist.

tags: , , ,

Get the O’Reilly Data Newsletter

Stay informed. Receive weekly insight from industry insiders.