In the next decade, Year Zero will be how big data reaches everyone and will fundamentally change how we live.
Editor’s note: this post originally appeared on the author’s blog, Solve for Interesting. This lightly edited version is reprinted here with permission.
In 10 years, every human connected to the Internet will have a timeline. It will contain everything we’ve done since we started recording, and it will be the primary tool with which we administer our lives. This will fundamentally change how we live, love, work, and play. And we’ll look back at the time before our feed started — before Year Zero — as a huge, unknowable black hole.
This timeline — beginning for newborns at Year Zero — will be so intrinsic to life that it will quickly be taken for granted. Those without a timeline will be at a huge disadvantage. Those with a good one will have the tricks of a modern mentalist: perfect recall, suggestions for how to curry favor, ease maintaining friendships and influencing strangers, unthinkably higher Dunbar numbers — now, every interaction has a history.
This isn’t just about lifelogging health data, like your Fitbit or Jawbone. It isn’t about financial data, like Mint. It isn’t just your social graph or photo feed. It isn’t about commuting data like Waze or Maps. It’s about all of these, together, along with the tools and user interfaces and agents to make sense of it.
Every decade or so, something from military or enterprise technology finds its way, bent and twisted, into the mass market. The client-server computer gave us the PC; wide-area networks gave us the consumer web; pagers and cell phones gave us mobile devices. In the next decade, Year Zero will be how big data reaches everyone. Read more…
The Strata + Hadoop World 2015 Startup Showcase highlighted four important trends in the big data world.
At Strata + Hadoop World 2015 in San Jose last week, we ran an event for data-driven startups. This is the fourth year for the Startup Showcase, and it’s become a fixture of the conference. One of our early winners, MemSQL, has since raised $50 million in financing, and it’s a good way for companies to get visibility with investors, analysts, and attendees.
This year’s winners underscore several important trends in the big data space at the moment: the maturity of management tools; the deployment of machine learning in other verticals; an increased focus on privacy and permissions; and the convergence of enterprise languages like SQL with distributed, schema-less data stacks. Read more…
A Call for Proposals for Strata Conference + Hadoop World 2014
When we launched Strata a few years ago, our original focus was on how big data, ubiquitous computing, and new interfaces change the way we live, love, work and play. In fact, here’s a diagram we mocked up back then to describe the issues we wanted the new conference to tackle:
Such lists might mean we miss the truly great breakthroughs, inspirations, and leaps of faith necessary to evolve.
Editor’s note: this post originally appeared on Tilt the Windmill; it is republished here with permission.
First: it’s an excellent post. You should read it. I’ll wait.
Every enterprise decision-maker will soon be running their business according to the lists Barry envisions, as the power of big data and analytics finds its way into every boardroom and dashboard. Society will soon demand them, too. But while such analysis is tremendously valuable, it carries two dangers: the politics of setting criteria, and the trap of relying on data for inspiration.
The harsh light of data
Barry is right: rather than using our precious time and resources to make yet another linkbait list of the 50 cutest kittens, or the seven people I’ll try to avoid at SXSW, we should use abundant data and a connected world to build lists that matter: lying politicians, bad cars, lousy doctors. Then we can use these lists to change policy and behaviour because we’ll make things transparent. Shining the harsh light of data on something can improve it.
As society becomes increasingly data driven, it's critical to remember big data isn't a magical tool for predicting the future.
If you eat ice cream, you’re more likely to drown.
That’s not true, of course. It’s just that both ice cream and swimming happen in the summer. The two are correlated — and ice cream consumption is a good predictor of drowning fatalities — but ice cream hardly causes drowning.
These kinds of correlations are all around us, and big data makes them easy to find. We can correlate childhood trauma with obesity, nutrition with crime rates, and how toddlers play with future political affiliations.
Just as we wouldn’t ban ice cream in the hopes of preventing drowning, we wouldn’t preemptively arrest someone because their diet wasn’t healthy. But a quantified society, awash in data, might be tempted to do so because overwhelming correlation looks a lot like causality. And overwhelming correlation is what big data does best.
It’s getting easier than ever to find correlations. Parallel computing, advances in algorithms, and the inexorable crawl of Moore’s Law have dramatically reduced how much it costs to analyze a data set. Consider an activity we do dozens of times a day, without thinking: a Google search. The search is farmed out to thousands of machines, and often returns hundreds of answers in less than a second. Big data might seem esoteric, but it’s already here. Read more…
Eleven areas of focus for deeper investigation.
Conferences like Strata are planned a year in advance. The logistics and coordination required for an event of this magnitude takes a lot of planning, but it also takes a decent amount of prediction: Strata needs to skate to where the puck is going.
While Strata New York + Hadoop World 2013 is still a few months away, we’re already guessing at what next year’s Santa Clara event will hold. Recently, the team got together to identify some of the hot topics in big data, ubiquitous computing, and new interfaces. We selected eleven big topics for deeper investigation.
- Deep learning
- Time-series data
- The big data “app stack”
- Cultural barriers to change
- Design patterns
- Laggards and Luddites
- The convergence of two databases
- The other stacks
- Mobile data
- The analytic life-cycle
- Data anthropology
Here’s a bit more detail on each of them. Read more…
It’s been a weird couple of weeks for the Internet of Things. As we connect everything to everything else, we inadvertently create a huge attack surface for hackers, and we’re starting to see the chinks in the armor.
Let’s say you fancy a fast car. Flavio Garcia, a University of Birmingham computer scientist, discovered the algorithim that verifies the ignition key for luxury cars like Porsches, Audis, Bentleys, and Lamborghinis. He was slapped with an injunction to ban him from disclosing his findings at the Usenix Security Symposium in order to prevent sophisticated criminal gangs from having the analytics tools for widespread car theft.
You might need Garcia’s algorithm to steal a car, but soon, with an entirely different algorithm, you may be able to crash one into a tree or disable its brakes from a distance. Or maybe it’s a fast boat you’re after. Mess with its GPS, and you can steer it where you want without the crew noticing.
Learn to resist vanity metrics
One of the things we preach in Lean Analytics is that entrepreneurs should avoid vanity metrics—numbers that make you feel good, but ultimately, don’t change your behavior. Vanity metrics (such as “total visitors”) tend to go “up and to the right” but don’t tell you much about how you’re doing.
Many people find solace in graphs that go up and to the right. The metric “Total number of people who have visited my restaurant” will always increase; but on its own it doesn’t tell you anything about the health of the business. It’s just head-in-the-sand comforting.
A good metric is often a comparative rate or ratio. Consider what happens when you put the word “per” before or after a metric. “Restaurant visitors per day” is vastly more meaningful. Time is the universal denominator, since the universe moves inexorably forwards. But there are plenty of other good ratios. For example, “revenue per restaurant visitor” matters a lot, since it tells you what each diner contributes.
What’s an active user, anyway?
For many businesses, the go-to metric revolves around “active users.” In a mobile app or software-as-a-service business, only some percentage of people are actively engaged. In a media site, only some percentage uses the site each day. And in a loyalty-focused e-commerce company, only some buyers are active.
This is true of more traditional businesses, too. Only a percentage of citizens are actively engaged in local government; only a certain number of employees are using the Intranet; only a percentage of coffee shop patrons return daily.
Unfortunately, saying “measure active users” begs the question: What’s active, anyway?
To figure this out, you need to look at your business model. Not your business plan, which is a hypothetical projection of how you’ll fare, but your business model. If you’re running a lemonade stand, your business model likely has a few key assumptions:
- The cost of lemonade;
- The amount of foot traffic past your stand;
- The percent of passers-by who will buy from you;
- The price they are willing to pay.
Our Lean lemonade stand would then set about testing and improving each metric, running experiments to find the best street corner, or determine the optimal price.
Lemonade stands are wonderfully simple, so your business may have many other assumptions, but it is essential that you quantify them and state them so you can then focus on improving them, one by one, until your business model and reality align. In a restaurant, for example, these assumptions might be, “we will have at least 50 diners a day” or “diners will spend on average $20 a meal.”
The activity you want changes
We believe most new companies and products go through five distinct stages of growth:
- Empathy, where you figure out what problem you’re solving and what solution people want;
- Stickiness, where you measure how many people adopt your solution rather than trying it and leaving;
- Virality, where you maximize word-of-mouth and references;
- Revenue, where you pour some part of your revenues back into paid acquisition or advertising;
- Scale, where you grow the business through automation, delegation, and process.
Submit your suggestions for videos that make us think about how data, visualizations, and technology are changing us
Each year at Strata, we warm up the crowd in the main keynote sessions with short videos that will make people think. These videos demonstrate the ways that data, technology, and visualization are changing us. Some are funny; some are clever; some are downright disturbing.
For Strata New York + Hadoop World in October, we’re hoping you’ll join in and suggest some videos for us. If you’ve got something you feel captures the zeitgeist of technology at the fringes, then complete this form, and we’ll check it out. We’ll choose some of them as we kick off the event this fall.