Gosier will be speaking at next week’s Strata conference on “The Democratization of Data Platforms.” In the following interview, he discusses the challenges and opportunities data democratization creates.
Your keynote is going to be about “everyone’s” big data problems. If everyone really does have their own big data problem, how are we going to democratize big data tools and processes? It seems that our various problems would require many different solutions.
Jonathan Gosier: It’s a problem for everyone because data problems can manifest in a multitude of ways: too much email, too many passwords to remember, a deluge of legal documents related to a mortgage, or simply knowing where to look online for the answers to simple questions.
You’re absolutely correct in noting that each of these problems requires different solutions. However, many of these solutions tend not to be accessible to the average person, whether this is because of prices or a level of expertise required to use the tools available.
There is a lot of talk about a “digital divide,” but there’s a growing “data divide” as well. It’s no longer about having basic computer literacy skills. Being able to understand what data is available, how it can be manipulated, and how it can be used to actually improve one’s life is a skill that not everyone possesses.
There’s an opportunity here for growth as well. If you look at the market, there are tools for visualizing personal finance (think Mint.com or HelloWallet), personal health (23andMe), personal productivity (Basecamp), etc. But the overarching trend is that there is a growing need for products that simplify the wealth of information around people. The simplest way to do this is often through visuals.
Why are visualizations so important to a better understanding of data?
Jonathan Gosier: Visualizations are only “better” in that they can relate complex ideas to a general audience. Visualization is by no means a replacement for expertise and research. It simply represents a method for communicating across barriers of knowledge.
But beyond that, the problem with a lot of the data visuals on the web is that they are static, pre-constructed, and vague about their data sources. This means the general public either has to take what’s presented on face value and agree or disagree, or they have to conduct their own research.
There’s a need for “living infographics” — visualizations that are inviting and easy to understand, but are shared with the underlying data used to create them. This allows the casual consumer to simply admire the visual while the more discerning audience can actually analyze the underlying data to see if the message being presented is consistent with their findings.
It’s far more transparent and credible to reveal, versus conceal, one’s sources.
One of the pushbacks to data democratization efforts is that people might not know how to use these tools correctly and/or they might use them to further their own agendas. How do you respond to that?
Jonathan Gosier: The question illustrates the point, actually. It wasn’t so long ago that the same could be said about the printing press. It was an innovation, but initially, it was so expensive that it was a technology that was only available to the elite and wealthy. Now it’s common (at least in the Western world) for any given middle-class household to contain an inexpensive printing device. The web radicalized things even more, essentially turning anyone with access into a publisher.
So the question becomes, was it good or bad that publishing became something that anyone could do versus a select few? I’d argue that, ultimately, the pros have out-weighed the cons by magnitudes.
Right now data can be thought of as an asset of the elite and privileged. Those with wealth pay a lot for it, and those who are highly skilled can charge a great deal for their services around it. But the reality is, there is a huge portion of the market that has a legitimate need for data solutions that aren’t currently available to them.