Before I came to O’Reilly I was building the “big data and disruptive analytics practice” at a major systems integrator. It was a blast to spend every week talking to customers in different industries who were waking up to the possibilities that technologies like Hadoop offered their businesses. Many of these businesses are going to fundamentally change as they embrace this stuff (or be replaced by those that do). But there’s a catch.
Twenty years or so ago large integrators made big business building applications on the then-new relational paradigm. They put in Oracle databases with custom code, wrote PowerBuilder apps on Sybase, and of course lots of businesses rolled their own with VB and SQL Server. It was an era of custom coding where Oracle, Sybase, SQL Server, Informix and etc. were thought of as platforms to build stuff on.
Then the market matured and shifted to package solution implementation. ERP, CRM, , etc. The big guys focused on integrating again and told their clients there was no ROI in building custom stuff. ROI would come from integrating best-of-breed solutions. Databases became commodity back ends to the applications that were always the real focus.
Now along comes big data, NoSQL, data science, and all that stuff and it seems like we’re starting the cycle over again. But this time clients, having been well trained over the last decade or so, aren’t having any of that “build it from scratch” mentality. They know that Hadoop and other new technologies can be transformative to their business, but they want it packaged up and solution’ified like they are used to. I heard a lot of “let us know when you have a solution already built or available to buy that does X” in the last year.
Also, lots of the shops that do this stuff at scale are built and staffed around the package implementation model and have shed many of the skills they used to have for custom work. Everything from staffing models to methodologies are oriented toward package installation.
So, here we are with all of this disruptive technology, but we seem to have lost the institutional wherewithal to do anything with it in a lot of large companies. Of course that fact was hard on my numbers. I had a great pipeline of companies with pain to solve, and great technologies to solve it, but too much of the time it was hard to close it without readymade solutions.
Every week I talked to the companies building these new platforms to share leads and talk about their direction. After a while I started cutting them off when they wanted to talk about the features of their next release. I just got to the point where I didn’t really care, it just wasn’t all that relevant to my customers. I mean, it’s important that they are making the platforms more manageable and building bridges to traditional BI, ETL, RDBMS, and the like. But the focus was too much on platforms and tools.
I wanted to know “What are you doing to encourage solution development? Are you staffing a support system for ISVs? What startups and/or established players are you aware of that are building solutions on this platform?” So when I saw this tweet I let out a little yelp. Awesome! The lack of ready-to-install solutions was getting attention, and from Mike Olsen.
Cloudera CEO wants startups to build Hadoop apps. He will connect you to funding. #dataconf
— MJHarkins (@uberjake42) March 21, 2012
You can watch the rest of what Mike Olson said here and you’ll find he tells a similar story about the RDBMS historical parallel.
I talked to Mike a few weeks ago to find out what was behind his comment and explore what else they are doing to support solution development. It boils down to what he said — he will help connect you with money — plus a newly launched partner program designed to provide better support to ISVs among others. Also, the continued attention to APIs and tools like Pig and Hive should make it easier for the solution ecosystem to develop. It can only be good for his business to have lots of other companies directly solving business problems, and simply pulling in his platform.
Hortonworks also started a partner program in the fall and I think we’ll see a lot more emphasis on this across the space this year. However, at the moment wherever I look (Hortonworks partners, Cloudera Partners, Accel big data portfolio) the focus today remains firmly on platform and tools or partnering with integrators. Tresata, a startup focused on financial risk management, pops up in in a lot of lists as the obvious odd one out — an actual domain-specific solution.
What about other people that could be building solutions? Is it the maturity level of the technology, the lack of penetration of Hadoop etc. into your customer’s data centers, or some combination of other factors that is slowing things down?
Of course, during the RDBMS adoption it took a lot of years before the custom era was over and thoroughly replaced by the era of package implementation. The question I’m pondering is whether customer expectations and the pace of technology will make it happen faster this time? Or is the disruptive value of big data going to continue to accrue only to risk-taking early adopters for the foreseeable future?
If you are building a startup based on a solution or application that leverages big data technology, and you aren’t being stealthy, I’d love to hear about it in the comments.