With HBaseCon right around the corner, I wanted to take stock of one of the more popular1 components in the Hadoop ecosystem. Over the last few years, many more companies have come to rely on HBase to run key products and services. The conference will showcase a wide variety of such examples, and highlight some of the new features that HBase developers have added over the past year. In the meantime here are some things2 you may not have known about HBase:
Many companies have had HBase in production for 3+ years: Large technology companies including Trend Micro, EBay, Yahoo! and Facebook, and analytics companies RocketFuel and Flurry depend on HBase for many mission-critical services.
There are many use cases beyond advertising: Examples include communications (Facebook messages, Xiaomi), security (Trend Micro), measurement (Nielsen), enterprise collaboration (Jive Software), digital media (OCLC), DNA matching (Ancestry.com), and machine data analysis (Box.com). In particular Nielsen uses HBase to track media consumption patterns and trends, mobile handset company Xiaomi uses Hbase for messaging and other consumer mobile services, and OCLC runs the world’s largest online database of library resources on HBase.
Flurry has the largest contiguous HBase cluster: Mobile analytics company Flurry has an HBase cluster with 1,200 nodes (replicating into another 1,200 node cluster). Flurry is planning to significantly expand their large HBase cluster in the near future.
Cell-level security: Many databases implement security by imposing access control at the column or row level. Intel developers have contributed several features that implement security at the cell level. This brings HBase closer to Apache Accumulo, a project that originated out of the NSA. But note that compared to Accumulo, where cell-level security was built-in from the outset, this is still a relatively new feature (having just been introduced to HBase in late 2013).
HBase has a diverse and vibrant community: No single company controls HBase – in fact, HBaseCon’s program committee draws from many companies, not just (conference organizer) Cloudera. It’s also worth noting that HBase has made inroads in companies across many countries. For example Chinese companies Huawei, Taobao, and Xiaomi all employ HBase committers.
HBaseCon is one my favorite events, the sessions are a great mix of deep dives, use cases, and introductory material. The two times I’ve attended, I found the energy to be great and the atmosphere very relaxed. One of the highlights of this year’s conference is a keynote presentation by Google’s BigTable team. BigTable was the inspiration for HBase, Accumulo, and many other databases. HBaseCon attendees are in for another great event.
(1) All the major Hadoop vendors are enthusiastic supporters of HBase.
(2) Huge thanks to Justin Kestelyn and St.Ack for walking me through the conference program.