FEATURED STORY

Scaling NoSQL databases: 5 tips for increasing performance

How NoSQL databases scale vertically and horizontally, and what you should consider when building a DB cluster.

Hypergrid_by_John_Lester_Flickr

Editor’s note: this post is a follow-up to a recent webcast, “Getting the Most Out of Your NoSQL DB,” by the post author, Alex Bordei.

As product manager for Bigstep’s Full Metal Cloud, I work with a lot of amazing technologies. Most of my work actually involves pushing applications to their limits. My mission is simple: make sure we get the highest performance possible out of each setup we test, then use that knowledge to constantly improve our services.

Here are some of the things I’ve learned along the way about how NoSQL databases scale vertically and horizontally, and what things you should consider when building a DB cluster. Some of these findings can be applied to RDBMS as well, so read on even if you’re still a devoted SQL fan. You might just get up to 60% more performance out of that database soon enough. Read more…

Comments: 2

The promise of Promise Theory: The O’Reilly Radar Podcast

Mark Burgess chats about Promise Theory, and Geoffrey Moore discusses a modern approach to his Crossing the Chasm theory.

Editor’s note: you can subscribe to the O’Reilly Radar Podcast through iTunes,SoundCloud, or directly through our podcast’s RSS feed.

As systems become increasingly distributed and complex, it’s more important than ever to find ways to accurately describe and analyze those systems, and to formalize intent behind processes, workflows, and collaboration.

In this podcast episode, I chat with Mark Burgess, founder and CTO of CFEngine, about the origins of Promise Theory and its connection to DevOps. Read more…

Comment

Announcing Spark Certification

A new partnership between O’Reilly and Databricks offers certification and training in Apache Spark.

Editor’s note: full disclosure — Ben is an advisor to Databricks.

spark-logoI am pleased to announce a joint program between O’Reilly and Databricks to certify Spark developers. O’Reilly has long been interested in certification, and with this inaugural program, we believe we have the right combination — an ascendant framework and a partnership with the team behind the technology. The founding team of Databricks comprises members of the UC Berkeley AMPLab team that created Spark.

The certification exam will be offered at Strata events, through Databricks’ Spark Summits, and at training workshops run by Databricks and its partner companies. A variety of O’Reilly resources will accompany the certification program, including books, training days, and videos targeted at developers and companies interested in the Apache Spark ecosystem. Read more…

Comment

One man willingly gave Google his data. See what happened next.

Google requires quid for its quo, but it offers something many don’t: user data access.

Despite some misgivings about the company’s product course and service permanence (I was an early and fanatical user of Google Wave), my relationship with Google is one of mutual symbiosis. Its “better mousetrap” approach to products and services, the width and breadth of online, mobile, and behind-the-scenes offerings saves me countless hours every week in exchange for a slice of my private life, laid bare before its algorithms and analyzed for marketing purposes.

I am writing this on a Chromebook by a lake, using Google Docs and images in Google Drive. I found my way here, through the thick underbrush along a long since forgotten former fishmonger’s trail, on Google Maps after Google Now offered me a glimpse of the place as one of the recommended local attractions.

lake

The lake I found via Google Maps and a recommendation from Google Now.

Admittedly, having my documents, my photos, my to-do lists, contacts, and much more on Google, depending on it as a research tool and mail client, map provider and domain host, is scary. And as much as I understand my dependence on Google to carry the potential for problems, the fact remains that none of those dependencies, not one shred of data, and certainly not one iota of my private life, is known to the company without my explicit, active, consent. Read more…

Comments: 26

Small brains, big data

How neuroscience is benefiting from distributed computing — and how computing might learn from neuroscience.

Neurons

When we think about big data, we usually think about the web: the billions of users of social media, the sensors on millions of mobile phones, the thousands of contributions to Wikipedia, and so forth. Due to recent innovations, web-scale data can now also come from a camera pointed at a small, but extremely complex object: the brain. New progress in distributed computing is changing how neuroscientists work with the resulting data — and may, in the process, change how we think about computation. Read more…

Comment: 1

How Flash changes the design of database storage engines

High-performing memory throws many traditional decisions overboard

supermicro_storage

Over the past decade, SSD drives (popularly known as Flash) have radically changed computing at both the consumer level — where USB sticks have effectively replaced CDs for transporting files — and the server level, where it offers a price/performance ratio radically different from both RAM and disk drives. But databases have just started to catch up during the past few years. Most still depend on internal data structures and storage management fine-tuned for spinning disks.

Citing price and performance, one author advised a wide range of database vendors to move to Flash. Certainly, a database administrator can speed up old databases just by swapping out disk drives and inserting Flash, but doing so captures just a sliver of the potential performance improvement promised by Flash. For this article, I asked several database experts — including representatives of Aerospike, Cassandra, FoundationDB, RethinkDB, and Tokutek — how Flash changes the design of storage engines for databases. The various ways these companies have responded to its promise in their database designs are instructive to readers designing applications and looking for the best storage solutions.

Read more…

Comments: 2