Mark Burgess chats about Promise Theory, and Geoffrey Moore discusses a modern approach to his Crossing the Chasm theory.
As systems become increasingly distributed and complex, it’s more important than ever to find ways to accurately describe and analyze those systems, and to formalize intent behind processes, workflows, and collaboration.
A new partnership between O’Reilly and Databricks offers certification and training in Apache Spark.
Editor’s note: full disclosure — Ben is an advisor to Databricks.
I am pleased to announce a joint program between O’Reilly and Databricks to certify Spark developers. O’Reilly has long been interested in certification, and with this inaugural program, we believe we have the right combination — an ascendant framework and a partnership with the team behind the technology. The founding team of Databricks comprises members of the UC Berkeley AMPLab team that created Spark.
The certification exam will be offered at Strata events, through Databricks’ Spark Summits, and at training workshops run by Databricks and its partner companies. A variety of O’Reilly resources will accompany the certification program, including books, training days, and videos targeted at developers and companies interested in the Apache Spark ecosystem. Read more…
Google requires quid for its quo, but it offers something many don’t: user data access.
Despite some misgivings about the company’s product course and service permanence (I was an early and fanatical user of Google Wave), my relationship with Google is one of mutual symbiosis. Its “better mousetrap” approach to products and services, the width and breadth of online, mobile, and behind-the-scenes offerings saves me countless hours every week in exchange for a slice of my private life, laid bare before its algorithms and analyzed for marketing purposes.
I am writing this on a Chromebook by a lake, using Google Docs and images in Google Drive. I found my way here, through the thick underbrush along a long since forgotten former fishmonger’s trail, on Google Maps after Google Now offered me a glimpse of the place as one of the recommended local attractions.
Admittedly, having my documents, my photos, my to-do lists, contacts, and much more on Google, depending on it as a research tool and mail client, map provider and domain host, is scary. And as much as I understand my dependence on Google to carry the potential for problems, the fact remains that none of those dependencies, not one shred of data, and certainly not one iota of my private life, is known to the company without my explicit, active, consent. Read more…
How neuroscience is benefiting from distributed computing — and how computing might learn from neuroscience.
When we think about big data, we usually think about the web: the billions of users of social media, the sensors on millions of mobile phones, the thousands of contributions to Wikipedia, and so forth. Due to recent innovations, web-scale data can now also come from a camera pointed at a small, but extremely complex object: the brain. New progress in distributed computing is changing how neuroscientists work with the resulting data — and may, in the process, change how we think about computation. Read more…
High-performing memory throws many traditional decisions overboard
Over the past decade, SSD drives (popularly known as Flash) have radically changed computing at both the consumer level — where USB sticks have effectively replaced CDs for transporting files — and the server level, where it offers a price/performance ratio radically different from both RAM and disk drives. But databases have just started to catch up during the past few years. Most still depend on internal data structures and storage management fine-tuned for spinning disks.
Citing price and performance, one author advised a wide range of database vendors to move to Flash. Certainly, a database administrator can speed up old databases just by swapping out disk drives and inserting Flash, but doing so captures just a sliver of the potential performance improvement promised by Flash. For this article, I asked several database experts — including representatives of Aerospike, Cassandra, FoundationDB, RethinkDB, and Tokutek — how Flash changes the design of storage engines for databases. The various ways these companies have responded to its promise in their database designs are instructive to readers designing applications and looking for the best storage solutions.
OSM is moving out of its awkward adolescence and into its mature, young adult phase.
Next to GPS, the most significant development in the Open Geo Data movement is OpenStreetMap (OSM), a community-driven mapping project whose goal is to create the most detailed, correct, and current open map of the world. This week, OSM celebrates its 10th birthday, which provides a convenient excuse to highlight why its achievements to-date are amazing, unusual, and promising in equal parts.
When the project was begun by Steve Coast in 2004, map data sources were few, and largely controlled by a small collection of private and governmental players. The scarcity of map data ensured that it remained both expensive and highly restrictive, and no one but the largest navigation companies could use map data. Steve changed the rules by creating a wiki-like resource of the entire globe, which everyone could use without hinderance. Read more…