Digits of pi, extruding images with iPads, and mapping the past on top of the present
In this edition of Strata Week: The 2,000,000,000,000,000th digit of pi is calculated with an assist from Hadoop and MapReduce; a new technique uses iPads to extrude light paintings across a long exposure shot; Historypin links historical photos to Google Street View shots; and this is the last week for Strata Conference proposal submissions.
Storage, MapReduce and Query are ushering in data-driven products and services.
We're at the beginning of a revolution in data-driven products and services, driven by a software stack that enables big data processing on commodity hardware. Learn about the SMAQ stack, and where today's big data tools fit in.
Blue is the color, getting help with email overload.
In the latest edition of Strata Week: Google's introduction of a new search-indexing system highlights an important limitation of MapReduce and Hadoop. Can MapReduce adapt to real-time needs or will others follow Google in creating new architectures for real-time analytics?
Some organizations create their own real-time analysis tools, while others turn to specialized solutions. In a previous post, I highlighted SQL-based real-time analytic tools that can handle large amounts of data. I noted that other big data management systems such as MPP databases and MapReduce/Hadoop were too batch-oriented to deliver analysis in near real-time. At least for MapReduce/Hadoop systems things may have changed slightly. A group of researchers from UC Berkeley and Yahoo recently modified MapReduce to allow for pipelining between operators.
The growing need to manage and make sense of Big Data, has led to a surge in demand for analytic databases, which many companies are attempting to fill. As an alternative to current shared-nothing analytic databases, HadoopDB is a hybrid that combines parallel databases with scalable and fault-tolerant Hadoop/MapReduce systems.
Our belief that proficiency in managing and analyzing large amounts of data distinguishes market leading companies, led to a recent report designed to help users understand the different large-scale data management techniques. Our report on Big Data Technologies was the result of interviews with over thirty experts, including research scientists, (open-source) hackers, vendors, data analysts, and entrepreneurs. I recently sat down with my co-author, Roger Magoulas (Director of Research at O’Reilly), who agreed talk about our report and Big Data in general.