FEATURED STORY

It’s not just about Hadoop core anymore

For maximum business value, big data applications have to involve multiple Hadoop ecosystem components.

Data is deluging today’s enterprise organizations from ever-expanding sources and in ever-expanding formats. To gain insight from this valuable resource, organizations have been adopting Apache Hadoop with increasing momentum. Now, the most successful players in big data enterprise are no longer only utilizing Hadoop “core” (i.e., batch processing with MapReduce), but are moving toward analyzing and solving real-world problems using the broader set of tools in an enterprise data hub (often interactively) — including components such as Impala, Apache Spark, Apache Kafka, and Search. With this new focus on workload diversity comes an increased demand for developers who are well-versed in using a variety of components across the Hadoop ecosystem.

Due to the size and variety of the data we’re dealing with today, a single use case or tool — no matter how robust — can camouflage the full, game-changing potential of Hadoop in the enterprise. Rather, developing end-to-end applications that incorporate multiple tools from the Hadoop ecosystem, not just the Hadoop core, is the first step toward activating the disparate use cases and analytic capabilities of which an enterprise data hub is capable. Whereas MapReduce code primarily leverages Java skills, developers who want to work on full-scale big data engineering projects need to be able to work with multiple tools, often simultaneously. An authentic big data applications developer can ingest and transform data using Kite SDK, write SQL queries with Impala and Hive, and create an application GUI with Hue. Read more…

Comment

Now available: Big Data Now, 2014 edition

Our wrap-up of important developments in the big data field.

In the four years we’ve been producing Big Data Now, our year-end wrap-up of important developments in the big data field, we’ve seen tools and applications mature, multiply, and coalesce into new categories. This year’s free wrap-up of Radar coverage is organized around seven themes:

  • Cognitive augmentation: As data processing and data analytics become more accessible, jobs that can be automated will go away. But to be clear, there are still many tasks where the combination of humans and machines produce superior results.
  • Intelligence matters: Artificial intelligence is now playing a bigger and bigger role in everyone’s lives, from sorting our email to rerouting our morning commutes, from detecting fraud in financial markets to predicting dangerous chemical spills. The computing power and algorithmic building blocks to put AI to work have never been more accessible.
  • Read more…

Comment
Four short links: 26 January 2015

Four short links: 26 January 2015

Coding in VR, Git Workflows, Programming as Bookkeeping, and Valuing People

  1. How Might We Code in VR? — caught my eye because I’m looking for ideas on how to think about interaction design in the holoculus world.
  2. Git Workflows for Pros — non-developers don’t understand how important this is to productivity.
  3. All Programming is Bookkeeping — approach programming as a bookkeeping problem: checks and balances.
  4. Why I Am Not a Maker (Deb Chachra) — The problem is the idea that the alternative to making is usually not doing nothing — it’s almost always doing things for and with other people, from the barista to the Facebook community moderator to the social worker to the surgeon. Describing oneself as a maker — regardless of what one actually or mostly does — is a way of accruing to oneself the gendered, capitalist benefits of being a person who makes products.
Comment

Designing on a system level

Andy Goodman on service design, embeddables, and predictive analytics.

connection_jazbeck_Flickr

I recently sat down with Andy Goodman, designer and group director of Fjord’s US studios. Goodman has been designing and managing design teams around the globe for the past 20 years. Goodman is a contributor to Designing for Emerging Technologies — our conversation covers embeddables, wearables, and predictive analytics. To kick off the conversation, I asked Goodman to define “service design”:

“It’s well-known that if you ask a service designer to define “service design,” you get 10 different answers. For me, it’s really about thinking on a system level about design … It’s thinking about how systems, and not just computer systems, but how human systems and computer systems and physical systems all interact with each other. You need to be thinking not about individual moments; you need to be thinking about journeys and flows, and thinking about how a human being will naturally, without even thinking about it, move from one context to another using different devices, using physical objects, being in physical spaces. For me, it was very appealing, this idea that you can design more than just interactions in a way, more than just interactions on a screen. You can actually design other things that are more about the way we live and work and play.”

Read more…

Comment

Bitcoin is just the first app to use blockchain technology

Understanding the value of the blockchain above and beyond bitcoin.

square_Ken_Flickr

Editor’s note: Lorne Lantz is a program co-chair for our O’Reilly Radar Summit: Bitcoin & the Blockchain on January 27, 2015, in San Francisco. For more on the program and for registration information, visit the Bitcoin & the Blockchain event website.

I remember the first time I heard about bitcoin. It was June 2012, and I was invited to a bitcoin meetup. The whole time I was sitting there, I thought these were a bunch of computer geeks playing around with nerd money.

At the same time, I felt excited about the possibilities. If what the bitcoin believers were saying was true, it could become something very big. When I took a closer look, I realized why it could be so groundbreaking: decentralization.

Unlike other currencies and payment networks, bitcoin is not controlled by a bank, government, or financial institution. Instead, thousands of computers around the world verify transactions and manage a global decentralized ledger. This innovative technology is called the blockchain, and it provides a unique pathway that allows — for the first time — many computers that don’t trust each other to achieve consensus. In bitcoin’s case, they are achieving consensus on updates to the global ledger. Read more…

Comments: 6

Blockchain scalability

A look at the stumbling blocks to blockchain scalability and some high-level technical solutions.

Author note: Vitalik Buterin contributed to this article.

chain_Peter_Shanks_Flickr

Editor’s note: Kieren James-Lubin is a program co-chair for our O’Reilly Radar Summit: Bitcoin & the Blockchain on January 27, 2015, in San Francisco. For more on the program and for registration information, visit the Bitcoin & the Blockchain event website.

In a talk at CoinJar last fall, well-known bitcoin expert Andreas Antonopoulos made the following comment:

“I have no worries that bitcoin can scale, and the simple reason for that is that I know that IPv4 can’t, and yet I use it every day.”

The issue of bitcoin scalability and the phrase “blockchain scalability” are often seen in technical discussions of the bitcoin protocol. Will the requirements of recording every bitcoin transaction in the blockchain compromise its security (because fewer users will keep a copy of the whole blockchain) or its ability to handle a great number of transactions (because new blocks on which transactions can be recorded are only produced at limited intervals)? In this article, we’ll explore several meanings of “blockchain scalability” and some high-level technical solutions to the issue.

The three main stumbling blocks to blockchain scalability are:

  1. The tendency toward centralization with a growing blockchain: the larger the blockchain grows, the larger the requirements become for storage, bandwidth, and computational power that must be spent by “full nodes” in the network, leading to a risk of much higher centralization if the blockchain becomes large enough that only a few nodes are able to process a block.
  2. The bitcoin-specific issue that the blockchain has a built-in hard limit of 1 megabyte per block (about 10 minutes), and removing this limit requires a “hard fork” (ie. backward-incompatible change) to the bitcoin protocol.
  3. The high processing fees currently paid for bitcoin transactions, and the potential for those fees to increase as the network grows. We won’t discuss this too much, but see here for more detail.

We’ll consider these first two issues in detail. Read more…

Comments: 3