ENTRIES TAGGED "Hadoop"

Strata Week: Big data boom and big data gaps

Strata Week: Big data boom and big data gaps

One report says the Hadoop market is booming while another says federal data usage isn't.

In this week's big data news, an IDC report points to the booming market for Hadoop and MapReduce (and if proposals for Strata are any indication, this is indeed a good time for big data).

Read Full Post | Comment |
Microsoft opens up

Microsoft opens up

How Microsoft is contributing to and benefitting from open source.

Microsoft seems to be embracing open source more and more. What does this tell us about the company's near-term future?

Read Full Post | Comments: 2 |
Now available: "Planning for Big Data"

Now available: "Planning for Big Data"

A free handbook for anybody wanting to understand and use big data.

"Planning for Big Data" is a new book that helps you understand what big data is, why it matters, and where to get started.

Read Full Post | Comments: 2 |
O'Reilly Radar Show 3/12/12: Best data interviews from Strata California 2012

O'Reilly Radar Show 3/12/12: Best data interviews from Strata California 2012

Doug Cutting on Hadoop, Max Gadney on video data graphics, Jeremy Howard on big data and analytics.

Hadoop creator Doug Cutting discussing the similarities between Linux and the big data world, Max Gadney from After the Flood explains the benefits of video data graphics, Kaggle's Jeremy Howard looks at the difference between big data and analytics.

Read Full Post | Comment |
Four short links: 13 February 2012

Four short links: 13 February 2012

Indie Businesses, Frontend Sluggards, Beautiful Graphics, and Big Data Patterns

  1. Rise of the Independents (Bryce Roberts) — companies that don’t take VC money and instead choose to grow organically: indies. +1 for having a word for this.
  2. The Performance Golden Rule (Steve Souders) — 80-90% of the end-user response time is spent on the frontend. Check out his graphs showing where load times come from for various popular sites. The backend responds quickly, but loading all the Javascript and images and CSS and embedded autoplaying videos and all that kerfuffle takes much much longer.
  3. Starry Night Comes to Life — wow, beautiful, must-see.
  4. MapReduce Patterns, Algorithms, and Use CasesIn this article I digest a number of MapReduce patterns and algorithms to give a systematic view of the different techniques that can be found in the web or scientific articles. Several practical case studies are also provided. All descriptions and code snippets use the standard Hadoop’s MapReduce model with Mappers, Reduces, Combiners, Partitioners, and sorting.
Comment |
Four short links: 8 February 2012

Four short links: 8 February 2012

Text Mining, Unstoppable Sociality, Unicode Fun, and Scholarly Publishing

  1. Mavunoan open source, modular, scalable text mining toolkit built upon Hadoop. (Apache-licensed)
  2. Cow Clicker — Wired profile of Cowclicker creator Ian Bogost. I was impressed by Cow Clickers [...] have turned what was intended to be a vapid experience into a source of camaraderie and creativity. People create communities around social activities, even when they are antisocial. (via BoingBoing)
  3. Unicode Has a Pile of Poo Character (BoingBoing) — this is perfect.
  4. The Research Works Act and the Breakdown of Mutual Incomprehension (Cameron Neylon) — an excellent summary of how researchers and publishers view each other and their place in the world.
Comment |
Top stories: January 30-February 3, 2012

Top stories: January 30-February 3, 2012

Hadoop deconstructed, the value of unstructured data, and a Moneyball approach to software teams.

This week on O'Reilly: Edd Dumbill examined the components and functions of the Hadoop ecosystem, Pete Warden gave a big thumbs-up to unstructured data, and Jonathan Alexander looked at how a Moneyball approach could help software teams.

Read Full Post | Comment |
What is Apache Hadoop?

What is Apache Hadoop?

A look at the components and functions of the Hadoop ecosystem.

Apache Hadoop has been the driving force behind the growth of the big data industry. But what does it do, and why do you need all its strangely-named friends, such as Oozie, Zookeeper and Flume?

Read Full Post | Comment |

Why Hadoop caught on

Doug Cutting on Hadoop's rise and why he's surprised at its growth.

Doug Cutting discusses Hadoop's current and near-term role, and the factors that made it a central part of data processing.

Read Full Post | Comments Off |