Strata Week: Running the numbers

IA Ventures success, MathJax display engine, statistical literacy, and making big data more human

Here’s what caught my attention in the data world this week.

It all comes down to funding

Fifty million. That’s the number of dollars investors have committed to IA Ventures, a New York City-based fund dedicated to big data tools and technology start-ups. It’s quite an impressive number for a first-time fund in any economic conditions, let alone the current climate.

So how did they do it? Check out founder Roger Ehrenberg’s recent blog post, in which he provides a behind-the-scenes look at his experience, including things he wishes he’d done differently. It’s a nice picture of what it’s like to change careers, start a fund, and learn from experience.

MathJax: Delicious and nutritious

Strata 2011Ever had a thought that couldn’t be expressed in words? Wanted to put that thought on the web? MathJax, an open source JavaScript display engine for mathematical equations, makes that easier (and much more beautiful) across most browsers.

A project of the American Mathematical Society, Design Science, Inc., and the Society for Industrial and Applied Mathematics, MathJax provides top-notch mathematical typesetting without the need for special downloads or plugins. Authors can submit math content in a variety of formats (such as MathML or LaTeX), and feel confident of its proper display even in browsers that don’t have native MathML support.

TeX samples, MathML samples, and scaling samples can be found on the MathJax demo page. Here’s a neat screencast of how users can copy and paste equations into various applications (such as Mathematica) using MathJax:

The MathJax source code is here, and further documentation can be found here.


Not to put too fine a point on it, but Kevin Drum’s “Statistical Zombies” post should be required reading for anyone who ever has, or will, pick up a newspaper. In it, he deftly highlights “the top ten mistakes that infest day-to-day reporting of numerical and statistical information.”

Error rates, inflation adjustment, and the distinction between correlation and causation are just some of the important data literacy principles Drum points out. Think you’re pretty statistics savvy? Take a read and see if you don’t learn (or recall) something.

For more fun, check out Lori Alden’s example set of 12 misleading charts and statistics. Can you identify the blunders?

The fine line between tragedy and numbers

Unless you’ve been living in a cave for the last few weeks (and, given the madness of the holiday season, I wouldn’t blame you), you’ve probably been following the WikiLeaks excitement in the news. The abundance of commentary on that issue need not be rehashed here, but Paul Bradshaw’s take bears mentioning.

In his Online Journalism Blog, Bradshaw explores the difficulty of bringing big datasets to a human scale in journalistic terms, and explains, “when you move beyond scales we can deal with on a human level, you struggle to engage people in the issue you are covering.”

His proposed solution is a kind of non-visual visualization, otherwise know as the anecdote. Human narratives can help us connect to data, to see it in a sympathetic way. Bradshaw stresses that personal stories must be carefully selected so they remain representative of the larger trend. He cautions that the intricacies of a larger dataset may not be revealed in the tales of individuals.

Industrial scale journalism using “big data” in a networked age raises new problems and new opportunities: we need to humanise and personalise big datasets in a way that does not detract from the complexity or scale of the issues being addressed; and we need to think about what happens after someone reads a story online and whether online publishers have a role in that.

Sometimes, it’s about more than just the numbers.

The Strata Conference is coming

Fifty-three: that's how many days are left before the inaugural Strata Conference!

Save 30% on Strata registration with the code STR11RAD.

