The bond between data and journalism grows stronger

Liliana Bounegru discusses the state of data journalism and its growing influence.

While reporters and editors have been the traditional vectors for information gathering and dissemination, the flattened information environment of 2012 now has news breaking first online, not on the newsdesk.

That doesn’t mean that the integrated media organizations of today don’t play a crucial role. Far from it. In the information age, journalists are needed more than ever to curate, verify, analyze and synthesize the wash of data.

To learn more about the shifting world of data journalism, I interviewed Liliana Bounegru (@bb_liliana), project coordinator of SYNC3, the first international Data Journalism Awards, and Data Driven Journalism at
the European Journalism Centre.

What’s the difference between the data journalism of today and the computer-assisted reporting (CAR) of the past?

Liliana Bounegru: There is a “continuity and change” debate going on around the label “data journalism” and its relationship with previous journalistic practices that employ
computational techniques to analyze datasets.

[PDF] that there is a difference between CAR and data
journalism. They say that CAR is a technique for gathering and analyzing data as a way of enhancing (usually investigative) reportage, whereas data journalism pays attention to the way that data
sits within the whole journalistic workflow. In this sense, data journalism pays equal attention to finding stories and to the data itself. Hence, we find the Guardian Datablog
or the Texas Tribune publishing datasets
alongside stories, or even just datasets by themselves for people to
analyze and explore.

Another difference is that in the past, investigative reporters
would suffer from a poverty of information relating to a question they
were trying to answer or an issue that they were trying to address.
While this is, of course, still the case, there is also an overwhelming
abundance of information that journalists don’t necessarily know what
to do with. They don’t know how to get value out of data. As Philip Meyer recently wrote to
me: “When information was scarce, most of our efforts were devoted to
hunting and gathering. Now that information is abundant, processing is
more important.”

On the other hand, some argue that there is no difference between
data journalism and computer-assisted reporting. It is by now common
sense that even the most recent media practices have histories as
well as something new in them. Rather than debating whether or not
data journalism is completely novel, a more fruitful position would be
to consider it as part of a longer tradition but responding to new
circumstances and conditions. Even if there might not be a difference
in goals and techniques, the emergence of the label “data journalism”
at the beginning of the century indicates a new phase wherein the
sheer volume of data that is freely available online combined with
sophisticated user-centric tools enables more people to work with more
data more easily than ever before. Data journalism is about mass data

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20

What does data journalism mean for the future of journalism? Are
there new business models here?

Liliana Bounegru: There are all kinds of
interesting new business models emerging with data journalism. Media
companies are becoming increasingly innovative with the way they
produce revenues, moving away from subscription-based models and
advertising to offering consultancy services, as in the case of the
German award-winning OpenDataCity.

Digital technologies and the web are fundamentally changing the way
we do journalism. Data journalism is one part in the ecosystem of
tools and practices that have sprung up around data sites and
services. Quoting and sharing source materials (structured data) is in
the nature of the hyperlink structure of the web and in the way we are
accustomed to navigating information today. By enabling anyone to
drill down into data sources and find information that is relevant to
them as individuals or to their community, as well as to do fact
checking, data journalism provides a much needed service coming from a
trustworthy source. Quoting and linking to data sources is specific
to data journalism at the moment, but seamless integration of data in
the fabric of media is increasingly the direction journalism is going
in the future. As Tim
Berners-Lee says, “data-driven journalism is the future

What data-driven journalism initiatives have caught your attention?

Liliana Bounegru: The data journalism project is one of my
favorites. It addresses a real problem: The European Union (EU) is
spending 48% of its budget on agriculture subsidies, yet the money
doesn’t reach those who need it.

Tracking payments and recipients of agriculture subsidies from the
European Union to all member states is a difficult task. The data is
scattered in different places in different formats, with some missing
and some scanned in from paper records. It is hard to piece it
together to form a comprehensive picture of how funds are distributed.
The project not only made the data available to anyone in an easy to
understand way, but it also advocated for policy changes and better
transparency laws.

LRA Crisis Tracker

Another of my favorite examples is the LRA Crisis Tracker, a
real-time crisis mapping platform and data collection system. The
tracker makes information about the attacks and movements of the Lord’s
Resistance Army (LRA) in Africa publicly available. It helps to inform
local communities, as well as the organizations that support
the affected communities, about the activities of the LRA through an
early-warning radio network in order to reduce their response time to

I am also a big fan of much of the work done by the Guardian Datablog.
You can find lots of other examples featured on,
along with interviews, case studies and tutorials.

I’ve talked to people like Chicago Tribune news app developer
Brian Boyer about the emerging “newsroom
.” What do you feel are the key tools of the data

Liliana Bounegru: Experienced data journalists
list spreadsheets as a top data journalism tool. Open source tools and
web-based applications for data cleaning, analysis and visualization
play very important roles in finding and presenting data stories. I
have been involved in organizing several workshops on ScraperWiki and Google Refine for
data collection and analysis. We found that participants were quite
able to quickly ask and answer new kinds of questions with these

How does data journalism relate to open data and open government?

Liliana Bounegru: Open government data means that
more people can access and reuse official information published by
government bodies. This in itself is not enough. It is increasingly
important that journalists can keep up and are equipped with skills
and resources to understand open government data. Journalists need to
know what official data means, what it says and what it leaves out.
They need to know what kind of picture is being presented of an

Public bodies are very experienced in presenting data to the public
in support of official policies and practices. Journalists, however,
will often not have this level of literacy. Only by equipping
journalists with the skills to use data more effectively can we break
the current asymmetry, where our understanding of the information that
matters is mediated by governments, companies and other experts. In a
nutshell, open data advocates push for more data, and data journalists
help the public to use, explore and evaluate it.

This interview has been edited and condensed for clarity.

Photo on associated home and category pages: NYTimes: 365/360 – 1984 (in color) by blprnt_van, on Flickr.


tags: , , , , , ,

Get the O’Reilly Data Newsletter

Stay informed. Receive weekly insight from industry insiders.

  • gregorylent

    no way do i want journalists to “curate, verify, analyze and synthesize the wash of data.”

    they aren’t conscious enough, they are beholden to the status quo, they are part of the problem!


    pay attention to the self and the stuff will line up as needed.

    you really need to talk to some yogis, people who know about awareness, consciousness, and the mind.

    as to big data, consciousness is still the best tool, unless you like life by algorithm.

    big data has value, but that stops the moment you outsource your own intuition.

    and it don’t mean neuroscientists, they are on the outside looking in just like journalists, but desperate to be thought of as in.

  • Another noted difference between CAR and DJ beyond the sheer volumes of data and “user-centric tools” available today is access to the online discussions occurring between people on the data that offer diverse perspectives and POVs that shape and influence a journalist’s interpretation of the data.

    gregorylent – I hear your concern, however, “somebody” will interpret and publish the data. There is a wide chasm that exists between government and citizens. If DJs can provide some third party “mediation” or help enable that role by the public through their prepping the data (e.g., curate, verify and analyze) then perhaps we’re a step closer to open dialog that must accompany open data.

  • @Daniel: Good point; the open, ongoing dialogue on social media platform can add more context and sentiment to the mix.

    @Gregory: “no way do i want journalists to “curate, verify, analyze and synthesize the wash of data.”

    I suppose that means you would prefer that I — along with thousands of other people working in new media — stopped editing and sharing journalism on Twitter, Facebook, newspapers and blogs? Because that’s precisely what’s been happening for many, many decades. With new tools and technologies come new terms — but the practice is hardly new. You are, of course, welcome to the perspective that you don’t want journalists to be infomediaries but my sense is that the 4th estate is a vital part of a modern democratic state — and indeed newspapers were hailed by Thomas Jefferson and other founding fathers in that context.

    >”they aren’t conscious enough, they are beholden to the status quo, they are part of the problem!”

    While some journalists may indeed not be fully “conscious,” particularly after working 14 hour days, I think many others — including this writer — try quite hard to keep their eyes, ears and mind open to new ideas. Curiousity is an important quality.

    >”pay attention to the self and the stuff will line up as needed. you really need to talk to some yogis, people who know about awareness, consciousness, and the mind.”

    My father teaches meditation, has written extensively and is a lay ordained Buddhist minister. I grew up as a Quaker, going to silent meeting. I practice mindful meditation in many contexts. I assure you, Gregorty, that I “really have” talked to people who know about such things.

    >”as to big data, consciousness is still the best tool, unless you like life by algorithm. big data has value, but that stops the moment you outsource your own intuition.”

    Being “conscious” will not help a data scientist who needs to gain insight from petabytes of, say, genome or customer or financial data. The formulation that Robert Kirkpatrick of UN Global Pulse recently used is that applying big data for good requires the power of algorithms, the wisdom of crowds and the instinct of experts. A good doctor, then, might make data-driven decisions, informed by the experiences of e-patients and research, but will also rely on intuition.

    >”and it don’t mean neuroscientists, they are on the outside looking in just like journalists, but desperate to be thought of as in”

    I’m not sure how many neuroscientists you’ve met. The small sample I’ve talked to, however, have not met the criteria for being “desperate” of anything, save perhaps for better tools, time and insight into the cause of diseases that were debilitating or killing friends, family and patients.

  • Thanks for the share. This is a useful and sensible article. I will share this to my colleagues.