Profile of the Data Journalist: The Storyteller and The Teacher

Sarah Cohen and Anthony DeBarros teach and show how to use data in the service of storytelling.

Around the globe, the bond between data and journalism is growing stronger. In an age of big data, the growing importance of data journalism lies in the ability of its practitioners to provide context, clarity and, perhaps most important, find truth in the expanding amount of digital content in the world. In that context, data journalism has profound importance for society.

To learn more about the people who are doing this work and, in some cases, building the newsroom stack for the 21st century, I conducted in-person and email interviews during the 2012 NICAR Conference and published a series of data journalist profiles here at Radar.

Sarah Cohen (@sarahduke), the Knight professor of the practice of journalism and public policy at Duke University, and Anthony DeBarros (@AnthonyDB), the senior database editor at USA Today, were both important sources of historical perspective for my feature on how data journalism is evolving from “computer-assisted reporting” (CAR) to a powerful Web-enabled practice that uses cloud computing, machine learning and algorithms to make sense of unstructured data.

The latter halves of our interviews, which focused upon their personal and professional experience, follow.

What data journalism project are you the most proud of working on or creating?

DeBarros: “In 2006, my USA TODAY colleague Robert Davis and I built a database of 620 students killed on or near college campuses and mined it to show how freshmen were uniquely vulnerable. It was a heart-breaking but vitally important story to tell. We won the 2007 Missouri Lifestyle Journalism Awards for the piece, and followed it with an equally wrenching look at student deaths from fires.”

Cohen: “I’d have to say the Pulitzer-winning series on child deaths in DC, in which we documented that children were dying in predictable circumstances after key mistakes by people who knew that their
agencies had specific flaws that could let them fall through the cracks.

I liked working on the Post’s POTUS Tracker and Head Count. Those were Web projects that were geared at accumulating lots of little bits about Obama’s schedule and his appointees, respectively, that we could share with our readers while simultaneously building an important dataset for use down the road. Some of the Post’s Solyndra and related stories, I have heard, came partly from studying the president’s trips in POTUS Tracker.

There was one story, called “Misplaced Trust,” on DC’s guardianship
system, that created immediate change in Superior Court, which was
gratifying. “Harvesting Cash,” our 18-month project on farm subsidies, also helped point out important problems in that system.

The last one, I’ll note, is a piece of a project I worked on,
in which the DC water authority refused to release the results of a
massive lead testing effort, which in turn had shown widespread
contamination. We got the survey from a source, but it was on paper.

After scanning, parsing, and geocoding, we sent out a team of reporters to
neighborhoods to spot check the data, and also do some reporting on the
neighborhoods. We ended up with a story about people who didn’t know what
was near them.

We also had an interesting experience: the water
authority called our editor to complain that we were going to put all of
the addresses online — they felt that it was violating peoples’ privacy,
even though we weren’t identifyng the owners or the residents. It was more
important to them that we keep people in the dark about their blocks. Our
editor at the time, Len Downie, said, “you’re right. We shouldn’t just put
it on the Web.” He also ordered up a special section to put them all in
print.

Where do you turn to keep your skills updated or learn new things?

Cohen: “It’s actually a little harder now that I’m out of the newsroom,
surprisingly. Before, I would just dive into learning something when I’d
heard it was possible and I wanted to use it to get to a story. Now I’m
less driven, and I have to force myself a little more. I’m hoping to start
doing more reporting again soon, and that the Reporters’ Lab will help
there too.

Lately, I’ve been spending more time with people from other
disciplines to understand better what’s possible, like machine learning
and speech recognition at Carnegie Mellon and MIT, or natural language
processing at Stanford. I can’t DO them, but getting a chance to
understand what’s out there is useful. NewsFoo, SparkCamp and NICAR are
the three places that had the best bang this year. I wish I could have
gone to Strata, even if I didn’t understand it all.”

DeBarros: For surveillance, I follow really smart people on Twitter and have several key Google Reader subscriptions.

To learn, I spend a lot of time training after work hours. I’ve really been pushing myself in the last couple of years to up my game and stay relevant, particularly by learning Python, Linux and web development. Then I bring it back to the office and use it for web scraping and app building.

Why are data journalism and “news apps” important, in the context of the contemporary digital environment for information?

Cohen: “I think anything that gets more leverage out of fewer people is
important in this age, because fewer people are working full time holding
government accountable. The news apps help get more eyes on what the
government is doing by getting more of what we work with and let them see
it. I also think it helps with credibility — the ‘show your work’ ethos –
because it forces newsrooms to be more transparent with readers / viewers.

For instance, now, when I’m judging an investigative prize, I am quite
suspicious of any project that doesn’t let you see each item, I.e., when
they say, “there were 300 cases that followed this pattern,” I want to see
all 300 cases, or all cases with the 300 marked, so I can see whether I
agree.

DeBarros: “They’re important because we’re living in a data-driven culture. A data-savvy journalist can use the Twitter API or a spreadsheet to find news as readily as he or she can use the telephone to call a source. Not only that, we serve many readers who are accustomed to dealing with data every day — accountants, educators, researchers, marketers. If we’re going to capture their attention, we need to speak the language of data with authority. And they are smart enough to know whether we’ve done our research correctly or not.

As for news apps, they’re important because — when done right — they can make large amounts of data easily understood and relevant to each person using them.”

These interviews were edited and condensed for clarity.

tags: , , , , ,

Get the O’Reilly Data Newsletter

Stay informed. Receive weekly insight from industry insiders.