"data journalism" entries

Making corrections to data stories, a Brazilian hackday, and the ‘truth’ about big data and agnostic storytelling.

A few weeks ago in this space, I wrote about efforts to create a corrections policy for data journalists. It turns out, the Toronto Star needed this policy sooner rather than later, after a summer intern pitched and created a project that featured a searchable database of banned license plates, which included material from another Star reporter’s article three years prior.  The Star published a public editor’s note about the issue. Does the problem of plagiarism become more complicated when it includes previously-reported data?

For those who are interested in the field of data journalism but unsure of where to start, The Data Journalism Heist offers a quick introduction. The e-book’s tagline: How to get in, get the data, and get the story out – and make sure nobody gets hurt

Read more…

Finding data after the shutdown, workarounds for reporters, and teaching a journalism MOOC.

When a government shutdown renders government data websites useless, what’s a data journalist to do? This week, reporters hoping to gather data from sites like the US Census Bureau, the USDA’s Food and Nutrition Service, and the Bureau of Economic Analysis were out of luck, as access to most online government data was blocked due to the government shutdown.

The Pew Research Center offered a mostly comprehensive list of the data casualties of the shutdown.

Read more…

Ethics in data journalism, the Geojournalism Handbook, and the very first Nate Silver.

As is the case with practitioners of most emerging and rapidly expanding fields, data journalists are finding it increasingly necessary to generate a code of sorts to deal with ethical issues and problems. In The Times Regrets the Programmer Error, a newsroom developer at the New York Times asks whether it’s time to create a detailed and explicit corrections policy for data.

And Paul Bradshaw of Birmingham City University imagines what a code of ethics for data journalism would look like. Ethical guidelines are necessary because of the sheer volume of data available in public databases, he says.

Finalists for the Gannett Award for Technical Innovation in Digital Journalism were announced by the Online Journalism Association this week. They were the data visualization tool D3.js; Quartz, a digitally native news site for business people; and Tarbell, a content management system created by the Chicago Tribune News Applications Team (and named after muckraking journalist Ida Tarbell.)

Read more…

Data journalism’s ‘secret weapon’, data newswires, and the newest data-scraping tools for journalists.

When investigative reporter and journalism instructor Chad Skelton needed help writing a curriculum for a data journalism course, he turned to NICAR-L, the email listerv for the National Institute of Computer Assisted Reporting, for advice.  Skelton says that virtually every data journalist in North America is plugged in to the NICAR listserv, making it data journalism’s “secret weapon.”

In 5 tips for a data journalism workflow, the online journalism blog advises newsrooms to find and tap into “data newswires” in the same way newsrooms have used traditional newswires like AP and Reuters.
Read more…

Latin America’s Media Party, Tow/ Knight Research Projects, and why necessity is the mother of invention.

Fun fact: Over the last 2 years, the Buenos Aires delegation of Hacks/Hackers has grown to be the second largest chapter in the world, with more than 2200 members. (New York City is the largest.) This weekend, the city hosts the second annual Media Party, one of the biggest events in the Americas for newsroom programmers and data journalists.  Featured guests include NPR news apps editor Brian Boyer, assistant editor for interactive news at The New York Times, Jacqui Maher, and Media Factory, Latin America’s first venture capital fund for emerging news organizations.

In fact, developing countries around the world have been hosting a brand new crop of data journalism initiatives. Most recently, the Canadian-based nonprofit, Journalists for Human Rights, collaborated with media outlets in Ghana to report several data-driven stories, like one that examined the frequency of paying bribes in Ghana. The Data Driven Journalism blog reported on the progress of the Kenya Open Data Initiative (KODI), two years after its launch.  And 12 reporters from seven nations are learning data journalism as part of ‘Flag It’,  a training course designed by Ecolab in partnership with the European Youth Press.

Read more…

Understanding education data, A/B testing in the newsroom, and ProPublica’s sister site in Thailand.

The 2013 Excellence in Journalism conference kicks off this Friday in Anaheim, California. Sessions of interest to data journalists include: best practices for pulling diversity data from census figures, journalists and coding, and storytelling with Google Maps, which provides an introduction to Google Earth and Google Fusion tables. SPJ’s Journalist’s Toolbox (@journtoolbox) will be tweeting live from several of the sessions for those who can’t attend.

The annual PDK/Gallup education poll was published on Wednesday, making this a busy week for data journalists on the education beat. Each year, the poll provides grist for education reporters looking to glean insights about the nation’s public schools. But the Educated Reporter, a blog of the Education Writers’ Association, warns data journalists to proceed with caution when using polling data in education reporting.

Read more…

Citizen Bezos, ultralight content management systems, and byline analysis at the New York Times.

If last week’s news belonged to Nate Silver and his transformation of journalism, this week belongs to Jeff Bezos, and the hopeful speculation from many corners about his ability to revamp newspapers’ struggling business model. Slate concludes that If Anyone Can Save the Washington Post, It’s Jeff Bezos, pointing to his uncanny ability to find “new ways of selling old things.” Blogger Walter Russell Mead says that Bezos and the Post are part of a larger trend towards the marriage of tech and state. The Street says Bezos is not saving journalism, he’s saving Amazon, contending that the purchase is simply a power grab to preserve Amazon’s media and retail dominance. Meanwhile, Mashable calls Bezos Journalism’s New Best Friend.

A third installment of the TechRaking conference series produced by The Center for Investigative Reporting began on Wednesday. TechRaking III, “Mining the News,” is an invite-only event for journalists and data professionals, co-hosted by Google. Visual.ly will donate $10,000 in development time to help produce the winning project.

Read more…

Data science in the public interest, ‘digital media data gurus’, and a comic about dirty data.

Insights and links from the data journalism beat

Data science in the public interest is en vogue, as collaborations between data scientists, nonprofits and human rights groups are springing up everywhere. Journalists at the Knight Foundation are following suit. This week, the foundation gave details about it’s $2 million Knight News Challenge for health-related data projects. The “inspiration phase” launching next month invites citizens, journalists, and community groups anywhere in the world to dream up ideas about how to turn public data sets into useful information that could improve the health of communities.

Over at the Neiman Journalism Lab, a journalism professor writes that we are now entering the age of the “Digital Media Data Guru,” a person with a hybrid of computer science and journalism skills who is able to “do it all” in the newsroom, and recommends that journalism schools prepare students for the data-centered work ahead of them.

Read more…

Nate Silver’s big move, tips for journalists at hackathons, and the limitations of polling Twitter

Tidbits from the data journalism beat

The big news in data journalism this week was Nate Silver’s announcement that he’s leaving the New York Times and taking his FiveThirtyEight franchise to ESPN. The chatteratti immediately weighed in: TIME credits Nate Silver with elevating data journalism to the level of “real reporting”, The Washington Post says that his genius lies in journalism, not math, and Salon asks whether Silver will be able to predict Oscar winners in the same way as a Presidential campaign.

The news app editors at ProPublica have developed another digital tool for your data journalism kit. Upton is a new open-source web scraping framework that makes web scraping easier by providing reusable components. (And it’s named after the great muckraking journalist Upton Sinclair!)

Read more…

Protecting US reporters’ records, data mining tools, and congressional acronym abuse

Notes and links from the data journalism beat

It seems that new data journalism tools are being released every day. The latest data journalism tools include: CivOmega, a modular prototype for government data that allows developers to plug in their own APIs and Fact Tank, a new data journalism platform from the Pew Research Center. Also, for journalists in the US concerned about protecting their own personal data, government investigators now face more hurdles when seeking a reporter’s records. And for a little data journalism levity, check out the latest project from Noah Veltman, a data journalism fellow at the BBC. Veltman used the GovTrack Bulk data API, SQL and Python to conduct a self-described “overly in-depth analysis” of Congressional Acronym Abuse from 1973 to the present.

Your links for the week:

  • The alpha of CivOmega: A hack-day tool to parse civic data and tell you more about Beyoncé’s travels (Neiman Lab)
    The idea of “a Siri or Wolfram Alpha for government data” — something that can connect natural language queries with multfaceted datasets — had been kicking around in the mind of MIT Media Lab and Knight-Mozilla veteran Dan Schultz ever since a Knight Foundation-sponsored election-year brainstorming session in 2011.
  • Introducing Fact Tank: An Interview with Pew Research Center President Alan Murray (Data Driven Journalism)
    Obviously, we collect vast amounts of data, about demographics, about a variety of issues – we are basically a data shop. In the past, most of the dissemination of our data has been done through existing media. But we also felt it was important for us to get our own data relating to news events out to the public more quickly and more directly. Additionally, we also felt it was important for us to play a role in aggregating data sets which we can then present ourselves.”
  • Read more…