Profile of the Data Journalist: The Visualizer

Michelle Minkoff is adding context and clarity to data with interactive features at the Associated Press.

Around the globe, the bond between data and journalism is growing stronger. In an age of big data, the growing importance of data journalism lies in the ability of its practitioners to provide context, clarity and, perhaps most important, find truth in the expanding amount of digital content in the world. In that context, data journalism has profound importance for society.

To learn more about the people who are doing this work and, in some cases, building the newsroom stack for the 21st century, I conducted a series of email interviews during the 2012 NICAR Conference.

Michelle Minkoff (@MichelleMinkoff ) is an investigative developer/journalist based in Washington, D.C. Our interview follows.

Where do you work now? What is a day in your life like?

I am an Interactive Producer at the Associated Press’ Washington DC bureau, where I focus on news applications related to politics and the election, as well as general mapping for our interactives on the Web. While my days pretty much always involve sitting in front of a computer, the actual tasks themselves can vary wildly. I may be chatting with reporters and editors in politics, environment, educational, national security or myriad beats about upcoming stories and how to use data to support reporting or create interactive stories. I might be gathering data, reformatting it or crafting Web applications. I spend a great deal of time creating interactive mapping systems, working a lot with geographic data, and collaborating with cartographers, editors and designers to decide how to best display it.

I split my time between working closely with my colleagues in the Washington bureau on the reporting/editing side, and my fellow interactive team members, only one of whom is also in DC. Our team is global, headquartered in New York, but with members spanning the globe from Phoenix to Bangkok.

It’s a question of walking a balance between what needs to be done on daily deadlines for breaking news, longer-term stories which are often investigative, and creating frameworks that help The Associated Press to make the most of the Web’s interactive nature in the long run.

How did you get started in data journalism? Did you get any special degrees or certificates?

I caught the bug when I took a computer-assisted reporting class from Derek Willis, a member of the New York Times’ Interactive News Team, at Northwestern’s journalism school where I was a grad student. I was fascinated by the role that technology could play in journalism for reporting and presentation, and very quickly got hooked. I also quickly discovered that I could lose track of hours playing with these tools, and that what came naturally to me was not as natural to others. I would spend days reporting for class, on and off Capitol Hill, and nights exchanging gchats with Derek and other data journalists he introduced me to. I started to understand SQL, advanced Excel, and fairly quickly thereafter, Python and Django.

I followed this up with an independent study in data visualization back at Medill’s Chicago campus, under Rich Gordon. I practiced making Django apps, played with the Processing visualization language. I voraciously read through all the Tufte books. As a final project, I created a package about the persistence of Chicago art galleries that encompasses text, Flash visualization and a searchable database.

I have a concentration in Interactive Journalism, with my Medill masters’ degree, but the courses mentioned above are but a partial component of that concentration.

Did you have any mentors? Who? What were the most important resources they shared with you?

The question here is in the wrong tense. I currently “do” have many mentors, and I don’t know how I would do my job without what they’ve shared in the past, and in the present. Derek, mentioned above, was the first. He introduced me to his friend Matt [Waite], and then he told me there was a whole group of people doing this work at NICAR. Literally hundreds of people from that organization have helped me at various places on my journey, and I believe strongly in the mantra of “paying it forward” as they have — no one can know it all, so we pass on what we’ve learned, so more people can do even better work.

Other key folks I’ve had the privilege to work with include all of the Los Angeles Times’ Data Desk’s members, which includes reporters, editors and Web developers. I worked most closely with Ben Welsh and Ken Schwencke, who answered many questions, and were extremely encouraging when I was at the very beginning of my journey.

At my current job at The Associated Press, I’m lucky to have teammates who mentor me in design, mapping and various Washington-based beats. Each is helpful in his or her own way.

Special attention deserves to be called to Jonathan Stray, who’s my official boss, but also a fantastic mentor who enables me to do what I do. He’s helping me to learn the appropriate technical skills to execute what I see in my head, as well as learn how to learn. He’s not just teaching me the answers to the problems we encounter in our daily work, but also helping me learn how to better solve them, and work this whole “thing I do” into a sustainable career path. And all with more patience than I have for myself.

What does your personal data journalism “stack” look like? What tools could you not live without?

No matter how advanced our tools get, I always find myself coming back to Excel first to do simple work. It helps us an overall handle on a data set. I also will often quickly bring data into SQLite, a Firefox extension that allows a user to run SQL queries, with no database setup. I’m more comfortable asking complicated questions of data that way. I also like to use Google’s Chart Tools to create quick visualizations for myself to better understand a story.

When it comes to presentation, since I’ve been doing a lot with mapping recently, I don’t know what I’d do without my favorite open source tools, Tilemill and Leaflet. Building a map stack is hard work, but the work that others have done before it have made it a lot easier.

If we consider programming languages tools (which I do), JavaScript is my new Swiss army knife. Prior to coming to the AP, I did a lot with Python and Django, but I’ve learned a lot about what I like to call “Really Hard JavaScript.” It’s not just about manipulating the colors of a background on a Web page, but parsing, analyzing and presenting data. When I need to do more complex work to manipulate data, I use a combination of Ruby and Python — depending on which has better tools for the job. For XML parsing, I like Ruby more. For simplifying geo data, I prefer Python.

What data journalism project are you the most proud of working on or creating?

That would be “ Road to 270“, a project we did at the AP that allows users to test out hypothetical “what-if” scenarios for the national election, painting states to define to which candidate a state’s delegates could go. It combines demographic and past election data with the ability for users to make a choice and deeply engage with the interactive. It’s not just telling the user a story, but informing the user by allowing him or her to be part of the story. That, I believe, is when data journalism becomes its most compelling and informative.

It also uses some advanced technical mapping skills that were new to me. I greatly enjoyed the thrill of learning how to structure a complex application, and add new tools to my toolkit. Now, I don’t just have those new tools, but a better understanding of how to add other new tools.

Where do you turn to keep your skills updated or learn new things?

I look at other projects, both within the journalism industry and in general visualization communities. The Web inspector is my best friend. I’m always looking to see how people did things. I read blogs voraciously, and have a fairly robust Google Reader set of people whose work I follow closely. I also use lynda.com frequently (I tend to learn best by video tutorials.) Hanging out on listservs for free tools I use (such as Leaflet), programming languages I care about (Python), or projects whose mission our work is related to (Sunlight Foundation) help me engage with a community that cares about similar issues.

Help sites like Stack Overflow, and pretty much anything I can find on Google, are my other best friends. The not-so-secret secret of data journalism: we’re learning as we go. That’s part of what makes it so fun.

Really, the learning is not about paper or electronic resources. Like so much of journalism, this is best conquered, I argue, with persistence and stick-to-it-ness. I approach the process of data journalism and Web development as a beat. We attend key meetings. Instead of city council, it’s NICAR. We develop vast rolodexes. I know people who have myriad specialties and feel comfortable calling on them. In return, I help people all over the world with this sort of work whenever I can, because it’s that important. While we may work for competing places, we’re really working toward the same goal: improving the way we inform the public about what’s going on in our world. That knowledge matters a great deal.

Why are data journalism and “news apps” important, in the context of the contemporary digital environment for information?

More and more information is coming at us every day. The deluge is so vast that we need to not just say things are true, but prove those truths with verifiable facts. Data journalism allows for great specificity, and truths based in the scientific method. Using computers to commit data journalism allows us to process great amounts of information much more efficiently, and make the world more comprehensible to a user.

Also, while we are working with big data, often only a subset of that data is valuable to a specific user. Data journalism and Web development skills allow us to customize those subsets for our various users, such as by localizing a map. That helps us give a more relevant and useful experience to each individual we serve.

Perhaps most importantly, more and more information is digital, and is coming at us through the Internet. It simply makes sense to display that information with a similar environment in which it’s provided. Information is dispensed in a different way now than it was five years ago. It will be totally different in another five years. So, our explanations of that environment should match. We must make the most of the Internet to tell our stories differently now than we did before, and differently than we will in the future.

Knowing things are constantly changing, being at the forefront of that change, and enabling the public to understand and participate in that change, is a large part of what makes data journalism so exciting and fundamentally essential.

This interview has been edited and condensed for clarity.

tags: , , , , ,

Get the O’Reilly Data Newsletter

Stay informed. Receive weekly insight from industry insiders.

  • mr ham

    “I also will often quickly bring data into SQLite, a Firefox extension that allows a user to run SQL queries, with no database setup”

    bahahahahahaha

  • http://landover.com Mike Stewart

    Michelle, your world fascinates me. Have you worked with any of the tech startups in DC that specialize in newsapps? I recently wrote a blog post about the the
    top DC tech startups, and I’m interested in your thoughts.