Science at the speed of light

Featured Strata Community Profile on Analytics Manager Kim Stedman

When Kim Stedman starts talking about the science of asking questions, I am all ears. As a reporter, I make a living asking questions. She goes on to explain the potential of data science to nudge us all in the direction of thinking about whether we could be asking better questions or making better use of the answers.

Photo of Kim Stedman provided courtesy of Jenny Jimenez

Photo of Kim Stedman provided courtesy of Jenny Jimenez

Stedman is an Analytics Manager at Meteor Entertainment, and hers is the first in a series of featured profiles of the Strata community we’ll be publishing. A self-described “data geek,” Stedman arrived at Meteor, which produces the free-to-play game “Hawken,” after spending years doing field work in South America and Africa. Observing the way humans behave across cultures led to an interest in parsing the ways we behave online. Essentially, Stedman says, data science is “social science at the speed of light.”

She took some time out recently to talk with me about collaboration, the tools she uses, and the future of data science.

What are your day-to-day responsibilities at Meteor Entertainment?

Kim Stedman: With free-to-play games, you are only able to monetize if you can keep people engaged in the game, and that means staying constantly dialed in to the zeitgeist of your user base. Data science is one of the primary ways to do something like that.

On a day-to-day basis, my job is to explore the data that we have. My emphasis is on gameplay. So I look at the way people are interacting within the game, I look at retention and matchmaking, and I look at whether the game is challenging and fun and sticky for people. If it isn’t, I try to identify factors that could be related to a negative experience of the game. There’s no point in having a game that’s no fun!

We just went into open beta, so we have a large number of people playing the game for the first time. When I came to the company, it was before we released; we spent a lot of time sizing up and ramping up, and testing the platform and the process we were going to use to capture data and ingest it. We are now there.

What tools are you using?

Kim Stedman: Our data pipeline is based on saving our data into gigantic JSON-formatted log files, which we then ingest using Pig (we’re on a Hadoop structure) into a structured SQL database. We do our analysis primarily out of that SQL database and we do a lot of Tableau visualizations. Tableau talks to SQL; it doesn’t talk to unstructured data.

So I would say that the Tableau/SQL conversation is the primary way we do analysis. Sometimes we’ll break into the raw log files if there’s something we haven’t ingested in the pipeline.

What is your background, and how did you arrive in the field of data science?

Kim Stedman: I started out as a field anthropologist. I’ve spent a lot of my life in the developing world, doing field studies in Bolivia and the Ivory Coast in Africa. I’m fascinated by social systems and the way that we use reputation systems, incentives and disincentives to create collaborative behavior. I found that sociology did not move at the pace I wanted it to.

After school, I wound up doing qualitative research in the tech industry. I then discovered web analytics and knew that I was at the very beginning of something, but I didn’t know data science existed. I just had an instinct that there was a ton of potential to being able to mine data to see how people behaved on a website.

When I began networking and talking to people about data science, I sort of fell off a cliff into it! I was so exhilarated to discover that data science is the science of asking questions, usually about people, as quickly as you can and as rigorously as you can.

You attended Strata + Hadoop World in NYC for the first time last year. What were your impressions?

Kim Stedman: When I discovered Strata, I felt that I was swimming in the primordial soup of a new highly interdisciplinary discipline. I can see that it’s just beginning to codify itself and to create structure for itself, and I’m really excited to be at the very beginning of that movement.

At first I was dizzied by the scope of the discussion. The technology gets very deep. The analytical techniques go quite far. You can be anyone — from a person like me who is just learning, to someone who has a PhD in machine learning. The difference in expertise is great, and people are coming from all kinds of different spaces with different research agendas. I found that both dizzying and fascinating. Strata helped me to develop a vocabulary for what I was getting into. It helped me to see where the horizons were, how I could grow in a highly technical direction or machine learning direction, and how to map out the space in front of me.

Once I began to understand that most of the people I was talking to had a real variety of backgrounds, I came to realize that the initially overwhelming space that I had seen was also a welcoming one that was looking for a wide variety of skill sets.

I feel that, in our generation, all of the next great innovations will come from interdisciplinary spaces.

Speaking of interdisciplinary spaces, how important is collaboration to the field of data science, and what are the challenges of collaborating across disciplines?

Kim Stedman: In data science, we’re all such curiosity-driven, geeky, passionate people, and we all have perspectives on the world. One of the things I’ve become more aware of as I’ve gotten involved in data science is the subjectivity of the question. I ask questions, and my approach is entirely based on my preconceived ideas of how it might go, or what the research space is defined by. The person who is sitting next to me will define and approach it in a completely different way, and will also establish the parameters or the limitations of the search space in an entirely different way. I describe data science as the science of asking meaningful questions, and we all have a different value that we bring to asking those resonant questions. None of us can be the gatekeeper.

How do you see the discipline of data science evolving or changing in the future?

Kim Stedman: I would like to see data science move in a direction where the work we are doing is integrated into our daily lives, and not a curious add-on to a business that has been running without us for a long time. We would instead be an organ in the organism, a fundamental part, and if we failed, the decision-making process would be disrupted. I think it’s going to take a while for data science to be integrated in that way, but I think it’s up to us — data scientists — to get us there.

This interview was edited and condensed.

Strata Conference Santa Clara — Strata Conference Santa Clara, being held Feb. 26-28, 2013 in California, gives you the skills, tools, and technologies you need to make data work today. Learn more
tags: , , , , , ,

Get the O’Reilly Data Newsletter

Stay informed. Receive weekly insight from industry insiders.