Dating with data

OkCupid CEO Sam Yagan on how data shapes the dating business.

OkCupid logoOkCupid is a free dating site with seven million users. The site’s blog, OkTrends, mines data from those users to tackle important subjects like “The case for an older woman” and “The REAL ‘stuff white people like’.”

Beyond clever headlines, OkCupid also uses an unusual pedigree to separate itself from the dating site pack: The business was founded by four Harvard-educated mathematicians.

“It probably scared people when they first heard that four math majors were starting a dating site,” said CEO Sam Yagan during a recent interview. But the founders’ backgrounds greatly influenced how they approached the problem of dating.

“A lot of other dating sites are based on psychology,” Yagan said. “The fundamental premise of a site like eHarmony is that they know the answer. Our approach to dating isn’t that there’s some psychological theory that will be the answer to all your problems. We think that dating is a problem to be solved using data and analytics. There is no magic formula that can help everyone to find love. Instead, we bring value by building a decent-sized platform that allows people to provide information that helps us to customize a match algorithm to each person’s needs.”

OkCupid works by having users state basic preferences and answering questions like “Is it wrong to spank a child who’s been bad?” Users are matched based on the overlap of their answers and how important each question is to both users.

Yagan said data was built into the business model from the beginning. “We knew from the time we started the company that the data we were generating would have three purposes: helping us match people up, attracting advertisers since that was the core of our revenue model, and that the data would also be interesting socially.”

In 2007, the company hired a PR firm to publicize some of its findings, such as the fact that when gas prices rise, users narrow the search radius for matches. “We called dozens of reporters and nobody cared,” Yagan said. So OkCupid fired the PR firm and started publishing their findings on the OkTrends blog. The blog has thus far doubled traffic to the site.

“The blog is partly an advice column, but instead of being written by a psychologist, the data writes itself,” Yagan said. “For example, we don’t tell you that you should or should not use a flash for your profile photo. We just tell you that if you use a flash you’ll look seven years older.”

Web 2.0 Summit, being held October 17-19 in San Francisco, will examine “The Data Frame” — focusing on the impact of data in today’s networked economy.

Save $300 on registration with the code RADAR

I asked Yagan about the data on which OkTrends draws. “We have people’s registration data,” he said. “Then we have stated preferences; the answers that people give to the questions we ask them. We use that kind of data occasionally, but it’s not the core difference that we have. The core difference is in the category of revealed preferences. Imagine if you had a video camera in every bar and you could observe every interaction between two people and see the success rate of that interaction. We essentially have that video camera on our site.”

The reason revealed preferences are so important is that they track real-world behavior — what people really want rather than what they say they want. “When you get 12 messages and you only reply to three of them, you are voting with your time,” Yagan said. “Or when a guy is shorter than you, you don’t reply.”

Mobile adds a new revealed preferences dimension for OkCupid. “As our product gets more mobile and location-aware, we are more likely to be on that date with them,” Yagan said. “Then we can model the kinds of conversations on the site that lead to an in-person meeting.” OkCupid can currently track the five million messages sent every week on the site as well as other revealed preferences, like ratings of profiles.

According to Yagan, OkCupid doesn’t use sophisticated data mining or analytics tools: “Most of it can be done by querying the database and crunching numbers in Excel. The fact that we have four math majors and a full-time statistician means that we take that number crunching very seriously.”

Related:

tags: ,

Get the O’Reilly Data Newsletter

Stay informed. Receive weekly insight from industry insiders.