A new twist on "data-driven site"

How a billion points of app data shape TripAdvisor's website.

The distinction between platform applications and websites doesn’t need to be as defined as it often is. Companies simply choose to develop apps and sites as separate products.

TripAdvisor is pursuing a different model. The company has an established travel website and a popular Facebook app, but it’s gone a step further and constructed an umbilical between these platforms. Data from the app shapes the site.

Sanjay Vakil, technical manager for apps at TripAdvisor, discusses the inner-workings of this mutually beneficial relationship in the following Q&A.

What is Trip Friends? How is it connected to Facebook?

Sanjay Vakil: TripAdvisor has 35 million reviews of various locations, but the reviews are fundamentally anonymous. What Trip Friends attempts to do is to pull together information from your social graph and identify friends who can tell you about the places you’re researching.

So as an example, I can navigate to the Los Angeles page on TripAdvisor. That page will show me people that I know who live there and people who have visited there. I can ask these people questions, and responses are curated on my Facebook wall.

What we’re actually doing here is pulling data from Cities I’ve Visited, which is a Facebook application that we’ve had for about three years. When people come to Cities I’ve Visited, they put pins in a map. We have over a billion pins’ worth of data. Data from Cities I’ve Visited is augmented with Facebook information users have opted to share publicly. That combination powers Trip Friends.

With the penetration that Cities I’ve Visited has, Facebook users have anywhere between eight and 10 friends who are using the Cities I’ve Visited application.

Screenshot of TripAdvisor's Trip Friends functionality
TripAdvisor’s Trip Friends tool (top right corner) hooks into Facebook’s social graph.

Screenshot of Cities I've Visited Facebook application
Data from the Cities I’ve Visited application (above) is augmented with public Facebook data. That combination powers Trip Friends.

It sounds like you’re dealing mostly with structured data. Is that the case?

SV: The information that’s going on the Cities I’ve Visited map is entirely on the TripAdvisor side. We’ve worked hard to make that specifically structured.

Facebook is actually going through a process now of making their location information more structured. If you looked at Facebook three years ago, when we were starting this stuff out, the “current location” and “hometown” fields were just free text entry. You’d see entries like “Bat Cave” in the current location field.

That’s not helpful to us if we want to generate a lat-long from it. But recently — and I’m guessing this was part of the work involved in Facebook Places — Facebook went through and structured that information. Now, if you update your Facebook profile and type in a current city, you have to select from an auto-completed list. That’s helpful to us.

How do you manage the data?

SV: Some of this predates me, but when Cities I’ve Visited was first taking off, they were watching the users ramp up exponentially and trying to buy computers fast enough to keep up with them. During the nine months I’ve been here, Cities I’ve Visited has more than doubled its usage base.

We’re tackling this growth in a number of ways. One is, internally, we spend a lot of time minimizing the number of hits we make to our database. We have a large Memcached cluster that all of this stuff runs against. We also took our single database that was holding all the pin information and federated it across 12 different databases so we could keep up with the huge amount of data coming in.

Trip Friends is a very strategic product for us, so we’re building this infrastructure to deal with it.

Could Trip Friends tap into other platforms?

SV: It’s certainly possible. We’re not really in a position to go into much detail on it, but there’s a lot of data sources out there.

The one thing I will say is that a lot of the information in the dataset we’re gathering is historical. People are pinning the last 25 years of their lives to the Cities I’ve Visited map. That’s really exciting to me. It’s a very broad swath with global coverage, as opposed to something that people are collecting over GPS.

What advice would you give to companies that are developing their own data products?

SV: One of the top problems to deal with when you’re gathering data is building it in a way where the data all maps down to a single set. You don’t want people having seven different copies of the same thing and having that information spread across seven different locations. For example, when someone says, “I’m visiting Boston,” do they actually mean Cambridge? Do they mean the greater Boston area? We’ve done a lot of mapping, so we know exactly what it is they’re talking about.

When you come in out of the blue, especially as a small player, it’s easy to just use Google’s database or just use Facebook’s database. The reality is none of the players — especially in the travel space — have a single database that is the de facto standard. You’re going to be doing a bunch of mapping back and forth, so you might as well go in with your eyes open.

Also, I think we tend to get caught up only looking forward. But there’s an opportunity to go in and gather information based on what a person has already done. Providing a mechanism around that jumpstarts the data collection.

Finally, companies that make pulling data out of people like pulling teeth are barking up the wrong tree. As a consumer, I want companies to make the data collection interesting to me and offer a clear benefit. That creates a virtuous feedback loop where people really like doing it. Getting people to use stuff is the hardest thing, so building something with a clear benefit helps a lot down the road.

This interview was edited and condensed.

Related:

tags: , ,