Why I'm so excited about Spock

Note: Spock is among the companies launching at the Web 2.0 Expo on Monday.

Michael Arrington wrote the other day about spock, the new people search engine, but I have to say that I don’t think he did it justice. Spock is really cool, and performs a unique function that is well outside the range of capabilities of current search engines. What’s more, it’s got a fabulous interface for harvesting user contribution to improve its results.

You can search for a specific person — but you can do that on Google. More importantly, you can search for a class of person, say politicians, or people associated with a topic — say Ruby on Rails. The spock robot automatically creates tags for any person it finds (and it gathers information on people from Wikipedia, social networking sites like LinkedIn and Facebook), but it also lets users add tags of their own, and vote existing tags up or down to strengthen the associations between people and topics. Users can also identify relationships between people (friend, co-worker, etc.), upload pictures, and provide other types of information. This is definitely a site that will get better as more people use it — one of my key tests for Web 2.0. It also illustrates the heart of a new development paradigm: using programs to populate a database, and people to improve it.

Let’s start with a search for a specific person — say, Eric Schmidt.

search for Eric Schmidt on spock

You’ll notice that there are 45 Eric Schmidts in total, and the number will grow as spock expands its reach. However, I’m pretty sure that most people would indeed expect the CEO of Google to be the top ranked result for “Eric Schmidt.” He’s top ranked on Google, too, but if you look at the Google search results page, you’ll see an important difference:

search for Eric Schmidt on google

Here, because Eric’s an important guy, he dominates the search results. We don’t find an entry about a second Eric Schmidt, the professor of medicinal chemistry at the University of Utah, until the middle of the second page of search results, and I didn’t click through enough pages to find a third Eric Schmidt.

Disambiguating people, and then collapsing multiple sources of information into a single entry, or entity resolution, is part of the secret sauce of a people search engine. (More on that in a followup post, since Spock wants your help in making this aspect of their software even better.) Mechanisms for ranking people are also going to be critical.

Now obviously, we can find out more about Eric with a Google search, but Spock collects a very nice top-level summary in one place, but most importantly, helps to find the collection of people named Eric Schmidt who are not this particularly high profile person.

There’s a reasonable amount of detail, including a picture, in the search results list, but clicking on Eric’s name shows even more detail on the information Spock has collected about him and also gives a chance for you, the user, to improve the information that’s already there:

Eric Schmidt's detail page on spock

This is a pretty good summary of Eric’s vital statistics, including a wikipedia widget, a picture, tags describing his career, links to web sites associated with him, “related people,” and so on.

But I notice a couple of things that are missing. The list of known web sites associated with Eric includes neither his personal home page nor the Google corporate information site, so I add links to both. I also see that he’s not tagged in association with Sun Microsystems, where he was formerly the CTO, or Novell, where he was the CEO. So I add these as tags. In the screen shot below, you can catch me in the act of tagging Eric with Sun Microsystems. The new web links have already been added.

Eric Schmidt's entry after I've updated it

Why, might you ask, will people go to the trouble of updating people’s pages on spock? First off, individuals can claim their own page, and clearly have an interest in it. (It will be interesting to see how Spock balances people’s desire to manage their own image with the public data the search engine finds. It will also be very interesting to see how successfully they manage spamming of tags, websites associated with people, and other user-contributed data. They do allow users to vote information up or down, but that may or may not be enough. I’ll bet that entries on prominent people end up needing to be closed. There are also issues with the semantics of related people. I was able to add Larry and Sergey as co-workers, but is that really the right way to describe their relationship? As with tags, there’s a huge amount of room for nuance, disagreement, and outright error. This private beta of spock exposes the tips of many icebergs, some of which have the power to sink one feature or another.)

Back to the question of motivation for user contribution: because of Spock’s tagging features, the engine will become a really useful tool for finding people at companies, in particular locations, or with common interests. Here, for example, is what I find if I click on the tag “Google” that is listed under Eric’s name:

people associated with the google tag on spock

Spock already has 1425 people associated with Google in one way or another — and I’ll bet a lot of them aren’t in LinkedIn or other social networks that require people to build out their own network. (Spock’s relevance ranking clearly has room for improvement, though. John Battelle is an important guy with key insights into Google, but I wouldn’t put him ahead of Larry Page!)

What really gets me excited is that I’m told that Spock plans to support private tags, so you can manage your own people information spaces. This will also have a powerful network effect, in that people will be motivated to upload their address books and other lists. How much more useful to me would be a Spock-ified list of O’Reilly authors than the simple database we now keep them in, or a list of our conference speakers? In a lot of ways, my business is based on the ability to find the right person, the person who knows the most about a given topic and can write about it, or present about it at a conference, or point to other interesting people. It’s also based on keeping track of people. When we’re planning the invitation list for an event, we’re often poring over a spreadsheet — and asking ourselves, who was that again? Spock pulls together a relevant summary for each person, making it a great outboard memory connecting names, faces, topics and companies.

What’s more, Spock is still in its infancy. It has only a fraction of the people it will have once it gets out of private beta, and only a fraction of the features. This is definitely a product and a company to watch.

(It’s also a lot of fun, but that’s a subject for another post, once the product is live and it won’t just be a tease to talk about all the cool things you can do with it!)