Fri

Oct 27
2006

Tim O'Reilly

Tim O'Reilly

Search Startups Are Dead, Long Live Search Startups

Echoing Debra Chrapaty's comment that "in the future, being on someone's platform will mean being hosted on their infrastructure", Bill Burnham has an interesting entry on why search is dead as a category for VCs:

A couple of months ago I had the pleasure of moderating a panel at TIECon on the Search Industry. Peter Norvig, Google’s Director of Research, made one comment in particular that stood out in my mind at the time. In response to a question about the prospects for the myriad of search start-ups looking for funding Peter basically said, and I am paraphrasing somewhat, that search start-ups, in the vein of Google, Yahoo Ask, etc. are dead. Not because search isn’t a great place to be or because they can’t create innovative technologies, but because the investment required to build and operate an Internet-scale, high performance crawling, indexing, and query serving farm were now so great that only the largest Internet companies had a chance of competing....
 

So where does this leave search start-ups? For many of them it will undoubtedly leave them in the Web 2.0 Deadpool, but some of them may thrive if they adapt to the new reality. This new reality is that search innovation will increasingly be about applications and not about the core infrastructure. In fact, there’s a good chance that much of the core infrastructure will be available as a service to search-based applications. Amazon is pioneering this “search as a service” with its opening up of the Alexa crawling and indexing APIs.

I came to that same conclusion several years ago, when I was on the board of Nutch, the open source search engine based on Lucene. Great software, but without the funding to take the operations to scale, it could never become more than a research platform.

In my talks on Web 2.0, I always end with the point that "a platform beats an application every time." We're entering the platform phase of Web 2.0, in which first generation applications are going to turn into platforms, followed by a stage in which the leaders use that platform strength to outperform their application rivals, eventually closing them out of the market. And that platform is not enforced by control over proprietary APIs, as it was in the Windows era, but by the operational infrastructure, and perhaps even more importantly, by the massive databases (with network effects creating increasing returns for the database leaders) that are at the heart of Web 2.0 platforms.

But as Bill notes, this doesn't mean the end of the application category. It just means that developers need to move up the stack, adding value on top of the new platform, rather than competing to become it.


tags: web 2.0  | comments: 21   | Sphere It
submit:

 
Previous  |  Next

0 TrackBacks

TrackBack URL for this entry: http://blogs.oreilly.com/cgi-bin/mt/mt-t.cgi/5007

Comments: 21

  josiah [10.27.06 11:49 AM]

It's a good point that we're entering the platform phase. The only question I have is whether the network effects are real or imagined. Some of these massive databases contain dated information that is useless. At the end of the day what matters is the number of users creating current content. However, it's evident that these users are willing to abandon ship at startling speed. The obvious example is Friendster.

  Greg Linden [10.27.06 02:55 PM]

Good point, Tim. In fact, it isn't limited to startups. Even some of the big guys have decided they cannot compete on core search. For example, A9 was layered on top of Google and now is on top of MSN Search. AOL Search uses Google underneath. In both cases, they are trying to compete by adding features on top of web search, not on the core of the crawling and indexing.

Findory's personalized web search is also is layered on top of Google search. Findory reorders Google web search results depending on what you and others have done in the past.

  Daniel Rueda [10.27.06 11:43 PM]

Search is Dead....Location LIVES! King of Location is MyLocator.com. The finest Strategic Multichannel KeyPhrase location based Network available. The world never need search engines. What we needed was Location Engines.

  Jeff Chan [10.28.06 12:43 AM]

It appears to be trendy to point to infrastructure as a barrier to entry for almost any new (or existing) category of services. A few hundred servers suffice for a small number of copies of a web-scale crawl and index. Google's scale is necessary for the volume of queries it serves, not the size of the corpus.



The real problems potential entrants have to deal with are technology and distribution. It is an open question whether anyone can significantly improve on the existing approach (perhaps mining and summarizing facts?), and even if they could, whether they can acquire a following before an existing service replicates it or acquires them.

  Gopi [10.28.06 10:09 AM]

Search is dead not because of intrastructure barrier (as someone noted it dont require that many servers to crawl and index, more servers are needed only for query results generation and thats a variable cost.)

But rather search is dead becuase of distribution issues, its really really hard to change the current mindshare for search engines and make people to try a new one.

  markl [10.28.06 12:26 PM]

Applications of search are obviously something that I am very interested in. Its fun watching these evolve in real time. At Google we launched Custom Search Engines the other day, and in my team, we supported this effort by making CSEs accessible to the mashup developer. We launched two little samples to showcase CSEs:

http://www.google.com/uds/samples/cse/index.html
http://www.google.com/uds/samples/cse/csearch.html

This enabled the launch of "vulnpedia" a security vulnerability site:
http://vulnpedia.com/ (search for worm, xss, rootkit, etc.). This site was built literally the day we enabled CSE refinements.

Another guy I know blogs about the TV Show Lost and he has built a little vertical:
http://searchlost.blogspot.com/

These are all little examples that back up many of your points. There are countless opportunities to embed search in vertical sites like this and people are actively working on this, refining their ideas, etc.

Here is a site that's virtually all dynamic content, much of it search oriented:
http://www.cybersmusic.com/cybersearch.php?a=Pearl%20Jam

The videobar down the side is based on this: http://www.google.com/uds/solutions/videobar/index.html

The Tabbed Search in the center is tabbed mode of Google AJAX Search, and note the last Tab is a custom search engine the guy built this week. Wrap this with some ads and art and some original content in the form of navigation and you have a useful vertical site where a bunch of content is search based. This particular guy has similar sights dedicated to Nascar racing, Gadgets/Electronics, and even programming ( http://www.ajaxcoded.com/ajaxsearch.php?a=JavaScript )

  Joseph Hunkins [10.28.06 02:32 PM]

Hmmm - I started to think "reasonable" but realized this is Search baloney. An argument similar to Norvig's could have been made back 1999 when fledgling Google faced the staggering and spectacular infrastructures of MS and Yahoo.

Google's vastly superior search algorithm won the day and they scaled up to acommodate increasing use. If, for example, Powerset comes up with a superior algorithm why wouldn't searches gravitate there and the additional revenues allow them to scale up in size as needed (JUST as Google has done over the past few years)?

  Tim O'Reilly [10.28.06 05:52 PM]

Joe -- I think the scale of the big guys now is several orders of magnitude greater than in 1999. I'll not say it's impossible, just that the barriers to entry are much higher than they used to be -- unless you use someone else's infrastructure, which is just the point.

It's not to say that a deep-pocketed VC couldn't fund something like this, but the risks are so high that it's unlikely. Remember, you can't "scale up" from zero. At minimum, you need to be able to have accumulated billions of pages. Nutch's 100 million was only sufficient for research purposes.

  Yakov [10.29.06 01:49 AM]

Tim, your article is just in time. Please check early next week how we are goiung to add value on top of existing web platform..

  Otis Gospodnetic [10.29.06 11:05 PM]

As one of the Lucene developers, this is clearly a topic right up my alley. I run Simpy, and I'm not afraid of Google, Yahoo and other gorillas! :)

Gopi: that has always and always will be hard, but it also happens _all_ the time. What did you use before Google?

Tim: yes, the barrier is higher now in absolute terms (e.g. Google is indexing billions of pages, while AltaVista, Excite, and others indexed only a few hundred million). But all other scales moved, too. $1000 (your own or VCs') gets you a lot more hardware than it did when Google started cca. 8 years ago. Hosting and bandwidth are cheaper. There are more available open-source solutions or tools. Sure, Google looks super-scary, but so did Microsoft, AOL, Yahoo, etc.

  Davide [10.30.06 04:13 AM]

I'm afraid of this concentration of power in the hands of few giants. If there is no room for an alternative we are doomed to have a Internet no longer free and distributed. We all considered the sentence "No one controls the Internet" as a positive value of the Internet but if Google and Yahoo ARE the table of contents then this sentence is no longer true. I would expect some revolutionary approach to indexing and search, based on p2p. But even here I don't see any new step ahead.

  Tim O'Reilly [10.30.06 08:03 AM]

Davide -- I don't disagree. But remember this: just as the internet arose as an alternative to Microsoft's concentration of power, creating a whole new game, so too will new technologies route around whatever concentration of power happens as the internet matures. It's the nature of technology. We get bursts of innovation when barriers to entry are low; the innovators become the incumbents, and raise barriers to new innovation; eventually, the walls crumble not because of direct assault but because the innovation has moved elsewhere, where the barriers are lower, and that eventually takes the energy away from the old market.

It's a bit like a new road bypassing a town. Eventually the old town dies.

  pwb [10.30.06 12:28 PM]

Google is vulnerable as long as it continues to disregard the single most telling data point: what result did the searcher click on. And better, which result did the searcher click on and not return to the results.

  Tim O'Reilly [10.30.06 01:13 PM]

pwb -- do you have data points that show that Google is in fact ignoring this data? If so, links please!

  Bill de hOra [10.30.06 05:29 PM]

"Joe -- I think the scale of the big guys now is several orders of magnitude greater than in 1999"

I remember a guy telling me back in 2000 you'd have to be crazy to take on the search engines. I don't think that included Google at the time.


So, I don't buy the barrier argument - it the same kind of reasoning that has Java stalwards claiming Ruby will never more than a fringe curiosity, while ignoring how Java itself used to be a fringe curiosity - if it's ever true, it's only a contingent truth (this class of argument is so common in our industry we should really have a name for it by now).


Plus it's only one particular search/storage model - download the web into a database, and pay through the nose to stop the database melting down. You can talk up the ops and algorithms but when you get down it, web 2.0 platforms are just client server grande.

Speaking of ghost towns - The assumption that seems to keep the centralised model afloat for search is that no user is willing to take the latency hit for collating distributed/peered queries - but perhaps that only really holds true for browser users (who have to wait due to the way the web medium works), or where the results aren't worth the wait. IM and mobile phones (even email) suggest there can be alternatives.

  Tim O'Reilly [10.30.06 08:15 PM]

Bill --

I'm not suggesting that there isn't the possible of radical disruption to the current model. But I still believe that we're entering the centralization phase of the web, in which the big get bigger, and put up barriers to entry to the new guys.

Already in Silicon Valley, the goal of most startups is to get acquired rather than to build standalone companies.

It's a lot like what happened midway through the PC revolution.

It doesn't mean the game is over. It does mean that it will need to move to new ground for radical disruption.

  Ken Krugler [10.30.06 08:37 PM]

Hi Tim,

The mention of Nutch in your blog (and in response to a comment) caught my eye. From what you say here, I guessing you don't know about the current version of Nutch, which is 0.8. Nutch 0.7 had limitations that made it impractical to use for more than very small, vertical/intranet crawls, but 0.8 (after getting past some teething problems) is a very different story. See http://blog.krugle.com/?p=194 for a few more details.

  Nik Cubrilovic [11.01.06 06:31 AM]

I agree that some established markets such as search have become so complex that they are not only very hard to attack, but become harder and harder as time goes on.

A day doesn't pass that you don't see an article about how much cheaper it is to start a business and to build technology is today, the costs of computing power, storage, bandwidth and development time are now much lower - but there are obviously markets where despite this trend it is still resource and cost prohibitive for a new entrant to compete. Search is a good example of this (even blog search, many entrants and nobody seems to have really nailed it yet) - my questions would be which other segments have the same traits and would not be worth pursuing?

Also - which products get more and more complex and require more resources as time goes on? Search is one because the web continues to get bigger which makes organizing the content and finding what is relevant harder (especially with sophisticated SEO), blog search is another, what else?

  Daniel Rueda [11.14.06 05:57 PM]

Algorithims are Over. User Generated location based content on a custom multichannel content distribution grid is the future. 100 million domains with content on them. most people wont remember more than 20-40 different sites. The world never needed search engines, locator engines would have made cyberspace....locationspace.

  Bob Pack [12.27.06 07:03 PM]

OH O'Reily... you are so wrong! Search start ups aren't even close to being dead. The search market will go through many more iterations over the next 5- 10 years. it won't even look like it's former self. There will be many more new companies advancing search both in terms of indexing results and the UI of how results will be displayed. Costs will continue to go down, cost for memory , bandwidth, cpu ect. Open source makes platform developement easier. Google has actually stagnated the search market, albeit they have made it grow,the market is at a stand still because it greatly benefits google to keep it where it is , people humming around lost...clicking on ads ...making google money, the more searching ..the more money?? The future is search engine start ups, always has been...always will be.

  adam smith [04.12.07 04:41 AM]

Speaking of ghost towns - The assumption that seems to keep the centralised model afloat for search is that no user is willing to take the latency hit for collating distributed/peered queries - but perhaps that only really holds true for browser users (who have to wait due to the way the web medium works), or where the results aren't worth the wait. IM and mobile phones (even email) suggest there can be alternatives.info

Post A Comment:

 (please be patient, comments may take awhile to post)






Type the characters you see in the picture above.

RECOMMENDED FOR YOU

RECENT COMMENTS