Previous  |  Next

Fri

08.31.07

Tim O'Reilly

Tim O'Reilly

Google to Host AP News

I do a lot of reasoning by analogy. I try to learn from history and from other situations with parallels to the present. (See for example the logical track leading through my papers The Open Source Paradigm Shift to What is Web 2.0?).

As readers of the Release 2.0 newsletter know, I recently became fascinated by parallels between Web 2.0 and financial markets. One of the conclusions I drew from that study was that (as I wrote in an article entitled "An Unorthodox History of Wall Street" for that issue of Release 2.0):

Bill Janeway, former Vice-Chairman of private equity firm Warburg Pincus and a "recovering economist," gave us a long term perspective that provided very useful context for consideration of possible futures for Web 2.0 markets as they achieve the scale of financial markets.

Thirty years ago, Bill points out that the price of a trade was regulated by the exchanges: it cost approximately 22 cents to trade a share on the New York Stock Exchange. Unable to compete on price, firms competed on the quality of their investment research, and brokers' relations to clients were based on the information and insights they could provide.

Once the exchanges no longer regulated the price of a trade, prices fell over time to current levels of a fraction of a cent per share, or for large trades, effectively zero. As a result, sell-side firms could no longer afford to do fundamental research. Two things happened: independent research firms grew up that charge directly for research, and more importantly, firms began to trade against their clients for their own account, such that now, the direct investment activities of a firm like Goldman Sachs dwarf their activities on behalf of outside customers....

This historical perspective provides thought-provoking fodder for speculation about the future of other networked information markets, including Google Adwords.

  1. Might we see Search Engine Optimization and Search Engine Marketing as the equivalent of Wall Street investment research? Will SEO/SEM firms evolve into research firms, or will they start "trading for their own account"?
  2. Might link farms that harvest search engine results to automatically build pages that will rank high in search engine results and thus collect a disproportionate share of adword clicks, be seen as the equivalent of program trading? If so, we can expect this type of programmatically created page to represent a larger and larger share of results volume unless search engines get smarter about regulating them.
  3. Might the direct inclusion of data such as weather and stock prices into search engine results pages (rather than links to external resources) be seen as the equivalent of Wall Street firms "trading for their own account"? If so, does Wall Street suggest to us that the future of search engines is to produce increasing amounts of their own content, and to consume a greater direct share of the available clicks, rather than passing the clicks off to their external advertisers?

Well, we just saw the latest news on this front, with Google "trading for its own account" as opposed to its traditional role of handing off traffic to web sites it searches, in the Chronicle this morning, Google to host AP News:

"Internet search leader Google Inc. on Friday began hosting material produced by The Associated Press and three other news services on its own Web site instead of only sending readers to other destinations.

"The change affects hundreds of stories and photographs distributed each day by the AP, Agence France-Presse, The Press Association in the United Kingdom and The Canadian Press. It could diminish Internet traffic to newspaper and broadcast companies' Web sites where those stories and photos are also found — a development that could reduce those companies' revenue from online advertising.

I've written several related posts on this issue, whether Google's future will be as a switchboard or a repository, and why Book search should work like web search.

What other areas do you see (besides weather, stocks, directions, phone numbers, books, and now AP news) where Google might start delivering results directly, thus potentially drying up traffic to the specialized sites that once provided that information? (Note that this is not necessarily a bad thing -- like many possible changes in the web ecosystem, it's possible for it to go either way, depending on the revenue splits to the licensed provider of the information. If that's greater than the value of the ad revenue or other business model of the current system, everyone is happy. In the end though, it's unlikely that everyone will be happy. But as long as the value created by the system is greater than the value destroyed, value reallocation between parties tends to work itself out over time.)

tags:   | comments: 16   | Sphere It
submit:

 

1 TrackBacks

TrackBack URL for this entry: http://orm3.managed.sonic.net/mt/mt-tb.cgi/2361

» Google for news from Random Mumblings

Lots of people chiming in on Google's use today of full-text Associated Press stories instead of linking to AP member Web sites. People had been wondering whatever happened to the AP-Google agreement announced a year or so ago and now we know... Read More

Comments: 16

Bill   [08.31.07 02:17 PM]

I don't really sympathize with the loss of traffic to a story that is an exact duplicate of all other re-prints of the same AP story on other news sites. Google is just reducing the amount of confusion and redundency for its users. As the Googel news blogs put it: "of course, if you want to see all the duplicates on other publisher websites with additional analysis and context, they’re only a click away."

Simon   [08.31.07 02:56 PM]

Tim O'Reilly asked:

"What other areas do you see (besides weather, stocks, directions, phone numbers, books, and now AP news) where Google might start delivering results directly...

"Dating" leaps immediately to mind.

Probably working on it already. Not quite "all the world's information", but who's counting? Plus, the trust factor, and the interface, oh, and obviously the search matching technolopgy.

Oh.

Probably working on it already!

/s

Michael R. Bernstein   [08.31.07 03:26 PM]

Science.

By which I do not mean the relatively tepid 'Google Scholar' offering, but the (frequently unpublished) huge datasets that are often both the requirement (eg. genomics) and the result (eg. High Energy Physics) of many modern experiments, and possibly impressive on-demand number-crunching capabilities to apply to these datasets.

This applies even more to disciplines such as economics, sociology, psychology, medicine, and demographics.

Michael R. Bernstein   [08.31.07 03:30 PM]

A general-purpose product rating and comparison service. Something like Amazon meets consumer reports.

Michael R. Bernstein   [08.31.07 04:31 PM]

Yahoo really never truly capitalized on it's Viaweb acquisition (now Yahoo stores). Google could make a play for hosted e-commerce solutions including digital content fulfillment. Digital River might be a possible acquisition in this space, but there are others.

If this seems outside of Google's stated mandate, note that tier-one players in this space must provide information on cross-jurisdictional commerce as a set of services, ie. "am I allowed to sell this software product to this customer in jurisdiction X?", not to mention local information on taxes, etc., and tracking fulfilment. These services are deeply information-intensive.

Michael R. Bernstein   [08.31.07 04:58 PM]

Also, you might take a look at where Google is *already* providing data directly via Google Base. Real-Estate comes to mind. Here is an example real-estate search. (Full disclosure, one of the houses for sale in this result set is mine).

Tim O'Reilly   [08.31.07 05:04 PM]

Michael -- you're on a real tear here. Thanks for all the great feedback. I hope someone from Google is reading all your smart ideas.

Simon -- you too. It's pretty clear that Google can't have Orkut as its only social networking play...

Outtanames999   [08.31.07 09:24 PM]

Yellow pages business. I won't be surprised to see Google to buy one or more YP publishers by 2010, or partner with them in some way.

Search ☸Engines ☸Web   [08.31.07 10:36 PM]

In case others have forgotten, FOR YEARS, Google has been sending FREE TRAFFIC to BILLIONS of Webpages courtesy of their organic SERPs.

These include the news links that are now incorporated into the SERPs as well as the links to Digg, Amazon, blogs etc

Michael R. Bernstein   [09.01.07 12:47 PM]

Tim, thanks for the compliment, though I'm not sure my ideas qualify as 'smart'. Maybe slightly non-obvious.

When their mandate is to 'organize the world's information', and there are so many information businesses that are only likely to get 'organized' if they take them over themselves, the trick really isn't to guess which information businesses they'll go into, but in what order. I don't really think I have any special insight into that order.

I definitely wouldn't have guessed 'full text scanning and indexing of books' before they did it.

OTOH, reading the tea leaves, we already do know that one of the areas they are about to enter is Health and Medicine information.

Michael R. Bernstein   [09.01.07 03:36 PM]

Regarding Simon's suggestion, you can search personal ads through Google Base too. In fact it is generally amenable to rectangular data storage and retrieval. I would be very surprised if Base isn't monitored to detect new and interesting uses by alpha geeks.

BTW, speaking of smart ideas, it would be interesting if Google introduced something that was the general equivalent of Google Base for non-rectangular data (IOW, something like Freebase), but I am unsure to what *specific* uses it would be put first.

Claus   [09.02.07 02:51 AM]

Note how the "trading for their own account" remark also fits in with the "more J.P. Morgan than Microsoft" line from your update to the previous Radar post on Google.

Alex Tolley   [09.02.07 06:39 AM]

"Two things happened: independent research firms grew up that charge directly for research, and more importantly, firms began to trade against their clients for their own account, such that now, the direct investment activities of a firm like Goldman Sachs dwarf their activities on behalf of outside customers...."

"Might link farms that harvest search engine results to automatically build pages that will in rank high in search engine results and thus collect a disproportionate share of adword clicks, be seen as the equivalent of program trading? "

Independent research is tiny - the real growth was the trading on their own accounts.

Let me suggest that the link farms are not like program trading, but currency trading. Currency trading doesn't have much in the way of fundamentals to anchor prices, so prices move much more on teh basis of patterns - I believe technical trading is still widely used in this domain. But teh idea I want to convey is that currency trading is more like a game - suckering a counter-party to make a losing trade by exploiting information asymmetry. That is the analogy I want to connect to link farms and.

Let me make another analogy in this domain. Market data used to be the preserve of the major vendors like Reuters. It used to be delivered as pages. Then the individual instruments, currency, stocks and bonds and their derivatives became itemized and delivered separately. About that time other vendors entered to manipulate that data. Tibco (nee Teknekron) created feedhandlers to grab Reuters, Dow Jones Markets data, create a common data item naming schema and provided sophisticated, customizable UI's to build deliver the data to traders. Google could do something like that in classifying page/data items so that the search and display UIs could be built on top of common data naming (ie beyond URLs). For a taste, just think of a dynamic "scrolling news", RSS feed delivering the latest, hottest pages based on some search criteria. Search would be transformed from snapshots to a tracking and monitoring function. Just think of the opportunities based on that.

Michael Bernstein/Tim O'Reilly: Didn't Google already announce their interest in hosting data at the 2007 SciFoo? Also, I certainly felt that the new Google Sky is a niftier version of the Sloan Sky survey DB that Jim Gray and colleague demonstrated at SciFoo 2006.

Nat   [09.03.07 07:49 AM]

Google earns money thanks to ads (AdWords, AdSense...) because it targets the banners, by knowing what is of interest for a given visitor. It may enhance this knowledge by tracking visitors actions on a site considered by a fair and growing fraction of Web users as a repository of useful generic information, a site where many of them immediately search for information on their topic du jour, a site where Google, for now, cannot track what they do.


This site is Wikipedia, and it needs hosting.


Here is my piece about this.

DaveN   [09.03.07 12:07 PM]

I think that in the back of Google's mind they are still only a website that depends on their biggest competitor Microsoft to get their product to users, So my money is on Google developing something to break away from the http:// model something that connects to a Google central core

Simon   [09.13.07 02:59 PM]

Follow up to dating online post.

I was quite unclear in my first post; I apologize it has taken me so long to respond. I also apologize for this long post, but this is a complicated issue.

At the risk of repeating myself (again!) what I was trying to articulate was not so much smart as imaginative, perhaps, or at least speculative.

Some of the speculation below seems similar to Alex's post, but this seems different enough to still warrant posting, since dating and relationships are interesting to every human on earth.

Um, no, Orkut -- as successful as it is -- especially internationally like in Brazil is (as much as I understand it, I’m not currently a member), just one of many similar social networking sites.

Everybody is reasonably familiar with these sites I think, mySpace and Facebook being the leaders, with millennials, anyway. They are sites for finding friends, dates and relationships.

Michael Bernstein makes a very good point, you could do a dating system with base - but the interesting question comes when you consider the cost to Google, or any company, of storing that information on their servers.

Storing the info costs a lot of money! Serving it up costs money. There has to be a compelling profit motive for them to store that data, and that means some kind of useful service that helps people connect to like-minded individuals for dating and relationships, in this case.

The service would need to be very good indeed at delivering customer value, as dating is a crowded market. But the Big G. is very good at matching people with things they like, connecting people with information and services. That’s kinda what they do, and what we trust them to do for us: Help people find information or "things", things we really want, whether we know it or not, I guess. To surprise us with information we didn't know we needed! One of the things they are good at, anyway.

The system I’m thinking of, and this is just Gibson-esque sci-fi speculation, would get data from all over the web, from many different social sites and dating services: your clicks, tags, links, bookmarks, contacts, flickr account etc. It would tie it all to a personal identifier, like an email address in the case of Rapleaf (who are already doing something along these lines) or a central phone number, or both.

This imaginary service could result in really quite extraordinary matching services for singletons! I'd sign up in a heartbeat, so to speak.

And all based on real information, the actual things you like, and click on, even in real time, maybe, as they change with your changing interests and passions; change, or not, as the case may be. Wow!

(Link above courtesy of ZD Net ZDNet)

It might result in something like a quality score for active daters, quality score in the sense used by the adsense system as part of the Google’s algorithm (as I understand it) for rating the "importance" or value of advertiser’s web sites.

Part of the economics of that system is that Google gets more value from storing a successful, useful or quality site on their servers, and your CPC goes down, in return.

So, in this imaginary dating service, people, or daters, who opt-in, would have their own rating or "quality" score, which might be based on all kinds of criteria: how long does the guy take to call back; how many active female single contacts does he have in his phone book?

Such a service might identify "players" from all genders. It would collect and "understand" data far beyond the usual lists of criteria given by each sex for seeking a mate such as height, weight, physical attractiveness etc., currently used by most people on active dating sites like Craigslist, which is free, btw!

Other information that could be used to match: aesthetic preferences (flickr); maybe even the type of face (or body) a person is actually attracted to, (through some system of analysis of photographs clicked on, and the kind of facial symmetry work being done in attractiveness studies by psychologists?)

It could go on an on. Predictive scores, accounting for changes in personality over time, etc. The sky’s the limit.

People with high quality scores would be pre-selected as not only a perfect match, or a darn good match, but also as, based on past behavior, matched in belief, or dating ethicality, so to speak. This would all be based on actual data, not on a set of psychological tests or questions that are used by the best matching sites at the moment.

As good as those tests may be, and no matter how clever the tests are, there is always going to be a difference between what a person self-reports, and how they actually behave, especially some people. That’s why real psychopaths never go to psychiatrists.

Data from high-quality people would be worth far, far more to store on servers than any social utility. Actually, hmm, data from very active daters would also be worth storing on their servers.

Of course, for such a system to work the data would have to be available from these social utilities, but just a day or two ago we learned that Facebook plans to make this kind of information public: USA Today.

Of course, as all readers of this blog know, many of these sites have open APIs, including del.icio.us, which would make collecting the data just a straight programming challenge.

(Link above courtesy of Sonja Thompson at TechRepublic)

.

What would you pay (or exchange) for that kind of information: for real love, from good people, matched with real world data, from all over the world; or for the perfect quick fling, right when you want it?

In this imaginary world, you are using your online desktop or mail service, and up pops a sponsored link on the right hand side that says something like:

"... based on your Personals Profile and your current interest in green cotton linen, we have determined that person X living in country Y has a 87% chance of being your perfect match, your "soul mate". In addition, they just broke up with somebody and are actively looking for somebody who looks just like you. Click here to send them an anonymous ping through the privacy system."


Frankly, I believe people would give just about anything for that kind of information: real love, soul mates?

If you are a very active dater up might pop something like:

"...based on your profile a VGL married man is currently trolling in a coffee shop two blocks from your current location. He fits your One Night Stand Profile 92% (ONSP). Click here for a photograph and his exact location on a blinking map sent to your cell phone, so you can go check him out. This will cost you a buck, and have fun!"

Getting scary! But it's just science fiction. I'm not an engineer or database expert, I know comparatively little about the actual programming mechanics of this stuff.

Why is it scary? Well, there are obvious moral questions about privacy here! Matrix-like oppressive universes spring to mind, all kinds of nightmare scenarios: Soylent green is made of people.

But technology itself is morally agnostic. It can be use for good or bad purposes.


 


Privacy and Choice


 

If this kind of scenario makes your "right to privacy nerve" twitch, remember that it is always a choice, here in the USA, anyway.

Choice, as they also say in the Matrix movie series, is what it is all about when it comes to our personalities, network or not, imho. We always exchange this information about ourselves for a service we want, in this case an actual soul mate you can trust!

We have a choice to never login again, should we wish. We certainly have a choice as to whether we will sign up for mySpace or other social networks.

Since these technologies are so powerful there is little one can do now, let alone in the future, to defend against them. Our “privacy” is now a choice we make, by our deeds and actions, every day. It is by these choices, just as it ever was, that you can either choose to be a good person, or not, depending.

Maybe there will be super-hackers, as there are in many of the William Gibson’s novels, who can navigate all this, but for a vast majority of us, if we value privacy at all (and we do have a right to privacy, for whatever reason we choose), we may be reduced to two choices: don't exist on the grid, or expect to be completely transparent.

There is one thing my dad taught me long ago and I always try to remember: There are always, no matter how smart you are, always people who are smarter than you, always a bigger fish in the sea.

When technology like this is used to take away choice; that's when it becomes worrisome. When it is done without your consent, and Google have shown no signs of ever engaging in that kind of practice, then it becomes totalitarian, as it seems to be rapidly becoming, perhaps, in the UK, and other countries, for example.

The only obligations (I believe) we have with regards to privacy are when we, personally, take away another's choice, perhaps without even realizing. For example: email a sexy picture of your ex-girlfriend; store photos and personal replies from people you exchanged perhaps one or two emails with on Craigslist; and you are potentially choosing for them, too.

If your online email were to be hacked, by somebody who wants to cause you harm for any reason, by storing that information about others you jeopardize their privacy.

By forgetting about others who trusted us, we potentially take away the choice of people who have nothing to do with whatever issue is currently threatening our privacy, or our choices in online privacy.

It is not just corporations and governments that can take away this choice. As individuals, if we are cavalier with the information sent to us by others, we are as culpable as any company or government.

The privacy of others, the people we interact with, should be our prime privacy concern on the network, imho.

Oh, and no worries about Google reading this, if I can imagine it, I'm absolutely certain they can too :)


Post A Comment:

 (please be patient, comments may take awhile to post)




Remember Me?


Subscribe to this Site

Radar RSS feed

BUSINESS INTELLIGENCE

CURRENT CONFERENCES