Sun

Mar 25
2007

Tim O'Reilly

Tim O'Reilly

From Subprime Loans to Failing Newspapers

In response to my piece about troubles at the Chronicle, Peter Wayner sent me the following note:

Apropos your posting on the SF Chronicle, I'm passing along a talk I gave at Google last year:

Google's Role in Paying for Shoe Leather

It wasn't particularly well received, I think, because the place still has the opinion that it is doing a favor for the content creators by pointing to them and wrapping ads around their work. That may be changing because one person told me last year, "Some think of Google as selling search. Some business types think it sells ads. I think it needs to be in the business of ensuring there's something to sell ads around."

Hence the connection between the piece on the Chronicle and the piece I just did on subprime loans. In his talk at Google, Peter noted:

There is the general sense that the Internet is slowly eroding the value of any of the traditional techniques used to pay for gathering the news. The classic newspaper is a bundle of products designed to make it easy for people who want one particular stream of information to work with others. Everyone can spend a small amount and aggregate their support devoted to gathering the news. Some want the sports scores, others want an update on new shows, and some want to keep an eye on what happens to their tax dollar. Most people have wandering interests in all three. Each of these found their answer in the general pile of information called the newspaper.

Bundling the information gave the editors the freedom to spend vastly different amounts to gather information and that allowed them to deliver relatively expensive information as long as it was balanced with relatively inexpensive facts....

Adam Penenberg analyzed the NY Times financial statements and found that in 2004 the dead tree subscribers generated $900 of revenue apiece. The website produced $11 per visitor. The gap between the two numbers has probably tightened since then because print advertising continues to become less popular while web advertising is booming....

This brings us to the question of shoe leather and what Google can do to support those who want to produce original content. If you asked me five to ten years ago, I would have thought that a search engine to the web was all that was needed. Helping the content consumer meet the right content creator is a marvelous gift to the world and something that is continuing to have amazing effects on almost every part of human life. Innovation is easier than ever before and magical creations are coming faster than ever before.

Today, I'm not as certain, in part because I think the ecology of free information has serious limitations. The Internet isn't supporting the shoe leather. I can't be certain of this, but my guess is that the blogs won't be able to replace the 8000 lost stories from the Washington Post. Yes, the blogs will replace some of them, perhaps as many of 7500 of the 8000. Yes, the blogs will offer a wider range of voices from a wider range of society. But I just don't see the same amount of serious journalism appearing. I can't quantify this and I don't know if anyone will ever be able to know. You just can't measure the depth of coverage very easily.

But even if I'm wrong, I'm beginning to see serious problems with the free information ecology. At first glance, free information seems like a great gift for the world. It's the kind of like the mythical frictionless economy, at least in ideas.

The danger is that the free information ecology will drown out the paid information ecology. Incidentally, this often happens in the world of money. People instinctively horde solid metal cash and spend paper money first. The economists call this Gresham's Law and summarize it with the phrase "the bad money drives out the good money". But in cyberspace, there's no such thing as money, just information and so it's no surprise that we could have a similar thing occur with bits. The cheap bits drive out the dear ones.

The important takeaway is that we aren't in the end-state of the information economy. In the course of its evolution, there may be consequences gone awry, such that Craigslist, a boon for people wanting to find an apartment or get rid of their stuff, or Google, a boon for people wanting a quick summary of the latest news, undercuts the ability of newspapers to fund coverage of expensive topics. User-generated content may replace many of the types of content that a newspaper (or for that matter, book publishers) used to provide, at lower cost and perhaps even with greater relevance and quality, but it may not fund the stuff that's hard to do.

Now, I'm not saying that Google or Craigslist has unleashed a genie similar to the one that came out of the subprime loans bottle, but I am suggesting that thought to long term consequences can lead to different strategic choices. In his talk, Peter focused on a number of possible ways that Google could help to encourage more of a paid content economy:

There's no reason why Google can't put paid information on the same level as free information. I think it can be done without harming the free information or by reversing the bias and putting the free information at a disadvantage. Here are five suggestions that might help the content creators out there make enough money to pay for more than bandwidth.

Solution 1: End the Bias Against Walled Gardens

...Make it simpler for publishers to get their information into the index even if they're not as wide open as we would like.

Solution 2: Tilt the Table Against the Copyists

Let me say that I'm a big believer in fair use. I think it's very important for people to be able to quote frequently and liberally. But some blogs take this to an extreme. It's easy to find blogs that are 80, 90, even 95 percent borrowed text. Some frequently cut huge chunks of an article and then wrap it with the thinnest amount of comment....

So why not add another term to the exponentially growing PageRank equation. Declan McCullagh suggested this during dinner last night. Why not compute the fraction of the text that's original and the fraction that's borrowed? ...Let's call this LeechRank. If 20% of the text is borrowed, let's do nothing to the PageRank. If 50% is borrowed, we bump them down a few notches. If 80% is borrowed, let's send them down 20 to 30 notches. And if 100% is borrowed, as some pirates do, well, let's just knock them straight out to the bottom of the listings, sort of a way station on their trip to the circle in hell reserved for people who steal and destroy a person's livelihood....

Solution V: Micropayments

It's time to open up the index to articles that are kept behind a wall of pay. I imagine a system with three different columns of results from a search. The first would be pointers to articles from the free ecology. The last would be paid advertisements. In the middle could be articles from web sites that charge people to read the text.

Google could either help collect the payment or leave that marketplace to another company. Both have their advantages and limitations.

Despite what Peter said about Google not being that receptive to his talk, they are clearly moving in the directions he suggested (albeit with more sophisticated thinking on possible interfaces.) Google Scholar now indexes paid content, as does Google News. Google Book Search is working to build a paid search economy for books (and hopefully, that will include books on publishers' sites, and not just in Google's repository.) And with Google Wallet, they now have a payment mechanism in place.

Nonetheless, Peter's basic point is a good one. An economy is also an ecology. It's possible to seriously damage an ecology by exploiting it too heavily. Smart players create sustainable ecologies.


tags: web 2.0  | comments: 11   | Sphere It
submit:

 
Previous  |  Next

1 TrackBacks

TrackBack URL for this entry: http://blogs.oreilly.com/cgi-bin/mt/mt-t.cgi/5371

The discussion today about whether newspapers are dead or merely have a flesh wound, and if only wounded what they must do to survive, reminds me of the “bring out your dead” scene in Monty Python’s The Holy Grail…. I never tou... Read More

Comments: 11

  Scott Carpenter [03.25.07 07:50 AM]

If we're going to rank based on percent of copied material, I hope someone figures out how to account for all the splogs out there.

Although I think that metric may not be as useful as hoped. That thin slice of commentary over the copied text can be valuable, and it's a convenience to read it in one place.

And I think Google already looks at duplicate content in determining rankings.

Money will be made, somehow. We're not in the end-state, and I think it's difficult if not impossible to predict that end-state, so I'm leery of any model or plan that tries too hard to hang on to what we've known and used to date.

  Nicholas Dronen [03.25.07 09:44 AM]

It's possible that a hybrid news style will emerge from the rubble. The family of TalkingPointsMemo.com sites could be an exemplar. They blend blogs, community, and original reporting. A while ago they hired a reporter or two to work on TPMMuckraker.com, which was at the vanguard of the US Attorney scandal. They're now hiring someone to cover Capitol Hill on Capitol Hill. Real shoe leather.

Here's a recent NPR story on them.

  Brij [03.25.07 01:34 PM]

Tom,




One can draw an interesting analogy between mainstream media/bloggers and proprietary/open source software.




Somehow Peter's assertion is missing actual data points. This is a start so we have to wait to see the emergence of "professional" bloggers:




Today, I'm not as certain, in part because I think the ecology of free information has serious limitations. The Internet isn't supporting the shoe leather. I can't be certain of this, but my guess is that the blogs won't be able to replace the 8000 lost stories from the Washington Post. Yes, the blogs will replace some of them, perhaps as many of 7500 of the 8000. Yes, the blogs will offer a wider range of voices from a wider range of society. But I just don't see the same amount of serious journalism appearing. I can't quantify this and I don't know if anyone will ever be able to know. You just can't measure the depth of coverage very easily.




Same kind of argument I have seen in open source context where proprietary software camp makes a case for more robust code by claiming more dedicated effort. With overall success of open source ecosystem that doesn't seem to the case anymore.


Thanks for sharing your insights..

  Bob [03.25.07 01:34 PM]

Seems to me that this article mostly quoted another source...yet I found it interesting still...

  Thomas Lord [03.25.07 10:34 PM]

I think the problem with "shoe leather" is a drop off in demand -- not a problem with medium of delivery or business model. The ads that directly surround the newshole still make perfect sense -- if there is widespread popular interest in newshole content. Classifieds: well, craigslist has its advantages for this and that but, frankly, a nice full page spread of ads also has *its* unique advantages and if craigslist were the only problem we'd probably see more innovation in newspaper classifieds.

Why do people care less, these days, about newshole content? I'm sure I'm not sure but among my top level guesses are (a) wider spread economic and political alientation; (b) advances in web 2.0-driven marketing of propaganda as a substitute for hard news.

Which brings us back, once again, to the investment strategies of the well off....

-t

  pwb [03.26.07 01:13 PM]

"found that in 2004 the dead tree subscribers generated $900 of revenue apiece. The website produced $11 per visitor"

Econ 101: profit matters more than revenue.

  Peter Wayner [03.26.07 05:52 PM]

This is a start so we have to wait to see the emergence of "professional" bloggers



We already have some people that make a fair bit of money blogging. Some of them are great, but I'm just as cynical today as I was then. While there are some bright shining stars in the blogging world, but I don't see the same amount of deep, investigative work from the semi-pro bloggers.


Consider some of the most successful:




  • Slashdot culls submissions from readers and organizes their comments. The ads support a staff that's around for 24 hours a day. But with a few columns written by editors, I don't think anyone gets cash for generating any of the content.


  • Boing Boing has a great collection of material, but much of it is borrowed. I did calculation once and found that about 50% of the raw text on the page was between blockquote tags. Some of the pieces are great, but many are a couple of sentence saying essentially, "This is cool." Their culling is useful, but it's not much more than press release journalism.



  • Fark: Pure editing.



  • Gawker, Treehugger, Valleywag, etc...--- These are some of the best examples of well run, professional blogs. Yet their pieces are often single source snippets. There's nothing wrong with that. Perhaps we intuitively like to snack on things in little pieces. A few of their pieces can be chained together to build something pretty amazing. I think, for instance, that you could learn just as much about pre-fab houses from a dozen Treehugger posts as you could from a traditional story in the paper. That being said, I think I'm struggling to be positive. I just don't see the same kind of volume that even an average midsized newspaper will bring.



I know that some people think that blogs are kicking the pants off of the old style press, but I think this is only in niches. Of course a blog is going to have better coverage of some odd corner of the tech world than a general newspaper intended for a general audience. The real question is whether blogs can provide the same good quality content.


I would be interested to see some pointers to the best blogs that are also self-supporting. (There are fabulous ones written by people with day jobs. Lawyers, for instance, often blog to get clients.) Are there good examples of blogs that can stand alone? Please send them along.

  daVe =P [03.26.07 08:32 PM]

How would Google determine who leeched from who?

How can any "neutral" automated system determine with a reasonable level of certainty who published what data first?

Would such a new criteria in page rank bias the system from smaller content providers who would not have the resources to "stand up" for their intellectual property rights? Isn't it a big thing about blogs that the smaller voices get a more "equal" footing?

  Peter Wayner [03.27.07 05:00 AM]

There are a several techniques that Google might use. None are foolproof, but they can work.

1) If page A has a link to page B and some shared content, then there's a good chance that A is borrowing from B.

2) If page A puts shared text between blockquote tags or between standard quotes, then it's borrowing from B.

3) Imagine Google spiders A and B on one day and finds the text on B alone. If it respiders a few days later finds it on both pages, then A is borrowing from B.

(1) and (2) can't detect outright plagiarism, but that's not really the issue here. The real goal is to reward people who are doing real research versus those who borrow the core of the reporting and take all of the advertisting revenue.

  Ciaran [03.28.07 11:42 PM]

What are all the professional bloggers going to write about when there are no NYT stories for them to link to and comment on?

  michael holloway [03.29.07 09:49 AM]

Filter Blogs or the 'Copyists' as Peter Wayner calls them; I agree with Scott Carpenter, "..it's a convenience to read it in one place."


I think it'll be one of the 'shapes' that hangs on over the years. For example I have a favourite filter right now, it's like getting branded on the NBC logo. My ceribral pathways have become accustomed to the look of that page. Every morning, between 7:00 and 9:00 the editor is Online, responding to comments.

Post A Comment:

 (please be patient, comments may take awhile to post)






Type the characters you see in the picture above.

RECOMMENDED FOR YOU

RECENT COMMENTS