Fri

Feb 8
2008

Tim O'Reilly

Tim O'Reilly

Reuters CEO sees "semantic web" in its future

Money:Tech logo

At Money:Tech yesterday, I did an on-stage interview with Devin Wenig, the charismatic CEO-to-be of Reuters (following the still-not completed merger with Thomson). Devin highlighted what he considers two big trends hitting financial (and other professional) data:

  1. The impact of consumer media on professional media. As young people who grew up on the web hit the trading floor, they aren't going to be satisfied with text. Reuters needs to combine text, video, photos, internet and mobile, into a rich, interactive information flow. However, he doesn't see direct competition from consumer media (including Google), arguing that professionals need richer, more curated information sources.

  2. The end of benefits from decreasing the time it takes for news to hit the market. He describes the quest for zero latency in news, from the telegraph and early stock tickers and the news business that Reuters pioneered through today's electronic trading systems. (Dale Dougherty wrote about this yesterday, in a story about the history of the Associated Press.) As we reach the end of that trend, with information disseminated instantly to the market via the internet, he increasingly sees Reuters' job to be making connections, going from news to insight. He sees semantic markup to make it easier to follow paths of meaning through the data as an important part of Reuters' future.

Devin's point about the semantic web was thought-provoking. Ultimately, Reuters' news is the raw material for analysis and application by investors and downstream news organizations. Adding metadata to make that job of analysis easier for those building additional value on top of your product is a really interesting way to view the publishing opportunity. If you don't think of what you produce as the "final product" but rather as a step in an information pipeline, what do you do differently to add value for downstream consumers? In Reuters' case, Devin thinks you add hooks to make your information more programmable. This is a really important insight, and one I'm going to be chewing on for some time.

That's a really good case for the Semantic Web, and one that I hadn't understood before. It's not about having end users add semantics for the love of it. That's just overhead, which is why I've always argued against it, preferring the kind of implicit semantics that come from applications that harness user self-interest. But professional publishers definitely have an incentive to add semantics if their ultimate consumer is not just reading what they produce, but processing it in increasingly sophisticated ways.

But even if Devin is right about one role of a publisher being to add value via metadata, I don't think he should discount the statistical, computer-aided curation that has proven so powerful on the consumer internet. (Curation is that part of the publisher's job that consists of choosing and arranging the content that is presented to the ultimate consumer -- reading the slushpile, if you will, so that others don't have to, and making sure that the most important material gets its day in the sun.)

Explicit semantic markup has thus far not proven to be anywhere near as powerful as techniques for mining implicit semantics, or the design of applications in which more implicit semantics are created by users simply by "living as and where they live." (Facebook's "social graph" is the latest example of this kind of implicit semantic application.) Much success on the consumer internet has resulted from innovations in curation. After all, PageRank is a kind of automated curation via collective intelligence, as is Flickr's interestingness algorithm, user voting on slashdot and digg stories, and even community editing of Wikipedia.

Devin is completely right, though, about consumer media changing expectations for professional media. I see a lot of future upset in enterprise software as well as in media, as consumer phenomena like mashups and social networking change ease-of-use expectations of applications like CRM and business reporting. But it seems to me that mainstream media needs to learn not just about multimedia, but also about new sources of information.

A huge part of the generational change is a change in expectations of transparency, informality, and sources of authority. So when Devin says that Google isn't terribly useful for professional uses like financial research, I think he misses just how much authority bloggers are getting as reliable news sources, and how people are using tools like iGoogle to pull together targeted RSS data feeds. Raw Google results may be less useful than Reuters-filtered results, but how about community or expert-curated Google results? Just as Reuters' customers are adding value to the Reuters data stream, they are capable of adding value to the Google data stream. And there are increasingly powerful tools for managing that stream.

What I think does ultimately matter is the ability of professional media to build specialized interfaces and vertical data stores that are suited to their niche, hopefully harnessing data and services from the consumer internet, and mashing them up with specialized, perhaps private, data stores. Put that together with metadata for programmable re-use, and you may really have something.

On the end of timing arbitrage due to zero latency in information distribution, I have to disagree with Devin. There's still a huge amount of information that never hits the market because people aren't paying attention to the right things. David Leinweber, who spoke at the conference yesterday, gave a great example, of a huge price move in the stock of a pharmaceutical company as a result of news of successful clinical trials for a cancer vaccine. Every single news story on the subject resulted from the company's press release, yet there was information available on the web months in advance of the story. Leinwebber's point: it isn't about speeding up news distribution, but about getting better access to sources of the news. As he said: "You have to get the news before the news people get there."

Even in areas like company financials, the bread and butter of equity analysis, is there any reason why earnings are still reported only quarterly? There may one day be a company that breaks ranks, and shows its data in real-time (as it is increasingly coming to company executives.) The expectation of radical transparency may one day reach even into this relic of the 19th century.

I also think there's a huge opportunity to get to data sooner via the sensor revolution. When phones report location (disclosure: O'Reilly AlphaTech Ventures investment), when phones listen to ambient sound, when credit cards report spending patterns (disclosure: OATV investment), when cars report their miles traveled, when we're increasingly turning every device into a sensor for the global brain, there will be more and more sources of data to be mined.

And yes, we'll need humans (at least for a while) as the last mile to extract meaning from all that data. But we'll want those humans to be augmented with tools that notice patterns and exceptions as they happen, not just after the "news" hits.

tags: moneytech  | comments: 13   | Sphere It
submit:

 
Previous  |  Next

0 TrackBacks

TrackBack URL for this entry: http://blogs.oreilly.com/cgi-bin/mt/mt-t.cgi/6285

Comments: 13

  Tommi [02.08.08 12:28 PM]

Great post, Tim! If you're going to keep at this I'd really love to hear more about how you'd think non-profits could better fit this conglomeration of new trends...

  Kevin Farnham [02.08.08 12:58 PM]

1. I found Devin's comments about focusing on the expectations of the new generation of traders very interesting as well. The late Thursday afternoon panel "What Do Hedge Fund Managers Really Want?" talked about the huge generational expectations gap between institutional money managers in their 50s and incoming professionals in their 20s. It sounds like Reuters sees the necessity for revolutionary change in this area -- opening of the trading desk to the rich content available on the Web (whereas many established institutional investment firms block access even to Facebook from within the corporate walls).

2. I loved Devin's comment about Reuters gaining their first latency competitive advantage 150 years ago through the use of carrier pigeons to transport information.

The semantic web seems to have fallen somewhat off the radar screen over the past few years -- the Web is simply growing so fast in other ways (for example, Web 2.0). But, one of the things Money:Tech highlighted for me is that the Web is not a one-size-fits-all entity. What's great for normal users, and even basic business use, does not by any means meet the needs of institutional traders. And, clearly, what we saw at Money:Tech represents the heart of the global financial system. Their success or failure affects us all.

The semantic web may not be critical for normal users; but for Wall Street, appropriate semantic tagging of content is significantly advantageous. Just as we have special purpose corporate VPNs, encrypted networks, etc., along with the open Internet channels we all use, financial professionals really do need a specialized semantic web for their own use. Searching for information at Google speed and manually sifting through the results simply won't accomplish the task at hand.

Another point from the conference: we'll always need humans to vet the results of automation. I've worked on this stuff for decades, and you always have to monitor the results. The simplest way to do this is using visualization (we saw a lot of examples of this at Money:Tech as well).

As Richard Bookstaber said, the algorithms make assumptions that may be true 99.9% of the time, but they're not true 100% of the time. This is why we get computer-generated crises like 1987, Long Term Capital Management (I've always laughed at that name, since their strategy was very short-term trades based on modeling a tiny slice of historical data), and whatever crisis happened this past August with the quant funds (apparently they all lined up in the same direction and something went awry). In the case of financial algorithms, one assumption is that the markets have essentially unlimited liquidity: it will always be possible to actualize the trades the model wants to make in the real marketplace. This simply isn't true!

Anyway, it was a great conference. Thanks for putting it together, and thanks for writing this post, letting people know that the Semantic Web is still needed, its value continues to grow (at least for specialized audiences), and its framework is being constructed today by Reuters.

  Jonathan C. Hall [02.09.08 12:05 PM]

Tim, I deeply appreciate the follow-up to your fireside chat with Devin Wenig, especially because I was the guy standing at the microphone with a question when the session ended. (Note: This is entirely my fault -- being shy, I hesitated to step up and got there too late.)

If I had asked my question, it would have gone something like this: Let's suppose Reuters is successful at building the content-rich, multimedia and marked-up *professional* Web Wenig described. And let's suppose -- notwithstanding your suggestion of an even more valuable data mine to come -- that the Reuters Web shows some promise of delivering real value to traders and entrepreneurs (i.e. professionals) in the form of high-quality, structured textual and multimedia data with zero latency that goes beyond its *descriptive* meaning and yields some *predictive* meaning, as Wenig suggests it must to be competitive. Is Reuters thinking about ways of creatively monetizing its asset for professionals as its counterpart Google has for consumers? And if not, should they?

I only ask because it may be that the kind of industry "innovation" and "cross-pollination" that business leaders idealize (and which conferences like MoneyTech are intended to spur) can sometimes be spurred by scrappy young start-ups and hackers who will prefer to use the free Web than pay high subscription fees to create their new, game-changing applications, leaving Reuters somewhat behind the curve. (We saw plenty of examples of this at the conference: FirstRain, SkyGrid, etc.) In online services, "professional" is often synonymous with "expensive." And it should not escape Reuters' attention that many folks in financial services who pay the fees have grown up with Reuters feeds at their fingertips and don't necessarily see in it the kind of potential Wenig (or a salivating hacker) does.

Has Reuters considered these possibilities? I would have loved to hear Wenig talk about what Reuters is doing to broaden its market share, besides marking up its content and evangelizing about it. I wonder if alternative business models are even possible, and if there are any such models on Reuters' radar.

  Tim O'Reilly [02.09.08 02:58 PM]

Jonathan,

As you know, I sent your question on to Devin, and he replied:

" Thanks for the comment and I'm glad this post and my session has been getting so much commentary. The short and simple answer to your question is that we now (it is only over the last few years) have a mix of business models and will continue to diversify what we do and how we make money. Subsciptions for high end professionals are a big part of our revneue and will continue to be, as that is the model that is most preferred by our clients (in particular high end finance and media professionals). But the fastest growing parts of our business are completely advertising supported (reuters.com, all its international variants and all our mobile properties) and transaction based services where we get paid only when people take action like trading.

For us, the key is to segment our services and provide horses for courses. There are some services like our calais tagging service which we just announced which have no fee at all. Anyone is free to use it, develop on it, etc, and our motivation is that by marking up rich blogs with our language, we can incorporate that content into our services and provide a better content experience up and down the stack.

So in summary we are broadening our customers (7.5 mm uniques last month on .com and the webs third largest business news site now) and we have diversified our model and will continue to do it, although for now we do so in a measured way so that we can provide the services that high end professionals require while also serving a new "consumer" audience in the way that the web demands.

Hope that answers the question."

  Mahesh CR [02.09.08 09:26 PM]

Wow!! An excellent post followed up by insight from Devin himself.

Let me confess, I started out being a little skeptic. In an age of Twitter, Digg, Delicious its more and more easy to have one's ear to catch the news. Latency, as Devin rightly points out, is irrelevant with citizen journalists, equipped with the GPS & Camera enabled mobiles transmitting news and gossip in real time.

Providers of news will have to evolve from being channels of news to catalysts for insight and action.

What is very impressive though is the turnaround for the query and response that just occurred here. Imagine, a real world conference, a post about an interview with Devin, a question that was not asked during the session followed up and answered within comments!

  Andrew Walkingshaw [02.10.08 05:57 PM]

Thanks for these posts - the news from Money:Tech's been fascinating. I've written about a couple of the recent posts on Radar here, but one of your phrases above really struck a chord:

(on adding value to data) In Reuters' case, Devin thinks you add hooks to make your information more programmable. This is a really important insight, and one I'm going to be chewing on for some time.

It's a problem in, or at least around, science too - or, at least, it's a preoccupation of mine. How do you make it easier for scientists to write programs to analyse the results of their experiments or simulations? This idea - adding hooks - is a really pithy way of putting it. Thank you!

  Jeff [02.11.08 07:42 AM]

Excellent post. The portion of the post that struck me the most was the decreasing time for information to impact a market. As we are seeing typical business cycles (shopping, reading information, etc.) shrink in time, the access to information (and urgency attached to it), becomes paramount.

How can companies assist those that want instant information, flavored just the way they want it? Devin hits on this clearly: content producers can originate content and allow it to travel into these customized, time sensitive channels.

Will this new way (and speed) of accessing information create new markets? I think it will. And with our devices learning and advocating for our likes and dislikes, we will soon be able to obtain any piece of information instantly, without even requesting it. How will that change the way we go about our lives?

Now that is an opportunity......

  Alex Tolley [02.11.08 08:44 AM]

"So when Devin says that Google isn't terribly useful for professional uses like financial research, I think he misses just how much authority bloggers are getting as reliable news sources, and how people are using tools like iGoogle to pull together targeted RSS data feeds. Raw Google results may be less useful than Reuters-filtered results, but how about community or expert-curated Google results? Just as Reuters' customers are adding value to the Reuters data stream, they are capable of adding value to the Google data stream"

This reminds me of Dow Jones Markets before they were bought out. The senior executives believed that their curated, "noise free" data and proprietary network would be unassailable. They were dead wrong. Reuters and other organizations put great stock in the value of their organizations to deliver high quality data (prices and news) that is already tagged with stock symbols. Yet Wikipedia showed that Britannica's expert curation was no better than the public contributors and less timely, and we know that the great news organizations are not necessarily as good as selected bloggers in specific domains. I suspect Reuters will find that out too.

It seems clear to me that once the barrier to entry by cost is removed, the actual content quality barrier is quite low and easily hurdled. Furthermore, coverage is always going to be narrower than that which could be provided by the interested population.

In the financial markets, it is information you have that others don't that is valuable. When everyone receives the same data feed, latency is an issue. But when information flows are much greater and patchy, the gains will go to those who can integrate disparate data and connect the dots.
This suggests to me that the future lies in the public domain, with value accruing to those who can offer better ways to extract useful information from the noise.

  Tom Wilde [02.11.08 12:47 PM]

Great post Tim. There's a broader question at play here as well. Google has dominated the text space by creating an asymmetrical view of the meta data regarding a piece of content (pagerank etc). With this view they have been able to establish themselves as the gateway to the content. As Devin describes, creating the authoritative meta record for a piece of content puts the content producer in the advantaged position of being able to leverage that information to create complex syndication, distribution, access, and advertising rules. This enhanced programming will ensure the content is delivered with the brand, context, and advertising opportunity in tact. This will also further enhance the ability of specific communities to add the social graphing overlay to a piece of media, in a way that is relevant to and reflective of that particular community.

Specifically regarding video and audio content, content producers *must* put the requisite capabilities in place to ensure that this information asymmetry cuts in their favor if they are to fully leverage the assets they have.

Tom Wilde
CEO
EveryZing

  Terry Jones [02.12.08 05:27 AM]

Hi Tim


You touch again on why I think it's so important that we move towards an architecture in which objects are not owned. One of my examples is about mashups. Mashups are cool and valuable, as we all know. But those who write them have a problem: where should they put the data they create? Sure, it can be delivered in HTML to a browser, but what if you want to permanently put the new information somewhere so that others can access it and build on it? Today that would require you to run your own server, with your own DB, to create your own API, and to document it all. Others wanting to use your information for further mashing will then be able to, but you'll have created yet another hoop to jump through to get your data.


Of course I argue that the "place" to put the additional data is with the original data. So financial services or semantic web or just normal people can simply tack their additional information onto the appropriate existing objects. Then that new data can be accessed in a uniform way by future agents, who can in turn add new information. If you or a program you write wants to adding value to this system, all you need do in this case is alert people to the new attributes (or others can find them simply by examining the attributes on the original objects and looking at their descriptions).


I know, you already get it! I see so many people banging their heads against these sorts of issues. As you put it to Ray Ozzie, changing the model of ownership & control really does cut the Gordian knot for things like this. I feel like I'm forever pointing this sort of thing out.

  Chris Vail [02.12.08 11:31 PM]

As I think about the content of newspapers in the US back in the 20th Century, I am struck by how much cultural leadership the New York Times had. The new technologies you mention will completely change the that cultural relationship; the "newspaper of record" will be an archaism. What will this do to nationalism, when information about the rest of the world is no longer being filtered by elite national institutions?

  Jack Culver [10.14.08 11:40 AM]

The ideas that Devin and his team came up sounds good. I see two problems. The semantic data is also created by human, in the news or feeds that Reuters has, so that Google or any other corporation can do and actually are doing. Reuters focus on reaching professional users with hundres of differents products will fail at some point and web sementantic will not work. Though the program, Calais, Reuters came up with can be useful but they are late in teh game.

I woud not be surprised in 5-8 years Google, or Wiki or any major Blog system becoming an importatn competitor of Reuters even on the professional usage side.

  Krista Thomas [01.14.09 09:06 PM]

To follow up, one year later, the value Thomson Reuters Calais initiative offers to publishers has come into focus.

We are now going beyond tagging to make it easier for publishers to enhance the value of their content, improve the reader experience and connect to the emerging linked content economy.

We automatically connect publishers to the exploding ecosystem of Linked Data assets, and help them syndicate their metadata to downstream reader via search engines, news aggregators, ‘related stories’ recommendation services, etc.

See more details on OpenCalais.com.

Post A Comment:

 (please be patient, comments may take awhile to post)






Type the characters you see in the picture above.

RECOMMENDED FOR YOU

RECENT COMMENTS