Jul 3

Peter Brantley

Peter Brantley

Science Direct-ly into Google

ScienceDirect (SD) is a compendium of scientific, technical, and medical (STM) literature from Reed Elsevier, one of the world's largest publishers. SD, most often made available on a subscription or licensing basis to large institutions like universities, biopharmaceuticals, and other research or health related companies, claims to contain approximately "25% of the world's science, technology and medicine full text and bibliographic information." SD is an expensive, and often contentious product in Higher Education due to high year-on-year pricing increases, but it is a highly desirable one, nonetheless.

It was therefore notable when its absence from Google Scholar, Google's search interface for scholarly-related material, was realized. Scholar has become tremendously popular for focussed searches in the scholarly literature among not only academics and students, but seekers of health information and other science-based data. Elsevier has long supported its own search interface for scholarly literature, Scopus, and it was no surprise to many that they avoided inclusion. However, they doubtless lost eyeballs as more and more of this traffic migrated to the freely available Scholar product.

Elsevier has now undertaken to have the majority of its SD journals (those for which it holds or can obtain the copyrights) crawled and indexed by Google. Both Google and Google Scholar are slowly incorporating an increasing amount of this content, and these data will be appearing in search results for Google and Google Scholar.

Ale de Vries, the SD product manager, informs me in an email:

About Google/Google Scholar: we're making good progress. As you may be aware, we did a pilot with some journals on SD first, and now we are working to get them all indexed. We're making good progress there - it's a lot of content to be crawled, but going along nicely. Both Google Scholar and main Google are gradually covering more and more of our journals.

This is notable for a wide range of reasons. One of the most prominent is that Elsevier clearly feels comfortable with having its core intellectual property crawled and analyzed by Google to augment discovery. In contrast to the various European newspaper publisher-related lawsuits, Elsevier has clearly felt that even with the basic, essential tools available today - robot exclusions, sitemaps, and business agreements - their ability to execute business strategy is unimpeded by encouraging greater content exposure.

While this type of scholarly literature is often more opaque to the public than publisher- or library-based digitization programs, it is at least as important, if not more so, in relation to the number and relevance of a wide variety of searches in the critically important fields of science, technology, and health. Google's ability to index this massive quantity of information will provide it with benefits that are significant; obviously Elsevier will profit as well.

For Google, as with its Books program, the gains to indexing the world's STM information revolve around not merely that data itself, but the linkages it can form between that data and the other information to which it has access, including geospatial information, data from books, historical/timeline data, biographical data, government documents, and so forth. Clearly the rewards from this mass of material for searchers are tremendous, almost overwhelming.

Both information seekers and publishers bear the responsibility of remembering that the Lens of Google through which we increasingly seek the world is only one lens, albeit one with further and further vision.

tags: publishing  | comments: 6   | Sphere It

Previous  |  Next

1 TrackBacks

TrackBack URL for this entry: http://blogs.oreilly.com/cgi-bin/mt/mt-t.cgi/5653

» ScienceDirect/GoogleScholar from pintiniblog

Tiens donc. On apprend sur O'Reilly Radar que le contenu de ScienceDirect (Elsevier) va être progressivement intégré aussi bien dans Google que dans Google Scholar. Et Scopus dans tout ça? Read More

Comments: 6

  Ale [07.04.07 04:21 AM]

Hi Peter - that loss of eyeballs is not as big as people would probably think. SD is by far the largest STM content platform in the marketplace and we have been able to run true A-B tests for inclusion and exclusion of content in Google. While there is an increase in usage in most content areas, it is nowhere near as large as some would expect it to be. For example, in the Health areas the net increased usage is only 5%. Although 5% is only a slight increase, we do think it is important to make content discoverable in the places users are starting their searches - which, nowadays, also includes Google Scholar and "main" Google.

  Don Marti [07.04.07 12:57 PM]

"Its" core intellectual property? Taxpayer money, tuition, and donations pay for the actual research and writing the actual papers. Editorial board members paid by institutions do the review. Elsevier is just a pirate.

  WoW!ter [07.04.07 03:49 PM]

Interesting. Thanks for sharing. But how will it affect Scopus and Scirus in the first place. You're guess is mine. And will it affect Thomson too?

  Jeroen Bosman [07.05.07 03:20 AM]

Elsevier journals and authors will gain from this extra exposure. The added value of Scirus will diminish, although many people use that for searching content other than Elsevier journals, such as patents and science websites. Institutions licensing Scopus will not have done so because of coverage of Elsevier journals but because of it's advanced functionality and references and citation information, so I do not expect too much change there becuase of this deal. Scholar's big advantage and minor disadvantage is its simplicity and full text search capability.

  Dennis McDonald in Alexandria, Virginia USA [07.05.07 04:23 AM]

Ale's comment makes a great deal of sense. In highly specialized areas there is a limit to the number of specialists that are interested in specific topics. So massive growth in demand for Elsevier content through improved Google based findability is unlikely.

Also I would think that specialized services that provide additional content and searchability will still appeal to niche markets (that can afford to pay).

Where I would hope to see significant impacts would be on the "edges" of specialized areas where cross-discipline communication opportunities exist. Discipline A may be unwilling or unable to subscribe to highly specialized indexing services in Discipline B, and as a result they may remain unaware of potentially useful or important topics in Discipline B. But a Discipline A researcher might be more likely to use a low-cost Google service that helps located the occasional Discipline B article. So even though Elsevier gives up some hypothetical "findability" revenue via Google indexing, Elsevier benefits by generating demand for Discipline B articles among Discipline A.

Dennis D. McDonald
Alexandria, Virginia USA

  David [07.06.07 06:55 PM]

Not that anyone writing here seems confused about this, but I'd still like to note that Ale's 5% statistic represents the increase in traffic due to improved indexing of ScienceDirect, not the entire role that Google and Google Scholar play in bringing people to Elsevier publications or even ScienceDirect itself. Google and Google Scholar search results include Elsevier publications not only due to what Google's machines find on ScienceDirect, but also due to indexing of directories such as PubMed, citation analysis (in the case of Google Scholar), and people placing individual full text papers on the public web. I have sometimes used the Google Scholar "library links" feature to click through to ScienceDirect via my library in cases where Google Scholar did not directly link to ScienceDirect.

Post A Comment:

 (please be patient, comments may take awhile to post)

Type the characters you see in the picture above.