Distributed Proofreaders Completes 10,000th Book

Juliet Sutherland wrote in email on Friday:

Today Distributed Proofreaders (DP)
posted a package of texts that takes us over 10,000 completed titles.
I’m very proud of our community of volunteers who have accomplished
this beautiful number. The 10K package (listed below) showcases the
wide range of our volunteers’ interests and talents. For those of you
who may not know, DP makes accurate
e-text transcriptions of public domain printed material. The results of
our efforts are available in html and plain text formats, primarily
through Project Gutenberg (PG).
Scanning and raw OCR are the first step in digitizing printed material.
DP takes the next step to make accurate versions that can be
reflowed, re-sized, cut and pasted, accurately searched, analyzed and
so on.

For those who like numbers, the 10,000 titles represent around 3
million pages. We have produced books in over 20 languages and just
under 15% of our production is in languages other than English. About
700 volunteers login each day, and about 3000 different volunteers
login over a 30 day period. We have approximately 5600 projects in some
stage of preparation.

…DP is one of the oldest examples of peer
production on the ‘net.

This last point is important. People are all excited about Amazon’s Mechanical Turk, but Distributed Proofreaders pioneered that same methodology of breaking up a large task into small units that are farmed out to large numbers of volunteers. (Volunteer here.)

Juliet continued:

Please have a look at these ebooks. I think you’ll be impressed.

Plantarum: Monandria, Diandria and Triandria
  by Carolus Linnaeus
(Carl von Linné) 1753   Latin  first three sections of the classic
botanical reference

for beginners, Rev. ed
. by Charles William Burkett, Frank Lincoln
Stevens, and Daniel Harvey Hill. 1914   a textbook-JS

de Venise
by Shakespeare, trans. by M. Guizot. original 1821,
transcribed edition 1862   French

Shanty Book, Part I, Sailor Shanties
by Richard Runciman Terry
(1864-1938)  1921  includes music to listen to-JS

annals of the Cakchiquels: The original text, with a translation,
notes, and introduction
by Francisco Ernantez Arana (fl. 1582),
trans. by and edit. by Daniel G. Brinton (1837-1899)  1885  
English/Cakchiquel Mayan  check out the side-by-side translation-JS

of Needlework
, by Therese de Dillmont  originally from 1884  still
in print although we worked from an old version. Amazing illustrations,
huge file-JS

R. Caldecott’s First Collection of Pictures and Songs by Randolph
Caldecott.  [1900-1909?]  8 of his most popular books-JS

or, A Discourse of Forest-Trees
by John Evelyn  (1620-1706)  1664

or the London Charivari
, Volume 159, October 27, 1920  the
280th issue DP has completed.-JS

by Johanna Spyri, 1890 German

by Johanna Spyri, trans. Elisabeth P. Stork, with an intro by Charles
Wharton Stork, A.M. PhD, Illustrations by Maria L. Kirk. Gift edition.
1919  great illustrations-JS

by E.E. Smith  1934  this version has not been reprinted-JS

atravessei Àfrica
(v. II), by Alexandre Alberto da Rocha de Serpa
Pinto (aka Serpa Pinto)  1881 Portuguese

Annual Report of the Bureau of Ethnology to the Secretary of the
Smithsonian Institution, 1886-1887
, ed. John Wesley Powell  part
of an ongoing effort to transcribe all of these important works about
Native Americans-JS

Narratives, Oklahoma
(A Folk History of Slavery in the United
States From Interviews with Former Slaves)  Works Project
Administration Federal Writer’s Project 1936-1938  Part of
another ongoing project which is now over half complete-JS

Note that some of these books are still in print, but out of copyright. This is a good example of the nuance with which we need to approach discussions of book digitization. There are many works that are still useful that are out of copyright. But what DP and Project Gutenberg is doing is a great service even if these books are still in print, since they are now computer-accessible. The digital version is another valuable manifestation, just like you might have a hardback, a paperback, and an audio version of the same book.

(The variety in the list also shows the range of what’s now available in Gutenberg. There are two books I own in there, the Encyclopedia of Needlework (which is as amazing as Juliet says), and Triplanetary, the first of E.E. “Doc” Smith’s Lensman series. I loved these books as a kid — turgid and jingoistic as they are, they sure fired the imagination of a twelve-year old — and I’m now the proud owner of a signed first edition. (That is, of the first book edition, as these stories were originally published serially in pulp magazines.))

