A National Scan Center: A Public Works Project

In the course of doing research for some recent
testimony before
Congress
on the
National Archives and Records Administration, I was struck by several facts about how our first National
Archivist, Robert D.W. Connor, met some seemingly insurmountable challenges when he took office in
the mid-1930s.

The biggest challenge was the deluge of paperwork, a situation not very different from what our
national institutions face today. Instead of simply moaning the impossibility of swallowing
all the records Connor would need to establish the National Archives, he thought nonlinear. The
result was the invention of several key technologies: the airbrush to clean paper, the laminator
to protect it, and of course, the microphotograph (now known as microfilm or microfiche), a technology so
successful it reduced incoming paper needs by 95%.

The other challenge that Connor faced with the National Archives, a situation again not very different
from what our national institutions face today, was a paucity of skilled labor. Lucky for Connor
though, the National Archives was born in the middle of the last great depression. Connor went
to Harry Hopkins, and together they went to President Roosevelt, and the result was a Works Progress
Administration program that ran until 1942 to survey federal archives. The program put
3,171 people to work in 1,057 communities and created two important reference aids still in
use today, the Historical Records Survey and the Inventory of Federal Archives.

Just before I testified, I read in
the New York Times
that the President of France had just
announced a stimulus package of $50 billion. President Sarkozy pledged 2% of that stimulus package,
a full $1.1 billion, towards scanning and digitizing a national archive. I didn’t use the term
Freedom Scans in my testimony, but the fact that the French were far ahead of the U.S. in putting
paperwork into cyberspace seemed a political opportunity.

In the U.S., we face a similar deluge of paperwork that we faced in the 1930s. A huge backlog
of paper, microfiche, audio, video, and other materials is located throughout the federal government.
Little money has gone from Congress for digitization, and bureaucracies have resorted to a series
of questionable private-public partnerships as a way of digitizing their materials. For example,
the Government Accountability Office shipped 60 million pages of our Federal Legislative Histories
(the record of each law from the initial bill through the hearings and conference reports) off to
Thomson West, but didn’t even get digital copies back. Another example is the recent failed effort
by the Government Printing Office to digitize 60 million pages of the Federal Depository Library
Program, an effort they tried to get through as a “zero dollar cost to the government” effort with
the private sector.

There are no free lunches and there are no “no cost to the government” deals. The costs
involve the government effort to supervise the contract, prepare the materials, and ship them, and
in both the GAO and GPO cases, the government wasn’t getting much back for its effort. What the
government and the people usually get is a lien on the public domain, preventing the public
from accessing these vital materials. Similar efforts are
sprinkled throughout the government. I testified to Congress that I had learned that the
National Archives was contemplating a scan of congressional hearings with LexisNexis under
similar circumstances, and many may be aware of the questionable deal the Archives cut with
Amazon where my favorite online superstore got de facto exclusive rights to 1,899 wonderful
pieces of video.

We can learn much from the French leadership on this issue. After my testimony, I went and visited
senior officials at the Library of Congress and the Smithsonian. They all said that while they
had tried to get more congressional interest in digitization, and had tried to go after stimulus
money, so far nobody had much success. I asked if they had gone hand-in-hand with their
sister institutions to ask for this money, and it was pretty clear that they had not.
Each institution went in one at a time pleading their own special case to congressional staffers
and to officials at the Office of Management and Budget.

There was one more thing I learned about our first National Archivist, which was that he had
backing where he needed it and the political skills to use that backing. One of the big challenges
Archivist Connor faced was getting the
agencies to cooperate with him in giving the National Archives their records. His solution
was leadership: President Roosevelt agreed to host a meeting of a newly-formed National
Archives Council in the Cabinet Room. That, needless to say, got the department secretaries and
agency chiefs to show up, and they elected the Secretary of State as head of the Council. The
Council only met a few times, but that was all it took, and the result were new federal policies
about how agencies should dispose of their records.

There are several agencies in the government that face huge digitization and scanning backlogs,
including the Library of Congress, the Smithsonian Institution, the Government Printing Office,
the National Archives and Records Administration, and the National Technical Information
Service. In addition, there are agencies such as the Government Accountability Office and
the Defense Visual Information Directorate that have valuable archives.

Chairman Wm. Lacy Clay of the the Information Policy, Census and National Archives Subcommittee
asked many very informed questions of the panelists, and one that came my way was about costs
for digitization. Today, the widely accepted cost for scanning a piece of paper and running it
through OCR is about 10 cents per page. These are the numbers that you hear from places like
the Internet Archive and Google Book Search, and that’s what I told the Chairman. But, I also told
the Chairman that
it was my belief that if the government starting scanning at volume, those costs could go down
by half. I also testified about the vastly reduced costs of digitizing video, a task I perform
under a joint venture with the National Technical Information Service using less than $10,000 in
hardware.

If the government invested a mere $100 million of our stimulus package (we’ve already spent over
$72.6 billion), that means 2 billion pages of
paper or microfiche would get scanned. For $500 million, we’re talking a huge chunk of our national backlog
being digitized, a task that would result in an enduring digitial public work for our modern era,
something that would prove
immense use to future generations, and would also save the government tremendous amounts of
money in storage costs and other facilities expenses.

What would it take to get the Library of Congress, the Smithsonian Institution, the Government
Printing Office, the National Archives and Records Administration, and the National Technical
Information Service all singing off the same page and working together? There is a tremendous
opportunity for White House leadership here, bringing the parties together and creating a
compelling case on why we should launch and fund a 5-year $500 million effort to create a
National Scan Center. Both the CIO and the CTO in the Executive Office of the President have
talked about the tremendous “moral authority and convening power” of the White House, and I
believe that this issue is of sufficient importance that it would be worthwhile to pursue.

tags: , ,