The Digital Public Library of America opened shop this week, offering access to a rich repository of books, photos and more. The collection, drawn from libraries and archives across the country, also contains rare historical footage like Kentucky women marching for the vote and news clips of civil rights Freedom Riders.
Is there anything the DPLA doesn’t have? Well, we could start with the 20 million or so digital books that Google has scanned at hundreds of libraries in the last decade. This collection — which is much larger than the 2.4 million records at the DPLA — has gone unmentioned in the course of the new library’s launch this week. Instead, Google’s trove is gathering digital dust as a ceaseless copyright case between the company and the Authors Guild grinds on.
This is a shame. While the DPLA is a beautiful and important endeavor, it also feels woefully incomplete. Its library collection does’t include major institutions like Stanford and Michigan (alma mater of Google CEO Larry Page) which have enormous digital resources – this can hardly be described as “of America.” More seriously, the holes in the DPLA’s catalog show how a once-unified effort to digitize the country’s knowledge has become a patchwork affair.
As I described in Battle for the Books, Harvard librarian Robert Darnton not only pulled the university out of its one-time partnership with Google, but also led an intellectual campaign to stop the company’s scanning plans. Darnton’s actions not only drove legal opposition to a proposed settlement, but also produced lasting resentment within the librarian community — some of whom regard Darnton as a spoilsport and a demagogue.
The result is that the U.S. now has multiple, unconnected repositories that represent a digital fracturing of its culture heritage. It’s worth noting that DPLA also comes in addition to the Internet Archive, a long-time pioneer in digital scanning (I found the image at right by searching the DPLA; the results led me to a Connecticut library and then the Internet Archive).
This is in part due to copyright issues. While Google has made a strong case that its scanning activities are fair use, authors fear they will lose control over their works — and many people oppose granting Google or any other private company a role in the country’s library systems.
For now, the DPLA, which is funded by governments and foundations, is not in discussion with Google Books. It has, however, been talking to the Hathi Trust, a network of university libraries that have connected their digital collections (they obtained the collection as part of the agreement under which Google scanned their books).
“We’re just getting started and are in talks with many large content hubs; yes, we have spoken to HathiTrust and can imagine a very complimentary collaboration with them.” wrote Executive Director Dan Cohen.
Google did not return a request for comment.
Correction: An earlier version of this story said that the DPLA partnered with libraries in only six states. According to the official DPLA release, the organization “has partnered with six state and regional digital libraries and an equal number of large cultural heritage institutions— including the National Archives and Records Administration (NARA), the Smithsonian Institution, the New York Public Library, and Harvard University.”