After a 16-year University-led effort, the earliest printed texts of modern era writers will now be available online and for free.

The University Library, the University of Oxford’s Bodleian Libraries and the company ProQuest have collaborated to make more than 25,000 texts printed between 1473 to 1700 available through the University of Michigan library’s website.

According to a press release, this effort is only the first phase of the Early English Books Online Text Creation Partnership, which began in 1999.

In an interview Thursday, Aaron McCollough, editorial director for Michigan Publishing, said the texts that will be available include Shakespeare, Chaucer and Homer.

“The selection process focused on books that were already believed to be very important, for which high demand would exist,” McCollough said. “The works of famous 17th century playwrights, prominent philosophers, sermon literature —there are around 25,000 of the ‘Greatest Hits,’ in a way, of the 17th century.”

ProQuest, an Ann Arbor-based company, created scanned images of these texts in 1999 and published them as a database called Early English Books Online, but was unable to reproduce them into searchable digital texts.

The entire effort across the libraries involved a process called double-keying, in which two different people type in, character by character, the letters from the print documents. A program called optical character recognition can transcribe modern printed works, but older texts contain different fonts that the program can’t recognize.

The Council on Library and Information Resources, Jisc, a digital solutions charity, and more than 160 other libraries also partnered in the project. McCollough said the international collaboration is one of the first of its kind.

“It also is a kind of revolutionary funding model for a big knowledge project in humanities — this idea of (sic) consortial collaboration, of multiple libraries contributing a feasible amount of money to a project so that something is made possible that wouldn’t be possible for any single library to fund,” McCollough said. “That kind of model is being replicated in other contexts, but this is one of the first big projects like that.”

Overall, McCollough said the project will be vitally important for English culture as a whole.

“This was a very important thing in terms of free culture and preserving what is a fundamental set of texts in the history of English-language culture,” McCollough said. “It’s also a very large set of humanities data that can be mined and repurposed for digital humanities projects, and this is one of the great promises of the material at this point: to see what scholars will do with these texts in the digital realm.”

The project hopes to release an additional 40,000 texts for public consumption by 2020.

Leave a comment

Your email address will not be published.