By Yngvil Beyer, National Library of Norway
Digitisation, in the form of digital images, of large collections of manuscripts and letters offers the opportunity to convert such sources into machine readable text formats, and hence make them available for full text search.
In 2006, the National Library of Norway started an ambitious project: making all printed material available for full-text search through extended use of OCR (optical character recognition). Per April this year, 563,744 books, 3,240,856 newspapers and 75,686 journals are available from our website. Handwritten documents, however, have until now only been digitised as facsimiles. As we all know, handwritten documents are much more varied in layout, as well as writing style and spelling, and OCR has not been possible to use.
At the National Library of Norway, the collection of handwritten manuscripts and letters counts hundreds of thousands of items. At present, close to 26,000 of these items…
View original post 536 more words