Not registered? - Request an account here

A Complete Approach to the Conversion of Typewritten Historical Documents for Digital Archives

A. Antonacopoulos, D. Karatzas

Document Analysis Systems VI: Proceedings of the 6th International Association for Pattern Recognition (IAPR) Workshop on Document Analysis Systems (DAS2004), S. Marinai and A.R. Dengel (Eds.), Springer Lecture Notes in Computer Science, LNCS 3163, Florence, Italy, September 2004, pp. 90-101

Abstract

This paper presents a complete system that historians/archivists can use to digitize whole collections of documents relating to personal information. The system integrates tools and processes that facilitate scanning, image indexing, document (physical and logical) structure definition, document image analysis, recognition, proofreading/correction and semantic tagging. The system is described in the context of different types of typewritten documents relating to prisoners in World-War II concentration camps and is the result of a multinational collaboration under the MEMORIAL project funded (€1.5M) by the European Union (www.memorial-project.info). Results on a representative selection of documents show a significant improvement not only in terms of OCR accuracy but also in terms of overall time/cost involved in converting these documents for digital archives. This work is supported by the European Union grant IST-2001-33441.

Citation

A. Antonacopoulos, D. Karatzas , "A Complete Approach to the Conversion of Typewritten Historical Documents for Digital Archives", Document Analysis Systems VI: Proceedings of the 6th International Association for Pattern Recognition (IAPR) Workshop on Document Analysis Systems (DAS2004), S. Marinai and A.R. Dengel (Eds.), Springer Lecture Notes in Computer Science, LNCS 3163, Florence, Italy, September 2004, pp. 90-101

DOI

10.1007/978-3-540-28640-0_9

Full Paper

Download PDF

Related Projects

MEMORIAL