Not registered? - Request an account here

Word-Based Adaptive OCR for Historical Books

V. Kluzner, A. Tzadok, Y. Shimony, E. Walach, A. Antonacopoulos

Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR2009), Barcelona, Spain, July 2009, pp. 501-505

Abstract

The aim of this work is to propose a new approach to the recognition of historical texts by providing an adaptive mechanism that automatically tunes itself to a specific book. The system is based on clustering together all the similar words in a book/text and simultaneously handling entire class. The paper describes the architecture of such a system and new algorithms that have been developed for robust word image comparison (including registration, optical flow based distortion compensation, and adaptive binarization). Results for a large dataset are presented as well. Over 23% recognition improvement is demonstrated.

Citation

V. Kluzner, A. Tzadok, Y. Shimony, E. Walach, A. Antonacopoulos , "Word-Based Adaptive OCR for Historical Books", Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR2009), Barcelona, Spain, July 2009, pp. 501-505

DOI

10.1109/ICDAR.2009.133

Full Paper

Download PDF

Related Projects

IMPACT - Improving Access to Text