Not registered? - Request an account here

A robust hybrid approach for text line segmentation in historical documents

C. Clausner, A. Antonacopoulos, S. Pletschacher

Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), Tsukuba, Japan, November 11-15, 2012, IEEE-CS Press, pp. 335-338

Abstract

Large-scale digitisation of historical documents demands robust methods that cope with the presence of frequent distortions and noisy artefacts. This paper presents a hybrid text line segmentation method that uses a novel data structure and a rule base to combine the strengths of top-down and bottom-up approaches while minimising their weaknesses. The effectiveness of the proposed approach has been methodically evaluated in the context of large-scale digitisation using a standardised framework. Results on a diverse dataset show improved performance over top-down and bottom-up approaches as well as over a leading commercially available system.

Citation

C. Clausner, A. Antonacopoulos, S. Pletschacher , "A robust hybrid approach for text line segmentation in historical documents", Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), Tsukuba, Japan, November 11-15, 2012, IEEE-CS Press, pp. 335-338

Full Paper

Download PDF

Related Projects

IMPACT - Improving Access to Text