Not registered? - Request an account here

The Significance of Reading Order in Document Recognition and its Evaluation

C. Clausner, S. Pletschacher, A. Antonacopoulos

Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR2013), Washington DC, USA, August 2013, pp. 688-692

Abstract

Reading order detection and representation is an important task in many digitisation scenarios involving the preservation of the logical structure of a document. The corresponding need for the evaluation of reading order results generated by layout analysis methods poses a particular challenge due to the potential deviations between the ground truth and actually detected segmentation of the page. To this end a novel evaluation approach that responds to this problem by incorporating region correspondence analysis is proposed. Furthermore, a sophisticated reading order representation scheme is presented and used by the system allowing the grouping of objects with ordered and/or unordered relations. This is a typical requirement for documents with complex layouts such as magazines and newspapers. The evaluation method has been validated using the results of two state-of-the-art OCR / layout analysis systems and a basic top-to-bottom reading order detection algorithm applied on representative samples from the PRImA contemporary and the IMPACT historical document datasets.

Citation

C. Clausner, S. Pletschacher, A. Antonacopoulos , "The Significance of Reading Order in Document Recognition and its Evaluation", Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR2013), Washington DC, USA, August 2013, pp. 688-692

DOI

10.1109/ICDAR.2013.141

Full Paper

Download PDF

Related Projects

Europeana Newspapers