Proceedings of 7th International Conference on Image Processing and its Applications (IPA1999), Manchester, UK, July 1999, pp. 417-420
The need for objective evaluation of the performance of image analysis algorithms is now widely acknowledged and a number of techniques have been devised for various subsystems. In the field of document image analysis (DIA), significant activity has concentrated on evaluating OCR results. In the case of OCR the comparison of experimental results with ground truth is straightforward (ASCII characters) and lends itself to more elaborate analysis using string-matching theory to calculate errors and associated costs. Consequently, it is possible to automate OCR evaluation using large-scale test-databases. Large-scale testing and evaluation is essential not only for OCR but for each of the subsystems involved in DIA also. For instance, the identification of regions of interest in the document page image (page segmentation) and the type of their content (page classification) are significant stages that seriously affect the performance of subsequent DIA stages (e.g. OCR, document image understanding etc.). The work described focuses on subsystems the layout analysis stage. The most subsystems in this stage are page segmentation and classification. The framework described in this paper is focused mainly on performance analysis. A scoring system is also used to provide developers with a higher-level view of the performance of a method in particular aspects. Furthermore, a global score can be easily produced for benchmarking purposes if required.
A. Antonacopoulos, A. Brough , "A New Framework for Efficient and Flexible Analysis of the Performance of Document Image Analysis Subsystems", Proceedings of 7th International Conference on Image Processing and its Applications (IPA1999), Manchester, UK, July 1999, pp. 417-420