Flexible Page Segmentation Using the Background
A. Antonacopoulos, R.T. Ritchings
Proceedings of the IAPR 12th International Conference on Pattern Recognition (ICPR'94), Jerusalem, Israel, October 9-12, 1994, IEEE-CS Press, pp. 339-344
This paper introduces a new method for document page segmentation. This method is based on the analysis of the background white space that surrounds the printed regions on the page. It does not make any assumptions about the shape of the regions as opposed to most earlier approaches which assume that printed regions are rectangular. It is capable of identifying and describing regions of complex shapes more accurately than existing methods. It requires no a priori knowledge. The background white space is covered with tiles and the contour of each region is identified by tracing through these white tiles that encircle it. The method can segment page images with severe skew without skew correction. The white tiles on the image can also be used in subsequent dicument analysis processes such as the classification of the image regions.