Proceedings of the 3rd International Conference on Document Analysis and Recognition (ICDAR1995), Montreal, Canada, August 1995, pp. 1132-1135
There is an increasingly pressing need to develop document analysis methods that are able to cope with images of documents containing printed regions of complex shapes. Contrary to the bounding box representation used in most past page segmentation and classification approaches which assume rectangular regions, there is a need for a more flexible description which also retains most of the functionality of the representation by rectangles. In the first part of this paper, the practical considerations of describing and handling the complex-shaped regions are examined and an appropriate representation scheme is proposed. For page classification, a new approach based on the description of white space inside regions is presented. In contrast to previous page classificaton approaches, skewed and complex-shaped regions are handled efficiently and the features are derived with no need for time-consuming accesses of the pixel-based image data.
A. Antonacopoulos, R.T. Ritchings , "Representation and Classification of Complex-Shaped Printed Regions Using White Tiles", Proceedings of the 3rd International Conference on Document Analysis and Recognition (ICDAR1995), Montreal, Canada, August 1995, pp. 1132-1135