Computer Vision and Image Understanding, Special Issue on Document Analysis and Retrieval, Volume 70, Issue 3, June 1998, pp. 350-369
There is an ever increasing number of publications which do not have the “traditional” layout where printed regions are rectangular. Text paragraphs and areas of graphic type may be of any shape, individually rotated and in any arrangement. Previous document analysis techniques are not well suited to such complex layouts. This paper introduces a new method for the segmentation of images of document pages having both traditional and complex layouts. The underlining idea is to efﬁciently produce a ﬂexible description (by means of tiles) of the background space which surrounds the printed regions in the page image under all the above conditions. Using this description of space, the contours of printed regions are identiﬁed with signiﬁcant accuracy. The new approach is fast as there is no need for skew detection and correction, and only few simple operations are performed on the description of the background (not on the pixel-based data).
A. Antonacopoulos , "Page Segmentation Using the Description of the Background", Computer Vision and Image Understanding, Special Issue on Document Analysis and Retrieval, Volume 70, Issue 3, June 1998, pp. 350-369