ICDAR2009 Page Segmentation Competition
The evaluation dataset has now been released for those who registered for the competition. Note that the deadline for submitting results and executable has been extended to 15 April. Registration for the competition is now closed.
Page Segmentation is of fundamental importance among Layout Analysis steps and has been (and continues to be) relatively well researched. The motivation of the competition is to evaluate existing approaches using a realistic dataset and an objective performance analysis system.
The ICDAR2009 Page Segmentation Competition follows the successful running of all previous ICDAR Page Segmentation competitions (2001, 2003, 2005 and 2007). The current competition benefits from an extended dataset and a new (and more detailed) evaluation approach.
The dataset used in this competition is realistic in the sense that it represents a wide variety of layouts that reflect documents that are likely to be of broad interest to be digitised. It contains images and ground truth of a variety of layouts (both Manhattan and irregular ones), mainly pages from magazine articles and scientific publications. While the majority of regions on each page are textual, there are graphic regions also present. Textual regions with fonts of varying sizes may also be present on each page.
The competition will use a new evaluation approach which takes into account a wide range of situations and provides considerable details on performance of layout analysis methods. The system performs a geometric comparison between regions detected by a segmentation method and ground-truth regions in order to identify erroneous mergers between regions, or split, missed, partially missed or misclassified regions. Each type of error is weighted according to the type of regions involved and the situation they are found. An overview of the evaluation methodology can be found in the following publication:
A. Antonacopoulos and D. Bridson, "Performance Analysis Framework for Layout Analysis Methods", Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR2007), Curitiba, Brazil, September 2007, pp. 1258-1262.
If you have any queries regarding the running of the competition, please contact Dr. Apostolos Antonacopoulos at A <dot> Antonacopoulos <at> primaresearch <dot> org.
The creation of the dataset used for this competition has been supported in part by:
EU 7th Framework Programme grant IMPACT (Ref: 215064)