Cookies warning

This web site uses cookies to improve your experience. By viewing our content, you are accepting the use of cookies.

Cookies are small text documents stored on your computer; the cookies set by this website can only be used on this website and pose no security risk.

Please do not proceed if you do not want these cookies being set. [Show details]

University of Salford
PRImA - Pattern Recognition & Image Analysis Group

PRImA Layout Analysis Dataset

Welcome to the PRImA Layout Analysis Dataset.

Visit our website for software tools, more datasets, and much more.

Note: You can download the dataset from the dataset section on our main website.


This dataset has been created primarily for the evaluation of layout analysis (physical and logical) methods. It contains realistic documents with a wide variety of layouts, reflecting the various challenges in layout analysis. Particular emphasis is placed on magazines and technical/scientific publications which are likely to be the focus of digitisation efforts.

Each image in the dataset has associated comprehensive and detailed ground truth enabling in-depth evaluation.

In addition to the information provided, the dataset is presented through this interactive interface. This interface and the flexible structure of the database behind it, allow easy browsing, searching and selection of subsets (e.g. for evaluation on specific layout conditions).

For a detailed overview of the dataset, please refer to the following paper:

A. Antonacopoulos, D. Bridson, C. Papadopoulos, and S. Pletschacher, "A Realistic Dataset for Performance Evaluation of Document Layout Analysis", Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR2009), Barcelona, Spain, July 2009, pp. 296-300. [further details]

Sample images from the dataset

Supported by

PRImA would like to thank for their support in creating and maintaining this dataset:

EU 7th Framework Programme grant IMPACT (Ref: 215064)

Valid XHTML 1.0! Valid CSS!
Best viewed in 1024x768 - Maintained by: Christian Clausner (e-mail), Apostolos Antonacopoulos (e-mail) - © 2009-2022