Not registered? - Request an account here

Aletheia Document Analysis System


Aletheia is an advanced system for accurate and yet cost-effective analysis, recognition and annotation of scanned documents. It aids the user with a number of automated and semi-automated tools which were developed and fine-tuned based on feedback from major libraries across Europe and from their digitisation service providers which are using it in a production environment.

Cutting-edge features are, among others, the support of top-down ground truthing with sophisticated split and shrink tools as well as bottom-up ground truthing supporting the aggregation of lower-level elements to more complex structures. The integrated rules and guidelines validator, in combination with powerful correction tools, enable efficient production of highly accurate ground truth as well as standardised electronic renditions of digitised documents.

In addition, special features such as a customisable virtual keyboard and the Aletheia Sans font with extensive coverage of special characters in Unicode have been developed to support working with the complexities of historical documents.

The Aletheia licencing model has changed. Read more »
Information for Franken+ users Read more »


Aletheia is available in two editions:

  • Lite – offering essential features with a free licence
  • Pro – a complete document analysis system (full trial available)

Read more »


There are numerous useful resources freely available, such as:

  • Video tutorials of Aletheia
  • Documentation and Example files
  • Aletheia Sans font

Read more »

Use cases

Read about use cases of Aletheia:

  • Wellcome Digital Library
  • Training Tesseract using Aletheia and Franken+
  • FactMiners

Read more »

Download the latest version

Aletheia Sans Font

Aletheia Sans is a font derived from Dejavu Sans. It has been enriched with characters that were required by several large-scale European and American digitisation projects for historical documents. Where possible, code points recommended by the Medieval Unicode Font Initiative (MUFI) were used.

Important note: Installing the font permanently will prevent automatic font updates in the Aletheia tool. Future changes will not be visible until the new version of the font is installed manually.

Download the latest version

Alternative download

Related Publications

A survey of OCR evaluation tools and metrics

C. Neudecker, K. Baierer, C. Clausner, A. Antonacopoulos, S. Pletschacher

In The 6th International Workshop on Historical Document Imaging and Processing (HIP '21). Association for Computing Machinery, New York, NY, USA, 13–18.

Details »  Download PDF 

Efficient and Effective OCR Engine Training

C. Clausner, A. Antonacopoulos, S. Pletschacher

International Journal on Document Analysis and Recognition (IJDAR), 23(1), 73-88

Details » 

Quality Prediction System for Large-Scale Digitisation Workflows

C. Clausner, S. Pletschacher, A. Antonacopoulos

Proceedings of the 12th IAPR International Workshop on Document Analysis Systems (DAS2016), Santorini, Greece, April 11-14, 2016

Details »  Download PDF 

Efficient OCR Training Data Generation with Aletheia

C. Clausner, S. Pletschacher, A. Antonacopoulos

Short Paper Booklet of the 11th International Association for Pattern Recognition (IAPR) Workshop on Document Analysis Systems (DAS2014), Tours, France, April 2014, pp. 19-20

Details »  Download PDF 

Aletheia - An Advanced Document Layout and Text Ground-Truthing System for Production Environments

C. Clausner, S. Pletschacher, A. Antonacopoulos

Proceedings of the 11th International Conference on Document Analysis and Recognition (ICDAR2011), Beijing, China, September 2011, pp. 48-52

Details »  Download PDF