Tutorials

Tutorial 1 (morning)

"Scene-Text Localization, Recognition, and Understanding"

Albert Gordo (Xerox Research Center Europe) and Lluís Gómez i Bigordà (Computer Vision Center, Universitat Autonòma de Barcelona)

During the last few years, the computer vision and document analysis communities have started giving attention to tasks related to text localization and recognition in natural images (also referred to as scene-text or text-in-the-wild), particularly after the seminal works of Wang et al. More recently, and steered by the current deep learning renaissance, architectures based on convolutional neural networks and recurrent neural networks have shown outstanding results on localization and recognition tasks, and have allowed researchers to approach more challenging problems such as text understanding in natural images.

This tutorial has three main objectives: first, to familiarize the audience with the problem of localization, recognition, and understanding of text in natural images, highlighting the similarities and differences between them and the same tasks performed on document images. Second, to provide some details about the techniques that are showing the largest potential for current and future research in the topic, and that could be easily transferred or adapted back to the document analysis domain. Third, to present to the audience open-source libraries that implement some of the current state-of-the-art methods.

Tutorial 2 (afternoon)

"Tesseract Blends Old and New OCR Technology"

Ray Smith (Google Inc)

This tutorial will cover the algorithms, design and implementation of the open source OCR engine known as Tesseract.

Designed largely in secret, the methods used in Tesseract are not well known, yet it remains a formidable force in OCR, and continues to improve. The layout analysis was second in the 2009 ICDAR competition, it supports more than 100 languages, including Chinese, and several Indic languages, and recent changes have allowed easy plugin of new classifiers, including a new Deep LSTM addition.


For general enquires
please contact us at
das2016@primaresearch.org