Not registered? - Request an account here

Turning Text Soup into Smart Data in Newspaper and Magazine Archives

 (17/02/2016)

Turning Text Soup into Smart Data in Newspaper and Magazine Archives

PRImA and Factminers have joined forces in raising awareness of and addressing the lack of structure in vast amounts of OCRed documents. Lacking accurate physical and logical layout structure, the “text soup” generated by large-scale digitisation efforts is increasingly inadequate for any meaningful resource discovery or analysis.

 

Our goal is to jointly develop software and a crowdsourcing platform to turn the “text soup” into “smart data” which researchers, citizen scientists, and students can use to frame questions that cannot be explored or answered by today’s automated text recognition.

 

 

For more information and further news see: 

Article: Ground Truth & the Knight Prototype Fund

#TextSoup2SmartData