Multi-Frontier Quality Optimisation for old document OCR

The recent IMPACT project was funded by the European Commission as a scientific research project. For 4 years multiple organisations worked on a broad set of technologies to “Improve Access to historical Texts”.

When looking at the "natural challenges" of digitising and OCRing old documents it should be clear that there is no simple approach to dramatically improve the results.

To get the best quality for within a project all processing steps have to be optimised:

  • Archive the best scanning quality
  • Clean/optimise the existing images in a way that they deliver the optimal results during OCR
  • Make sure that the correct regions for recognition are found/defined
  • Use a good/suitable OCR technology/Engine
  • Use external resources like dictionaries to improve the recognition results
  • … in Progress

