Paper
24 January 2011 Ancient documents bleed-through evaluation and its application for predicting OCR error rates
V. Rabeux, N. Journet, J. P. Domenger
Author Affiliations +
Proceedings Volume 7874, Document Recognition and Retrieval XVIII; 78740Q (2011) https://doi.org/10.1117/12.873368
Event: IS&T/SPIE Electronic Imaging, 2011, San Francisco Airport, California, United States
Abstract
This article presents a way to evaluate the bleed-through defect on very old document images. We design measures to quantify and evaluate the verso ink bleeding through the paper onto the recto side. Measuring the bleed-through defect alows us to perform statistical analysis that are able to predict the feasibility of different post-scan tasks. In this article we choose to illustrate our measures by creating two OCR error rate predicting models based bleed-through evaluation. Two models are proposed, one for Abbyy FineReader * which is a very power-full commercial OCR and OCRopus † which is sponsored by Google. Both prediction models appears to be very accurate when calculating various statistic indicators.
© (2011) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
V. Rabeux, N. Journet, and J. P. Domenger "Ancient documents bleed-through evaluation and its application for predicting OCR error rates", Proc. SPIE 7874, Document Recognition and Retrieval XVIII, 78740Q (24 January 2011); https://doi.org/10.1117/12.873368
Lens.org Logo
CITATIONS
Cited by 2 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Optical character recognition

Error analysis

Image quality

Data modeling

Statistical modeling

Scanners

Bromine

RELATED CONTENT

History of the Tesseract OCR engine what worked and...
Proceedings of SPIE (February 04 2013)
Anchored paired comparisons
Proceedings of SPIE (January 28 2008)

Back to Top