Paper
13 January 2003 Do Thesauri enhance rule-based categorization for OCR text?
Author Affiliations +
Proceedings Volume 5010, Document Recognition and Retrieval X; (2003) https://doi.org/10.1117/12.472835
Event: Electronic Imaging 2003, 2003, Santa Clara, CA, United States
Abstract
A rule-based automatic text categorizer was tested to see if two types of thesaurus expansion, called query expansion and Junker expansion respectively, would improve categorization. Thesauri used were domain-specific to an OCR test collection focussed on a single topic. Results show that neither type of expansion significantly improved categorization.
© (2003) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Kazem Taghva and Jeffrey Coombs "Do Thesauri enhance rule-based categorization for OCR text?", Proc. SPIE 5010, Document Recognition and Retrieval X, (13 January 2003); https://doi.org/10.1117/12.472835
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Optical character recognition

Associative arrays

Chemical species

Computer programming

Information science

Knowledge acquisition

Lanthanum

RELATED CONTENT

OCR correction based on document level knowledge
Proceedings of SPIE (January 13 2003)
Evaluation of an automatic markup system
Proceedings of SPIE (March 30 1995)
Evaluating text categorization in the presence of OCR errors
Proceedings of SPIE (December 21 2000)
Presentation of structured documents without a style sheet
Proceedings of SPIE (January 29 2007)
Address extraction using hidden Markov models
Proceedings of SPIE (January 17 2005)

Back to Top