ABBYY FineReader Engine
The most comprehensive OCR SDK for software developers
Recognition
Comprehensive set of recognition technologies
For the actual text recognition step, ABBYY FineReader Engine offers a comprehensive set of recognition technologies. The provided technologies include recognition of machine-printed texts (OCR), hand-printed texts (ICR), and recognition of barcodes (OBR). As a market leader, ABBYY offers the highest number of OCR languages, which can be individually combined. See below the list of available technologies and processing options.
Optical Character Recognition (OCR)
OCR technology is available for more than 200 languages:
- European languages (Latin, Cyrillic, Armenian, Greek alphabets)
- Non-European languages: Chinese, Japanese, Korean, Arabic, Farsi, Thai, Vietnamese, Hebrew, Burmese (preview)
- FineReader XIX – for old documents, books and newspapers published from 1600 until 1937 in English, French, German, Italian and Spanish in old fonts such as Fraktur, Schwabacher and Gothic fonts
- Recognition of OCR-A, OCR-B, MICR (E13B) and CMC7 fonts and documents printed by dot-matrix printers or typed on typewriters
Intelligent Character Recognition (ICR)
ICR technology is available for more than 120 languages:
- European and non-European languages
- 22 regional styles of hand-printing
- Recognition of hand-printed characters in fields and frames
- ICR for Indian digits used in Arab states
Recognition of hand-printed information in different languages (multilingual ICR) is possible.
Optical Barcode Recognition (OBR)
1D and 2D barcode types Fast barcode extraction. This feature enables automated detection and recognition of barcodes at any angle on a document.
Accurate recognition mode
Fast recognition mode
Full text recognition & field-level recognition
In general, two types of recognition are possible: full text and field-level recognition. Full text recognition is used for document conversion and usually includes usage of OCR technology. Field-level recognition is used to extract particular data and includes usage of OCR, ICR and other technologies.
The following table shows the differences:
Specification | Full text recognition | Field-level recognition |
---|---|---|
Used for: | Document conversion, books archiving | Data capture / Data extraction |
Document analysis: | General document analysis, document analysis for invoices, document analysis for full-text indexing | Manual blocks specification for field-level recognition |
Recognition technologies: | OCR with up to 99% accuracy | OCR, ICR, OMR, Barcodes recognition with predefined data types and values range. 99.99% accurate |
Verification: | Recommended (for content reuse) | Obligatory (as accuracy is a critical issue in most cases) |
Synthesis: | Used for document reconstruction | Not used |
Export format: | Document files (RTF, DOC, PDF, etc.) | Export to XML file or database |
Full text recognition
Full text recognition is a basic recognition type for different tasks, like:
- Documents and books conversion for archiving
- Document conversion for content reuse
- Ground text extraction for fields detection and documents classification
All of them require the recognition (OCR) of the whole text. Resulting text is exported as a plain text or as a complete document in the requested format.
Field-level recognition
To support key business processes such as forms processing, keyword classification, machine vision or robotic process automation, ABBYY FineReader Engine extracts text from fields or zones. Key functionality includes multilingual OCR and ICR, OMR, barcode recognition and a range of specific functions, such as:
- Data extraction from fields with various borders and frames
- Definition of field content by setting alphabets, dictionaries, regular expressions, handwriting styles, etc.
- Detection of in-field spacing
- Intelligent processing of blocks with intersecting parts and lines
- Text block despeckling, with the ability to specify the size of "garbage"
- Field-level recognition is as well supported by special tools for developers such as Voting API and "On-the-Fly" Recognition Tuning.
User languages
To increase the recognition quality, ABBYY FineReader Engine provides an API for creating and editing recognition languages, adjusting predefined recognition languages, and adding new words to user languages. Examples:
- To improve the quality of ICR recognition in forms, you can use user languages to describe the type of information, which may be entered in each field (zip codes, product codes, numbers).
- To improve the recognition of product codes, telephone numbers or passport numbers, you can create a new recognition language, which helps the program to read specific types of data.
Pattern training
In the vast majority of cases, FineReader Engine can successfully read texts without any prior training. However, when recognizing decorative or outlined fonts or low print quality documents, own patterns can be trained and the recognition quality increased.
Request a demo today!
Schedule a demo and see how ABBYY intelligent automation can transform the way you work—forever.