ABBYY

API

Integrate reliable Document AI in your automation workflows with just a few lines of code

Easily convert unstructured documents into clean, structured data with a high-performance OCR and document processing API—built for speed, accuracy, and seamless integration into your stack.
Intelligent Document Processing Insights
DS-999-analytics-idp-636x568-GettyImages-2042526827

Pick the OCR service you can rely on

Eliminate inconsistent OCR results, complex integrations, and fragmented tools. ABBYY Document AI API is purpose-built to deliver reliable and consistent data extraction for enterprises. Backed by over 35 years of expertise and trusted by organizations worldwide, we provide the dependable results your essential business processes demand.

Seamless APIs, built for developers and business-critical document automation

Precision you can count on
Built for developers
Automation at scale
Zero hallucinations
Ready to scale
Built by experts, trusted by engineers
ABBYY-Document-classification-features

Key capabilities

Image-to-text conversion

Extract searchable text from documents using state-of-the-art OCR technology. Supports multiple languages including English, German, French, Japanese, and Chinese, as well as multilingual documents. Text is delivered in structured JSON or text-only JSON formats.

Pre-trained field extraction

Access pre-trained extraction models designed for critical documents, including invoices, receipts, waybills, "proof-of" documents, and tax forms. Simplify integration into downstream workflows without additional training.

Document conversion

Transform scans and images of documents into searchable formats like PDF, PDF/A-3a, or HTML for greater versatility and usability.

Developer-friendly integration

Leverage SDKs available in Python, C#, TypeScript, and Java. Our intuitive API and developer-friendly and intuitive documentation ensure smooth setup and easy collaboration.

Data consistency and compliance

Benefit from a platform designed to preserve data integrity and support regulatory compliance, ensuring process transparency.

Purpose-built for business process automation

Build workflows tailored to your business needs. Designed specifically for automation teams, our API handles multi-language, handwritten, and complex document layouts with ease.

How to use the API

  • Sign up and create an API key
  • Upload and process documents
  • Output structured data

Sign up and create an API key

Register for free, generate your unique API key, and start developing immediately without lengthy onboarding.

docai-api-keys

Upload and process documents

Send your business documents to the API for processing using SDKs or direct API calls.

docai-api-extracted-data

Output structured data

Receive reliable and structured data as JSON to feed directly into your automation or AI workflows.

docai-api-python-code

Intelligent document processing pipeline

image-enhancement-icon-active
image-enhancement-icon-active
image-enhancement-icon
Image enhancement
human-in-the-loop-icon-active
human-in-the-loop-icon-active
human-in-the-loop-icon
Human in the loop & continuous learning
quality-analytics-icon-active
quality-analytics-icon-active
quality-analytics-icon
Quality analytics
data-output-icon-active
data-output-icon-active
data-output-icon
Data output

Document input

Ingest documents from multiple channels—mobile devices, email, shared folders, network scanners, and direct connections to business systems via API or pre-built connectors—ensuring seamless integration into your workflows, no matter how documents enter your organization. This flexibility empowers you to efficiently support diverse business processes, adapting to your specific needs and streamlining operations from every entry point.

ABBYY-Intelligent-Document-Input-Capture

Image enhancement

The quality of document images can vary significantly due to issues like poor lighting and distortions from mobile cameras—or come with multiple auxiliary elements such as patterned backgrounds, protection marks, field markings, lines, and guides that obscure important information.

ABBYY’s AI-powered image enhancement algorithms optimize each image for accurate data extraction. The AI corrects distortions and separates text from the background, cleaning up even the most complex and visually busy documents—such as IDs, birth certificates, and forms—to achieve reliable results and high straight-through processing rates.

ABBYY-Image enhancement-Document-AI

OCR / ICR

AI has transformed the ability to read and interpret content previously deemed impossible to process, dramatically expanding the use cases for automation. ABBYY IDP uses advanced AI-based optical character recognition (OCR) and intelligent character recognition (ICR) technologies to digitize printed and handwritten text, preparing it for further processing. These technologies are able to recognize the logical structure of the whole document, including complex elements such as tables, enabling document classification, data extraction, and high-quality export to digital formats.

ABBYY-AI- Document-Processing-OCR/ICR

Document classification & assembly

Automate document classification and routing with AI classification models that analyze both text and image features through multimodal learning to recognize and organize documents. Once classified, documents are automatically assigned an AI extraction model for processing. By incorporating human-in-the-loop input, the models learn from user corrections and automatically adjust, continuously improving their performance over time.

ABBYY-Document-classification-Document-AI

Data extraction & validation

Extract data from structured, semi-structured, or unstructured business documents using advanced AI and machine learning that mimic human understanding. ABBYY IDP reads and understands documents in over 200 languages and effortlessly handles complex tables, handwriting, checkmarks, barcodes, signatures, and more.

Automatic validation cross-checks information against databases and ensures compliance with built-in validation rules. Our low-code design approach gives you the flexibility to use pre-trained models available in the ABBYY Marketplace, tweak these ready-to-use models for the unique needs of your organization, or train custom models tailored to your specific documents.

AI-Document-Classification-ABBYY-Document-AI

Human in the Loop (HITL) & continuous learning

Keep refining your processes through human-in-the-loop (HITL) review, which lets subject matter experts step in to manually check and correct document classes as well as extracted data through a convenient interface. This optional step is crucial when 100% accuracy is required or when a document doesn’t meet the specific validation rules established for each AI model. Each time a correction is made, the AI models improve through continuous learning and get more accurate.

Human-in-the-loop-Document-AI

Quality analytics

The advanced quality analytics provided by ABBYY Document AI provide a clear understanding of your document processing performance and track improvements in straight-through processing rates over time. With actionable insights and tailored recommendations, you can pinpoint the root causes of problems and take effective actions to improve data extraction quality of the models for superior business outcomes within your IDP workflow.

Quality-analytics-ABBYY-Document-AI

Data output

ABBYY Document AI automatically exports data in the required format to meet your needs—whether JSON, CSV, XML, or others. The data is then sent seamlessly to your automation systems and business applications through simple REST API or pre-built connectors into your downstream processes.

Data-Output-with-ABBYY-Document-AI

Learn more about IDP and OCR

Webpage
Blog

Code Smarter, Not Harder: Document Processing Is Solved

Get the right tools for faster development and greater success. Benefit from ABBYY’s pre-trained models, intuitive APIs, and detailed documentation.

Read the article
Webpage
Blog

Choosing OCR Technology: Key Considerations for Software Developers

Not all OCRs are created equally, so choosing the right one can still be a headache. Discover key points to keep in mind, including considerations for open source models, limitations of LLMs, and pricing.

Read the article
Webpage
Intelligent Enterprise

Beyond Basic OCR: What Developers Really Need

Choosing the right OCR solution comes down to what your work demands: accuracy, scalability, affordability, and dependable developer support.

Read the article
Webpage
Blog

Code Smarter, Not Harder: Document Processing Is Solved

Get the right tools for faster development and greater success. Benefit from ABBYY’s pre-trained models, intuitive APIs, and detailed documentation.

Read the article
Webpage
Blog

Choosing OCR Technology: Key Considerations for Software Developers

Not all OCRs are created equally, so choosing the right one can still be a headache. Discover key points to keep in mind, including considerations for open source models, limitations of LLMs, and pricing.

Read the article
Webpage
Intelligent Enterprise

Beyond Basic OCR: What Developers Really Need

Choosing the right OCR solution comes down to what your work demands: accuracy, scalability, affordability, and dependable developer support.

Read the article

Document AI API—frequently asked questions

What makes ABBYY Document AI API different?
Which document types are supported?
How can I test the API?
Can I rely on ABBYY for long-term solutions?
Where can I find the API documentation?
How can I join the community?
What SDKs are available with the Document AI API?