ABBYY Tech Terms

Definitions and terms about Process AI,
Document AI, and everything in between.

A B C D E F G H IJK L M N O PQR S T UVWXYZ

An advanced form of AI-driven automation where systems not only complete tasks but also make autonomous decisions without human intervention. Unlike traditional automation, which follows set rules, agentic AI adapts to changing conditions and handles dynamic, unpredictable tasks.

AI OCR

Optical character recognition (OCR) that uses artificial intelligence (AI) and machine learning (ML) to improve text recognition, particularly in complex documents, poor-quality scans, or handwritten text. AI OCR uses machine learning and natural language processing to understand document structure and context.

Algorithmic fairness

A principle to ensure that AI decision-making is unbiased and equitable. This involves techniques to detect, reduce, and prevent bias in AI training data and algorithmic outcomes.

Artificial intelligence (AI)

Technology that simulates human cognitive functions like learning, reasoning, and decision-making. AI helps businesses run more efficiently by automating processes, improving data management, and enabling intelligent decision-making.

Business process management (BPM)

A system for designing and automating business processes to streamline workflows across departments and improve efficiency and consistency.

Computer vision

An AI-powered technology that allows machines to interpret and understand images. Computer vision can be trained to recognize objects, patterns, and features in visual content, and can be used to automate tasks like quality control and facial detection.

Deep machine learning (deepML)

A subset of machine learning (ML) that uses artificial neural networks to analyze vast amounts of data and improve predictions over time. Also known as deep learning, deepML can be pre-trained for specific tasks for accuracy right out of the box and continuous improvement.

Digital twin

A virtual version of a physical object, system, or process that updates in real time. It uses smart sensors to track performance, detect issues, and improve efficiency. In process simulation, digital twins let businesses test and refine workflows before making real-world changes.

Document skills

Pre-trained AI-powered extraction models that automate document-related tasks such as data extraction, classification, validation, and exception handling. Also called document models, these skills typically use ML, NLP, OCR, and other AI technologies to process documents with high accuracy and efficiency. Because document skills can be pre-trained, or trained by customers for a custom fit, businesses can quickly integrate them into workflows without extensive AI expertise.

Enterprise automation

The use of AI-driven technologies to improve business processes at scale. Enterprise automation allows businesses to reduce manual work, improve accuracy, and make workflows more efficient.

Enterprise content management (ECM)

A system that helps businesses store, organize, and manage digital content securely. ECM typically includes tools for content collaboration, workflow automation, and record-keeping.

Enterprise resource planning (ERP)

A system that integrates core business functions into a unified platform to improve data flow and decision-making across an organization.

Ethical AI

A framework for developing and using AI responsibly, ensuring fairness, transparency, accountability, and alignment with human values. Key aspects include algorithmic fairness, privacy protection, and explainable AI (XAI), all aimed at minimizing bias and promoting trustworthy AI systems.

Explainable AI (XAI)

AI systems designed to provide clear, interpretable explanations for their decisions, so users can understand why an AI model made a specific choice.

Extract, load, and transform (ELT)

A data integration process that extracts data from different sources then loads the raw data into a data warehouse or data lake for cleaning and organizing as needed. ELT takes advantage of cloud storage for faster, more flexible data handling and access.

Extract, transform, and load (ETL)

A data integration process that extracts data from different sources, cleans and organizes it, and then stores it in a database for analysis. ETL helps businesses turn raw data into useful insights for reporting and analytics.

Fast machine learning (FastML)

A lighter, more adaptive approach to ML that, even with minimal data, quickly fine-tunes AI models for improved accuracy.

Generative AI (genAI)

A type of AI that is trained to create new content, such as text, images, audio, video, and code, by learning from vast amounts of data. Instead of just analyzing and processing information, genAI generates new data based on probability and patterns it has learned.

Human-centric AI

AI designed with human values at its core. The goal of human-centric AI is to enhance user experience and support ethical decision-making that is in alignment with human and social needs.

Human-in-the-loop (HITL)

Also known as “manual verification,” HITL is a hybrid AI approach where humans step in to review, validate, or correct AI-driven processes when needed for quality control. Over time, AI learns from these interventions to improve accuracy.

Hyperautomation

The integration of multiple automation technologies, tools, and platforms to identify, evaluate, and automate as many business and IT processes as possible. Hyperautomation typically combines AI, RPA, IDP, process mining, and other automation solutions to reduce manual effort and improve efficiency.

Identity proofing

The process of confirming a user’s identity by checking their information against trusted data or documents. Unlike authentication, which uses passwords to grant access, identity proofing verifies identity first before issuing credentials.

Intelligent character recognition (ICR)

An advanced form of OCR that recognizes handwritten characters of any kind. Thanks to the capabilities of AI, it continuously improves its accuracy over time and now even processes cursive handwriting.

Intelligent document processing (IDP)

AI-powered technology that reads, extracts, organizes, and interprets process-driving data from documents. IDP works with structured, semi-structured, and unstructured formats, going beyond simple character recognition to understand meaning through NLP, ML, and other AI-based technologies.

Intelligent process automation (IPA)

An advanced form of automation that combines AI technologies like ML, NLP, and computer vision with RPA to streamline complex business processes, handle unstructured data, and mimic human decision-making.

Key-value pair extraction

A technique in document processing where AI identifies and extracts structured data from forms and documents by recognizing key terms (e.g., "Date") and their corresponding values (e.g., 11/1/2024).

Know Your Customer (KYC)

A set of regulations requiring financial institutions to verify identities, assess risk, and prevent fraud, money laundering, and financial crime. KYC includes CIP, CDD, EDD, and ongoing monitoring of transactions.

Language model (LM)

An AI system designed to understand, generate, and predict human language. Trained on vast amounts of text, it learns grammar, patterns, and context to generate sentences, answer questions, translate languages, and more.

Large language model (LLM)

A language model that understands and generates human language with impressive accuracy. It predicts patterns and nuances in text based on input prompts, delivering responses tailored to user needs—though its outputs can sometimes be inaccurate or unreliable, known as “hallucinations.”

Low-code/no-code

A software development approach that lets users build and automate workflows with little to no coding by using a drag-and-drop approach or natural language inputs.

Machine learning (ML)

A type of AI that allows systems to recognize patterns and learn from data to improve performance over time without being explicitly programmed. ML powers tasks like data extraction, NLP, and OCR.

Natural language processing (NLP)

A branch of AI that allows systems to extract, understand, and generate human language. NLP can be used to analyze and extract key data from documents to improve process automation.

NeoML

An open-source machine learning framework developed by ABBYY. NeoML provides the tools and infrastructure needed to develop, train, and run AI models for data analytics, computer vision, deepML, and NLP.

Neural networks

A machine learning model inspired by the human brain, used for tasks like image recognition and NLP.

OCR Container

ABBYY OCR Container is a pre-packaged, ready-to-deploy OCR solution that lets businesses run OCR in cloud or on-premises environments without extensive configuration.

Optical character recognition (OCR)

Technology that extracts text from scanned documents or images and converts it into machine-readable format. OCR enables text to be edited, searched, and stored electronically, and is often an essential part of IDP and other automation workflows.

Privacy by design

A principle that integrates data protection and privacy safeguards into software and system development from the start to protect user information and comply with regulations.

Process analysis

A method for examining and understanding workflows by visualizing and examining process data. A core component of process intelligence, process analysis helps organizations identify inefficiencies and areas for improvement.

Process discovery

A technique for understanding how business workflows function in their current state. As a core component of process intelligence, process discovery combines process mining and task mining to identify inefficiencies, bottlenecks, and opportunities for optimization.

Process excellence (PEX)

A strategy for continuously improving business processes to maximize efficiency and effectiveness. PEX involves process assessment, modification, and testing to meet defined goals.

Process intelligence

A combination of data-driven capabilities focused on analyzing and improving business operations. Built on five key pillars—process discovery, analysis, monitoring, prediction, and simulation—process intelligence goes beyond traditional process mining to provide deeper, end-to-end insights for continuous improvement.

Process mining

A data-driven technique that analyzes event logs from business systems to visualize and optimize workflows. By tracking timestamps, activities, and case IDs, process mining reconstructs process flows to uncover inefficiencies and deviations.

Process monitoring

A method and system for tracking workflows in real time to ensure they follow the correct steps. As one of the five pillars of process intelligence, process monitoring can also send alerts or trigger automated corrective actions in the case of a delay or a deviation.

Process prediction

A technique that uses past data and AI models to forecast how a process will likely turn out before it’s completed. Process prediction plays a key part in process intelligence by helping businesses anticipate problems and make proactive decisions.

Process simulation

A technique that creates digital twins or models of business processes to test scenarios and see potential outcomes. A core component of process intelligence, process simulation lets organizations simulate changes and gauge their impacts before making them.

Purpose-built AI

ABBYY’s approach to AI, which is developing AI-driven solutions that are specifically engineered to address the unique issues or needs of an organization or a particular use case. Compared to general-purpose AI models, purpose-built AI solutions have a high level of customization and focus and, as a result, are smaller, more accurate, and quicker to deploy.

REST API

A method for applications to communicate with each other over the internet using simple web requests. REST API allows businesses to seamlessly connect existing systems with intelligent automation solutions for data sharing and workflow automation.

Retrieval-augmented generation (RAG)

An AI technique that improves language models by allowing them to retrieve and integrate external knowledge in real time. Instead of relying only on pre-trained data, RAG pulls relevant information from a database or knowledge source to produce more accurate, context-aware responses.

Robotic process automation (RPA)

A technology that uses software bots to handle repetitive digital tasks like data entry and processing.

Semi-structured documents

Partially organized data that has some structure but is not rigidly formatted. For example, invoices: they include mainly the same components, but these components can be placed in different areas of the invoice.

Small language model (SLM)

A language model designed for efficiency and precision in specific tasks. Faster, more affordable, and easier to train, SLMs excel in targeted applications while using less computing power and reducing environmental impact.

Software development kit (SDK)

A collection of tools, documentation, and sample code that allows developers to integrate technology when building applications for specific platforms, frameworks, or hardware systems. SDKs simplify software development by providing pre-built components and guidelines that help developers create software more efficiently.

Straight-through processing (STP)

A rate at which businesses can measure the portion of documents that are processed by a system automatically, without exceptions or human-in-the-loop required. STP rates are improved through the use of AI and ML to extract, classify, and validate data with minimal manual touchpoints.

Structured data

Data that is ready to be processed by systems in a workflow. IDP enables organizations to turn unstructured data from all document types (structured, semi-structured, and unstructured) into structured data for processing, in the form of JSON files.

Task mining

A technique that captures and analyzes user desktop activities to uncover how tasks are performed within a process. Task mining fills in process gaps by tracking human interactions with applications to uncover inefficiencies and deviations.

Unstructured data

Data without a predefined format, such as information in scanned files, handwritten documents, and videos.

Zero trust security

A cybersecurity framework where no user, device, or system is automatically trusted. Instead, access is continuously verified through authentication and encryption to reduce security risks.

Zero-shot learning

An AI solution's ability to process and understand new data or tasks without requiring prior training on specific examples.

Retrieval-augmented generation

State of Intelligent Automation: Generative AI Confessions

Gartner® Magic Quadrant™ for Intelligent Document Processing Solutions

What is ABBYY Marketplace?

11 Document Skills for Transportation & Logistics

7 Document Skills for Financial Services in the ABBYY Marketplace