Form Recognizer connector

<< Click to Display Table of Contents >>

Navigation:  Bizagi Studio > Integrating external applications from Bizagi > Artificial Intelligence Connectors >

Form Recognizer connector

Navigation menu

Overview

How does Form Recognizer work

Types of data extraction models in Form Recognizer

Requirements

How do you start using the Form Recognizer connector

 

Overview

Optical Character Recognition (OCR) is a technology that is highly used in digital transformation strategies. This technology lets you convert images, handwriting or printed documents into encoded text, so you can use this information digitally in IT systems. Form Recognizer is a solution that provides swift and easy to use OCR capabilities. Bizagi provides a native connector to integrate your automated processes with Form Recognizer, so you can get information from scanned documents, credit cards or receipts, and use it in your processes. This section describes how you can use the Form Recognizer connector.

 

Form_Recognizer_connector_01

 

How does Form Recognizer work

Form Recognizer is a cognitive service offered by Microsoft Azure that uses advanced machine learning technology to build automated data extraction models. These models identify and extract text, key/value pairs, selection marks, tables, and structures from documents, for you to integrate in your applications and enhance your business processes. Using any of the three types of data extraction models Form Recognizer offers, you can quickly obtain precise results without substantial manual efforts or considerable data science knowledge.

 

For detailed information about the Form Recognizer cognitive service, refer to What is Form Recognizer?

 

Types of data extraction models in Form Recognizer

Form Recognizer has three different types of data extraction models you can invoke using a REST API or a client library SDKs:

1.Layout API: The layout API model detects and extracts data from documents with a specific layout (e.g. tables and selection marks). Both the text and the layout structure are extracted using high-definition optical character recognition (OCR) tailored for these kind of documents.

2.Prebuilt models: Prebuilt models are already trained models for usual frameworks that extract text, key/value pairs and line items from documents. These are ready-to-use models and, therefore, do not require any training samples. The three prebuilt APIs currently available in the Form Recognizer cognitive service are Prebuilt Invoice, Prebuilt Receipt, and Prebuilt Business Card.

3.Custom models: Custom models refer to raw created and user trained data extraction models that learn the structure of customer specific documents in an intelligent way. For an adequate training, these models use five input samples to learn the forms’ structure and intelligently extract data tailored to these documents.

 

note_pin

Bizagi Form Recognizer native connector is built around Custom data extraction models. The Layout API and Prebuilt models are described only for you to have a better understanding of the Form Recognizer cognitive service.

 

There are two ways a user can train a custom model: unsupervised learning and supervised learning.

i.Unsupervised learning: Unsupervised learning means that you train the data extraction model using unlabeled data. Form Recognizer uses this type of training by default to understand the layout and relationships between fields and entries in forms. The service's internal algorithm clusters the forms by type, discovers what keys and tables are present, and associates values to keys and entries to tables.

ii.Supervised learning: Contrary to unsupervised training, supervised training uses labeled data to extract values of interest. This type of training might demand previous work to label the data (either manually or with intensive coding) and upkeep the labeled forms, but result in better-performing models that adapt to complex forms (i.e. forms that contain values without keys). In supervised training, Form Recognizer uses the Layout API model to learn the expected sizes and positions of printed and handwritten text elements, and then uses user-specified labels to learn the key/value associations in the documents.

 

Requirements

The Form Recognizer cognitive service operates with documents that fulfill the following conditions:

Format must be JPG, PNG, PDF (text or scanned), or TIFF.

File size must be less than 50 MB.

Image dimensions must be between 50 x 50 pixels and 10000 x 10000 pixels.

PDF dimensions must be at most 17 x 17 inches, corresponding to Legal or A3 paper sizes and smaller.

For PDF and TIFF, only the first 200 pages are processed (with a free tier subscription, only the first two pages are processed).

The total size of the training data set must be 500 pages or less.

If PDFs are password-locked, the lock must be removed before submitting them.

If scanned from paper documents, forms should be high-quality scans.

Text must use the Latin alphabet (English characters).

For unsupervised learning (without labeled data), data must contain keys and values. Keys must appear above or to the left of the values.

Form Recognizer does not support these type of input data: complex tables (nested tables, combined headers, merged cells, etc.), check boxes and radio buttons.

 

How do you start using the Form Recognizer connector

To start using the Form Recognizer connector in Bizagi you must follow these steps:

1. Create the required Azure resources

2. Create a Form Recognizer connector in Bizagi Studio

3. Execute Form Recognizer from an activity action