Form OCR Recognizer Component
The Novacura AI Form Recognizer is developed to provide the capability to OCR PDF Invoices using custom models and layouts to provide reliable value outputs within Novacura Flow.

This flow is the start point of the application, it holds together the fragments. The order of the processes is:
- 1.Upload a PDF document to the BLOB storage
- 2.Download the FOTT file if it exists
- 3.If none exist, then generate a new FOTT file and download it
- 4.If the FOTT file contains a trained model id, then it analyzes the uploaded document against the model id
- 5.If the FOTT file does not contain a trained model id, then ask the user to open the online Form OCR Testing Tool (FOTT), label the documents, and train the AI to create a model

If in the BLOB Handling fragment, there were no files uploaded, this screen appears, and users can restart the whole process or quit.

Before the analyze-process, the application checks whether the project’s FOTT file contains a trained model id or not. If the selected customer project does not have one yet, this screen is shown. By clicking on the link (“Form Recognizer Tool”), it opens the OCR Tool website where an administrator can label the uploaded documents and train a model.

This screen appears at the end of the application and by clicking on the Exit button, the flow ends.

This fragment handles the file uploading process, splits files (if it contains multiple pages), and check whether the file exists or not in the BLOB storage. If the file exists, users can decide to override the existing file.
In the first step, the users select the customer company from a list (fetched from IFS), upload a single .pdf file and if necessary, give a page range. Only the given pages are split and uploaded from the original document.
To upload the whole document, leave the Page Range field blank, otherwise ‘ ‘ (space), ‘,’ (comma), or ‘;’ (semicolon) can be used as s separator, e.g.: “1,3-5,11” or “4-6;8;10;20-25”
The application checks whether the document already exists in Azure BLOB Storage or not. If it exists, the user is prompted to override or skip the file.
After the file(s) are uploaded successfully, a message appears on the screen which or how many files were uploaded, otherwise it shows an error message.
The files are stored under a folder in the BLOB storage container, the naming structure:
<customer_id><customer’s address_id>/{}.pdf
E.g.: 1000_5/714570.pdf; 10_1/708623_2.pdf

Select a customer from the list, then select a single .pdf document to upload and provide a page number or range if the document contains multiple pages.

When the users try to upload a document with the same name as it exists in the Blob Storage, then they get this warning screen and can decide to override this file or go back to the previous screen to upload another document.

When the users try to upload a multi-page documents and some of these documents are already uploaded the application shows this screen.
On this screen, users can check which files exist in the Blob Storage and which ones do not. They can decide to go back to the previous step to upload another document (or other pages of the same document), override all the files, or upload only those files which are not been uploaded to Storage yet.

If users type in a page range in the wrong format, they see this message screen.

This fragment lists those files which belong to the selected customer in the BLOB storage, counts the number of files, and downloads the .fott file of the current project.
If the application finds an .fott file, it provides the content of the file in a flow variable which is necessary to get the trained model id to analyze the uploaded document. If there is no .fott file for the project yet, the return variable is empty, the main flow continues the run, and an .fott file will be generated.

This fragment creates a new OCR project file/folder in the Azure Storage Account. The workflow takes in the customer information and creates a <customer_id>_<address_id>.fott file and folder structure within the storage account. A shared project key is then generated and provided by the single user step.
A user step presents the shared security key for the project used to import the project not the OCR tool. The shared key must be saved by the user (or email functionality is added to the flow) as the shared key is not visible after this step. It is impossible to load the project into the OCR tool without the shared secure key.
This workflow uses the OCR-Crypto rest connector and uses WebCrypto methods to encrypt specific parts of the FOTT project file and to generate an encrypted shared key.

If there is not an existing project to upload the file to, then the application creates a new project and puts the uploaded document under a new folder in the Blob Storage.
To open the project, users can follow the steps in the list. To open the OCR Tool click on the “OCR Tool website” link. This link opens the OCR Tool in a new tab.
- Click “Open Cloud Project”

- Paste the Shared Key and click “Open”

- It opens the new project


This flow can copy the existing fields.json file to the new project’s folder. This file contains the predefined labels which can be used for labeling documents in OCR Tool. This is important to provide consistency in variable naming between models. A new model can be trained after at least 5 documents are labeled. Users can decide to not copy this file and create their own labelling in OCR Tool.
This fragment runs only in the new project creation case (when a new .fott file is generated), otherwise, we skip this step.

From the dropdown list, select an existing fields.json file and click “Next” to copy and add it to the new project.
If the users copy a fields.json file to the new project, in OCR Tool we can see the results, labels (tags) are appearing on the right pane.


This fragment handles the analysis of the pdf document by using a previously trained model.
The document is uploaded to the Form Recognizer Service and a URL for retrieving the result is returned. The workflow checks for the result and, if the results are not ready, it waits and tries again until the analysis is done. When the results are returned from the service, the workflow parses all the labels and tables that are defined in the trained model.
The connector is configured to be generic when it comes to the labels, this means that you do not need to modify the connector if you have a model with different labels. That being said, the Flow script embedded in the workflow needs to be adjusted to handle specific labeled fields.
All labels are returned in a ‘fields’ table:

All tables include in another table called ‘tables’:

The workflow uses a Flow script to parse labels and tables and returns a complete Flow object e.g., an Invoice object.
Each object is presented in a user step as a checklist sub-task.

After the application finished the analysis successfully then the above screen appears.
This checklist can contain a single or multiple rows, depending on the number of documents uploaded. By clicking on the ‘>’ symbol at the top corner of the lines, it shows a preview of the parsed document. This is typically the step where the user can modify the parsed information before saving it to IFS.

This screen shows all the data that was read and processed by the OCR Tool.
The confidence level on the header shows the trained model reading accuracy, where 1.000 is the maximum (100%) and 0.000 is the minimum (0%) value.
On this screen, users can check if values are correct and equal to the values of the original document. If the lines contain differences, users can edit the cells and correct the values.
In addition, users can download the uploaded document quickly review it against the AI’s response. This can be done in the FileGallery user step called “Uploaded file”.
When the users confirm the values are correct, values are transferred to IFS to generate an Instant Invoice.
Last modified 1yr ago