Form OCR Recognizer Component
Last updated
Last updated
The Novacura AI Form Recognizer is developed to provide the capability to OCR PDF Invoices using custom models and layouts to provide reliable value outputs within Novacura Flow.
This flow is the starting point of the application, it holds together the fragments. The order of the processes is:
Upload a PDF document to the BLOB storage
Check if a trained model exists
If no model exists, the user is asked to open the online Form Recognizer Studio, create a new project and train a model
Analyzes the uploaded document against the model
If in the BLOB Handling fragment, there were no files uploaded, this screen appears, and users can restart the whole process or quit.
Before the analyze-process, the application checks whether a trained model exists or not. If the selected customer project does not have a trained model, this screen is shown. Clicking the link "Form Recognizer Studio" will open the Studio website where an administrator can label the uploaded documents and train a new model.
This fragment handles the file uploading process, splits files (if it contains multiple pages), and checks whether the file exists or not in the BLOB storage. If the file exists, users can decide to override the existing file.
In the first step, the users select the customer company from a list (fetched from IFS), upload a single .pdf file and if necessary, give a page range. Only the given pages are split and uploaded from the original document.
To upload the whole document, leave the Page Range field blank, otherwise ‘ ‘ (space), ‘,’ (comma), or ‘;’ (semicolon) can be used as s separator, e.g.: “1,3-5,11” or “4-6;8;10;20-25”
The application checks whether the document already exists in Azure BLOB Storage or not. If it exists, the user is prompted to override or skip the file.
After the file(s) are uploaded successfully, a message appears on the screen of how many files were uploaded. otherwise, it shows an error message.
The files are stored under a folder in the BLOB storage container, the naming structure:
<customer_id><customer’s address_id>/{}.pdf
E.g.: 1000_5/714570.pdf; 10_1/708623_2.pdf
Select a customer from the list, then select a single .pdf document to upload and provide a page number or range if the document contains multiple pages.
When the users try to upload a document with the same name as it exists in the Blob Storage, then they get this warning screen and can decide to override this file or go back to the previous screen to upload another document.
When the users try to upload multi-page documents and some of these documents are already uploaded, the application shows this screen.
On this screen, users can check which files exist in the Blob Storage and which ones do not. They can decide to go back to the previous step to upload another document (or other pages of the same document), override all the files, or upload only those files which are not been uploaded to Storage yet.
If users type in a page range in the wrong format, they see this message screen.
This fragment lists those files which belong to the selected customer in the BLOB storage and counts the number of files.
If the application finds a trained model for the selected customer, the model details will be provided in a flow variable which is necessary to get the trained model id to analyze the uploaded document. If there is no trained model for the customer yet, the return variable will lack the model id information.
This fragment lists those files which belong to the selected customer in the BLOB storage and counts the number of files of the current project.
This flow can copy the existing fields.json file to the new project’s folder. This file contains the predefined labels which can be used for labeling documents in OCR Tool. This is important to provide consistency in variable naming between models. A new model can be trained after at least 5 documents are labeled. Users can decide to not copy this file and create their own labeling in OCR Tool.
This fragment runs only in the new project creation case, otherwise, we skip this step.
From the dropdown list, select an existing fields.json file and click “Next” to copy and add it to the new project.
If the users copy a fields.json file to the new project, in Form Recognizer Studio we can see the results, and labels (tags) are appearing on the right pane.
This fragment handles the analysis of the pdf document by using a previously trained model.
The document is uploaded to the Form Recognizer Service and a URL for retrieving the result is returned. The workflow checks for the result and, if the results are not ready, it waits and tries again until the analysis is done. When the results are returned from the service, the workflow parses all the labels and tables that are defined in the trained model.
The connector is configured to be generic when it comes to the labels, this means that you do not need to modify the connector if you have a model with different labels. That being said, the Flow script embedded in the workflow needs to be adjusted to handle specific labeled fields.
All labels are returned in a ‘fields’ table:
All tables include in another table called ‘tables’:
The workflow uses a Flow script to parse labels and tables and returns a complete Flow object e.g., an Invoice object.
Each object is presented in a user step as a checklist sub-task.
After the application finished the analysis successfully then the above screen appears.
This checklist can contain single or multiple rows, depending on the number of documents uploaded. By clicking on the ‘>’ symbol at the top corner of the lines, it shows a preview of the parsed document. This is typically the step where the user can modify the parsed information before saving it to IFS.
This screen shows all the data that was read and processed by the OCR Tool.
The confidence level on the header shows the trained model reading accuracy, where 1.000 is the maximum (100%) and 0.000 is the minimum (0%) value.
On this screen, users can check if the values are correct and equal to the values of the original document. If the lines contain differences, users can edit the cells and correct the values.
In addition, users can download the uploaded document and quickly review it against the AI’s response. This can be done in the FileGallery user step called “Uploaded file”.
When the users confirm the values are correct, values are transferred to IFS to generate an Instant Invoice.
This flow will list all current entities and trained models which exist in the current environment, with these models you can add/edit models to selected customers.
Customer-Model Table
In this screen, the user has the option to either create a new Customer-Model entity or update an existing entity. The screen will show the create new entity option ("*NEW") as well as all existing trained Customer-Model entities, if you want to update an existing entity, click on one record, and on the next screen you will be prompted to update the model for that project/customer.
In this screen, the user can create a new Customer-Model entity by selecting a customer and an existing model
In this screen the user can edit an existing Customer-Model entity by selecting a model or the user can delete the current Customer-Model entity by clicking on the button "Delete"