Convincing clients to start using new software features can be a challenging task. It is the same for OCR solutions. Every single customer wants an OCR solution that would be precise out of the box. So how do you achieve this? This guide will show you:

  • How to use AI pre-learning ability.
  • How to use rows from your database to achieve precise data extraction from day one.

Why should you use AI pre-learning ability?

In the previous guide, we’ve implemented the usage of the continuous learning ability of typless AI. We saw how to integrate it into the existing workflow to support the customer’s work. However, we had one problem. Extraction did not return any data at the beginning. So the question is, how to improve the first user experience for your customers? That’s why AI needs the ability of pre-learning, making it trainable before your customer has any interaction with it. Consequently, the accuracy of the extracted data is high enough to satisfy your customers from the beginning.

What do I need to do?

If you have read the guide, how to choose OCR solution for ERP, you know that classic AI pre-learning can be costly and time-consuming. In other words, you want to invest as little resources as possible to train AI. Therefore, it would be lifesaving if you could use the data that is already in your database. This is something typless can do for you. So let’s take a look.

use-your-database-and-files

How to use AI’s pre-learning ability?

Prerequisites

  1. Register at typless
  2. Log in
  3. Click the Settings tab in the side navigation bar to get an API key
  4. Pre-learning ability is not enabled on free plan – contact us to enable it for testing
  5. Create a new document type named “pre-learning-example”:
    • Click on the Documents tab in the side navigation bar
    • Click on the +New button
    • For the name set “pre-learning-example
    • For the OCR language select “English
    • Click on the Next button
    • Leave fields as they are and click on the Confirm button
  6. Get code of the example project from Github.
Copy to Clipboard

Set the API key:

Copy to Clipboard

To install requirements:

Copy to Clipboard

Using AI’s pre-learning ability

In the examples directory, you will find example invoices and the database. The example database contains tables for received invoices and suppliers. Each of the records contains the following fields:

  • supplier – ID of the supplier,
  • invoice_number – invoice number,
  • issue_date – issue date of invoice,
  • total_amount – the total amount on an invoice,
  • file_path – path to file inside files directory.

To use typless AI’s pre-learning ability, we need to do the following steps:

  • read rows from the database,
  • for each row:
    • upload a file with data from a row.

The Python code that will do that with an example’s database looks like this:

Copy to Clipboard

Run it:

Copy to Clipboard

Congratulations, you’ve just used typless AI’s pre-learning ability.

You can use extract_data.py, to try to extract data from one of the examples invoices:

Copy to Clipboard

That’s it! Happy OCRing!