AI pre-learning ability – use SQL database

Convincing clients to start using new software features can be a challenging task. It is the same for OCR solutions. Every single customer wants an OCR solution that would be precise out of the box. So how do you achieve this? This guide will show you:

How to use AI pre-learning ability.
How to use rows from your database to achieve precise data extraction from day one.

Why should you use AI pre-learning ability?

In the previous guide, we’ve implemented the usage of the continuous learning ability of typless AI. We saw how to integrate it into the existing workflow to support the customer’s work. However, we had one problem. Extraction did not return any data at the beginning. So the question is, how to improve the first user experience for your customers? That’s why AI needs the ability of pre-learning, making it trainable before your customer has any interaction with it. Consequently, the accuracy of the extracted data is high enough to satisfy your customers from the beginning.

What do I need to do?

If you have read the guide, how to choose OCR solution for ERP, you know that classic AI pre-learning can be costly and time-consuming. In other words, you want to invest as little resources as possible to train AI. Therefore, it would be lifesaving if you could use the data that is already in your database. This is something typless can do for you. So let’s take a look.

How to use AI’s pre-learning ability?

Prerequisites

Register at typless
Log in
Click the Settings tab in the side navigation bar to get an API key
Pre-learning ability is not enabled on free plan – contact us to enable it for testing
Create a new document type named “pre-learning-example”:
- Click on the Documents tab in the side navigation bar
- Click on the +New button
- For the name set “pre-learning-example“
- For the OCR language select “English“
- Click on the Next button
- Leave fields as they are and click on the Confirm button
Get code of the example project from Github.

				
					git clone https://github.com/typless/pre_learning.git

Set the API key:

				
					export API_KEY=YOUR_API_KEY_FROM_SETTINGS_TAB

To install requirements:

				
					pip install -r requirements.txt

Using AI’s pre-learning ability

In the examples directory, you will find example invoices and the database. The example database contains tables for received invoices and suppliers. Each of the records contains the following fields:

supplier – ID of the supplier,
invoice_number – invoice number,
issue_date – issue date of invoice,
total_amount – the total amount on an invoice,
file_path – path to file inside files directory.

To use typless AI’s pre-learning ability, we need to do the following steps:

read rows from the database,
for each row:
- upload a file with data from a row.

The Python code that will do that with an example’s database looks like this:

				
					import json
import os
import sqlite3

import requests

BASE_DIR = os.path.dirname(os.path.abspath(__file__))
# open connection
conn = sqlite3.connect(os.path.join(BASE_DIR, 'examples', 'examples.db'))
conn.row_factory = sqlite3.Row

cur = conn.cursor()
cur.execute("SELECT * FROM received_invoices")

# get all received invoices
rows = cur.fetchall()

for row in rows:
    with open(os.path.join(BASE_DIR, 'examples', 'files', row['file_path']), 'rb') as invoice_file:
        files = {
            "file": (row['file_path'].split('/')[-1], invoice_file.read(),),
        }
    request_data = {
        "document_type_name": 'pre-learning-example',
        "customer": 'me',
        "learning_fields": json.dumps(
            [
                {'name': 'supplier', 'value': row['supplier']},
                {'name': 'invoice_number', 'value': row['invoice_number']},
                {'name': 'issue_date', 'value': row['issue_date']},  # convert to YYYY-MM-DD string if your database has datetime type
                {'name': 'total_amount', 'value': '%.2f' % row['total_amount']},
            ]
        )
    }
    if os.getenv('API_KEY') is None:
        raise Exception('YOU MUST SET API KEY!')

    response = requests.post(
        f'https://developers.typless.com/api/document-types/learn/',
        files=files,
        data=request_data,
        headers={'Authorization': f'Token {os.getenv("API_KEY")}'}
    )
    print(response.text)

Run it:

				
					python pre_learning.py

Congratulations, you’ve just used typless AI’s pre-learning ability.

You can use extract_data.py, to try to extract data from one of the examples invoices:

				
					python extract_data.py

That’s it! Happy OCRing!

AI pre-learning ability – use SQL database

Why should you use AI pre-learning ability?

What do I need to do?

How to use AI’s pre-learning ability?

Prerequisites

Using AI’s pre-learning ability

Avtomatsko obdelajte vse vaše podatke

Choose which product you want to Access

Tapp – Automatic invoice processing

Platform – IDP model training

Izberite aplikacjo

Typless App – samodejna obdelava računov

Platforma – usposabljanje za modele IDP