How to Extract Items of Invoice Using Form Recognizer API
Table of Contents
In this article, we will talk about how to use Azure Form Recognizer API to extract items from an invoice.
Download sample code from Form Recognizer Studio
After using Azure Form Recognizer Studio to recognize sample invoice (You can read the article below), we can download the sample code.
How to Extract Items From Invoices Using Azure Form Recognizer
https://thats-it-code.com/azure/how-to-use-form-recognizer-to-extract-items-from-receipt/
Download sample code
After downloading the sample code (sample_analyze_invoices.py), move it to a new folder for extracting items from invoice using Form Recognizer API project.
Move the sample code to project folder
Open the project (the above folder) with Visual Studio Code
Click [Open Folder] in VS Code and select the above folder.
You can read the article below to learn how to create a local development environment.
Lets Create a Programming Environment
https://thats-it-code.com/programming/lets-create-a-programming-environment/
Setup endpoint and key for calling API
Let’s open the sample code, you will see endpoint and key variable is not set.
Let’s go to Azure portal to get the two values.
Firstly, enter “congnitive services” in the top search bar of Azure portal.
And click “Cognitive services multi-service account” in the result list.
Click the service name.
Click [Keys and Endpoint] in the left-side menu.
Click Copy button of [KEY 1] and [Endpoint]respectively and paste them to the endpoint and key variables of the sample code.
Install necessary library
To prevent pollution of the global python environment, we can use virtual environment.
I use pipenv library to create and management python virtual environments.
Firstly, let’s install pipenv library using pip command.
pip install pipenv
In the VS Code, press Ctrl+@ to open terminal panel at the bottom of editor.
And select Git Bash shell.
Execute the following command to create a new virtual environments based python 3.
pipenv --python 3
And use the command below to enter the virtual environment.
pipenv shell
Now let’s see what libraries are used in the sample code.
As you can see, Azure Form Recognizer library is impoted.
Let’s install azure form recognizer library.
pip install azure-ai-formrecognizer==3.2.0b2
But when finished installation, the import warnings still exist.
We have to change the Python intepreter.
This time, the import warnings will gone.
Execute sample code
Next, let’s execute the sample code.
Execute the command below in the terminal panel at the bottom of VS Code.
python sample_analyze_invoices.py
The result below will show in the terminal.
Extract items from local sample invoice
We can also extract invoice items from local files by modifying some lines.
Firstly, let’s prepare the local sample invoice file.
Let’s place the image below into data folder in project.
This image is from Internet.
Let’s comment out formUrl and replace it with opening the local invoice sample file.
# formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/invoice_sample.jpg"
with open("data/sample-invoice.png", "rb") as f:
formData = f.read()
And replace begin_analyze_document_from_url method with begin_analyze_document method.
# poller = document_analysis_client.begin_analyze_document_from_url("prebuilt-invoice", formUrl)
poller = document_analysis_client.begin_analyze_document("prebuilt-invoice", formData)
invoices = poller.result()
And let’s execute our code again. The invoice items also be extracted successfully.
Conclusion
In this article, we use Form Recognizer API to extract invoice items from invoice files. We modified the sample code downloaded from Form Recognizer Studio page and extracted items from the local invoice file successfully.