Form Recognizer Solution Accelerator

Accelerate your Form Recognizer solution to production with this Solution Accelerator, which leverages an Azure Function and a set of Logic Apps to split multi-page PDF files to single-page PDF files and sends individual PDF files to the REST API endpoint of a trained custom document model in Form Recognizer.

Architecture

This solution implements two capabilities that are commonly required when working with a trained custom document model:

Splitting multi-page PDF documents into individual, single-page PDF documents
Analyzing the results of documents sent to the Form Recognizer REST API endpoint of a trained custom document model

Please reference this blog post for detailed, step-by-step instructions for how to implement this solution. We are also actively working on organizing the same step-by-step instructions in this repository.

Step 1: Deploy core resources to Azure

Using the below button, six Azure services will be deployed:

Storage account
Function app
App Service plan
Form Recognizer
Logic app (x2)

Step 2: Create containers & upload data

Download sample data from this repository and upload it into the new containers you create.

Step 3: Train custom document model

Open the Form Recognizer Studio and train a custom document model.

Step 4: Deploy open-source Python code to split PDFs

Deploy open-source Python code to your Function App to split multi-page PDF files.

Step 5: Configure Logic App to split multi-page PDF documents to single-page PDF documents

Create a Logic App to call your Azure Function App and save individual PDF files based on a multi-page PDF file input.

Step 6: Configure Logic App to send single-page PDF document data to REST API endpoint of trained custom document model

Leverage the REST API endpoint of a trained custom document model in Form Recognizer.

Step 7: Verify the results

Upload a multi-page PDF file and verify that the first Logic App produces single-page PDF files. Then, verify that the second Logic App sends each file to the custom model endpoint in Form Recognizer and saves the resulting JSON.

FormRecognizerAccelerator
FormRecognizerAccelerator copied to clipboard

Metadata

Form Recognizer Solution Accelerator

Step 1: Deploy core resources to Azure

Step 2: Create containers & upload data

Step 3: Train custom document model

Step 4: Deploy open-source Python code to split PDFs

Step 5: Configure Logic App to split multi-page PDF documents to single-page PDF documents

Step 6: Configure Logic App to send single-page PDF document data to REST API endpoint of trained custom document model

Step 7: Verify the results

← Metadata

Owner

Metadata

FormRecognizerAccelerator FormRecognizerAccelerator copied to clipboard

Metadata

Form Recognizer Solution Accelerator

Step 1: Deploy core resources to Azure

Step 2: Create containers & upload data

Step 3: Train custom document model

Step 4: Deploy open-source Python code to split PDFs

Step 5: Configure Logic App to split multi-page PDF documents to single-page PDF documents

Step 6: Configure Logic App to send single-page PDF document data to REST API endpoint of trained custom document model

Step 7: Verify the results

← Metadata

Owner

Metadata

FormRecognizerAccelerator
FormRecognizerAccelerator copied to clipboard