ibm-cloud-functions-serverless-ocr-openchecks
ibm-cloud-functions-serverless-ocr-openchecks copied to clipboard
Serverless bank check deposit processing with object storage and optical character recognition using Apache OpenWhisk powered by IBM Cloud Functions. See the Tech Talk replay for a demo.
Serverless check deposit processing with Apache OpenWhisk on IBM Cloud Functions
This project demonstrates serverless technology - powered by Apache OpenWhisk with IBM Cloud Functions - in the context of a retail banking scenario where deposited checks are processed digitally (such as through a mobile banking app) using optical character recognition (OCR).
This sort of use case is ideal for a serverless architecture because it addresses compute-intensive and highly elastic payday deposit processing where the workload spikes for one particular timeframe every two weeks.
You can learn more about this scenario in the IBM Code blog post and associated Code Pattern. You can also see a deeper dive and extended demo with the Tech Talk replay.
It also highlights a few OpenWhisk development concepts, which include rules, triggers, sequences, and invoking actions from other actions directly. Furthermore, it demonstrates a use case where polyglot development is applied in order to package an third-party C library for OCR as an action.
You should have a basic understanding of the OpenWhisk programming model. If not, try the action, trigger, and rule demo first. You'll also need a Bluemix account and the latest OpenWhisk command line tool (wsk) installed and on your PATH.
It is currently built on the public Cloud Functions service and relies on Cloudant and SoftLayer Object Storage. On-premises, it could use Apache CouchDB and OpenStack Swift. Other storage services could include FileNet or Cleversafe. Tesseract provides the OCR library.

Components
- Apache OpenWhisk via IBM Cloud Functions
- Object Storage on Bluemix
- Apache CouchDB (IBM Cloudant on Bluemix)
- SendGrid on Bluemix
- Docker Hub
Retail bank deposit processing with OpenWhisk
This project uses a set of actions and triggers linked by rules to process images that are added to an object storage service. When new checks are detected a workflow downloads, resizes, archives, and reads the checks then it invokes an external system to handle the transaction.
Most of the actions are written in JavaScript using the default Node.js version 6 environment on Cloud Functions. One of the actions is written as a shell script and packaged in a Docker container. This shows both the polyglot nature of OpenWhisk, as well as the ability to package any arbitrary program, as is needed in this case to leverage an OCR library.
Primary workflow
-
A mobile app user or teller at a bank branch scans a check and puts the image into an object storage service (the
incomingcontainer) named with the customer email, deposit to account, amount of the check, and timestamp encoded in the file name, for example,[email protected]^12345679^19.99^1475597757.jpg -
A
poll-for-incoming-checkstrigger invokes thefind-new-checksaction every 20 seconds to poll the object storage service for new check images. (An alternative implementation should use an OpenStack Swift webhook to push this event instead of polling). -
This
find-new-checksaction queries the object storage service. For each file found, it invokes thesave-check-imagesaction asynchronously, which processes them in parallel. -
The
save-check-imagesaction downloads the check image and puts two resized copies (50% and 25% scaled) into anarchiveCouchDB database and the original in anauditdatabase. When all inserts have completed successfully, the files are deleted from the object storage service. -
A
check-ready-to-scanchange trigger on theauditCouchDB database invokes aparse-check-dataaction to process the full-size image. -
This
parse-check-dataaction retrieves the image, then calls theparse-check-with-ocrDocker action to read the payer account information and routing number. It stores the results into aparsedCouchDB database. If it can't read this information, the check is flagged as needing additional human review and stored in arejecteddatabase. -
A
check-ready-for-deposittrigger is then fired by that change to theparseddatabase and invokes another action,record-check-deposit. -
This
record-check-depositaction retrieves the account details from theparsedrecord, logs the transaction in theprocesseddatabase and sends an email with SendGrid (simulating connectivity to an external system).
Sample check images
There are three checks in the images directory that the OCR action can read reliably and two that will fail in order to show the alternate workflows (and to provide an exercise for improving the image recognition capabilities).
Notice the MICR data at the bottom of the check representing the routing number and payer (deposit from) account number.
The amount data is not currently parsable, nor is the deposit to account information, as these are handwritten on most checks. This will need to be passed as metadata (and later the algorithm may be improved to handle this case with a more powerful service).

Running the sample application
- Set up the Bluemix services (Cloudant, Object Storage, Docker Hub).
- Set up the OpenWhisk actions, triggers, and rules.
License
This code pattern is licensed under the Apache Software License, Version 2. Separate third party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 (DCO) and the Apache Software License, Version 2.