openfoodfacts-ai
openfoodfacts-ai copied to clipboard
Extract data from receipts/bills and enrich OFF with prices notion
I read somewhere that table recognition is on roadmap... When this is ready, scanning "bills" or invoices to extract products price by brand/store/date.
With a shared price information, comparators and other apps would be possibles...
- See price over time of a product
- Compare stores margins
- Detect price fluctuation
- Compare categories average prices
- Compare categories average country differences
- An app to calculate the best couple of store to get this gorcery list you have to buy for next dinner, created from my last bills scans recurring products ... and so on
Maybe privacy is to be discussed though ! Maybe a mix of anonymous price data, and a way to keep the pictures/OCR/data local in the user device (aka let apps owners use OCR localy)
My 2 cents
I developped a draft of bills data extractor from PDF > detect header footer infos (like store, address, date, SIRET, ect) > parse table rows to CSV.
PDF file are generated right now by an external app TextFairy that uses Tesseract to extract text and positioning.
I found OpenFoodFacts searching a way to get products infos from "partial general name" + store
@devingfx We had made this prototype during a hackathon: https://github.com/openreceipts/openreceipts-server
@devingfx I'm not sure whether you're still interested in the subject, but you've launche Open Prices (https://prices.openfoodfacts.org), a crowdsourced database of prices of food products in the world. Having ML to extract automatically data from receipts/price tags would help tremendously.