openfoodfacts-ai icon indicating copy to clipboard operation
openfoodfacts-ai copied to clipboard

Extract data from receipts/bills and enrich OFF with prices notion

Open devingfx opened this issue 4 years ago • 3 comments

I read somewhere that table recognition is on roadmap... When this is ready, scanning "bills" or invoices to extract products price by brand/store/date.

With a shared price information, comparators and other apps would be possibles...

  • See price over time of a product
  • Compare stores margins
  • Detect price fluctuation
  • Compare categories average prices
  • Compare categories average country differences
  • An app to calculate the best couple of store to get this gorcery list you have to buy for next dinner, created from my last bills scans recurring products ... and so on

Maybe privacy is to be discussed though ! Maybe a mix of anonymous price data, and a way to keep the pictures/OCR/data local in the user device (aka let apps owners use OCR localy)

My 2 cents

devingfx avatar Jan 20 '21 16:01 devingfx

I developped a draft of bills data extractor from PDF > detect header footer infos (like store, address, date, SIRET, ect) > parse table rows to CSV.

PDF file are generated right now by an external app TextFairy that uses Tesseract to extract text and positioning.

I found OpenFoodFacts searching a way to get products infos from "partial general name" + store

devingfx avatar Jan 20 '21 16:01 devingfx

@devingfx We had made this prototype during a hackathon: https://github.com/openreceipts/openreceipts-server

teolemon avatar Sep 09 '21 11:09 teolemon

@devingfx I'm not sure whether you're still interested in the subject, but you've launche Open Prices (https://prices.openfoodfacts.org), a crowdsourced database of prices of food products in the world. Having ML to extract automatically data from receipts/price tags would help tremendously.

raphael0202 avatar Mar 08 '24 09:03 raphael0202