open-prices Price tags: levels of difficulty (ex: AI found an existing product or category)

Story

Currently, all the Price Tags (with status=None) are visible in the Price Validation Assistant (web frontend). But some PT are very hard to decipher, or simply not well detected. And we have new users who find the PVA too complex.

In order to build a simplified PVA, we need to isolate simple PTs from difficult ones.

How

An idea is to sort them by magnitudes of difficulties :

level 1 : found a barcode corresponding to an existing product
level 2 : found an existing category
level 3 : found a valid barcode, but no product exists yet
level 4 : the rest 😅

We could add a field difficulty_level that would be calculated similarly to #656

The tricky bit is that this field would be stored on the prediction. So we then need to be able to filter the API on the predictions...

Todo

[x] first list of rules
[x] new PriceTag.tags field to store the results: #806
[x] new PriceTagPrediction helper methods to tell if the predicted barcode/product/category_tag is valid or not: #838
[x] new signal to set the PriceTag tags after its PriceTagPrediction creation: #843
[ ] run a script on the the PriceTag history

Jan 26 '25 18:01 raphodn

With the help of #806 (new PriceTag.tags field), we can label the PriceTags with the above rules

Rule	tag label	implemented?
prediction found a valid barcode	~`prediction-found-valid-barcode`~ `prediction-barcode-valid`	✅
prediction found a barcode corresponding to an existing product	~`prediction-found-existing-product`~ `prediction-product-exists`	✅
prediction found a valid barcode corresponding to an existing product	`prediction-barcode-valid-and-product-exists`
prediction found a valid category_tag	~`prediction-found-category`~ `prediction-category-tag-valid`	✅

@TTalex sounds good ? ideas of extra rules, or better naming ?

May 09 '25 09:05 raphodn

Looks good !

May 09 '25 16:05 TTalex

ok I merged 2 PRs. and updated my comments above! and tested on staging ✅

May 11 '25 16:05 raphodn

todo

[x] refactoring : move the logic to ml.py
[x] manage changes due to schema v2
- following #885
[ ] manage specific barcode rules (e.g. Carrefour) currently done in the frontend
- dedicated issue : https://github.com/openfoodfacts/open-prices/issues/1020
[ ] run a script on the history (and the changes post-v2)
[x] add a filter in the frontend UI

Aug 06 '25 15:08 raphodn

I'm a bit stuck on the shop-specific rules. I'd like to implement them in the backend, but I don't want to change the output of Gemini nevertheless. So it would probably require adding a new field, such as PriceTagPrediction.data.barcode_cleaned, but maybe better to add it in a seperate field/dict ?

The logic would be to implement a v1 without this, but Carrefour has such particular price tags it would suck to not have them managed ^^

Aug 27 '25 07:08 raphodn

I remember a previous conversation about this.

To me, it's acceptable to edit the Gemini prediction with additional rulings, such as the Carrefour one, and simply replace the prediction.

It would be the same if we edited the prompt, asking the model explicitly to handle barcode with slashes differently. What matters is that in the end the prediction is as good as possible, even if we had to fix a few things along the way.

No ?

Aug 27 '25 18:08 TTalex

I started some frontend integration here : https://github.com/openfoodfacts/open-prices-frontend/pull/1702

And I moved the "shop-specific" discussion to a dedicated issue : https://github.com/openfoodfacts/open-prices/issues/1020

Closing :)

Sep 26 '25 20:09 raphodn