Price tags: levels of difficulty (ex: AI found an existing product or category)
Story
Currently, all the Price Tags (with status=None) are visible in the Price Validation Assistant (web frontend). But some PT are very hard to decipher, or simply not well detected. And we have new users who find the PVA too complex.
In order to build a simplified PVA, we need to isolate simple PTs from difficult ones.
How
An idea is to sort them by magnitudes of difficulties :
- level 1 : found a barcode corresponding to an existing product
- level 2 : found an existing category
- level 3 : found a valid barcode, but no product exists yet
- level 4 : the rest π
We could add a field difficulty_level that would be calculated similarly to #656
The tricky bit is that this field would be stored on the prediction. So we then need to be able to filter the API on the predictions...
Todo
- [x] first list of rules
- [x] new
PriceTag.tagsfield to store the results: #806 - [x] new
PriceTagPredictionhelper methods to tell if the predicted barcode/product/category_tag is valid or not: #838 - [x] new signal to set the
PriceTagtags after itsPriceTagPredictioncreation: #843 - [ ] run a script on the the
PriceTaghistory
With the help of #806 (new PriceTag.tags field), we can label the PriceTags with the above rules
| Rule | tag label | implemented? |
|---|---|---|
| prediction found a valid barcode | ~prediction-found-valid-barcode~ prediction-barcode-valid |
β |
| prediction found a barcode corresponding to an existing product | ~prediction-found-existing-product~ prediction-product-exists |
β |
| prediction found a valid barcode corresponding to an existing product | prediction-barcode-valid-and-product-exists |
|
| prediction found a valid category_tag | ~prediction-found-category~ prediction-category-tag-valid |
β |
@TTalex sounds good ? ideas of extra rules, or better naming ?
Looks good !
ok I merged 2 PRs. and updated my comments above! and tested on staging β
todo
- [x] refactoring : move the logic to ml.py
- [x] manage changes due to schema v2
- following #885
- [ ] manage specific barcode rules (e.g. Carrefour) currently done in the frontend
- dedicated issue : https://github.com/openfoodfacts/open-prices/issues/1020
- [ ] run a script on the history (and the changes post-v2)
- [x] add a filter in the frontend UI
I'm a bit stuck on the shop-specific rules.
I'd like to implement them in the backend, but I don't want to change the output of Gemini nevertheless.
So it would probably require adding a new field, such as PriceTagPrediction.data.barcode_cleaned, but maybe better to add it in a seperate field/dict ?
The logic would be to implement a v1 without this, but Carrefour has such particular price tags it would suck to not have them managed ^^
I remember a previous conversation about this.
To me, it's acceptable to edit the Gemini prediction with additional rulings, such as the Carrefour one, and simply replace the prediction.
It would be the same if we edited the prompt, asking the model explicitly to handle barcode with slashes differently. What matters is that in the end the prediction is as good as possible, even if we had to fix a few things along the way.
No ?
I started some frontend integration here : https://github.com/openfoodfacts/open-prices-frontend/pull/1702
And I moved the "shop-specific" discussion to a dedicated issue : https://github.com/openfoodfacts/open-prices/issues/1020
Closing :)