torchmetrics
torchmetrics copied to clipboard
[Metrics] Panoptic Quality
🚀 Feature
Implement Panoptic Quality
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi @ananyahjha93 are you working on this? If you're busy with other things I could take a look at this metric :).
@ddrevicky I think you can give it a shot, thanks!
cc @teddykoker
I will most likely not have time to look at this now, if anyone else would like to take a look at this feel free to do so :)
Hi! thanks for your contribution!, great first issue!
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
A polite request for reopening the issue, PQ is an important metric and it would be great to have it
Hello, I will give a try this week, inspired from the COCO implementation: https://github.com/cocodataset/panopticapi/blob/master/panopticapi/evaluation.py
Any preliminary comments are most welcome, especially on the signature that the methods should have. In any case I will submit a draft PR soon.
Regarding the spirit of implementation to adopt I do have a few question since it is my first contribution to PL.
- Should the metric return a single float (the actual panoptic quality) or should it return a dict of detailed intermediate results like in the reference implementation in COCO api?
- If I see small bugs/differences between the reference implementation and the reference paper, which one should I follow?
Answer from @justusschock on Discord, transcribed here for visibility : Regarding your questions:
- Metrics (after the full computation i.e after compute has been called) usually return a single float/scalar tensor so that these values can easily be logged to a logger of your choice. Sometimes (like for a PR-Curve) this is not feasible because you can’t reduce it to a single scalar, but if possible we should try to get it like that. Note that if reduction is None, we should get a scalar per sample of the current batch.
- That’s a very good question. Not sure how much this potential difference impacts the overall value. Usually I’d go with the paper, but in your specific case, I’d opt for the reference implementation since coco is the established de-facto standard and for comparability and consistency I feel like we should match them