geo-deep-learning
geo-deep-learning copied to clipboard
Benchmark: need to have per-image metrics and detailed report
Currently, when using inference.py with ground truth files, we average metrics over all predictions made.
It would be important to keep per-image metrics for benchmarking purposes. For example, this could help us identify if the model performs better on images from a particular ecozone, season, time of day, etc.
This shows the need for a benchmarking platform that could output a complete report on a certain model's performance.
This has been resolved in the new Benchmarking implementation recently merged!?
https://smp.readthedocs.io/en/latest/metrics.html#segmentation_models_pytorch.metrics.functional.iou_score