Numerical Instability in metrics.py
When I use metrics.py to evaluate a model using the same weight, I get different mIoU values for different runs.
I am using your DeepLab implementation as a backbone in another network and also using your evaluation code
Below are 3 such runs, where metrics.py has been used to evaluate the model on the same validation set, using the same weights.
RUN 1
> 'Pixel Accuracy': 0.891,
> 'Mean Accuracy': 0.755,
> 'Frequency Weighted IoU': 0.810,
> 'Mean IoU': 0.615,
RUN 2
> 'Pixel Accuracy': 0.896,
> 'Mean Accuracy': 0.761,
> 'Frequency Weighted IoU': 0.819,
> 'Mean IoU': 0.622,
RUN 3
> "Pixel Accuracy": 0.882
> "Mean Accuracy": 0.748,
> "Frequency Weighted IoU": 0.798,
> "Mean IoU": 0.609,
seems like its an issue of numerical instability.
Particularly, I feel that either the _fast_hist function or the division in scores function in utils/metric.py file is the root cause.
Will greatly appreciate if you can provide some help here thank you!
Is the output of your network consistent for each run? Let me make sure the inference is deterministic.