segmentation_metrics
segmentation_metrics copied to clipboard
Computing values on ground truth does not give perfect scores
Following one of the examples in the README
file, a quick test using a ground truth image as the predicted image,
labels = [0, 1, 2]
gdth_img = np.array([[0,0,1], [0,1,2]])
metrics = sg.write_metrics(labels=labels[1:], gdth_img=gdth_img, pred_img=gdth_img)
does not give perfect scores: i.e. the dice score, Jaccard index, precision and recall are not 1.0. Although they are close (0.999) they should be a perfect 1.0.
It seems to do with the smooth parameter, changing it to smooth=0.0 gives a dice of 1.0. Some discussion about that here. https://github.com/Jingnan-Jia/segmentation_metrics/blob/df9a231275decb7803ed8e16792ccd3ab9300d28/seg_metrics/seg_metrics.py#L81-L89
@agarcia-ruiz thanks for the answer. I understand that it may be useful to have it to prevent a division by zero, but I am not convinced that it should be there outside the scope of training a DL model (have gone through the discussion you pointed, and even there it seem like for some cases the training does not work unless setting it to 0).
In this context, the case should probably be dealt with in another way when pred_sum
is 0. At the very least, documenting the behavior may be worthwhile, or adding it as a parameter to the method.
@jhlegarreta Thank you very much proposing this question. Actually, we have had some discussion on this issue and we decided to use 0.001 as the default smooth. But indeed we found that a lot of users may feel confused about it especially when running our examples. So we will have another deeper discussion on it and see how to solve it. A possible solution, as you mentioned, is to make it as a parameter. Another solution is make it as 0 and raise Exception when division zero error occured.
Anyway, we will solve it soon. And if you have better solution, please let us know.
Best, Jingnan
@jhlegarreta Another solution could be that we have the default smooth value of 0 followed by a try excep function for the metrics calculation. If we met the division zero exception in the try function, we reset the smooth value to 0.001 and recalculate the metrics, (and may show a warning to the users about this case).
What do you think of this solution?
@Jingnan-Jia it looks like one possible workaround; I am not sure what the best approach is. Maybe worthwhile looking at what other scientific tools that compute at least the DSC do. In my mind, when giving the ground truth it should clearly provide a perfect score. Providing tests on well-known cases would probably be enlightening.
@jhlegarreta Thank you for your reply. I updated the package. By installing the latest version (pip install seg-metrics==1.2.3
) you can now have accurate and perfect metrics.
I removed the use of "smooth= 0.001". Instead, I used the if else
condition to check if thedenominator is 0 and return the perfect metrics for different conditions.
By running the code you provide you can get the following results.
[{'label': [1, 2],
'dice': [1.0, 1.0],
'jaccard': [1.0, 1.0],
'precision': [1.0, 1.0],
'recall': [1.0, 1.0],
'fpr': [0.0, 0.0],
'fnr': [0.0, 0.0],
'vs': [0.0, 0.0],
'hd': [0.0, 0.0],
'msd': [0.0, 0.0],
'mdsd': [0.0, 0.0],
'stdsd': [0.0, 0.0],
'hd95': [0.0, 0.0]}]
v1.2.3 or later the perfect metrics can be obtained.
@Jingnan-Jia thanks for the effort.