tf_unet Change 'dice_coefficient' to calculate Dice score rather than pixel accuracy

I am a student who's relatively new to the field, but:

I was inspecting this code for use in a project, and I noticed that the dice_coefficient calculation sums the prediction-y intersection over all dimensions, including class, before dividing by the "union." This seems incorrect, because the Dice score should be calculated by taking the intersection/union ratio separately for each class, and then summing the results (see e.g. the first answer here: https://stats.stackexchange.com/questions/255465/accuracy-vs-jaccard-for-multiclass-problem). This weights the predictions properly according to the frequency of the class in the image.

In contrast, the current code weights each prediction by the scalar "union," which will actually always equal 2 * height * width * batch because of the softmax. This is equivalent to pixel accuracy. I've changed the code to wait to reduce across the class/channel dimension until after the division.

If this is wrong I apologize, but I thought this was serious enough that I should bring it to attention.

May 11 '17 08:05 ethanbb

Sweet that looks nice. I will play arround a little bit to get a feelling but at first sight this looks good. Thanks a lot!

May 15 '17 19:05 jakeret

@ethanbb I looked at your PR and quickly ran into some issues. I was using the launcher.py example. Any thoughts?

2017-06-07 21:41:40,671 Layers 3, features 16, filter size 3x3, pool size: 2x2 2017-06-07 21:41:43,343 Removing 'jakeret/workspace/tf_unet/prediction' 2017-06-07 21:41:43,344 Removing 'jakeret/workspace/tf_unet/unet_trained' 2017-06-07 21:41:43,345 Allocating 'jakeret/workspace/tf_unet/prediction' 2017-06-07 21:41:43,345 Allocating 'jakeret/workspace/tf_unet/unet_trained' 2017-06-07 21:41:52,098 Verification error= 19.3%, loss= -0.9108 2017-06-07 21:41:55,556 Start optimization 2017-06-07 21:42:01,622 Iter 0, Minibatch Loss= -0.8897, Training Accuracy= 0.8661, Minibatch error= 13.4% 2017-06-07 21:42:11,388 Iter 2, Minibatch Loss= -0.9438, Training Accuracy= 0.8686, Minibatch error= 13.1% 2017-06-07 21:42:21,351 Iter 4, Minibatch Loss= -1.0127, Training Accuracy= 0.7953, Minibatch error= 20.5% 2017-06-07 21:42:29,899 Iter 6, Minibatch Loss= -1.0798, Training Accuracy= 0.8195, Minibatch error= 18.1% 2017-06-07 21:42:38,333 Iter 8, Minibatch Loss= -1.2307, Training Accuracy= 0.8281, Minibatch error= 17.2% 2017-06-07 21:42:47,531 Iter 10, Minibatch Loss= -1.4287, Training Accuracy= 0.8143, Minibatch error= 18.6% 2017-06-07 21:42:56,059 Iter 12, Minibatch Loss= -1.2956, Training Accuracy= 0.8034, Minibatch error= 19.7% 2017-06-07 21:43:05,129 Iter 14, Minibatch Loss= -1.4939, Training Accuracy= 0.9566, Minibatch error= 4.3% 2017-06-07 21:43:13,368 Iter 16, Minibatch Loss= -1.5857, Training Accuracy= 0.9590, Minibatch error= 4.1% 2017-06-07 21:43:21,670 Iter 18, Minibatch Loss= nan, Training Accuracy= 0.0195, Minibatch error= 18.8% 2017-06-07 21:43:25,199 Epoch 0, Average loss: nan, learning rate: 0.2000 2017-06-07 21:43:32,609 Verification error= 19.3%, loss= -0.8961 jakeret/workspace/tf_unet/tf_unet/util.py:74: RuntimeWarning: invalid value encountered in true_divide img /= np.amax(img) W tensorflow/core/framework/op_kernel.cc:993] Invalid argument: Nan in summary histogram for: norm_grads

Jun 07 '17 19:06 jakeret

Sorry, forgot about this - not sure what's causing the NaN(s). We ended up using a different loss function entirely (computing cross entropy for pixels in each gt class separately and then averaging across classes), so I haven't tested this version extensively. I still think this is closer to a real Dice score than the original, but the "true" Dice score that we used for evaluation treats predictions as binary rather than adding up the probability of each class for each pixel (but this way can't be used as a loss function since it doesn't have a smooth gradient).

Jul 01 '17 06:07 ethanbb

Hi @ethanbb , thanks for your pull request. I have a question, in your code, the following line

loss = tf.reduce_sum(-(2 * intersection/ (union)))

why not

loss = tf.reduce_mean(-(2 * intersection/ (union)))?

Apr 18 '18 09:04 glhfgg1024

@glhfgg1024 Hey, it's been a while. I'm not sure whether summing or meaning the individual intersections-over-union gives a more "canonical" Dice coefficient, but there should be no difference in training since they just differ by a constant factor.

Apr 24 '18 04:04 ethanbb

Hi @ethanbb, thanks a lot for your kind comment!

Apr 24 '18 04:04 glhfgg1024

Be careful with this calculation... Intersection over union (IoU) is not the same as the dice coefficient. Besides the 2 coefficient present in the dice coefficient (which is accounted for in the code), the denominator is also different. For IoU, the denominator is simply |A u B| whereas for the dice coefficient it is |A| + |B|. The main difference in this is that the intersect is double counted in the dice coefficient as both the |A| and |B| term will include the intersect, whereas in IoU, |A u B| only includes the intersect once.

To fix this, we could change the denominator to either |A| + |B| or to (union + intersect) as those are both equivalent. Since the dice coefficient is usually done with |A| + |B|, I would personally suggest that for the sake of clarity, but bother are mathematically correct.

A further suggestion would be to add IoU as a third option for the cost parameter as this may be a function others would like to use in the future.

Jun 04 '18 19:06 wkeithvan

@wkeithvan Good point. Actually, I believe that what is currently called union in the code is actually the sum |A| + |B| as you suggest - is it not? So to be more accurate, we should rename that to sum or something, and then also include intersection over union as another option.

Jun 04 '18 19:06 ethanbb

I agree... I was just going through and thinking the same thing. I would rename the intersection variable to A_intersect_B and the union variable to A_plus_B. Additionally, I would toss the eps into the final calculation so that interm variables aren't 'fuzzed' through adding an epsilon just in case they get used later.

Perhaps this?

A_intersect_B = tf.reduce_sum(prediction * self.y, axis=[0, 1, 2])
A_plus_B =  tf.reduce_sum(prediction, axis=[0, 1, 2]) + tf.reduce_sum(self.y, axis=[0, 1, 2])
loss = tf.reduce_sum(-(2 * A_intersect_B/ (eps + A_plus_B)))

Jun 04 '18 20:06 wkeithvan

Make sure to add to the documentation/comment in line 207 adding that intersection over union as "iou" is a valid choice for the cost function. Otherwise looks good.

Jun 06 '18 14:06 wkeithvan

First of all thanks for your contribution. I just found some time to have a look.

I noticed that there are syntax errors in the commit. When I fixed them an ran the demo code I was getting the old (see above) NaN errors. Something doesn't quite right. @wkeithvan does it work on your machine?

Jun 08 '18 17:06 jakeret

Sorry about the syntax errors - I don't use Python often. I'm still not sure about the NaN problem.

Jun 08 '18 17:06 ethanbb

It's running, although I'm getting that black output again... (#182)

Jun 11 '18 18:06 wkeithvan