keras
keras copied to clipboard
BinaryCrossentropy should support a way to reduce over NO axes
Currently, we do not support a way to reduce over no axes inbackend.mean.
In https://github.com/keras-team/keras/blob/c9068087d9142bab573e0c300bf9874a957accff/keras/losses.py#L2162 this means we cannot use the Keras BinaryCrossentropy loss for YOLOx model.
@quantumalvya needs a way to avoid reducing over any axis, lets support this in the backend.mean() function: https://github.com/keras-team/keras/blob/c9068087d9142bab573e0c300bf9874a957accff/keras/backend.py#L2899
Is the underline it uses TF reduce_mean: https://www.tensorflow.org/api_docs/python/tf/math/reduce_mean
Where the axis=None semantic means "all the axis".
So what do you mean with avoid reducing over any axis?
@LukeWood can you expand on the requirement here?
Perhaps some example code demonstrating this need would be useful. I'm curious why we would need to use binary_crossentropy as a loss without any reduction. Can the use case be solved using keras.backend.binary_crossentropy, which doesn't perform reduction?
Sarvagya can you comment here? (pinged him a link as I can't tag him yet)
Where the axis=None semantic means "all the axis".
So what do you mean with avoid reducing over any axis?
As in the loss should return without reducing the mean over any axis at all. axis=None means over all axes but we require no reduction for YoloX.
Perhaps some example code demonstrating this need would be useful.
Currently implementing this in the YoloX implementation as such:
if self.axis is not None:
return tf.reduce_mean(
tf.keras.backend.binary_crossentropy(
y_true, y_pred, from_logits=self.from_logits
),
axis=self.axis,
)
return tf.keras.backend.binary_crossentropy(
y_true, y_pred, from_logits=self.from_logits
)
I'm curious why we would need to use binary_crossentropy as a loss without any reduction. Can the use case be solved using keras.backend.binary_crossentropy, which doesn't perform reduction?
YoloX sums over the axis internally and then means based on the number of positive number boxes (as calculated by SimOTA). If we mean over the number of examples, it would be inaccurate. keras.backend.binary_crossentropy can achieve it, yes. However, we want to offer support for losses passed through compile(). I have currently made an internal loss as part of my implementation.
Sarvagya can you comment here? (pinged him a link as I can't tag him yet)
It was just misspelled.
YoloX sums over the axis internally and then means based on the number of positive number boxes (as calculated by SimOTA). If we mean over the number of examples, it would be inaccurate. keras.backend.binary_crossentropy can achieve it, yes. However, we want to offer support for losses passed through compile(). I have currently made an internal loss as part of my implementation.
I think that it make more sense to create another BinaryCrossentropy like BinaryCrossentropyFocal or add an extra parameter to make the mean optional in BinaryCrossentropy instead of modifying mean.
YoloX sums over the axis internally
Does this mean that you could reduce sum and then divide by positive_boxes?
I think that it make more sense to create another BinaryCrossentropy like BinaryCrossentropyFocal or add an extra parameter to make the mean optional in BinaryCrossentropy instead of modifying mean.
Does this mean that you could reduce sum and then divide by positive_boxes?
Yes, as for now, I have implemented an internal loss specifically for YoloX which does reduce_sum followed by the number of positive boxes.
Yes also in the official reference impl the reduction is more in the BinaryCrossentropy API more then in mean:
https://github.com/Megvii-BaseDetection/YOLOX/blob/main/yolox/models/yolo_head.py#L493-L495