tensorflow
tensorflow copied to clipboard
tf.keras.metrics.MeanIoU API is practically unusable without a threshold
Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
- TensorFlow installed from (source or binary): pip3
- TensorFlow version (use command below): 2.1.0
- Python version: 3.6
- Bazel version (if compiling from source):
- GCC/Compiler version (if compiling from source):
- CUDA/cuDNN version:
- GPU model and memory:
Describe the current behavior
tf.keras.metrics.MeanIoU's constructor implementation does not take a threshold or list of thresholds as input argument. This is not only inconsistent the API used by other metrics (e.g. tf.keras.metrics.TruePositives, tf.keras.metrics.FalseNegatives) but also renders the API practically unusable because the outputs (i.e. predictions) from a network would generally be probability values in range from 0 to 1 and not a perfect 0 or 1 values. Hence, unless the constructor takes thresholds as argument and applies it to predictions before computing IoU, it is practically useless. It would always end up showing 0.5, or 0.25 or whatever the baseline random guess IOU happens to be in a given problem.
Describe the expected behavior
tf.keras.metrics.MeanIoU constructor should take threshold values as input and also apply those before computing the IoU.
Standalone code to reproduce the issue None required because the docs https://www.tensorflow.org/api_docs/python/tf/keras/metrics/MeanIoU proves the point where it only shows a example where preds are already binary values.
Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
@dd1923, On running the usage example given in the MeanIoU documentation, the output I got was similar to the example. Please find the gist of it here.
Could you please provide a minimal code sample to reproduce the issue reported here. Thanks!
@amahendrakar That was my point. Did you read my post above? Mean IOU, implemented as a metric, is pretty much unusable without a threshold. You can try replacing 1 values in pred to 0.9 in your example and see the output.
Was able to reproduce the issue with TF v2.1, TF v2.2.0-rc4 and TF-nightly. Please find the attached gist. Thanks!
@pavithrasv any idea on when will this be addressed? Seems like a minor change on API side but it carries big impact on usability.
@dd1923 do you need threshold or something like argmax that choose the index with maximum probability?
Need threshold to stay consistent with the rest of the API. See TruePositives, FalsePositives etc implementations https://www.tensorflow.org/api_docs/python/tf/keras/metrics/TruePositives?hl=TR
Need threshold to stay consistent with the rest of the API. See TruePositives, FalsePositives etc implementations https://www.tensorflow.org/api_docs/python/tf/keras/metrics/TruePositives?hl=TR
I'm not sure this can be consistent -- it looks like TruePositives requires y_pred to be probabilities, i.e., [batch_size, HW, n_classes], which is where threshold makes sense to label the output as either 0 or 1. Meanwhile MeanIOU requires y_pred to be predicted class id, i.e., [batch_size, HW]
I'd rather think the right way is allow this metrics to accept y_pred as probabilities and do argmax under the hood.
First, at-least the docs of TruePositives and MeanIoU refer y_pred as y_pred | The predicted values.
Second, on the docs it says IOU is defined as follows: IOU = true_positive / (true_positive + false_positive + false_negative) so it seems logical to me to think that if all the inputs of IoU function (namely true positives, false_positives and false negatives) work on thresholds then the resulting metric would also work on thresholds.
Third, how would argmax handle the case of having 1 in more than one output classes?
I think something along the lines of the following in the update_state method would do the trick, right before calling current implementation of update_state (for atleast the case where threshold is one value):
y_pred = tf.where(condition=tf.math.greater(y_pred, tf.cast(threshold, y_pred.dtype)), x=tf.cast(1.0, y_pred.dtype), y=y_pred)
y_pred = tf.where(condition=tf.math.less_equal(y_pred, tf.cast(threshold, y_pred.dtype)), x=tf.cast(0.0, y_pred.dtype), y=y_pred)
Ok you want multilabel support, in that case having threshold makes sense
@dd1923 Sorry about the delay. Will you be interested in sending a PR for this with test cases? I'll be happy to review and merge the change.
A potential workaround I found on this stack overflow (https://stackoverflow.com/questions/60507120/how-to-correctly-use-the-tensorflow-meaniou-metric):
class MyMeanIOU(tf.keras.metrics.MeanIoU):
def update_state(self, y_true, y_pred, sample_weight=None):
return super().update_state(y_true, tf.argmax(y_pred, axis=-1), sample_weight)
In this case my y_true was a mask of shape batch,256,256,1 where the pixel values in the last dimension were 0,1 or 2. Then my y_pred was shape batch,256,256,3. This way the argmax takes from probability -> class value. Hope that helps!
@pavithrasv I've opened #47410 , could you take a look and share your thoughts on the best way to make the changes to the meanIoU() interface?
Hi guys,
I got a new issue with MeanIoU.
I am using TF 2.8 and python 3.9, OS windows10.
I am running a simple code for semantic segmentation (3 classes, background: 0, green objects: 1, red objects: 2).
I also used y_train = to_categorical(y_train, num_classes=3)
The metrics show that the model is learning, but the IoU is always 0.3342, while simply by checking the tp, fp, fn, the IoU should be different.
Could you please let me know what I am missing?
Epoch 1/20
47/47 - 39s - loss: 1.1312 - tp: 15429676.0000 - fp: 14302729.0000 - tn: 84006816.0000 - fn: 34110008.0000 - mean_io_u: 0.3342 - val_loss: 1.0479 - val_tp: 93273.0000 - val_fp: 13260.0000 - val_tn: 42716212.0000 - val_fn: 21271462.0000 - val_mean_io_u: 0.3333 - 39s/epoch - 820ms/step
Epoch 2/20
47/47 - 27s - loss: 0.8477 - tp: 24225640.0000 - fp: 8459446.0000 - tn: 88540744.0000 - fn: 24657304.0000 - mean_io_u: 0.3342 - val_loss: 0.9835 - val_tp: 375734.0000 - val_fp: 24419.0000 - val_tn: 42705056.0000 - val_fn: 20989004.0000 - val_mean_io_u: 0.3333 - 27s/epoch - 575ms/step
Epoch 3/20
47/47 - 25s - loss: 0.6144 - tp: 33600268.0000 - fp: 4546788.0000 - tn: 92456352.0000 - fn: 15279731.0000 - mean_io_u: 0.3342 - val_loss: 0.7825 - val_tp: 3682112.0000 - val_fp: 62991.0000 - val_tn: 42666480.0000 - val_fn: 17682624.0000 - val_mean_io_u: 0.3333 - 25s/epoch - 531ms/step
Epoch 4/20
47/47 - 23s - loss: 0.4546 - tp: 39932232.0000 - fp: 2521113.0000 - tn: 94480768.0000 - fn: 8949028.0000 - mean_io_u: 0.3342 - val_loss: 0.5884 - val_tp: 16064998.0000 - val_fp: 105517.0000 - val_tn: 42623952.0000 - val_fn: 5299738.0000 - val_mean_io_u: 0.3333 - 23s/epoch - 484ms/step
Epoch 5/20
47/47 - 28s - loss: 0.3440 - tp: 43877272.0000 - fp: 1494296.0000 - tn: 95507416.0000 - fn: 5004135.0000 - mean_io_u: 0.3342 - val_loss: 0.3385 - val_tp: 20455492.0000 - val_fp: 135860.0000 - val_tn: 42593616.0000 - val_fn: 909244.0000 - val_mean_io_u: 0.3333 - 28s/epoch - 605ms/step
Epoch 6/20
47/47 - 27s - loss: 0.2719 - tp: 45860256.0000 - fp: 975254.0000 - tn: 96026592.0000 - fn: 3021046.0000 - mean_io_u: 0.3342 - val_loss: 0.2615 - val_tp: 20886848.0000 - val_fp: 140094.0000 - val_tn: 42589384.0000 - val_fn: 477889.0000 - val_mean_io_u: 0.3333 - 27s/epoch - 567ms/step
Epoch 7/20
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size= 0.3, random_state=0)
data_gen_args = dict(horizontal_flip=True,
rotation_range=20,
vertical_flip=True)
image_datagen = ImageDataGenerator(**data_gen_args)
mask_datagen = ImageDataGenerator(**data_gen_args)
seed = 1
bs = 16
image_generator = image_datagen.flow(X_train, seed=seed, batch_size=bs, shuffle=True)
mask_generator = mask_datagen.flow(y_train, seed=seed, batch_size=bs, shuffle=True)
train_generator = zip(image_generator, mask_generator)
METRICS = [
keras.metrics.TruePositives(name='tp'),
keras.metrics.FalsePositives(name='fp'),
keras.metrics.TrueNegatives(name='tn'),
keras.metrics.FalseNegatives(name='fn'),
keras.metrics.MeanIoU(num_classes= 3)
]
input_shape = ((args.input_size, args.input_size, 3))
model = DeepLabV3Plus(input_shape)
opt = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(loss='categorical_crossentropy', optimizer= opt, metrics=METRICS)
history = model.fit(train_generator,
verbose=2,
epochs=args.epochs,
steps_per_epoch=(len(X_train) // bs),
validation_data=(X_val, y_val),
shuffle=False)
I have solved this problem by changing MeanIoU with OneHotIoU
I have solved this problem by changing MeanIoU with OneHotIoU
would you mind contributing this in KerasCV instead?
I have solved this problem by changing MeanIoU with OneHotIoU
would you mind contributing this in KerasCV instead?
How can I do that?
I have solved this problem by changing MeanIoU with OneHotIoU
would you mind contributing this in KerasCV instead?
How can I do that?
Make a PR on https://github.com/keras-team/keras-cv on a MeanIoU that corrects this
Hi,
Thank you for opening this issue. Since this issue has been open for a long time, the code/debug information for this issue may not be relevant with the current state of the code base.
The Tensorflow team is constantly improving the framework by fixing bugs and adding new features. We suggest you try the latest TensorFlow version with the latest compatible hardware configuration which could potentially resolve the issue. If you are still facing the issue, please create a new GitHub issue with your latest findings, with all the debugging information which could help us investigate.
Please follow the release notes to stay up to date with the latest developments which are happening in the Tensorflow space.
This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.
This issue was closed because it has been inactive for 7 days since being marked as stale. Please reopen if you'd like to work on this further.