clearml icon indicating copy to clipboard operation
clearml copied to clipboard

No automatical artifacts upload with tensorflow 2.13

Open Kaczmarekrr opened this issue 1 year ago • 6 comments

Hi! First at all thank for a really good tool!

I have some troubles with logging with newer tensorflow version. On older version (tested on 2.8 but I think might be anything below 2.11) everything works, tf.keras.callbacks.ModelCheckpoint() is saving model and automatically the model is uploaded to clearml local server in our case.

When using older version I got in artifact "outputs models" and there a position for each of saved checkpoints. But with newer version (2.13) only uploads "variables file" which are not the whole model it is overwriting all over again and is not usable at all. The same goes with MODEL CONFIGURATION which in the older version is uploaded but here nothing happens. I already tested with every possible saving format - still do not work.

I noticed a warning about imports of tf saying that import need to be fixed.

WARNING:tensorflow:Please fix your imports. Module tensorflow.python.training.tracking.util has been moved to tensorflow.python.checkpoint.checkpoint. The old module will be deleted in version 2.11.

The function TrackableSaver that clearml is using was moved from tf.python.training.tracking.util to tf.python.training.checkpoint.checkpoint

I hoped that fixing import will make a job. I tried to do this but this is not enough. There is no more warning but still it do not log correctly

To reproduce

Run any training loop on newer tfversion and compere logging results to the older one.

For testing this issue I used basic tf classification tutorial with added code needed by clearml.

Environment

  • Server type: self hosted
  • ClearML SDK Version: 1.12.2
  • ClearML Server Version: WebApp: 1.12.1-397 • Server: 1.12.1-397 • API: 2.26
  • Python Version: 3.10.12
  • OS: Linux 22

Kaczmarekrr avatar Sep 05 '23 17:09 Kaczmarekrr

Hi @Kaczmarekrr ! Thank you for letting us know. We will fix this ASAP.

Thanks @eugen-ajechiloae-clearml, would help a lot. Keras introduced "Keras v3" format with .keras extension as recommended from TF2.13. Not sure if this is related to this issue, but would be nice if ClearML worked with both SavedModel and Keras v3.

niemiaszek avatar Sep 06 '23 10:09 niemiaszek

@niemiaszek Can you please post an example of "Keras v3"? We would like to look into it as well

Sure @eugen-ajechiloae-clearml . It's introduced in 2.13 release as default format in place of SavedModel. It can be created according to an example in Keras documentation. Here is an output model generated from this example: example.keras.zip. I had to zip it to upload it directly here. Upon further inspection it contains 3 files: "config.json", "metadata.json" and "model.weights.h5"

niemiaszek avatar Sep 07 '23 13:09 niemiaszek

Hey @Kaczmarekrr! Just letting you know that this issue has been resolved in v1.13.0. Let us know if there are any issues :)

pollfly avatar Oct 02 '23 08:10 pollfly

@pollfly Thanks for letting me know! Already tested. At this moment it works as intended. :))

Kaczmarekrr avatar Oct 02 '23 08:10 Kaczmarekrr