huggingface_hub Some custom objects are not being serialized with push_to_hub

trafficstars

Self-contained code example:

from huggingface_hub import from_pretrained_keras

model = from_pretrained_keras("keras-io/vit-small-ds")

Error

/usr/local/lib/python3.7/dist-packages/keras/utils/generic_utils.py in class_and_config_for_serialized_keras_object(config, module_objects, custom_objects, printable_module_name) 560 if cls is None: 561 raise ValueError( --> 562 f'Unknown {printable_module_name}: {class_name}. Please ensure this ' 563 'object is passed to the custom_objects argument. See ' 564 'https://www.tensorflow.org/guide/keras/save_and_serialize'

ValueError: Unknown optimizer: Addons>AdamW. Please ensure this object is passed to the custom_objects argument. See https://www.tensorflow.org/guide/keras/save_and_serialize#registering_the_custom_object for details.

Jan 14 '22 15:01 osanseviero

Similarly with

from huggingface_hub import from_pretrained_keras

model = from_pretrained_keras("carlosaguayo/vit-base-patch16-224-in21k-euroSat")

Jan 15 '22 16:01 osanseviero

Hey @osanseviero @merveenoyan

I was successful in uploading the custom objects with push_to_hub_keras API. The main steps are:

Have a get_config method for all the custom layers.
Serialization of tensors that are used in the custom layers.

class MultiHeadAttentionLSA(tf.keras.layers.MultiHeadAttention):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        # The trainable temperature term. The initial value is
        # the square root of the key dimension.
        self.tau = tf.Variable(
            math.sqrt(float(self._key_dim)),
            trainable=True
        )
        # Build the diagonal attention mask
        diag_attn_mask = 1 - tf.eye(NUM_PATCHES)
        self.diag_attn_mask = tf.cast([diag_attn_mask], dtype=tf.int8)
    
    def get_config(self):
        config = super().get_config()
        config.update({
            "tau": self.tau.numpy(),                       #<---- IMPORTANT
            "diag_attn_mask": self.diag_attn_mask.numpy(), #<---- IMPORTANT
        })
        return config

Not using augmentation layers inside the model. TensorFlow 2.7 has a problem with serializing the augmentation layers. Here I have used the map function to map the tf.data and used the augmentation as a preprocessing step rather than inside the mode.

Colab Notebook used

Colab Notebook

Usage of the pre-trained `vit-ds-small` model

loaded_model = from_pretrained_keras("keras-io/vit-small-ds")

_, accuracy, top_5_accuracy = loaded_model.evaluate(test_ds)
print(f"Test accuracy: {round(accuracy * 100, 2)}%")
print(f"Test top 5 accuracy: {round(top_5_accuracy * 100, 2)}%")

Note: You have to use the test augmentation to preprocess the input images before sending it to the model.

Jan 16 '22 18:01 ariG23498

Hi @ariG23498! This is great and very insightful! :hugs:

The augmentation layer is TF 2.7 specific and hopefully it should work with incoming versions. That way the end-user does not need to know to know any pre/post processing steps and everything is within the saved model.

I wonder if there's any way to achieve this programmatically instead of expecting users having to implement this (cc @Rocketknight1 or @gante might have some ideas). Worst case, we could throw some warning when there are custom layers being saved and pointing to documentation. WDYT?

Jan 17 '22 08:01 osanseviero

@osanseviero I feel like this might be not specific to 2.7. I saw @ariG23498 downgraded TF to 2.6 to save the model, so model was still saved (his first issue was related to that and we fixed it that way) but custom layer was still needed to be registered for us to load the model, the error we got about AdamW was related to that (see below) and it has nothing to do with 2.7. If you have a custom object you need to register it using the methods he mentioned. See here. We might indeed ask user to register the custom object with a warning if they'd like to host their model on Hub. Regardless of this, I'm looking for ways to see if we can infer it from the SavedModel format.

Jan 17 '22 09:01 merveenoyan

Worst case, we could throw some warning when there are custom layers being saved and pointing to documentation. WDYT?

Yep. This is the only way to go. Even TensorFlow throws an error while trying to save a custom model. It is a platform specific error that needs to be checked by the user and not the HF team IMO.

Jan 17 '22 11:01 ariG23498

I agree with what @ariG23498 wrote about custom layers, its flexibility (which makes it hard to automate) can be a boon. And the creators of custom layers are mostly power user anyways, in my experience :D

For completeness of discussion, there is a point yet to be addressed in this discussion. The error that @osanseviero originally points at can be avoided by importing tfa, which is needed to load the optimizer. In other words:

this works

import tensorflow_addons as tfa
from huggingface_hub import from_pretrained_keras
loaded_model = from_pretrained_keras("keras-io/vit-small-ds")
loaded_model.summary()

this doesn't work

from huggingface_hub import from_pretrained_keras
loaded_model = from_pretrained_keras("keras-io/vit-small-ds")
loaded_model.summary()

Digging deeper, we can see that push_to_hub_keras uses tf.keras.models.save_model under the hood (here). We might want to set its include_optimizer argument to False, which removes the optimizer object before serialization, preventing errors like this from optimizers that are not in the standard tensorflow library.

What do you think?

EDIT: at the very least, we can throw a warning when the user is pushing a model to the hub with these kind of optimizers.

Jan 17 '22 12:01 gante

Hey @gante

I love the insights that you bring to the table.

I think https://github.com/huggingface/huggingface_hub/issues/598 covers the issue that you are talking about. Do let me know what you think.

Jan 17 '22 12:01 ariG23498

@gante @ariG23498 @osanseviero save_traces registers every custom layer to SavedModel by default and we don't need to register custom objects, it's my bad. I thought the error @ariG23498 got was related to that because the error message indicated this. Now, we only need to change include_optimizer and maybe we could change signatures for TF Lite users, related to #598.

Jan 17 '22 13:01 merveenoyan

For this one I'm planning to test from_pretrained_keras() on models with custom objects and see the error pattern, and if it's raised, prompt user to pass the custom objects with custom_objects into **kwargs instead when loading. Seems the only reasonable way given user has to implement things.

BTW weird enough, when you import AdamW without actually compiling the model again this issue ValueError: Unknown optimizer: Addons>AdamW. Please ensure this object is passed to the custom_objects argument. See https://www.tensorflow.org/guide/keras/save_and_serialize#registering_the_custom_object for details. goes away.

Feb 21 '22 14:02 merveenoyan

huggingface_hub
huggingface_hub copied to clipboard

Some custom objects are not being serialized with push_to_hub_keras

Colab Notebook used

Usage of the pre-trained `vit-ds-small` model

huggingface_hub huggingface_hub copied to clipboard

Some custom objects are not being serialized with push_to_hub_keras

Colab Notebook used

Usage of the pre-trained vit-ds-small model

huggingface_hub
huggingface_hub copied to clipboard

Usage of the pre-trained `vit-ds-small` model