huggingface_hub
huggingface_hub copied to clipboard
Some custom objects are not being serialized with push_to_hub_keras
Self-contained code example:
from huggingface_hub import from_pretrained_keras
model = from_pretrained_keras("keras-io/vit-small-ds")
Error
/usr/local/lib/python3.7/dist-packages/keras/utils/generic_utils.py in class_and_config_for_serialized_keras_object(config, module_objects, custom_objects, printable_module_name) 560 if cls is None: 561 raise ValueError( --> 562 f'Unknown {printable_module_name}: {class_name}. Please ensure this ' 563 'object is passed to the
custom_objectsargument. See ' 564 'https://www.tensorflow.org/guide/keras/save_and_serialize'ValueError: Unknown optimizer: Addons>AdamW. Please ensure this object is passed to the
custom_objectsargument. See https://www.tensorflow.org/guide/keras/save_and_serialize#registering_the_custom_object for details.
Similarly with
from huggingface_hub import from_pretrained_keras
model = from_pretrained_keras("carlosaguayo/vit-base-patch16-224-in21k-euroSat")
Hey @osanseviero @merveenoyan
I was successful in uploading the custom objects with push_to_hub_keras API. The main steps are:
- Have a
get_configmethod for all the custom layers. - Serialization of tensors that are used in the custom layers.
class MultiHeadAttentionLSA(tf.keras.layers.MultiHeadAttention):
def __init__(self, **kwargs):
super().__init__(**kwargs)
# The trainable temperature term. The initial value is
# the square root of the key dimension.
self.tau = tf.Variable(
math.sqrt(float(self._key_dim)),
trainable=True
)
# Build the diagonal attention mask
diag_attn_mask = 1 - tf.eye(NUM_PATCHES)
self.diag_attn_mask = tf.cast([diag_attn_mask], dtype=tf.int8)
def get_config(self):
config = super().get_config()
config.update({
"tau": self.tau.numpy(), #<---- IMPORTANT
"diag_attn_mask": self.diag_attn_mask.numpy(), #<---- IMPORTANT
})
return config
- Not using
augmentationlayers inside the model. TensorFlow2.7has a problem with serializing the augmentation layers. Here I have used themapfunction to map thetf.dataand used the augmentation as a preprocessing step rather than inside the mode.
Colab Notebook used
Usage of the pre-trained vit-ds-small model
loaded_model = from_pretrained_keras("keras-io/vit-small-ds")
_, accuracy, top_5_accuracy = loaded_model.evaluate(test_ds)
print(f"Test accuracy: {round(accuracy * 100, 2)}%")
print(f"Test top 5 accuracy: {round(top_5_accuracy * 100, 2)}%")
Note: You have to use the test augmentation to preprocess the input images before sending it to the model.
Hi @ariG23498! This is great and very insightful! :hugs:
The augmentation layer is TF 2.7 specific and hopefully it should work with incoming versions. That way the end-user does not need to know to know any pre/post processing steps and everything is within the saved model.
I wonder if there's any way to achieve this programmatically instead of expecting users having to implement this (cc @Rocketknight1 or @gante might have some ideas). Worst case, we could throw some warning when there are custom layers being saved and pointing to documentation. WDYT?
@osanseviero I feel like this might be not specific to 2.7. I saw @ariG23498 downgraded TF to 2.6 to save the model, so model was still saved (his first issue was related to that and we fixed it that way) but custom layer was still needed to be registered for us to load the model, the error we got about AdamW was related to that (see below) and it has nothing to do with 2.7. If you have a custom object you need to register it using the methods he mentioned. See here. We might indeed ask user to register the custom object with a warning if they'd like to host their model on Hub. Regardless of this, I'm looking for ways to see if we can infer it from the SavedModel format.
Worst case, we could throw some warning when there are custom layers being saved and pointing to documentation. WDYT?
Yep. This is the only way to go. Even TensorFlow throws an error while trying to save a custom model. It is a platform specific error that needs to be checked by the user and not the HF team IMO.
I agree with what @ariG23498 wrote about custom layers, its flexibility (which makes it hard to automate) can be a boon. And the creators of custom layers are mostly power user anyways, in my experience :D
For completeness of discussion, there is a point yet to be addressed in this discussion. The error that @osanseviero originally points at can be avoided by importing tfa, which is needed to load the optimizer. In other words:
- this works
import tensorflow_addons as tfa
from huggingface_hub import from_pretrained_keras
loaded_model = from_pretrained_keras("keras-io/vit-small-ds")
loaded_model.summary()
- this doesn't work
from huggingface_hub import from_pretrained_keras
loaded_model = from_pretrained_keras("keras-io/vit-small-ds")
loaded_model.summary()
Digging deeper, we can see that push_to_hub_keras uses tf.keras.models.save_model under the hood (here). We might want to set its include_optimizer argument to False, which removes the optimizer object before serialization, preventing errors like this from optimizers that are not in the standard tensorflow library.
What do you think?
EDIT: at the very least, we can throw a warning when the user is pushing a model to the hub with these kind of optimizers.
Hey @gante
I love the insights that you bring to the table.
I think https://github.com/huggingface/huggingface_hub/issues/598 covers the issue that you are talking about. Do let me know what you think.
@gante @ariG23498 @osanseviero
save_traces registers every custom layer to SavedModel by default and we don't need to register custom objects, it's my bad. I thought the error @ariG23498 got was related to that because the error message indicated this.
Now, we only need to change include_optimizer and maybe we could change signatures for TF Lite users, related to #598.
For this one I'm planning to test from_pretrained_keras() on models with custom objects and see the error pattern, and if it's raised, prompt user to pass the custom objects with custom_objects into **kwargs instead when loading. Seems the only reasonable way given user has to implement things.
BTW weird enough, when you import AdamW without actually compiling the model again this issue ValueError: Unknown optimizer: Addons>AdamW. Please ensure this object is passed to the custom_objects argument. See https://www.tensorflow.org/guide/keras/save_and_serialize#registering_the_custom_object for details. goes away.