Katherine Wu

Results 12 comments of Katherine Wu

The reason why this is failing is that the MHA layer has extra instructions when deserialized with `from_config`, which isn't called when initialized using `num_heads` and `key_dim`: https://github.com/keras-team/keras/blob/v2.9.0/keras/layers/attention/multi_head_attention.py#L303 Without this...

+@rchao This kind of issue is somewhat common, anyone who tries to create a subclassed MHA layer will run into it. The new idempotent saving format will also see it,...

@SirDavidLudwig In code snippet in my previous comment, you can either pass `embed_dim` and `num_heads`, or the mha layer into the constructor. The mha argument is needed only for `from_config()`

Tagging @rchao

The best way to avoid this issue is to disable the layer tracing when creating the SavedModel, but you'll have to manually define the `serving_default` function (this is the default...

@gcunhase Are you getting the same error even with `save_traces=False`?

Thanks for the PR! I think this counts as a shallow copy, since the values aren't being copied before being added to the new dict. You could replace the loop...