molgraph Cannot save model as specified in the documentation

The documentation proposed to either save models using tf.saved_model or tf.keras. The first approach works, but saves a model that cannot be trained further (to my understanding). The second approach crashes with the following error:

Traceback (most recent call last):
  File "/home/raphael/.../example.py", line 86, in <module>
    model.save("model0.keras")
  File "/home/raphael/.../python3.10/site-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/raphael/.../python3.10/site-packages/keras/src/saving/serialization_lib.py", line 395, in _get_class_or_fn_config
    raise TypeError(
TypeError: Cannot serialize object ImmutableDict({}) of type <class 'tensorflow.python.framework.immutable_dict.ImmutableDict'>. To be serializable, a class must implement the `get_config()` method.

Here is an example based on the GAT tutorial that reproduces the error above:

from molgraph.chemistry import datasets
from molgraph.chemistry import features
from molgraph.chemistry import Featurizer
from molgraph.chemistry import MolecularGraphEncoder

from tensorflow import keras
import tensorflow as tf

atom_encoder = Featurizer([
    features.Symbol(),
    features.Hybridization(),
])

bond_encoder = Featurizer([
    features.BondType(),
    features.Conjugated(),
])

encoder = MolecularGraphEncoder(
    atom_encoder,
    bond_encoder,
    positional_encoding_dim=16,
    self_loops=False
)

esol = datasets.get('esol')

x_train = encoder(esol['train']['x'])
y_train = esol['train']['y']

x_val = encoder(esol['validation']['x'])
y_val = esol['validation']['y']

x_test = encoder(esol['test']['x'])
y_test = esol['test']['y']
type_spec = x_train.spec

from molgraph.layers import GATConv
from molgraph.layers import LaplacianPositionalEncoding
from molgraph.layers import Readout
from molgraph.layers import MinMaxScaling

node_preprocessing = MinMaxScaling(
    feature='node_feature', feature_range=(0, 1), threshold=True)
edge_preprocessing = MinMaxScaling(
    feature='edge_feature', feature_range=(0, 1), threshold=True)
train_ds = (
    tf.data.Dataset.from_tensor_slices((x_train, y_train))
    .shuffle(1024)
    .batch(32)
    .prefetch(-1)
)

val_ds = (
    tf.data.Dataset.from_tensor_slices((x_val, y_val))
    .batch(32)
    .prefetch(-1)
)

test_ds = (
    tf.data.Dataset.from_tensor_slices((x_test, y_test))
    .batch(32)
    .prefetch(-1)
)
node_preprocessing.adapt(train_ds.map(lambda x, *args: x))
edge_preprocessing.adapt(train_ds.map(lambda x, *args: x))

model = keras.Sequential([
    keras.layers.Input(type_spec=type_spec),
    node_preprocessing,
    edge_preprocessing,
    LaplacianPositionalEncoding(),
    GATConv(normalization='batch_norm'),
    GATConv(normalization='batch_norm'),
    GATConv(normalization='batch_norm'),
    Readout(),
    keras.layers.Dense(1024, 'relu'),
    keras.layers.Dense(1024, 'relu'),
    keras.layers.Dense(y_train.shape[-1])
])

model.compile(optimizer='adam', loss='mae')
model.predict(x_test)

model.save("model0.keras")

Tensorflow version: 2.15.1 Keras version: 2.15.0 Python version: 3.10.12 Molgraph version: 0.6.6 (10143c6)

Apr 16 '24 17:04 RaphaelRobidas

Thanks for the observation @RaphaelRobidas, I will check it out.

Apr 17 '24 07:04 akensert

It seems to be an issue related to the ExtensionType API of TF. The 'auxiliary' field of the GraphTensor is a Mapping[str, tf.Tensor] type which internally creates a ImmutableMapping, which cannot be serialized (using .keras. format).

So the current solution is to switch from the .keras format to the SavedModel format by omitting .keras.

Apr 18 '24 13:04 akensert

@akensert Thanks for looking into this.

Your solution does solve the issue in the example perfectly, thanks a lot! It seems to be problematic for some kinds of layers though. With a model using GATv2Conv layers, I guess the following error:

Traceback (most recent call last):
  File "/home/raphael/<path>/gat.py", line 415, in <module>
    gnn_model2 = keras.models.load_model("model0")
  File "/home/<path>/python3.10/site-packages/keras/src/saving/saving_api.py", line 262, in load_model
    return legacy_sm_saving_lib.load_model(
  File "/home/<path>/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/tmp/__autograph_generated_filecz6jiapd.py", line 37, in tf__call
    ag__.if_stmt(ag__.converted_call(ag__.ld(graph_tensor).is_ragged, (), None, fscope), if_body, else_body, get_state, set_state, ('graph_tensor',), 1)
AttributeError: Exception encountered when calling layer 'gat_conv_1' (type GATv2Conv).

in user code:

    File "/home/raphael/<path>/molgraph/molgraph/layers/gnn_layer.py", line 205, in call  *
        if graph_tensor.is_ragged():

    AttributeError: 'SymbolicTensor' object has no attribute 'is_ragged'


Call arguments received by layer 'gat_conv_1' (type GATv2Conv):
  • graph_tensor=tf.Tensor(shape=(None, None, 114), dtype=float32)

Apr 19 '24 13:04 RaphaelRobidas

Actually, that's not quite true. The problem occurs when the input specifications are not defined explicitly in the model via keras.layers.Input(type_spec=X_train.spec). Adding this seems to solve the issue as far as I can tell!

Apr 19 '24 13:04 RaphaelRobidas

@RaphaelRobidas Okay interesting, good to know. And good to know you could save your models eventually, although not in .keras format.

Apr 22 '24 07:04 akensert

Btw, I would like to migrate to Keras 3 (and TF>=2.16); however, Keras 3 does not yet support extension types.

May 03 '24 10:05 akensert

This issue is stale because it has been open for 30 days with no activity.

Jun 03 '24 02:06 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

Jun 17 '24 02:06 github-actions[bot]

It seems to be an issue related to the ExtensionType API of TF. The 'auxiliary' field of the GraphTensor is a Mapping[str, tf.Tensor] type which internally creates a ImmutableMapping, which cannot be serialized (using .keras. format).

So the current solution is to switch from the .keras format to the SavedModel format by omitting .keras.

@RaphaelRobidas: Quite late for a fix, but I've now implemented a (temporary) fix for this (see version 0.6.8). You should now be able to save a model using .keras format:

import keras
from molgraph import GraphTensor
from molgraph import layers
from tensorflow import keras

g = GraphTensor(node_feature=[[4.], [2.]], edge_src=[0], edge_dst=[1])

model = keras.Sequential([
    layers.GNNInput(type_spec=g.spec), # or layers.GNNInputLayer(type_spec)
    layers.GINConv(units=32),
    layers.GINConv(units=32),
    layers.Readout(),
    keras.layers.Dense(units=1),
])

pred = model(g)

model.save('/tmp/tmp_model.keras')
loaded_model = keras.models.load_model('/tmp/tmp_model.keras')
assert pred == loaded_model(g)
loaded_model.summary()

Jul 04 '24 17:07 akensert

molgraph molgraph copied to clipboard

Cannot save model as specified in the documentation

molgraph
molgraph copied to clipboard