umap
umap copied to clipboard
Parametric UMAP Model Broken on Latest Tensorflow Platform?
Hello, let me begin by saying I really appreciate UMAP, and have used the non-parametric version in the past. I recently learned about the Parametric UMAP, and wanted to give it a shot for a project. I was able to define my encoder/decoder and train a model quickly at first on an EC2 instance without a GPU. However, when trying to save the model for later use, the I wasunable to do so.
pumap_model.save('../models/pumap_v1')
triggered the following.
WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[44], line 1
----> 1 pumap_model.save('../models/pumap')
File ~/miniconda3/envs/art/lib/python3.10/site-packages/umap/parametric_umap.py:494, in ParametricUMAP.save(self, save_location, verbose)
492 if self.encoder is not None:
493 encoder_output = os.path.join(save_location, "encoder")
--> 494 self.encoder.save(encoder_output)
495 if verbose:
496 print("Keras encoder model saved to {}".format(encoder_output))
File ~/miniconda3/envs/art/lib/python3.10/site-packages/keras/utils/traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
67 filtered_tb = _process_traceback_frames(e.__traceback__)
68 # To get the full stack trace, call:
69 # `tf.debugging.disable_traceback_filtering()`
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb
File ~/miniconda3/envs/art/lib/python3.10/site-packages/tensorflow/python/trackable/data_structures.py:823, in _DictWrapper.__getattribute__(self, name)
821 return object.__getattribute__(self, name)
822 else:
--> 823 return super().__getattribute__(name)
TypeError: this __dict__ descriptor does not support '_DictWrapper' objects
This is without using a custom encoder or decoder. I believe the same issue appears when trying to save a Parametric UMAP model trained on MNIST dataset on latest versions of the libraries. For reference
$ python
Python 3.10.9 | packaged by conda-forge | (main, Feb 2 2023, 20:20:04) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import umap
>>> umap.__version__
'0.5.3'
>>>
>>> import tensorflow as tf
>>> tf.__version__
'2.12.0-rc0'
>>>
>>> import tensorflow_probability as tfp
>>> tfp.__version__
'0.19.0'
Later, when moving to a GPU instance, with the same version of the packages, the model.fit(..) step breaks down. Here's some output.
ParametricUMAP(autoencoder_loss=True, decoder=<keras.engine.sequential.Sequential object at 0x7fc1f7f89ab0>, dims=(16233,), encoder=<keras.engine.sequential.Sequential object at 0x7fc1f7ce6590>, optimizer=<keras.optimizers.adam.Adam object at 0x7fc1ea30b3a0>, parametric_reconstruction=True, run_eagerly=True)
Sun Mar 5 06:00:46 2023 Construct fuzzy simplicial set
Sun Mar 5 06:00:47 2023 Finding Nearest Neighbors
Sun Mar 5 06:00:49 2023 Finished Nearest Neighbor Search
Sun Mar 5 06:00:50 2023 Construct embedding
Epoch 1/10
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[56], line 1
----> 1 pumap_model.fit(data[:10, :]);
File ~/proj/notebook/pumap.py:203, in ParametricUMAP.fit(self, X, y, precomputed_distances)
201 return super().fit(precomputed_distances, y)
202 else:
--> 203 return super().fit(X, y)
File ~/miniconda3/envs/art/lib/python3.10/site-packages/umap_learn-0.5.3-py3.10.egg/umap/umap_.py:2736, in UMAP.fit(self, X, y)
2734 if self.transform_mode == "embedding":
2735 epochs = self.n_epochs_list if self.n_epochs_list is not None else self.n_epochs
-> 2736 self.embedding_, aux_data = self._fit_embed_data(
2737 self._raw_data[index],
2738 epochs,
2739 init,
2740 random_state, # JH why raw data?
2741 )
2743 if self.n_epochs_list is not None:
2744 if "embedding_list" not in aux_data:
File ~/proj/notebook/pumap.py:463, in ParametricUMAP._fit_embed_data(self, X, n_epochs, init, random_state)
460 validation_data = None
462 # create embedding
--> 463 history = self.parametric_model.fit(
464 edge_dataset,
465 epochs=self.loss_report_frequency * self.n_training_epochs,
466 steps_per_epoch=steps_per_epoch,
467 max_queue_size=100,
468 validation_data=validation_data,
469 **self.keras_fit_kwargs
470 )
471 # save loss history dictionary
472 self._history = history.history
File ~/miniconda3/envs/art/lib/python3.10/site-packages/keras/utils/traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
67 filtered_tb = _process_traceback_frames(e.__traceback__)
68 # To get the full stack trace, call:
69 # `tf.debugging.disable_traceback_filtering()`
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb
File ~/miniconda3/envs/art/lib/python3.10/site-packages/keras/engine/training.py:1697, in Model.fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
1695 logs = tf_utils.sync_to_numpy_or_python_type(logs)
1696 if logs is None:
-> 1697 raise ValueError(
1698 "Unexpected result of `train_function` "
1699 "(Empty logs). Please use "
1700 "`Model.compile(..., run_eagerly=True)`, or "
1701 "`tf.config.run_functions_eagerly(True)` for more "
1702 "information of where went wrong, or file a "
1703 "issue/bug to `tf.keras`."
1704 )
1705 # Override with model metrics instead of last step logs
1706 logs = self._validate_and_get_metrics_result(logs)
ValueError: Unexpected result of `train_function` (Empty logs). Please use `Model.compile(..., run_eagerly=True)`, or `tf.config.run_functions_eagerly(True)` for more information of where went wrong, or file a issue/bug to `tf.keras`.
Thanks in advanced for your help.
I haven't tried installing the pre-release of tensorflow that you are using, but everything works fine in the current version.
If this is an unstable pre-release they might fix things on their end.
As of now with the latest Tensorflow 2.16.1 I get the above crash as soon as I import umap :-/