umap icon indicating copy to clipboard operation
umap copied to clipboard

Parametric UMAP Model Broken on Latest Tensorflow Platform?

Open ayaanhossain opened this issue 2 years ago • 2 comments

Hello, let me begin by saying I really appreciate UMAP, and have used the non-parametric version in the past. I recently learned about the Parametric UMAP, and wanted to give it a shot for a project. I was able to define my encoder/decoder and train a model quickly at first on an EC2 instance without a GPU. However, when trying to save the model for later use, the I wasunable to do so.

pumap_model.save('../models/pumap_v1')

triggered the following.

WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[44], line 1
----> 1 pumap_model.save('../models/pumap')

File ~/miniconda3/envs/art/lib/python3.10/site-packages/umap/parametric_umap.py:494, in ParametricUMAP.save(self, save_location, verbose)
    492 if self.encoder is not None:
    493     encoder_output = os.path.join(save_location, "encoder")
--> 494     self.encoder.save(encoder_output)
    495     if verbose:
    496         print("Keras encoder model saved to {}".format(encoder_output))

File ~/miniconda3/envs/art/lib/python3.10/site-packages/keras/utils/traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     67     filtered_tb = _process_traceback_frames(e.__traceback__)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:
     72     del filtered_tb

File ~/miniconda3/envs/art/lib/python3.10/site-packages/tensorflow/python/trackable/data_structures.py:823, in _DictWrapper.__getattribute__(self, name)
    821   return object.__getattribute__(self, name)
    822 else:
--> 823   return super().__getattribute__(name)

TypeError: this __dict__ descriptor does not support '_DictWrapper' objects

This is without using a custom encoder or decoder. I believe the same issue appears when trying to save a Parametric UMAP model trained on MNIST dataset on latest versions of the libraries. For reference

$ python
Python 3.10.9 | packaged by conda-forge | (main, Feb  2 2023, 20:20:04) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import umap
>>> umap.__version__
'0.5.3'
>>>
>>> import tensorflow as tf
>>> tf.__version__
'2.12.0-rc0'
>>>
>>> import tensorflow_probability as tfp
>>> tfp.__version__
'0.19.0'

Later, when moving to a GPU instance, with the same version of the packages, the model.fit(..) step breaks down. Here's some output.

ParametricUMAP(autoencoder_loss=True, decoder=<keras.engine.sequential.Sequential object at 0x7fc1f7f89ab0>, dims=(16233,), encoder=<keras.engine.sequential.Sequential object at 0x7fc1f7ce6590>, optimizer=<keras.optimizers.adam.Adam object at 0x7fc1ea30b3a0>, parametric_reconstruction=True, run_eagerly=True)
Sun Mar  5 06:00:46 2023 Construct fuzzy simplicial set
Sun Mar  5 06:00:47 2023 Finding Nearest Neighbors
Sun Mar  5 06:00:49 2023 Finished Nearest Neighbor Search
Sun Mar  5 06:00:50 2023 Construct embedding
Epoch 1/10
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[56], line 1
----> 1 pumap_model.fit(data[:10, :]);

File ~/proj/notebook/pumap.py:203, in ParametricUMAP.fit(self, X, y, precomputed_distances)
    201     return super().fit(precomputed_distances, y)
    202 else:
--> 203     return super().fit(X, y)

File ~/miniconda3/envs/art/lib/python3.10/site-packages/umap_learn-0.5.3-py3.10.egg/umap/umap_.py:2736, in UMAP.fit(self, X, y)
   2734 if self.transform_mode == "embedding":
   2735     epochs = self.n_epochs_list if self.n_epochs_list is not None else self.n_epochs
-> 2736     self.embedding_, aux_data = self._fit_embed_data(
   2737         self._raw_data[index],
   2738         epochs,
   2739         init,
   2740         random_state,  # JH why raw data?
   2741     )
   2743     if self.n_epochs_list is not None:
   2744         if "embedding_list" not in aux_data:

File ~/proj/notebook/pumap.py:463, in ParametricUMAP._fit_embed_data(self, X, n_epochs, init, random_state)
    460     validation_data = None
    462 # create embedding
--> 463 history = self.parametric_model.fit(
    464     edge_dataset,
    465     epochs=self.loss_report_frequency * self.n_training_epochs,
    466     steps_per_epoch=steps_per_epoch,
    467     max_queue_size=100,
    468     validation_data=validation_data,
    469     **self.keras_fit_kwargs
    470 )
    471 # save loss history dictionary
    472 self._history = history.history

File ~/miniconda3/envs/art/lib/python3.10/site-packages/keras/utils/traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     67     filtered_tb = _process_traceback_frames(e.__traceback__)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:
     72     del filtered_tb

File ~/miniconda3/envs/art/lib/python3.10/site-packages/keras/engine/training.py:1697, in Model.fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
   1695 logs = tf_utils.sync_to_numpy_or_python_type(logs)
   1696 if logs is None:
-> 1697     raise ValueError(
   1698         "Unexpected result of `train_function` "
   1699         "(Empty logs). Please use "
   1700         "`Model.compile(..., run_eagerly=True)`, or "
   1701         "`tf.config.run_functions_eagerly(True)` for more "
   1702         "information of where went wrong, or file a "
   1703         "issue/bug to `tf.keras`."
   1704     )
   1705 # Override with model metrics instead of last step logs
   1706 logs = self._validate_and_get_metrics_result(logs)

ValueError: Unexpected result of `train_function` (Empty logs). Please use `Model.compile(..., run_eagerly=True)`, or `tf.config.run_functions_eagerly(True)` for more information of where went wrong, or file a issue/bug to `tf.keras`.

Thanks in advanced for your help.

ayaanhossain avatar Mar 05 '23 01:03 ayaanhossain

I haven't tried installing the pre-release of tensorflow that you are using, but everything works fine in the current version.

If this is an unstable pre-release they might fix things on their end.

timsainb avatar Mar 11 '23 22:03 timsainb

As of now with the latest Tensorflow 2.16.1 I get the above crash as soon as I import umap :-/

sbrl avatar Jun 01 '24 02:06 sbrl