hub icon indicating copy to clipboard operation
hub copied to clipboard

Bug: ValueError when loading model previously cached but now missing

Open sammlapp opened this issue 1 year ago • 5 comments

What happened?

Loading a tensorflow hub model that I previously loaded results in an error. As described in https://github.com/tensorflow/hub/issues/575, it seems that TF Hub looks for the model in an existing temp directory. The model is gone, but the folder still exists. If I manually delete the folder, and re-run tensorflow_hub.load(url) it works properly. Therefore, it seems there is a bug in the way that TF Hub looks for cached models.

The correct behavior would be to re-download the model if does not exist locally, or to use the cached version if it exists locally.

It seems as if this PR should have resolved the issue https://github.com/tensorflow/hub/pull/602

A related issue: https://stackoverflow.com/questions/63078695/savedmodel-file-does-not-exist-when-using-tensorflow-hub

Relevant code

import tensorflow_hub
url = 'https://tfhub.dev/google/yamnet/1'
tensorflow_hub.load(url)

Relevant log output

ValueError: Trying to load a model of incompatible/unknown type. '/var/folders/d8/265wdp1n0bn_r85dh3pp95fh0000gq/T/tfhub_modules/9616fd04ec2360621642ef9455b84f4b668e219e' contains neither 'saved_model.pb' nor 'saved_model.pbtxt'.

tensorflow_hub Version

other (please specify)

TensorFlow Version

other (please specify)

Other libraries

Tensorflow: '2.13.0' Tensorflow Hub: '0.14.0'

Python Version

3.x

OS

macOS

sammlapp avatar Oct 06 '23 17:10 sammlapp

@sammlapp,

To avoid the issue of TF-Hub looking in temp directory for cached models, you can customise the download location to home directory by by setting the environment variable TFHUB_CACHE_DIR (recommended) or by passing the command-line flag --tfhub_cache_dir. Users who prefer persistent caching across system reboots can instead set TFHUB_CACHE_DIR to a location in their home directory. When using a persistent location, be aware that there is no automatic cleanup.

I would recommend you to download the model from tfhub.dev with assets, variables and .pb checkpoint file and save it to your home directory and specify the downloaded model folder path in the hub.load() to load the model from local storage instead of looking in temp directory. Example Below

model = hub.load("/Users/singhniraj/Downloads/universal-sentence-encoder-multilingual-large_3/")

Other way would be to instruct the tensorflow_hub library to directly read models from remote storage (GCS) instead of downloading the models locally. This way, no caching directory is needed.

Ref: Caching model downloads from TF Hub. Thank you!

singhniraj08 avatar Oct 11 '23 05:10 singhniraj08

Thanks for the workaround. In my case, I am developing a package where the tensorflow hub usage is "under the hood" and not part of the user's experience. I feel it should be possible to avoid this buggy default download behavior without requiring the user to make changes to their TF Hub environment variables.

sammlapp avatar Oct 11 '23 18:10 sammlapp

@sammlapp, Thank you for the feedback. Let us discuss this feature implementation internally and we will update this thread. Thanks!

singhniraj08 avatar Oct 12 '23 08:10 singhniraj08

Are there any plans to address this issue? Thanks

sammlapp avatar Jan 12 '24 23:01 sammlapp

@sammlapp,

tfhub.dev has been converged with Kaggle Model hub. You can refer this for update. Future improvements will be driven by Kaggle team. Thank you!

singhniraj08 avatar Jan 18 '24 04:01 singhniraj08

Closing this due to inactivity. Please take a look into the answers provided above, feel free to reopen and post your comments(if you still have queries on this). Thank you!

singhniraj08 avatar Mar 08 '24 04:03 singhniraj08