transformers
transformers copied to clipboard
CLIP default download location is /root/.cache/..., not current working dir like other models
System Info
-
transformers
version: 4.27.3 - Platform: Linux-5.10.147+-x86_64-with-glibc2.31
- Python version: 3.9.16
- Huggingface_hub version: 0.13.3
- PyTorch version (GPU?): 1.13.1+cu116 (False)
- Tensorflow version (GPU?): 2.11.0 (False)
- Flax version (CPU?/GPU?/TPU?): 0.6.4 (cpu)
- Jax version: 0.3.25
- JaxLib version: 0.3.25
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Who can help?
flax: @sanchit-gandhi
Information
- [X] The official example scripts
- [ ] My own modified scripts
Tasks
- [X] An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
clip_model = jax.device_get(FlaxCLIPModel.from_pretrained('openai/clip-vit-large-patch14'))
downloads by default if not exists locally:
Downloading flax_model.msgpack: 100% 1.71G/1.71G [00:08<00:00, 210MB/s]
BUT, unlike all other models [that i'm using in the HF pipelines for SD-Flax], the file download location is far away from the working directory:
find / -iname 'flax_model.msgpack'
shows that the SD weights are where they should be, but CLIP's weights are off in some hidden, hashed directory:
/root/.cache/huggingface/hub/models--openai--clip-vit-large-patch14/snapshots/8d052a0f05efbaefbc9e8786ba291cfdf93e5bff/flax_model.msgpack
is this intended? if so why break from the pattern of other models that download to cwd?
Expected behavior
files would download to current working directory, e.g. something like /content/openai/clip-vit-large-patch14/
and by extension, plugging in _name_or_path
value of 'openai/clip-vit-large-patch14'
would be one-and-the-same to the file location as well as the hub's catalogue name (i.e. can i confidently put in a different path that i saved the weights to manually?)
Hi @krahnikblis, thanks for raising this issue!
In the transformers library, from_pretrained
can be used to load a model from the hub and from a local file. When from_pretrained(path)
is called, if path
is a local folder, these weights are loaded. If it's a checkpoint on the hub e.g. openai/clip-vit-large-patch14
, then the checkpoint is download to the cache directory, as you've correctly noticed. If from_pretrained(path)
is called again, then the weights are loaded from the cache. This happens for all frameworks: PyTorch, TensorFlow and Flax.
For SD-Flax, am I correct in understanding this as the Stable Diffusion pipeline from the diffusers library? Could you share a more detailed snippet showing what exactly is being run? For the diffusers pipelines, if using the pipeline.from_pretrained(model_weights)
API, then the same behaviour will happen (download to cache, can load from local) as noted in the documentation.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.