stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

[Feature Request]: Use the `huggingface_hub` library for downloading checkpoints from the HF Hub

Open sayakpaul opened this issue 1 year ago • 6 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

Currently, the model checkpoints from the Hugging Face (HF) Hub are downloaded using the following

https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/0cc0ee1bcb4c24a8c9715f66cede06601bfc00c8/modules/modelloader.py#L62

The HF team provides a separate library called huggingface_hub that lets anyone seamlessly interact with the HF Hub and its files. It comes packed with support for caching as well. So, I was wondering if we should consider refactoring this with huggingface_hub.

Proposed workflow

If a user enters, say, andite/anything-v4.0 (a repository on the HF Hub), we would automatically download this checkpoint and cache it. An option for downloading a specific checkpoint can also be added.

The code for this is relatively quite simple:

import huggingface_hub as hub

repo_id = "andite/anything-v4.0"
filename = "anything-v4.0.ckpt"

file_path = hub.hf_hub_download(repo_id, filename)

Here are some number on loading a checkpoint:

Without caching:

CPU times: user 14.3 s, sys: 14.2 s, total: 28.5 s
Wall time: 1min 59s

With caching:

CPU times: user 380 ms, sys: 48.7 ms, total: 428 ms
Wall time: 682 ms

Additional information

Cc: @patrickvonplaten

sayakpaul avatar Mar 01 '23 10:03 sayakpaul

Maybe cc @ClashSAN @AUTOMATIC1111 as well here

patrickvonplaten avatar Mar 01 '23 10:03 patrickvonplaten

Hi @sayakpaul, @patrickvonplaten, appreciate and thank you for your input!

There are some third party extensions for hf downloads: (not caching-related) https://github.com/camenduru/stable-diffusion-webui-huggingface https://github.com/etherealxx/batchlinks-webui

Q: Do other webuis cache the stable-diffusion format models? If the speed is improved when loading and switching between models, that is great, for users with weaker hardware it could be good. If users are loading models in 2min, that is truly tragic, I only experience 3.5s loading time with .safetensors!

I don't know the average loading model time for users, maybe @vladmandic, who runs an extension with opt-in benchmarking data sharing https://github.com/vladmandic/sd-extension-system-info has thoughts on this.

Q: using this built-in cache system only works with hf-downloaded models? Q: When using this system, would it bring telemetry after download?

ClashSAN avatar Mar 01 '23 22:03 ClashSAN

i don't track model load time, but looking at different issues reported around that, any extreme times are almost always related to initial model hash calculations when model resides on slow disk (or in one extreme case, on filesystem backed by s3 bucket). in reality, model load is limited by i/o rate (either from disk or network) and hash calculation is just worst-case example of it. and given network latency, no matter how high your throughput is, its always going to be much slower than disk. so yes, model caching on local storage should help.

vladmandic avatar Mar 01 '23 22:03 vladmandic

Q: Do other webuis cache the stable-diffusion format models? If the speed is improved when loading and switching between models, that is great, for users with weaker hardware it could be good. If users are loading models in 2min, that is truly tragic, I only experience 3.5s loading time with .safetensors!

I don't really know how other UI are dealing with this

I don't know the average loading model time for users, maybe @vladmandic, who runs an extension with opt-in benchmarking data sharing https://github.com/vladmandic/sd-extension-system-info has thoughts on this.

Q: using this built-in cache system only works with hf-downloaded models?

The cache would work with any model hosted on hf.co

Q: When using this system, would it bring telemetry after download?

I don't think telemetry is sent by default (except for the HEAD request to the file in question, just like when using wget). You could add telemetry though via the user_agent attribute if you like

patrickvonplaten avatar Mar 02 '23 00:03 patrickvonplaten

great, thanks!

the caching could be useful, if users are switching models mid-sampling as experiments, with deforum and other extensions too.

fortunately or unfortunately, AUTOMATIC1111 doesn't always read from the issue page, so a discussion with a barebones PR draft may get his opinion of this.

ClashSAN avatar Mar 02 '23 03:03 ClashSAN

Thanks for your inputs. I will drop a draft PR soon.

sayakpaul avatar Mar 02 '23 03:03 sayakpaul