huggingface_hub icon indicating copy to clipboard operation
huggingface_hub copied to clipboard

Throttle download speed

Open arch-btw opened this issue 1 year ago • 4 comments

Is your feature request related to a problem? Please describe.

When downloading a model, huggingface-cli opens many connections and completely maxes out the connection's bandwidth. Because of this, every other process doesn't have any bandwidth left.

Describe the solution you'd like

A command line flag that would allow you to set the maximum download speed. For example:

huggingface-cli download --downrate 8M

To limit download speed to 8MB/sec.

Describe alternatives you've considered

I've tried bandwidth throttling apps such as trickle, but they don't work with huggingface-cli.

Additional context

I'm using Linux.

arch-btw avatar Mar 15 '24 11:03 arch-btw

When downloading a model, huggingface-cli opens many connections and completely maxes out the connection's bandwidth. Because of this, every other process doesn't have any bandwidth left.

Hi @arch-btw, this is the case only if you have hf_transfer installed and enabled. hf_transfer is a rust-based package used to maximize the throughput by exploiting many processes in parallel, as you've seen. By default, huggingface-cli uses the simple-threaded requests library which doesn't lead to the situation you've described. I suspect that you have the HF_HUB_ENABLE_HF_TRANSFER environment set to 1. See these docs for more details. If you want to disable it for a single command, you can do

HF_HUB_ENABLE_HF_TRANSFER=0 huggingface-cli download ...

Wauplin avatar Mar 15 '24 13:03 Wauplin

My env doesn't have HF_HUB_ENABLE_HF_TRANSFER=1, and huggingface-cli hogs all my bandwidth which is quite disruptive. A download rate flag would be hugely appreciated.

I'd workaround by using a proxy to rate limit, but that doesn't appear to be available in the cli either.

altaic avatar Jun 10 '24 00:06 altaic

@altaic what you can do to limit the number of concurrent connections is to set the number of workers to 1. This cannot be done with the CLI (yet) but with a script:

from huggingface_hub import snapshot_download

snapshot_download(repo_id, ..., max_workers=1)

Default value is 5. I hope this can help you handle the load correctly. This parameter could be added to the CLI quite easily if you want to open a PR for it :)

Regarding throttling the connection, I don't think we want to officially support and maintain such a feature as it depends a lot on user's setup. What you can do is have a look to requests-ratelimited and add a custom Adapter to the requests Session used by huggingface_hub. More details on how to do that here.

Wauplin avatar Jun 10 '24 07:06 Wauplin

Did you mean "can NOT be done with the CLI?"

Ah yes, sorry for the typo (corrected it in my initial post). Using snapshot_download with a limited max_workers is not a bad advice. The rest is more hacky yes.

Wauplin avatar Jun 10 '24 10:06 Wauplin