litgpt icon indicating copy to clipboard operation
litgpt copied to clipboard

Slow download from HuggingFace Hub (capped at 10.5 MB/s)

Open Andrei-Aksionov opened this issue 11 months ago • 6 comments

Bug description

Sometimes the download speed is pretty slow, no more than 10.5 MB/s. For example, gemma-2-2b takes a lot of time to download.

In download.py the code sets hf_transfer variable via constants: https://github.com/Lightning-AI/litgpt/blob/fe96c6366ad8fd20a632673bd6f344bd9b18ca04/litgpt/scripts/download.py#L79-L82

But it doesn't help. Even if set the value through os.environ["HF_HUB_ENABLE_HF_TRANSFER"]="1".

But, if to export the env variable before running a script one can notice a significant speed-up:

export HF_HUB_ENABLE_HF_TRANSFER=1

Most likely, HF Hub checks the env variable upon initialization. The code needs to be fixed.

What operating system are you using?

macOS

LitGPT Version

Version: 0.5.4.dev1

Andrei-Aksionov avatar Dec 23 '24 16:12 Andrei-Aksionov

I think this is fixed now via #1899

rasbt avatar Jan 08 '25 21:01 rasbt

Nope :) Still the same: the downloading speed is capped at 10.5, but if I enable HF_HUB_ENABLE_HF_TRANSFER before downloading it is superfast.

Andrei-Aksionov avatar Jan 09 '25 09:01 Andrei-Aksionov

So weird, I wasn't able to reproduce!

rasbt avatar Jan 09 '25 14:01 rasbt

still the issue is not fixed. I am downloading Qwen2 VL 2b instruct model in lightning.ai studio and takes a lot of time. 3x slower downloading than kaggle.

ahsannawazch avatar Jan 10 '25 11:01 ahsannawazch

it goes beyond these models, it's an HF thing i believe and not Litgpt

Respaired avatar Jan 16 '25 17:01 Respaired

it goes beyond these models, it's an HF thing i believe and not Litgpt

well I tried following and it went upto 500-600 MB/s. Took few seconds to download the same model.

  1. export HF_HUB_ENABLE_HF_TRANSFER=1
  2. pip install hf_transfer

ahsannawazch avatar Jan 18 '25 15:01 ahsannawazch