text-generation-inference The HF_TRANSFER is not working for the model CalderaAI/30B-Lazarus

System Info

I'am on a Ubuntu server of https://console.paperspace.com/ with this 2 A100 GPU, but when i run the model CalderaAI/30B-Lazarus i don't cannot use the HF Transfer, even with the "--net=host" (solution working with other models, find on another issue)

I run :

sudo docker run --net=host --gpus all --shm-size 1g -p 8080:80 -v $PWD/data:/data -e HF_HUB_ENABLE_HF_TRANSFER=1 ghcr.io/huggingface/text-generation-inference:0.8 --model-id CalderaAI/30B-Lazarus --num-shard 1 --env --disable-custom-kernels

My error :

HF_TRANSFER for Lararus model

2023-06-15T10:56:36.180341Z ERROR download: text_generation_launcher: An error occurred while downloading using `hf_transfer`. Consider disabling HF_HUB_ENABLE_HF_TRANSFER for better error handling.

2023-06-15T10:56:36.180378Z  INFO download: text_generation_launcher: Retry 4/4 

2023-06-15T10:56:36.180458Z  INFO download: text_generation_launcher: Download file: pytorch_model-00003-of-00007.bin

2023-06-15T10:57:00.401367Z ERROR text_generation_launcher: Download encountered

an error: Traceback (most recent call last):

  File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 486, in http_get

    download(url, temp_file.name, max_files, chunk_size, headers=headers)       

Exception: Error while downloading: Os { code: 28, kind: StorageFull, message: "No space left on device" }

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File "/opt/conda/bin/text-generation-server", line 8, in <module>

    sys.exit(app())

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 137, in download_weights

    local_pt_files = utils.download_weights(pt_filenames, model_id, revision)   

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py", line 167, in download_weights

    file = download_file(filename)



  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py", line 159, in download_file

    raise e



  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py", line 147, in download_file

    local_file = hf_hub_download(

  File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn

    return fn(*args, **kwargs)

  File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1347, in hf_hub_download

    http_get(

  File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 495, in http_get

    raise RuntimeError(

RuntimeError: An error occurred while downloading using `hf_transfer`. Consider disabling HF_HUB_ENABLE_HF_TRANSFER for better error handling.

Error: DownloadError

Information

[X] Docker
[ ] The CLI directly

Tasks

[X] An officially supported command
[ ] My own modifications

Reproduction

Run :

sudo docker run --net=host --gpus all --shm-size 1g -p 8080:80 -v $PWD/data:/data -e HF_HUB_ENABLE_HF_TRANSFER=1 ghcr.io/huggingface/text-generation-inference:0.8 --model-id CalderaAI/30B-Lazarus --num-shard 1 --env --disable-custom-kernels

On a paperspace server on Ubuntu with 2 A100 GPUs

Expected behavior

To download the Lazarus model with the HF transfer

Jun 15 '23 14:06 ArnaudHureaux

Try disabling it ? It should still download the model just a bit slower.

hf_transfer is really barebones, and any flaky network might trigger issues for you (or because you're using much more resources sometimes other part of the infra, not necessarily yours might start to lag down causing flaky network overall). Using the raw python download is definitely more recommended for stable downloading.

Jun 16 '23 07:06 Narsil

This works for me

docker run --shm-size 1g --net=host -p 8080:80 -v $PWD/data:/data -e HUGGING_FACE_HUB_TOKEN=$token -e HF_HUB_ENABLE_HF_TRANSFER=0 ghcr.io/huggingface/text-generation-inference:latest  --model-id TheBloke/Llama-2-13B-chat-GGML --quantize bitsandbytes

Jul 27 '23 03:07 ksingh-scogo

how can i avoid this error. i am using aws sagemaker.

Jul 28 '23 13:07 majidbhatti

Isn't there a way for you to provide environement variables ?

HF_HUB_ENABLE_HF_TRANSFER=0

Is what you are looking for.

Jul 31 '23 15:07 Narsil

Isn't there a way for you to provide environement variables ?
HF_HUB_ENABLE_HF_TRANSFER=0
Is what you are looking for.

Hello Narsil, I have a similar problem to majidbhatti. I am on a SageMaker notebook and I get the same error. I tried your solution by setting os.environ["HF_HUB_ENABLE_HF_TRANSFER"]="0" but it did not change anything, I still get the exact same error "RuntimeError: An error occurred while downloading using hf_transfer. Consider disabling HF_HUB_ENABLE_HF_TRANSFER for better error handling.". Do you have any suggestions?

Aug 01 '23 14:08 anastasia-enot

I avoided this error using hub = { 'HF_HUB_ENABLE_HF_TRANSFER': 0 }

huggingface_model = HuggingFaceModel( image_uri=get_huggingface_llm_image_uri("huggingface",version="0.8.2"), env=hub )

Aug 02 '23 06:08 majidbhatti

I avoided this error using hub = { 'HF_HUB_ENABLE_HF_TRANSFER': 0 }

huggingface_model = HuggingFaceModel( image_uri=get_huggingface_llm_image_uri("huggingface",version="0.8.2"), env=hub )

Thank you so much!! It worked for me.

Aug 02 '23 11:08 anastasia-enot

Thanks for sharing the solution !

Closing this then

Aug 03 '23 08:08 Narsil

You can also add HF_HUB_ENABLE_HF_TRANSFER=0 in the docker command,

docker run --shm-size 1g --env HF_HUB_ENABLE_HF_TRANSFER=0 .......

Aug 06 '23 10:08 chintanckg

text-generation-inference text-generation-inference copied to clipboard

The HF_TRANSFER is not working for the model CalderaAI/30B-Lazarus

System Info

Information

Tasks

Reproduction

Expected behavior

text-generation-inference
text-generation-inference copied to clipboard