text-generation-inference
text-generation-inference copied to clipboard
The HF_TRANSFER is not working for the model CalderaAI/30B-Lazarus
System Info
I'am on a Ubuntu server of https://console.paperspace.com/ with this 2 A100 GPU, but when i run the model CalderaAI/30B-Lazarus i don't cannot use the HF Transfer, even with the "--net=host" (solution working with other models, find on another issue)
Full description of my GPUs : paperspace@pse55xf0v:~/text-generation-inference$ nvidia-smi Thu Jun 15 10:00:25 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 515.105.01 Driver Version: 515.105.01 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA A100-SXM... Off | 00000000:00:05.0 Off | 0 | | N/A 29C P0 53W / 400W | 184MiB / 81920MiB | 0% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA A100-SXM... Off | 00000000:00:06.0 Off | 0 | | N/A 26C P0 53W / 400W | 4MiB / 81920MiB | 0% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 1497 G /usr/lib/xorg/Xorg 74MiB | | 0 N/A N/A 2389 G /usr/bin/gnome-shell 102MiB |
I run :
sudo docker run --net=host --gpus all --shm-size 1g -p 8080:80 -v $PWD/data:/data -e HF_HUB_ENABLE_HF_TRANSFER=1 ghcr.io/huggingface/text-generation-inference:0.8 --model-id CalderaAI/30B-Lazarus --num-shard 1 --env --disable-custom-kernels
My error :
HF_TRANSFER for Lararus model
2023-06-15T10:56:36.180341Z ERROR download: text_generation_launcher: An error occurred while downloading using `hf_transfer`. Consider disabling HF_HUB_ENABLE_HF_TRANSFER for better error handling.
2023-06-15T10:56:36.180378Z INFO download: text_generation_launcher: Retry 4/4
2023-06-15T10:56:36.180458Z INFO download: text_generation_launcher: Download file: pytorch_model-00003-of-00007.bin
2023-06-15T10:57:00.401367Z ERROR text_generation_launcher: Download encountered
an error: Traceback (most recent call last):
File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 486, in http_get
download(url, temp_file.name, max_files, chunk_size, headers=headers)
Exception: Error while downloading: Os { code: 28, kind: StorageFull, message: "No space left on device" }
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in <module>
sys.exit(app())
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 137, in download_weights
local_pt_files = utils.download_weights(pt_filenames, model_id, revision)
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py", line 167, in download_weights
file = download_file(filename)
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py", line 159, in download_file
raise e
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py", line 147, in download_file
local_file = hf_hub_download(
File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1347, in hf_hub_download
http_get(
File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 495, in http_get
raise RuntimeError(
RuntimeError: An error occurred while downloading using `hf_transfer`. Consider disabling HF_HUB_ENABLE_HF_TRANSFER for better error handling.
Error: DownloadError
Information
- [X] Docker
- [ ] The CLI directly
Tasks
- [X] An officially supported command
- [ ] My own modifications
Reproduction
Run :
sudo docker run --net=host --gpus all --shm-size 1g -p 8080:80 -v $PWD/data:/data -e HF_HUB_ENABLE_HF_TRANSFER=1 ghcr.io/huggingface/text-generation-inference:0.8 --model-id CalderaAI/30B-Lazarus --num-shard 1 --env --disable-custom-kernels
On a paperspace server on Ubuntu with 2 A100 GPUs
Expected behavior
To download the Lazarus model with the HF transfer
Try disabling it ? It should still download the model just a bit slower.
hf_transfer is really barebones, and any flaky network might trigger issues for you (or because you're using much more resources sometimes other part of the infra, not necessarily yours might start to lag down causing flaky network overall). Using the raw python download is definitely more recommended for stable downloading.
This works for me
docker run --shm-size 1g --net=host -p 8080:80 -v $PWD/data:/data -e HUGGING_FACE_HUB_TOKEN=$token -e HF_HUB_ENABLE_HF_TRANSFER=0 ghcr.io/huggingface/text-generation-inference:latest --model-id TheBloke/Llama-2-13B-chat-GGML --quantize bitsandbytes
how can i avoid this error. i am using aws sagemaker.
Isn't there a way for you to provide environement variables ?
HF_HUB_ENABLE_HF_TRANSFER=0
Is what you are looking for.
Isn't there a way for you to provide environement variables ?
HF_HUB_ENABLE_HF_TRANSFER=0Is what you are looking for.
Hello Narsil,
I have a similar problem to majidbhatti. I am on a SageMaker notebook and I get the same error. I tried your solution by setting os.environ["HF_HUB_ENABLE_HF_TRANSFER"]="0" but it did not change anything, I still get the exact same error "RuntimeError: An error occurred while downloading using hf_transfer. Consider disabling HF_HUB_ENABLE_HF_TRANSFER for better error handling.".
Do you have any suggestions?
I avoided this error using
hub = { 'HF_HUB_ENABLE_HF_TRANSFER': 0 }
huggingface_model = HuggingFaceModel( image_uri=get_huggingface_llm_image_uri("huggingface",version="0.8.2"), env=hub )
I avoided this error using
hub = { 'HF_HUB_ENABLE_HF_TRANSFER': 0 }
huggingface_model = HuggingFaceModel( image_uri=get_huggingface_llm_image_uri("huggingface",version="0.8.2"), env=hub )
Thank you so much!! It worked for me.
Thanks for sharing the solution !
Closing this then
You can also add HF_HUB_ENABLE_HF_TRANSFER=0 in the docker command,
docker run --shm-size 1g --env HF_HUB_ENABLE_HF_TRANSFER=0 .......