text-generation-inference
text-generation-inference copied to clipboard
"Unauthorized for url: https://huggingface.co/api/models/bigcode/starcoder"
System Info
The API is working with the Bloom 560M
But when i try the model "bigcode/starcoder" i got this error :
2023-06-08T09:42:38.813488Z ERROR text_generation_launcher: Download encountered an error: Traceback (most recent call last):
File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py", line 259, in hf_raise_for_status
response.raise_for_status()
File "/opt/conda/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/api/models/bigcode/starcoder
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in <module>
sys.exit(app())
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 96, in download_weights
utils.weight_files(model_id, revision, extension)
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py", line 92, in weight_files
filenames = weight_hub_files(model_id, revision, extension)
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py", line 25, in weight_hub_files
info = api.model_info(model_id, revision=revision)
File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/hf_api.py", line 1604, in
model_info
hf_raise_for_status(r)
File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py", line 291, in hf_raise_for_status
raise RepositoryNotFoundError(message, response) from e
huggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-6481a28e-096788e502c5cd136b7ef37a)
Repository Not Found for url: https://huggingface.co/api/models/bigcode/starcoder.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.
Information
- [X] Docker
- [ ] The CLI directly
Tasks
- [x] An officially supported command
- [ ] My own modifications
Reproduction
- i ran
sudo docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data -e HF_HUB_ENABLE_HF_TRANSFER=0 ghcr.io/huggingface/text-generation-inference:0.8 --model-id bigcode/starcoder --num-shard 1 --env --disable-custom-kernels
Expected behavior
Download the bigcode/starcode model and get an API based on this model
I think you need to pass your token, since this model is gated behind access request (you cannot download it being a non logged in user essentially).
docker run .... -e HUGGING_FACE_HUB_TOKEN=$MYTOKEN...
@OlivierDehaene Shouldn't we add that in the README by default (since there are a few of those)
Ah ok, thanks @Narsil, and the authentification is required only for the download right ?
- it is for every model ? how can i know by advance if i need it for a model ?
Ah ok, thanks @Narsil, and the authentification is required only for the download right ? Yes
it is for every model ? how can i know by advance if i need it for a model ?
No, you need to check if the model is gated on the hub (usually it's a LICENSE acceptance)
Thanks a lot @Narsil
Passing HUGGING_FACE_HUB_TOKEN like in
docker run .... -e HUGGING_FACE_HUB_TOKEN=$MYTOKEN...
results in
error: unexpected argument 'HUGGING_FACE_HUB_TOKEN=hf_xxxxxxxxxx
I start the service like this:
model=meta-llama/Llama-2-7b-chat-hf num_shard=1 volume=./model_cache/
docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:0.9.3 --model-id $model --num-shard $num_shard --quantize=bitsandbytes -e HUGGING_FACE_HUB_TOKEN=hf_xxxxxxxxxxxxx
The -e X=Y
needs to happen before the docker image ghcr:....
This is how docker-cli works (currently the -e
is sent directly to text-generation-launcher
which indeed doesn't have this flag)