text-generation-inference "Unauthorized for url: https://huggingface.co/api/models/bigcode/starcoder"

System Info

The API is working with the Bloom 560M

But when i try the model "bigcode/starcoder" i got this error :

2023-06-08T09:42:38.813488Z ERROR text_generation_launcher: Download encountered an error: Traceback (most recent call last):

  File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py", line 259, in hf_raise_for_status
    response.raise_for_status()

  File "/opt/conda/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)

requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/api/models/bigcode/starcoder


The above exception was the direct cause of the following exception:


Traceback (most recent call last):

  File "/opt/conda/bin/text-generation-server", line 8, in <module>
    sys.exit(app())

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 96, in download_weights
    utils.weight_files(model_id, revision, extension)

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py", line 92, in weight_files
    filenames = weight_hub_files(model_id, revision, extension)

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py", line 25, in weight_hub_files
    info = api.model_info(model_id, revision=revision)

  File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
    return fn(*args, **kwargs)

  File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/hf_api.py", line 1604, in 
model_info
    hf_raise_for_status(r)

  File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py", line 291, in hf_raise_for_status
    raise RepositoryNotFoundError(message, response) from e

huggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-6481a28e-096788e502c5cd136b7ef37a)

Repository Not Found for url: https://huggingface.co/api/models/bigcode/starcoder.       
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.    
Invalid username or password.

Information

[X] Docker
[ ] The CLI directly

Tasks

[x] An officially supported command
[ ] My own modifications

Reproduction

i ran sudo docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data -e HF_HUB_ENABLE_HF_TRANSFER=0 ghcr.io/huggingface/text-generation-inference:0.8 --model-id bigcode/starcoder --num-shard 1 --env --disable-custom-kernels

Expected behavior

Download the bigcode/starcode model and get an API based on this model

Jun 08 '23 09:06 ArnaudHureaux

I think you need to pass your token, since this model is gated behind access request (you cannot download it being a non logged in user essentially).

docker run ....  -e HUGGING_FACE_HUB_TOKEN=$MYTOKEN...

@OlivierDehaene Shouldn't we add that in the README by default (since there are a few of those)

Jun 08 '23 10:06 Narsil

Ah ok, thanks @Narsil, and the authentification is required only for the download right ?

it is for every model ? how can i know by advance if i need it for a model ?

Jun 08 '23 12:06 ArnaudHureaux

Ah ok, thanks @Narsil, and the authentification is required only for the download right ? Yes

it is for every model ? how can i know by advance if i need it for a model ?

No, you need to check if the model is gated on the hub (usually it's a LICENSE acceptance)

Jun 08 '23 14:06 Narsil

Thanks a lot @Narsil

Jun 09 '23 08:06 ArnaudHureaux

Passing HUGGING_FACE_HUB_TOKEN like in

docker run .... -e HUGGING_FACE_HUB_TOKEN=$MYTOKEN...

results in

error: unexpected argument 'HUGGING_FACE_HUB_TOKEN=hf_xxxxxxxxxx I start the service like this:

model=meta-llama/Llama-2-7b-chat-hf num_shard=1 volume=./model_cache/

docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:0.9.3 --model-id $model --num-shard $num_shard --quantize=bitsandbytes -e HUGGING_FACE_HUB_TOKEN=hf_xxxxxxxxxxxxx

Jul 19 '23 10:07 alexanderfrey

The -e X=Y needs to happen before the docker image ghcr:....

This is how docker-cli works (currently the -e is sent directly to text-generation-launcher which indeed doesn't have this flag)

Jul 19 '23 10:07 Narsil

text-generation-inference text-generation-inference copied to clipboard

"Unauthorized for url: https://huggingface.co/api/models/bigcode/starcoder"

System Info

Information

Tasks

Reproduction

Expected behavior

text-generation-inference
text-generation-inference copied to clipboard