text-generation-inference Allow HUGGINGFACE_HUB_TOKEN to be passed in as a container parameter/argument

Allow HUGGINGFACE_HUB_TOKEN to be passed in as a container parameter/argument

Open ssmi153 opened this issue 2 years ago • 0 comments

Feature request

I'm running TGI on Runpod, and am trying to load a model from a private Huggingface repository. Despite passing in a value for HUGGINGFACE_HUB_TOKEN into runpod's Environment Variables, it looks like TGI is unable to read this, and throws up the error "Unauthorised for URL" (trace below).

For context, the following variations DO work:

Loading a public model - works fine
Using this token to access and download this repository from the Python huggingface_hub => snapshot_download function

... so I'm really confident TGI is simply not using the Huggingface authentication token that I'm passing in.

I can see why the code-base isn't supporting this: https://github.com/huggingface/text-generation-inference/blob/2b53d71991e8fe975be41a82ffe3b52b0bcd40a3/router/src/main.rs#L126C27-L126C27

In this line you can see that TGI is loading HUGGINGFACE_HUB_TOKEN from std::env::var("HUGGING_FACE_HUB_TOKEN").ok(); This is different to all the other parameters that are passed through as Args. I understand that this is the desired behaviour in some other environments, but would it be possible to check both this and the Args and take whichever one is given? I can't seem to set the value where it's currently being read from in Runpod.

Here's my exception:

2023-07-01T10:11:36.032120596-07:00 Error: DownloadError
2023-07-01T10:11:36.032189599-07:00 {"timestamp":"2023-07-01T17:11:36.031831Z","level":"ERROR","fields":{"message":"Download encountered an error: Traceback (most recent call last):\n\n  File \"/opt/conda/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py\", line 259, in hf_raise_for_status\n    response.raise_for_status()\n\n  File \"/opt/conda/lib/python3.9/site-packages/requests/models.py\", line 1021, in raise_for_status\n    raise HTTPError(http_error_msg, response=self)\n\nrequests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/api/models/ssmi153/student-feedback-llama-30b-instruct-2023-06-27-step2400\n\n\nThe above exception was the direct cause of the following exception:\n\n\nTraceback (most recent call last):\n\n  File \"/opt/conda/bin/text-generation-server\", line 8, in <module>\n    sys.exit(app())\n\n  File \"/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py\", line 109, in download_weights\n    utils.weight_files(model_id, revision, extension)\n\n  File \"/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py\", line 96, in weight_files\n    filenames = weight_hub_files(model_id, revision, extension)\n\n  File \"/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py\", line 25, in weight_hub_files\n    info = api.model_info(model_id, revision=revision)\n\n  File \"/opt/conda/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py\", line 120, in _inner_fn\n    return fn(*args, **kwargs)\n\n  File \"/opt/conda/lib/python3.9/site-packages/huggingface_hub/hf_api.py\", line 1604, in model_info\n    hf_raise_for_status(r)\n\n  File \"/opt/conda/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py\", line 291, in hf_raise_for_status\n    raise RepositoryNotFoundError(message, response) from e\n\nhuggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-64a05e47-78fbc0ee167461415e040537)\n\nRepository Not Found for url: https://huggingface.co/api/models/ssmi153/student-feedback-llama-30b-instruct-2023-06-27-step2400.\nPlease make sure you specified the correct `repo_id` and `repo_type`.\nIf you are trying to access a private or gated repo, make sure you are authenticated.\nInvalid username or password.\n\n"},"target":"text_generation_launcher"}

Motivation

It'd be great to be able to download and run models from private Huggingface repos on Runpod. This is not currently possible.

Your contribution

I can read Rust but am not confident with its syntax to write additional code, so I'm not sure if I can fix this myself, but I can point out exactly where the change would need to happen: https://github.com/huggingface/text-generation-inference/blob/2b53d71991e8fe975be41a82ffe3b52b0bcd40a3/router/src/main.rs#L126C27-L126C27

Jul 01 '23 17:07 ssmi153

text-generation-inference text-generation-inference copied to clipboard

Allow HUGGINGFACE_HUB_TOKEN to be passed in as a container parameter/argument

Feature request

Motivation

Your contribution

text-generation-inference
text-generation-inference copied to clipboard