dstack icon indicating copy to clipboard operation
dstack copied to clipboard

[Bug]: Enabling OpenAI-compatible mapping for restricted HuggingFace models leads to unexpected server error

Open r4victor opened this issue 9 months ago • 0 comments

Steps to reproduce

If I run a service with model mapping without specifying chat_template, dstack goes to huggingface to get the tokenizer info. But if a model is restricted (requires an agreement), then the request without auth fails. I ran into the problem when running mistralai/Mistral-7B-Instruct-v0.1 example from our docs:

type: service

image: ghcr.io/huggingface/text-generation-inference:latest
env:
  - MODEL_ID=mistralai/Mistral-7B-Instruct-v0.1
commands:
  - text-generation-launcher --port 8000 --trust-remote-code
port: 8000

resources:
  gpu: 24GB

# Enable the OpenAI-compatible endpoint   
model:
  type: chat
  name: mistralai/Mistral-7B-Instruct-v0.1
  format: tgi

It worked before. Apparently, mistralai/Mistral-7B-Instruct-v0.1 started to ask for agreement.

Actual behaviour

Not sure how to get the restricted model info without auth. In any case, if it's not possible, there should be an appropriate error.

Expected behaviour

No response

dstack version

master

Server logs

File "/Users/r4victor/Projects/dstack/dstack/src/dstack/_internal/server/services/gateways/options.py", line 31, in get_tokenizer_config
    raise ConfigurationError(f"Failed to get tokenizer info: {e}")
dstack._internal.core.errors.ConfigurationError: Failed to get tokenizer info: 401 Client Error: Unauthorized for url: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1/resolve/main/tokenizer_config.json

Additional information

No response

r4victor avatar Apr 24 '24 10:04 r4victor