lorax icon indicating copy to clipboard operation
lorax copied to clipboard

Passing a `--revision` causes failure in loading tokenizer config

Open chiragjn opened this issue 1 year ago • 0 comments
trafficstars

System Info

docker run ghcr.io/predibase/lorax:ea5d74b --gpus all -it --rm -e HF_TOKEN=... lorax-launcher --source hub --model-id meta-llama/Llama-2-7b-chat-hf  --default-adapter-source local --revision f5db02db724555f92da89c216ac04704f23d4590

We notice

2024-08-01T13:25:45.718148Z  INFO hf_hub: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55: Token file not found "/root/.cache/huggingface/token"    
2024-08-01T13:25:46.569450Z  INFO lorax_router: router/src/main.rs:534: Serving revision f5db02db724555f92da89c216ac04704f23d4590 of model meta-llama/Llama-2-7b-chat-hf
2024-08-01T13:25:46.569516Z  WARN lorax_router: router/src/main.rs:337: Could not find tokenizer config locally and no API specified
2024-08-01T13:25:46.569522Z  WARN lorax_router: router/src/main.rs:358: Could not find a fast tokenizer implementation for meta-llama/Llama-2-7b-chat-hf
2024-08-01T13:25:46.569527Z  WARN lorax_router: router/src/main.rs:359: Rust input length validation and truncation is disabled

and any chat completions requests also fail with template not found

When I skip passing revision, it manages to find the tokenizer_config.json

Information

  • [X] Docker
  • [ ] The CLI directly

Tasks

  • [X] An officially supported command
  • [ ] My own modifications

Reproduction

docker run ghcr.io/predibase/lorax:ea5d74b --gpus all -it --rm -e HF_TOKEN=... lorax-launcher --source hub --model-id meta-llama/Llama-2-7b-chat-hf  --default-adapter-source local --revision f5db02db724555f92da89c216ac04704f23d4590

Expected behavior

Passing a revision should not affect loading tokenizer config

chiragjn avatar Aug 01 '24 13:08 chiragjn