text-generation-inference how to use the model's checkpoint in local fold?

how to use the model's checkpoint in local fold?

Open zk19971101 opened this issue 1 year ago • 2 comments

System Info

ghcr.io/huggingface/text-generation-inference 2.0.4 platform windows10 Docker version 27.0.3 llm model:lllyasviel/omost-llama-3-8b-4bits cuda 12.3 gpu nvidia rtx A6000

Information

[X] Docker
[ ] The CLI directly

Tasks

[ ] An officially supported command
[ ] My own modifications

Reproduction

C:\Users\Administrator>docker run --gpus all -p 8080:80 -v ./data:/data ghcr.io/huggingface/text-generation-inference:2.0.4 --model-id "F:\Omost-main\checkpoints\models--lllyasviel--omost-llama-3-8b-4bits" --max-total-tokens 9216 --cuda-memory-fraction 0.8