[HF task support] Unexpected output with a Meta-Llama-3.1-8B-Instruct based model
What huggingface task type would you like to support? text-genaration
Specifically, what model are you interested in under this task type? NeverSleep/Lumimaid-v0.2-8B
Would you mind sharing a bit more, like use cases and errors you encountered?
I deployed the model using the Lepton LLM engine. The only information provided on this page was MODEL_PATH, and the deployment succeeded.
I can create a chat completion request using the OpenAI SDK. However, the results I receive are unexpected. The snapshot below is what I obtained from the webpage, and I received a similar result using the OpenAI SDK.
I also attempted deployment on my own server using transformers, and the results appeared satisfactory.
As shown in the config.json, it is identical to the Llama-3.1-8B-Instruct's configuration. However, the results suggest there may be a tokenization error.
Please assist me in ensuring the deployment functions correctly in chat mode, thank you!