text-generation-inference
text-generation-inference copied to clipboard
Responses with unusual content.
System Info
text generation inference api
Information
- [ ] Docker
- [ ] The CLI directly
Tasks
- [X] An officially supported command
- [ ] My own modifications
Reproduction
i'm using inference api https://api-inference.huggingface.co/v1/chat/completions with nvidia/Llama-3.1-Nemotron-70B-Instruct-HF model.
I use the same message with the role of "user," and the model produces different results. Most of the time, the model provides normal answers, but occasionally it generates responses with strange content.
I temporarily stopped calling the API for a short period. After that, I called the API again with the same message used previously, and the model returned a normal response.
Expected behavior
Is this issue caused by the model? Is there any way to prevent the model from generating such strange responses?