Python client: Extra slash in base_uri leads to failures in chat endpoint
System Info
If you create a Client with a server URI ending in a slash, the generation endpoint works fine but the chat endpoint fails silently (for the Python client, it's a JSONDecodeError because the server returns a 404 with an empty body, and empty-string isn't valid JSON; that's a separate bug though).
(I'm also confused about the intended relationship between the Python client library in this repo and the InferenceClient one in huggingface_hub. The docs refer to both, in different places.)
Information
- [ ] Docker
- [X] The CLI directly
Tasks
- [X] An officially supported command
- [ ] My own modifications
Reproduction
from text_generation import Client
client = Client('http://localhost:3000/')
client.chat(messages=[
{
"role": "system",
"content": "You answer in bulleted lists."
},
{
"role": "user",
"content": "Why is the sky blue?"
}
], max_tokens=100)
Expected behavior
Same as if the trailing / in the Client URI is missing.
This could be as simple as self.base_url = base_url.rstrip('/') on:
https://github.com/huggingface/text-generation-inference/blob/e9f03f822a766f071620457bd977f7987e65b20e/clients/python/text_generation/client.py#L62