Python client: Extra slash in base_uri leads to failures in chat endpoint

Open kcarnold opened this issue 1 year ago • 0 comments

System Info

If you create a Client with a server URI ending in a slash, the generation endpoint works fine but the chat endpoint fails silently (for the Python client, it's a JSONDecodeError because the server returns a 404 with an empty body, and empty-string isn't valid JSON; that's a separate bug though).

(I'm also confused about the intended relationship between the Python client library in this repo and the InferenceClient one in huggingface_hub. The docs refer to both, in different places.)

Information

[ ] Docker
[X] The CLI directly

Tasks

[X] An officially supported command
[ ] My own modifications

Reproduction

from text_generation import Client
client = Client('http://localhost:3000/')
client.chat(messages=[
    {
        "role": "system",
        "content": "You answer in bulleted lists."
    },
    {
        "role": "user",
        "content": "Why is the sky blue?"
    }
], max_tokens=100)

Expected behavior

Same as if the trailing / in the Client URI is missing.

This could be as simple as self.base_url = base_url.rstrip('/') on:

https://github.com/huggingface/text-generation-inference/blob/e9f03f822a766f071620457bd977f7987e65b20e/clients/python/text_generation/client.py#L62

Apr 27 '24 13:04 kcarnold