OpenLLM icon indicating copy to clipboard operation
OpenLLM copied to clipboard

feat: Document usage of request_id

Open AlexanderFillbrunn opened this issue 1 year ago • 0 comments

Feature request

Hi, The /v1/generate endpoint returns a request_id as part of the JSON response. I assume that when finished is set to false, I can somehow use this request ID to query for the rest of the output later. However, the OpenAPI documentation I can access under http://127.0.0.1:3000/ does not seem to document anywhere which endpoint to use for that. Or am I mistaken and this is not possible?

I am using the latest ghcr.io/bentoml/openllm Docker image like this: docker run --rm -it -p 3000:3000 --platform linux/x86_64 ghcr.io/bentoml/openllm start facebook/opt-1.3b --backend pt

Kind regards, Alexander

Motivation

This feature would allow me to find out how the request_id can be used to follow up on incomplete queries, if this is at all possible.

Other

No response

AlexanderFillbrunn avatar Feb 12 '24 10:02 AlexanderFillbrunn