helicone icon indicating copy to clipboard operation
helicone copied to clipboard

[Bug]: Predibase Support is Lacking

Open colegottdank opened this issue 3 months ago • 3 comments

What happened?

Within the worker, we map to the predibase base URL. It says https://api.app.predibase.com but it should be https://serving.app.predibase.com

Also, the model and usage are returned as headers. When we grab the model out, we get from response body -> request body -> path. It should be the header first and if not exists, continue the existing flow.

For the usage, we grab from the body. We need to grab from there. Here are what they will look like:

Response Headers info These headers should be considered a beta feature, and are subject to change in the future.

x-total-tokens: The number of tokens in both the input prompt and the output. x-prompt-tokens: The number of tokens in the prompt. x-generated-tokens: The number of generated tokens. x-total-time: The total time the request took in the inference server, in milliseconds. x-time-per-token: The average time it took to generate each output token, in milliseconds. x-queue-time: The time the request was in the internal inference server queue, in milliseconds. x-model-id: predibase/Meta-Llama-3.1-8B-Instruct-dequantized (example model name)

Lastly, add docs for Predibase support. Reference other integrations we have, such as the integrations with DeepInfra, Fireworks AI.

Relevant log output

No response

Twitter / LinkedIn details

No response

colegottdank avatar Oct 28 '24 23:10 colegottdank