'details' in /v1/chat/completions endpoint missing
System Info
'details' in /v1/chat/completions endpoint missing
This works:
stream_url ="localhost:8000/generate_stream"
payload = {
"inputs": prompt,
"parameters": {
"stream": True,
"details": True,
},
}
and correctly returns 'details' in the final chunk:
data:{"index":49,"token":{"id":32000,"text":"<|im_end|>","logprob":-0.91845703,"special":true},"generated_text":"Example answer","details":{"finish_reason":"eos_token","generated_tokens":49,"seed":3169457846579174189}}
but this endpoint does not:
stream_url = "localhost:8000/v1/chat/completions"
payload = {
"messages": prompt,
"details": True, # neither of these work
"parameters": {
"details": True, # neither of these work
},
}
Reviewing server.rs at line 675, I can see details is set to True by default. So in theory it should be included?
// build the request passing some parameters
let generate_request = GenerateRequest {
inputs: inputs.to_string(),
parameters: GenerateParameters {
best_of: None,
temperature: req.temperature,
repetition_penalty,
frequency_penalty: req.frequency_penalty,
top_k: None,
top_p: req.top_p,
typical_p: None,
do_sample: true,
max_new_tokens,
return_full_text: None,
stop: Vec::new(),
truncate: None,
watermark: false,
details: true,
decoder_input_details: !stream,
seed,
top_n_tokens: None,
grammar: tool_grammar.clone(),
},
};
TGI Version = "1.4.3" Via official docker image
It's an additional separate field, so would not interfere with the OpenAI Standard Format.
Information
- [X] Docker
- [X] The CLI directly
Tasks
- [X] An officially supported command
- [ ] My own modifications
Reproduction
stream_url = "localhost:8000/v1/chat/completions"
payload = { "messages": prompt, "details": True, # neither of these work "parameters": { "details": True, # neither of these work }, }
Expected behavior
Expected details && generated_tokens in the response:
data:{"index":49,"token":{"id":32000,"text":"<|im_end|>","logprob":-0.91845703,"special":true},"generated_text":"Example answer","details":{"finish_reason":"eos_token","generated_tokens":49,"seed":3169457846579174189}}
Hi @daz-williams thank you for using TGI and opening this issue, however this is the intended functionality since details are not a concept in the chat api.
The /v1/chat/completions endpoint returns a ChatCompletionChunk or ChatCompletion type response based on if you're streaming.
The ChatCompletion response includes choices.logprobs, choices.finish_reason and usage, and the ChatCompletionChunk includes a finish_reason and logprobs which all use information from details.
Is there a specific data needed from the chat endpoint?
closing this issue as this is the expected functionality (described above)