FinishReason enum not compatible with OAI api
System Info
mac, stable rust, master version of TGI
Reproduction
Use the async-openai crate to make a call that ends on the server with eos_token (which doesn't exist in OAI API), you'll get
JSONDeserialize(Error("unknown variant eos_token, expected one of stop, length, tool_calls, content_filter, function_call", line: 1, column: 209))
Expected behavior
serialization should succeed with strict clients
Hi @jondot this appears to be an issue with the async-openai. The error thrown is JSONDeserialize which is defined in the client library.
Additionally for reference, I just tested the library locally and was able to get a response without any issues (not including the eos_token, as it doesn't seem like the library allows you to see that value)
use async_openai::config::OpenAIConfig;
use async_openai::types::CreateCompletionRequestArgs;
use async_openai::Client;
use std::error::Error;
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let client = Client::with_config(OpenAIConfig::new().with_api_base("http://localhost:3000/v1"));
let request = CreateCompletionRequestArgs::default()
.model("meta-llama/Meta-Llama-3-8B-Instruct")
.prompt("What is Hugging Face and what does it do?")
.max_tokens(40_u32)
.build()?;
let response = client.completions().create(request).await?;
println!("\nResponse (single):\n");
for choice in response.choices {
println!("{}", choice.text);
}
Ok(())
}
Response (single):
🤗
Hugging Face is an AI technology company that specializes in natural language processing (NLP) and artificial intelligence (AI) research. The company has developed a range of AI models and
I understand that, yes. But look at this differently, async-openai is strongly typed, and the serde infra is strict. This mean this is the perfect detector for you to find out what's not adhering to the OAI spec. If you'd use some python or node.js lib, it is more than likely you'd never bump into this, until some kind of user-side logic would try to make sense of the return values.
Now, if you're not completely compatible with the strict OAI API, then that's a different story. But if you are you should be either:
- Fully strongly type-compatible
- Have a wider-area API, of which OAI API is a non-breaking subset
In this case, missing values in the original OAI API is a breaking non-subset API surface area.
thanks for the quick response @jondot, I apologize but I'm not sure I fully understand the issue/can reproduce.
Would you be able to share an example of a request you're making that is failing to be parsed by async-openai. Thank you!
update
I've been able to reproduce and my understanding of the issue is that the returned finish_reason is the value "eos_token" rather than "stop"
for reference
use async_openai::config::OpenAIConfig;
use async_openai::types::CreateCompletionRequestArgs;
use async_openai::Client;
use std::error::Error;
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let client = Client::with_config(OpenAIConfig::new().with_api_base("http://localhost:3000/v1"));
let request = CreateCompletionRequestArgs::default()
.model("meta-llama/Meta-Llama-3-8B-Instruct")
.prompt("What is Hugging Face and what does it do?")
.max_tokens(2000_u32) // <- increased
.seed(1337)
.build()?;
let response = client.completions().create(request).await?;
println!("\nResponse (single):\n");
for choice in response.choices {
println!("{}", choice.text);
}
Ok(())
}
will follow up soon with changes to align the response with the openai expected values. Thanks for noting this issue