Support for structured output like OpenAI's response format or tools from Claude
起始日期 | Start Date
No response
实现PR | Implementation PR
No response
相关Issues | Reference Issues
No response
摘要 | Summary
In many cases, structured output can be utilized to streamline the original workflow. In my understand, MiniCPM only supports prompt engineering to tune its output; however, this adds up a lot of computational power consumption. In my testing, this requires me to send 4 samples to achieve this. Is it possible to support the predefined output like Pytantics' model so that we can ensure the output schema is predictable.
基本示例 | Basic Example
This is the example from OpenAI's doc.
from pydantic import BaseModel
from openai import OpenAI
client = OpenAI()
class CalendarEvent(BaseModel):
name: str
date: str
participants: list[str]
completion = client.beta.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "Extract the event information."},
{"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
],
response_format=CalendarEvent,
)
event = completion.choices[0].message.parsed
缺陷 | Drawbacks
I don't think there might be any kind of drawbacks
未解决问题 | Unresolved questions
No response
Perhaps you could attempt to preserve the kv-cache to avoid the re-prefill computation.
Even if I can cache it, the inference time will still be increased given the necessary example prompt, right?
If it is a newly added prompt input, additional inference time will be required, but it can save the prefill time overhead of the few-shot prefix prompt.