MiniCPM-V Support for structured output like OpenAI's response format or tools from Claude

起始日期 | Start Date

No response

实现PR | Implementation PR

No response

摘要 | Summary

In many cases, structured output can be utilized to streamline the original workflow. In my understand, MiniCPM only supports prompt engineering to tune its output; however, this adds up a lot of computational power consumption. In my testing, this requires me to send 4 samples to achieve this. Is it possible to support the predefined output like Pytantics' model so that we can ensure the output schema is predictable.

基本示例 | Basic Example

This is the example from OpenAI's doc.

from pydantic import BaseModel
from openai import OpenAI

client = OpenAI()

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
    ],
    response_format=CalendarEvent,
)

event = completion.choices[0].message.parsed

缺陷 | Drawbacks

I don't think there might be any kind of drawbacks

未解决问题 | Unresolved questions

No response

Jan 27 '25 00:01 avalonliberty

Perhaps you could attempt to preserve the kv-cache to avoid the re-prefill computation.

Jan 27 '25 04:01 YuzaChongyi

Even if I can cache it, the inference time will still be increased given the necessary example prompt, right?

Jan 27 '25 08:01 avalonliberty

If it is a newly added prompt input, additional inference time will be required, but it can save the prefill time overhead of the few-shot prefix prompt.

Jul 18 '25 06:07 YuzaChongyi

Support for structured output like OpenAI's response format or tools from Claude

起始日期 | Start Date

实现PR | Implementation PR

相关Issues | Reference Issues

摘要 | Summary

基本示例 | Basic Example

缺陷 | Drawbacks

未解决问题 | Unresolved questions