opencode icon indicating copy to clipboard operation
opencode copied to clipboard

Enforcing structured output (c.f. Pydantic) and offload to temporary variables? [design patterns?]

Open davidbernat opened this issue 2 weeks ago • 1 comments

Question

The OpenCode team is doing such a fantastic job that I want to begin building a new class of agents for myself. This is again a design pattern which I simply have not seen in my organic exploration or OpenCode tutorials.

For example, suppose I want to extract each proper name and profession from a page of text (or products and UPCs from an receipt). The "software engineer" solution could use a package, such as Pydantic, to create a structured class which explains the structured JSON result we design, and use PydanticAI to feed that into a client API call to the LLM backend. Then, Pydantic non-magic happens behind the scenes to modify and coerce the prompts and output, initiating retries and such, so that the API results either fail or return as a JSON-only in the requested format. There are issues with using small LLMs but by-and-large this works easily for an engineer.

from pydantic import BaseModel, Field
from pydantic_ai import Agent, RunContext

# this defines the structure of the output
class Employee(BaseModel):
    name: str = Field(description='the name of the individual listed on their employee card')
    role: str = Field(description="the functions the individual has permission to perform for salary")

# we then create a constricted agent which exercises prompt manipulation behind the scenes
agent = Agent(
    "my.favorite.ollama.model",
    output_type=Employee,
    instructions=(
        'You perform information retrieval to extract verbatim text phrases from unstructured text.'
    ),
)

async def main():
    result = await agent.run("Here is a long document of text containing..... .....and in the end the love you make.)
    # result.output will be guaranteed to be a list of Employee data objects, and we can do with this as we wish.
    print(result.output)

The underpinnings of PydanticAI are proprietary to their package; but broadly speaking there is nothing particularly complicated here, certainly not as a developer pattern. In fact, much of this is so straightforward that in a pinch a developer could make an MCP which accepts at-runtime schema for a generic data model (BaseModel inheritance) and use that MCP to engage with any LLM to enforce structured data return.

OK. How do I do the same thing using OpenCode as a vibe-analyst without writing any of the above code? I hope this Q makes sense to the junior developers reading this paragraph as they arrive from other sources.

In practice, from an agent prompt in OpenCode, I can ask for the data to be structured, but I have no guarantee that the data is structured (ignoring the retry mechanisms which PydanticAI also implements). Is this correct? This is a very simple "vibe pattern" which I presume uses are often requesting, so again I am somewhat befuddled that I have not found an industry standard solution yet? It makes me feel as though I am missing out.

Let me split this ticket here into two different considerations:

  1. What is the expected "vibe analyst" pattern to, say, "Hey extract the employees from this text, and store this as an offline temporary variable called myEmployees." and a. guarantee OpenCode will throw an error if extraction is not validated b. store that data in an offline file which OpenCode knows to reference again later c. interact with that filesystem in context or prompts (similar to how the @ symbol works across .md design files)? If I were teaching a first year graduate course on vibe-analysts at a University this would be our week 3 assignment. So I am confused that none of this has come up in the several dozen hours of reading organically. If this does not exist this would be a profitable feature to engineer (as an MCP server, presumably) and please reach out.

  2. Repeatability is hugely important in industry, and yet vibe-analyst work is the specific antithesis of this. In other words, any industrialized vibe-analysts are surely creating a result through vibe, then implementing some structured version of its execution. If this happens at the OpenCode level, then this would be as simple as "these were the sequence of MCP executions made throughout the session" (and reduced prompts) and exporting that as a deterministic (ish) code in a disk-saved cartridge library of what would essentially be MCP servlet cartridges with dependency requirements. In fact, from an industrialization perspective, the OpenCode "goal" is to produce these sequences as coerced by the capabilities of one LLM or another. Notice the huge security vulnerabilities of "distillation" here too: create the MCP sequence once, repeat the MCP sequence without LLM or with tiny LLMs. Yet I do not see these capabilities as "native" in OpenCode (they should be!) and have not heard any discussion of design patterns. Again this is a working group I would join because I will be doing it anyway; but I cannot yet understand how every vibe-analyst is not also looking to do this from day three. (Nobody said I was smart.)

Comments? I am cross-linking https://github.com/anomalyco/opencode/issues/6665 because the hallucination problems of LLMs not knowing their own data-type limitations, and executing odd runtime-created code, are both examples of a workflow which would be bottled, offloaded to MCP, and referenced in future instances (if the solutions worked).

I am really excited to solve these issues this week as the entire year is ahead of us all. You are invited to email me if you want to discuss this in ways not appropriate for a public issue comment. If we were co-located I would say we should pop the champagne on this one.

davidbernat avatar Jan 03 '26 15:01 davidbernat