instructor Literal while streaming does not work

[x] This is actually a bug report.
[ ] I am not getting good LLM Results
[ ] I have tried asking for help in the community on discord or discussions and have not received a response.
[ ] I have tried searching the documentation and have not found an answer.

What Model are you using?

[x] gpt-3.5-turbo
[ ] gpt-4-turbo
[ ] gpt-4
[ ] Other (please specify)

Describe the bug

While streaming a Literal, it crashes with a validation error:

ValidationError: 1 validation error for PartialUserInfo
name
  Input should be 'Bob', 'Alice' or 'John' [type=literal_error, input_value='', input_type=str]
    For further information visit https://errors.pydantic.dev/2.8/v/literal_error

To Reproduce Steps to reproduce the behavior, including code snippets of the model and the input data and openai response.

import instructor
from typing import Literal
from instructor import Partial
from pydantic import BaseModel
from openai import AsyncOpenAI


# Define your desired output structure
class UserInfo(BaseModel):
    name: Literal["Bob", "Alice", "John"]
    age: int


# Patch the OpenAI client
client = instructor.from_openai(AsyncOpenAI())

# Extract structured data from natural language
user_info = await client.chat.completions.create(
    model="gpt-3.5-turbo",
    response_model=Partial[UserInfo],
    messages=[{"role": "user", "content": "John Doe is 30 years old."}],
    stream=True
)

async for chunk in user_info:
    print(chunk)

Expected behavior A clear and concise description of what you expected to happen.

Validation should happen once the stream is finished.

Related: https://github.com/jxnl/instructor/pull/362

Screenshots If applicable, add screenshots to help explain your problem.

Jul 24 '24 22:07 ahuang11

whats the ideal check? i dont think theres a good answer unles you want to add a custom before validator

Jul 25 '24 21:07 jxnl

I might be missing something, but during the json parsing in the PartialBase class, from_json is called with partial_mode="trailing-strings". if we change it to "on" instead, incomplete strings will be discarded. something like this:

obj = from_json(
    (potential_object or "{}").encode(), partial_mode="on"
)

I'll be happy to tackle it if I didn't miss anything :D

Jul 31 '24 01:07 roeybc

@roeybc That would be very helpful, thanks. I am facing the same issue at the moment.

Aug 23 '24 12:08 widike

I might be missing something, but during the json parsing in the PartialBase class, from_json is called with partial_mode="trailing-strings". if we change it to "on" instead, incomplete strings will be discarded. something like this:
obj = from_json(
    (potential_object or "{}").encode(), partial_mode="on"
)
I'll be happy to tackle it if I didn't miss anything :D

this would be a great PR cc @ivanleomk

Aug 24 '24 02:08 jxnl

Awesome, on it 😀

Aug 24 '24 03:08 roeybc

Just saw a PR @roeybc , lemme go test it out. Thanks for the submission!

Aug 24 '24 06:08 ivanleomk

Hmm, ok this is a good solution but it also breaks our ability to stream partial string fields. Is this a trade off we want to make @jxnl?

import instructor
from pydantic import BaseModel
from openai import AsyncOpenAI
import asyncio


# Define your desired output structure
class UserInfo(BaseModel):
    name: str
    age: int


# Patch the OpenAI client
client = instructor.from_openai(AsyncOpenAI())


async def generate():
    # Extract structured data from natural language
    user_info = client.chat.completions.create_partial(
        model="gpt-3.5-turbo",
        response_model=UserInfo,
        messages=[{"role": "user", "content": "John Doe is 30 years old."}],
        stream=True,
    )

    async for chunk in user_info:
        print(chunk)


if __name__ == "__main__":
    asyncio.run(generate())

We can see this when the completion below is being generated.

name=None age=None
name=None age=None
name=None age=None
name=None age=None
name=None age=None
name=None age=None
name='John Doe' age=None
name='John Doe' age=None
name='John Doe' age=None
name='John Doe' age=30
name='John Doe' age=30

Any suggestions for getting around this @roeybc ?

Aug 24 '24 06:08 ivanleomk

Fair point. Maybe set it to 'On' automatically if Literals are in the class? For my own use I created a monkey patch in case I have literals in my model.

from typing import Literal
from jiter import from_json
from instructor.dsl.partial import PartialBase


def monkey_patch_instructor_partial(
    partial_mode: Literal["off", "on", "trailing-strings"] = "trailing-strings"
):

    def patched_model_from_chunks(cls, json_chunks, **kwargs):
        potential_object = ""
        partial_model = cls.get_partial_model()
        for chunk in json_chunks:
            potential_object += chunk
            obj = from_json(
                (potential_object or "{}").encode(),
                partial_mode=partial_mode,  # Change "trailing-strings" to "on" for Literal support
            )
            obj = partial_model.model_validate(obj, strict=None, **kwargs)
            yield obj

    async def patched_model_from_chunks_async(cls, json_chunks, **kwargs):
        potential_object = ""
        partial_model = cls.get_partial_model()
        async for chunk in json_chunks:
            potential_object += chunk
            obj = from_json(
                (potential_object or "{}").encode(),
                partial_mode=partial_mode,  # Change "trailing-strings" to "on" for Literal support
            )
            obj = partial_model.model_validate(obj, strict=None, **kwargs)
            yield obj

    # Replace the original methods with the patched ones
    PartialBase.model_from_chunks = classmethod(patched_model_from_chunks)
    PartialBase.model_from_chunks_async = classmethod(patched_model_from_chunks_async)


def enable_instructor_literal_patch():
    monkey_patch_instructor_partial(partial_mode="on")


def disable_instructor_literal_patch():
    monkey_patch_instructor_partial(partial_mode="trailing-strings")

Aug 24 '24 13:08 widike

I think this is still broken:

import instructor
from typing import Literal
from instructor import Partial
from pydantic import BaseModel
from openai import AsyncOpenAI


# Define your desired output structure
class UserInfo(BaseModel):
    name: Literal["Bob", "Alice", "John"]
    age: int


# Patch the OpenAI client
client = instructor.from_openai(AsyncOpenAI())

# Extract structured data from natural language
user_info = await client.chat.completions.create(
    model="gpt-4o-mini",
    response_model=Partial[UserInfo],
    messages=[{"role": "user", "content": "John Doe is 30 years old."}],
    stream=True
)

async for chunk in user_info:
    print(chunk)


ValidationError: 1 validation error for PartialUserInfo
name
  Input should be 'Bob', 'Alice' or 'John' [type=literal_error, input_value='', input_type=str]
    For further information visit https://errors.pydantic.dev/2.10/v/literal_error

Jan 28 '25 16:01 ahuang11

hey @ahuang11 , you'll need to update your base model definition with this mixin (https://python.useinstructor.com/concepts/partial/ )

from instructor.dsl.partial import PartialLiteralMixin

class UserInfo(BaseModel, PartialLiteralMixin):
    name: Literal["Bob", "Alice", "John"]
    age: int

The reason for this change is because we wanted to preserve field level streaming and literals and so we opted for a mixing as the best intermediate workaround

Jan 30 '25 00:01 ivanleomk