instructor
instructor copied to clipboard
feat: update and allow strict mode
addresses #612
:rocket: This description was created by Ellipsis for commit 291e3e59f937417d5dae99b3786850a40d091e41 |
---|
Summary:
The pull request introduces a new strict
parameter, defaulting to True
, to several methods in the Instructor
and AsyncInstructor
classes and to the new_create_async
and new_create_sync
functions.
Key points:
- Added a
strict
parameter to several methods in theInstructor
andAsyncInstructor
classes ininstructor/client.py
. - Added a
strict
parameter to thenew_create_async
andnew_create_sync
functions ininstructor/patch.py
. - The
strict
parameter is a boolean that defaults toTrue
.
Generated with :heart: by ellipsis.dev
Deploying instructor with
Cloudflare Pages
Latest commit: |
291e3e5
|
Status: | ✅ Deploy successful! |
Preview URL: | https://6128e958.instructor.pages.dev |
Branch Preview URL: | https://allow-strict-in-create.instructor.pages.dev |
I think this is a good change and should be merged but it doesn't fulfill my intent behind #612.
#612 is about allowing control characters in JSON strings because this happens so commonly with Claude's models.
Pydantic's model_validate_json(..., strict=False)
does not allow control characters in strings, but does all this which might be desirable to clients in some cases.
The standard library's json.loads(... strict=False)
does one thing: it allows control characters in JSON strings, which is what I want in #612.
If you want to merge these non-strict semantics, the change looks like this for the JSON-parsing functions in function_calls.py
:
@classmethod
def parse_anthropic_json(
cls: Type[BaseModel],
completion,
validation_context: Optional[Dict[str, Any]] = None,
strict: Optional[bool] = None,
) -> BaseModel:
from anthropic.types import Message
assert isinstance(completion, Message)
text = completion.content[0].text
extra_text = extract_json_from_codeblock(text)
if strict:
return cls.model_validate_json(
extra_text, context=validation_context, strict=strict
)
else:
# Allow control characters.
parsed = json.loads(extra_text, strict=False)
# Pydantic non-strict: https://docs.pydantic.dev/latest/concepts/strict_mode/
return cls.model_validate(parsed, context=validation_context, strict=strict)
Maybe you don't want to merge these semantics in instructor's strict
, in which case there would need to be two separate arguments to toggle these different capabilities.
If this is functionality you want in instructor
I'm happy to submit a PR subject to however you want to design this.
This functionality was made possible at some point, not sure when it was removed: https://github.com/jxnl/instructor/pull/75
I think this is a good change and should be merged but it doesn't fulfill my intent behind #612.
#612 is about allowing control characters in JSON strings because this happens so commonly with Claude's models.
Pydantic's
model_validate_json(..., strict=False)
does not allow control characters in strings, but does all this which might be desirable to clients in some cases.The standard library's
json.loads(... strict=False)
does one thing: it allows control characters in JSON strings, which is what I want in #612.If you want to merge these non-strict semantics, the change looks like this for the JSON-parsing functions in
function_calls.py
:@classmethod def parse_anthropic_json( cls: Type[BaseModel], completion, validation_context: Optional[Dict[str, Any]] = None, strict: Optional[bool] = None, ) -> BaseModel: from anthropic.types import Message assert isinstance(completion, Message) text = completion.content[0].text extra_text = extract_json_from_codeblock(text) if strict: return cls.model_validate_json( extra_text, context=validation_context, strict=strict ) else: # Allow control characters. parsed = json.loads(extra_text, strict=False) # Pydantic non-strict: https://docs.pydantic.dev/latest/concepts/strict_mode/ return cls.model_validate(parsed, context=validation_context, strict=strict)
Maybe you don't want to merge these semantics in instructor's
strict
, in which case there would need to be two separate arguments to toggle these different capabilities.If this is functionality you want in
instructor
I'm happy to submit a PR subject to however you want to design this.
lets allow this too, I'll merge this first. sorry for delay was on vacation!