Fix JSON schema $ref resolution in nested Pydantic models
Tracking issue
Why are the changes needed?
Pydantic v2 generates JSON schemas with $ref references for nested models (e.g., {"$ref": "#/$defs/SingleObj"}). The schema parsing logic in type_engine.py was attempting to access property_val["type"] before resolving these references, causing KeyError: 'type'.
class SingleObj(BaseModel):
a: str
class TestDatum(BaseModel):
b: SingleObj # Direct ref: {"$ref": "#/$defs/SingleObj"}
d: list[SingleObj] # Array items ref: {"items": {"$ref": ...}}
e: Optional[list[SingleObj]] # anyOf with ref: {"anyOf": [{"items": {"$ref": ...}}]}
What changes were proposed in this pull request?
Added $ref resolution logic:
_resolve_json_schema_ref()dereferences schema paths like#/$defs/ModelNamewith proper error handling- Resolves references before type access, preventing KeyError
Updated schema processing functions:
_handle_json_schema_property()now accepts full schema and resolves$refbefore processing_get_element_type()handles resolved object types by converting them to dataclasses- Fixed type annotation:
Dict[str, str]→Dict[str, Any]for schema properties
Propagated schema context:
generate_attribute_list_from_dataclass_json_mixin()passes schema to helper functions- All recursive calls maintain schema context for nested reference resolution
How was this patch tested?
Added test_nested_pydantic_model_with_list covering:
- Direct nested models with
$ref - Lists of nested models with
$refin items - Optional lists with
anyOfcontaining$ref
All existing pydantic transformer tests (30/30) and dataclass tests (38/38) pass.
Setup process
N/A
Screenshots
N/A
Check all the applicable boxes
- [ ] I updated the documentation accordingly.
- [x] All new and existing tests passed.
- [ ] All commits are signed-off.
Related PRs
Docs link
Original prompt
Fix handling of JSON schema $ref references in nested Pydantic models
Problem
When Pydantic v2 generates JSON schemas for nested models (especially in lists like
list[NestedModel]), it uses$refreferences to definitions. The_handle_json_schema_propertyfunction intype_engine.pyfails withKeyError: 'type'because it tries to access the"type"key before resolving the$ref.Example that fails:
from pydantic import BaseModel class SingleObj(BaseModel): a: str class TestDatum(BaseModel): a: str b: SingleObj c: list[str] d: list[SingleObj] # This fails - list of nested objectsThis generates a schema like:
{ "properties": { "d": { "anyOf": [ { "type": "array", "items": { "$ref": "#/$defs/SingleObj" } }, { "type": "null" } ] } } }The error occurs because:
_handle_json_schema_propertyprocesses theanyOfand recursively calls itself for each item- For the array item, it encounters
{"type": "array", "items": {"$ref": "#/$defs/SingleObj"}}- When processing the
items, it tries to accessproperty_val["type"]on the$refdict- This fails because
$refdicts only have a"$ref"key, not a"type"keySolution
The
_handle_json_schema_propertyfunction needs to resolve$refreferences before attempting to access any schema properties. This should be done:
- At the beginning of the function (before any property access)
- Pass the full schema as a parameter to enable reference resolution
- Handle the reference path format
#/$defs/ModelNameor#/definitions/ModelNameThe existing
generate_attribute_list_from_dataclass_jsonfunction already has logic to handle$reffor nested dataclasses, and we should apply similar logic togenerate_attribute_list_from_dataclass_json_mixin.Also need to handle
$refin array items and other nested structures.
This pull request was created as a result of the following prompt from Copilot chat.
Fix handling of JSON schema $ref references in nested Pydantic models
Problem
When Pydantic v2 generates JSON schemas for nested models (especially in lists like
list[NestedModel]), it uses$refreferences to definitions. The_handle_json_schema_propertyfunction intype_engine.pyfails withKeyError: 'type'because it tries to access the"type"key before resolving the$ref.Example that fails:
from pydantic import BaseModel class SingleObj(BaseModel): a: str class TestDatum(BaseModel): a: str b: SingleObj c: list[str] d: list[SingleObj] # This fails - list of nested objectsThis generates a schema like:
{ "properties": { "d": { "anyOf": [ { "type": "array", "items": { "$ref": "#/$defs/SingleObj" } }, { "type": "null" } ] } } }The error occurs because:
_handle_json_schema_propertyprocesses theanyOfand recursively calls itself for each item- For the array item, it encounters
{"type": "array", "items": {"$ref": "#/$defs/SingleObj"}}- When processing the
items, it tries to accessproperty_val["type"]on the$refdict- This fails because
$refdicts only have a"$ref"key, not a"type"keySolution
The
_handle_json_schema_propertyfunction needs to resolve$refreferences before attempting to access any schema properties. This should be done:
- At the beginning of the function (before any property access)
- Pass the full schema as a parameter to enable reference resolution
- Handle the reference path format
#/$defs/ModelNameor#/definitions/ModelNameThe existing
generate_attribute_list_from_dataclass_jsonfunction already has logic to handle$reffor nested dataclasses, and we should apply similar logic togenerate_attribute_list_from_dataclass_json_mixin.Also need to handle
$refin array items and other nested structures.
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.
Bito Automatic Review Skipped - Draft PR
Bito didn't auto-review because this pull request is in draft status.
No action is needed if you didn't intend for the agent to review it. Otherwise, to manually trigger a review, type /review in a comment and save.
You can change draft PR review settings here, or contact your Bito workspace admin at [email protected].
Bito Automatic Review Skipped - Draft PR
Bito didn't auto-review because this pull request is in draft status.
No action is needed if you didn't intend for the agent to review it. Otherwise, to manually trigger a review, type /review in a comment and save.
You can change draft PR review settings here, or contact your Bito workspace admin at [email protected].