datamodel-code-generator
datamodel-code-generator copied to clipboard
oneOf with subschema array items not incorporated/generated as Any for pydantic.v2
Describe the bug
When using a JSON schema as input with aoneOf construct where one option is an array with items defined in a subschema, the resulting pydantic v2 model does not incorporate the subschema definition, but rather list[Any]
To Reproduce
The following JSON schema snippet:
"SpatialPlan": {
"type": "object",
"properties": {
"officialDocument": {
"title": "officialDocument",
"description": "Link to the official documents that relate to the spatial plan.",
"oneOf": [
{
"$ref": "definitions/voidable.json#/definitions/Voidable"
},
{
"type": "array",
"minItems": 1,
"items": {
"$ref": "definitions/ref.json#/definitions/FeatureRef"
},
"uniqueItems": true
}
]
}
leads to the pydantic v2 model:
class OfficialDocument(RootModel[list[Any]]):
root: Annotated[
list[Any],
Field(
description='Link to the official documents that relate to the spatial plan.',
min_length=1,
title='officialDocument',
),
]
class SpatialPlan(BaseModel):
officialDocument: Annotated[
Voidable | OfficialDocument,
Field(
description='Link to the official documents that relate to the spatial plan.',
title='officialDocument',
),
]
Used commandline:
$ datamodel-codegen --target-python-version 3.10 --use-union-operator --use-standard-collections --use-schema-description --use-annotated --collapse-root-models --output-model-type pydantic_v2.BaseModel --input input.json --output output.py
Expected behavior The resulting pydantic model should look like this:
class OfficialDocument(RootModel[list[FeatureRef]]):
root: Annotated[
list[FeatureRef],
Field(
description="Link to the official documents that relate to the spatial plan.",
min_length=1,
title="officialDocument",
),
]
Or maybe even more preferable, the addtional RootModel definition should be dropped as a whole:
class SpatialPlan(BaseModel):
officialDocument: Annotated[
list[FeatureRef] | Voidable,
Field(
description="Link to the official documents that relate to the spatial plan.",
min_length=1,
title="officialDocument",
),
]
Version:
- OS: Ubuntu 22.04 (WSL)
- Python version: 3.10
- datamodel-code-generator version: 0.25.5
Additional context Add any other context about the problem here.
Is this related to: https://github.com/koxudaxi/datamodel-code-generator/blob/fcab9a4d555d4b96d64bb277f974bb7507982fb2/datamodel_code_generator/parser/jsonschema.py#L681-L694
If so - or if you can provide another hint - maybe we can have a look and work on a PR. This issue is really hampering our use case.
I've been looking into a similar issue on my project - so far I think it may be related to enabling the --field-constraints option, which is also enabled by using the --use-annotated option.
I'm working off of a very slightly modified version of the CycloneDX 1.5 schema, where the licenses field here is changed from an array to object type (due to some other issue with datamodel-code-generator parsing the schema). I expect to get a Python class somewhere that includes the expression and bom-ref fields. Here's what I'm seeing using datamodel-codegen 0.25.6, with the command
datamodel-codegen --input ~/temp/modified-bom-1.5.schema.json --output output-license-obj-annotated --use-annot ated:
class LicenseChoice1(BaseModel):
__root__: Annotated[
List[Any],
Field(
description='A tuple of exactly one SPDX License Expression.',
max_items=1,
min_items=1,
title='SPDX License Expression',
),
]
class LicenseChoice(BaseModel):
__root__: Annotated[
Union[List[LicenseChoiceItem], LicenseChoice1],
Field(
description='EITHER (list of SPDX licenses and/or named licenses) OR (tuple of one SPDX License Expression)',
title='License Choice',
),
]
When I remove --use-annotated, I get something more like what I expect:
class LicenseChoiceItem1(BaseModel):
class Config:
extra = Extra.forbid
expression: str = Field(
...,
examples=[
'Apache-2.0 AND (MIT OR GPL-2.0-only)',
'GPL-3.0-only WITH Classpath-exception-2.0',
],
title='SPDX License Expression',
)
bom_ref: Optional[RefType] = Field(
None,
alias='bom-ref',
description='An optional identifier which can be used to reference the license elsewhere in the BOM. Every bom-ref MUST be unique within the BOM.',
title='BOM Reference',
)
class LicenseChoice(BaseModel):
__root__: Union[List[LicenseChoiceItem], List[LicenseChoiceItem1]] = Field(
...,
description='EITHER (list of SPDX licenses and/or named licenses) OR (tuple of one SPDX License Expression)',
title='License Choice',
)
I'll keep digging, but for now it appears that using annotations/field constraints ends up dropping type information somewhere down that path.
I can confirm that dropping --field-constraints could be considered a workaround - thanks for the hint! However, this limits the possibilities in the model, e.g. pattern constraints cannot be used anymore.
I see you already provided a PR - great :)