datamodel-code-generator
datamodel-code-generator copied to clipboard
Code generation crashes with "black.parsing.InvalidInput: Cannot parse"
Describe the bug
datamodel-codegen crashed with an error instead of generating code:
% poetry run datamodel-codegen --input openapi.yml --input-file-type openapi --output models.py
Traceback (most recent call last):
File "/Users/user/Library/Caches/pypoetry/virtualenvs/spreadsheet-offset-tool-L93AmRO5-py3.12/lib/python3.12/site-packages/datamodel_code_generator/__main__.py", line 447, in main
generate(
File "/Users/user/Library/Caches/pypoetry/virtualenvs/spreadsheet-offset-tool-L93AmRO5-py3.12/lib/python3.12/site-packages/datamodel_code_generator/__init__.py", line 468, in generate
results = parser.parse()
^^^^^^^^^^^^^^
File "/Users/user/Library/Caches/pypoetry/virtualenvs/spreadsheet-offset-tool-L93AmRO5-py3.12/lib/python3.12/site-packages/datamodel_code_generator/parser/base.py", line 1304, in parse
body = code_formatter.format_code(body)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/Library/Caches/pypoetry/virtualenvs/spreadsheet-offset-tool-L93AmRO5-py3.12/lib/python3.12/site-packages/datamodel_code_generator/format.py", line 226, in format_code
code = self.apply_black(code)
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/Library/Caches/pypoetry/virtualenvs/spreadsheet-offset-tool-L93AmRO5-py3.12/lib/python3.12/site-packages/datamodel_code_generator/format.py", line 234, in apply_black
return black.format_str(
^^^^^^^^^^^^^^^^^
File "src/black/__init__.py", line 1225, in format_str
File "src/black/__init__.py", line 1239, in _format_str_once
File "src/black/parsing.py", line 90, in lib2to3_parse
black.parsing.InvalidInput: Cannot parse: 12:0: class NullEnum(BaseModel):
To Reproduce
The OpenAPI schema used:
openapi: 3.0.1
components:
schemas:
NullEnum:
# Yes, this is a weird type. It has a reason to exist in this form.
type: string
enum:
- null
nullable: true
Used commandline:
$ datamodel-codegen --input openapi.yml --input-file-type openapi --output models.py
Expected behavior I'd expect the code to be generated successfully.
Version:
- OS: macOS 13.4
- Python version: 3.12.3
- datamodel-code-generator version: 0.25.6
% pip freeze
annotated-types==0.7.0
argcomplete==3.3.0
black==24.4.2
click==8.1.7
coverage==7.5.1
datamodel-code-generator==0.25.6
dnspython==2.6.1
email_validator==2.1.1
genson==1.3.0
idna==3.7
inflect==5.6.2
iniconfig==2.0.0
isort==5.13.2
Jinja2==3.1.4
MarkupSafe==2.1.5
mypy==1.10.0
mypy-extensions==1.0.0
packaging==24.0
pathspec==0.12.1
platformdirs==4.2.2
pluggy==1.5.0
pydantic==2.7.1
pydantic_core==2.18.2
pytest==8.2.1
pytest-cov==5.0.0
PyYAML==6.0.1
ruff==0.4.4
typing_extensions==4.11.0
Additional context
If I modify datamodel-code-generator locally to skip the formatting step this is the code it generates:
# generated by datamodel-codegen:
# filename: openapi.yml
# timestamp: 2024-05-23T11:52:03+00:00
from __future__ import annotations
from enum import Enum
from typing import Optional
from pydantic import BaseModel
class NullEnumEnum(Enum):
class NullEnum(BaseModel):
__root__: Optional[NullEnumEnum] = None
I have the same issue:
black.parsing.InvalidInput: Cannot parse: 107:55: title: constr(pattern=r'[\s\w\{\}\$\-\(\)\.\[\]"\\'_/\\,\*\+\#:@!?;=]*') = Field(..., description='Human readable title of the case enquiry')
In this case, it is caused by regexes containing escape sequences that are not properly escaped in the generated code