datamodel-code-generator icon indicating copy to clipboard operation
datamodel-code-generator copied to clipboard

Code generation crashes with "black.parsing.InvalidInput: Cannot parse"

Open jstasiak opened this issue 1 year ago • 1 comments

Describe the bug datamodel-codegen crashed with an error instead of generating code:

% poetry run datamodel-codegen --input openapi.yml --input-file-type openapi --output models.py
Traceback (most recent call last):
  File "/Users/user/Library/Caches/pypoetry/virtualenvs/spreadsheet-offset-tool-L93AmRO5-py3.12/lib/python3.12/site-packages/datamodel_code_generator/__main__.py", line 447, in main
    generate(
  File "/Users/user/Library/Caches/pypoetry/virtualenvs/spreadsheet-offset-tool-L93AmRO5-py3.12/lib/python3.12/site-packages/datamodel_code_generator/__init__.py", line 468, in generate
    results = parser.parse()
              ^^^^^^^^^^^^^^
  File "/Users/user/Library/Caches/pypoetry/virtualenvs/spreadsheet-offset-tool-L93AmRO5-py3.12/lib/python3.12/site-packages/datamodel_code_generator/parser/base.py", line 1304, in parse
    body = code_formatter.format_code(body)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/Library/Caches/pypoetry/virtualenvs/spreadsheet-offset-tool-L93AmRO5-py3.12/lib/python3.12/site-packages/datamodel_code_generator/format.py", line 226, in format_code
    code = self.apply_black(code)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/Library/Caches/pypoetry/virtualenvs/spreadsheet-offset-tool-L93AmRO5-py3.12/lib/python3.12/site-packages/datamodel_code_generator/format.py", line 234, in apply_black
    return black.format_str(
           ^^^^^^^^^^^^^^^^^
  File "src/black/__init__.py", line 1225, in format_str
  File "src/black/__init__.py", line 1239, in _format_str_once
  File "src/black/parsing.py", line 90, in lib2to3_parse
black.parsing.InvalidInput: Cannot parse: 12:0: class NullEnum(BaseModel):

To Reproduce

The OpenAPI schema used:

openapi: 3.0.1

components:
  schemas:
    NullEnum:
      # Yes, this is a weird type. It has a reason to exist in this form.
      type: string
      enum:
        - null
      nullable: true

Used commandline:

$ datamodel-codegen --input openapi.yml --input-file-type openapi --output models.py

Expected behavior I'd expect the code to be generated successfully.

Version:

  • OS: macOS 13.4
  • Python version: 3.12.3
  • datamodel-code-generator version: 0.25.6
% pip freeze      
annotated-types==0.7.0
argcomplete==3.3.0
black==24.4.2
click==8.1.7
coverage==7.5.1
datamodel-code-generator==0.25.6
dnspython==2.6.1
email_validator==2.1.1
genson==1.3.0
idna==3.7
inflect==5.6.2
iniconfig==2.0.0
isort==5.13.2
Jinja2==3.1.4
MarkupSafe==2.1.5
mypy==1.10.0
mypy-extensions==1.0.0
packaging==24.0
pathspec==0.12.1
platformdirs==4.2.2
pluggy==1.5.0
pydantic==2.7.1
pydantic_core==2.18.2
pytest==8.2.1
pytest-cov==5.0.0
PyYAML==6.0.1
ruff==0.4.4
typing_extensions==4.11.0

Additional context If I modify datamodel-code-generator locally to skip the formatting step this is the code it generates:

# generated by datamodel-codegen:
#   filename:  openapi.yml
#   timestamp: 2024-05-23T11:52:03+00:00

from __future__ import annotations

from enum import Enum
from typing import Optional

from pydantic import BaseModel


class NullEnumEnum(Enum):


class NullEnum(BaseModel):
    __root__: Optional[NullEnumEnum] = None

jstasiak avatar May 23 '24 11:05 jstasiak

I have the same issue:

black.parsing.InvalidInput: Cannot parse: 107:55:     title: constr(pattern=r'[\s\w\{\}\$\-\(\)\.\[\]"\\'_/\\,\*\+\#:@!?;=]*') = Field(..., description='Human readable title of the case enquiry')

In this case, it is caused by regexes containing escape sequences that are not properly escaped in the generated code

iodbh avatar May 29 '24 13:05 iodbh