datamodel-code-generator icon indicating copy to clipboard operation
datamodel-code-generator copied to clipboard

Bug: Path Handling Error - Schema File Paths as Strings - Pydantic

Open Christian-Blank opened this issue 7 months ago • 0 comments

Bug Report: datamodel-code-generator Path Handling Error

Describe the bug

The datamodel-code-generator raises an error (AttributeError: 'str' object has no attribute 'get') when the input schema file path is provided as a string instead of a pathlib.Path object. Internally, the generator mistakenly treats the file path string as JSON content, causing the process to fail.

This issue was identified during the integration of the generator into a schema-first architecture workflow, where schemas are managed and referenced systematically.

To Reproduce

Example Schema (example-schema.json):

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "Example",
  "type": "object",
  "properties": {
    "name": {
      "type": "string"
    }
  }
}

Command Used (string input causing error):

datamodel-codegen --input "/path/to/example-schema.json" --input-file-type jsonschema --output example_model.py

Error Output:

AttributeError: 'str' object has no attribute 'get'

Expected behavior

The generator should handle schema file paths provided as strings by properly reading and processing the file contents, resulting in correctly generated Pydantic models. For example:

from pydantic import BaseModel

class Example(BaseModel):
    name: str

Workaround

The issue does not occur when explicitly using a pathlib.Path object:

datamodel-codegen --input $(python -c 'from pathlib import Path; print(Path("/path/to/example-schema.json"))') --input-file-type jsonschema --output example_model.py

Or equivalently, in Python-based invocations:

from pathlib import Path
from datamodel_code_generator import InputFileType, generate

# Successful invocation using Path object
generate(
    Path("/path/to/example-schema.json"),
    input_file_type=InputFileType.JsonSchema,
    output="example_model.py",
)

Why this matters

  • Developer Experience: Users naturally provide file paths as strings, and unclear error messages can mislead developers, increasing debugging time.
  • Workflow Reliability: Automated workflows and code-generation pipelines commonly utilize string paths, and the requirement to convert explicitly to Path objects introduces unnecessary friction.
  • Best Practices: JSON Schema-driven development workflows strongly benefit from intuitive, robust tooling. Enhancing the tool to gracefully handle both strings and Path objects aligns with expected behavior and usability standards.

How we discovered this issue

This issue was discovered during the implementation of a schema-first architecture within our internal project, aiming for consistent, schema-driven code generation. Our initial misinterpretation led us to suspect issues related to local $ref resolution within array items. After systematic testing, it became clear the root cause was actually related to the handling of input paths provided as strings rather than proper local reference handling.

Testing Matrix

Input Type Result Description
Schema path as string ❌ FAIL 'str' object has no attribute 'get' error
Schema path as pathlib.Path object ✅ PASS Successfully generates the expected models
Direct schema content as string ✅ PASS Successfully generates the expected models
Parsed schema content passed directly ✅ PASS Successfully generates the expected models

Suggested Solutions

  • Update internal handling of the input parameter to accept both strings and pathlib.Path objects, detecting input types and performing appropriate file handling.
  • Alternatively, clearly document the requirement for pathlib.Path objects to avoid confusion.

Environment

  • Operating System: macOS 14.4
  • Python Version: 3.11
  • datamodel-code-generator Version: v0.30.0

Additional context

This issue was consistent across multiple versions tested (including versions 0.21.5 through 0.30.0). Clarifying this behavior or enhancing the tool to accommodate both input types will greatly improve usability and reduce friction in adoption and integration processes.

Please let me know if further details or additional tests would be helpful.

Thank you for your work, Christian

Christian-Blank avatar Apr 19 '25 05:04 Christian-Blank