marker icon indicating copy to clipboard operation
marker copied to clipboard

[BUG: Breaking] Ollama invalid JSON schema in format

Open xiaoyao9184 opened this issue 6 months ago • 1 comments

🧨 Describe the Bug

When using Marker with an Ollama backend and providing a Pydantic v2 schema via SectionHeaderSchema and PageSchema .model_json_schema(), Marker constructs a format schema that discards $defs, causing $ref references inside properties to break. Ollama then returns:

{"error":"invalid JSON schema in format"}

This is a breaking bug, as $ref is required for any schema with nested models (e.g., list of subobjects), and Marker’s current format_schema logic does not preserve the $defs block from the original model_json_schema().

📄 Input Document

N/A — bug is schema-level.

📤 Output Trace / Stack Trace

Click to expand
{"error":"invalid JSON schema in format"}

⚙️ Environment

Please fill in all relevant details:

  • Marker version: 1.8.0
  • Surya version:
  • Python version: 3.11
  • PyTorch version:
  • Transformers version:
  • Operating System (incl. container info if relevant):

✅ Expected Behavior

Add "$defs": schema["$defs"] if "$defs" in schema else {} in https://github.com/datalab-to/marker/blob/edbcb8cf4ae31da9bdf57f0486b558f56e0a1484/marker/services/ollama.py#L39-L44

xiaoyao9184 avatar Jun 30 '25 07:06 xiaoyao9184

https://github.com/datalab-to/marker/issues/907#issuecomment-3384031679

kipavy avatar Nov 04 '25 15:11 kipavy