ontology-development-kit icon indicating copy to clipboard operation
ontology-development-kit copied to clipboard

Automatic generation of the schema documentation is broken

Open gouttegd opened this issue 1 month ago • 8 comments

The dump-schema command (used to dump the schema of the configuration object, which is then used to automatically produce the documentation of said schema) is broken:

$ odkrun -l python ./odk/odk.py dump-schema
Traceback (most recent call last):
  File "/work/./odk/odk.py", line 1605, in <module>
    cli()
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1442, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1363, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1830, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1226, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/work/./odk/odk.py", line 1394, in dump_schema
    print(json.dumps(clazz.json_schema(), sort_keys=True, indent=4))
                     ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/dataclasses_jsonschema/__init__.py", line 941, in json_schema
    properties[f.mapped_name], is_required = cls._get_field_schema(f.field, schema_options)
                                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/dataclasses_jsonschema/__init__.py", line 772, in _get_field_schema
    field_meta, required = cls._get_field_meta(field, schema_options.schema_type)
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/dataclasses_jsonschema/__init__.py", line 745, in _get_field_meta
    field_meta.default = cls._encode_field(field.type, default_value, omit_none=False)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/dataclasses_jsonschema/__init__.py", line 443, in _encode_field
    return encoder(field_type, value, omit_none)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/dataclasses_jsonschema/__init__.py", line 209, in _encoder_is_json_schema_subclass
    return v.to_dict(omit_none=o, validate=False)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: DataClassJsonMixin.to_dict() got an unexpected keyword argument 'omit_none'

Bisecting identifies commit aff9e65f76f6d48eb0c43c5b0d1352784bce87bc as introducing the issue.

gouttegd avatar Dec 07 '25 20:12 gouttegd

More precisely this line:

robot : RobotOptionsGroup = field(default_factory=lambda: RobotOptionsGroup())

gouttegd avatar Dec 07 '25 20:12 gouttegd

In fact the second step of the documentation generation process (the odk/schema_documentation.py) is broken as well:

$ odkrun -l python ./odk/schema_documentation.py
INFO:root:Target is: /work/docs/project-schema.md
Traceback (most recent call last):
  File "/work/./odk/schema_documentation.py", line 383, in <module>
    generate()
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1442, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1363, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1226, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 794, in invoke
    return callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/work/./odk/schema_documentation.py", line 378, in generate
    print_element(element, md_out, plain_doc, nesting_list=[])
  File "/work/./odk/schema_documentation.py", line 216, in print_element
    lines = plain_doc[element]
            ~~~~~~~~~^^^^^^^^^
KeyError: 'allow_equivalents'

And that breakage is even older, it is already broken in 1.5.

gouttegd avatar Dec 07 '25 21:12 gouttegd

Thank you! I guess using a relatively complex system of data classes and defaults as a blueprint to generate a json schema was always a little risky. I see you added this to 1.7 release - I might take a stab at this just before we want to go for 1.7.

matentzn avatar Dec 08 '25 12:12 matentzn

I added this to the 1.7 milestone before I realised it was more broken than I thought.

The problem is that this system, as it is currently designed, will be incompatible with the switch to ODK-Core, because with ODK-Core the actual model will live elsewhere (for now there – and the schema_documentation.py script is not only using the output of the dump-schema command, it is also getting some additional infos directly from the odk.py file.

Not sure yet how best to handle the documentation in the new ODK-Core setup that I envision. Ideas are welcome.

gouttegd avatar Dec 08 '25 13:12 gouttegd

Unless it burns under your fingers right now, I would suggest we wait a little, meet and brainstorm. My preference of course would be to curate the odk-yaml file schema separately, but it would create duplication of effort since you would curate the schema twice, and the second time for a questionable ROI (documentation). Right now I also cant think of much else - I am nearly certain that the documentation was hardly ever used by anyone and 99% of ODK users looked at other projects for how to enable an ODK feature in their setup.

matentzn avatar Dec 08 '25 13:12 matentzn

I would suggest we wait a little, meet and brainstorm.

Fine with me.

I am nearly certain that the documentation was hardly ever used by anyone.

That’s my feeling as well – which is why I am not keen on having the switch to ODK-Core being blocked by this issue. I’d rather drop the auto-generated schema documentation (we can keep the current one with a warning that it may no longer be accurate).

gouttegd avatar Dec 08 '25 13:12 gouttegd

The switch to ODK core should absolutely NOT be blocked by this! Just do not support auto generation of documentation for now.

matentzn avatar Dec 08 '25 13:12 matentzn

OK, glad we agree. ;)

Keeping the issue open as a reminder that we will still need to devise a new way to generate the documentation at some point.

gouttegd avatar Dec 08 '25 14:12 gouttegd