Added pydantic backend for serialization, and other updates.
Purpose
The MDO community at NASA use XDSMs pretty extensively to convey ideas, but we need a better way to turn OpenMDAO models in to XDSM diagrams. The intent of this update to pyXDSM is to allow it to ingest XDSM information in a declarative format to then produce the XDSM in PDF format. This means that tools like OpenMDAO would only need to be able to generate a compatible JSON file rather than interacting directly via the API.
This PR does the following:
-
Changes setup.py to pyproject.toml for PEP518 compatibility
-
Objects in the API are now built on top of Pydantic BaseModel
In particular, this leverages Pydantic for serialization/deserialization and validation. This means, in addition to the existing imperative API, an XDSM model can now be created with a declarative syntax via a dictionary or JSON (or any other serialization that can be represented as a dictionary, such as yaml.)
The key methods here are .to_json() and .from_json().
The MDF and kitchen sink examples now have corresponding .json files. Tests have been added to validate the serialization and deserialization of XDSM objects.
- A
__main__.pyhas been added to provide a command-line interface.
This allows the user to quickly build a tikz, pdf, or json output file from a given json file as input. The JSON output should be the equivalent of a copy but it seemed appropriate to include it, in case we ever support any other type of file format (yaml, toml, etc).
This utility allows the user to specify the JSON file to be converted. If no output is specified, it will assume that a PDF is to be generated. It supports a --cleanup and --quiet options of the build method as arguments.
-
The latex generation of XDSM was moved to a separate
xdsm_latex_writer.pyfile. -
Similar changes are not yet made to matrix equations, though I plan to work on that as a follow-up.
-
Docs have been updated.
Expected time until merged
This is not urgent. While I'm going to base some future work on this change, its a somewhat large PR and I understand if it takes some time to fully review it. I can work to my fork in the mean time.
Type of change
- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (non-backwards-compatible fix or feature)
- [x] Code style update (formatting, renaming)
- [ ] Refactoring (no functional changes, no API changes)
- [x] Documentation update
- [x] Maintenance update
- [ ] Other (please describe)
Testing
I've added tests to test_xdsm.py that test that a deserialized XDSM is equivalent.
The example XDSMs have been added in JSON format and tests of the CLI have been implemented.
Checklist
- [x] I have run
ruff checkandruff formatto make sure the Python code adheres to PEP-8 and is consistently formatted - [ ] I have formatted the Fortran code with
fprettifyor C/C++ code withclang-formatas applicable - [x] I have run unit and regression tests which pass locally with my changes
- [x] I have added new tests that prove my fix is effective or that my feature works
- [x] I have added necessary documentation
This is pretty great, thanks for putting it together Rob! I think it has been a longstanding goal to have a stable declarative syntax for XDSM diagrams, and it's a major shortcoming of the current Python API. It will take me some time to review this PR, but I do want to pose a question initially -- how do we feel about a Pydantic-based JSON representation vs. something a bit more custom? I understand that from a ease of development perspective Pydantic is great and very stable, but this effectively locks in the Pydantic class definitions --- any future changes such as adding/removing/renaming fields will mean that the previously-generated JSON files are invalid, and shims (likely in the form of model_validators) have to be added to maintain compatibility.
On the other hand, a custom format may be more compact/readable, and we could have a bit more flexibility in serialization/deserialization, though we lose out on all the great Pydantic features. And I suppose some compat layer has to be created no matter what we do.. just curious for other's thoughts here.
Lastly, I want to mention that there exists XDSMjs as a javascript library, which of course has its own existing JSON representation of the XDSM diagram. Maybe there are opportunities to standardize and define a community-driven JSON schema that could be used by various tools/backends. CC @relf @eirikurj and others, if there are lots of discussions we can move this to a dedicated thread.
I've been diving into pydantic during this government shutdown. It seems like it has fairly broad adoption so I'm not concerned with it losing support.
I think the benefits it provides as far as providing serialization and validation are really worthwhile vs a custom format. As far as merging with XDSMjs, that would be a discussion worth having.
Sorry, the concern I had is not about pydantic as a package, but the particular schema in this repo gets locked in via any serialized JSON files, and changes to those pydantic class definitions will not be backwards compatible. It's something we can get around with, via validators and such, but it is cumbersome.
I'll try to review the actual code, and I don't have any fundamental objections to using the class definitions as-is but I was just wondering if there are any improvements we want to do---best to do them now before they are fixed by the JSON.
Since there's interest in this, I'll clean this up when I get a chance. With the furlough ending things will be a bit busy for a few days :)