PSA: Pydantic v1/v2 and the QCArchive + Psi4 stack
| software | pydantic v1 only (API v1) [^1] | pydantic v1/v2 tolerant (API v1) | pydantic v2 only (API v2) |
|---|---|---|---|
| QCElemental | thru 0.25.1 | 0.26.0 thru 0.27.1 | WIP #321 |
| QCEngine | thru 0.26.0 | 0.27.0 thru 0.29.0 | WIP https://github.com/MolSSI/QCEngine/pull/425 |
| QCFractal next | 0.5 beta13 thru 0.51 | WIP https://github.com/MolSSI/QCFractal/pull/787 (upcoming v0.52) | |
| Psi4 [^2] | v1.6 thru v1.8.0 | v1.8.1 _2, v1.8.2 |
https://github.com/psi4/psi4/pull/3034 |
[^1]: "v1 only" describes the state of the code. conda packages may not be constrained to only solve with pydantic v1. [^2]: psi4 before v1.6 didn't use pydantic directly
EDIT 19 Jun 2024: The plan is to update schema v2 and pydantic v2 at the same time at 0.70
Fall 2024
QCSchema v2
- schema expressed in pydantic v2 API
- schema layout rearranged to facilitate composability (big changes only to procedures schema; small to AtomicInput/Result; none to Molecule)
- no new features (maybe one)
Planned version targets for QCElemental and QCEngine:
- v0.50 — QCSchema v2 available. QCSchema v1 unchanged (files moved but imports will work w/o change). There will be beta releases.
- v0.70 — QCSchema v2 will become the default. QCSchema v1 will remain available, but it will require specific import paths (available as soon as v0.50).
- v1.0 — QCSchema v2 unchanged. QCSchema v1 dropped.
Relevant PRs (accumulating into next2024 branches of QCElemental and QCEngine) for Pydantic v2 API
- MolSSI/QCElemental#345 — set up pre-commit. move
modelstomodels.v1 - MolSSI/QCElemental#346 — adapt all tests to run parametrized through QCSchema v1 or v2. (needs 345)
- MolSSI/QCElemental#347 — translate
models.v2to pydantic v2 API (Levi's #321). (needs 346) - MolSSI/QCElemental#348 — convert internal classes (non-QCSchema like
Datum) to pydantic v2 API. (needs 347) - MolSSI/QCElemental#349 — add back dummy files to
qcelemental/models/*pyso it issues a warning if you try to import from files, rather than properly frommodelsormodels.v1ormodels.v2. Run withPYTHONWARNINGS=allto see warnings. (needs 348) - MolSSI/QCEngine#452 — set up pre-commit
- MolSSI/QCEngine#453 — requires pydantic=2 dependency. convert internal classes (non-QCSchema classes like TaskConfig) to pydantic v2 API. (needs 452 and qcel 349)
- MolSSI/QCElemental#352 — top-level models (and
FailedOperation) learned aconvert_v()function to return alternate versions of QCSchema. QCSchema v1 models learnedmodel_dumpetc. so easier to write unified code. (needs 349) - MolSSI/QCEngine#454 — adapt all the tests to run parameterized QCSchema v1 or v2 input and output checks. (needs 453 and qcel 352)
- MolSSI/QCEngine#455 — consolidate
qcengine.computeandqcengine.compute_procedureinto the former and add areturn_version={1, 2, -1}argument, where the default-1returns the input version (v1 if it can't be determined). - MolSSI/QCElemental#354 — fine tuning of converter functions, warnings fixes, docs. (needs 352)
- MolSSI/QCEngine#456 — translate harnesses to use pyd v2. (needs 455 and qcel 354)
- MolSSI/QCElemental#355 — holder for Mol testing
Relevant PRs (accumulating into next2024 branches of QCElemental and QCEngine) for data layout rearrangement
- MolSSI/QCElemental#357 — remove
Results.errorfield, enforce.successfield. - MolSSI/QCEngine#458 — make
FailedOp.input_dataobject where possible (needs qcel 357) - MolSSI/QCElemental#361 — interloper: py38+ and poetry -> setuptools to help CI
- MolSSI/QCElemental#358 —
AtomicResult.input_datafield (needs qcel 357) - MolSSI/QCEngine#459 — implement
AtomicResult.input_datafield (needs qcel 358 and qcng 458) - MolSSI/QCElemental#359 —
AtomicInput.specificationfield (needs qcel 358) - MolSSI/QCEngine#460 — implement
AtomicInput.specificationfield (needs qcel 358 and qcng 459) - MolSSI/QCElemental#363 —
OptandTDschema (needs qcel 359) - MolSSI/QCEngine#461 — implement
OptandTDv2 schema (needs qcel 363 and qcng 460) - ~MolSSI/QCElemental#364 — standardize model names, rational locations, protocols (needs qcel 363)~ NOW 366
- ~MolSSI/QCEngine#462 — implement standardize model names, etc. (needs qcel 364 and qcng 461)~ NOW 468
Spring 2025
Relevant PRs (accumulating into next2025 branches of QCElemental and QCEngine)
next2025branches arenext2024rebased atop QCElemental v0.29 and QCEngine v0.31v0.50a1Release Tag- MolSSI/QCElemental#366 — standardize model names, rational locations, protocols (needs qcel 363)
- MolSSI/QCEngine#468 — standardize model fields, esp. TD (needs qcel 366 and qcng 461)
- MolSSI/QCElemental#367 — (needs qcel 366)
- MolSSI/QCEngine#469 — new buildsys (needs qcel 367 and qcng 468)
- MolSSI/QCEngine#471
Thanks to the Center for Scientific Software Engineering at Georgia Tech, I'm going to be able to tackle this pydantic v1/v2 problem. General strategy is:
- Pydantic
- you will be required to use pydantic v2 with subsequent releases of qcel/qcng. (don't panic, you won't need to use the v2 API just the v2 packages (from which v1 API can be imported))
- from qcelemental, the existing QCSchema models (mostly
schema_version=1except Mol that's2) will continue to be available asfrom qcelemental.models import AtomicInput, etc., and will continue to use pydantic v1 API. To fend off pydantic v2 API for a while, you can alsofrom qcelemental.models.v1 import AtomicInput - a new "v2" version of QCSchema that is written based on Pydantic v2 API according to the PRs listed in the headmatter will be available as
from qcelemental.models.v2 import AtomicInput, etc. - QCEngine will run the either longstanding QCSchema v1 or newfangled QCSchema v2. you'll get back whichever version you put in.
- Layout
- Since pydantic has necessitated the upheaval above, we also want to take the opportunity to do some of the layout rearrangements that have been long in discussion (see
nextbranch, #264, mini meeting at MQM 2022 at Virginia Tech). In particular:- (a) storing
<CalcType>Inputon<CalcType>Resultrather than the latter inheriting from the former. This helps restart and more explicitly handles programs messing with Molecule orientation. - (b) having a few base classes (probably
BaseInput,BaseSpecification,BaseResult,BaseProperties) so that misc. fields don't get left out (e.g.,id,protocols,return_result,return_gradient, respectively). - (c) separate "what molecule" from "how to compute" so there aren't largely redundant schema (
AtomicInput/QCInputSpecificationandOptimizationInput/OptimizationSpecification) and they become more composable
- (a) storing
- See image below for anticipated form for
AtomicInput/AtomicResult - These changes are deliberately layout-only rather than adding features
- Since pydantic has necessitated the upheaval above, we also want to take the opportunity to do some of the layout rearrangements that have been long in discussion (see
- this plan is as undisruptive as I can devise while still breaking the impasse. I'm glad to hear opinions or concerns or show sketches for the other models or discuss further.
- there may be another normal release of qcel/qcng before the above take effect
@Lnaden @coltonbh @mattwthompson @berquist @awvwgk @bennybp
Hi @loriab thanks for doing this work. I built out the models from the next branch for my own use and use in our lab some time ago, used them extensively, then updated the designs based on our collective experience. My thoughts are now baked into the qcio (Quantum Chemistry Input/Output) package. You can find documentation here: https://qcio.coltonhicks.com. Feel free to use any of the ideas you find useful. Happy to chat more details if at all helpful.
Hi @coltonbh, yes, we've been sadly slow at proceeding on this, and I'm glad you've got a working relative of QCSchema so as to not impede research. I'll reach out further in a couple weeks to make sure I understand some of your design decisions. I think you're right that the layout (flat) that makes a convenient API interface doesn't necessarily make compact and descriptive models for schema.
Just so I understand correctly, the from qcelemental.models.v1 import * hatch isn't live yet, but would be in the first release (or at least the first one associated with these effort)?
Just so I understand correctly, the from qcelemental.models.v1 import * hatch isn't live yet, but would be in the first release (or at least the first one associated with these effort)?
Correct, the qcelemental.models bifurcation is in a local branch at present. Expected in a release in late October.
FYI, latest QCSchema v2 timetable posted at https://github.com/MolSSI/QCElemental/pull/377#issuecomment-3380191869