QCElemental icon indicating copy to clipboard operation
QCElemental copied to clipboard

PSA: Pydantic v1/v2 and the QCArchive + Psi4 stack

Open loriab opened this issue 2 years ago • 6 comments

software pydantic v1 only (API v1) [^1] pydantic v1/v2 tolerant (API v1) pydantic v2 only (API v2)
QCElemental thru 0.25.1 0.26.0 thru 0.27.1 WIP #321
QCEngine thru 0.26.0 0.27.0 thru 0.29.0 WIP https://github.com/MolSSI/QCEngine/pull/425
QCFractal next 0.5 beta13 thru 0.51 WIP https://github.com/MolSSI/QCFractal/pull/787 (upcoming v0.52)
Psi4 [^2] v1.6 thru v1.8.0 v1.8.1 _2, v1.8.2 https://github.com/psi4/psi4/pull/3034

[^1]: "v1 only" describes the state of the code. conda packages may not be constrained to only solve with pydantic v1. [^2]: psi4 before v1.6 didn't use pydantic directly

EDIT 19 Jun 2024: The plan is to update schema v2 and pydantic v2 at the same time at 0.70


Fall 2024

QCSchema v2

  • schema expressed in pydantic v2 API
  • schema layout rearranged to facilitate composability (big changes only to procedures schema; small to AtomicInput/Result; none to Molecule)
  • no new features (maybe one)

Planned version targets for QCElemental and QCEngine:

  • v0.50 — QCSchema v2 available. QCSchema v1 unchanged (files moved but imports will work w/o change). There will be beta releases.
  • v0.70 — QCSchema v2 will become the default. QCSchema v1 will remain available, but it will require specific import paths (available as soon as v0.50).
  • v1.0 — QCSchema v2 unchanged. QCSchema v1 dropped.

Relevant PRs (accumulating into next2024 branches of QCElemental and QCEngine) for Pydantic v2 API

  • MolSSI/QCElemental#345 — set up pre-commit. move models to models.v1
  • MolSSI/QCElemental#346 — adapt all tests to run parametrized through QCSchema v1 or v2. (needs 345)
  • MolSSI/QCElemental#347 — translate models.v2 to pydantic v2 API (Levi's #321). (needs 346)
  • MolSSI/QCElemental#348 — convert internal classes (non-QCSchema like Datum) to pydantic v2 API. (needs 347)
  • MolSSI/QCElemental#349 — add back dummy files to qcelemental/models/*py so it issues a warning if you try to import from files, rather than properly from models or models.v1 or models.v2. Run with PYTHONWARNINGS=all to see warnings. (needs 348)
  • MolSSI/QCEngine#452 — set up pre-commit
  • MolSSI/QCEngine#453 — requires pydantic=2 dependency. convert internal classes (non-QCSchema classes like TaskConfig) to pydantic v2 API. (needs 452 and qcel 349)
  • MolSSI/QCElemental#352 — top-level models (and FailedOperation) learned a convert_v() function to return alternate versions of QCSchema. QCSchema v1 models learned model_dump etc. so easier to write unified code. (needs 349)
  • MolSSI/QCEngine#454 — adapt all the tests to run parameterized QCSchema v1 or v2 input and output checks. (needs 453 and qcel 352)
  • MolSSI/QCEngine#455 — consolidate qcengine.compute and qcengine.compute_procedure into the former and add a return_version={1, 2, -1} argument, where the default -1 returns the input version (v1 if it can't be determined).
  • MolSSI/QCElemental#354 — fine tuning of converter functions, warnings fixes, docs. (needs 352)
  • MolSSI/QCEngine#456 — translate harnesses to use pyd v2. (needs 455 and qcel 354)
  • MolSSI/QCElemental#355 — holder for Mol testing

Relevant PRs (accumulating into next2024 branches of QCElemental and QCEngine) for data layout rearrangement

  • MolSSI/QCElemental#357 — remove Results.error field, enforce .success field.
  • MolSSI/QCEngine#458 — make FailedOp.input_data object where possible (needs qcel 357)
  • MolSSI/QCElemental#361 — interloper: py38+ and poetry -> setuptools to help CI
  • MolSSI/QCElemental#358 — AtomicResult.input_data field (needs qcel 357)
  • MolSSI/QCEngine#459 — implement AtomicResult.input_data field (needs qcel 358 and qcng 458)
  • MolSSI/QCElemental#359 — AtomicInput.specification field (needs qcel 358)
  • MolSSI/QCEngine#460 — implement AtomicInput.specification field (needs qcel 358 and qcng 459)
  • MolSSI/QCElemental#363 — Opt and TD schema (needs qcel 359)
  • MolSSI/QCEngine#461 — implement Opt and TD v2 schema (needs qcel 363 and qcng 460)
  • ~MolSSI/QCElemental#364 — standardize model names, rational locations, protocols (needs qcel 363)~ NOW 366
  • ~MolSSI/QCEngine#462 — implement standardize model names, etc. (needs qcel 364 and qcng 461)~ NOW 468

Spring 2025

Relevant PRs (accumulating into next2025 branches of QCElemental and QCEngine)

  • next2025 branches are next2024 rebased atop QCElemental v0.29 and QCEngine v0.31
  • v0.50a1 Release Tag
  • MolSSI/QCElemental#366 — standardize model names, rational locations, protocols (needs qcel 363)
  • MolSSI/QCEngine#468 — standardize model fields, esp. TD (needs qcel 366 and qcng 461)
  • MolSSI/QCElemental#367 — (needs qcel 366)
  • MolSSI/QCEngine#469 — new buildsys (needs qcel 367 and qcng 468)
  • MolSSI/QCEngine#471

loriab avatar Aug 30 '23 15:08 loriab

Thanks to the Center for Scientific Software Engineering at Georgia Tech, I'm going to be able to tackle this pydantic v1/v2 problem. General strategy is:

  • Pydantic
    • you will be required to use pydantic v2 with subsequent releases of qcel/qcng. (don't panic, you won't need to use the v2 API just the v2 packages (from which v1 API can be imported))
    • from qcelemental, the existing QCSchema models (mostly schema_version=1 except Mol that's 2) will continue to be available as from qcelemental.models import AtomicInput, etc., and will continue to use pydantic v1 API. To fend off pydantic v2 API for a while, you can also from qcelemental.models.v1 import AtomicInput
    • a new "v2" version of QCSchema that is written based on Pydantic v2 API according to the PRs listed in the headmatter will be available as from qcelemental.models.v2 import AtomicInput, etc.
    • QCEngine will run the either longstanding QCSchema v1 or newfangled QCSchema v2. you'll get back whichever version you put in.
  • Layout
    • Since pydantic has necessitated the upheaval above, we also want to take the opportunity to do some of the layout rearrangements that have been long in discussion (see next branch, #264, mini meeting at MQM 2022 at Virginia Tech). In particular:
      • (a) storing <CalcType>Input on <CalcType>Result rather than the latter inheriting from the former. This helps restart and more explicitly handles programs messing with Molecule orientation.
      • (b) having a few base classes (probably BaseInput, BaseSpecification, BaseResult, BaseProperties) so that misc. fields don't get left out (e.g., id, protocols, return_result, return_gradient, respectively).
      • (c) separate "what molecule" from "how to compute" so there aren't largely redundant schema (AtomicInput/QCInputSpecification and OptimizationInput/OptimizationSpecification) and they become more composable
    • See image below for anticipated form for AtomicInput/AtomicResult
    • These changes are deliberately layout-only rather than adding features
  • this plan is as undisruptive as I can devise while still breaking the impasse. I'm glad to hear opinions or concerns or show sketches for the other models or discuss further.
  • there may be another normal release of qcel/qcng before the above take effect

AtInRes_v2

@Lnaden @coltonbh @mattwthompson @berquist @awvwgk @bennybp

loriab avatar Aug 19 '24 21:08 loriab

Hi @loriab thanks for doing this work. I built out the models from the next branch for my own use and use in our lab some time ago, used them extensively, then updated the designs based on our collective experience. My thoughts are now baked into the qcio (Quantum Chemistry Input/Output) package. You can find documentation here: https://qcio.coltonhicks.com. Feel free to use any of the ideas you find useful. Happy to chat more details if at all helpful.

coltonbh avatar Aug 19 '24 22:08 coltonbh

Hi @coltonbh, yes, we've been sadly slow at proceeding on this, and I'm glad you've got a working relative of QCSchema so as to not impede research. I'll reach out further in a couple weeks to make sure I understand some of your design decisions. I think you're right that the layout (flat) that makes a convenient API interface doesn't necessarily make compact and descriptive models for schema.

loriab avatar Aug 27 '24 07:08 loriab

Just so I understand correctly, the from qcelemental.models.v1 import * hatch isn't live yet, but would be in the first release (or at least the first one associated with these effort)?

mattwthompson avatar Aug 27 '24 18:08 mattwthompson

Just so I understand correctly, the from qcelemental.models.v1 import * hatch isn't live yet, but would be in the first release (or at least the first one associated with these effort)?

Correct, the qcelemental.models bifurcation is in a local branch at present. Expected in a release in late October.

loriab avatar Aug 27 '24 18:08 loriab

FYI, latest QCSchema v2 timetable posted at https://github.com/MolSSI/QCElemental/pull/377#issuecomment-3380191869

loriab avatar Oct 08 '25 08:10 loriab