flopy refactor(createpackages): use jinja for mf6 module code generation

Carved out of #2317 — just move to Jinja as an initial step, type hints can come afterwards as they will take some effort to get right. The aim here is to match the old createpackages.py capabilities without relying on mfstructure.py. This is an initial step to unplug mfstructure.py and move to a leaner representation of the input spec, but mfstructure.py is still used at runtime which will need to be unraveled in followup work.

Introduces a flopy.mf6.utils.codegen module with some new machinery for loading an input specification from DFN files into an intermediate representation, and generating source code with Jinja. I tried to keep this fairly general, with special handling for quirks of the existing framework in a "shim" which transforms the template context before rendering. The shim still writes some Python by hand (__init__ method body and class attributes section). The templates are also more complicated than should later be necessary.

The DFN load logic can be simplified when the flat DFN moves to a structured format (e.g. TOML). It becomes unnecessary at that point to inflate composite variables. For now the load routine uses a data structure from the boltons library to represent a DFN before structural parsing as a map of variables with possibly duplicate names. I'm fairly comfortable bringing boltons in as it has no dependencies and may be handy in the future. If we could guarantee all variable names were unique by changing a few DFNs this wouldn't be needed. It could also be done by hand which @mjreno has prototyped.

This PR uses dataclasses on a few classes to get init methods and nested dict conversion for free. This is a nod towards attrs which we have been prototyping with and which is a more powerful superset of the dataclasses module.

Miscellaneous:

add some mermaid diagrams to the MF6 module dev guide
minor fix in flopy/mf6/data/mfdatastorage.py to avoid referencing a variable before it exists
update DFNs as per https://github.com/MODFLOW-USGS/modflow6/pull/2031

Oct 10 '24 17:10 wpbonelli

Codecov Report

Attention: Patch coverage is 4.91803% with 406 lines in your changes missing coverage. Please review.

Project coverage is 75.9%. Comparing base (bb9824e) to head (ad9e2e2). Report is 67 commits behind head on develop.

Files with missing lines	Patch %	Lines
flopy/mf6/utils/codegen/filters.py	0.0%	308 Missing :warning:
flopy/mf6/utils/codegen/__init__.py	0.0%	70 Missing :warning:
flopy/mf6/utils/codegen/component.py	0.0%	20 Missing :warning:
flopy/mf6/utils/generate_classes.py	40.0%	6 Missing :warning:
flopy/mf6/utils/createpackages.py	50.0%	2 Missing :warning:

Additional details and impacted files

@@            Coverage Diff            @@
##           develop   #2333     +/-   ##
=========================================
+ Coverage     68.4%   75.9%   +7.5%     
=========================================
  Files          294     296      +2     
  Lines        59390   63918   +4528     
=========================================
+ Hits         40652   48577   +7925     
+ Misses       18738   15341   -3397

Files with missing lines	Coverage Δ
flopy/mf6/data/mfdatastorage.py	`75.4% <ø> (+6.5%)`	:arrow_up:
flopy/mf6/data/mfdatautil.py	`77.9% <ø> (+23.3%)`	:arrow_up:
flopy/mf6/data/mfstructure.py	`73.6% <100.0%> (+9.5%)`	:arrow_up:
flopy/mf6/mfbase.py	`87.6% <100.0%> (+12.1%)`	:arrow_up:
flopy/mf6/mfpackage.py	`81.9% <ø> (+17.3%)`	:arrow_up:
flopy/mf6/mfsimbase.py	`75.6% <100.0%> (+13.4%)`	:arrow_up:
flopy/mf6/utils/createpackages.py	`60.0% <50.0%> (+53.5%)`	:arrow_up:
flopy/mf6/utils/generate_classes.py	`18.6% <40.0%> (+0.8%)`	:arrow_up:
flopy/mf6/utils/codegen/component.py	`0.0% <0.0%> (ø)`
flopy/mf6/utils/codegen/__init__.py	`0.0% <0.0%> (ø)`
... and 1 more

... and 249 files with indirect coverage changes

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Oct 18 '24 00:10 codecov[bot]

This PR adds a script to convert DFNs to a tentative TOML format: python -m flopy.mf6.utils.codegen.dfn2toml. This script builds on a lot of initial work done by @mjreno. It will populate the flopy/mf6/data/toml folder by default. This folder is ignored by git for now.

The TOML format is visually similar to DFNs. The main differences are it has an internal structure with top-level attributes name and vars (also solution, subpackage, package_type, and multi attributes, metadata info which was previously in #-prefixed comments at the top of DFN files), and composite variables have children. In future we can load a complex data structure directly into the form used for code generation instead of reading it off the type attribute (correspondingly, composites' type attribute now no longer enumerates their children).

For instance, this

# --------------------- prt oc options ---------------------

block options
name budget_filerecord
type record budget fileout budgetfile
shape
reader urword
tagged true
optional true
longname
description

block options
name budget
type keyword
shape
in_record true
reader urword
tagged true
optional false
longname budget keyword
description keyword to specify that record corresponds to the budget.

block options
name fileout
type keyword
shape
in_record true
reader urword
tagged true
optional false
longname file keyword
description keyword to specify that an output filename is expected next.

block options
name budgetfile
type string
preserve_case true
shape
in_record true
reader urword
tagged false
optional false
longname file keyword
description name of the output file to write budget information.

...

becomes this

name = "prt-oc"
multi = false

[vars]
[vars.budget_filerecord]
name = "budget_filerecord"
type = "record"
block = "options"

[vars.budget_filerecord.children]
[vars.budget_filerecord.children.budget]
block = "options"
name = "budget"
type = "keyword"
in_record = "true"
reader = "urword"
tagged = "true"
optional = "false"
longname = "budget keyword"
description = "keyword to specify that record corresponds to the budget."

[vars.budget_filerecord.children.fileout]
block = "options"
name = "fileout"
type = "keyword"
in_record = "true"
reader = "urword"
tagged = "true"
optional = "false"
longname = "file keyword"
description = "keyword to specify that an output filename is expected next."

[vars.budget_filerecord.children.budgetfile]
block = "options"
name = "budgetfile"
type = "string"
preserve_case = "true"
in_record = "true"
reader = "urword"
tagged = "false"
optional = "false"
longname = "file keyword"
description = "name of the output file to write budget information."

...

I think we could experiment with the TOML format for some time, see which attributes we really need to define variables, and make revisions as needed.

In the meantime we can unhook mfstructure.py at runtime, or at least remove the need to reproduce the original dfn format verbatim in the generated component classes. (If we don't do this soon, we need to preserve the flat dfn format and make sure it's identical to maintain the current functionality, which may be painful.)

When we are happy with the TOML format, we could move the conversion script to devtools so flopy and mf6 could both use it, and begin to consume TOML definitions here and mf6. Then all the complicated load logic here could be thrown away.

Nov 07 '24 19:11 wpbonelli

@jdhughes-usgs @langevin-usgs I think this is about ready, but maybe it should wait til after the class if the generate classes script will be used there

Nov 15 '24 11:11 wpbonelli

After https://github.com/MODFLOW-USGS/modflow-devtools/pull/167 and https://github.com/MODFLOW-USGS/modflow-devtools/pull/173 this ~~could now be~~ has now been reworked to run the toml conversion script, load toml directly into jinja contexts, and dispense with dfn.py, which ~~would make~~ makes the PR much smaller. The conversion script is an adapter which we can eventually get rid of also, once DFN files go to TOML upstream in MF6. (And at that point we might also want to consider eliminating devtools from the process, if we don't mind duplicating some http client boilerplate to download and unzip the dfns?)

In any case this will have to wait for 3.10 as 3.9 is imminent

Dec 15 '24 00:12 wpbonelli

outdated

@deltamarnix after considering I think your work in progress (removing the legacy `dfn` attr on the generated classes) should go before this. This PR has shrunk a lot with the dfn -> toml conversion provided by devtools. It would be messy to try to preserve or reconstruct the old unstructured format when the whole point is to drop it and clean things up. So I took the `dfn` attr out of the templates in the last commit here. Let's discuss.

Jan 14 '25 22:01 wpbonelli

IMO this should wait til this week's release goes out, then it can get some exercise on the develop branch for a while

Feb 04 '25 13:02 wpbonelli

Thanks. Yeah, this is why I want to wait to merge til after release. I think it'll take some time to ferret out all the minute differences and make sure no behavior has changed, even after merge.

Feb 06 '25 10:02 wpbonelli

here goes nothing..

Mar 07 '25 19:03 wpbonelli

flopy flopy copied to clipboard

refactor(createpackages): use jinja for mf6 module code generation

Codecov Report

flopy
flopy copied to clipboard