idaes-pse
idaes-pse copied to clipboard
Spring cleaning for dependencies
Goals
- [ ] Get rid of dependencies that are not needed anymore
- [ ] Separate remaining dependencies in two broad categories:
- core: used extensively throughout the IDAES codebase
- optional: only used in a limited subset of submodules and/or applications, i.e. not all IDAES users will need them
- [ ] Divide optional dependencies in a reasonable amount (
2 <= n < 8
) of categories to be used asextras_require
, i.e.pip install idaes-pse[my-dependencies]
Current status
Requirement | Imported in modules | Notes | Effort to make optional |
---|---|---|---|
backports.shutil_get_terminal_size | Only needed for Python <3.3 | 0 | |
bunch | 0 | ||
click | >9000 | ||
colorama | idaes.dmf.util |
1 | |
flask | UI, but probably not needed anymore? | 0 | |
flask-cors | Same | 0 | |
jupyter | Used for the examples | 2 | |
pywin32 | Only needed for version constraint | 2 | |
lxml | idaes.dmf.util |
1 | |
matplotlib | ~30 | 3 | |
nbconvert | idaes.commands.examples |
1 | |
nbformat | ideas.commands.examples , idaes.dmf |
1 | |
numpy | ~90 | 5 | |
networkx | pyomo sequential modular tool, and by extension many flowsheet examples | Easy to make optional, but used fairly extensively | 1 |
omlt | idaes.surrogate.keras_surrogate |
1 | |
pandas | ~50 | 4 | |
pint | all IDAES models, idaes.dmf.model_data |
>9000 | |
psutil | 0 | ||
pyomo | ~500 | >9000 | |
pytest | ~300 test_* and conftest.py |
2 | |
pyyaml | idaes.generic_models.properties.core.coolprop , idaes.config , idaes.dmf |
3 | |
requests | idaes.ui.fsvis |
1 | |
python-slugify | idaes.ui.fsvis |
1 | |
scipy | ~20 | 3 | |
seaborn | idaes.dmf.model_data, idaes.core.base.costing_base(future) | Already optional manual install | 1 |
sympy | idaes.core.util.expr_doc , idaes.surrogate.helmet |
Helmet should not be considered too heavily in this, it will soon end up in the "contrib" section | 2 |
tinydb | idaes.dmf |
1 | |
rbfopt | idaes.surrogate.ripe |
This is contrib code, and this should definitely be an optional dependency | 1 |
ipython<8.0.0 | Only needed for version constraint | 2 | |
tensorflow | idaes.surrogate.keras_surrogate |
Already has conditional import | 1 |
coolprop | idaes.generic_models.properties.core.coolprop + test |
Already has conditional import | 1 |
(table generated with the useful tool https://www.tablesgenerator.com/markdown_tables)
Effort to make optional:
- 0: can be removed
- 1:
extras_require
+ conditional import - 2: Like 1, with caveats
- 3: Possible but non-negligible amount of work required; probably not worth it
- 4: Possible, but a lot of work; most likely not worth it
- 5: Not impossible, but realistically not feasible
- >5: "Let's rewrite Pyomo from scratch"
Proposed grouping
Option A
Core dependencies
-
pyomo
-
pint
-
numpy
-
pandas
-
matplotlib
-
networkx
-
pyyaml
-
scipy
-
click
Optional dependencies in extras_require
-
testing
-
pytest
-
-
examples
ORjupyter
ORnotebooks
-
jupyter
-
nbconvert
-
nbformat
-
ipython
(version constraint only) -
pywin32
(version constraint only)
-
-
dmf
-
colorama
-
lxml
-
tinydb
-
nbformat
-
-
ui
ORfsvis
-
requests
-
python-slugify
-
-
omlt
ORkeras
ORmlai
etc-
omlt
-
tensorflow
-
Optional dependencies (standalone)
-
sympy
-
coolprop
- this is also useful for verification testing of property models. -
rbfopt
@lbianchi-lbl Do you want us to update the table above if we have comments to make? E.g. pint
is basically required for all IDAES models through our use of pyomo
's unit conversion tools.
@lbianchi-lbl Do you want us to update the table above if we have comments to make? E.g.
pint
is basically required for all IDAES models through our use ofpyomo
's unit conversion tools.
@andrewlee94 that's an excellent point and yes, please feel free to add corrections. I've started by looking at modules that are directly imported by IDAES code, but this ignores cases such as pint
where the module is used extensively even though it's not imported.
Also, the table might be finicky to edit by hand, so you can write corrections in a comment and I can then amend the table if that seems easier.
On the topic of things like pytest
- these are only used by the tests, an thus many users probably aren't that concerned. However, we do (or at least did) tell users to run tests as part of the installation.
Personally, I would be inclined to make things like this a developer-only
dependency to keep the burden on the user low, but we would need to come up with an alternative for testing an IDAES install - running a prebuilt example notebook would probably be a good alternative, and honestly more useful to a general user.
@lbianchi-lbl I would also suggest that we might also want to have some further "use-case" only dependencies - i.e. things that are only used by one (sub-)package in the code. These we would just have a check for in the code, and tell the user that they need to get the dependency themselves.
I can see a few cases above where a "contrib-quality" package is bringing in a new dependency (e.g. rbfopt
), and I don't think we should go out of our way to support these cases.
@andrewlee94 - thanks for the corrections for pint
, networkx
, and pyyaml
. It seems that these should be part of the core dependencies, also considering that (AFAIK) the payoff from making them optional doesn't seem very substantial.
For pytest
, I get what you're saying in that its primary use is by IDAES contributors, which would make it part of the separate dev dependencies in requirements-dev.txt
. However, being able to use the test suite to verify that IDAES has been set up correctly is useful not only for end users, but also for automated testing of any "site-packages" (non-editable) installation, including when cutting a release. We've been using this extensively with @ksbeattie, and makes us inclined towards considering pytest
as an optional dependency (installable as e.g. idaes-pse[testing]
), rather than an exclusively dev-only requirement.
@lbianchi-lbl I would also suggest that we might also want to have some further "use-case" only dependencies - i.e. things that are only used by one (sub-)package in the code. These we would just have a check for in the code, and tell the user that they need to get the dependency themselves.
I can see a few cases above where a "contrib-quality" package is bringing in a new dependency (e.g.
rbfopt
), and I don't think we should go out of our way to support these cases.
@andrewlee94 I agree, that's another good point. As I see it, the main criteria for deciding whether a dependency mypkg
would be "worthy" of an extras_require
tag are:
- Is
mypkg
only used/useful as part of a "bundle" with other dependencies? - Is
mypkg
required by substantial parts of the codebase such as not including it (and therefore, let the tests that require it be skipped) is going to affect the test coverage significantly? - Does
mypkg
have any particular version and/or platform-specific constraints, such that it would make sense to encode them in a requirement directive withinsetup.py
?
For any of the dependencies that we're classifying as optional, we could use these criteria to decide the extras_require
tag (if any) that it will be grouped into.
I've edited the original comment adding a first proposed grouping for the dependencies, as "Option A". Feel free to add your thoughts, either in a comment to the issue or by copying "Option A", making the changes, and posting it under a different identifier.
Once this is sorted out, we need to update this page in our docs: https://idaes-pse.readthedocs.io/en/stable/tutorials/getting_started/opt_dependencies.html#optional-dependencies
I added seaborn to the table. I used it in one place and you currently need to manually install if you want it. Someone also commented that they may use it in the future. It's definitely very optional.
This might be a good candidate for making at least some of these changes immediately after the Aug release.
Summarizing the next steps for this:
- [x] As a low-hanging fruit, make a first pass and get rid of all dependencies that can be removed with no or minimal changes to the code (e.g. something is imported but not actually used)
- [x] @lbianchi-lbl will open a PR with these changes
- [ ] The biggest payoff would probably be separating the Jupyter stack, which can be done once
idaes get-examples
and the DMF are removed and/or adapted- [ ] @lbianchi-lbl and @dangunter will discuss and come up with a plan