idaes-pse icon indicating copy to clipboard operation
idaes-pse copied to clipboard

Spring cleaning for dependencies

Open lbianchi-lbl opened this issue 3 years ago • 10 comments

Goals

  • [ ] Get rid of dependencies that are not needed anymore
  • [ ] Separate remaining dependencies in two broad categories:
    • core: used extensively throughout the IDAES codebase
    • optional: only used in a limited subset of submodules and/or applications, i.e. not all IDAES users will need them
  • [ ] Divide optional dependencies in a reasonable amount (2 <= n < 8) of categories to be used as extras_require, i.e. pip install idaes-pse[my-dependencies]

Current status

Requirement Imported in modules Notes Effort to make optional
backports.shutil_get_terminal_size Only needed for Python <3.3 0
bunch 0
click >9000
colorama idaes.dmf.util 1
flask UI, but probably not needed anymore? 0
flask-cors Same 0
jupyter Used for the examples 2
pywin32 Only needed for version constraint 2
lxml idaes.dmf.util 1
matplotlib ~30 3
nbconvert idaes.commands.examples 1
nbformat ideas.commands.examples, idaes.dmf 1
numpy ~90 5
networkx pyomo sequential modular tool, and by extension many flowsheet examples Easy to make optional, but used fairly extensively 1
omlt idaes.surrogate.keras_surrogate 1
pandas ~50 4
pint all IDAES models, idaes.dmf.model_data >9000
psutil 0
pyomo ~500 >9000
pytest ~300 test_* and conftest.py 2
pyyaml idaes.generic_models.properties.core.coolprop, idaes.config, idaes.dmf 3
requests idaes.ui.fsvis 1
python-slugify idaes.ui.fsvis 1
scipy ~20 3
seaborn idaes.dmf.model_data, idaes.core.base.costing_base(future) Already optional manual install 1
sympy idaes.core.util.expr_doc, idaes.surrogate.helmet Helmet should not be considered too heavily in this, it will soon end up in the "contrib" section 2
tinydb idaes.dmf 1
rbfopt idaes.surrogate.ripe This is contrib code, and this should definitely be an optional dependency 1
ipython<8.0.0 Only needed for version constraint 2
tensorflow idaes.surrogate.keras_surrogate Already has conditional import 1
coolprop idaes.generic_models.properties.core.coolprop + test Already has conditional import 1

(table generated with the useful tool https://www.tablesgenerator.com/markdown_tables)

Effort to make optional:

  • 0: can be removed
  • 1: extras_require + conditional import
  • 2: Like 1, with caveats
  • 3: Possible but non-negligible amount of work required; probably not worth it
  • 4: Possible, but a lot of work; most likely not worth it
  • 5: Not impossible, but realistically not feasible
  • >5: "Let's rewrite Pyomo from scratch"

Proposed grouping

Option A

Core dependencies

  • pyomo
  • pint
  • numpy
  • pandas
  • matplotlib
  • networkx
  • pyyaml
  • scipy
  • click

Optional dependencies in extras_require

  • testing
    • pytest
  • examples OR jupyter OR notebooks
    • jupyter
    • nbconvert
    • nbformat
    • ipython (version constraint only)
    • pywin32 (version constraint only)
  • dmf
    • colorama
    • lxml
    • tinydb
    • nbformat
  • ui OR fsvis
    • requests
    • python-slugify
  • omlt OR keras OR mlai etc
    • omlt
    • tensorflow

Optional dependencies (standalone)

  • sympy
  • coolprop - this is also useful for verification testing of property models.
  • rbfopt

lbianchi-lbl avatar Feb 11 '22 00:02 lbianchi-lbl

@lbianchi-lbl Do you want us to update the table above if we have comments to make? E.g. pint is basically required for all IDAES models through our use of pyomo's unit conversion tools.

andrewlee94 avatar Feb 11 '22 14:02 andrewlee94

@lbianchi-lbl Do you want us to update the table above if we have comments to make? E.g. pint is basically required for all IDAES models through our use of pyomo's unit conversion tools.

@andrewlee94 that's an excellent point and yes, please feel free to add corrections. I've started by looking at modules that are directly imported by IDAES code, but this ignores cases such as pint where the module is used extensively even though it's not imported.

Also, the table might be finicky to edit by hand, so you can write corrections in a comment and I can then amend the table if that seems easier.

lbianchi-lbl avatar Feb 11 '22 14:02 lbianchi-lbl

On the topic of things like pytest - these are only used by the tests, an thus many users probably aren't that concerned. However, we do (or at least did) tell users to run tests as part of the installation.

Personally, I would be inclined to make things like this a developer-only dependency to keep the burden on the user low, but we would need to come up with an alternative for testing an IDAES install - running a prebuilt example notebook would probably be a good alternative, and honestly more useful to a general user.

andrewlee94 avatar Feb 11 '22 14:02 andrewlee94

@lbianchi-lbl I would also suggest that we might also want to have some further "use-case" only dependencies - i.e. things that are only used by one (sub-)package in the code. These we would just have a check for in the code, and tell the user that they need to get the dependency themselves.

I can see a few cases above where a "contrib-quality" package is bringing in a new dependency (e.g. rbfopt), and I don't think we should go out of our way to support these cases.

andrewlee94 avatar Feb 11 '22 15:02 andrewlee94

@andrewlee94 - thanks for the corrections for pint, networkx, and pyyaml. It seems that these should be part of the core dependencies, also considering that (AFAIK) the payoff from making them optional doesn't seem very substantial.

For pytest, I get what you're saying in that its primary use is by IDAES contributors, which would make it part of the separate dev dependencies in requirements-dev.txt. However, being able to use the test suite to verify that IDAES has been set up correctly is useful not only for end users, but also for automated testing of any "site-packages" (non-editable) installation, including when cutting a release. We've been using this extensively with @ksbeattie, and makes us inclined towards considering pytest as an optional dependency (installable as e.g. idaes-pse[testing]), rather than an exclusively dev-only requirement.

lbianchi-lbl avatar Feb 11 '22 15:02 lbianchi-lbl

@lbianchi-lbl I would also suggest that we might also want to have some further "use-case" only dependencies - i.e. things that are only used by one (sub-)package in the code. These we would just have a check for in the code, and tell the user that they need to get the dependency themselves.

I can see a few cases above where a "contrib-quality" package is bringing in a new dependency (e.g. rbfopt), and I don't think we should go out of our way to support these cases.

@andrewlee94 I agree, that's another good point. As I see it, the main criteria for deciding whether a dependency mypkg would be "worthy" of an extras_require tag are:

  • Is mypkg only used/useful as part of a "bundle" with other dependencies?
  • Is mypkg required by substantial parts of the codebase such as not including it (and therefore, let the tests that require it be skipped) is going to affect the test coverage significantly?
  • Does mypkg have any particular version and/or platform-specific constraints, such that it would make sense to encode them in a requirement directive within setup.py?

For any of the dependencies that we're classifying as optional, we could use these criteria to decide the extras_require tag (if any) that it will be grouped into.

lbianchi-lbl avatar Feb 11 '22 15:02 lbianchi-lbl

I've edited the original comment adding a first proposed grouping for the dependencies, as "Option A". Feel free to add your thoughts, either in a comment to the issue or by copying "Option A", making the changes, and posting it under a different identifier.

lbianchi-lbl avatar Feb 11 '22 15:02 lbianchi-lbl

Once this is sorted out, we need to update this page in our docs: https://idaes-pse.readthedocs.io/en/stable/tutorials/getting_started/opt_dependencies.html#optional-dependencies

ksbeattie avatar Apr 28 '22 18:04 ksbeattie

I added seaborn to the table. I used it in one place and you currently need to manually install if you want it. Someone also commented that they may use it in the future. It's definitely very optional.

eslickj avatar Apr 30 '22 14:04 eslickj

This might be a good candidate for making at least some of these changes immediately after the Aug release.

ksbeattie avatar Jul 28 '22 18:07 ksbeattie

Summarizing the next steps for this:

  • [x] As a low-hanging fruit, make a first pass and get rid of all dependencies that can be removed with no or minimal changes to the code (e.g. something is imported but not actually used)
    • [x] @lbianchi-lbl will open a PR with these changes
  • [ ] The biggest payoff would probably be separating the Jupyter stack, which can be done once idaes get-examples and the DMF are removed and/or adapted
    • [ ] @lbianchi-lbl and @dangunter will discuss and come up with a plan

lbianchi-lbl avatar Mar 02 '23 19:03 lbianchi-lbl