idaes-pse icon indicating copy to clipboard operation
idaes-pse copied to clipboard

[WIP] Initial implementation of IDAES Performance Testing Suite

Open andrewlee94 opened this issue 2 years ago • 11 comments

Fixes None

Summary/Motivation:

This PR adds the first content for the IDAES performance testing suite. Work still needs to be done to determine how these will be run as part of the CI suite.

Changes proposed in this PR:

  • Add main.py and compare.py files from Pyomo performance testing suite. These should be removed once Pyomo makes their code easily available.
  • Refactor HC_PR tests to create general functions for the performance testing steps.
  • Adds a general performance test runner/template for IDAES models
  • Implements HC_PR as first performance test
  • Add new pytest mark for performance tests and config methods to mark these to run only on demand.

How to Run Performance Tests

In the idaes/tests/performance folder, run the following: python main.py -v -d [output_dir] . --performance (the . is the path to the performance testing folder). A log file (json) will be created in [output_dir] containing the performance record.

Legal Acknowledgement

By contributing to this software project, I agree to the following terms and conditions for my contribution:

  1. I agree my contributions are submitted under the license terms described in the LICENSE.txt file at the top level of this directory.
  2. I represent I am authorized to make the contributions and grant the license. If my employer has rights to intellectual property that includes these contributions, I represent that I have received permission to make contributions and grant the required license on behalf of that employer.

andrewlee94 avatar Aug 05 '22 18:08 andrewlee94

I intend to try this out and give it a thorough review next week

Robbybp avatar Aug 05 '22 20:08 Robbybp

FYI: Pyomo/pyomo#2476 was merged and will be in the next Pyomo release (supporting the next IDAES release)

jsiirola avatar Aug 08 '22 17:08 jsiirola

Also, see my PR https://github.com/andrewlee94/idaes-pse/pull/16 into this branch for a couple of performance tests with the moving bed model.

Robbybp avatar Aug 12 '22 18:08 Robbybp

@Robbybp The _run_test method and associated class is not intended to be universal - it is meant to be a helper class for common tests based off of existing IDAES test examples (where the build, initialize and solve methods more-or-less already exist). For cases where this is not as convenient or appropriate we can write custom test methods that record the necessary data.

This also addresses the concerns about dynamic examples and unit consistency - these examples could skip the unit consistency tests for now (until Pyomo fixes that problem).

andrewlee94 avatar Aug 15 '22 13:08 andrewlee94

@andrewlee94 So for a different type of test, we should implement a new class, perhaps TestDynamicModel, with its own _run_test method?

Robbybp avatar Aug 15 '22 15:08 Robbybp

I'm also thinking about an application where, for all models in the IDAES performance suite, we collect some statistics, then initialize and construct a PyomoNLP and collect some more statistics. For an application like this, it would be necessary to have a standard way to access the build_model, initialize_model, and solve_model functions. I think an IdaesPerformanceTest base class that performance test cases must inherit from would be one way to do this. Does this idea of a common API for "performance test cases" seem like a good idea that would be worth supporting?

Robbybp avatar Aug 15 '22 15:08 Robbybp

@Robbybp The first thing to note is that Pyomo is actually collecting a lot of information in the background that is not immediately apparent. E.g., we get construction times for every Pyomo component in the model and every transformation performed automatically. So, chances are most of the statistics you are thinking about are already being captured.

The things called out in the _run_test method are just there to give us some high-level summaries.

andrewlee94 avatar Aug 15 '22 16:08 andrewlee94

@Robbybp r.e. different types of tests; for now I would hope that we can make a general testing class that covers most cases (once the unit consistency issue is resolved, and I have a work around for that that I will do soon). However, any test should feel free to write its own test class if it has things it wants to log separately (e.g. a complex flowsheet might want to log build and initialization times by sub-section). TLDR: the test class here is intended to be a helper class for cases where it is applicable, not to enforce an arbitrary standard form (I'll add comments to this extent).

andrewlee94 avatar Aug 15 '22 16:08 andrewlee94

I am thinking more about statistics related to the algebraic/NLP models, rather than timing statistics. For instance, tracking the frequency and types of nonlinearities, size of expressions, condition numbers of various matrices, etc. This doesn't really have anything to do with performance in the sense we're discussing here, but I think it would be nice to use this collection of IDAES models as a set of fairly large, fairly practical algebraic/NLP models to help motivate research into preprocessing/solution algorithms for these types of problems. And I think it would be fairly easy/nonintrusive to require a common API to build, initialize, and solve our performance test models, at least those that use this standard workflow.

Robbybp avatar Aug 15 '22 16:08 Robbybp

@Robbybp Ah - that was part of the intent behind the three methods (build, initialize and solve) - so that we could reuse the same code between multiple tests. I would not be restrictive however, but encourage it wherever possible - there will inevitably be cases whee it doesn't work.

I am going to rework things to have the single common class for each test case which has standard names for the three sub-methods, so I think that will address a large part of what you are suggesting.

andrewlee94 avatar Aug 15 '22 16:08 andrewlee94

@Robbybp I've added some changes so that you only need to provide a single test class for each case which is expected to have the methods to build, initialize and solve the model. I also added an option to flag to allow for tests which we know fail unit consistency - we still do the assertion, but we check that it fails in these cases (so we know to go back and fix this later).

andrewlee94 avatar Aug 15 '22 18:08 andrewlee94

I just played around with importing and operating on all IDAES performance suite models. Long comment incoming. Recall that the purpose of this is to be able to answer questions like:

  • What is the average number of variables/constraints in an IDAES performance suite model?
  • What percentage of constraints in each model are nonlinear?
  • What is the condition number after initialization of each constraint Jacobian?
  • etc.

If these models all inherit from some base class like IdaesPerformanceTest that I suggest above, this amounts to importing everything from every module in IDAES and checking if it is an instance of IdaesPerformanceTest. This seems slow and cumbersome and I don't really know how to do it. If there is a good way to do this, I would be interested to learn.

An alternative I have come up with is to have the performance test base class inherit from unittest.TestCase:

import pyomo.common.unittest as unittest

class IdaesPerformanceTest(unittest.TestCase):
    def build_model(self):
        raise NotImplementedError()
    def initialize_model(self, model):
        raise NotImplementedError()
    def solve_model(self, model):
        raise NotImplementedError()
    def test_dummy(self):
        pass

Then we can iterate over performance tests with a small script using TestLoader.discover from unittest:

import os
from idaes.tests.performance.idaes_performance_base import IdaesPerformanceTest
import pyomo.common.unittest as unittest
import idaes


def generate_test_cases(test_suite, discovered=None):
    if discovered is None:
        discovered = set()
    for test in test_suite:
        if isinstance(test, unittest.TestCase):
            if test.__class__ not in discovered:
                # Sometimes we find the same test class twice for some reason...
                discovered.add(test.__class__)
                yield test
        elif isinstance(test, unittest.TestSuite):
            yield from generate_test_cases(test, discovered=discovered)


def flatten_test_suite(test_suite):
    return list(generate_test_cases(test_suite))


def main():
    loader = unittest.defaultTestLoader
    idaes_dir = os.path.dirname(idaes.__path__[0])
    test_suite = loader.discover(idaes_dir)
    test_cases = flatten_test_suite(test_suite)
    for test in test_cases:
        if isinstance(test, IdaesPerformanceTest):
            print(test)


if __name__ == "__main__":
    main()

which displays:

<a bunch of deprecation warnings>
test_dummy (idaes.models.properties.modular_properties.examples.tests.test_HC_PR.HC_PR_Model)
test_dummy (idaes.tests.performance.idaes_performance_base.IdaesPerformanceTest)
test_dummy (idaes.models.unit_models.tests.test_heat_exchanger_1D.HX1D_Model)

Note that the test_dummy method is required, otherwise these test cases will have no tests, and discover will just pick up an empty TestSuite rather than the TestCases we're looking for.

We can get around this with a slight change to the test infrastructure:

  • TestModel now serves as a base class and implements a test that calls _run_test
  • Specific test classes inherit from both TestModel and TestCase

In test_models.py:

class TestModel(object):
    def recordData(self, name, value):
        ...
    def _run_test(self, consistent_units=True):
        ...
    def build_model(self):
        raise NotImplementedError()
    def initialize_model(self, model):
        raise NotImplementedError()
    def solve_model(self, model):
        raise NotImplementedError()
    def test_build_initialize_solve(self):
        self._run_test()

In the modules for the specific tests:

import pytest
import pyomo.common.unittest as unittest
from idaes.tests.performance.test_models import TestModel

@pytest.mark.performance
class HX1D_Model(TestModel, unittest.TestCase):
    ...
import pytest
import pyomo.common.unittest as unittest
from idaes.tests.performance.test_models import TestModel

@pytest.mark.performance
class HC_PR_Model(TestModel, unittest.TestCase):
    ...

Then the same script as above (with TestModel instead of IdaesPerformanceTest) can iterate over TestCases and identify those that are also instances of TestModel, with output:

test_build_initialize_solve (idaes.models.properties.modular_properties.examples.tests.test_HC_PR.HC_PR_Model)
test_build_initialize_solve (idaes.models.unit_models.tests.test_heat_exchanger_1D.HX1D_Model)

To run the tests from this idaes/tests/performance/ directory, we now need to import them in to a test file, e.g. test_performance_suite.py:

from idaes.models.properties.modular_properties.examples.tests.test_HC_PR import (
    HC_PR_Model,
)
from idaes.models.unit_models.tests.test_heat_exchanger_1D import HX1D_Model

and the test suite can be run from main.py as before. I have implemented this in my perf_testing-discoverable branch, for what it's worth.

Robbybp avatar Aug 18 '22 23:08 Robbybp

@Robbybp So, a few comments in response (there is a lot there so I will probably miss something.

  1. First, I think we are talking about different things here. The purpose of these classes is not to be the test themselves but to hold methods that are common in other tests. The tests can then leverage these methods rather than duplicating code. Note that in the examples I have so far the build_model methods, etc. are use in the unit tests for each model as well as the performance tests.
  2. The actual test methods are then written elsewhere and have pytest marks that can be used for test discovery. pytest has ways of marking tests to be skipped by default, so we can set up tests that will only be run if called explicitly (which I have done for the performance tests). So, to run the performance tests you do pytest --performance from the idaes-pse folder. My thought is that you would want to write a separate set of tests to count constraints, etc. which leverage these common methods.
  3. Finally, have we considered what we would use these tests your are proposing for in the overall CI environment? These are definitely things of interest, but I am wondering what we would use these for.

andrewlee94 avatar Aug 19 '22 13:08 andrewlee94

@andrewlee94 My goal is to be able to iterate over all Performance Test Suite models in a Python script and use a common API to build, initialize, and solve. Ideally, I would be able to do this automatically, without needing to know what the test suite models are or where they are located. This is accomplished with the script I show above.

Responding to your points:

  1. In my proposal, a performance test model doubles as a test and utility class whose methods can still be used in other tests. The motivation for making these classes tests is so they are easily discoverable (for iteration in a Python script) with unittest.TestLoader.discover
  2. Rather than discovering just the test methods, I would like to discover the underlying Performance Test Models that implement build and initialize methods. The tests are still decorated with @performance and can be discovered with pytest
  3. I do not plan to iterate over Performance Test Suite models in any tests or CI environment. I would like to do so to generate statistics that justify the use of these models to benchmark optimization solvers, as I think "models like IDAES models" are under-represented as benchmark problems. I would like to collect statistics that help me define more precisely what I mean by "models like IDAES models"

Robbybp avatar Aug 19 '22 15:08 Robbybp

@Robbybp OK, I think I see the source of the problem here - the real issue is that we do not have a formal, easily discoverable definition of what constitutes the performance test suite. At the moment, the list of models is defined implicitly as part of the performance test runner which makes it hard to repurpose for other uses.

I think the immediate solution is fairly straight forward, and actually makes your life easier too - we need to create a separate module which defines the models that make up the test suite, and then have the test runners work from that (whether it is the timing tools in this PR or your tools for counting constraints, etc.). The only catch I see with this is that it does start to push us towards a standard form for all these to take (i.e. the performance test base class) which I had been hoping to avoid (to allow for more custom performance tests if required), but we can deal with those when they arise.

My first thoughts are thus:

  1. Define a standard form for performance tests via the base class.
  2. Have a "performance test suite module" that imports these classes from their various locations and puts them in a dict for easy discovery. This will give us a neat, unambiguous definition of what constitutes the performance test suite.
  3. The test runner can then use the above dict to parameterize the performance tests, which saves us having to write test methods for each model in the test suite as a bonus.

This way, we don't have the do things like import all of ideas and search for instances of specific classes.

However, as I said this does force us into a standard form for the performance test models, which might limit us in future. Notably, I wanted to leave things open for us to write more granular performance logging in cases where it is appropriate (e.g. flowsheets). However, there is probably a way to make this work - I just need to think a bit longer.

andrewlee94 avatar Aug 19 '22 15:08 andrewlee94

@Robbybp I think I have the solution: the PerformanceTest base class should be merged with the existing TestModel class and have only two methods that it must define: build_model and run_test. This way the performance test suite can just grab the run_test method, which can be customized for a specific model or left as is, and the build_model should return an instance of the model to be tested (for use in your tools or others).

The run_model method may or may not make use of the build_model method as required by the test case (i.e. simple cases will just call the build_model, initialize_model and solve_model methods, but more complex cases might have fully customized run_model methods).

Can you see any holes in this idea?

andrewlee94 avatar Aug 19 '22 15:08 andrewlee94

@andrewlee94 My main question is about how you envision the "performance test suite" to look. A test class that iterates over a dict, creating a new test_* method for every item in the dict? And these test methods just call run_test?

My other comments are:

  • I would like the base class for the standard performance test to have methods for initialization and solve as well.
  • Different types of models can have different classes for their performance tests. E.g. DynamicPerformanceTest or FlowsheetPerformanceTest
  • I would actually prefer to not have some global dict that defines the "IDAES Performance Suite". My main reason is so that, to implement a performance test, you only have to touch one file (the test file where you subclass PerformanceTest and TestCase for a specific model). Having a global data structure to define the performance tests strikes me as "storing the same piece of data in two locations"

Robbybp avatar Aug 19 '22 18:08 Robbybp

@Robbybp I think I have something working that doesn't require the dict of test classes and just uses a common base class and pytest discovery. This does mean however that you need to do some sort of search to find the performance tests.

I envision most performance test classes to have initialize_mode and solve_model methods (I actually have a default solve_model method in the base class), but I was looking at them as only being required if you used the default run_test method (i.e. if you wrote your own then you might not need the initialize and solve methods).

Finally, there is nothing to stop you from having multiple performance tests for a given model - just create as many performance test calsses as you want for it (or if you want to be fancy, you can embed multiple test runs in a single class too).

andrewlee94 avatar Aug 19 '22 20:08 andrewlee94

Have you looked at what I implemented in https://github.com/andrewlee94/idaes-pse/pull/17 ? It is a working implementation of what I've described here. I'm interested to see your implementation as well.

Robbybp avatar Aug 19 '22 22:08 Robbybp

@Robbybp What you have is very similar to what I have implemented, the main differences being that I am working on the issue you had with being able to run tests in isolation.

andrewlee94 avatar Aug 22 '22 12:08 andrewlee94

@Robbybp Here is my latest version. you should be able to use the following from the idaes-pse folder to run the performance tests alone:

python idaes/tests/performance/main.py -v -d [output_dir] . --performance -m performance

I still need to remove the need for the copy of main.py (as it is now in the base Pyomo package), but I am having some issues with the pytest arguments and am waiting for some help from @jsiirola .

andrewlee94 avatar Aug 22 '22 14:08 andrewlee94

Codecov Report

Merging #926 (af41349) into main (6295724) will increase coverage by 2.60%. The diff coverage is 45.09%.

@@            Coverage Diff             @@
##             main     #926      +/-   ##
==========================================
+ Coverage   70.32%   72.93%   +2.60%     
==========================================
  Files         576      577       +1     
  Lines       64614    64664      +50     
  Branches    12138    12142       +4     
==========================================
+ Hits        45443    47165    +1722     
+ Misses      16871    15246    -1625     
+ Partials     2300     2253      -47     
Impacted Files Coverage Δ
idaes/core/util/performance.py 40.90% <40.90%> (ø)
idaes/conftest.py 87.17% <71.42%> (-3.73%) :arrow_down:
idaes/ver.py 61.53% <0.00%> (-4.62%) :arrow_down:
...ls_extra/power_generation/unit_models/downcomer.py 90.90% <0.00%> (ø)
...ls_extra/power_generation/unit_models/waterpipe.py 92.79% <0.00%> (ø)
..._extra/power_generation/unit_models/steamheater.py 93.07% <0.00%> (ø)
...daes/core/surrogate/pysmo/polynomial_regression.py 83.27% <0.00%> (+0.33%) :arrow_up:
idaes/core/util/initialization.py 75.78% <0.00%> (+0.44%) :arrow_up:
idaes/apps/uncertainty_propagation/sens.py 69.84% <0.00%> (+0.61%) :arrow_up:
...xtra/power_generation/properties/flue_gas_ideal.py 83.17% <0.00%> (+0.62%) :arrow_up:
... and 39 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

codecov[bot] avatar Aug 22 '22 21:08 codecov[bot]

@andrewlee94 Out of curiosity, why are -m performance and --performance both required?

Robbybp avatar Aug 24 '22 19:08 Robbybp

@Robbybp They might not be at this point - I am still working on those details. I suspect (but have not tried) the -m performance might be sufficient.

andrewlee94 avatar Aug 24 '22 19:08 andrewlee94

@Robbybp I've addressed all your comments, although I am still looking to see if there is a way to get the base class to inherit from unittest.TestCase but not get run by pytest (so that we don't need to remember to have all the derived classes do multiple inheritance).

andrewlee94 avatar Aug 25 '22 14:08 andrewlee94

@andrewlee94 I believe inheriting from TestCase doesn't change what pytest will run, but it does change what unittest will run. The only reason this is an issue is that I can't figure out test discovery with pytest.

Robbybp avatar Aug 25 '22 17:08 Robbybp

Another comment is that the HC_PR test takes about 5 minutes to run on my laptop. Is this too long?

Robbybp avatar Aug 25 '22 17:08 Robbybp

@Robbybp 5 minutes is the expected time for the HC_PR test, which is why it was included. It involved a lot of deep expressions and unit coversions so hopefully represents one of our best cases for performance improvement.

andrewlee94 avatar Aug 25 '22 17:08 andrewlee94

@Robbybp @jsiirola I have implemented the __init_subclass__ method to force inheritance from TestCase and hopefully fixed the last lingering test failures. This should be ready for review.

andrewlee94 avatar Aug 25 '22 19:08 andrewlee94

Also, the latest (final) way to run the performance tests is (updated in original comment too):

python [pyomo]\scripts\performance\main.py --project idaes -v -d [output_dir] . --performance

andrewlee94 avatar Aug 25 '22 19:08 andrewlee94