EMAworkbench icon indicating copy to clipboard operation
EMAworkbench copied to clipboard

Progress bar and run stats

Open steipatr opened this issue 3 years ago • 9 comments

As requested in #101, I'm filing a separate issue for each improvement. I have some initial work on progress bar and run stats in https://github.com/steipatr/EMAworkbench/tree/experiment-stats. I focused on sequential and multiprocessing evaluation. Is the IpyparallelEvaluator still relevant?

For sequential experiments, tqdm is executed directly in SequentialEvaluator ( https://github.com/steipatr/EMAworkbench/blob/6e548ef4b64594475a8b7203aa34d9dc0f4f8bbb/ema_workbench/em_framework/evaluators.py#L234 ) and should automatically detect whether it's in console or notebook.

For parallel experiments (wow this was a rabbit hole), I added tqdm to the ExperimentFeeder class (https://github.com/steipatr/EMAworkbench/blob/6e548ef4b64594475a8b7203aa34d9dc0f4f8bbb/ema_workbench/em_framework/ema_multiprocessing.py#L224).

Both work roughly as expected (demo files to follow).

Question: Is there an easy way to get the total number of experiments to be performed? There is nr_of_exp in evaluators.py: https://github.com/quaquel/EMAworkbench/blob/160ef047a8593a341ac9c51f7ca206c3e56c5b52/ema_workbench/em_framework/evaluators.py#L444 but it does not seem to consider the factorial/zipover/jointly options for experimental design.

Next steps:

  • add accurate number of experiments for seq and parallel evaluators.
  • add end of run stats
  • look at tqdm parameters (https://github.com/tqdm/tqdm#parameters) to see if there is something useful there. Might make sense to calculate the run stats directly in tqdm and show them with the postfix parameter.

steipatr avatar Apr 21 '21 14:04 steipatr

Progress bar demo files: pbar-demo.zip

steipatr avatar Apr 21 '21 14:04 steipatr

Currently, the logging is handled in the call method of AbstractCallback (https://github.com/quaquel/EMAworkbench/blob/2.1-inprogress/ema_workbench/em_framework/callbacks.py). Would it be possible to put the progress bar in this location?

quaquel avatar Apr 21 '21 15:04 quaquel

pbar.update() could probably go in the AbstractCallback, but the main tqdm call has to wrap around whatever is being iterated over. From my understanding, the AbstractCallback is called after every case (so as part of the iteration), so the main tqdm call would have to be higher up.

steipatr avatar Apr 21 '21 15:04 steipatr

not sure what you mean with the call having to be higher up. I guess you mean because of the with statement? This activates a context manager. A context manager is any object with an __enter__ and __exit___ dunder method. Looking at the source code for tqdm.tqdm, you can see that __enter__ does not do anything, so no need to invoke it. __exit__ does need to be called for clean up purposes, so we can hack this by calling it if self.i equals self.pbar.totals.

Below is a dirty hack that seems to work fine with running .py files (need to test it still in notebooks and jupyter lab.


class AbstractCallback(object):
    """
    Abstract base class from which different call back classes can be derived.
    Callback is responsible for storing the results of the runs.

    Parameters
    ----------
    uncs : list
            a list of the parameters over which the experiments
            are being run.
    outcomes : list
               a list of outcomes
    nr_experiments : int
                     the total number of experiments to be executed
    reporting_interval : int, optional
                         the interval at which to provide progress
                         information via logging.
    reporting_frequency: int, optional
                         the total number of progress logs


    Attributes
    ----------
    i : int
        a counter that keeps track of how many experiments have been
        saved
    reporting_interval : int,
                         the interval between progress logs

    """
    __metaclass__ = abc.ABCMeta

    i = 0

    def __init__(self, uncertainties, outcomes, levers,
                 nr_experiments, reporting_interval=None,
                 reporting_frequency=10):
        if reporting_interval is None:
            reporting_interval = max(
                1, int(round(nr_experiments / reporting_frequency)))

        self.reporting_interval = reporting_interval

        self.pbar = tqdm.tqdm(nr_experiments)

    @abc.abstractmethod
    def __call__(self, experiment, outcomes):
        """
        Method responsible for storing results. The implementation in this
        class only keeps track of how many runs have been completed and
        logging this. Any extension of AbstractCallback needs to implement
        this method. If one want to use the logging provided here, call it via
        super.

        Parameters
        ----------
        experiment: Experiment instance
        outcomes: dict
                the outcomes dict

        """
        #
        # TODO:: https://github.com/alexanderkuk/log-progress
        # can we detect whether we are running within Jupyter?
        # yes:
        # https://stackoverflow.com/questions/15411967/how-can-i-check-if-code-is-executed-in-the-ipython-notebook
        self.i += 1
        _logger.debug(str(self.i) + " cases completed")
        #
        # if self.i % self.reporting_interval == 0:
        #     _logger.info(str(self.i) + " cases completed")
        self.pbar.update()

        if self.i == self.pbar.total:
            self.pbar.__exit__()

    @abc.abstractmethod
    def get_results(self):
        """
        method for retrieving the results. Called after all experiments
        have been completed. Any extension of AbstractCallback needs to
        implement this method.
        """

quaquel avatar Apr 21 '21 16:04 quaquel

minor additional point. __exit__ simply calls close so we could call that directly as well.

the run stats info is a separate issue. But since the tqdm instance is an attribute on the callback object, this should be easy to handle at a higher level.

quaquel avatar Apr 21 '21 16:04 quaquel

the progress bar stuff now works.

this issue however started with a request for run stats, so do you still want that in addition to the tqdm based progress bar? And if so, what do you want to show in this?

quaquel avatar May 02 '21 12:05 quaquel

Wow, that is way more structured than anything I could have come up with. Mixins look useful, never heard of that.

Regarding run stats, Jason and I used:

  • number of cores used
  • number of runs completed
  • elapsed wall clock time
  • wall clock time per run ( runs/time)
  • average duration of a single run (runs/(time/cores) )

We just hardcoded this into perform_experiments because we're hacks. In EMA, I would propose extending BaseEvaluator with a report_stats method, and then calling that method after the "experiments finished" notification: https://github.com/quaquel/EMAworkbench/blob/160ef047a8593a341ac9c51f7ca206c3e56c5b52/ema_workbench/em_framework/evaluators.py#L476

Something like:

    def report_stats(self, callback):
        """send summary statistics about experiments to logger."""

        elapsed = callback.pbar.format_dict['elapsed'] 
        runs = callback.pbar.format_dict['n']    

        _logger.info(f"experiments completed: {runs}")
        _logger.info(f"elapsed wall clock time: {elapsed} s")
    
        if isinstance(self, SequentialEvaluator):
            _logger.info(f"mean run duration: {elapsed / runs} s")
        
        else:
            if isinstance(self, MultiprocessingEvaluator):
                cores = self.n_processes
                
            elif isinstance(evaluator, IpyparallelEvaluator):
                cores =  len(self.client.ids)
                
            _logger.info(f"used cores: {cores}") 
            _logger.info(f"wall clock per run: {elapsed / runs} s")
            _logger.info(f"mean run duration: {elapsed / (runs / cores)} s")
        
        return None

Above code is tested with sequential evaluation in a notebook, and multiprocessing. Would that work for you? Another option would be to use tqdm's postfix argument to put this stuff directly in the progress bar.

steipatr avatar May 03 '21 11:05 steipatr

if it is a log message, the sensible thing would be to have it printed by default in which case it can be tied to close method on ProgressBarMixin or postfix as you suggested. This might also enable us to reuse the stats collected already by the progress bar. Drawback of this is that it only works if you use progress bar which is controllable through the log_progress kwarg on perform_experiments.

The alternative would be to write a custom helper class to track relevant runtime statistics, log a default subset while having the full data available as a data frame or something. In particular when using optimisation, you might want more runtime statistics.

The mixin stuff, I learned from this book: https://effectivepython.com which is among the most useful programming books I know.

quaquel avatar May 10 '21 19:05 quaquel

I think it makes sense to re-use the stats already collected by tqdm. So I will take a look at what can be done with postfix. I don't feel comfortable with the second option yet. Effective Python really looks nice, already requested from the library.

steipatr avatar May 19 '21 08:05 steipatr