Improved Printouts + Algorithm Refactor
In #50 I designed a Printer class that managed verbose printouts during an algorithm's execution. I realized I was able to remove a lot of duplicated code if this Printer class also kept track of certain algorithm state variables, such as the index of the best solution among others.
Ultimately the Printer class became responsible for two things: 1. managing verbose printouts and 2. tracking certain aspects of the algorithm state. A class with two different purposes is bad design, and suggests it should be split into two different classes, each with a singular purpose. This PR does that.
Instead of a Printer class, this PR introduces a Logger class that solely handles printing at different verbosity levels.
To remove the duplicated code among the different algorithms, I now have them subclass from a generic Iterative class which is itself a subclass of Algorithm. An Iterative algorithm is any algorithm that proceeds in an iterative fashion to produce policy iterates (all our algorithms behave this way).
Each subclass of Iterative need only to define how to move from one iterate to the next, while the superclass handles logging and the main iteration loop.
This does not change the experience from the user-end. To create a custom algorithm, you still subclass Algorithm.
All the algorithms except MFOMO have been reimplemented as subclasses of Iterative. MFOMO can be ported as well, but will require a little cleanup.
I will share some numerics to show the new implementations don't differ from the old implementations. But in the meantime, the PR is available for review.
I've confirmed that the changes in this PR do not change the output of the algorithms. Below are charts for each algorithm, across a variety of environments, that plot the exploitability on this branch and on main. The blue lines are entirely hidden under the red lines.
And here's a screenshot of the new printouts.
Helpful comments all around. They've been addressed.
@junziz @AnranHu A friendly ping. I'm just waiting for approval on the changes to merge.
Everything looks good here except for two very minor items: 1) For MFOMO, let's display the full optimizer dict including config, not just name; 2) let's just make INFO_PANEL_WIDTH and MAX_TABLE_LENGTH configurable so that it would be more convenient in case there is any need to adjust them? Thanks! @matteosantama
The table now displays the full optimizer dict.
I'd really prefer to not support configurable INFO_PANEL_WIDTH and MAX_TABLE_LENGTH values. Allowing them to be configurable would force us to handle formatting across a range of widths (line-wrapping vs. truncation, etc.), and that is no small task. I also don't think "someday somebody might want to do this" is a strong enough reason. If we encounter a convincing, practical reason for needing configurable widths, we can revisit this.
I'd really prefer to not support configurable INFO_PANEL_WIDTH and MAX_TABLE_LENGTH values. Allowing them to be configurable would force us to handle formatting across a range of widths (line-wrapping vs. truncation, etc.), and that is no small task. I also don't think "someday somebody might want to do this" is a strong enough reason. If we encounter a convincing, practical reason for needing configurable widths, we can revisit this.
I'm not sure if I missed anything, but why would there be need to handle line-wrapping vs. truncation here? I think rich.panel would do automatic line wrapping, and MAX_TABLE_LENGTH is merely splitting the tables over rows so also don't see why there would be line-wrapping vs. truncation, etc. And we can just live with these (already pretty nice) behavior even if make the two parameters configurable, and nothing to handle in terms of formatting across a range of widths but just let rich do the line-wrapping and resizing automatically as is, no?
But otoh after playing a bit with rich print/table/panel I realized that it automatically does resizing and line wrapping already, so there wouldn't be any truncation anyway. Was mainly worried e.g., with large iteration numbers, large algorithm hyperparameters or large environment dimensions, there would still be some truncation happening. But since this is not the case then there is no need to configure them.