stable-baselines3 icon indicating copy to clipboard operation
stable-baselines3 copied to clipboard

Added missing metrics when logging on tensorboard (#1298)

Open rogierz opened this issue 2 years ago • 7 comments

Co-authored-by: Riccardo Sepe [email protected] Co-authored-by: Francesco Scalera [email protected]

Added missing metrics when logging on tensorboard (#1298)

Description

Now both the hparam_dict and the metric_dict are stored on Tensorboard

Motivation and Context

Tensorboard library allows storing both hparam_dict and metric_dict, this PR changes sb3 implementation so as to be compatible with tensorboard. The problem is described in issue #1298. Closes #1298.

  • [x] I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

  • [x] Bug fix (non-breaking change which fixes an issue)
  • [ ] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] Documentation (update in the documentation)

Checklist

  • [ ] I've read the CONTRIBUTION guide (required)
  • [ ] I have updated the changelog accordingly (required).
  • [ ] My change requires a change to the documentation.
  • [ ] I have updated the tests accordingly (required for a bug fix or a new feature).
  • [ ] I have updated the documentation accordingly.
  • [ ] I have opened an associated PR on the SB3-Contrib repository (if necessary)
  • [ ] I have opened an associated PR on the RL-Zoo3 repository (if necessary)
  • [ ] I have reformatted the code using make format (required)
  • [ ] I have checked the codestyle using make check-codestyle and make lint (required)
  • [ ] I have ensured make pytest and make type both pass. (required)
  • [ ] I have checked that the documentation builds using make doc (required)

Note: You can run most of the checks using make commit-checks.

Note: we are using a maximum length of 127 characters per line

rogierz avatar Jan 26 '23 12:01 rogierz

@timothe-chaumont could you review/test this one?

araffin avatar Feb 15 '23 12:02 araffin

@timothe-chaumont could you review/test this one?

Yep I'll have a look at it by the end of the week :)

timothe-chaumont avatar Feb 16 '23 00:02 timothe-chaumont

Thank you @rogierz.

The changes proposed in this PR follows the implementation in pytorch tensorboard writer, and if added to SB3 they would have the following impact:

  • When custom metrics are passed through metric_dict and are not logged from somewhere else, they will also be displayed in TIME SERIES and SCALARS tab (as a graph with one point, see screenshot below).

    Screenshot 2023-02-20 at 00 39 57
  • For metrics that are logged separately, and (e.g. rollout/ep_len_mean), and mentioned in HParam's metric_dict, the value given will be added to the metric data (i.e. adding a fake data point).

    For example, in the screenshot below, the value 0 in dark blue has been logged when calling logger.record("hparams", HParam(...)) and does not correspond to a real value.

    Screenshot 2023-02-19 at 23 49 22

Given these two side-effects, we can either:

  1. Use @rogierz's proposition and add a warnings in the documentation,
  2. Update the documentation to make it clearer, but don't modify the logger code. e.g. Add a warning saying that custom metrics must be logged, and add self.logger.record("custom_metric", 1.0) in the example (_on_training_start or _on_step).
  3. Modify the HParam class to ask for metrics names only (without values):
HParam(hparam_dict, metric_list=['train/value_loss', 'custom_metric'])

hence forcing users to log custom metrics separately with:

logger.record("custom_metric", 1.0)

timothe-chaumont avatar Feb 20 '23 01:02 timothe-chaumont

@timothe-chaumont thanks for reviewing.

Modify the HParam class to ask for metrics names only (without values):

yes, look like a better fix, but hparams is asking for a non-empty metric dict (according to SB3 docstring), so you would assign zero values to all of them? (I'm also wondering how a metric can be non-zero/have a meaningful meaning at the very start of training)

araffin avatar Mar 02 '23 12:03 araffin

yes, look like a better fix, but hparams is asking for a non-empty metric dict (according to SB3 docstring), so you would assign zero values to all of them?

Yes, I think that the value given here has no impact on the logged data so we can use 0.

(I'm also wondering how a metric can be non-zero/have a meaningful meaning at the very start of training)

I guess that some metrics (e.g. ep_len_mean) wouldn't be exactly zero at the very start of the training, but I agree that meaningful use of this value must be rare. As it might be confusing we might as well remove it :)

timothe-chaumont avatar Mar 24 '23 21:03 timothe-chaumont

@rogierz do you want to add this change:

Modify the HParam class to ask for metrics names only (without values)

HParam(hparam_dict, metric_list=['train/value_loss', 'custom_metric'])

Or should I do it?

timothe-chaumont avatar Mar 24 '23 21:03 timothe-chaumont

I think I won’t be able to do it in the near future

rogierz avatar Mar 27 '23 15:03 rogierz