ReinforcementLearning.jl icon indicating copy to clipboard operation
ReinforcementLearning.jl copied to clipboard

Improve the logging mechanism during training

Open findmyway opened this issue 3 years ago • 0 comments

Currently, in each policy or learner, we allocate a temp memory to record intermediate data. And to record these data, we need to add an extra hook. There're at least 3 problems:

  1. Each policy or learner needs to define some extra fields simply for caching these intermediate data.
  2. In some policies, the intermediate data will be updated several times (think about the PPO). This means hooks can only see the last statistics in each update step.
  3. We can't do early filtering.

To improve it, the idea is simple, we can just leverage the logging system in Julia and some utils in LoggingExtras.jl.

Basically, we can replace all the existing logging lines with @debug "index/name" x=y ... _group=DEFAULT_GROUP. And provide a filter (see the concepts in LoggingExtras.jl) to extract all the logs with _group of DEFAULT_GROUP. Then we can use any log sinks to write the logs.

Note that:

  1. Here we prefer @debug to avoid printing those statistics with the default logger
  2. _group is required so that we can easily distinguish them from other logs
  3. A default filter must be provided (maybe simply reuse EarlyFilteredLogger?)
  4. Wandb.jl and TensorBoardLogger.jl should be supported out of the box as sinks. (TODO: add some examples in docs.)

findmyway avatar Jul 06 '22 16:07 findmyway