speechbrain
speechbrain copied to clipboard
Fix tensorboard logger
Our current tensor board logger is not DDP compliant and can't be resumed if the resume crash or if we restart from a different epoch. This PR fixes all this.
@Gastron may I ask you to review this PR? It is linked to checkpointification of one of the loggers. To test all the recipes that are impacted, however, I don't know much how to proceed as I don't have the data. I think that @mravanelli has some ? Could you please try only on UrbanSound or Voicebank ? The changes are the same for all recipes, so if it fails, it will fail for all of them.
@TParcollet can we merge it or it is better to do another review here?
This changes quite a lot of recipe. It needs to be tested again on all of them (I did mine, but it must be reviewed).
@anautsch, could you do it when you have time? It could be great to add this fix in the new minor version.
works with extra dependency: protobuf==3.20.1
protobuf is for de/serializing data structures (from Google since 2008; open-source). Going through logs which crashed, they explicitly asked for going to a 20.x protobuf version or earlier.
note: when heading for v0.6 we have a dependency & extra_dependency 'theme park' ahead of us...