stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

Add support for Tensorboard (training)

Open Melanpan opened this issue 2 years ago • 2 comments

Describe what this pull request is trying to achieve.

This will allow one to use Tensorboard to log the quality of their training. It's a more upgraded version of the csv writer, so to speak. Tensorboard provides the visualization and tooling needed for machine learning experimentation. For more info see.

Additional notes and description of your changes Three additional options have been added under Options > Training:

  • Enable tensorboard logging.
  • Save generated images within tensorboard.
  • How often, in seconds, to flush the pending tensorboard events and summaries to disk. image

The following scalars are added to tensorboard

  • The loss and learning rate per epoch
  • The loss and learning rate over the course of the whole training session
  • validation Images per epoch

image

The events will be saved to a tensorboard folder within the training project's folder. No extra packages are required as pytorch supports tensorboard out of the box. One small change I did that is not directly related to tensorboard was fixing a typo in the code. (ititial_step > intitial_step)

To view a training project with tensorboard, providing it's installed (pip install tensorboard) is to simply go to the project's folder, and run tensorboard --logdir=tensorboard then one can visit http://localhost:8888 to view the board.

Melanpan avatar Oct 20 '22 21:10 Melanpan

Can't wait to see this land, much better solution than rolling our own summaries.

dfaker avatar Oct 24 '22 01:10 dfaker

Not sure if this ever makes it in, but if it does, you may want to take a look at my project over here, where I figured out how to take the logged data from tensorboard and automagically generate graphs using Pandas, then outputting them along with generated images.

https://github.com/d8ahazard/sd_dreambooth_extension/blob/main/dreambooth/utils.py#L321

d8ahazard avatar Jan 06 '23 18:01 d8ahazard

What types of losses are used to measure results ? I'm not sure Can I specify loss type when I Train model?

robinoud avatar Mar 27 '23 15:03 robinoud