delft
delft copied to clipboard
Tensorboard
This PR setup DeLFT to use tensorboard: logs, callbacks...
The idea is to cover the interesting metrics so that everything is visualized on Tensorboard.
How to use: basically nothing special, start a training and launch:
> tensorboard --logdir logs/summaries
(tensorboard is already installed)
Then open http://localhost:6006/
TODO:
- add more metrics
- see non-scalar views
Tensorboard generates a ridiculous amount of event logs... more than 1.2GB of logs per epoch only for a basic transformer model. For instance after 4 epochs:
$ du -sh logs/summaries/20220502-160051/train
5.1G logs/summaries/20220502-160051/train
Given that effect (like with a 10-fold cross-validation training), the tensorflow callbacks need to be defined at the application level, so that it is a user choice and not systematically used by the core library.
Looks good.
Indeed, this tool can fill up the disk space faster than docker... 🎉
FYI 1 run of 10-fold crossvalidation weight ... 82GB 😂
(base) [lfoppian0@sakura02 summaries]$ ls -lh
total 4.0K
drwxr-sr-x 4 lfoppian0 tdm 4.0K May 10 12:17 20220510-121634
(base) [lfoppian0@sakura02 summaries]$ du -hs
82G .