delft icon indicating copy to clipboard operation
delft copied to clipboard

Tensorboard

Open kermitt2 opened this issue 2 years ago • 3 comments

This PR setup DeLFT to use tensorboard: logs, callbacks...

The idea is to cover the interesting metrics so that everything is visualized on Tensorboard.

How to use: basically nothing special, start a training and launch:

> tensorboard --logdir logs/summaries

(tensorboard is already installed)

Then open http://localhost:6006/

Screenshot from 2022-05-02 14-59-03

TODO:

  • add more metrics
  • see non-scalar views

kermitt2 avatar May 02 '22 13:05 kermitt2

Tensorboard generates a ridiculous amount of event logs... more than 1.2GB of logs per epoch only for a basic transformer model. For instance after 4 epochs:

$ du -sh logs/summaries/20220502-160051/train
5.1G	logs/summaries/20220502-160051/train

Given that effect (like with a 10-fold cross-validation training), the tensorflow callbacks need to be defined at the application level, so that it is a user choice and not systematically used by the core library.

kermitt2 avatar May 02 '22 14:05 kermitt2

Looks good.

Indeed, this tool can fill up the disk space faster than docker... 🎉

lfoppiano avatar May 10 '22 02:05 lfoppiano

FYI 1 run of 10-fold crossvalidation weight ... 82GB 😂

(base) [lfoppian0@sakura02 summaries]$ ls -lh
total 4.0K
drwxr-sr-x 4 lfoppian0 tdm 4.0K May 10 12:17 20220510-121634
(base) [lfoppian0@sakura02 summaries]$ du -hs
82G	.

lfoppiano avatar May 16 '22 06:05 lfoppiano