flowmap icon indicating copy to clipboard operation
flowmap copied to clipboard

Reducing the overfit training time by avoiding repetition of the validation step

Open RR-28023 opened this issue 1 year ago • 1 comments

Hi! Thanks for sharing this amazing work. This is a minor issue but in my case reduced the overfit training time significantly:

Currently, the validation step is being executed 64 times at every val_check_interval (with the same data each time, generating 64 visualizations that are overriden 63 times). This is because the num_workers of the dummy validation DataLoader is set to 64.

The above can be avoided by simply setting num_workers=1 in the val_dataloader method of DataModuleOverfit. Alternatively, a limit_val_batches=1 can be set at Trainer initialization in overfit.py but this may be sub-optimal because I think it would still launch the 64 subprocesses.

RR-28023 avatar Aug 01 '24 07:08 RR-28023

Is this an issue with the code at head? It seems the number of workers is hard-coded to 0:

https://github.com/dcharatan/flowmap/blob/2be1b9ef8a22513da99b1d7611362f33f9f6481a/flowmap/dataset/data_module_overfit.py#L29

dcharatan avatar Aug 15 '24 21:08 dcharatan