pytorch-lightning Support all iterator modes for fit/validate/test/predict

Description & Motivation

trainer.fit only works with CombinedLoader(..., mode="max_size_cycle"|"min_size")

trainer.{validate,test,predict} only works with CombinedLoader(..., mode="sequential")

This constraint is checked in the top-level loops: https://github.com/Lightning-AI/lightning/blob/0009cde1db1a9ab4e2f1e0a9f69a4affb59d5134/src/lightning/pytorch/loops/fit_loop.py#L351-L354 https://github.com/Lightning-AI/lightning/blob/0009cde1db1a9ab4e2f1e0a9f69a4affb59d5134/src/lightning/pytorch/loops/evaluation_loop.py#L182-L183

Pitch

Have all trainer functions support all modes

TODO:

[ ] FitLoop
[x] EvaluationLoop (#17163)
[ ] PredictionLoop

Alternatives

Not do it

Additional context

This builds on top of https://github.com/Lightning-AI/lightning/pull/16726

cc @borda @justusschock @awaelchli

Feb 21 '23 16:02 carmocca

I am migrating my code to PL 2 and it seems that for the val dataloader getting a batch to be of the form {"key_a": batch_dataloader_a, "key_b": batch_dataloader_b} is not implemented in PL 2 yet. Here my old code as a reference:

def val_dataloader(self):
    val_dataloaders = {
        key: DataLoader(
            dataset,
            batch_size=dataset.batch_size,
            shuffle=False,
            num_workers=dataset.num_workers,
            pin_memory=False,
        )
        for key, dataset in self.val_datasets.items()
    }
    combined_val_loaders = CombinedLoader(val_dataloaders, "max_size_cycle")
    return combined_val_loaders

Mar 20 '23 19:03 mees

@mees I added support for that in #17163, if you want to give it a try. The PR only implements it for validation and testing.

Mar 28 '23 23:03 carmocca

@mees I added support for that in #17163, if you want to give it a try. The PR only implements it for validation and testing.

really helpful! I hope this gets into "stable" soon.... or even the next release!

May 15 '23 15:05 bkmi

I really wish there was sequential support in the training loop. Right now, it's not clear how one should handle batches of potentially different sizes in the training_step. We'd have to collate inside the training_step and ensure the given batch size is divided by the number of data loaders to keep gradient accumulation consistent etc. It gets pretty ugly. @carmocca Thank you for your work on this issue. Not to rush you, but any update on the sequential support in the training loop? Thanks again!

May 15 '23 17:05 FarzanT

Unfortunately, I dont have bandwidth to work on this now. If somebody wants to try, I can help getting the PR merged. You can follow the structure in the EvaluationLoop. The training hooks will need an optional dataloader_idx argument

May 17 '23 13:05 carmocca

@mees I added support for that in #17163, if you want to give it a try. The PR only implements it for validation and testing.

really helpful! I hope this gets into "stable" soon.... or even the next release!

Me too! Is there any release timeline / nightly version with this supported? I can't use lightning without this and really would like to leverage its other features!

May 25 '23 21:05 surya-narayanan

Ditto! FYI for others pulling nightly will get the feature: https://github.com/Lightning-AI/lightning/pull/17163

Aug 17 '23 23:08 spfrommer

Thanks! I also need this great feature.

Oct 06 '23 07:10 chenhaomingbob

+1, please release this feature asap!

Oct 31 '23 03:10 johnathanchiu

Is this feature currently worked on?

Jul 30 '24 07:07 lukas-folle-snkeos

As far as I know, nobody is currently working on it, Lukas

Jul 30 '24 13:07 carmocca

pytorch-lightning pytorch-lightning copied to clipboard

Support all iterator modes for fit/validate/test/predict

Description & Motivation

Pitch

Alternatives

Additional context

pytorch-lightning
pytorch-lightning copied to clipboard