catalyst icon indicating copy to clipboard operation
catalyst copied to clipboard

Add support for step based validation and checkpointing

Open julian-tonita opened this issue 2 years ago • 2 comments

🚀 Feature Request

The current library is built around epochs for a lot of core functionality including the checkpoint callback and validation metric functionality. It would be very useful to allow people to use steps instead of epochs to validate and save checkpoints.

Motivation

For large datasets saving every epoch is pretty useless. Additionally, even for smaller datasets being able to run validation metrics more frequently can be very helpful to see how the model is performing.

Proposal

For all core functionality that currently only uses epochs add a mode parameter which can be either epoch or step. If step is selected there will be another parameter num_steps (or something similarly named) which will control how many steps between the given functionality (e.g. steps between validation runs). Though I think this should be done for all epoch tied features the two most pressing are validation runs and checkpoints.

Alternatives

It is likely possible to add this functionality with custom callbacks, but this seems like a lot of work especially given how common this request likely is.

Additional context

Checklist

  • [x] feature proposal description
  • [x] motivation
  • [x] extra proposal context / proposal alternatives review

FAQ

Please review the FAQ before submitting an issue:

julian-tonita avatar Jun 07 '22 15:06 julian-tonita

Hi! Thank you for your contribution! Please re-check all issue template checklists - unfilled issues would be closed automatically. And do not forget to join our slack for collaboration.

github-actions[bot] avatar Jun 07 '22 15:06 github-actions[bot]

Hi,

Is it possible to use Sampler with train Dataset (Dataloder(dataset=dataset, sampler=sampler)) to make it the pre-defined number of batches you want?

Scitator avatar Jun 10 '22 18:06 Scitator

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Aug 12 '22 05:08 stale[bot]