transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Added a new without_checkpoint_model:bool param to trainer.train()

Open stevemadere opened this issue 11 months ago • 3 comments

Added a new option without_checkpoint_model:bool to the Trainer.train() params.

This prevents re-loading the model supplied to the constructor during a train(resume_from_checkpoint) operation so that models that the Trainer does not explicitly know how to load itself can still be resumed from a checkpoint.

Fixes #29740

Before submitting

  • [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [x] Did you read the contributor guideline, Pull Request section?
  • [ ] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • [x] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • [ ] Did you write any new necessary tests?

@muellerzr , @pacman100, @amyeroberts

stevemadere avatar Mar 19 '24 23:03 stevemadere

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

hmmm, I'm not sure about this. This seems like a workaround that avoids addressing a problem with the trainer.

~Could you give an example of a model that's saved out with trainer but then can't resume from checkpoint? Ideally as a small script to reproduce~ Never mind - I just saw the details in the issue :)

amyeroberts avatar Mar 20 '24 09:03 amyeroberts