lightning-flash
lightning-flash copied to clipboard
[RFC] Deprecate the `unfreeze_milestones` finetuning strategy?
Motivation
The unfreeze_milestones finetuning strategy is confusing:
- how is the layer number interpretted? Does this include e.g. batch norm and non-linearity layers? Not documented
- what's the use case? Not aware of any time this would be recommended (also not documented)
Alternatives
At least document the answers to the above questions if there are any.
Hello @ethanwharris ,
I think UnfreezeMilestines strategy is somewhat close to the Gradual Unfreezing idea from this paper Universal Language Model Fine-tuning for Text Classification by Jeremy Howard and Sebastian Ruder.
We might need to change how it works of add more arguments for users to customize it and it definitely needs more documentation.
And we can also change the name to GradualUnfreezing if that is okay.
Hey @karthikrangasai, yes that could work. If we re-implement that then we can change the name and cite their paper in the code / docs and so people can read about why they may want to do it 😃
... plus documenting the usage of course
Great. Sounds good. I will try to get some work done on this then.
Hi, my 2 cents on this:
- As a part of this issue, let's only document this further as @ethanwharris rightly pointed out. We definitely need to show the usage and explain a little about how it works. If we are sure that there is an exact reference that exists, would be great to cite it.
- I would prefer not to modify/add/edit the current strategy, as long as it works and does the job (on unfreezing layers). But if there is some motivation, please share any reference implementation (if any library already implements it) here, and we can (very shortly) take a look and go ahead if all sounds good.
Hope it sounds good! :)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
@ethanwharris lets do it... :otter: