lightning-flash icon indicating copy to clipboard operation
lightning-flash copied to clipboard

[RFC] Deprecate the `unfreeze_milestones` finetuning strategy?

Open ethanwharris opened this issue 3 years ago • 6 comments

Motivation

The unfreeze_milestones finetuning strategy is confusing:

  • how is the layer number interpretted? Does this include e.g. batch norm and non-linearity layers? Not documented
  • what's the use case? Not aware of any time this would be recommended (also not documented)

Alternatives

At least document the answers to the above questions if there are any.

ethanwharris avatar Mar 29 '22 18:03 ethanwharris

Hello @ethanwharris ,

I think UnfreezeMilestines strategy is somewhat close to the Gradual Unfreezing idea from this paper Universal Language Model Fine-tuning for Text Classification by Jeremy Howard and Sebastian Ruder.

We might need to change how it works of add more arguments for users to customize it and it definitely needs more documentation.

And we can also change the name to GradualUnfreezing if that is okay.

karthikrangasai avatar Mar 30 '22 13:03 karthikrangasai

Hey @karthikrangasai, yes that could work. If we re-implement that then we can change the name and cite their paper in the code / docs and so people can read about why they may want to do it 😃

... plus documenting the usage of course

ethanwharris avatar Mar 30 '22 13:03 ethanwharris

Great. Sounds good. I will try to get some work done on this then.

karthikrangasai avatar Mar 30 '22 13:03 karthikrangasai

Hi, my 2 cents on this:

  1. As a part of this issue, let's only document this further as @ethanwharris rightly pointed out. We definitely need to show the usage and explain a little about how it works. If we are sure that there is an exact reference that exists, would be great to cite it.
  2. I would prefer not to modify/add/edit the current strategy, as long as it works and does the job (on unfreezing layers). But if there is some motivation, please share any reference implementation (if any library already implements it) here, and we can (very shortly) take a look and go ahead if all sounds good.

Hope it sounds good! :)

krshrimali avatar Apr 01 '22 05:04 krshrimali

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jun 05 '22 20:06 stale[bot]

@ethanwharris lets do it... :otter:

Borda avatar Jan 05 '23 02:01 Borda