Morgon Kanter
Morgon Kanter
> @mx any opinion on this implementation Seems fine to me, I'll withdraw my PR. Do we even need the additional setting though? The user already has to opt-in to...
Could this be considered again for inclusion? It would be really helpful to be able to stop training the text encoder at a certain point but continue to train the...
@DevArqSangoi If you do it that way, could active regularization effects still reduce the weights, depending on the optimizer in use? That would be a problem if they could.
It shouldn't matter for Adam or AdamW or the optimizers based on them, either. Just not sure what weirdness is out there. I suppose it should really work fine for...
> It is a brutish way. I didn't say it wouldn't work, just it is a sledgehammer for the lazy. The real way no current "dev" wants to tackle if...