OneTrainer
OneTrainer copied to clipboard
Explanation of Masked Training
Describe your use-case.
To better understand the concept, use when needed, and avoid when not.
What would you like to see as a solution?
A detailed explanation of how you've implemented it, the way it affects the training and what happens with the background, does it go fully "nan" or it remains but at a much slower "weight"
Have you considered alternatives? List them here.
Only through author's explanation.