sparseml
sparseml copied to clipboard
Let modifiers implement epoch adjustment when composing recipes
This change simplifies the manager's recipe combination, also fixes a bug that the current code at the manager level does not account for all epoch related attributes from the quantization modifiers.
@eldarkurtic @bfineran @rahul-tuli @dbogunowicz assigned for review
I like the idea of defining the advance_epochs method in the modifier class. However, note that the new method does not try to update all attributes w/ "epoch" in the name, as the previous implementation did. Are you sure you're not breaking other flows (not quantization)? What testing was done?
I like the idea of defining the advance_epochs method in the modifier class. However, note that the new method does not try to update all attributes w/ "epoch" in the name, as the previous implementation did. Are you sure you're not breaking other flows (not quantization)? What testing was done?
I thought it updates all epoch attributes (start, end, observer, freeze bn), no? Afaik there're no other scheduled modifiers having extra epoch related attributes (except for start and end); let me know otherwise. I also verify the composed recipe after quantization from a pruned model.
I agree this is a definite improvement over the current approach. I believe the only missing coverage (which isn't properly covered by the current state either) is
update_epochsinmodifier_thinning.py. This is a list of epochs which needs to be iterated over and updated as well
@KSGulin Thanks for pointing out. I'll update the PR to account for that modifier.
I like the new approach, but shouldn't the standard advance_epochs in modifier.py reproduce the old behavior to be safe?
@anmarques from my inspection this PR seems to cover all the desirable old behavior and avoids the error prone behavior of trying to catch attributes with "epoch" in their name that meet certain conditions. Do you see some potential use cases that aren't covered?
I like the new approach, but shouldn't the standard advance_epochs in modifier.py reproduce the old behavior to be safe?
The impl in modifier.py is for the abstract scheduled modifier and therefore is about the start and end epoch. Previously we had to change these two and all epoch based attributes in derived modifiers at this level of hierarchy which is not a good design.