diffusers
diffusers copied to clipboard
Add missing get_velocity function to DEISMultistepScheduler
Add the missing get_velocity function (required for v-prediction) from DDPMScheduler from #2351
This allows DEISMultistepScheduler to be a drop in replacement for DDPMScheduler.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.
Thanks for adding this @saunderez
I think it looks good - do you want to run make quality
so the code quality checks can pass?
@patrickvonplaten @williamberman can we merge this in? otherwise can't use v_prediction
in DEISMultistepScheduler
during training
Thanks for adding this @saunderez I think it looks good - do you want to run
make quality
so the code quality checks can pass?
I've tried but unfortunately I'm having a little bit of trouble with that....will try again when I get home from work
All ready to go @yiyixuxu !
Happy to be a contributor for once, hopefully there'll be more to come :)
Hey @saunderez,
Sorry, could you clarify the use case a bit?
There should really be a use case where there is no need to use the DEISMultistepScheduler
instead of DDPMScheduler
during training, e.g. we always use the DDPMScheduler in our examples: https://github.com/huggingface/diffusers/blob/e3ddbe25edeadaa5afc3f8f5bb0d645098a8b26a/examples/dreambooth/train_dreambooth.py#L617 and all those trained unets are then compatible with the DEIS scheduler during inference.
We have been testing the usage of DEIS as an alternative scheduler for training in the dev branch the Dreambooth extension for Automatic1111 WebUI. It is currently a user selectable toggle.
https://github.com/d8ahazard/sd_dreambooth_extension
While I wouldn't describe the extension as cutting edge the sd_dreambooth_extension community does not shy away from testing implementations with potential benefit such as LORA and DEIS. LION was pushed to Dev as a potential replacement for AdamW today and early results are extremely promising and I believe that we can improve training quality and the process in general by using state of of the art tech when possible.
Back to the pull request itself....
DEIS was implemented in dev for testing not long after release and several members of the Discord had success a with 1.5 based models opinion was favourable. As someone who trains 2.1 768 almost exclusively I was the first to encounter the missing function and it was immediately obvious that a simple copy and paste would allow me to continue testing.
Since implementing it on my codebase I have trained several models successfully. No AB analysis has been done so far but I have definitely noticed higher quality samples at low steps and I think that alone warrants implementation of get_velocity for training.
On Thu, 16 Feb 2023, 20:02 Patrick von Platen, @.***> wrote:
Hey @saunderez https://github.com/saunderez,
Sorry, could you clarify the use case a bit?
There should really be a use case where there is no need to use the DEISMultistepScheduler instead of DDPMScheduler during training, e.g. we always use the DDPMScheduler in our examples: https://github.com/huggingface/diffusers/blob/e3ddbe25edeadaa5afc3f8f5bb0d645098a8b26a/examples/dreambooth/train_dreambooth.py#L617 and all those trained unets are then compatible with the DEIS scheduler during inference.
— Reply to this email directly, view it on GitHub https://github.com/huggingface/diffusers/pull/2352#issuecomment-1432825429, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVPV2L4HQOVROQ5E3MQ4FLWXX3J5ANCNFSM6AAAAAAU3YUXPY . You are receiving this because you were mentioned.Message ID: @.***>
To further add color to @saunderez use-case, we've found that using DEIS in place of DDIM/DDPM for the noise sampler improves training speed without any affect to output quality, but without v-pred, it breaks V2.1 models.
I see, would love a second opinion here from @pcuenca @patil-suraj @williamberman
To further add color to @saunderez use-case, we've found that using DEIS in place of DDIM/DDPM for the noise sampler improves training speed without any affect to output quality, but without v-pred, it breaks V2.1 models.
Thanks for the extra detail :)
Gently ping again @patil-suraj @williamberman and @pcuenca
I have no experience training with DEIS, but the PR looks fine to me. The change is isolated to this use case and shouldn't impact anything, unless the user manually configures their training scripts to use it.
I can try to setup a couple of dreambooth runs today.
I'm a bit confused, I believe the "noise schedule" is the same regardless of the Scheduler class. I don't think we get anything out of adding this method to other schedulers.
This is where the diffusers naming can get confusing, sorry about this. We have several Scheduler classes which each contain a "noise schedule" and a set of methods for solving for the next step in the denoising process given what the model outputs. In the training scripts, the noise_scheduler
is only used for its noise schedule and not for any of the solving methods. This makes sense as there's no iterative denoising process during training.
The noise schedule is configured by num_train_timesteps, beta_start, beta_end, beta_schedule, and trained_betas which are always used the same way.
The "solving methods" are configured by variance_type and prediction_type.
Maybe there's something about DEISMultistepScheduler that I'm missing? Also using an alternative noise_scheduler for training scripts would mean we'd have to make the class configurable somehow. I.e. just adding the get_velocity method isn't sufficient
@williamberman @saunderez As I understand this, the get_velocity
function should indeed be the same for all Scheduler
(sampler) classes given the same noise schedule parameters, but must be implemented to yield a training target to train models that use v-prediction. In fact, it may be good to
- Mark
get_velocity
with the# Copied from
mechanism to propagate the implementation to allScheduler
classes that defineself.alphas_cumprod
consistent with the original DDPM stochastic process formulation for the time being. Same goes for theScheduler
classes that defineself.sigmas
consistent with the SDE/ODE formulation in Karras. et. al. 2022. - Refactor noise schedules into a separate class, that is then passed as a
Scheduler
constructer argument as a long-term goal.
@williamberman @saunderez As I understand this, the
get_velocity
function should indeed be the same for allScheduler
(sampler) classes given the same noise schedule parameters, but must be implemented to yield a training target to train models that use v-prediction. In fact, it may be good to
- Mark
get_velocity
with the# Copied from
mechanism to propagate the implementation to allScheduler
classes that defineself.alphas_cumprod
consistent with the original DDPM stochastic process formulation for the time being. Same goes for theScheduler
classes that defineself.sigmas
consistent with the SDE/ODE formulation in Karras. et. al. 2022.- Refactor noise schedules into a separate class, that is then passed as a
Scheduler
constructer argument as a long-term goal.
Exactly right, and great idea. Without it schedulers aren't fully compatible with v-prediction, and it's such an easy win.
We've monkey patched it in for our purposes to implement DEIS and have since done the same to implement UniPCMultiStepScheduler which is also lacking get_velocity (and is now my go to for training coz speeed!)
Considering the outstanding work done to provide standardised interfaces in Diffusers the inconsistency here is a bit out of place.
To further add color to @saunderez use-case, we've found that using DEIS in place of DDIM/DDPM for the noise sampler improves training speed without any affect to output quality, but without v-pred, it breaks V2.1 models.
@d8ahazard could you help explain where it breaks? Are you suggesting it our text_to_image
example breaks if one used the v2-1 checkpoint as the base model?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
To further add color to @saunderez use-case, we've found that using DEIS in place of DDIM/DDPM for the noise sampler improves training speed without any affect to output quality, but without v-pred, it breaks V2.1 models.
@d8ahazard could you help explain where it breaks? Are you suggesting it our
text_to_image
example breaks if one used the v2-1 checkpoint as the base model?
When training, if we use DEIS for the noise sampler without monkey-patching in v-prediction.
Here, I believe:
https://github.com/d8ahazard/sd_dreambooth_extension/blob/926ae204ef5de17efca2059c334b6098492a0641/dreambooth/train_dreambooth.py#L1151
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
BUMP.
Can we please merge this already? I'm tired of having the monkey patch it in my code. Lol.
On Sat, May 6, 2023, 10:03 AM github-actions[bot] @.***> wrote:
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines https://github.com/huggingface/diffusers/blob/main/CONTRIBUTING.md are likely to be ignored.
— Reply to this email directly, view it on GitHub https://github.com/huggingface/diffusers/pull/2352#issuecomment-1537160411, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMO4NFVOZBKFRGTLQO5TNDXEZR3PANCNFSM6AAAAAAU3YUXPY . You are receiving this because you were mentioned.Message ID: @.***>
We will need to solve the quality check here before merging this. Nevertheless I also recommend to simply use the DDPMScheduler for training (add_noise) and then use DEISScheduler for inference. Schedulers can be switched in & out on the fly:
scheduler = DDPMScheduler.from_config(pipeline.scheduler.config)
scheduler = DEISScheduler.from_config(pipeline.scheduler.config)
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.