InvokeAI
InvokeAI copied to clipboard
[bug]: DPM Samplers Seem To Never Converge On MPS?
Is there an existing issue for this?
- [X] I have searched the existing issues
OS
macOS
GPU
mps
VRAM
16GB
What happened?
Running prompt 'banana sushi' -S42 in k_dpm_2 and k_dpm_2_a at 17, 27, and 37 steps seems to indicate that SD will never converge (decide?) on a particular image.
k_dpm_2 at 17 steps

k_dpm_2 at 27 steps

k_dpm_2 at 37 steps

Screenshots
No response
Additional context
Seeing this on my system since the 2.0 release, including the 2.1 release candidate I cloned yesterday on 11/1. Anyone else? I can reproduce this with as many as 70 steps (making my poor M1 Pro MacBook very melty). k_heun sampler does not exhibit this behavior, BTW.
Contact Details
No response
Thanks for the report. I’ll try to reproduce the error and suggest a fix.
+1 I see the same behavior.
@lstein Is this possibly related to the rcently-discovered regression in reproducibility?
A datapoint - I took the Automatic1111 distro for a brief spin, and DPM2 and 2a behave the same there, so I wonder if the issue has more to do with dependencies (PyTorch?) + my hardware than InvokeAI (i.e., a me problem).
I'm pretty sure I've been hearing about this for a long time, maybe since the beginning of the k_diffuser import. Someone with an MPS system should check out an old version and see if it was there.
Lincoln
On Sat, Nov 5, 2022 at 6:25 AM victorca25 @.***> wrote:
+1 I see the same behavior.
— Reply to this email directly, view it on GitHub https://github.com/invoke-ai/InvokeAI/issues/1350#issuecomment-1304487302, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA3EVLUM4LU4PY36QVP3C3WGYY2PANCNFSM6AAAAAARVI7KYA . You are receiving this because you were assigned.Message ID: @.***>
--
Lincoln Stein
Head, Adaptive Oncology, OICR
Senior Principal Investigator, OICR
Professor, Department of Molecular Genetics, University of Toronto
Tel: 416-673-8514
Cell: 416-817-8240
@.***
E**xecutive Assistant
Michelle Xin
Tel: 647-260-7927
@.*** @.**>
Ontario Institute for Cancer Research
MaRS Centre, 661 University Avenue, Suite 510, Toronto, Ontario, Canada M5G 0A3
Collaborate. Translate. Change lives.
This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.
I think I follow. Do you mean check out an old version of InvokeAI before you started using k_diffuser? Or check out an older version of k_diffuser? Sorry, machine learning isn't my background and this is all quite... Exotic to me 😳
I looked into this and I have a fix which restores the four k_dpm* samplers which are currently broken on Apple Silicon. The fix is simple, but I think input is needed first from project maintainers and this is why I am not sending a PR yet.
The root cause: in PyTorch 1.12.1 there is a bug with indexing on 'mps' device, if followed by a function activation.
> torch.tensor([1.,5.], device='mps').log()
tensor([0.0000, 1.6094], device='mps:0')
> torch.tensor([1.,5.], device='mps')[1].log()
tensor(0., device='mps:0') # should be 1.6094
This bug is fixed in PyTorch 1.13, but that version has a more serious issue as to make InvokeAI completely unusable for me. (I suspect memory management, constant swapping). Related: https://github.com/pytorch/pytorch/issues/89784
Back to 1.12.1: the bug affects InvokeAI through 'log' on a single 'sigma' - https://github.com/Birch-san/k-diffusion/blob/mps/k_diffusion/sampling.py#L590 The sigmas array is a one-dimensional mps-tensor, and thus suffers from the pytorch 1.12.1 bug.
My suggested solution: move this tensor to 'cpu'. It is small, but more importantly - its dimension ('time') is unrelated to any of the dimensions of the actual tensors at play: there is no real reason for it to be on the gpu (or even a tensor, for that matter). For example, here, add .to('cpu'): https://github.com/invoke-ai/InvokeAI/blob/main/ldm/models/diffusion/ksampler.py#L196
The complication: the 'to_d' function that is used in several samplers treat a scalar 'sigma' value as a tensor, which causes an error for mixed gpu/cpu tensor computation. https://github.com/Birch-san/k-diffusion/blob/mps/k_diffusion/sampling.py#L48 (I don't understand why the dimension adding before division is needed, can someone please elucidate?) The workaround there is to move the single-sigma to 'mps' by adding .to('mps').
(PS: There's already an MPS workaround in 'to_d', very possibly related: https://github.com/Birch-san/k-diffusion/blob/mps/k_diffusion/utils.py#L46 )
Hope this helps!