Birch-san comments

Results 175 comments of


                                            Birch-san

[MPS] MPSNDArray error: product of dimension sizes > 2**31

I have a repro (from stable-diffusion's attention forward-pass). https://github.com/CompVis/stable-diffusion/blob/69ae4b35e0a0f6ee1af8bb9a5d0016ccb27e36dc/ldm/modules/attention.py#L180 ```python from torch import einsum, ones # crashes with "product of dimension sizes > 2**31" # this is equivalent to invoking...

[MPS] MPSNDArray error: product of dimension sizes > 2**31

Can you give some examples? Because I've done stable diffusion inference and TI training just fine on Python 3.10, with mainline master branch, and have done inference just fine with...

Refactor `attention.py`

@patrickvonplaten rather than having everybody save & re-upload their weights: can diffusers intercept the weights during model load and map them to different parameter names? Apple uses PyTorch's `_register_load_state_dict_pre_hook()` idiom...

Fix Karras scheduler doesn't start from the actual max to min

https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/b34b25b4c941819d34f29be6c4c1ec01e64585b4#commitcomment-86212295

MPS: models require an initial pass for reproducibility

I found that this mitigation works, but I assume it only works if you're consistent in the size of `x` and `context` that you submit to `CrossAttention` (e.g. same image...

MPS: models require an initial pass for reproducibility

personally I run pytorch stable 1.12.1 (because it's faster than 1.13RC or the nightlies https://github.com/pytorch/pytorch/issues/85297, ~https://github.com/pytorch/pytorch/issues/87010~), so I don't encounter the einsum reproducibility problem. my use-case is almost always "launch...

MPS: models require an initial pass for reproducibility

wasn't this problem due to this einsum bug: https://github.com/pytorch/pytorch/issues/85224 solved since at least `1.13.0.dev20220928` (so should be in latest stable, 1.13.1). in any case: diffusers CrossAttention doesn't use einsum any...

MPS: models require an initial pass for reproducibility

hard disagree that a CoreML model is a substitute to having a working PyTorch MPS model. but I do think [diffusers is deterministic on MPS](https://github.com/huggingface/diffusers/issues/372#issuecomment-1374846894) anyway.

[Feature request] Let user provide his own randn data for samplers in sampling.py

> Since those are the same I think you need to set eta to something other than the default of 0, because it isn't using the random noise at all....

[Feature request] Let user provide his own randn data for samplers in sampling.py

oh wow, halving `rtol` to 0.025 **does** help `sample_dpm_adaptive` produce big sleeves similar to the ones `sample_dpm_adaptive` converged on. target we're trying to converge on (`sample_dpmpp_2s_ancestral`, 100 steps): `sample_dpm_adaptive` eta=0.75...