discoart icon indicating copy to clipboard operation
discoart copied to clipboard

Running on M1 mac

Open albertvaka opened this issue 1 year ago • 4 comments

Can we expect this to be ported from Cuda to the Apple Accelerate framework (or something else) so it can run on Mac laptops?

albertvaka avatar Oct 11 '22 18:10 albertvaka

It looks like pytorch already supports the M1 chip, so it might be enough to use torch.device('mps'). I'll give it a try.

albertvaka avatar Oct 11 '22 18:10 albertvaka

Using torch.device('mps') I get:

The operator 'aten::index.Tensor' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable PYTORCH_ENABLE_MPS_FALLBACK=1 to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

Then after setting PYTORCH_ENABLE_MPS_FALLBACK=1 I get:

Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.

Which comes from https://github.com/jina-ai/guided-diffusion... next I will try to monkey patch that to use float32 (since it seems there's no global way to tell numpy to only use float32) 😅

albertvaka avatar Oct 11 '22 20:10 albertvaka

By making all arrays in guided-difussion of type float32 I managed the code to continue until it reaches this repo's cond_fn. However, it seems there's something wrong when cond_fn calls MakeCutouts which in turns calls Pytorch's RandomAffine, causing the program to crash (triggers an assertion on Apple's code):

-:27:11: error: invalid input tensor shapes, indices shape and updates shape must be equal
-:27:11: note: see current operation: %25 = "mps.scatter_along_axis"(%23, %arg4, %24, %1) {mode = 6 : i32} : (tensor<150528xf32>, tensor<224xf32>, tensor<50176xi32>, tensor<i32>) -> tensor<150528xf32>
/AppleInternal/Library/BuildRoots/a0876c02-1788-11ed-b9c4-96898e02b808/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphExecutable.mm:1267: failed assertion `Error: MLIR pass manager failed'

I've tried commenting out the T.RandomAffine(...) transformation and it continues further but fails again when cond_fn calls model_stat['clip_model'].encode_image(...) which ends up calling Pytorch's layer_norm and crashing with:

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

I give up 😞

albertvaka avatar Oct 11 '22 22:10 albertvaka

I give up 😞 never give up

sascha1337 avatar Oct 17 '22 17:10 sascha1337