candle icon indicating copy to clipboard operation
candle copied to clipboard

Metal iOS

Open soldelacroix opened this issue 11 months ago • 5 comments

Great framework!

Is the usage of Metal already possible on iOS? I'm trying to run the Phi example on iOS and I can only get it to work with a CPU device but not with Metal. MTLStorageModeManaged isn't available on iOS.

soldelacroix avatar Mar 13 '24 13:03 soldelacroix

I've never tried compiling for iOS but this issue seems related #1759

LaurentMazare avatar Mar 18 '24 22:03 LaurentMazare

Have had a try at it here but something based on candle-examples/examples/phi is generating mostly blank tokens on my iPad. Not sure what the issue is at this point. TBC

ssoudan avatar Mar 26 '24 18:03 ssoudan

Thanks a lot @LaurentMazare for the great work.

I tried running it on iOS and as the OP noted, it works well but only on CPU. When using the GPU, the Metal code crashes because candle is explicitly using MTLResourceOptions::StorageModeManaged in a few places. The managed mode is not available on iOS and tvOS so the code panics.

I tried simply changing it to be shared everywhere, but that also panics because iOS has a buffer size limit of 256MB (see: https://github.com/gfx-rs/metal-rs/blob/master/src/device.rs#L712-L718) and at least trying to run Phi would attempt to allocate more.

Can we somewhere control the buffer size that is being used by candle?

filipw avatar Apr 10 '24 07:04 filipw

ref https://github.com/huggingface/candle/issues/2322 ... @filipw i've stumbled upon the same issue, did you eventually manage to make it work on iOS?

evilsocket avatar Jul 08 '24 11:07 evilsocket

I'll report here my comment on #2322 because I think it's super relevant and might be a very good lead to fix the issue:

After a bit of digging into Apple MLX and especially how they handle buffer allocation on both macOS and iOS, I found this https://github.com/ml-explore/mlx/blob/main/mlx/backend/metal/allocator.cpp#L207

You will see that all allocations are centralized there and they always and only use ResourceStorageModeShared (you won't find references to other storage modes in their metal backend). So it seems like on iOS at least, managed buffers are not needed. It makes sense if we think about it as Metal on macOS must support both Intel (where GPU and RAM are not unified) and Apple Silicon, while Metal on iOS only cares about unified memory, hence no syncing needed / supported.

Looking forward to your thoughts.

evilsocket avatar Jul 10 '24 12:07 evilsocket