objc2 icon indicating copy to clipboard operation
objc2 copied to clipboard

Metal: Misc non-`unsafe` translation fixes

Open MarijnS95 opened this issue 5 months ago • 6 comments

Draft until this becomes more complete, and while I figure out the correct semantics.

Obvious things that should remain unsafe:

  • CPU-side indexing operations (that are not known to be bounds-checked internally).
  • Functions that involve raw pointers and/or data ranges (not translated to slices yet).

Things that are dubious:

  • GPU indices: sampleCountersInBuffer:atSampleIndex:withBarrier:, and also all the calls that set resources on encoders and argument buffers like setBuffer:offset:atIndex:.
  • GPU ranges: executeCommandsInBuffer:withRange:.
  • GPU buffer offsets: executeCommandsInBuffer:indirectBuffer:indirectBufferOffset: (note that MTLBuffer newTextureWithDescriptor:offset:bytesPerRow: was marked safe).

Here I am not sure if Rust's safety model should extend to the GPU, if at least CPU-side encoding is fully safe. It would be more convenient to not have to wrap them in unsafe on the call-side.

Things that seem obviously safe to me:

  • Various object getters and setters with high-level safe types.
  • Especially when they return Option to handle nullability?

MarijnS95 avatar Jul 23 '25 19:07 MarijnS95

CPU-side indexing operations (that are not known to be bounds-checked internally).

Pretty sure these are bounds-checked internally. I'll try to add a test that they are.

I am not sure if Rust's safety model should extend to the GPU, if at least CPU-side encoding is fully safe.

What do you think @cwfitzgerald? Should methods that may cause the GPU to execute an out-of-bounds operation be considered UB in objc2-metal?

I'm somewhat leaning towards "yes", but it might not actually be a problem if Metal internally ensures that such accesses cannot touch the data of other applications using the GPU? I.e. if all accesses are sandboxed anyhow?

madsmtm avatar Aug 24 '25 12:08 madsmtm

I guess my problem here is a lack of understanding of how Metal works.

As I understand it, if we are to map safety concerns in normal CPU code to safety concerns when interfacing with the GPU, it would be:

  • Type safety: MTLBuffer seems to be untyped, how do you ensure that you don't cast e.g. MTLPackedFloat3 to a MTLPackedFloatQuaternion (which would be invalid in Rust, since MTLPackedFloat3 has a possibly uninitialized padding field)? Or is this just a valid thing to do everywhere in Metal?
  • Initialization safety: I'm guessing that you're never working with uninitialized memory, but that newly allocated memory is automatically zeroed? Though if that's the case, what happens if you don't initialize e.g. a pointer, and just leave it at 0? And does Metal have the equivalent of NonNull<u32>?
  • Bounds safety: Discussed above.
  • Lifetime safety: If you release a MTLHeap or MTLBuffer before the GPU is done using them, do you get use-after-free? or does Metal hold on to these buffers long enough by itself?
  • Thread safety:
    • To/from-CPU: Accessing the contents of a buffer has to be synchronized. And if using MTLStorageModeShared, reads and writes should probably be volatile?
    • To/from-GPU: Do GPUs have atomics? Would you get torn values if you tried to read something that another GPU "thread" (or whatever it is called) was writing to?
  • Invariants violated by shaders themselves: Should methods like newLibraryWithSource:options:error: even be safe? The shader itself can do most of the above things, right?

Maybe one of you could help clarify these?

madsmtm avatar Aug 24 '25 13:08 madsmtm

Practically, I think we should be able to mark most descriptor methods as safe.

madsmtm avatar Aug 25 '25 14:08 madsmtm

I believe that https://github.com/madsmtm/objc2/pull/792 mostly replaces this, apart from the bounds stuff, do you want to close this PR, or use it as a starting point to work on https://github.com/madsmtm/objc2/issues/793?

madsmtm avatar Sep 29 '25 17:09 madsmtm

Assuming we didn't yet race on #793, is it safe to start working on factoring out the false-positives from unsafe-default-safety.not-bounds-affecting = true and set them to .unsafe = false?


Regarding the points above:

  • Type safety: this is most likely not very clearly handled, especially as buffers can be bound at arbitrary offsets [^1] and reused in various ways. In addition to being accessible on the CPU via contents: but at least that raw pointer is unsafe;

  • Initialization safety: this seems to be a better place to address your comment about type-safety:

    don't cast e.g. MTLPackedFloat3 to a MTLPackedFloatQuaternion (which would be invalid in Rust, since MTLPackedFloat3 has a possibly uninitialized padding field)?

    I don't think there's actual padding in the type there but a full-on size mismatch which makes any such casting extra unsafe if not for carefully crafted and controlled situations.

    IIRC reading uninitialized memory mostly all returns 0, but this may be more clearly defined somewhere in the spec.

  • Bounds safety: very non-trivial, as brought up in https://github.com/madsmtm/objc2/issues/793#issuecomment-3352262358 we also have to take into account functions that change element type and thus size (of index / vertex buffers for example);

  • Lifetime safety: This used to be tracked by Metal if not opted out but that already wasn't possible when manually uploading gpuReourceID() values, and is removed now in Metal 4.

    • Thus, Drop is unsafe :sweat_smile:
    • Closely related to this: also consider residency.
  • Thread safety: Yes GPUs have atomics, but I'm not sure what would happen if you tried to read/write them from the CPU concurrently without appropriate synchronization. I don't expect torn values if they're cache-aligned, which they might anyway have to be?

  • Shader invariants: Perhaps functions and libraries should be safe, but executing them should not? That's a combination between setPipelineState: and dispatchThreads:/draw...:.

All in all, mostly guesses and "it's probably unsafe in some way anyway" :grimacing:

[^1]: Fun coincidence: draw messages that already receive a vertex or index count plus type (to know the total size of bytes read) receive a separate BufferLength for "validation" (??) yet binding a MTLBuffer to a MTL4ArgumentTable or encoding a descriptor in a heap does not receive any length, thus not having any validation on whichever subregion of a MTLBuffer or larger MTLHeap you're intending to specify.

MarijnS95 avatar Sep 30 '25 14:09 MarijnS95

Assuming we didn't yet race on #793, is it safe to start working on factoring out the false-positives from unsafe-default-safety.not-bounds-affecting = true and set them to .unsafe = false?

I won't be touching it yet, and I'll let you know if I want to. I have assigned you to the issue to hopefully make this clear.

  • Type safety: this is most likely not very clearly handled, especially as buffers can be bound at arbitrary offsets 1 and reused in various ways. In addition to being accessible on the CPU via contents: but at least that raw pointer is unsafe;

In https://github.com/madsmtm/objc2/pull/792, I've marked MTLBuffer as unsafe in argument positions for this reason. This is the only type which does weird things with types, right? E.g. MTLTextures are "type-safe" in the sense that you can safely read and modify the texture contents as a &[u8] (assuming you do proper synchronization, and assuming you don't do weird things to the underlying buffer, if it has one).

  • Initialization safety: this seems to be a better place to address your comment about type-safety:

    don't cast e.g. MTLPackedFloat3 to a MTLPackedFloatQuaternion (which would be invalid in Rust, since MTLPackedFloat3 has a possibly uninitialized padding field)? I don't think there's actual padding in the type there but a full-on size mismatch which makes any such casting extra unsafe if not for carefully crafted and controlled situations. IIRC reading uninitialized memory mostly all returns 0, but this may be more clearly defined somewhere in the spec.

There's two sides to this:

  • Uninitialized state in CPU buffers.
  • Uninitialized state in shader code.

I was mostly worried about it from the CPU side, but I think that the bytes of MTLBuffers are guaranteed to be initialized (I'm reading -[MTLDevice newBufferWithLength:options:] which says that they're zero-initialized), which at least means we won't have issues with people reading uninitialized memory (which, now that I'm thinking about it, is probably mostly a concept in the Rust or C Abstract Machine, and doesn't actually apply to things like this).

Re uninitialized state in shader code, I think we can mostly ignore it. For example, writing an uninitialized struct to a buffer is technically UB, but on the CPU side, we don't really care, we'd just observe one of two things:

  • The struct was initialized to all zeroes, and these were written.
  • Nothing was written (because e.g. the shader compiler could prove that no write needed to happen).

I don't think there's actual padding in the type there

Pretty sure packed_float3/MTLPackedFloat3 has a stride of 16 bytes, even though it only has a size of 12 bytes?

Assuming that's the case, when writing a packed_float3 into a buffer on the GPU side, I think the shader compiler could similarly avoid copying the padding bytes.

  • Lifetime safety: [...]

Moved to https://github.com/madsmtm/objc2/pull/792#discussion_r2396029579.

  • Thread safety: Yes GPUs have atomics, but I'm not sure what would happen if you tried to read/write them from the CPU concurrently without appropriate synchronization. I don't expect torn values if they're cache-aligned, which they might anyway have to be?

I think I've handled that by marking all resources as unsafe because they need synchronization, see https://github.com/madsmtm/objc2/pull/792#issuecomment-3358234339.

  • Shader invariants: Perhaps functions and libraries should be safe, but executing them should not? That's a combination between setPipelineState: and dispatchThreads:/draw...:.

I guess we have a choice between:

  • Mark MTLLibrary creation / getting as unsafe.
  • Mark MTLFunction usage as unsafe (e.g. storing it inside a MTLComputePipelineStateDescriptor).
  • Mark encoding or similar as unsafe (e.g. setPipelineState:).
  • Mark submission of the commands to the GPU as unsafe (e.g. commit).

I feel like making MTLFunction usage unsafe was the most "local" of these (which is why that's what I've gone with in https://github.com/madsmtm/objc2/pull/792), but I don't actually have the experience here to back that up, so I might be wrong here?

All in all, mostly guesses and "it's probably unsafe in some way anyway" 😬

Well, much more informed guesses than mine, so definitely appreciated!

madsmtm avatar Oct 01 '25 22:10 madsmtm