objc2 icon indicating copy to clipboard operation
objc2 copied to clipboard

Figure out bounds-safety in `objc2-metal`

Open madsmtm opened this issue 3 months ago • 9 comments

Figure out which bounds-affecting methods are safe to call (because they are checked on the CPU side).

See https://github.com/madsmtm/objc2/pull/792 and https://github.com/madsmtm/objc2/pull/773 for context.

madsmtm avatar Sep 29 '25 17:09 madsmtm

For context, you set unsafe-default-safety.not-bounds-affecting = true in #792. Which way do you expect to address all the false-positives? Should we set all those to .unsafe = false, or should we instead aim for not-bounds-affecting = false and set all the actual possible out-of-bounds (either CPU-side that were tested to fail, and anything that happens on the GPU) to .unsafe = true?

MarijnS95 avatar Sep 30 '25 13:09 MarijnS95

That depends on what the pattern for what's safe here actually is? Maybe you could write down some examples of methods that index on the CPU and methods that index on the GPU, and we can try to figure it out?

Btw, in response to:

not sure if Rust's safety model should extend to the GPU, if at least CPU-side encoding is fully safe.

I think we should extend it to that, in part because I suspect that'd be more useful for wgpu, in part because the CPU and the GPU can share memory nowadays, and that makes it increasingly important that code on the GPU is sound too.

madsmtm avatar Sep 30 '25 13:09 madsmtm

But I think it's likely we either aim for not-bounds-affecting = false, or we change the check itself to have fewer false positives.

madsmtm avatar Sep 30 '25 13:09 madsmtm

That depends on what the pattern for what's safe here actually is?

That was mostly concerning false-positives like setIndexType receiving a # Safety comment about bounds-checking, despite taking an enum constant:

https://github.com/madsmtm/objc2-generated/compare/8bb1a3b5096cc34d8889d8ddf1016434a770fe9d...a6560b215be4350e1327ea9793c1ed9a1dee5610#diff-480db922f3ee8593666ed843d3933b3111114d631e0458bcff48f66bab106931R475

setVertexFormat for example is marked safe in the same diff.

(Yes both enumerations affect the number of bytes that will be read from a buffer on the GPU, but by that logic pretty much all GPU state is unsafe 😬. Unless Metal validates it before submitting, creating an object from a descriptor, or something like that)


Maybe you could write down some examples of methods that index on the CPU and methods that index on the GPU, and we can try to figure it out?

Sure. Let's take setBuffer:offset:index::

  • offset is likely only applied to the buffer VA on the GPU (but there could be CPU-side validation);
  • atIndex likely updates a linear array on the CPU that encodes the descriptor for that virtual address, before later uploading it to the GPU?

resolveCounters receives specific indexes/ranges that are going to be "read" for their values to be written to a buffer, on the GPU. And it has a CPU equivalent.


Btw, in response to:

not sure if Rust's safety model should extend to the GPU, if at least CPU-side encoding is fully safe.

I think we should extend it to that, in part because I suspect that'd be more useful for wgpu, in part because the CPU and the GPU can share memory nowadays, and that makes it increasingly important that code on the GPU is sound too.

If someone can guarantee safety on the shader side, and we can guarantee state-safety on the encoding side (referring back to changing the index type on various acceleration structure descriptors for example), that make sense.


But I think it's likely we either aim for not-bounds-affecting = false, or we change the check itself to have fewer false positives.

So that implies already looking for and listing all .unsafe = true, rather than its opposite set (minus everything that wasn't a false-positive)?

MarijnS95 avatar Sep 30 '25 13:09 MarijnS95

So that implies already looking for and listing all .unsafe = true, rather than its opposite set (minus everything that wasn't a false-positive)?

Maybe? But the check could also fairly easily be changed to "only consider functions which take NS[U]Integer and NSRange, or something similar to those", if things like setIndexType/setVertexFormat are the only false positives?

Both are equivalent, it's only in terms of "what would make it the most maintainable". So I'd also be fine with listing everything manually with .unsafe = true, and then figuring out some better rules afterwards.

madsmtm avatar Sep 30 '25 13:09 madsmtm

Can't conclusively say, but might be interesting to test the type-check and see the diff? Right now it's too large to manually read all false-positives (by means of searching for # Safety).

it's only in terms of "what would make it the most maintainable".

Yes, explicitly listing everything that's unsafe seems like a lot of tracking/manual overhead if not-bounds-affecting = true already handles most of them. Then we can flag only false-positives and methods that were tested to have internal validation (also without validation layer?) to be .unsafe = false?

MarijnS95 avatar Sep 30 '25 13:09 MarijnS95

Can't conclusively say, but might be interesting to test the type-check and see the diff? Right now it's too large to manually read all false-positives (by means of searching for # Safety).

I think you can search for might not be bounds-checked, but yeah. I'll see if I can whip something up, gimme a minute.

Then we can flag only false-positives and methods that were tested to have internal validation (also without validation layer?) to be .unsafe = false?

I'd probably be more comfortable with that, yeah.

madsmtm avatar Sep 30 '25 13:09 madsmtm

Yeah that's what I meant, but it's already 121 hits with most files collapsed because of the large diff. IIRC it's also where most of the 4255-1865 additions come from?

Note again that the functions I just mentioned affect GPU buffer reads (for bounds checking) just as much as an offset or count/size 😬

MarijnS95 avatar Sep 30 '25 14:09 MarijnS95

I pushed a commit to my PR with this fix.

madsmtm avatar Sep 30 '25 14:09 madsmtm