ponyc
ponyc copied to clipboard
Pointers to uninitialized memory via Array.cpointer and String.cpointer
cpointer can be called with an offset. This isn't checked to see if it is beyond the valid range for the array. I believe we should be checking and then return an error if the offset is outside the range.
Making cpointer partial is going to be a painful breaking change but I believe needs to be done.
A slightly less painful option would be to make cpointer not take an offset and leave it as non-partial and introduce a new partial method for getting cpointer's that does take an offset, checks to make sure it is valid and is partial.
I think the latter is preferable as making cpointer partial is going to wreck havoc everywhere.
There's no place in the standard library that uses cpointer with an offset other than 0, so I think splitting that functionality off is the best idea.
@SeanTAllen - I don't understand - maybe I'm missing something.
To elucidate, are you able to produce a code sample that reads from uninitialized memory without using FFI?
Since .cpointer() is only for use in the FFI, what would be the benefit of making the method partial? Even if it produced problems or exceptions in a C library, would that exception even be caught by the Pony runtime in a try .. end block?
I don't think that the .cpointer function attempts to read at that offset supplied to it, but rather probably just does simple pointer math and returns a "number" that is supposed to go into the FFI.
Right, that matches my understanding. And if that's true, then this is not a safety issue, in my opinion.
Though if Sean has something else to add for consideration, that's why I asked for a code sample that demonstrates unsafety.
I think this is problematic as I could disallow FFI to a 3rd party module and it could still use cpointer with an offset to get access to arbitrary memory to attempt to use in an attack.
Footguns like this aren't just footguns, they are vectors for staging novel attacks that we might not be able to imagine.
I could disallow FFI to a 3rd party module and it could still use cpointer with an offset to get access to arbitrary memory to attempt to use in an attack.
How could it "get access to" (which I'm interpreting as "read from and or write to") arbitrary memory if FFI is disabled?
I don't know but we've allowed a pointer to arbitrary memory from pony code so the first step is done. The point is, allowing access to arbitrary memory from pony is a first step in possible unknown exploits.
So, 1, we remove a foot gun. Two, we remove a point of incorrectness that open up avenues to try to exploit.
I don't know but we've allowed a pointer to arbitrary memory from pony code so the first step is done.
But no access to arbitrary memory is produced from Pony. The computed pointer is just a USize number that a user might pass into the FFI. They could also take an actual Pony USize and declare an FFI signature with a USize even though the C function expects a pointer. This kind of "possible unknown exploit" has imho nothing to do with Pony.
I think this is problematic as I could disallow FFI to a 3rd party module and it could still use cpointer with an offset to get access to arbitrary memory to attempt to use in an attack.
How would this work -- how could someone without access to FFI exploit the fact that they were able to compute a Pony Pointer object that would point to a wrong location? How would that "access" be achieved without FFI?
There would be a million Pony ways to generate pointers that when used with the FFI are unsafe. On the Pony side, it would be as easy as:
use "random"
actor Main
new create(e: Env) =>
let unsafe_ffi_pointer = Rand.usize()
but I do not see how that could be "exploited" in any further "steps" without the FFI. It does of course trivially become unsafe if I declare use @printf[None](fmt: USize, ...) and call @printf(unsafe_ffi_pointer, ...).
I think @jemc's question is reasonable since a change to builtin should in my opinion address a demonstrable risk or produce a demonstrable benefit.
@stefandd I've stated my case. You are asking me to make a different case. I'm not going to make a different case.
I mentioned this to Sean in a DM conversation just now, but for the record here:
I'd be okay with removing all forms of pure-Pony pointer offset math, as long as we provide builtin/intrinsic FFI functions that an FFI-enabled package can use to do zero-overhead pointer math for legitimate FFI purposes.
I think that would solve the perceived security issue (even if it currently is a bit too hypothetical to have me convinced yet at this time), while still retaining performance for legitimate (i.e. FFI-oriented) use cases of this feature.
What would this look like?
Something like this:
@pony_pointer_offset(array.cpointer(), some_number)
At any rate, I do want to change the title of this ticket a bit to make it more accurate - so far I think we don't know of any way to "read from" uninitialized memory using this technique - only "point to".
@stefandd I've stated my case. You are asking me to make a different case. I'm not going to make a different case.
@SeanTAllen In any case, thanks for reading my concerns. ..maybe a missed opportunity for me to learn something about Pony from any answers you might have given.
Just FYI, this also applies to Pointer.offset, which is also a public function.
(Reopened because I believe there's a good conversation to be had about this issue, even if it doesn't happen right now)
I'm in two minds about this one. For the specific cases of Array and String, we do know where the end of the allocation happens, and so we're in the right position to prevent handling a pointer to unitialized memory.
Leaving aside the question of what an attacker can do with a Pointer tag without FFI, the problem, in my view, is that we can't guarantee that a pointer will ever point to initalized memory, even if we place restrictions on the cpointer functions.
For example, an attacker can get hold of a pointer through the use of cpointer, then use the reserve or undefined functions to trigger a reallocation, which invalidates the previous memory allocation (right now Pony doesn't return any memory to the OS, so this pointer will for sure be reused in the future for another allocation)
actor Main
new create(env: Env) =>
let arr = Array[U8].create(8)
env.out.print(arr.cpointer().usize().string()) // 139673816024640
arr.reserve(arr.space() + 1)
env.out.print(arr.cpointer().usize().string()) // 139673816024736
Thinking about it more, I might be wrong in my previous comment: if my understating of the GC is correct, holding the previous pointer will prevent the previous allocation from being freed, so it won't be reused for future allocations until we drop our reference to it.
Your GC understanding is correct
We had a conversation about this after sync today (unfortunately it wasn't recorded). I agree with @jemc 's view of delegating this to an FFI call, if we make cpointer partial.
To answer @stefandd, my view on the difference between Pointer and a bare USize is an aesthetic one: excluding FFI, we can consider that, right now, a Pointer is either null, or a pointer to Pony-allocated memory. The current issue with cpointer messes with this view. With an USize, you don't have that limitation (and also you start to get into other issues where you assume that a pointer is the same as an integer, or that it has the same size as an USize, but that's a different concern).
It would be good to get @ergl, @jemc and I on a call to discuss this.
I think this is ready for someone to try implementing the idea
https://github.com/ponylang/ponyc/issues/4175#issuecomment-1212210469
in the standard library and see if it works and if anything weird arises. If anything feels wrong, we should discuss as soon as something starts to feel wrong.