pgrx
pgrx copied to clipboard
Incorrect size assumptions in `Array::as_slice()` mean it must be destroyed
Arrays currently can be converted into &[T] using as_slice, but in actuality their repr is boxed in a usize. Thus for types which are smaller than the machine's pointer width, this results in exposing various zero-initialized instances of the types in question, with tragic results when this is interacted with further.
This can be fixed by ceasing to use deconstruct_array to generate them, which resizes types like i32 into Datums.
To fix:
- [x] land tcdi/pgx#636
- [x] followup PR introducing null bitmap handling into Array, to allow not using
deconstruct_array - [ ] remove it
In the process of trying to creating a diff that goes from #662 to solving this, I have discovered that we can create slices with Rust types that exceed the size of their Postgres layout, and exceed 8 bytes, the maximum Datum size, at 16 bytes, while themselves representing "unboxed" datums. These include the recently introduced HeapTuple types, but there are almost certainly more.
This means there is currently no safe way to offer this function. In practice, even an unsafe version that mimicked this interface would not be correct, and would instead be a giant footgun. So I will instead use the recently introduced low-level interface tools to offer increasingly safe ways to do random access into the "flat array" representation and reduce the overhead of Array<'_, T>.
Hopefully in offering these tools it can make up for removing Array::as_slice entirely.
Nice detective work, @workingjubilee :)
Because this function is common in Rust code, it is usually sound, and there are probably some users of this function "in the wild", I have opened https://github.com/tcdi/pgx/pull/672 to begin a deprecation process rather than simply removing it outright.
- The bad news: It will now also panic if any of the values are considered "null" by Postgres, as it is quite probable that if a value is null it is not something that should be witnessed by Rust code.
- The good news: It now handles data types sized as 1, 2, and 4 bytes correctly, not just ones with 8 bytes! They actually expose a correctly-sized slice.
- The future news: I will work on an alternative function at some future point which describes a
NullableSlice<'_, T>that can be used to interact with data structures in Postgres with associated null mappings.
I think I was a bit panicky with "no safe way", but I think it's still a massive footgun. I think it might be safe for most cases at the moment, and if the bounds were tightened, I might be confident in its soundness. But as that will be a breaking change anyways, it's still going away, I just need something I can feel good about using to rewrite the existing as_slice tests with first, since they've unfortunately caught quite a few "improvements" to Array I attempted experimentally.
Followups in this vein:
- https://github.com/tcdi/pgx/pull/672
- https://github.com/tcdi/pgx/pull/881
- https://github.com/tcdi/pgx/pull/882