memoffset
memoffset copied to clipboard
const_fn usage
I know it's hard/impossible right now, but are you tracking the ongoing const_fn
changes to ensure that offset_of!
can be used in constants at some point? It's usually very useful in tandem with size_of()
, which is already const fn
.
I sure am.
iirc const fn
s cannot receive pointers at the moment and it might block this feature, but I have yet to experiment with this.
Status update: Doing offset_of
in const
was possible until very recently, but only in a way that triggered UB. Now the const engine got better at finding UB, and that trick does not work any more.
We'll keep this issue open to track adding const
support here once that's possible.
Also see the discussion starting here.
I just bumped into this myself. I would love to have a solution.
Cc https://github.com/rust-lang/rust/pull/63810
https://github.com/rust-lang/rust/pull/63810 has landed; with that we should be able to allow const-fn usage on nightly. :)
Should we feature-gate it further than just on nightly to avoid future breakages? Do you have an ETA for when it'll be stable?
Should we feature-gate it further than just on nightly to avoid future breakages?
That's probably a good idea.
Do you have an ETA for when it'll be stable?
No.
A'right, I'll try working on it tomorrow
Lol, I was going to do the same. ;)
@Gilnaa Go ahead then, I'll review your PR. ;)
See this for one possible option; we'd have to adjust this a bit but I like putting the entire computation into an intermediate const
. That way we can be sure that anything weird we do only happens at compile-time, not at run-time.
I would be interested in using memoffset
in a const fn
context without having to use unstable features. Do you have a sense of how far along these features are in the stabilization process -- i.e. how much additional time and effort will be required to stabilize them?
I'm happy to ask on the individual tracking bugs for the features and do some work to get them stabilized, I just figured I should ask here first.
I am not quite sure, @oli-obk might know better.
We currently need the following features: const_ptr_offset_from, const_maybe_uninit_as_ptr, const_raw_ptr_deref, const_refs_to_cell (the latter only if the struct you use with offset_of
contains interior mutability).
This is all still blocked on me finishing https://github.com/rust-lang/nomicon/pull/221, after which we can finally start stabilizing lots of unsafe const things
I think I have a way to calculate this in stable CTFE that doesn't require pointer offset, it's just more expensive at compile-time: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=2910d9be2d2b7a1c9e584bde946306ee. The key here is that you can create a buffer that holds the bytes of the offset, ptr::addr_of!
into the buffer as if it were a T
, and do a *const u8
deref. No pointer math, but expensive-ish setup.
What are your thoughts, other than Deref
coercion?
What happens if the offset does not fit into a u8
?
It's calculated one byte at a time, see the current playground link:
const LEN: usize = size_of::<$t>();
// Ensure our offsets buffer is aligned to `$t`
#[repr(C)]
union Offsets {
uninit: ManuallyDrop<MaybeUninit<$t>>,
buffer: [u8; LEN],
}
// Calculate the least-significant to most-significant byte.
// There are many areas for optimization here.
// This is the simplest code for demo.
// Which byte of the output we are calculating?
let mut byte_index = 0;
// How many bytes we need to consider. [1, 2^8) = 1, [2^8; 2^16) = 2, [2^16, 2^24) = 3, etc.
let num_bytes = (usize::BITS - LEN.leading_zeros() - 1) / 8 + 1;
// Create a buffer long enough to store a `$t`
let mut offsets = Offsets { buffer: [0u8; LEN] };
// This is the raw bytes of the output offset, ordered from least to most significant byte.
let mut field_offset_bytes = [0u8; size_of::<usize>()];
// Loop through each byte of output to consider
while byte_index < num_bytes {
// Fill the buffer with the considered byte for each index.
// For example, an index of 2093 is the little-endian bytes [45, 8, 0, 0, 0, 0, 0, 0].
// - When byte_index == 0, offsets.buffer[2093] = 45.
// - When byte_index == 1, offsets.buffer[2093] = 8.
// - For every other byte_index, offsets.buffer[2093] = 0.
let mut i = 1;
while i < LEN {
let byte = i.to_le_bytes()[byte_index];
unsafe { offsets.buffer[i] = byte };
i += 1;
}
// Treat our buffer as a *const T, and offset into the field.
let invalid = unsafe { addr_of!(offsets.uninit) } as *const $t;
let field = unsafe { addr_of!((*invalid).$field) };
// Retrieve the byte at the position of the field into the buffer.
// That value will be the little-endian `byte_index`th byte of the output.
let field_offset_byte = unsafe { *(field as *const u8) };
field_offset_bytes[byte_index] = field_offset_byte;
byte_index += 1;
}
// Build the full field_offset from the little-endian bytes we just determined.
let field_offset = usize::from_le_bytes(field_offset_bytes);
let invalid = unsafe { addr_of!(offsets.uninit) } as *const $t;
let field = unsafe { addr_of!((*invalid).$field) };
let field_length = size_of_ptr(field);
field_offset..(field_offset + field_length)
Okay... I'm sorry I have no idea how this code works, and it is quite hard to decipher the code. An English description of the algorithm and some comments would be really helpful. :)
That said, it doesn't seem like const_ptr_offset_from
is terribly far from stabilization, so I don't think we should switch to such a complicated alternative.
It just writes the offsets ([0,1,2,...]) in a byte array that overlaps with the struct, and then casts a pointer to a field to a u8 pointer to read that byte to get the offset.
But that breaks down for >255, so it repeats that a few times to get the higher bits of the offset. So the second time the array holds 256 times a zero, followed by 256 a one, etc. That gets the second byte of the offset. And so on. (With a small optimization to not repeat more than necessary by using the 2-log of the length of the struct (from the number of leading zeros).)
Ah, apologies, the multi-byte version was written in 10 minutes from the original single-byte version, so names and structure are non-ideal and intended for demo only 😅 Personally, I don't think it's that complicated, it's just non-idiomatic because of the restrictions on CTFE.
@m-ou-se's reading is correct, regardless I added some comments in the sample. The algorithm is O(N) (N = size_of) as opposed to the pointer_offset_from version that is O(1). My thought is that we could switch to this expensive-at-compile time version with an expensive_const
flag that would work on stable for now. const_ptr_offset_from
won't be available in stable (not stabilized in nightly) for at least 4-6 months from now, right?
(It might be a bit confusing and easy to miss that 'bytes' refers to the bytes of the offset usize
(so 8 bytes on 64-bit platforms), rather than the bytes of the struct.)
Ah, I see. Wow! That's clever.
How does this approach handle the situation where the field
type is a ZST and that is the last field of the struct (and there is no trailing padding)?
If I understood things correctly, that requires using buffer: [u8; LEN+1]
?
My thought is that we could switch to this expensive-at-compile time version with an expensive_const flag that would work on stable for now. const_ptr_offset_from won't be available in stable (not stabilized in nightly) for at least 4-6 months from now, right?
We would however be required to keep that version around even after const_ptr_offset_from is stabilized, due to backwards compatibility. I am am not sure we want to maintain this code for so long, so I'd rather spend my energy on getting const_ptr_offset_from stabilized quickly. I think we could totally make it for the May 19 release (that requires getting the stabilization PR landed until April 7).
Ah, I see. Wow! That's clever.
Thanks, that means a lot ❤️
How does this approach handle the situation where the
field
type is a ZST and that is the last field of the struct (and there is no trailing padding)? If I understood things correctly, that requires usingbuffer: [u8; LEN+1]
?
You're correct - I've updated the playground link. The final algorithm will be more rigorously tested and will likely need further adjustments for edge cases I've not yet considered.
We would however be required to keep that version around even after const_ptr_offset_from is stabilized, due to backwards compatibility
Why? There's no change in behavior between this and a pointer-offset version (excepting compile time). My proposal:
- Add an improved version of this algorithm, gated behind an
expensive_const
feature - Stabilize
const_ptr_offset_from
, wait until it's landed in stable (May 19 at the earliest) - Keep the expensive algorithm and
expensive_const
feature until you update the Minimum Supported Rust Version ofmemoffset
- Remove the expensive algorithm, deprecate the
expensive_const
feature with it having no effect, now that the non-const code works in const - Remove the
expensive_const
feature entirely
My team's not on a strict timeline for the const-time offset-of, so we can definitely wait or fork until May, if the maintainers would rather not spend the energy on reviewing/maintaining a temporary workaround.
until you update the Minimum Supported Rust Version of memoffset
I don't see us doing that though. memoffset is kind of a polyfill crate, so we want to be extremely conservative here. There's a reason our MSRV is still Rust 1.19 (and our code is littered with cfg
to make use of newer features when available).
If @Gilnaa is fine landing this I won't object, but I think in the time it takes me to thoroughly review this, I can resolve https://github.com/rust-lang/miri/issues/1950, and then const_ptr_offset_from is just a stabilization PR away from being done.
Hey @kupiakos , First of all, thank you, that's a very clever solution.
I admit I'm finding it a bit hard to accept it though, seeing as const_ptr_offset_from
is around the corner.
Hi @Gilnaa, thanks for the response. I understand if you'd rather not upstream and maintain this - after all, the solution only starts working in Rust 1.58 when *const T
dereference was stabilized in CTFE, so it would only be a working solution for like, 3 or 4 stable versions.
I'd still like to offer this as an interim solution for those stable versions until const_ptr_offset_from
is stabilized, even if it's just a short-lived link to this issue and demo code on the crates.io page. Maybe with some cleanup of the demo code. Would you be up for that?
@kupiakos, that's a very interesting idea! I might be missing something obvious, but I think your code can be simplified to this:
use core::mem::{ManuallyDrop, MaybeUninit, size_of};
use core::ptr::addr_of;
const LEN: usize = size_of::<$t>();
// Ensure our offsets buffer is aligned to `$t`
#[repr(C)]
union Offsets {
uninit: ManuallyDrop<MaybeUninit<$t>>,
buffer: [u8; LEN + 1],
}
let mut offsets = Offsets { buffer: [0u8; LEN + 1] };
let invalid = unsafe { addr_of!(offsets.uninit) as *const $t };
let field = unsafe { addr_of!((*invalid).$field) };
// Set all bytes sequentially until we bump into the field pointer
let mut field_offset = 0;
while field_offset < LEN {
unsafe { offsets.buffer[field_offset] = 1 };
if unsafe { *(field as *const u8) } == 1 {
break;
}
field_offset += 1;
}
field_offset..(field_offset + size_of_ptr(field))
EDIT: This can be enhanced further to do 8 bytes at once (at the cost of using up to 7 bytes more and doing some unnecessary work for offsets that are smaller than 8), but it doesn't seem to have any significant effect, at least for simple projects.
use core::mem::{size_of, ManuallyDrop, MaybeUninit};
use core::ptr::addr_of;
const LEN: usize = size_of::<$t>() / size_of::<u64>();
// Ensure our offsets buffer is aligned to `$t`
#[repr(C)]
union Offsets {
uninit: ManuallyDrop<MaybeUninit<$t>>,
buffer: [u64; LEN + 1],
}
let mut offsets = Offsets {
buffer: [0u64; LEN + 1],
};
let field = unsafe {
let invalid = addr_of!(offsets.uninit) as *const $t;
addr_of!((*invalid).$field)
};
// Construct a native representation of the sequence [1,2,3...] that
// fits in a u64
let v = {
let mut bytes = [0u8; size_of::<u64>()];
let mut i = 1;
while i <= bytes.len() {
bytes[i - 1] = i as u8;
i += 1;
}
u64::from_ne_bytes(bytes)
};
// Set 8 byte chunks sequentially until we bump into the field pointer
let mut buf_offset = 0;
while buf_offset <= LEN {
unsafe { offsets.buffer[buf_offset] = v };
if unsafe { *(field as *const u8) } != 0 {
break;
}
buf_offset += 1;
}
// Calculate the final field offset by multiplying by the chunk size and
// adding the in-chunk offset of the field
let in_chunk_offset = unsafe { *(field as *const u8) as usize } - 1;
let field_offset = buf_offset * size_of::<u64>() + in_chunk_offset;
field_offset..(field_offset + size_of_ptr(field))
At least in terms of the Rust aliasing rules this is invalid -- when you write to offsets.buffer[buf_offset]
, all references and pointers to that memory become invalid.
Would getting the pointer after the assignment make it valid? i.e.:
// Set 8 byte chunks sequentially until we bump into the field pointer
let mut buf_offset = 0;
while buf_offset <= LEN {
unsafe { offsets.buffer[buf_offset] = v };
let field = unsafe {
let invalid = addr_of!(offsets.uninit) as *const $t;
addr_of!((*invalid).$field)
};
if unsafe { *(field as *const u8) } != 0 {
break;
}
buf_offset += 1;
}
Yes, that would work.
@iscgar I like that idea better! It should have far fewer reads/writes on average. I think it can be simplified even further:
use core::mem::{size_of, ManuallyDrop, MaybeUninit};
use core::ptr::addr_of;
const LEN: usize = size_of::<$t>() / size_of::<u64>() + 1;
// Ensure our offsets buffer is aligned to `$t`
#[repr(C)]
union Offsets {
uninit: ManuallyDrop<MaybeUninit<$t>>,
buffer: [u64; LEN],
}
let mut offsets = Offsets {
buffer: [0u64; LEN],
};
// Construct a native representation of the sequence [1,2,3...] that
// fits in a u64
let v = u64::from_ne_bytes(0x0807060504030201u64.to_le_bytes());
// Set 8 byte chunks sequentially until we bump into the field pointer
let mut buf_offset = 0;
loop {
unsafe { offsets.buffer[buf_offset] = v };
let field = unsafe {
let invalid = addr_of!(offsets.uninit) as *const $t;
addr_of!((*invalid).$field)
};
let in_chunk_position = unsafe { *(field as *const u8) as usize };
if in_chunk_position != 0 {
// We have hit the target chunk.
// Calculate the final field offset by multiplying by the chunk size and
// adding the in-chunk position of the field, subtracting by 1 because
// in_chunk_position is 1-indexed.
let field_offset = buf_offset * size_of::<u64>() + in_chunk_position - 1;
break field_offset..(field_offset + size_of_pointee(field));
}
buf_offset += 1;
if buf_offset >= LEN {
panic!("could not locate field offset");
}
}
Another nice advantage of this is that it doesn't need a special case for size_of::<$t>() == 0
, so the macro is simpler.