memoffset icon indicating copy to clipboard operation
memoffset copied to clipboard

const_fn usage

Open birkenfeld opened this issue 6 years ago • 34 comments

I know it's hard/impossible right now, but are you tracking the ongoing const_fn changes to ensure that offset_of! can be used in constants at some point? It's usually very useful in tandem with size_of(), which is already const fn.

birkenfeld avatar Oct 15 '18 07:10 birkenfeld

I sure am.

iirc const fns cannot receive pointers at the moment and it might block this feature, but I have yet to experiment with this.

Gilnaa avatar Oct 15 '18 07:10 Gilnaa

Status update: Doing offset_of in const was possible until very recently, but only in a way that triggered UB. Now the const engine got better at finding UB, and that trick does not work any more.

We'll keep this issue open to track adding const support here once that's possible.

Also see the discussion starting here.

RalfJung avatar Aug 16 '19 10:08 RalfJung

I just bumped into this myself. I would love to have a solution.

thedrow avatar Sep 30 '19 10:09 thedrow

Cc https://github.com/rust-lang/rust/pull/63810

RalfJung avatar Oct 01 '19 01:10 RalfJung

https://github.com/rust-lang/rust/pull/63810 has landed; with that we should be able to allow const-fn usage on nightly. :)

RalfJung avatar Nov 03 '19 16:11 RalfJung

Should we feature-gate it further than just on nightly to avoid future breakages? Do you have an ETA for when it'll be stable?

Gilnaa avatar Nov 03 '19 17:11 Gilnaa

Should we feature-gate it further than just on nightly to avoid future breakages?

That's probably a good idea.

Do you have an ETA for when it'll be stable?

No.

RalfJung avatar Nov 03 '19 17:11 RalfJung

A'right, I'll try working on it tomorrow

Gilnaa avatar Nov 03 '19 17:11 Gilnaa

Lol, I was going to do the same. ;)

RalfJung avatar Nov 03 '19 17:11 RalfJung

@Gilnaa Go ahead then, I'll review your PR. ;) See this for one possible option; we'd have to adjust this a bit but I like putting the entire computation into an intermediate const. That way we can be sure that anything weird we do only happens at compile-time, not at run-time.

RalfJung avatar Nov 03 '19 18:11 RalfJung

I would be interested in using memoffset in a const fn context without having to use unstable features. Do you have a sense of how far along these features are in the stabilization process -- i.e. how much additional time and effort will be required to stabilize them?

I'm happy to ask on the individual tracking bugs for the features and do some work to get them stabilized, I just figured I should ask here first.

martinboehme avatar Sep 22 '21 09:09 martinboehme

I am not quite sure, @oli-obk might know better.

We currently need the following features: const_ptr_offset_from, const_maybe_uninit_as_ptr, const_raw_ptr_deref, const_refs_to_cell (the latter only if the struct you use with offset_of contains interior mutability).

RalfJung avatar Sep 23 '21 15:09 RalfJung

This is all still blocked on me finishing https://github.com/rust-lang/nomicon/pull/221, after which we can finally start stabilizing lots of unsafe const things

oli-obk avatar Sep 28 '21 13:09 oli-obk

I think I have a way to calculate this in stable CTFE that doesn't require pointer offset, it's just more expensive at compile-time: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=2910d9be2d2b7a1c9e584bde946306ee. The key here is that you can create a buffer that holds the bytes of the offset, ptr::addr_of! into the buffer as if it were a T, and do a *const u8 deref. No pointer math, but expensive-ish setup.

What are your thoughts, other than Deref coercion?

kupiakos avatar Mar 07 '22 18:03 kupiakos

What happens if the offset does not fit into a u8?

RalfJung avatar Mar 07 '22 19:03 RalfJung

It's calculated one byte at a time, see the current playground link:

const LEN: usize = size_of::<$t>();
// Ensure our offsets buffer is aligned to `$t`
#[repr(C)]
union Offsets {
    uninit: ManuallyDrop<MaybeUninit<$t>>,
    buffer: [u8; LEN],
}
// Calculate the least-significant to most-significant byte.
// There are many areas for optimization here.
// This is the simplest code for demo.

// Which byte of the output we are calculating?
let mut byte_index = 0;

// How many bytes we need to consider. [1, 2^8) = 1, [2^8; 2^16) = 2, [2^16, 2^24) = 3, etc.
let num_bytes = (usize::BITS - LEN.leading_zeros() - 1) / 8 + 1;

// Create a buffer long enough to store a `$t`
let mut offsets = Offsets { buffer: [0u8; LEN] };

// This is the raw bytes of the output offset, ordered from least to most significant byte.
let mut field_offset_bytes = [0u8; size_of::<usize>()];

// Loop through each byte of output to consider
while byte_index < num_bytes {
    // Fill the buffer with the considered byte for each index.
    // For example, an index of 2093 is the little-endian bytes [45, 8, 0, 0, 0, 0, 0, 0].
    // - When byte_index == 0, offsets.buffer[2093] = 45.
    // - When byte_index == 1, offsets.buffer[2093] = 8.
    // - For every other byte_index, offsets.buffer[2093] = 0.
    let mut i = 1;
    while i < LEN {
        let byte = i.to_le_bytes()[byte_index];
        unsafe { offsets.buffer[i] = byte };
        i += 1;
    }
    // Treat our buffer as a *const T, and offset into the field.
    let invalid = unsafe { addr_of!(offsets.uninit) } as *const $t;
    let field = unsafe { addr_of!((*invalid).$field) };

    // Retrieve the byte at the position of the field into the buffer.
    // That value will be the little-endian `byte_index`th byte of the output.
    let field_offset_byte = unsafe { *(field as *const u8) };
    field_offset_bytes[byte_index] = field_offset_byte;
    byte_index += 1;
}

// Build the full field_offset from the little-endian bytes we just determined.
let field_offset = usize::from_le_bytes(field_offset_bytes);
let invalid = unsafe { addr_of!(offsets.uninit) } as *const $t;
let field = unsafe { addr_of!((*invalid).$field) };
let field_length = size_of_ptr(field);
field_offset..(field_offset + field_length)

kupiakos avatar Mar 07 '22 20:03 kupiakos

Okay... I'm sorry I have no idea how this code works, and it is quite hard to decipher the code. An English description of the algorithm and some comments would be really helpful. :)

That said, it doesn't seem like const_ptr_offset_from is terribly far from stabilization, so I don't think we should switch to such a complicated alternative.

RalfJung avatar Mar 07 '22 20:03 RalfJung

It just writes the offsets ([0,1,2,...]) in a byte array that overlaps with the struct, and then casts a pointer to a field to a u8 pointer to read that byte to get the offset.

But that breaks down for >255, so it repeats that a few times to get the higher bits of the offset. So the second time the array holds 256 times a zero, followed by 256 a one, etc. That gets the second byte of the offset. And so on. (With a small optimization to not repeat more than necessary by using the 2-log of the length of the struct (from the number of leading zeros).)

m-ou-se avatar Mar 07 '22 21:03 m-ou-se

Ah, apologies, the multi-byte version was written in 10 minutes from the original single-byte version, so names and structure are non-ideal and intended for demo only 😅 Personally, I don't think it's that complicated, it's just non-idiomatic because of the restrictions on CTFE.

@m-ou-se's reading is correct, regardless I added some comments in the sample. The algorithm is O(N) (N = size_of) as opposed to the pointer_offset_from version that is O(1). My thought is that we could switch to this expensive-at-compile time version with an expensive_const flag that would work on stable for now. const_ptr_offset_from won't be available in stable (not stabilized in nightly) for at least 4-6 months from now, right?

kupiakos avatar Mar 07 '22 21:03 kupiakos

(It might be a bit confusing and easy to miss that 'bytes' refers to the bytes of the offset usize (so 8 bytes on 64-bit platforms), rather than the bytes of the struct.)

m-ou-se avatar Mar 07 '22 22:03 m-ou-se

Ah, I see. Wow! That's clever.

How does this approach handle the situation where the field type is a ZST and that is the last field of the struct (and there is no trailing padding)? If I understood things correctly, that requires using buffer: [u8; LEN+1]?

My thought is that we could switch to this expensive-at-compile time version with an expensive_const flag that would work on stable for now. const_ptr_offset_from won't be available in stable (not stabilized in nightly) for at least 4-6 months from now, right?

We would however be required to keep that version around even after const_ptr_offset_from is stabilized, due to backwards compatibility. I am am not sure we want to maintain this code for so long, so I'd rather spend my energy on getting const_ptr_offset_from stabilized quickly. I think we could totally make it for the May 19 release (that requires getting the stabilization PR landed until April 7).

RalfJung avatar Mar 07 '22 22:03 RalfJung

Ah, I see. Wow! That's clever.

Thanks, that means a lot ❤️

How does this approach handle the situation where the field type is a ZST and that is the last field of the struct (and there is no trailing padding)? If I understood things correctly, that requires using buffer: [u8; LEN+1]?

You're correct - I've updated the playground link. The final algorithm will be more rigorously tested and will likely need further adjustments for edge cases I've not yet considered.

We would however be required to keep that version around even after const_ptr_offset_from is stabilized, due to backwards compatibility

Why? There's no change in behavior between this and a pointer-offset version (excepting compile time). My proposal:

  1. Add an improved version of this algorithm, gated behind an expensive_const feature
  2. Stabilize const_ptr_offset_from, wait until it's landed in stable (May 19 at the earliest)
  3. Keep the expensive algorithm and expensive_const feature until you update the Minimum Supported Rust Version of memoffset
  4. Remove the expensive algorithm, deprecate the expensive_const feature with it having no effect, now that the non-const code works in const
  5. Remove the expensive_const feature entirely

My team's not on a strict timeline for the const-time offset-of, so we can definitely wait or fork until May, if the maintainers would rather not spend the energy on reviewing/maintaining a temporary workaround.

kupiakos avatar Mar 07 '22 23:03 kupiakos

until you update the Minimum Supported Rust Version of memoffset

I don't see us doing that though. memoffset is kind of a polyfill crate, so we want to be extremely conservative here. There's a reason our MSRV is still Rust 1.19 (and our code is littered with cfg to make use of newer features when available).

If @Gilnaa is fine landing this I won't object, but I think in the time it takes me to thoroughly review this, I can resolve https://github.com/rust-lang/miri/issues/1950, and then const_ptr_offset_from is just a stabilization PR away from being done.

RalfJung avatar Mar 07 '22 23:03 RalfJung

Hey @kupiakos , First of all, thank you, that's a very clever solution.

I admit I'm finding it a bit hard to accept it though, seeing as const_ptr_offset_from is around the corner.

Gilnaa avatar Mar 08 '22 18:03 Gilnaa

Hi @Gilnaa, thanks for the response. I understand if you'd rather not upstream and maintain this - after all, the solution only starts working in Rust 1.58 when *const T dereference was stabilized in CTFE, so it would only be a working solution for like, 3 or 4 stable versions.

I'd still like to offer this as an interim solution for those stable versions until const_ptr_offset_from is stabilized, even if it's just a short-lived link to this issue and demo code on the crates.io page. Maybe with some cleanup of the demo code. Would you be up for that?

kupiakos avatar Mar 09 '22 20:03 kupiakos

@kupiakos, that's a very interesting idea! I might be missing something obvious, but I think your code can be simplified to this:

use core::mem::{ManuallyDrop, MaybeUninit, size_of};
use core::ptr::addr_of;
const LEN: usize = size_of::<$t>();
// Ensure our offsets buffer is aligned to `$t`
#[repr(C)]
union Offsets {
    uninit: ManuallyDrop<MaybeUninit<$t>>,
    buffer: [u8; LEN + 1],
}

let mut offsets = Offsets { buffer: [0u8; LEN + 1] };
let invalid = unsafe { addr_of!(offsets.uninit) as *const $t };
let field = unsafe { addr_of!((*invalid).$field) };

// Set all bytes sequentially until we bump into the field pointer
let mut field_offset = 0;
while field_offset < LEN {
    unsafe { offsets.buffer[field_offset] = 1 };
    if unsafe { *(field as *const u8) } == 1 {
        break;
    }
    field_offset += 1;
}

field_offset..(field_offset + size_of_ptr(field))

EDIT: This can be enhanced further to do 8 bytes at once (at the cost of using up to 7 bytes more and doing some unnecessary work for offsets that are smaller than 8), but it doesn't seem to have any significant effect, at least for simple projects.

use core::mem::{size_of, ManuallyDrop, MaybeUninit};
use core::ptr::addr_of;

const LEN: usize = size_of::<$t>() / size_of::<u64>();
// Ensure our offsets buffer is aligned to `$t`
#[repr(C)]
union Offsets {
    uninit: ManuallyDrop<MaybeUninit<$t>>,
    buffer: [u64; LEN + 1],
}

let mut offsets = Offsets {
    buffer: [0u64; LEN + 1],
};

let field = unsafe {
    let invalid = addr_of!(offsets.uninit) as *const $t;
    addr_of!((*invalid).$field)
};

// Construct a native representation of the sequence [1,2,3...] that
// fits in a u64
let v = {
    let mut bytes = [0u8; size_of::<u64>()];
    let mut i = 1;
    while i <= bytes.len() {
        bytes[i - 1] = i as u8;
        i += 1;
    }
    u64::from_ne_bytes(bytes)
};

// Set 8 byte chunks sequentially until we bump into the field pointer
let mut buf_offset = 0;
while buf_offset <= LEN {
    unsafe { offsets.buffer[buf_offset] = v };
    if unsafe { *(field as *const u8) } != 0 {
        break;
    }
    buf_offset += 1;
}

// Calculate the final field offset by multiplying by the chunk size and
// adding the in-chunk offset of the field
let in_chunk_offset = unsafe { *(field as *const u8) as usize } - 1;
let field_offset = buf_offset * size_of::<u64>() + in_chunk_offset;

field_offset..(field_offset + size_of_ptr(field))

iscgar avatar Mar 10 '22 10:03 iscgar

At least in terms of the Rust aliasing rules this is invalid -- when you write to offsets.buffer[buf_offset], all references and pointers to that memory become invalid.

RalfJung avatar Mar 10 '22 18:03 RalfJung

Would getting the pointer after the assignment make it valid? i.e.:

// Set 8 byte chunks sequentially until we bump into the field pointer
let mut buf_offset = 0;
while buf_offset <= LEN {
    unsafe { offsets.buffer[buf_offset] = v };
    let field = unsafe {
        let invalid = addr_of!(offsets.uninit) as *const $t;
        addr_of!((*invalid).$field)
    };
    if unsafe { *(field as *const u8) } != 0 {
        break;
    }
    buf_offset += 1;
}

iscgar avatar Mar 10 '22 23:03 iscgar

Yes, that would work.

RalfJung avatar Mar 11 '22 13:03 RalfJung

@iscgar I like that idea better! It should have far fewer reads/writes on average. I think it can be simplified even further:

use core::mem::{size_of, ManuallyDrop, MaybeUninit};
use core::ptr::addr_of;

const LEN: usize = size_of::<$t>() / size_of::<u64>() + 1;
// Ensure our offsets buffer is aligned to `$t`
#[repr(C)]
union Offsets {
    uninit: ManuallyDrop<MaybeUninit<$t>>,
    buffer: [u64; LEN],
}

let mut offsets = Offsets {
    buffer: [0u64; LEN],
};

// Construct a native representation of the sequence [1,2,3...] that
// fits in a u64
let v = u64::from_ne_bytes(0x0807060504030201u64.to_le_bytes());

// Set 8 byte chunks sequentially until we bump into the field pointer
let mut buf_offset = 0;
loop {
    unsafe { offsets.buffer[buf_offset] = v };
    let field = unsafe {
        let invalid = addr_of!(offsets.uninit) as *const $t;
        addr_of!((*invalid).$field)
    };
    let in_chunk_position = unsafe { *(field as *const u8) as usize };
    if in_chunk_position != 0 {
        // We have hit the target chunk.
        // Calculate the final field offset by multiplying by the chunk size and
        // adding the in-chunk position of the field, subtracting by 1 because
        // in_chunk_position is 1-indexed.
        let field_offset = buf_offset * size_of::<u64>() + in_chunk_position - 1;
        break field_offset..(field_offset + size_of_pointee(field));
    }
    buf_offset += 1;
    if buf_offset >= LEN {
        panic!("could not locate field offset");
    }
}

Another nice advantage of this is that it doesn't need a special case for size_of::<$t>() == 0, so the macro is simpler.

kupiakos avatar Mar 11 '22 17:03 kupiakos