fdpp icon indicating copy to clipboard operation
fdpp copied to clipboard

rfc: deal with offsetof in some better ways

Open stsp opened this issue 3 years ago • 10 comments

The members of our template classes, do require the knowledge of their offset within the class. Such offset should be passed as a template parameter, like this (non-working example):

template <int O>
struct B {
    static constexpr int off = O;
};

struct A {
    char a;
    B<offsetof(A, b)> b;
};

Which gives:

error: 'struct A' has no member named 'b'
   10 |     B<offsetof(A, b)> b;

Ah, obviously... Let's work around that, should be quite straight-forward:

template <int O>
struct B {
    static constexpr int off = O;
};

struct A {
    char a;
    char _mark[0];
    B<offsetof(A, _mark)> b;
};

Now we get that:

error: invalid use of incomplete type 'struct A'
   11 |     B<offsetof(A, _mark)> b;

Much worse... but not giving up just yet. Maybe the work-around is still possible if we implement offsetof() ourselves? Lets try:

template <typename T, char (T::*M)[0]>
struct offset_of {
    constexpr operator size_t() const {
        return (std::uintptr_t)&(((T*)nullptr)->*M);
    }
};
template <typename T, char (T::*M)[0]>
struct B {
    static const int off = offset_of<T, M>();
};

struct A {
    char a;
    char _mark[0];
    B<A, &A::_mark> b;
};

This actually worked til gcc-9, but now gives:

error: 'reinterpret_cast<char (*)[0]>(1)' is not a constant expression
   12 |     static const int off = offset_of<T, M>();

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49171 Also started to produce bad code with clang at around the same time it was removed from gcc...

Are we out of ideas? Hell no. :) This works:

template <int (*O)()>
struct B {
    static constexpr int off = O();
};

struct A {
    char a;
    static constexpr int off_b() { return offsetof(A, b); }  // this works even though b is declared after
    B<off_b> b;
};

Unfortunately its a bit sub-optimal, as it require the one to create a static member method for every member we'd want to get an offset. The same was true also for all the previously tried work-arounds, where we had to insert the dummy mark instead, but how about simplifying it a bit:

template <size_t (*O)(void)>
struct B {
    static constexpr int off = O();
};

struct A {
    char a;
    B<+[](){ return offsetof(A, b); }> b;
};

Should be completely similar to the previous work-around, it just exploits the C++20 feature - lambda as a template parameter (not supported in clang). Unfortunately we get this again:

error: 'struct A' has no member named 'b'
   11 |     B<[](){ return offsetof(A, b); }> b;

So offsetof works in a static constexpr member function, but not in lambda... and that is quite sad.

I've found this paper: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1278r0.html that seems to be aiming to solve the similar problem. But I am not sure how well does it do so. In particular, it doesn't seem to be considering offsetof in template arguments. It seems to be using a runtime pointer to member instead, which looks surprising to me. Also I don't quite understand what does this mean:

namespace std {
  template <class T>
  ptrdiff_t offset (T const& pmd) noexcept;
}

why the argument of offset() is a reference to T and not to the member of T?

Anyway, I wonder if @slurps-mad-rips can give some insight/advice here. Overall, the state of offsetof() in C++ seems to be in a very bad shape, so lets just see what can be done.

stsp avatar Nov 05 '20 16:11 stsp

@stsp hi! std::offset (whose name has been changed but I'll be fighting to have it be something better) won't be added to the standard until reflection is available as it'll be more powerful then. Regarding the paper, we weren't sure if all implementations would be able to give us a constexpr form, and if I ever get around to writing an R1 of the paper it'll be marked as constexpr at that time.

That said when you get a pointer to member, its always stored as either a ptrdiff_t or an int (under MSVC which I believe isn't an issue here 🙂) internally, and these offsets are known at compile time anyhow.

The reason we take a reference to T is because you're expected to pass the entire pointer to member in one go, i.e.

auto x = std::offset(&A::b);

This also means that

auto x = &A::b;
auto y = std::offset(x);

is valid and thus we can also dynamically (in addition to statically) get the value of the offset of a member object pointer.

That said if you're willing to rely on clang and gcc intrinsics and ignore other compilers like MSVC or intel, you can recreate P1278 via __builtin_memcpy. This isn't 100% accurate (i.e., play around with this) but you should be able to do (via C++17)

template <class> struct class_of;
template <class S, class T> struct class_of<S T::*> : type_identity<T> { };
template <class T> using class_of_t = typename class_of<T>::type;

template <class T, class=std::enable_if_t<std::conjunction_v<std::is_member_object_pointer<T>, std::is_standard_layout_v<class_of_t<T>>>
constexpr ptrdiff_t offset (T const& pmd) noexcept {
  static_assert(sizeof(pmd) == sizeof(int) or sizeof(pmd) == sizeof(ptrdiff_t));
  std::conditional_t<sizeof(pmd) == sizeof(int), int, ptrdiff_t> target { };
  __builtin_memcpy(std::addressof(target), std::addressof(pmd), sizeof(pmd));
  return target;
}

Given your attempts to implement offsetof above, I do hope this helps (and if not, at least you know about __builtin_memcpy being constexpr-capable now :P

bruxisma avatar Nov 08 '20 02:11 bruxisma

@slurps-mad-rips thanks for a reply, I've got your idea now. But it doesn't seem to work for me:

template <typename T>
constexpr ptrdiff_t offset (T const& pmd) noexcept {
  static_assert(sizeof(pmd) == sizeof(int) or sizeof(pmd) == sizeof(ptrdiff_t));
  std::conditional_t<sizeof(pmd) == sizeof(int), int, ptrdiff_t> target { };
  __builtin_memcpy(std::addressof(target), std::addressof(pmd), sizeof(pmd));
  return target;
}

template <typename T, char (T::*M)[0]>
struct B {
    static const int off = offset(M);
};

struct A {
    char a;
    char _mark[0];
    B<A, &A::_mark> b;
};
c++ -std=c++20 -Wall offs6.cpp
offs6.cpp: In instantiation of 'const int B<A, &A::_mark>::off':
offs6.cpp:26:57:   required from here
offs6.cpp:14:34:   in 'constexpr' expansion of 'offset<char (A::*)[0]>(&A::_mark)'
offs6.cpp:8:19: error: '__builtin_memcpy(((void*)(& target)), ((const void*)(&<anonymous>)), 8)' is not a constant expression
    8 |   __builtin_memcpy(std::addressof(target), std::addressof(pmd), sizeof(pmd));
      |   ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Can you change my example in some way to get it working? :)

I think your idea works only in run-time context. And in run-time context also the normal offsefof() seems to work, so what are the advantages? Also, it seems in run-time context I can do return *reinterpret_cast<ptrdiff_t const*>(&pmd); instead of memcpy. But how to get my constexpr example above to work with your technique?

stsp avatar Nov 08 '20 10:11 stsp

I think this might be a language version or compiler version issue. I copy pasted the code you showed above into godbolt and it compiled without issues from gcc 6.4-10.2 under C++17

https://godbolt.org/z/3bW6o4

bruxisma avatar Nov 08 '20 11:11 bruxisma

I think gcc just optimized things away as unused. Here is the full example (sorry, dunno how to create the godbolt URL):

#include <cstddef>
#include <iostream>

template <typename T>
constexpr ptrdiff_t offset (T const& pmd) noexcept {
  static_assert(sizeof(pmd) == sizeof(int) or sizeof(pmd) == sizeof(ptrdiff_t));
  std::conditional_t<sizeof(pmd) == sizeof(int), int, ptrdiff_t> target { };
  __builtin_memcpy(std::addressof(target), std::addressof(pmd), sizeof(pmd));
  return target;
}

template <typename T, char (T::*M)[0]>
struct B {
    static const int off = offset(M);
};

struct A {
    char a;
    char _mark[0];
    B<A, &A::_mark> b;
};
int main()
{
    A a;
    std::cout << "size " << sizeof(A) << " off " << a.b.off << std::endl;
    return 0;
}

stsp avatar Nov 08 '20 11:11 stsp

@slurps-mad-rips OK, got it to URL. :) https://godbolt.org/z/T6rPYa

So what do you think? Btw, I assume you are subscribed to that thread, so I don't need to always mention you. :)

stsp avatar Nov 08 '20 11:11 stsp

@slurps-mad-rips ping. :)

There are few possibilities here:

  1. your proposal can work with my test-case, and the only thing needed is to show how exactly
  2. your proposal does not work in constexpr context, and so R1 is needed that does.

In case of 2 we can cooperate, to make sure all my needs are covered.

stsp avatar Nov 17 '20 12:11 stsp

Hi, apologies. life got in the way. Thanks for the ping :)

Regarding your previous message it looks like they fixed the memcpy compile time that used to work. That's disappointing to be quite honest. Additionally clang seems to only support __builtin_bitcast under C++20 mode :(

Regarding your statements however

  1. The proposal is most likely not going to work with your use case and I don't think we can backport this to C++17 (which means that my paper is probably needed even more.
  2. The current example I gave used to work in a constexpr context (I wouldn't have been able to approach several vendors to discuss it with them otherwise 😆), but it seems they've clamped down on the hack since. An R1 is not needed as the proposal is intended to change behavior to permit the desired behavior.

At this stage and without a proper implementation available, I think you have the following options

  1. File a bug with clang and gcc to get the old behavior back
  2. Wait for C++23 (this will most likely not be the path you take :P)
  3. Find a different way to architect this within the entire project which will be a massive refactor.

I sadly don't think I can help with this further :(

bruxisma avatar Nov 18 '20 17:11 bruxisma

Regarding your previous message it looks like they fixed the memcpy compile time that used to work.

In what gcc? On godbolt I wasn't able to find any gcc or clang where that works.

Additionally clang seems to only support __builtin_bitcast under C++20 mode :(

In constexpr context?

An R1 is not needed as the proposal is intended to change behavior to permit the desired behavior.

But is your assumption documented in the standard, I mean this: in practice, all current vendors simply place the offset to said member inside its pointer to member as either a ptrdiff_t or an int. You formulate it as an implementation detail, not as something defined by the standard. If this is not defined in the standard, then why do you think they will permit the offsetof() impl to rely on this?

File a bug with clang and gcc to get the old behavior back

I'd like to, but on godbolt I haven't found "how old". :)

Wait for C++23 (this will most likely not be the path you take :P)

I am perfectly fine with this if there is a paper with the solution I should wait for. Which is why I do propose R1, because I think your current proposal will not work for my case, but I may be wrong on that.

I sadly don't think I can help with this further :(

Well, the idea here is not (only) to help me, but rather to have a proposal for c++23 that will help me in 2023. My current reservations are:

  1. The offset-storing behaviour you rely on, may not be standard-defined. For example you can't reinterpret_cast the member pointer to ptrdiff_t or whatever alike, even in runtime context (not to speak about constexpr), and to me that says a lot. Why bit_cast will?
  2. bit_cast (or memcpy or whatever you rely on) may not work in a constexpr, and I don't see your paper asking otherwise. It rather says: a discussion is needed on whether std::offset should be constexpr, and whether std::bit_cast should permit constexpr casting of pointer to member data from Standard Layout classes to a ptrdiff_t.

And here you are, a discussion whether it should be a constexpr. :) But why do you think that std::bit_cast will be permitted to apply to the member pointers at all?

stsp avatar Nov 18 '20 19:11 stsp

Hey apologies, it's been a minute since I responded.

You formulate it as an implementation detail, not as something defined by the standard. If this is not defined in the standard, then why do you think they will permit the offsetof() impl to rely on this?

They won't for offsetof(), as that's part of the C standard and we happen to import it. For a new function however (and because C++ implementations all do the same thing for their ABI), we can guarantee this behavior. What the standard lets us do is take existing implementation details and (if they are "the same") we can word the standard so that these implementation details become standard. This is why integers weren't technically twos complement until recently, but when a survey was done we found that all other implementations of integers had in fact been so bad for performance that they were never tried again or no one is supporting them at the software level these days.

I'd like to, but on godbolt I haven't found "how old". :)

Now I'm wondering if my constexpr example I had in 2018 was ever "correct". 🤔

The offset-storing behaviour you rely on, may not be standard-defined. For example you can't reinterpret_cast the member pointer to ptrdiff_t or whatever alike, even in runtime context (not to speak about constexpr), and to me that says a lot. Why bit_cast will?

You can "cast" it, but you use memcpy instead of a cast. A cast is a lightweight "reinterpret this data as X" request to the compiler to move around the type system. A memcpy is "in a non-aliasing way, please write these bytes to this other representation". It's small loopholes like these that allow us to do such things. However you need to verify with a given implementation that what you're doing is "correct". Implementations won't change their ABI but if a new platform is released with a new ABI (in this case ABI refers to a calling convention), you'd have to check. As for why bit_cast will, it's because the committee said so and implementations will follow it.

bit_cast (or memcpy or whatever you rely on) may not work in a constexpr

bit_cast is already in the standard for C++20 (but can't be used for member pointers at this time). It's just a way to codify some forms of memcpy into a "reinterpret these bytes as some other bytes" with little to no overhead. My paper doesn't mention it in the R0 because we weren't sure yet. However, after a discussion in both SG7 and LEWG (the notes of which require password access that I am not at liberty to give), we did realize it could be a constexpr function. That will be in the R1 as I stated above.

But why do you think that std::bit_cast will be permitted to apply to the member pointers at all?

If I ever expressed that, I apologize. It's quite the opposite. The wording requires that no member pointers are usable with it, hence why the "moder offset" paper exists.

bruxisma avatar Dec 14 '20 18:12 bruxisma

However, after a discussion in both SG7 and LEWG (the notes of which require password access that I am not at liberty to give), we did realize it could be a constexpr function.

OK, so if I understand you correctly, you do the following claims:

  • bit_cast will be allowed in constexpr (when?)
  • memcpy will be allowed in constexpr (when? or when it was, since you implied it did in the past)
  • function member pointers will be standardized to be represented as offset (when?)
  • function member pointers will be represented as an offset (and standardized that way) even in constexpr context

This is just the list of requirements that seems to be sufficient for your approach to work, or at least its my understanding. Would be cool if you can clarify which of these are valid, and when are they expected to be standardized (or maybe already are). Or do you mean they all will materialize on C++23? If so, are there the pending proposals per every statement, or will these all (or part) be in your R1? I am asking this simply because I see no evidence for this to materialize. So if you have the URLs to proposals, please let me know.

If I ever expressed that, I apologize. It's quite the opposite.

No, it was just my misunderstanding. Am I right that you are saying only memcpy() would be required for that trick, and bit_cast won't work that way? If not too difficult, would it be possible to clarify why bit_cast is any worse than memcpy for that task? I mean, right now its quite obvious why, but if the member function pointers are standardized to be represented as offsets, then why not to just allow the bit_cast or even static_cast to do the work?

stsp avatar Dec 14 '20 22:12 stsp

I tried to language-lawer the constexpr uses of offsetof() here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111923 Unfortunately, at the end they just opened an "unfriendly" DR to C++ standard to disallow that. :( https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111923#c9 Of course I went to that DR and tried to block it also there. Not sure why they are trying that hard to disallow all the constexpr usage possibilities of offsetof() that I am finding.

stsp avatar Oct 28 '23 07:10 stsp

@zygoloid have informed me that we can now write:

template <typename T, auto O>
struct B {
        static constexpr int off = O.template operator()<T>();
};
struct A {
        char a;
        B<A, []<typename T>() constexpr { return offsetof(T, b); }> b;
};

Which basically gives a working offsetof() in templates.

@AntonBikineev You might be interested to look at that gem too. :)

stsp avatar Feb 05 '24 09:02 stsp

And that doesn't work on Focal, which has clang-10. So I disabled the Focal build for now...

stsp avatar Feb 05 '24 13:02 stsp

You can make the technique a little cleaner by using a default template argument:

template <auto O>
struct B {
        static constexpr int off = O();
};
struct A {
        char a;
        B<[]<typename T = A>() constexpr { return offsetof(T, b); }> b;
};

zygoloid avatar Feb 05 '24 19:02 zygoloid

Done, thanks! This way we can have a macro: #define offset_of(p, n) []<typename T = p>() constexpr { return offsetof(T, n); } which can be a part of some library (like boost), because I don't think too many people can think up this solution themselves. Not sure if it can be wrapped into something else then a macro (eg a helper class) - probably not.

stsp avatar Feb 05 '24 20:02 stsp

I wonder what would be the future of offsetof() in c++. Even if @zygoloid can find the sophisticated work-arounds, its not for the mere mortals. Have something went out of this thread?: https://groups.google.com/a/isocpp.org/g/std-proposals/c/e7eWt79103g It seems to propose the valuable offsetof() extensions for c++, like getting an offset via the member pointer, which is also proposed in this paper: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1278r0.html

But even with that extension, we'd still need to write something like this:

struct A {
    char a;
    B<A, []<typename T = A>() constexpr { return &T::b; }> b;
};

because otherwise you can't easily pass the pointer to member, without extending the complete-class context rules... So all the extension proposals I could find, including this one: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0908r0.html are insufficient. Does this mean, offsetof() will stay in the dust forever, or will there be an efforts, including this one: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2883r0.pdf to get it to the somewhat functional state?

stsp avatar Feb 06 '24 11:02 stsp