cista icon indicating copy to clipboard operation
cista copied to clipboard

`cista::to_tuple` Fails to properly decompose Object With Array Type

Open TheFloatingBrain opened this issue 1 year ago • 3 comments

cista::to_tuple is giving this error

/root/workdir/Include/ThirdParty/cista/cista.h:2005:11: error: 13 names provided for structured binding
 2005 |     auto& [p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11, p12, p13] = t;
      |           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/root/workdir/Include/ThirdParty/cista/cista.h:2005:11: note: while ‘const al’ decomposes into 4 elements

with this struct:

struct al {
  float a;
  alignas(8) char b;
  char bb;
  char arr[10];
};

Note: Struct was taken from here (I am trying to implement offsetof for all members in a struct).

P.s There are other cases that this does not work work with (base structs, structs with #pragma (pack, N) etc. with similar problems, do you want bug reports on those?

TheFloatingBrain avatar Aug 13 '22 02:08 TheFloatingBrain

In C++ it's better to use something like std::array<T, Size> instead of C-arrays. Cista offers cista::array<T, Size> for this purpose if you want deterministic serialization behavior.

It's a known issue that cista requires custom serialization or cista_members member function for structs that contain C-arrays because the aggregate initialization trick allows you to initialize each array entry separately, but it's still only one member variable.

I think if your struct looks like this, it should have the same properties (memory layout and performance wise) but it should work with all cista functionality such as cista::to_tuple:

struct al {
  float a;
  alignas(8) char b;
  char bb;
  cista::array<char, 10> arr;
};

I never tried using alignas in combination with cista. However since cista makes use of offsetof for serialization, it should work as long as offsetof returns the correct address.

felixguendling avatar Aug 13 '22 10:08 felixguendling

In C++ it's better to use something like std::array<T, Size> instead of C-arrays. Cista offers cista::array<T, Size> for this purpose if you want deterministic serialization behavior.

It's a known issue that cista requires custom serialization or cista_members member function for structs that contain C-arrays because the aggregate initialization trick allows you to initialize each array entry separately, but it's still only one member variable.

I think if your struct looks like this, it should have the same properties (memory layout and performance wise) but it should work with all cista functionality such as cista::to_tuple:

Okay, I may have to report back on this one, my use case involves computing the offset of members of a struct (POD'ish) so I can send interleaved vertex data to the GPU in OpenGL. If it does in fact have the same memory layout properties, I can certainly recommend to anyone using the library to use std::array or explicitly serialize their object type (or perhaps specialize for std::array type things, I am not sure I could go to &myStdArray and get the first element, as I cant call .data() GPU side).

I never tried using alignas in combination with cista. However since cista makes use of offsetof for serialization, it should work as long as offsetof returns the correct address.

On a personal note, as someone who spent a few days looking into this problem of how to implement a "cleaner" offsetof and looking at a lot of references, I am intrigued by cista'suse of offsetof/cista_member_offset, looking at a chunk like the one below

  auto const start = origin->empty()
                         ? NULLPTR_OFFSET
                         : c.write(static_cast<T const*>(origin->el_), size,
                                   std::alignment_of_v<T>);

  c.write(pos + cista_member_offset(Type, el_),
          convert_endian<Ctx::MODE>(
              start == NULLPTR_OFFSET
                  ? start
                  : start - cista_member_offset(Type, el_) - pos));

I did read the article on how cista basically works, but it looks like instead of trying to make a function that can take any member and compute offset of (at compile time?), you alias the members into a certain set of names while some how preserving memory layout?

Your work is very impressive by the way :v:

TheFloatingBrain avatar Aug 15 '22 15:08 TheFloatingBrain

Basically, what cista does is to first memcpy the data structure 1:1 into the serialization buffer (file / memory / mmap). This is done for every "entry point" - the root data structure as well as everything that's behind a heap allocated pointer (like vector, unique_ptr). You then always have a pair of pointer and offset: the origin of your data (original data structure you had in-memory and want to serialize) and the offset to the memcpy-ed data in the buffer. After that, cista iterates recursively through each member and does what's needed:

  • Base case (recursion ends): scalar values (only endianness conversion)
  • Recursive case struct: recursively call serialize(origin->member_, pos + offsetof(my_struct, member_))
  • Recursive case "container" (unique_ptr, vector, etc.): memcpy allocated contents to serialization buffer, call serialize(&vec[i], pos + i * sizeof(T))

Here, the offsetof(my_struct, member_) is required to compute the corresponding offset pos + offsetof(...) of the member variable in the buffer: pos corresponds to the start address where the struct was serialized, so pos + offsetof(my_struct, member_) yields the offset of the member in the serialization buffer which is required for the recursive serialize call or to apply endian conversion directly as shown in your example code.

&myStdArray

Maybe something like &my_array[0] could work. Since the data structures have the same memory layout, you could probably also try to use the C-array in OpenGL code. I only have experience with CUDA where you can use C++ data structures.

Thank you! :-)

felixguendling avatar Aug 16 '22 07:08 felixguendling