pfr icon indicating copy to clipboard operation
pfr copied to clipboard

Reflecting array members of aggregate structs

Open willwray opened this issue 6 years ago • 3 comments

Reflection of member array types is possible by refining magic_get methods.

Recursive scheme

Here's a scheme for finding an aggregate's member types one at a time:

Given an aggregate-initializable struct S:

  1. Assume initial member types T... of S are known (none to start) => S can be aggregate initialized as S { T_init... } with T_init braced or unbraced initialization as needed by T (scalars need unbraced init, arrays need braced init => awkward).
  2. Attempt aggregate initialization S { T_init..., ubiq_init } where ubiq is the usual convert-to-anything type. If this fails then T... is the complete list of members: Done.
  3. Else redo initialization (2) but capture the ubiq converted-to type L (either by 'Great Type Loophole' or by typeid memo). If the member is array type then L is its base type (on gcc ubiq actually sees the full array type - a tempting anomaly)
  4. Introspect the rank of the member base type L found in (3): Attempt initializations of S { T_init..., {L_init} } with increasing depth of nested braces => deduce the rank
  5. For non-zero rank, i.e. array types, find the array bounds: e.g. for an array of rank 2 attempt initializations S { T_init..., {{Li...}} } S { T_init..., {{Li}...} } with increasing numbers Li... => the maxes are the array bounds
  6. Append the new type to the known initiial member types T...
  7. Goto (1)
  • This scheme is not fully generic because of the need for both braced and unbraced init syntax in the comma-separated init list (variadic expansion of differing syntax in that context seems impossible).
  • Also, array rank deduction has an implementation-defined depth limit.

Array vs Scalar init

Arrays and scalar types don't mix well in braced-init lists, requiring non-uniform initialization; to support both requires expanding both braced and unbraced initialization syntax. Again, given aggregate-initializable struct S with known initial member type sequence T...

  • Expanding unbraced S{ T{}... } fails for T = L[N] array type
    • array must be initialized with a brace-enclosed initializer.
  • Expanding braced S{ {T{}}... } fails for scalar
    • error: braces around scalar initializer

A 'Catch 22' situation in generating a suitable init-list:

Workarounds for non-uniform initialization syntax

For the initial member type initializations:

  • Separate T... into runs requiring braced or unbraced initialization:
  • Support pairs of unbraced runs Ui and braced runs Bi to some limit N:
    • T... -> Ts<U0....>, Ts<B0....> ... Ts<UN-1...>, Ts<BN-1...>

A different workaround would be to flatten all arrays and do all unbraced inits. For large arrays this explodes compile time and memory usage. Or, for a mixed workaround, fallback to flattening arrays only when necessary. If more than N pairs of runs are needed, selectively flatten smaller arrays in order to remove braced runs until there are only N pairs of runs.

Proof of Concept code

I have code for this scheme, done in C++17 plus -fconcepts for ease (so currently limited to gcc, but not using gcc's direct extraction of the array type). It is prototype / experimental code that has not been much exercised yet. It is around 500 LOC including comments. It uses the Loophole ubiq. It expands up to 4 array dimensions. It expands up to 4 pairs of unbraced,braced runs (the array-flattening fallback has not been implemented so 4 is a hard limit). In principle, it should be possible to backport to C++14. I'm not motivated to backport or test with Clang / MSVC until they have concepts. Compile time for large arrays gets noticeable, likely the binary search for bounds.

Contribute to pfr / magic_get?

I'm happy to provide the code in current form, to see if it is suitable for inclusion. There seem to be more cons than pros though...

Cons: Arrays are awkward types so still need special-casing in generic client code. Don't want to encourage C-array usage - std::array is better if it can be used. Consumes more compile resource than current 'precise' and 'flat' methods. It comes burdened with implementation-defined limits. The code is involved so there's a maintenance cost. The backport is not trivial.

Pros: Its the only way I know to reflect array members. Because it's there.

Note on structured bindings and 'structured reflection'

Structured bindings get tantalisingly close to complete 'precise' struct reflection; if you know the member count in a struct then you can bind to and get all types. If you don't know then you have to count, but you can't directly count because you can't SFINAE on a structured binding failing, so you have to count indirectly. For an aggregate, this means counting via the init-list. We're back to init, innit. Egg-bound by the chickens.

I floated this template-matching syntax for 'structured reflection';

template <class A> struct destructure;

template <class A, auto... mp>
struct destructure<A::[mp...]> {
     using member_pointers = Members<mp...>;
};

struct S { bool b; char c[4]; };
using S_members = typename destructure<S>::member_pointers;

If only it were this simple.

willwray avatar Mar 12 '18 21:03 willwray

@willwray, I have some questions about your approach, may we discuss their (by email or here for example)? May you show me your c++17 implementation of this scheme?

glebushka98 avatar May 22 '18 14:05 glebushka98