flatbuffers icon indicating copy to clipboard operation
flatbuffers copied to clipboard

Feature request, C++, support for std::variant / std::visit

Open ulfhammarqvist-st opened this issue 1 month ago • 6 comments

Parsing unions in c++ can be error prone, in the sense there's no compile time guarantee to handle all cases unless using some combination of the enum and corresponding x_as_y.

A more modern alternative is to use std::variants and std::visit, however flatc is not producing the required variants, or any means to convert to such (afaik).

... is this something that was ever discussed?

ulfhammarqvist-st avatar Nov 06 '25 12:11 ulfhammarqvist-st

there is a c++17 support flag for flatc c++ generation that this feature could be added to.

I've pulled the relevant generated functions from the Monster fbs file for disussion:

  MyGame::Sample::Equipment equipped_type() const {
    return static_cast<MyGame::Sample::Equipment>(GetField<uint8_t>(VT_EQUIPPED_TYPE, 0));
  }
  const void *equipped() const {
    return GetPointer<const void *>(VT_EQUIPPED);
  }
  template<typename T> const T *equipped_as() const;
  const MyGame::Sample::Weapon *equipped_as_Weapon() const {
    return equipped_type() == MyGame::Sample::Equipment_Weapon ? static_cast<const MyGame::Sample::Weapon *>(equipped()) : nullptr;
  }
  void *mutable_equipped() {
    return GetPointer<void *>(VT_EQUIPPED);
  }

...

template<> inline const MyGame::Sample::Weapon *Monster::equipped_as<MyGame::Sample::Weapon>() const {
  return equipped_as_Weapon();
}

I definitely agree that this portion of the interface feels a bit clunky.

maybe one possible implementation would be to generate the variant type for the union, and add a function for returning that directly

  using EquippedVariant =
      std::variant<std::monostate, const MyGame::Sample::Weapon*, const void*>;
  using MutableEquippedVariant =
      std::variant<std::monostate, MyGame::Sample::Weapon*, void*>;

  EquippedVariant equipped_variant() const {
    switch (equipped_type()) {
      case MyGame::Sample::Equipment_NONE:
        return std::monostate{};
      case MyGame::Sample::Equipment_Weapon:
        return equipped_as<MyGame::Sample::Weapon>();
      default:
        return equipped();
    }
  }

  MutableEquippedVariant mutable_equipped_variant() {
    switch (equipped_type()) {
      case MyGame::Sample::Equipment_NONE:
        return std::monostate{};
      case MyGame::Sample::Equipment_Weapon:
        return mutable_equipped_as<MyGame::Sample::Weapon>();  // it seems this currently isn't generated - need to investigate
      default:
        return mutable_equipped();
    }
  }

A couple of thoughts I have:

  1. we wouldn't want to remove the current functionality, so would need to name these functions accordingly to not collide
  2. we may want to support non standard variant/monostate types, similar to what is done for std::string
  3. I'm curious if we would want to have this generated by default with --cpp-std c++17 or if we would want to add an additional argument for variants --cpp-gen-variant

Would this fit your needs?

jtdavis777 avatar Nov 11 '25 17:11 jtdavis777

I don't use std::variant myself (find its usage too clumsy) but not against supporting it.. that said:

My main beef would be its inefficiency, as you're doing a double type switch to read the value. First a switch which essentially serves to translate from FlatBuffers enum to whatever value std::variant uses internally, and then you're going to have to select on that again.

Second, you are producing variants that rely on FlatBuffer pointer types. Using std::variant may be attractive if a code base already uses std::variant extensively, but raw FlatBuffer types tend to not travel far into a codebase typically, so I question how useful this is. I'd almost think std::variant would be more useful in the object API, since the intent of that API is for FlatBuffers generated objects to be longer lived and integrate more deeply with a codebase. But there it would require a special flag to not be a breaking change.

If this is to be added, I'd thing just adding to --cpp-std c++17 makes more sense to me, as its cost is mostly just more code.. people that don't call equipped_variant() don't pay for its cost.

Is the addition of void * in the variant purely to accomodate older code reading newer serialized data (that may have new unknown union members)? I'd think that having both this and the monostate doesn't help with the stated desire of "compile time guarantee to handle all cases".

aardappel avatar Nov 18 '25 22:11 aardappel

I was just looking for this as well, for the object API. I once did this conversion in a capnproto c++ object generator (repo), converting capnproto union types to an std::variant between the builder and object.

By matching the order of the discriminator enum and variant types, the variant index() will equal the enum integer value, so another field for the discriminator enum value is not needed. Type matchers are typically used on the variants anyway so the enum is not generally used.

nickolasrossi avatar Nov 18 '25 23:11 nickolasrossi

My main beef would be its inefficiency

sure. I think the intent with providing a variant is to provide a "flatbuffers canonical" method of going from the (seemingly internal) enumeration value to the contained object. While this does add in the inefficiency, its an 'easy button' for use cases where convenience trumps efficiency. end users can write a visitor with the overload pattern that looks reasonably clean. I think this would also pave the way for more fully constexpr uses of flatbuffers, (std::visit is fully constexpr capable in c++20) which is a feature set I personally think would be very interesting to pursue.

Second, you are producing variants that rely on FlatBuffer pointer types... Is the addition of void * in the variant purely to accommodate older code reading newer serialized data?

these two do go hand in hand, I was thinking that it was important to distinguish between "present but unknown" and "not present". But, old code can't really act on the pointer itself at all. So here's my attempt two, which aims to address some of these concerns.

  struct UnknownType {};

  temaplate<typename... Ts>
  using FlatbuffersVariant =
      std::variant<std::monostate, UnknownType, std::reference_wrapper<Ts>...>;
  
  using EquippedVariant = FlatbuffersVariant<const MyGame::Sample::Weapon>;
  using MutableEquippedVariant = FlatbuffersVariant<MyGame::Sample::Weapon>;

  EquippedVariant equipped_variant() const {
    switch (equipped_type()) {
      case MyGame::Sample::Equipment_NONE:
        return std::monostate{};
      case MyGame::Sample::Equipment_Weapon:
        return *equipped_as<MyGame::Sample::Weapon>();
      default:
        return UnknownType{};
    }
  }

  MutableEquippedVariant mutable_equipped_variant() {
    switch (equipped_type()) {
      case MyGame::Sample::Equipment_NONE:
        return std::monostate{};
      case MyGame::Sample::Equipment_Weapon:
        return *mutable_equipped_as<MyGame::Sample::Weapon>();  // it seems this currently isn't generated - need to investigate
      default:
        return UnknownType{};
    }
  }

As you said, the only real cost here is code, so while I think the spirit of this addition does fit the Object API better, I don't think it detracts from the original API for those who don't want to commit to the Object API.

jtdavis777 avatar Nov 18 '25 23:11 jtdavis777

The big upside to std::visit is a (better) chance to catch missing mappings at compile time, and the fix is to just to complete the visitor. Quite neat pattern when mutating a wide set of union members.

As far as my current use case is concerned, extending the object api with std::variant flavors for union access is perfectly acceptable. I think this could be either be achieved by a new c++ specific flag, or extending the native type approach to unions, but not sure exactly what that'd look like. (see https://github.com/google/flatbuffers/commit/599847236c35fa3802ea4e46e20e93a55d3a4a94.)

ulfhammarqvist-st avatar Nov 20 '25 08:11 ulfhammarqvist-st

I think using a flatbuffers::UnknownType would indeed be nicer than void *, then we can document what that stands for, it will be clearer to any programmer that encounters it, and force them to think about what they want to do in that situation (some code may want to ignore them, others log, yet others error..).

aardappel avatar Nov 23 '25 20:11 aardappel

alrighty - I'm getting close(ish) to having a branch up for this.

I still need to see why the mutable union functions aren't generated, and need to write tests/docs.

Here is some sample generated code:

tests/union_vector/union_vector.fbs

...

table Attacker {
  sword_attack_damage: int;
}

struct Rapunzel {
  hair_length: int;
}

struct BookReader {
  books_read: int;
}

union Character {
  MuLan: Attacker,  // Can have name be different from type.
  Rapunzel,         // Or just both the same, as before.
  Belle: BookReader,
  BookFan: BookReader,
  Other: string,
  Unused: string
}

...

generates

...

using CharacterVariant = ::flatbuffers::FlatbuffersVariant<const Attacker, const Rapunzel, const BookReader, const BookReader, const ::flatbuffers::String, const ::flatbuffers::String>;
using MutableCharacterVariant = ::flatbuffers::FlatbuffersVariant<Attacker, Rapunzel, BookReader, BookReader, ::flatbuffers::String, ::flatbuffers::String>;

...

  CharacterVariant main_character_variant() const {
    switch (main_character_type()) {
      case Character::NONE:
        return std::monostate{};
      case Character::MuLan:
        return CharacterVariant{std::in_place_index<1>, *main_character_as_MuLan()};
      case Character::Rapunzel:
        return CharacterVariant{std::in_place_index<2>, *main_character_as_Rapunzel()};
      case Character::Belle:
        return CharacterVariant{std::in_place_index<3>, *main_character_as_Belle()};
      case Character::BookFan:
        return CharacterVariant{std::in_place_index<4>, *main_character_as_BookFan()};
      case Character::Other:
        return CharacterVariant{std::in_place_index<5>, *main_character_as_Other()};
      case Character::Unused:
        return CharacterVariant{std::in_place_index<6>, *main_character_as_Unused()};
      default:
        return UnknownUnionType{};
    }
  }

  MutableCharacterVariant mutable_main_character_variant() {
    switch (main_character_type()) {
      case Character::NONE:
        return std::monostate{};
      case Character::MuLan:
        return MutableCharacterVariant{std::in_place_index<1>, *mutable_main_character_as_MuLan()};
      case Character::Rapunzel:
        return MutableCharacterVariant{std::in_place_index<2>, *mutable_main_character_as_Rapunzel()};
      case Character::Belle:
        return MutableCharacterVariant{std::in_place_index<3>, *mutable_main_character_as_Belle()};
      case Character::BookFan:
        return MutableCharacterVariant{std::in_place_index<4>, *mutable_main_character_as_BookFan()};
      case Character::Other:
        return MutableCharacterVariant{std::in_place_index<5>, *mutable_main_character_as_Other()};
      case Character::Unused:
        return MutableCharacterVariant{std::in_place_index<6>, *mutable_main_character_as_Unused()};
      default:
        return UnknownUnionType();
    }
  }

jtdavis777 avatar Dec 09 '25 14:12 jtdavis777

this code does make the assumption of a correctly formed flatbuffer -- if you have a type but no value this will try to create a reference of a nullptr.

jtdavis777 avatar Dec 09 '25 14:12 jtdavis777