zserio icon indicating copy to clipboard operation
zserio copied to clipboard

Missing map support

Open nlohmann opened this issue 6 years ago • 12 comments

Protobuf has a map type, see https://developers.google.com/protocol-buffers/docs/proto#maps. I have not found a similar mechanism in zserio. Though serialization should be easy (e.g., an array of pairs), I would also expect a nice API with some complexity guarantees such as logarithmic lookup. Did I overlook a similar language construct? If not, would this be on the roadmap?

nlohmann avatar Mar 12 '19 09:03 nlohmann

Yes, you are right. There is no map type or something similar in zserio.

Regarding roadmap, we are just about planning of new Milestone 1.3. If there is any particular need to have such map type, I would say, we can put this issue there.

mikir avatar Mar 12 '19 09:03 mikir

It would be valuable to have this modeling construct - especially since it is so common in similar languages like JSON or Protobuf.

nlohmann avatar Mar 12 '19 10:03 nlohmann

OK. It's in Milestone 1.3.

mikir avatar Mar 12 '19 10:03 mikir

We have a RnD CPP-only implementation of a generic template mechanism in our zserio fork. So you could easily generate a map as described above, but also leverage the functionality to create even more powerful structures.

There is also a smaller tutorial repo where we have checked-in a built version of the zserio parser already. Feel free to play around with it here: https://github.com/Klebert-Engineering/zserio-template We are currently evaluating the general approach of having generic templates in zserio, so if you are also interested it may make sense to merge this into the main zserio development after some testing. We currently still have some limitations but these could be eliminated once it gets accepted in the main zserio development.

fklebert avatar Mar 14 '19 08:03 fklebert

But when playing with it, keep in mind that a templated list of pairs does not sort the items thus lookup is still O(n) but (because of zserio runtime internals) insert is O(1)

0xDEAD avatar Mar 14 '19 09:03 0xDEAD

Maybe we should think about adding a registered callback which can be called before writing which could be used to resort, but still you need to implement the optimized lookup yourself. On the other hand, is it really worth the effort, I guess you won't be able to measure the lookup-time compared to deserializing a stream.

0xDEAD avatar Mar 14 '19 09:03 0xDEAD

Taking into account that people may want to have complete map functionality, including sorting and lookup, we may want to introduce a dedicated map still. Question would be what kind of key we would want to allow. Protobuf for example only allows ints and strings as keys, which makes sense in terms of keeping things simple.

fklebert avatar Mar 14 '19 11:03 fklebert

Allowing basetypes as keys only is not only "simpler" but also the only possible thing to do. Structs and/or choices are not guaranteed to be sortable in any way...

0xDEAD avatar Mar 14 '19 11:03 0xDEAD

... unless we introduce other means (which I would never recommend) which indicate sortable/comparable elements of a struct. But this is way beyond what zserio actually should be.

fklebert avatar Mar 14 '19 12:03 fklebert

Constraining the key type to strings rule out a lot of use cases, for instance mapping an enum to values.

As you already generate code for a hash code and a comparison operator, a naive implementation using std::unordered_map should be possible. (Naive in the sense that calculating the hash code could be expensive.)

Furthermore, you could generate an operator< for all types, so using std::map should be possible, too. Ordering primitive types should be straightforward, and complex types can be defined recursively, for instance like std::tuple::operator< is doing it.

nlohmann avatar Mar 14 '19 17:03 nlohmann

Side note: Allowing base types and enums as key types would already provide more than protobuf does.

And as far as I understood enums would not be that big of a deal? Maybe split this issue into two parts? First basetypes + enums, then structs where surely there will be a bit more discussion.

PeachOS avatar Sep 12 '19 11:09 PeachOS

Yes, this could be feasible. Thanks for a note. We will keep it in mind.

mikir avatar Sep 12 '19 12:09 mikir