zserio
zserio copied to clipboard
Add less than operator to c++ emitted code
Request: Add an 'less than operator' in every generated c++ class. The comparison should compare in the order of the encountered fields.
Example:
struct A
{
T1 a;
T2 b;
T3 c;
};
A::operator<(const A& other) const
{
return std::tie(a, b, c) < std::tie(other.a. other.b, other.c);
}
Background:
- when generating delta's between data for different versions of the same input, we want to keep the same sort order in the structures in order to keep the delta small. Since delta support is part of the standard, it makes sense to also support this in the zserio
- there are specific requirements on the order of elements in lists in the standard (for example sorting by feature). Also in this case the operator can help
- in manual testing it is helpful that data always appears in the same order (even over different versions of the same input data) Without the support requested in this ticket such comparison must be implemented manually in code using the zserio generated classes, which is quite some coding and maintenance effort.
Scope: This ticket is about the specific request for 'less than' in c++ emitted code. But while doing so, you could also have a look at the following related items. it should however not be blocking this issue or making it much more effort. If so, I would advice to do that in another ticket.
- It may make sense to also support all other operands like >, !=, <=> (in case we compile for c++20). Note: for my use cases 'less than' would suffice. But some coding standards dictate that all or none of the comparison operands should be defined.
- '==' and hashCode() functions are already generated. But maybe have a look at changing the interface to allow direct use in std::unordered_map and std::unordered_set).
- add similar support for other languages supported by zserio
OK. Good idea. Thanks for comprehensive description.
After deeper investigation, we have two open points for discussion regarding this issue:
- Zserio does support non-integer types like string, bool, extern... These types do not fit well for less than operator. So, the question is what Zserio shall generate for these types? In another words, will be any less than operator intuitive at all for users?
- Less than operator will compare lexicographically, that is, compares the first field, if they are equivalent, compares the second field, if those are equivalent, compares the third fields, and so on. This does not help for delta compression in most cases, for example if the first field is non-integer or if multiple fields are integers.
Notes to the previous comment:
- Technically, the less than operator can be implemented for all zserio types.
- If we implement it, it does not help for delta compression in most cases. Examples:
struct Foo1
{
string text;
int32 value;
};
The sorting will be done according to the first field in the structure, which is string.
struct Foo2
{
int32 valuesWithBigDeltas;
int32 valuesWithSmallDeltas;
};
The sorting will be done according to the first field but we want to sort it according to the second field. Such information is not available for zserio compiler, only user application knows which field is more suitable for delta compression.
What's the latest status for adding an 'less than operator' in every generated c++ class? Is there any release plan for this?
Unfortunatelly because of other important tasks, there is no plan for this.
But we can boost this topic up and bring this up to others if you need it to be implemented in reasonable time.
Thank you mikir. That would be great if this problem could be moved forward quickly. If possible, we expect this task can finish in a month or two.
- It may make sense to also support all other operands like >, !=, <=> (in case we compile for c++20). Note: for my use cases 'less than' would suffice. But some coding standards dictate that all or none of the comparison operands should be defined.
Because all others C++ operators !=,>,<=,>= was not directly requested, we have decided not to implement them to keep generated object as simple as possible (generated objects are already complicated enough). C++ users can for example use std::rel_ops::operator!=,>,<=,>=
to overcome this limitation. In case of any strong need, please create a new issue.
- '==' and hashCode() functions are already generated. But maybe have a look at changing the interface to allow direct use in std::unordered_map and std::unordered_set).
We have created separate issue for this request: #543.
- add similar support for other languages supported by zserio
We have created separate issues for them: #541 and #542.