simple-binary-encoding
simple-binary-encoding copied to clipboard
[C#,C++] Generate DTOs for non-perf-sensitive usecases.
Overview
In some applications, performance is not critical. Some users would like to use SBE across their whole "estate" but don't want the "sharp edges" associated with flyweight codecs, e.g., usage not aligning with data lifetimes.
In this PR, I've added DTO generation for C# and C++.
I'm using property-based tests to gain confidence that the DTOs are working correctly. In particular, I'm checking the following property (albeit not exhaustively):
∀ msg ∈ MessageSchemas,
∀ encoding ∈ EncodingsOf(msg),
encoding = dtoEncode(dtoDecode(encoding))
I.e., for any message schema dtoEncode
is the inverse of dtoDecode
and the "round trip" preserves all information in the original encoding.
These tests run periodically rather than on every commit; however, I've tested out the CI job using a PR hook here.
Implementation notes
The DTOs support encoding and decoding via the generated codecs using static void EncodeWith(CodecT codec, DtoT dto)
and static DtoT DecodeWith(CodecT codec)
methods.
C# Representations
-
Messages and composites are represented as immutable records.
-
init
accessors are provided so that record expressions may be used, e.g.,x with { Y = Z }
. - An all-args constructor is defined to prevent the construction of records with missing fields.
- The compiler generated
ToString()
does not show what is inside groups etc.; therefore, we provideToSbeString()
as well.
-
-
Groups are represented as
IReadOnlyList<GroupT>
-
Added/optional primitives are represented as nullable types.
null
indicates the value is not filled. The reserved null value defined explicitly in the schema or implicitly by the SBE specification is not permitted for use within the DTOs, as this would lead to multiple representations ofnull
in consuming application code. Both constructors andinit
accessors validate that values are in the allowed range. -
Added fixed-length data is represented through nullable reference types, e.g.,
string?
andIReadOnlyList<byte>?
. Missing data, e.g., due to the encoding version, is represented asnull
. -
Missing, added variable-length data is represented as an empty string or array, similarly to the codecs.
-
Enums and bitsets use the existing codec representations, i.e., generated enums.
- Missing bitset fields are represented as
0
.
- Missing bitset fields are represented as
Other changes
-
Use .NET
6.0
(LTS) rather than3.x
for CI build and tests-
sbe-dll
still targets the (quite ancient, ~2017).NET Standard 2.0
but no longer the (very ancient, ~2012).NET Framework 4.5
. - Consumers may need to upgrade from
.NET Framework 4.5
to a minimum of.NET Framework 4.6.1
-
- Add daily slow build that incorporates (expensive) property-based tests
-
Property based tests for JSON generation
- Shows problems relating to string escaping, unless we constrain the inputs (which we do)
-
Adopt jvm-test-suites Gradle plugin
- This is the Gradle 9 compatible way of introducing new source sets, e.g., for property-based tests.
- The "traditional" mechanisms for introducing source sets give deprecation warnings in Gradle 8.
- Fix what I believe is a bug where the default minValue for
float
anddouble
are their minimum +ve representable values rather than minimum -ve values.
Feedback to consider:
- Use records (C# 9+ only) for built-in equality and comparison operations.
- Be more idiomatic, e.g.,
int?
rather thanint
andNullValue
, but protect against having twonull
values in the setters.
@kieranelby, please can someone kick the tyres on your side and let me know if you'd prefer different representations?
To use it, you can supply -Dsbe.csharp.generate.dtos=true
to the code generator.