Use OwnedBinaryData rather than vector<char> for encryption keys
What, How & Why?
This is a better semantic fit for the data, and will make it easier to automatically generate bindings.
☑️ ToDos
- [x] 📝 Changelog update
- ~~[ ] 🚦 Tests (or not relevant)~~
- ~~[ ] C-API, if public C++ API changed.~~
Marking as ready for review since I finally fixed all of the apple compile failures that I couldn't test locally. I'm pretty sure the failure in evergreen is unrelated to this change.
Something like
value_ptr<std::array<char, 64>>seems like it'd best express the type?
Perhaps. But std::array is still semantically a sequence type, which I would like to map to the SDK's Array type when generating bindings. For languages like JS that have separate types for Binary data vs sequences of bytes , I want this to clearly be the former (actually there isn't even a byte type in JS, so it would instead map to string[] or double[], both of which are clearly wrong).
I suppose I could make a dedicated type in the binding generator which maps to that C++ type, but maps to some other type in the SDK language, but that seems like overkill here. We previously were fine with having this be a variable-length container, why would that change now? Also, variable seems nice from a future-proofing standpoint, since it wouldn't need to change if we ever supported different key lengths.
We previously were fine with having this be a variable-length container, why would that change now?
Well this currently appears to be changing code for the sake of changing code. The current type is clearly not semantically ideal but works, and switching it to a different non-ideal type that happens to work is just pointless code churn. I'm not convinced that OwnedBinaryData is even a step in the correct direction. Encoding the required input size in the static type rather than having to document it would be one way to make it actually an improvement.
Based on the comments from @tgoyne, I am not entirely convinced that this PR adds value. @RedBeard0531 What is your current position here?
Sorry, I've been meaning to revive this, but it has been low on my priority list since I ended up working around it in the generator with a dedicated EncryptionKey type.
I think there were two issues 1) should we use a fixed-size type, and 2) how should we model the lack of a key.
For 1, I'm now more convinced that it would be wrong to use a fixed-size type, at least in the configuration struct. Using a variable-length type allows us to validate the length when doing the rest of config validation. That isn't possible when using a fixed-size type. That would be fine if all consumers were internal C++ code, but they aren't. I checked a few SDKs (JS, Swift, .Net), and in all of them, the user passes an encryption key as a variable-length byte array that we just propagate to the config. If we were to use a fixed-size type in the config struct, then we would force the SDKs to do their own length checks correctly prior to passing to core. That is both more work, and more room for missed checks.
For 2, I'm not convinced that it is worth having a distinction between null vs empty here. Most containers and views (including std::span and std::string_view) do not have this distinction. I can understand why it could be an issue to conflate them in C# because a null string really means that you don't have a string, so you can't do myNullString.IsEmpty(). But that isn't the case in C++ even with our BinaryData type. If you have a null BinaryData, you do still have a BinaryData object so you can call .empty() on it. Ideally, we wouldn't need the encryption_key_ptr and the lower levels of core would take a BinaryData rather than a blind, unsized char*, but I didn't want to push the refactor down further. Note that the c-api already treats empty keys as null. Going back to the user experience, why should we error if the user passes an empty Binary buffer in their config?
@jedelbo @RedBeard0531 @tgoyne Should this be moved forward, go into Draft or be closed?