feat(serde_with): make `BytesOrString` adjustable
At the moment, any Vec<u8> annotated with #[serde(as = "BytesOrString")] will always be serialised as an array of integers in the range 0–255. However, often these bytes are valid strings, the internal representation as Vec<u8> is often just an implementation detail. This PR adds a generic type parameter PREFERENCE, which must implement the new trait TypePreference, which is a subtrait of SerializeAs<[u8]>. This parameter defaults to the new marker-strut PreferBytes, which serialises Vec<u8> as an array like before. Two additional marker-structs are PreferString, which tries to convert &[u8] to &str first and will fallback to serialising as array only if the bytes do not represent a valid string, as well as PreferAsciiString, which additionally checks that all characters of the string are valid ASCII-characters, otherwise falling back to the array as well.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 67.59%. Comparing base (
2d70b70) to head (e030aa0). Report is 15 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #793 +/- ##
==========================================
+ Coverage 67.34% 67.59% +0.25%
==========================================
Files 40 40
Lines 2468 2509 +41
==========================================
+ Hits 1662 1696 +34
- Misses 806 813 +7
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Given this PR is #794 really necessary? The Bytes type does not really have any string association, but is primarily to enable calling the (potentially) optimized serialize_bytes function.
Given this PR is #794 really necessary? The
Bytestype does not really have any string association, but is primarily to enable calling the (potentially) optimizedserialize_bytesfunction.
It was more like an obvious addition to this PR. However, making this trait work for Bytes as well somewhat changed the serialisation behaviour of BytesOrString for the PreferBytes case: It will call serialize_bytes. Is this a breaking change?
It will call
serialize_bytes. Is this a breaking change?
Oh yeah, that is changing the behavior. In JSON it is not visible, as there is no efficient way to store binary data. There are some tests for Bytes using assert_tokens.
It will call
serialize_bytes. Is this a breaking change?Oh yeah, that is changing the behavior. In JSON it is not visible, as there is no efficient way to store binary data. There are some tests for
Bytesusingassert_tokens.
It’s not breaking for Bytes, I adjusted the SerdeAs<[u8]> implementation of PreferBytes to match the current behaviour. It has however changed it for BytesOrString though, which used source.serialize(serializer) before and uses serializer.serialize_bytes(source) now.
The suggested changes (serialize_bytes → serialize) would break #794. However, as you do not seem to be convinced of that PR anyway and neither is that PR very important to me – it just seemed reasonable to me to apply this to Bytes as well, but I guess, closing it and just proceeding with this PR here is the way to go?