add encode/decode methods that work with [UInt8]
I am using this library in my Vapor BE app that communicates with FE swift app (not on any of apple platforms). It is working great so far and I love it. However, it is common pattern to use raw [UInt8] in these environments. I can go through Data but it incurs unnecessary performance penalty. Would it be possible to add a way to avoid having to create Data instance on every encode/decode?
Bonus points for adding support for Span as well :)
Thanks for your interest in the library.
There are a few relevant points regarding your suggestion:
API
- The API of
BinaryCodableis aligned with existing encoders likeJSONEncoder - Convenience functions could be added, but they reduce clarity and simplicity
Performance
- Convenience functions that return and consume
[UInt8]instead ofDatawould just convert between the two, creating the same performance penalty - I assume (but have not measured) that the internal copying and concatenating of chunks is the major driver for encoding inefficiencies. I guess that an additional "copy" to
[UInt8]will not make a large difference -
BinaryCodableis much slower for encoding and decoding than e.g.JSONEncoder. If computational performance is a bigger issue than storage efficiency, then it may be worth looking into other options. - Switching the internals to
[UInt8]would be feasible, but the public API still needs to provideDatafunctions, shifting the performance penalty to all other users. - Rewriting the internals to use
[UInt8]is some effort, and I'm not sure that there will be a measurable increase in performance from that alone.
Internals
Internally, the encoder works with small chunks of Data while constructing the binary, instead of a single continuous storage that is filled. This is due to several reasons:
-
Codableallows quite some flexibility when defining the encoding process. E.g. it's possible to create multiple (nested) containers and write to them in arbitrary order. - Some elements have variable lengths, so it's not possible to determine how many bytes are needed. E.g. the size of the length indicator for an encoded array can only be determined once all elements are encoded. For few elements, it will be a single byte, but larger arrays require a larger size encoding.
- There are advanced options like sorting the encoded values by their keys, which require rearranging of the encoded chunks.
This means there is quite some internal data manipulation happening (depending on the complexity of the encoded types), which is hard to get rid of.
Wrapping up
I'm hesitant to add additional functions to the API, since it's unclear if there is sufficient public interest at the moment. Regarding the performance penalty, I would base any implementation changes on measurements, to ensure that the effort is justified. If you have any concrete data showing where the actual bottlenecks are, that would be very helpful.
Side not: I want to perform optimisations to the internal data handling in the future, but just haven't had the time. There are a few things that could be improved, but due to the requirements of the Codable implementation, this may get rather complicated.
For the time being, I would suggest that you simply write some extensions for your use case:
extension BinaryEncoder {
func encode<T: Encodable>(_ value: T) throws -> [UInt8] {
let data: Data = try encode(value)
return [UInt8](data)
}
}
extension BinaryDecoder {
func decode<T: Decodable>(_ type: T.Type, from data: [UInt8]) throws -> T {
try decode(T.self, from: Data(data))
}
}
If you gather any data showing that this is unreasonably inefficient compared to the actual encoding/decoding, then we can think about alternative options.
Thanks for the detailed answer. I did not do any measurements yet as I only recently started using the BinaryCodable. Based on what you said, it looks like the single conversion will not make a difference and when I get to the point where coding is bottleneck I will probably need to come up with altogether different implementation.
I will make sure to share my findings if/when I get to that point