CBOR Feature Drop: COSE
This PR obsoletes #2371 and #2359 as it contains the features of both PRs and many more.
Specifically, this PR contains all feature required to serialize and parse COSE-compliant CBOR (thanks to @nodh). While some canonicalization steps (such as sorting keys) still needs to be performed manually. It does get the job done quite well. Namely, we have successfully used the features introduced here to create and validate ISO/IEC 18013-5:2021-compliant mobile driving license data.
This PR introduces the following features to the CBOR format:
- Serial Labels
- Tagging of keys and values
- Definite length encoding (this is the largest change, as it effectively makes the cbor encoder two-pass)
- Globally preferring major type 2 for byte array encoding
- Encodes
nullcomplex objects as empty map (to be COSE-compliant)
Hi, thanks for your PR! Just as a quick heads-up so you know I've seen it — I'm going on vacation soon, so I'll be able to properly review it in September.
Globally preferring major type 2 for byte array encoding
It would be awesome, as currently it's somewhat surprising that the default encoded size of byte arrays is nearly twice as large.
WASM still fails, because even runCatching does not catch a runtimeException for an expected IndexOutOfBoundsException when checking for tags. Help would be much appreachiated.
WASM still fails, because even
runCatchingdoes not catch a runtimeException for an expected IndexOutOfBoundsException when checking for tags. Help would be much appreachiated.
Here are the details:
Current test: testOpenPolymorphism
Process output:
wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:1
RuntimeError: array element access out of bounds
at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlin.Array.get (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[2207]:0x4247e)
at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlinx.serialization.descriptors.SerialDescriptorImpl.getElementAnnotations (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[3974]:0x595a3)
at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlinx.serialization.cbor.internal.getKeyTags (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[5369]:0x6b6e2)
at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlinx.serialization.cbor.internal.Preamble.encode (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[5180]:0x68fa4)
at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlinx.serialization.cbor.internal.Token.encode (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[5173]:0x68d56)
at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlinx.serialization.cbor.internal.CborWriter.encode (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[5294]:0x6a0c3)
at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlinx.serialization.cbor.Cbor.encodeToByteArray (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[5042]:0x6743b)
at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlinx.serialization.cbor.CborPolymorphismTest.testOpenPolymorphism (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[6111]:0x79d2a)
at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlin.wasm.internal.testContainer$kotlinx.serialization.cbor test fun$CborPolymorphismTest test fun$testOpenPolymorphism test fun.invoke (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[6715]:0x8440e)
at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlin.test.TeamcityAdapter$test$lambda.invoke (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[5605]:0x6f4a9)
EDIT: seems it's already been reported and there's really not much we can do: https://youtrack.jetbrains.com/issue/KT-59081/WASM-Cant-catch-index-out-of-bounds-and-divide-by-0-exceptions
For those who willing to test: we've got a custom 1.6.3-SNAPSHOT version for Kotlin 1.9 with full COSE and I've just published 1.7.1-SNAPSHOT with COSE support to the same maven repo, which you can just add to your gradle builds. Notably, the JS targets are missing due to what seems to be a compiler bug. For the targets that do build, everything looks good with our internal test suites, but external feedback is of course appreciated.
@sandwwraith Pining you as requested. knit and apiDocs run, though I fear I forgot something again ;-)
WASM still fails, because even
runCatchingdoes not catch a runtimeException for an expected IndexOutOfBoundsException when checking for tags. Help would be much appreachiated.Here are the details:
Current test: testOpenPolymorphism Process output: wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:1 RuntimeError: array element access out of bounds at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlin.Array.get (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[2207]:0x4247e) at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlinx.serialization.descriptors.SerialDescriptorImpl.getElementAnnotations (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[3974]:0x595a3) at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlinx.serialization.cbor.internal.getKeyTags (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[5369]:0x6b6e2) at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlinx.serialization.cbor.internal.Preamble.encode (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[5180]:0x68fa4) at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlinx.serialization.cbor.internal.Token.encode (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[5173]:0x68d56) at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlinx.serialization.cbor.internal.CborWriter.encode (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[5294]:0x6a0c3) at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlinx.serialization.cbor.Cbor.encodeToByteArray (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[5042]:0x6743b) at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlinx.serialization.cbor.CborPolymorphismTest.testOpenPolymorphism (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[6111]:0x79d2a) at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlin.wasm.internal.testContainer$kotlinx.serialization.cbor test fun$CborPolymorphismTest test fun$testOpenPolymorphism test fun.invoke (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[6715]:0x8440e) at <org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>.kotlin.test.TeamcityAdapter$test$lambda.invoke (wasm://wasm/<org.jetbrains.kotlinx:kotlinx-serialization-cbor_test>-00616f4e:wasm-function[5605]:0x6f4a9)EDIT: seems it's already been reported and there's really not much we can do: https://youtrack.jetbrains.com/issue/KT-59081/WASM-Cant-catch-index-out-of-bounds-and-divide-by-0-exceptions
That one's fixed upstream already, btw since the correct compiler flags are being passed
For those who willing to test: we've got a custom 1.6.3-SNAPSHOT version for Kotlin 1.9 with full COSE and I've just published 1.7.1-SNAPSHOT with COSE support to the same maven repo, which you can just add to your gradle builds. Notably, the JS targets are missing due to what seems to be a compiler bug. For the targets that do build, everything looks good with our internal test suites, but external feedback is of course appreciated.
@sandwwraith Pining you as requested.
knitandapiDocsrun, though I fear I forgot something again ;-)
Fixed with Kotlin 2.0.20-Beta1
Hi, I've looked into your new approach, and while it doesn't create a new array for each write operation, creating and storing a lambda in each encodeInt/String/etc is also slow, unfortunately:
# JMH version: 1.37
# VM version: JDK 11.0.14.1, OpenJDK 64-Bit Server VM, 11.0.14.1+10-LTS
# VM invoker: /usr/lib/jvm/java-11-amazon-corretto/bin/java
(your PR)
Benchmark Mode Cnt Score Error Units
CborBaseline.fromBytes thrpt 10 1657.190 ± 5.512 ops/ms
CborBaseline.toBytes thrpt 10 925.624 ± 1.491 ops/ms
(dev)
Benchmark Mode Cnt Score Error Units
CborBaseline.fromBytes thrpt 10 1939.893 ± 2.799 ops/ms
CborBaseline.toBytes thrpt 10 2628.659 ± 5.105 ops/ms
Following up on our previous discussion, the cornerstone of this problem, if I remember correctly, is that we need to write regular classes as definite-length maps — using length as the prefix (and beginStructure doesn't have collectionSize parameter). Is this the only reason to use DeferredEncoding?
I came up with several ideas on what can be done to improve this:
1. Reduce number of allocations
You can create intermediate storage per beginStructure, not per each encodeXxx. This code can look as following:
internal class Encoder {
class Data(val byteArray: ByteArray, var elementCount: Int)
val stack = mutableListOf<Data>()
fun beginStructure(descriptor: SerialDescriptor) {
val current = Data(byteArrayOf(), 0)
current.writeTags(descriptor)
stack.add(current)
}
fun encodeElement(descriptor: SerialDescriptor, index: Int) {
val current = stack.peek()
current.writeTagsAndName(descriptor.getElementDescriptor(index), descriptor.getElementName(index))
current.elementCount++
}
fun encodeString(s: String) = writeString(s, stack.peek().byteArray)
fun encodeInt(i: Int) = writeInt(i, stack.peek().byteArray)
fun endStructure() {
val completedCurrent = stack.pop()
completedCurrent.writeBreak() // if needed for indefinite length
val outer = stack.peek()
outer.writeSize(completedCurrent.elementCount) // if needed
outer.write(completedCurrent.byteArray)
}
}
By doing so, you will greatly reduce allocations for most used operations (encodeString, encodeInt, etc). It will also make code much simpler without the need for storing lambdas, additional classes besides Data/Token, and queue (I still don't get why we need one tbh. encodeStructure is already last call in chain, you do not need to delay writing BREAK byte).
2. Reserve space for size beforehand
Alternatively, we can try to do even more aggressive optimization: write everything to a single array, leaving gaps in it for size writing afterward. I.e. stack would contain not byte arrays, but positions, where we are supposed to insert the size. However, it would mean that size will always be written as at least 2 bytes*, even if it can be represented as a smaller number of bytes. IIRC it is still a valid CBOR encoding (although non-canonical), but I don't know about COSE requirements.
* numbers are encoded using 0 additional bytes if x <= 23 and 1 additional byte if 24 <= x =< 255. Since classes can't have more than
Byte.MAX_VALUEproperties, we can always encode any number ofelementCountas 2 bytes[24, elementCount]. Or, we can optimistically reserve 1 byte in array for classes with <= 23 properties and then if we have a bigger class, shift array usingArrays.copyOf. Such encoding provides smaller output, but potentially slower performance due to excessive array copying — in case we have a lot of classes that have more than 23 properties.
3. Split implementation
While writing over approaches above I noticed that for indefinite-length encoding, we don't need any of this mechanisms. Therefore, maybe it would be simpler just to keep old implementation as e.g. class CborIndefiniteLengthTaglessWriter when combination of flags indicates that we don't need to create new CborWriter. I'm not sure what other features we should backport to it (likely none, to keep it simple). It will allow supporting tags/definite lengths as well as keeping runtime performance for current users who do not need new features.
4. Try using kotlinx-io
This approach still requires reducing number of allocations (1), but also provides additional benefit: unlike ByteArray.append(otherByteArray), kotlinx-io Buffers can be joined with zero copying (because of internal segment mechanism). Therefore, same stack structure theoretically should be faster with Buffers than with ByteArrays. However, it would be hard to compare them directly, because output as a ByteArray means that we still need to copy all Buffers into it one by one. Perhaps this is more of a future point of improvement.
Hope this will help you.
Also, as a side note: it seems that Encoding.kt file size grew out of hand (>1000 lines). Initially it contained both CborEncoder and CborDecoder since they were small (my mistake anyway), but since they're bigger now, would you mind splitting them up?
Regarding public annotation class CborArray(@OptIn(ExperimentalUnsignedTypes::class) vararg val tag: ULong) — it seems that issue is still actual for JS even on 2.0.20-Beta1. Would it be problematic to use regular Long instead? On the bright side, it will also make @ExperimentalUnsignedTypes unnecessary.
Regarding
public annotation class CborArray(@OptIn(ExperimentalUnsignedTypes::class) vararg val tag: ULong)— it seems that issue is still actual for JS even on 2.0.20-Beta1. Would it be problematic to use regularLonginstead? On the bright side, it will also make@ExperimentalUnsignedTypesunnecessary.
The RFC defines tags as positive 64 bit numbers, so transitioning to regular Long will result in a counter-intuitive user experience, but I'll revamp the ByteString annotation to encapsulate the tag values into something that makes sense semantically. This would just leave illegal constructs of empty ValueTags and empty KeyTags, which should be disallowed anyway (but this checking this at compile-time would probably require KSP).
Thanks for the detailed feedback, much appreciated! I'll be replying inline.
Hi, I've looked into your new approach, and while it doesn't create a new array for each write operation, creating and storing a lambda in each
encodeInt/String/etcis also slow, unfortunately:# JMH version: 1.37 # VM version: JDK 11.0.14.1, OpenJDK 64-Bit Server VM, 11.0.14.1+10-LTS # VM invoker: /usr/lib/jvm/java-11-amazon-corretto/bin/java (your PR) Benchmark Mode Cnt Score Error Units CborBaseline.fromBytes thrpt 10 1657.190 ± 5.512 ops/ms CborBaseline.toBytes thrpt 10 925.624 ± 1.491 ops/ms (dev) Benchmark Mode Cnt Score Error Units CborBaseline.fromBytes thrpt 10 1939.893 ± 2.799 ops/ms CborBaseline.toBytes thrpt 10 2628.659 ± 5.105 ops/ms
I agree, still not good enough!
Following up on our previous discussion, the cornerstone of this problem, if I remember correctly, is that we need to write regular classes as definite-length maps — using length as the prefix (and
beginStructuredoesn't havecollectionSizeparameter). Is this the only reason to useDeferredEncoding?
Yes, that is the reason for this whole mess. CBOR, and especially COSE, is borderline insane. It's like X.509 but orders of magnitude worse because the known shortcomings weren't improved, but amplified.
I came up with several ideas on what can be done to improve this:
1. Reduce number of allocations
You can create intermediate storage per
beginStructure,not per eachencodeXxx.This code can look as following:internal class Encoder { class Data(val byteArray: ByteArray, var elementCount: Int) val stack = mutableListOf<Data>() fun beginStructure(descriptor: SerialDescriptor) { val current = Data(byteArrayOf(), 0) current.writeTags(descriptor) stack.add(current) } fun encodeElement(descriptor: SerialDescriptor, index: Int) { val current = stack.peek() current.writeTagsAndName(descriptor.getElementDescriptor(index), descriptor.getElementName(index)) current.elementCount++ } fun encodeString(s: String) = writeString(s, stack.peek().byteArray) fun encodeInt(i: Int) = writeInt(i, stack.peek().byteArray) fun endStructure() { val completedCurrent = stack.pop() completedCurrent.writeBreak() // if needed for indefinite length val outer = stack.peek() outer.writeSize(completedCurrent.elementCount) // if needed outer.write(completedCurrent.byteArray) } }By doing so, you will greatly reduce allocations for most used operations (encodeString, encodeInt, etc). It will also make code much simpler without the need for storing lambdas, additional classes besides
Data/Token, and queue (I still don't get why we need one tbh.encodeStructureis already last call in chain, you do not need to delay writing BREAK byte).
That seems like a viable solution, but I somehow have the feeling that some roadblocks will come up. I'll go for this option, because it looks like low hanging fruit. Fingers crossed!
2. Reserve space for size beforehand
Alternatively, we can try to do even more aggressive optimization: write everything to a single array, leaving gaps in it for size writing afterward. I.e. stack would contain not byte arrays, but positions, where we are supposed to insert the size. However, it would mean that size will always be written as at least 2 bytes*, even if it can be represented as a smaller number of bytes. IIRC it is still a valid CBOR encoding (although non-canonical), but I don't know about COSE requirements.
COSE requires canonocalization (because how elese is it supposed to work cryptographically), so this approach is not viable.
- numbers are encoded using 0 additional bytes if x <= 23 and 1 additional byte if 24 <= x =< 255. Since classes can't have more than
Byte.MAX_VALUEproperties, we can always encode any number ofelementCountas 2 bytes[24, elementCount]. Or, we can optimistically reserve 1 byte in array for classes with <= 23 properties and then if we have a bigger class, shift array usingArrays.copyOf. Such encoding provides smaller output, but potentially slower performance due to excessive array copying — in case we have a lot of classes that have more than 23 properties.
This is the killer application for COSE support . As you can see, lots of properties, so I'd optimize for the generic case, and not for specifics.
3. Split implementation
While writing over approaches above I noticed that for indefinite-length encoding, we don't need any of this mechanisms. Therefore, maybe it would be simpler just to keep old implementation as e.g.
class CborIndefiniteLengthTaglessWriterwhen combination of flags indicates that we don't need to create newCborWriter. I'm not sure what other features we should backport to it (likely none, to keep it simple). It will allow supporting tags/definite lengths as well as keeping runtime performance for current users who do not need new features.
Tagging should go in there in any case, IMHO, since it is independent of definite/indefinite length. In general I agree that better encapsulation and separation of concerns would benefit the code. I really dislike the idea of maintaining two codebases that are somewhat similar, but not quite the same. Seems like a mess waiting to happen in the long run.
4. Try using kotlinx-io
This approach still requires reducing number of allocations (1), but also provides additional benefit: unlike
ByteArray.append(otherByteArray), kotlinx-ioBuffers can be joined with zero copying (because of internal segment mechanism). Therefore, same stack structure theoretically should be faster with Buffers than with ByteArrays. However, it would be hard to compare them directly, because output as aByteArraymeans that we still need to copy allBuffers into it one by one. Perhaps this is more of a future point of improvement.
If you are fine with kotlinx.io, I'm all for it! Incorporating it should not be too much of a hassle.
Hope this will help you.
Mos def did!
Also, as a side note: it seems that
Encoding.ktfile size grew out of hand (>1000 lines). Initially it contained bothCborEncoderandCborDecodersince they were small (my mistake anyway), but since they're bigger now, would you mind splitting them up?
Sure thing!
I'll ping you once I'm done.
@sandwwraith I Implemented your suggested simplified approach according to your snippet. This PR is ready for the next review round.
TL;DR: t is working perfectly well and the code is much simpler. Also performs much faster! But the original composeNumber was abysmal to begin with! Deserializing performance is as you mentioned, but that is fine IMHO.
Long story:
Simplifying the code as you suggested had no impact at all it was exactly as slow as the convoluted approach and when switching out ByteArrayOutput for kotlinx-io Buffer the performance tanked completely (third result) unless write is used instead of copyTo (fourth result).
It got even worse: I've removed all the teeny tiny byteArrays that are allocated during tag encoding to operate in-place on the buffer, only to be back to the very same performance numbers as with my original approach. So I got rid of the buffer again.
And now for the kicker: I went ahead split the indefinite length encoder from the definite length encoder and the speed-up was only 10%.
What finally made a huge impact was to removing all the inefficient creation of teeny tiny ByteArrays from encodeByteArray. Fully optimizing the number encoding to always operate directly on the buffer makes things 5% slower, so I did not go further.
The current state of the PR
* CborBaseline.toBytes thrpt 10 1125.402 ± 2.414 ops/ms
Your suggestion (no more deferred encoding)
* CborBaseline.toBytes thrpt 10 1107.893 ± 5.853 ops/ms
CopyTo Buffer
* CborBaseline.toBytes thrpt 10 456.103 ± 4.931 ops/ms
Optimized Buffer
* CborBaseline.toBytes thrpt 10 983.492 ± 6.253 ops/ms
In-Place Number encoding
* CborBaseline.toBytes thrpt 10 1160.831 ± 8.129 ops/ms
Indefinite length only encoder:
* CborBaseline.toBytes thrpt 10 1257.923 ± 6.029 ops/ms
Fastest with the mostest
* CborBaseline.toBytes thrpt 10 1827.695 ± 6.384 ops/ms
I'm also happy to report that the JS compiler bug can now only be triggered when using empty @ValueTags or empty @KeyTags. This is nonsensical anyhow and translates to a NOOP, so everything is usable in practice
Hi,
I stumbled upon this great MR while trying to work with some CBOR structures and I have two questions:
-
The ByteStringWrapper seems to create it's own internal Cbor instance and doesn't follow the settings used to serialize/deserialize the outer structures. Am I overlooking a way to influence the behaviour of ByteStringWrapper?
-
I see examples to tag classes when I want them to be arrays with @CborArray(Tag()). Can I tag classes when they are maps somehow? And if I use Map or List to work in a more dynamic way can I somehow verify/write tags for the whole map/array and/or the respective elements?
Hi,
I stumbled upon this great MR while trying to work with some CBOR structures and I have two questions:
1. The ByteStringWrapper seems to create it's own internal Cbor instance and doesn't follow the settings used to serialize/deserialize the outer structures. Am I overlooking a way to influence the behaviour of ByteStringWrapper?
Nope, that was just an oversight and needs fixing.
2. I see examples to tag classes when I want them to be arrays with @CborArray(Tag()). Can I tag classes when they are maps somehow? And if I use Map or List to work in a more dynamic way can I somehow verify/write tags for the whole map/array and/or the respective elements?
I don't see how that would work. You would need to individually tag specific keys and values, which is just impossible
I obviously can't run the CI myself, but I tried to address everything. Ready for the next round of reviews. Thanks everyone for the detailed feedback! It really helped to clear up some things that were not working the way they are supposed to.
In my browser, this PR is now behaves uber-glitchy when all comments are shown, maybe it is just too large. The page jumps around when scrolling and is utterly unusable for me. I really tried to address all comments and i sincerely hope I did, because every single one was very valuable and improved the overall quality. If something slipped through, I'm really sorry! If this page still renders fine for you, hopefully everything shows up as outdated. The gliches are also the reason I stopped accepting suggestion through here and simply applied them and committed them regularly.
I have to say, I now understand why kotlinx.serialization works as well and reliably as it does. you really do take QA seriously!
I want to thank you again for this huge amount of work and your time. Well done!