kotlinx.serialization Q: How to not write a CBOR tag when value is null?

For CBOR serialization, I assign a tag as so,

@Serializable
data class Something(
    @ValueTags(MY_TAG) @Contextual val myObject: MyObject?
)

and I have a serializer like

object MyObject AsByteArraySerializer : KSerializer<MyObject> {
    override val descriptor: SerialDescriptor
        get() = SerialDescriptor("MyObject", ByteArraySerializer().descriptor)

    override fun deserialize(decoder: Decoder): MyObject = MyObject(decoder.decodeSerializableValue(ByteArraySerializer()))

    override fun serialize(
        encoder: Encoder,
        value: MyObject,
    ) = encoder.encodeSerializableValue(ByteArraySerializer(), value.toBytes())
}

It works fine for non-null values, but when null, I don't want the tag written to the byte stream. I need to do this to be in compliance with another project's CBOR protocol expectations.

For example, right now, when the value is null, I'm writing a tag,

      70726576                          # "prev"
   D8 2A                                # tag(42)
      F6                                # primitive(22)

but I need it not to write the tag

      70726576                          # "prev"
   F6                                   # primitive(22)

what can I do to control this?

Apr 01 '25 00:04 travishaagen

@travishaagen Writing of null values is managed by the container of the value (Something), not by the serializer of the value. If you specify null as the default that will likely work. There may be some other configuration options for cbor to use.

Apr 01 '25 08:04 pdvrieze

@pdvrieze thanks for the reply, but setting the default to null didn't change the outcome.

@ardune @JesusMcCloud @sandwwraith

^ see that you folks have discussed CBOR and null handling in the past (e.g., https://github.com/Kotlin/kotlinx.serialization/pull/2952, https://github.com/Kotlin/kotlinx.serialization/issues/2848 )

do you have any insights on making CBOR tags optional, for null values, with this library? In other words, you want the tag to appear when it is non-null, but no tag appears in the binary CBOR serialization when the value is null.

Apr 01 '25 18:04 travishaagen

@travishaagen I'll preface my response with: the following is a bad approach and smarter people should give you a better one.

I personally use a custom build that exposes functions I need. Specific example: https://github.com/ardune/kotlinx.serialization/commit/50c9a263f87e9a506da9df3a27102b667c480ec5

Look at "TaggedByteStringSerializer" - it directly encodes and checks for tags

I did this approach as I wanted to be strict about tags in some cases and more lenient in others.

Ideally, I would a custom Descriptor of some kind that adds, say, an ObjectTags annotation properly instead of making custom edits. But, I didn't figure out how to do that properly and decided to go with the hacky approach.

So, I'd also love to know the better way to deal with this.

Apr 01 '25 21:04 ardune

Thanks @ardune ! Being able to access the tags like that seems like it would be useful.

I'm going to dig in a bit more. I was also wondering if the annotation itself could have an optional parameter to change this behavior.

Apr 01 '25 22:04 travishaagen

@travishaagen I am assuming configuring the CBOR encoder with encodeDefaults=false and setting the default value to null introduces undesired behaviour, as you need the null to be there?

~~There might be a dirty hack, in case encodeDefaults=false is an option for you by introducing an additional property capturing the null.~~

Currently, the only proper way of doing this would be a custom serialiser and deserialiser, of course. A bit tedious, but not rocket science. Just manually encode and decode as you like. Just delegate to the default generated serialiser for all properties, except for the one you need custom handling for. There you check whether its value is null. If so: just encode a null. if not -> delegate to default serialiser for that property.
Decoding is a tad trickier and you'll need to try-catch: Try do decode a tag for the last property, if it fails try to decode a null. (or the other way around: try to decode a null, and if it fails, try to decode. a tag, YMMV). Why does this work: the decoder has a lookahead of a single byte and if this single byte does not match the expected first byte of whatever your are trying to decode, the decoder throws but does not advance. We've been using this property to great success, most recently here as part of a large behemoth of a custom serilaizer that used to exploit this property not enough before.

inside the serialize and deserialize functions, you can actually check the type of the decoder and if it is a CBOR decoder/encoder, so you can toggle custom encoding/decoding based on whether you are actually doing CBOR or some other format

Apr 02 '25 06:04 JesusMcCloud

I discovered that I don't need this anymore for my use case, but others may encounter this, because rfc 8949 states that untagged null can be valid CBOR,

As a matter of convention, many tags do not accept null or undefined values as tag content; 
instead, the expectation is that a null or undefined value can be used in place of the entire tag; 
Section 3.4.2 provides some further considerations for one specific tag about the handling of this
convention in application protocols and in mapping to platform types.

...
using untagged null or undefined avoids the use of non-finites and results in a shorter encoding

Apr 13 '25 17:04 travishaagen

@sandwwraith @fzhinkin Hello, I put up a PR to address this issue https://github.com/Kotlin/kotlinx.serialization/pull/3074

Sep 08 '25 18:09 travishaagen