serde Error deserializing the unit type from an empty sequence

I stumbled across the following inconsistency in serde and wondered if this is a bug or not:

    let _: (i32,) = serde_json::from_value(json!([42])).unwrap(); // works fine
    let _: () = serde_json::from_value(json!([])).unwrap(); // panics

AFAIK the unit type is can be thought of a tuple of length 0 and there is no other way to express that in rust. But while deserialization of tuples with elements works fine, deserializing an empty sequence into a unit type fails. I know that I could work around this in various ways but this behaviour does not seem to be correct.

I found out about this behaviour while writing a macro where I deserialize function arguments into tuples.

Any thoughts?

Dec 14 '22 23:12 nicolaiunrein

Actually ability to deserialize () from empty sequence was removed in #839 (and ability to deserialize unit structs from empty sequences removed in #857).

However, there is still some inconsistency. Internal ContentDeserializer allows deserialization of () from empty maps / sequences: https://github.com/serde-rs/serde/blob/05a5b7e3c6de502d45597cbc083f28bc1d4f4626/serde/src/private/de.rs#L1297-L1345

This is required, because this deserializer is used for internally tagged enums:

#[derive(Deserialize)]
#[serde(tag = "tag")]
enum InternallyTagged {
  Unit,
}

where InternallyTagged::Unit variant would be represented as ["Unit"] in JSON (for example). Because this is a sequence, it is deserialized using visit_seq and when deserializer consumes tag the sequence for the value becomes empty. In order to be able to read () (which is the representation of unit variants for internally tagged enums) the deserializer implements such conversions.

However, ContentDeserializer used not only for internally tagged enums, but for flattening types too, and using it for that creates inconsistency, because you are able to deserialize () from the empty sequence when it in flattened structure, but not able to do this when it in the ordinary type.

Aug 11 '23 15:08 Mingun

If I unterstand you correctly, you argue that () should never be deserialized from an empty sequence because it would be a type conversion? I actually think the other way around. In my regard it should be possible because it is just an empty tuple which has some special meaning in rust (e.g. it is the implicit return type). The inconsistency I see is that if a tuple of length n >= 1 can be deserialized/serialized from/to a sequence it should be possible for tuples of length 0 as well. Of course serialization is more tricky because() can also be thought of as null in other languages and it makes sense that the default serialization of () in e.g. JSON is null. I think in essence the problem is that () has more than one semantic meaning. The best solution (the solution with the highest level of correctness) I could think of at the moment would be an attribute like #[serde(serialize_as_tuple)] and allow deserialization of empty sequences to (). If that is not feasible maybe #[serde(as_tuple)] could be a good compromise.

Aug 14 '23 06:08 nicolaiunrein

If I unterstand you correctly, you argue that () should never be deserialized from an empty sequence because it would be a type conversion?

I would not say what I argue, but this is the current behavior which was implemented in this way with such motivation. I actually agree with you, that it would be feasible to have #[serde(as_tuple)] or similar

Aug 14 '23 12:08 Mingun

I will draft a PR if I find time.

Aug 15 '23 22:08 nicolaiunrein