serde icon indicating copy to clipboard operation
serde copied to clipboard

Fix incorrect representation of tuples with skipped fields

Open Mingun opened this issue 11 months ago • 10 comments

This PR fixes #2105 and give some insights about the current situation with skipped fields.

You can see what changed in the generated code by running the following commands:

  • place skip.txt into test_suite/tests folder and change it extension to .rs
  • run the following command:
    ...\serde\test_suite> cargo expand --all-features --test skip > skip-N.rs
    
    where N is some number.

When commit message contains words "Changes in generated code" that means that output of the command above was changed. I recommend to diff it against the previous result and explore the difference. There will be 4 files -- baseline and three changed.

Overall changes over all commits:

  • Tuple1as0, Tuple1as0Default, Tuple1as0With:
    • removed visit_newtype_struct
    • changed *_newtype_struct -> *_tuple_struct(0)
  • Tuple2as1, Tuple2as1Default, Tuple2as1With:
    • added visit_newtype_struct

The table below summarizes the current behavior and the new behavior, introduced in this PR. Empty cells in the last column means that the behavior wasn't changed.

The behavior columns shows which methods are used to serialize or deserialize a type. The number in parenthesis is a count of fields which passed to the corresponding method.

CodeCurrent behaviorFixed behavior
Unit representations
struct Unit;

serialize_unit_struct
deserialize_unit_struct
visit_unit

Tuple representations
struct Tuple0();

serialize_tuple_struct(0)
deserialize_tuple_struct(0)
visit_seq

:information_source: It would be better to use Unit representation (see below)

// Can be named
// Newtype(u8)
struct Tuple1(u8);

serialize_newtype_struct
deserialize_newtype_struct
visit_newtype_struct
visit_seq

struct Tuple2(u8, u8);

serialize_tuple_struct(2)
deserialize_tuple_struct(2)
visit_seq

// The same as Tuple0
struct Tuple1as0(
    #[serde(skip)] u8,
);

#2105

serialize_newtype_struct
deserialize_newtype_struct
visit_newtype_struct
visit_seq

serialize_tuple_struct(0)
deserialize_tuple_struct(0)
visit_seq
:information_source: It would be better to use Unit representation (see below)

// The same as Tuple0
struct Tuple2as0(
    #[serde(skip)] u8,
    #[serde(skip)] u8,
);

serialize_tuple_struct(0)
deserialize_tuple_struct(0)
visit_seq

:information_source: It would be better to use Unit representation (see below)

// The same as Tuple1
struct Tuple2as1(
    #[serde(skip)] u8,
    u8,
);

serialize_tuple_struct(1)
deserialize_tuple_struct(1)
visit_seq

serialize_tuple_struct(1)
deserialize_tuple_struct(1)
visit_newtype_struct
visit_seq
:information_source: It would be better to use Newtype representation (see below)

Struct representations
struct Struct0 {}

serialize_struct
deserialize_struct
visit_seq
visit_map

struct Struct1 {
    a: u8,
}

serialize_struct
deserialize_struct
visit_seq
visit_map

struct Struct2as0 {
    #[serde(skip)]
    a: u8,
    #[serde(skip)]
    b: u8,
}

serialize_struct
deserialize_struct
visit_seq
visit_map

struct Struct2as1 {
    #[serde(skip)]
    a: u8,
    b: u8,
}

serialize_struct
deserialize_struct
visit_seq
visit_map

Enum variant representations

The following changes also was implemented by I plan to ship them in separate PR due to some debating changes. I left my finding here to have the overall picture about skipped fields in one place.

The first debating change is to how to represent tuple(0) tuples? Should we represent them as a sequences with zero fields, or as a unit structs? Motivation for the unit structs is that that is the only way to skip all fields in a tuple and still represent it as a unit. This is currently done for Enum::Tuple1as0, but only for it. Enum::Tuple2as0 no longer follows this agreement as well as all tuple structs, which introduces more confusion. I am inclined to this option, since to have an always empty sequence is somewhat strange.

The second debating change, maybe we should make the behavior configurable? Introduce a new attribute like #[serde(tuple)] to produce tuple(0) and tuple(1) forms from TupleXas0 and TupleXas1 tuples?

CodeCurrent behaviorSuggested behavior
Enum::Unit representations
enum Enum {
  Unit,
}

serialize_unit_variant
unit_variant

Enum::Tuple representations
enum Enum {
  Tuple0(),
}

serialize_tuple_variant(0)
tuple_variant(0)
visit_seq

serialize_unit_variant
unit_variant

// Can be called
// Enum::Newtype(u8)
enum Enum {
  Tuple1(u8),
}

serialize_newtype_variant
newtype_variant

enum Enum {
  Tuple2(u8, u8),
}

serialize_tuple_variant(2)
tuple_variant(2)
visit_seq

// The same as
// Enum::Tuple0
enum Enum {
  Tuple1as0(
    #[serde(skip)] u8,
  ),
}

serialize_unit_variant
unit_variant

// The same as
// Enum::Tuple0
enum Enum {
  Tuple2as0(
    #[serde(skip)] u8,
    #[serde(skip)] u8,
  ),
}

serialize_tuple_variant(0)
tuple_variant(0)
visit_seq

serialize_unit_variant
unit_variant

// The same as
// Enum::Tuple1
enum Enum {
  Tuple2as1(
    #[serde(skip)] u8,
    u8,
  ),
}

#2224

serialize_tuple_variant(1)
tuple_variant(1)
visit_seq

serialize_newtype_variant
newtype_variant

Enum::Struct representations
enum Enum {
  Struct0 {},
}

serialize_struct_variant(0)
struct_variant
visit_seq
visit_map

enum Enum {
  Struct1 {
    a: u8,
  },
}

serialize_struct_variant(1)
struct_variant
visit_seq
visit_map

enum Enum {
  // The same as
  // Enum::Struct0
  Struct2as0 {
    #[serde(skip)]
    a: u8,
    #[serde(skip)]
    b: u8,
  },
}

serialize_struct_variant(0)
struct_variant
visit_seq
visit_map

enum Enum {
  // The same as
  // Enum::Struct1
  Struct2as1 {
    #[serde(skip)]
    a: u8,
    b: u8,
  },
}

serialize_struct_variant(1)
struct_variant
visit_seq
visit_map

List of inconsistencies

As I said before, the current behavior contains a bunch of inconsistencies.

For example, an enum tuple variant with zero fields represented as a tuple with zero fields, but tuple variant with one skipped field represented as a unit. At the same time, tuple variant with 2 or more fields, all skipped, again represented as a tuple with zero fields. The same is applied to tuples with one effective field -- sometimes it is represented as a newtype, sometimes as a tuple with one field. I suggest to unify behavior across all cases that I summarized in the following table:

ValueSerialized formSuggested form
Types with zero fields in serialized form

Tuple0()

Represented as a tuple with zero fieldsunit

Tuple1as0(Skipped)

:warning: Incorrectly represented as a newtype struct (bug #2105)unit

Tuple2as0(Skipped, Skipped)

Represented as a tuple with zero fieldsunit
Types with 1 field in serialized form

Tuple1(x)

Represented as a newtype x

Tuple2as1(Skipped, x)

:exclamation: Represented as a tuple with 1 fieldnewtype
Enum variants with zero fields in serialized form

Enum::Tuple0()

Represented as a tuple with zero fieldsunit

Enum::Tuple1as0(Skipped)

:exclamation: Represented as a unit struct

Enum::Tuple2as0(Skipped, Skipped)

Represented as a tuple with zero fieldsunit
Enum variants with 1 field in serialized form

Enum::Tuple1(x)

Represented as a newtype x

Enum::Tuple2as1(Skipped, x)

:exclamation: Represented as a tuple with 1 field (#2224)newtype

Legend:

  • :warning: Bugs that leads to compilation errors
  • :exclamation: Inconsistency in a block of similar table rows

Mingun avatar Jul 21 '23 17:07 Mingun