serde icon indicating copy to clipboard operation
serde copied to clipboard

Allow rename with str|bool|int type for internally tagged enums

Open NotBad4U opened this issue 5 years ago • 18 comments

This is the WIP for #745 (my work is based on #973). I have some troubles to understand what I have to do to complete the serde_derive/de.rs and serde_derive/se.rs modifications. I don't understand what I've to do in the method deserialize_identifier: serde_derive/src/de.rs#L2016 and if I have to modify other deserialize_* methods in serde_derive/de.rs. Does the serde_derive/de.rs part seem correct ?

For now the unit test doesn't compile and I get this error. Do I have to modify something in serde_derive/src/internals/ast.rs to pass the compilation ?

   Compiling serde_test_suite v0.0.0 (file:///home/escanor/Junk/serde/test_suite)
error[E0308]: mismatched types
    --> test_suite/tests/test_macros.rs:1048:32
     |
1048 |     #[derive(Debug, PartialEq, Serialize, Deserialize)]
     |                                ^^^^^^^^^
     |                                |
     |                                expected reference, found u64
     |                                help: consider borrowing here: `&Serialize`
     |
     = note: expected type `&_`
                found type `u64`

error[E0308]: mismatched types
    --> test_suite/tests/test_macros.rs:1048:32
     |
1048 |     #[derive(Debug, PartialEq, Serialize, Deserialize)]
     |                                ^^^^^^^^^
     |                                |
     |                                expected reference, found bool
     |                                help: consider borrowing here: `&Serialize`
     |
     = note: expected type `&_`
                found type `bool`

error[E0308]: mismatched types
    --> test_suite/tests/test_macros.rs:1048:43
     |
1048 |     #[derive(Debug, PartialEq, Serialize, Deserialize)]
     |                                           ^^^^^^^^^^^ expected str, found u64
     |
     = note: expected type `str`
                found type `u64`

error[E0308]: mismatched types
    --> test_suite/tests/test_macros.rs:1048:43
     |
1048 |     #[derive(Debug, PartialEq, Serialize, Deserialize)]
     |                                           ^^^^^^^^^^^ expected str, found bool
     |
     = note: expected type `str`
                found type `bool`

error: aborting due to 4 previous errors

NotBad4U avatar Sep 17 '18 20:09 NotBad4U

Nice, this is on the right track. Regarding the test failure, you can use cargo-expand and the command cargo expand --test test_macros in the test_suite directory to see the generated code. I would recommend copying one of the non-working generated impls into a new minimal project, patching it up manually to compile and behave as intended, and then once the handwritten one works figuring out what modifications are needed in serde_derive to generate the modifications you had to make by hand.

dtolnay avatar Sep 19 '18 08:09 dtolnay

@dtolnay Thx for the tips :) And for the work on https://github.com/serde-rs/serde/blob/3f0f739e17dde92be582917dbed3d5b09fc4177d/serde_derive/src/de.rs#L2016 I'm still blocking on what I have to do in deserialize_identifier method.

NotBad4U avatar Sep 19 '18 09:09 NotBad4U

That deserialize_identifier function is generating the code that identifies which field of a struct or which variant of an enum we are looking at in the input. For example if the input contains a JSON string then we have a match expression like the following to map that string to a field or variant, taking into account any serde(rename = "...") attributes.

https://github.com/serde-rs/serde/blob/3f0f739e17dde92be582917dbed3d5b09fc4177d/serde_derive/src/de.rs#L2244-L2252

The visit_other case is relevant when the user has something like:

#[derive(Deserialize)]
struct Demo {
    a: A,
    b: B,
    #[serde(flatten)]
    more: HashMap<Value, Value>,
}

where any map entries that are not one of the strings "a" or "b" need to be captured regardless of type and stored in more.

dtolnay avatar Sep 19 '18 17:09 dtolnay

Thx @dtolnay for your cargo expand tips I was able to progress on this PR.

I set a unit test of a internally tagged enums with a lot of variant types to help me to work on this feature: test_internally_tagged_enum_renamed. For now, the serialization pass all my assert_tokens but not the deserialization for variant with "complex" type like:

#[serde(rename=1)]
F(BTreeMap<String, String>),

Can you make a review of my deserialize_identifier ? I don't know if its the correct way to modify this method and I don't know if I have to deal with visit_u64 as I dealt with visit_bool.

Could you tell me where I have to put the unit tests for this feature (I saw that you have test_de.rs, test_ser.rs and test_annotations.rs) ?

NotBad4U avatar Oct 03 '18 21:10 NotBad4U

Looks good so far!

I tried this against serde_json using the following code, but got an error during deserialization.

extern crate serde;
extern crate serde_derive;
extern crate serde_json;

use serde_derive::Deserialize;

#[derive(Deserialize, Debug)]
#[serde(tag = "success")]
enum Outcome {
    #[serde(rename = true)]
    Success,
    #[serde(rename = false)]
    Failure,
}

fn main() {
    let j = r#" {"success":true} "#;
    match serde_json::from_str::<Outcome>(j) {
        Ok(outcome) => println!("{:?}", outcome),
        Err(err) => eprintln!("{}", err),
    }
}
invalid type: boolean `true`, expected variant identifier at line 1 column 16

Some comments inline in the generated code:

    impl<'de> _serde::Deserialize<'de> for Outcome {
        fn deserialize<__D>(__deserializer: __D) -> _serde::export::Result<Self, __D::Error>
        where
            __D: _serde::Deserializer<'de>,
        {
            #[allow(non_camel_case_types)]
            enum __Field {
                __field0,
                __field1,
            }
            struct __FieldVisitor;

// I would omit visit_u64, visit_str, visit_bytes and only keep
// visit_bool. That way we get the Visitor trait's default invalid type
// error message, since those really correspond to an invalid type
// rather than an invalid value for the variant tag.
            impl<'de> _serde::de::Visitor<'de> for __FieldVisitor {
                type Value = __Field;
                fn expecting(
                    &self,
                    __formatter: &mut _serde::export::Formatter,
                ) -> _serde::export::fmt::Result {
                    _serde::export::Formatter::write_str(__formatter, "variant identifier")
                }
                fn visit_u64<__E>(self, __value: u64) -> _serde::export::Result<Self::Value, __E>
                where
                    __E: _serde::de::Error,
                {
                    match __value {
                        _ => _serde::export::Err(_serde::de::Error::invalid_value(
                            _serde::de::Unexpected::Unsigned(__value),
                            &"variant index 0 <= i < 2",
                        )),
                    }
                }
                fn visit_bool<__E>(self, __value: bool) -> _serde::export::Result<Self::Value, __E>
                where
                    __E: _serde::de::Error,
                {
                    match __value {
                        true => _serde::export::Ok(__Field::__field0),
                        false => _serde::export::Ok(__Field::__field1),

// This pattern is unreachable and should be left out.
                        _ => {
                            let __value = &_serde::export::from_bool(__value);
                            _serde::export::Err(_serde::de::Error::unknown_variant(__value, VARIANTS))
                        }
                    }
                }
                fn visit_str<__E>(self, __value: &str) -> _serde::export::Result<Self::Value, __E>
                where
                    __E: _serde::de::Error,
                {
                    match __value {
                        _ => _serde::export::Err(_serde::de::Error::unknown_variant(__value, VARIANTS)),
                    }
                }
                fn visit_bytes<__E>(self, __value: &[u8]) -> _serde::export::Result<Self::Value, __E>
                where
                    __E: _serde::de::Error,
                {
                    match __value {
                        _ => {
                            let __value = &_serde::export::from_utf8_lossy(__value);
                            _serde::export::Err(_serde::de::Error::unknown_variant(__value, VARIANTS))
                        }
                    }
                }
            }
            impl<'de> _serde::Deserialize<'de> for __Field {
                #[inline]
                fn deserialize<__D>(__deserializer: __D) -> _serde::export::Result<Self, __D::Error>
                where
                    __D: _serde::Deserializer<'de>,
                {

// This line is why the serde_json snippet above is failing. JSON
// represents identifiers as strings, so we are telling it to expect a
// string but the input contains a boolean. This line should say
// Deserializer::deserialize_bool. If the user's enum mixes different
// types of tags, this should say Deserializer::deserialize_any.
                    _serde::Deserializer::deserialize_identifier(__deserializer, __FieldVisitor)
                }
            }
            const VARIANTS: &'static [&'static str] = &["true", "false"];
            let __tagged = try!(_serde::Deserializer::deserialize_any(
                __deserializer,
                _serde::private::de::TaggedContentVisitor::<__Field>::new("success"),
            ));
            match __tagged.tag {
                __Field::__field0 => {
                    try!(_serde::Deserializer::deserialize_any(
                        _serde::private::de::ContentDeserializer::<__D::Error>::new(__tagged.content,),
                        _serde::private::de::InternallyTaggedUnitVisitor::new("Outcome", "Success"),
                    ));
                    _serde::export::Ok(Outcome::Success)
                }
                __Field::__field1 => {
                    try!(_serde::Deserializer::deserialize_any(
                        _serde::private::de::ContentDeserializer::<__D::Error>::new(__tagged.content,),
                        _serde::private::de::InternallyTaggedUnitVisitor::new("Outcome", "Failure"),
                    ));
                    _serde::export::Ok(Outcome::Failure)
                }
            }
        }
    }

dtolnay avatar Nov 25 '18 01:11 dtolnay

Could you tell me where I have to put the unit tests for this feature (I saw that you have test_de.rs, test_ser.rs and test_annotations.rs) ?

Where you currently have it is fine. Those files are a bit of a mess -- at some point we will need to tidy up and organize them in a more sensible way.

dtolnay avatar Nov 25 '18 01:11 dtolnay

Hi @dtolnay :smiley: , the code you tried with serde_json now work fine. I made some tests on my side (deserialization + serialization on complex enum) and It worked. Now I have to fix some tests which fails:

---- test_flatten_struct_enum stdout ----
thread 'test_flatten_struct_enum' panicked at 'tokens failed to deserialize: invalid type: string "insert_integer", expected field identifier', serde_test/src/assert.rs:193:19

---- test_flatten_enum_newtype stdout ----
thread 'test_flatten_enum_newtype' panicked at 'tokens failed to deserialize: invalid type: string "Q", expected field identifier', serde_test/src/assert.rs:193:19
note: Run with `RUST_BACKTRACE=1` for a backtrace.

---- test_flatten_internally_tagged stdout ----
thread 'test_flatten_internally_tagged' panicked at 'tokens failed to deserialize: invalid type: string "typeX", expected field identifier', serde_test/src/assert.rs:193:19

---- test_flatten_option stdout ----
thread 'test_flatten_option' panicked at 'tokens failed to deserialize: invalid type: string "inner1", expected field identifier', serde_test/src/assert.rs:193:19

---- test_flatten_map_twice stdout ----
thread 'test_flatten_map_twice' panicked at 'tokens failed to deserialize: invalid type: string "x", expected field identifier', serde_test/src/assert.rs:193:19

---- test_flatten_untagged_enum stdout ----
thread 'test_flatten_untagged_enum' panicked at 'tokens failed to deserialize: invalid type: string "a", expected field identifier', serde_test/src/assert.rs:193:19

---- test_lifetime_propagation_for_flatten stdout ----
thread 'test_lifetime_propagation_for_flatten' panicked at 'tokens failed to deserialize: invalid type: string "x", expected field identifier', serde_test/src/assert.rs:193:19

I do not understand what expected field identifier means in the error messages. Could you clarify this for me ?

My second point is on the behavior of the API. What have to be done when we have 2 variants with the same type and value ? e.g.: 2 * rename=true

#[derive(Deserialize, Debug)]
#[serde(tag = "success")]
enum Outcome {
    #[serde(rename = true)]  // first occurs
    A,
    #[serde(rename = true)] // secund occurs
    B,
}

For know, the build pass :disappointed: and the first variant is select all the time (A). If we allow only one rename=true where should I put this guard ?

NotBad4U avatar Dec 10 '18 16:12 NotBad4U

I do not understand what expected field identifier means in the error messages. Could you clarify this for me ?

In Serde, a field identifier is a value that can be used to distinguish fields. It's used for key-value based formats that might not keep the original order, like JSON.

If we allow only one rename=true where should I put this guard ?

The attribute-checking machinery resides in https://github.com/serde-rs/serde/blob/e1edb0282ad353fd5f4539d2456f3a0ad90f7e54/serde_derive/src/internals/attr.rs where Attr::set already handles duplicate attributes if the value field already contains one.

Adding new checks is done by adding new match arms in Container::from_ast, Variant::from_ast and/or Field::from_ast.

hcpl avatar Dec 10 '18 18:12 hcpl

So I don't understand why a structure like this one (from test_flatten_internally_tagged) with a flatten variant attributes encounter a problem with deserialization

---- test_flatten_internally_tagged stdout ----
thread 'test_flatten_internally_tagged' panicked at 'tokens failed to deserialize: invalid type: string "typeX", expected field identifier', serde_test/src/assert.rs:193:19
note: Run with `RUST_BACKTRACE=1` for a backtrace.
#[derive(Serialize, Deserialize, PartialEq, Debug)]
struct S {
    #[serde(flatten)]
    x: X,
    #[serde(flatten)]
    y: Y,
}

#[derive(Serialize, Deserialize, PartialEq, Debug)]
#[serde(tag = "typeX")]
enum X {
    A { a: i32 },
    B { b: i32 },
}

#[derive(Serialize, Deserialize, PartialEq, Debug)]
#[serde(tag = "typeY")]
enum Y {
    C { c: i32 },
    D { d: i32 },
}

This is the list of tokens expected:

assert_tokens(
     &s,
     &[
         Token::Map { len: None },
         Token::Str("typeX"),
         Token::Str("B"),
         Token::Str("b"),
         Token::I32(1),
         Token::Str("typeY"),
         Token::Str("D"),
         Token::Str("d"),
         Token::I32(2),
         Token::MapEnd,
     ],
);

This tokens list match with the ouput of value.serialize() (from assert_ser_tokens) but not the ouput of T::deserialize from (assert_de_tokens). How can I debug this further ? :smile:

NOTE: All the tests that do not pass are related to the flatten variant attributes.

NotBad4U avatar Dec 17 '18 10:12 NotBad4U

@hcpl would you be able to help get this reviewed and ready to land?

dtolnay avatar Jan 13 '19 21:01 dtolnay

This would be a handy feature! Any chance the PR could be revived?

kpozin avatar Sep 08 '19 05:09 kpozin

I need this quite badly (API I'm writing client for uses this pattern), badly enough to volunteer for updating this PR if it would mean it getting merged. Would just need guidance on what to do, since I don't have much in the way of practical Rust experience yet.

jaen avatar Sep 17 '19 21:09 jaen

@kpozin @jaen, I am still interested in landing this feature but I have been bottlenecked on review bandwidth for many months, and this feature has been relatively low value for how much review effort it demanded so I haven't been able to give it more attention.

Here are some things that would help drive this forward:

  • Get the PR rebased, passing tests, and rustfmt'd;
  • Assess whether it is possible for this PR to affect generated impls of any type that does not use bool or int renames; I would hope that this is not the case;
  • Assess whether the test suite is adequate or are there edge cases that are not covered;
  • A round of code review from someone who is not me but knows the Deserialize api, on whether this implementation roughly makes sense.

dtolnay avatar Oct 05 '19 01:10 dtolnay

@dtolnay I actually got the PR rebased (well, squashed and rebased, it was simpler that way) and compiling, but I had problems getting tests to work.

I ended up writing the impl manually instead, because I discovered #1221 will be blocking me anyway. I can publish the rebased code somewhere and/or try to fix the tests in my spare time (not much of that though, considering it's not blocking me at work for now) if that helps.

jaen avatar Oct 08 '19 13:10 jaen

@jaen that's awesome -- could you share your rebased code in a new PR? It doesn't have to be perfect.

dtolnay avatar Oct 08 '19 15:10 dtolnay

@dtolnay It's what I have right now – https://github.com/jaen/serde/pull/1/files. It compiles, but has some issues with deserialization tests – as I understand the deserializer produces a stream of tokens that are used to validate the deserializations and they do not match. These are the failing tests I didn't have time to debug yet.

jaen avatar Oct 09 '19 07:10 jaen

Thanks! I copied it over to https://github.com/serde-rs/serde/pull/1644 so we don't lose the work if you decide to delete your fork later.

dtolnay avatar Oct 09 '19 13:10 dtolnay

I spent a few hours doing this so I thought it may be useful to other people in the future since this PR has not received any modification for almost a year.

I generated the code for a tagged enum and adapted it to work with integers (it can also be modified to work with booleans). However I'm using two structs from private API ContentDeserializer and TaggedContentVisitor, to achieve this, should I worry it might break in a future serde patch? Is there a better way to do it? https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=ace054282891ad2e26a184f6487dc88a

mbenoukaiss avatar Aug 04 '20 15:08 mbenoukaiss

I think this PR is obsolete in favor of #2056

kangalio avatar Sep 27 '22 11:09 kangalio