serde icon indicating copy to clipboard operation
serde copied to clipboard

Struct with named fields can be deserialized from sequence

Open ghost opened this issue 6 years ago • 4 comments

When I derive Deserialize for a struct containing named fields, I expect that in any format that has a way of expressing named fields (i.e. pretty much every ordinary format except Bincode), it will always be expressed as a map containing named fields. To my surprise, it can also be deserialized from a sequence of field values without field names, even in formats where this is not correct, such as JSON and YAML.

Accepting invalid data is a security risk as well as risk of causing people to rely on an unintended implementation details. This also risks having different implementations that differ in what data they accept, which is also a security risk. Further it bloats programs with visitor methods for parsing invalid data, which would otherwise automatically be removed by the compiler.

In this example, I'd like deserialization to fail, but it incorrectly parses invalid data into a structure. (Playground)

use serde_derive::Deserialize;

#[derive(Debug, Deserialize)]
struct Person {
    first_name: String,
    last_name: String,
}

fn main() {
    eprintln!("{:#?}", serde_json::from_str::<Person>(r#"["John","Doe"]"#));
}

Maybe this is due to implementations of various formats not taking proper care in what data they accept after deserialize_struct is called, or maybe it is because of Serde itself having an ambiguous purpose of the visit_seq method, and there needs to be a different visitor method that is used only for formats that don't use field names (e.g. Bincode), with a default implementation that delegates to visit_seq. Deriving Deserialize for structs would then implement this method, but not the visit_seq method which also applies to JSON sequences.

ghost avatar Jul 27 '19 09:07 ghost

Do we have any solution for this yet? I also landed into this problem today where by my json is getting wrongly deserialized.

I have 2 types of values in a Container enum. The first is a struct containing String as field and the second is a Vector of String values.

pub enum Container { String(Wrapper), StringList(Vec<String>), }

pub struct Wrapper { value: String }

While deserializing json!(["abc"]) I get result as String(Wrapper { value: "abc" }) instead of StringList(["abc"])

Here is the code link: Playground

nupurbaghel avatar Jul 31 '19 18:07 nupurbaghel

Would a #[serde(strict)] attribute be an acceptable solution for you? I imagine it would omit all the fallback implementations from the Visitor implentation the macro generates.

Fwiw, there's nothing stopping you from writing a Deserialize implementation that does this yourself: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=dc9fbffd94d47da2153a6308a30f5c88

Plecra avatar May 12 '20 12:05 Plecra

Is there any way to enforce this without giving up #[derive(Deserialize)] at the present moment? What would it take to implement a #[serde(strict)] attribute like that?

indiv0 avatar May 15 '22 05:05 indiv0

I implemented the serde(strict) attribute as I interpreted it: #2639 The naming is intentionally left undecided. I would appreciate some feedback :)

Toromyx avatar Oct 28 '23 00:10 Toromyx