rust-csv Add support for `#[serde(flatten)]`

trafficstars

Hi!

We have a use case for serializing maps using your library.

Consider the following structure:

struct Inner {
    inner_a: i32,
    inner_b: i32,
}

struct Middle {
    #[serde(flatten)]
    inner: Inner,
    middle_a: i32,
    middle_b: i32,
}

struct Outer {
    #[serde(flatten)]
    middle: Middle,
    outer_a: i32,
    outer_b: i32,
}

initialized as

let outer = Outer {
    middle: Middle {
        inner: Inner {
            inner_a: 0,
            inner_b: 1,
        },
        middle_a: 2,
        middle_b: 3,
    },
    outer_a: 4,
    outer_b: 5,
};

The expected serialized data is

inner_a,inner_b,middle_a,middle_b,outer_a,outer_b
0,1,2,3,4,5

I.e. a completely flat structure. There is no limit to the depth/width or anything else using the current solution. This particular example is added as a unit test in the PR. Additionally, serialization inside the "flattened" structures works as expected - for example, we can achieve the desired behavior mentioned in https://github.com/BurntSushi/rust-csv/pull/197#issuecomment-741645650. We also avoid the workaround mentioned in https://github.com/BurntSushi/rust-csv/issues/98#issuecomment-439733007, as it is very tedious, especially if the structure is big and/or deep.

Serializing fields which are HashMaps will still lead to a runtime error, however serializing HashMaps which are marked by #[serde(flatten)] will result in a flat structure (as long as both the key and the value in the HashMap are scalars).

If there is any room for improvement and/or changes that are needed to be done -- please let me know, I'll be happy to update the PR.

Mar 03 '21 14:03 gootorov

cc @amrhassan, have a look if this covers your use cases. This works as expected for the use case you mentioned in https://github.com/BurntSushi/rust-csv/pull/197#issuecomment-741645650, but perhaps you have other use cases you could try this on.

Mar 04 '21 17:03 gootorov

@BurntSushi, have you had a chance to have a look at this? Is there anything stopping from merging?

Mar 17 '21 16:03 gootorov

@gootorov No, I haven't had a chance, sorry. The problem is that I'm short on time and serde changes to this library are incredibly complicated and usually have unintended effects. So in order me to accept this, I'd probably have to spend at least a couple of hours re-contextualizing serde semantics and making sure that the behavior we add here is something that we can live with going forward. If a mistake in the public API gets made, then we either have to live with it or release a csv 2, and I don't really want to release a csv 2 for at least a few more years (if that).

And on top of all of that, serde(flatten) is particularly complex.

Mar 17 '21 16:03 BurntSushi

I am really looking forward to have #[serde(flatten)] support added to this crate. And I really appreciate you taking a jab at it :) Though, for a robust solution and with pedantry aside, I think I should mention this:

I might be mistaken, but this doesn't seem to "add support for #[serde(flatten)]" as it would not simply work as expected. What I mean is that a Reader without headers would not be able to deserialize entries.

For example:

#[derive(Debug, serde::Serialize, serde::Deserialize)]
struct Inner {
    inner: u32,
}

#[derive(Debug, serde::Serialize, serde::Deserialize)]
struct Outer {
    #[serde(flatten)]
    inner: Inner,
    outer: u32,
}

let input = "0,2\n4,7";

let mut reader = csv::ReaderBuilder::new()
    .has_headers(false)
    .from_reader(input.as_bytes());

for r in reader.deserialize::<Outer>() {
    println!("{:?}", r);
}

The same would work with has_headers(true), though. Please, correct me if I'm wrong.

I know that you focused on the serializer part. Maybe, then, this PR should re-scope itself to reflect that? There's more to go for flatten support, IMO. And scoping it to what it is trying to achieve may get other contributors to chip in with the other facets that need to be tackled.

Oct 05 '21 17:10 m-lima

I encountered this today and would love to see this implemented, and I support re-scoping the PR to serializing only (and return an error on the non-header deserialize example)

Jan 15 '22 10:01 SuperCuber

I tried the version in the head of the branch, and it worked swimmingly for serialization (my only requirement).

Mar 25 '22 07:03 mhvelplund

Tried to use 1.1.6 with nested records but failed with value type error ("1" was in a string field), will try to check HEAD and get a test case later

Mar 25 '22 08:03 qrilka

I too tried the version at the head of this branch, and it worked perfectly for my use case. Furthermore, I couldn't have accomplished what I needed without it, because serialize_struct (which would otherwise work) requires the keys to be &'static str, and I compute my key names at runtime.

Given that this PR is 13 months old, it'd sure be nice to see it in the "true" crate.

Apr 05 '22 19:04 lukesneeringer

This works for our use case as well

May 19 '22 22:05 silverjam

https://github.com/qrilka/rust-csv/commit/01837248517dc28862d2e11afa7911a1745cb58d shows the problem we've seen with the current master, this results in

---- deserializer::tests::flatten_mix_types stdout ----
thread 'deserializer::tests::flatten_mix_types' panicked at 'called `Result::unwrap()` on an `Err` value: Error(Deserialize { pos: None, err: DeserializeError { field: None, kind: Message("invalid type: integer `1`, expected a string") } })', src/deserializer.rs:1254:58
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

May 20 '22 14:05 qrilka

rust-csv rust-csv copied to clipboard

Add support for `#[serde(flatten)]`

rust-csv
rust-csv copied to clipboard