serde-yaml
serde-yaml copied to clipboard
Incorrect error position inside tagged enums
use serde::Deserialize;
#[serde(tag = "Type")]
#[derive(Deserialize, Debug)]
pub enum Node {
One {
},
Two {
value1: u32,
value2: u32,
},
}
fn main() {
let text = "\
---
- Type: One
- Type: Two
value1: 1
value2: text
";
let nodes: Vec<Node> = match serde_yaml::from_str(text) {
Ok(v) => v,
Err(e) => {
eprintln!("{}", e);
return;
}
};
dbg!(nodes);
}
It prints:
invalid type: string "text", expected u32 at line 2 column 1
Which is completely incorrect. The line is 5 and column is 11.
Same issue for me, it seems to be caused by (tagged) enums.
Just hit this issue too, which was quite confusing
versions: rust: 1.55 serde: 1.0.130 serde_yaml: 0.8.21
PS: please ask if anything else needs to be provided
I also have hit this. Here is another test that shows it:
#[test]
fn test_tagged_location() {
#[derive(Debug, Deserialize)]
#[serde(deny_unknown_fields)]
struct S {
#[allow(dead_code)]
a: usize,
#[allow(dead_code)]
b: bool,
#[allow(dead_code)]
c: E,
}
#[derive(Debug, Deserialize)]
#[serde(tag = "type")]
pub enum E {
A {
e: String,
f: bool,
g: f32,
}
}
let yaml = indoc! {"
---
a: 2
b: true
c:
type: A
e: \"hello\"
f: fase
g: 1.0
"};
let expected =
"invalid type: string \"fase\", expected a boolean at line 7 column 6";
test_error::<S>(yaml, expected);
}
Output:
---- test_tagged_location stdout ----
thread 'test_tagged_location' panicked at 'assertion failed: `(left == right)`
left: `"invalid type: string \"fase\", expected a boolean at line 7 column 6"`,
right: `"invalid type: string \"fase\", expected a boolean at line 2 column 2"`',
I've done a little investigation. It appears that the problem is common for other formats. I did two almost identical tests for serde-yaml and serde-json and put panics at creation of the error to see the backtrace:
The test for serde-json
#[test]
fn test_enum_serde_tag() {
#[derive(Deserialize, PartialEq, Debug)]
#[serde(tag = "kind")]
enum E {
Circle { radius: u8 },
Dot { x: u8, y: u8 },
}
#[derive(Deserialize, PartialEq, Debug)]
struct Data {
figures: Vec<E>,
}
let source = r#"{ "figures": [ { "kind": "Dot", "y": 22, "x": true } ] }"#;
let deserialized = from_str::<Data>(source);
assert!(deserialized.is_err());
let err = deserialized.unwrap_err();
assert_eq!(1, err.line(), "{:?}", err);
assert_eq!(47, err.column(), "{:?}", err);
}
The actual value of the column is 54.
The test for serde-yaml
#[test]
fn test_enum_serde_tag() {
#[derive(Deserialize, PartialEq, Debug)]
#[serde(tag = "kind")]
enum E {
Circle { radius: u8 },
Dot { x: u8, y: u8 },
}
#[derive(Deserialize, PartialEq, Debug)]
struct Data {
figures: Vec<E>,
}
let yaml = indoc! {"
---
figures:
- kind: Dot
x: 21
y: true
"};
let deserialized = serde_yaml::from_str::<Data>(yaml);
assert!(deserialized.is_err());
let err = deserialized.unwrap_err();
let loc = err.location().unwrap();
assert_eq!(5, loc.line(), "{:?}", err);
assert_eq!(8, loc.column());
}
The actual value of the line is 3.
So let's take a look at the tracebacks.
The traceback for serde-json
thread 'test_enum_serde_tag' panicked at 'OMG!!, invalid type: boolean `true`, expected u8', src/error.rs:374:13
stack backtrace:
0: backtrace::backtrace::libunwind::trace
at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/libunwind.rs:86
1: backtrace::backtrace::trace_unsynchronized
at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/mod.rs:66
2: std::sys_common::backtrace::_print_fmt
at src/libstd/sys_common/backtrace.rs:78
3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
at src/libstd/sys_common/backtrace.rs:59
4: core::fmt::write
at src/libcore/fmt/mod.rs:1076
5: std::io::Write::write_fmt
at /rustc/04488afe34512aa4c33566eb16d8c912a3ae04f9/src/libstd/io/mod.rs:1537
6: std::io::impls::<impl std::io::Write for alloc::boxed::Box<W>>::write_fmt
at src/libstd/io/impls.rs:176
7: std::sys_common::backtrace::_print
at src/libstd/sys_common/backtrace.rs:62
8: std::sys_common::backtrace::print
at src/libstd/sys_common/backtrace.rs:49
9: std::panicking::default_hook::{{closure}}
at src/libstd/panicking.rs:198
10: std::panicking::default_hook
at src/libstd/panicking.rs:214
11: std::panicking::rust_panic_with_hook
at src/libstd/panicking.rs:526
12: rust_begin_unwind
at src/libstd/panicking.rs:437
13: std::panicking::begin_panic_fmt
at src/libstd/panicking.rs:391
14: <serde_json::error::Error as serde::de::Error>::invalid_type
at src/error.rs:374
15: serde::__private::de::content::ContentDeserializer<E>::invalid_type
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/private/de.rs:1007
16: serde::__private::de::content::ContentDeserializer<E>::deserialize_integer
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/private/de.rs:1023
17: <serde::__private::de::content::ContentDeserializer<E> as serde::de::Deserializer>::deserialize_u8
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/private/de.rs:1156
18: serde::de::impls::<impl serde::de::Deserialize for u8>::deserialize
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/impls.rs:129
19: <core::marker::PhantomData<T> as serde::de::DeserializeSeed>::deserialize
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:787
20: <serde::de::value::MapDeserializer<I,E> as serde::de::MapAccess>::next_value_seed
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/value.rs:1181
21: serde::de::MapAccess::next_value
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:1866
22: <&mut A as serde::de::MapAccess>::next_value
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:1947
23: <test::test_enum_serde_tag::_::<impl serde::de::Deserialize for test::test_enum_serde_tag::E>::deserialize::__Visitor as serde::de::Visitor>::visit_map
at tests/test.rs:1377
24: serde::__private::de::content::visit_content_map
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/private/de.rs:1071
25: <serde::__private::de::content::ContentDeserializer<E> as serde::de::Deserializer>::deserialize_any
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/private/de.rs:1110
26: test::test_enum_serde_tag::_::<impl serde::de::Deserialize for test::test_enum_serde_tag::E>::deserialize
at tests/test.rs:1377
27: <core::marker::PhantomData<T> as serde::de::DeserializeSeed>::deserialize
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:787
28: <serde_json::de::SeqAccess<R> as serde::de::SeqAccess>::next_element_seed
at ./src/de.rs:1943
29: serde::de::SeqAccess::next_element
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:1727
30: <serde::de::impls::<impl serde::de::Deserialize for alloc::vec::Vec<T>>::deserialize::VecVisitor<T> as serde::de::Visitor>::visit_seq
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/impls.rs:1038
31: <&mut serde_json::de::Deserializer<R> as serde::de::Deserializer>::deserialize_seq
at ./src/de.rs:1736
32: serde::de::impls::<impl serde::de::Deserialize for alloc::vec::Vec<T>>::deserialize
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/impls.rs:1049
33: <core::marker::PhantomData<T> as serde::de::DeserializeSeed>::deserialize
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:787
34: <serde_json::de::MapAccess<R> as serde::de::MapAccess>::next_value_seed
at ./src/de.rs:2002
35: serde::de::MapAccess::next_value
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:1866
36: <test::test_enum_serde_tag::_::<impl serde::de::Deserialize for test::test_enum_serde_tag::Data>::deserialize::__Visitor as serde::de::Visitor>::visit_map
at tests/test.rs:1384
37: <&mut serde_json::de::Deserializer<R> as serde::de::Deserializer>::deserialize_struct
at ./src/de.rs:1835
38: test::test_enum_serde_tag::_::<impl serde::de::Deserialize for test::test_enum_serde_tag::Data>::deserialize
at tests/test.rs:1384
39: serde_json::de::from_trait
at ./src/de.rs:2412
40: serde_json::de::from_str
at ./src/de.rs:2613
41: test::test_enum_serde_tag
at tests/test.rs:1391
42: test::test_enum_serde_tag::{{closure}}
at tests/test.rs:1375
43: core::ops::function::FnOnce::call_once
at /rustc/04488afe34512aa4c33566eb16d8c912a3ae04f9/src/libcore/ops/function.rs:233
44: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once
at /rustc/04488afe34512aa4c33566eb16d8c912a3ae04f9/src/liballoc/boxed.rs:1081
45: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
at /rustc/04488afe34512aa4c33566eb16d8c912a3ae04f9/src/libstd/panic.rs:318
46: std::panicking::try::do_call
at /rustc/04488afe34512aa4c33566eb16d8c912a3ae04f9/src/libstd/panicking.rs:348
47: std::panicking::try
at /rustc/04488afe34512aa4c33566eb16d8c912a3ae04f9/src/libstd/panicking.rs:325
48: std::panic::catch_unwind
at /rustc/04488afe34512aa4c33566eb16d8c912a3ae04f9/src/libstd/panic.rs:394
49: test::run_test_in_process
at src/libtest/lib.rs:541
50: test::run_test::run_test_inner::{{closure}}
at src/libtest/lib.rs:450
The traceback for serde-yaml
thread 'test_enum_serde_tag' panicked at 'OMG!! invalid type: boolean `true`, expected u8', src/error.rs:190:9
stack backtrace:
0: backtrace::backtrace::libunwind::trace
at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/libunwind.rs:86
1: backtrace::backtrace::trace_unsynchronized
at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/mod.rs:66
2: std::sys_common::backtrace::_print_fmt
at src/libstd/sys_common/backtrace.rs:78
3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
at src/libstd/sys_common/backtrace.rs:59
4: core::fmt::write
at src/libcore/fmt/mod.rs:1076
5: std::io::Write::write_fmt
at /rustc/04488afe34512aa4c33566eb16d8c912a3ae04f9/src/libstd/io/mod.rs:1537
6: std::io::impls::<impl std::io::Write for alloc::boxed::Box<W>>::write_fmt
at src/libstd/io/impls.rs:176
7: std::sys_common::backtrace::_print
at src/libstd/sys_common/backtrace.rs:62
8: std::sys_common::backtrace::print
at src/libstd/sys_common/backtrace.rs:49
9: std::panicking::default_hook::{{closure}}
at src/libstd/panicking.rs:198
10: std::panicking::default_hook
at src/libstd/panicking.rs:214
11: std::panicking::rust_panic_with_hook
at src/libstd/panicking.rs:526
12: rust_begin_unwind
at src/libstd/panicking.rs:437
13: std::panicking::begin_panic_fmt
at src/libstd/panicking.rs:391
14: <serde_yaml::error::Error as serde::de::Error>::custom
at src/error.rs:190
15: serde::de::Error::invalid_type
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:209
16: serde::__private::de::content::ContentDeserializer<E>::invalid_type
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/private/de.rs:1007
17: serde::__private::de::content::ContentDeserializer<E>::deserialize_integer
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/private/de.rs:1023
18: <serde::__private::de::content::ContentDeserializer<E> as serde::de::Deserializer>::deserialize_u8
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/private/de.rs:1156
19: serde::de::impls::<impl serde::de::Deserialize for u8>::deserialize
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/impls.rs:129
20: <core::marker::PhantomData<T> as serde::de::DeserializeSeed>::deserialize
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:787
21: <serde::de::value::MapDeserializer<I,E> as serde::de::MapAccess>::next_value_seed
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/value.rs:1181
22: serde::de::MapAccess::next_value
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:1866
23: <&mut A as serde::de::MapAccess>::next_value
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:1947
24: <test_de::test_enum_serde_tag::_::<impl serde::de::Deserialize for test_de::test_enum_serde_tag::E>::deserialize::__Visitor as serde::de::Visitor>::visit_map
at tests/test_de.rs:183
25: serde::__private::de::content::visit_content_map
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/private/de.rs:1071
26: <serde::__private::de::content::ContentDeserializer<E> as serde::de::Deserializer>::deserialize_any
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/private/de.rs:1110
27: test_de::test_enum_serde_tag::_::<impl serde::de::Deserialize for test_de::test_enum_serde_tag::E>::deserialize
at tests/test_de.rs:183
28: <core::marker::PhantomData<T> as serde::de::DeserializeSeed>::deserialize
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:787
29: <serde_yaml::de::SeqAccess as serde::de::SeqAccess>::next_element_seed
at ./src/de.rs:754
30: serde::de::SeqAccess::next_element
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:1727
31: <&mut A as serde::de::SeqAccess>::next_element
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:1756
32: <serde::de::impls::<impl serde::de::Deserialize for alloc::vec::Vec<T>>::deserialize::VecVisitor<T> as serde::de::Visitor>::visit_seq
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/impls.rs:1038
33: serde_yaml::de::DeserializerFromEvents::visit_sequence::{{closure}}
at ./src/de.rs:589
34: serde_yaml::de::DeserializerFromEvents::recursion_check
at ./src/de.rs:679
35: serde_yaml::de::DeserializerFromEvents::visit_sequence
at ./src/de.rs:587
36: <&mut serde_yaml::de::DeserializerFromEvents as serde::de::Deserializer>::deserialize_seq
at ./src/de.rs:1331
37: serde::de::impls::<impl serde::de::Deserialize for alloc::vec::Vec<T>>::deserialize
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/impls.rs:1049
38: <core::marker::PhantomData<T> as serde::de::DeserializeSeed>::deserialize
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:787
39: <serde_yaml::de::MapAccess as serde::de::MapAccess>::next_value_seed
at ./src/de.rs:818
40: serde::de::MapAccess::next_value
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:1866
41: <&mut A as serde::de::MapAccess>::next_value
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:1947
42: <test_de::test_enum_serde_tag::_::<impl serde::de::Deserialize for test_de::test_enum_serde_tag::Data>::deserialize::__Visitor as serde::de::Visitor>::visit_map
at tests/test_de.rs:190
43: serde_yaml::de::DeserializerFromEvents::visit_mapping::{{closure}}
at ./src/de.rs:607
44: serde_yaml::de::DeserializerFromEvents::recursion_check
at ./src/de.rs:679
45: serde_yaml::de::DeserializerFromEvents::visit_mapping
at ./src/de.rs:601
46: <&mut serde_yaml::de::DeserializerFromEvents as serde::de::Deserializer>::deserialize_struct
at ./src/de.rs:1385
47: <serde_yaml::de::Deserializer as serde::de::Deserializer>::deserialize_struct::{{closure}}
at ./src/de.rs:424
48: serde_yaml::de::Deserializer::de
at ./src/de.rs:114
49: <serde_yaml::de::Deserializer as serde::de::Deserializer>::deserialize_struct
at ./src/de.rs:424
50: test_de::test_enum_serde_tag::_::<impl serde::de::Deserialize for test_de::test_enum_serde_tag::Data>::deserialize
at tests/test_de.rs:190
51: <core::marker::PhantomData<T> as serde::de::DeserializeSeed>::deserialize
at /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/de/mod.rs:787
52: serde_yaml::de::from_str_seed
at ./src/de.rs:1496
53: serde_yaml::de::from_str
at ./src/de.rs:1477
54: test_de::test_enum_serde_tag
at tests/test_de.rs:210
55: test_de::test_enum_serde_tag::{{closure}}
at tests/test_de.rs:182
56: core::ops::function::FnOnce::call_once
at /rustc/04488afe34512aa4c33566eb16d8c912a3ae04f9/src/libcore/ops/function.rs:233
57: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once
at /rustc/04488afe34512aa4c33566eb16d8c912a3ae04f9/src/liballoc/boxed.rs:1081
58: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
at /rustc/04488afe34512aa4c33566eb16d8c912a3ae04f9/src/libstd/panic.rs:318
59: std::panicking::try::do_call
at /rustc/04488afe34512aa4c33566eb16d8c912a3ae04f9/src/libstd/panicking.rs:348
60: std::panicking::try
at /rustc/04488afe34512aa4c33566eb16d8c912a3ae04f9/src/libstd/panicking.rs:325
61: std::panic::catch_unwind
at /rustc/04488afe34512aa4c33566eb16d8c912a3ae04f9/src/libstd/panic.rs:394
62: test::run_test_in_process
at src/libtest/lib.rs:541
63: test::run_test::run_test_inner::{{closure}}
at src/libtest/lib.rs:450
The lines are slightly different because I've left some comment, but you
can find the places in the code by the names of the backtrace frames.
From both of backtraces you can see that the last point where an error
is able to get the location in the file is in deserialize_seq
31: <&mut serde_json::de::Deserializer<R> as serde::de::Deserializer>::deserialize_seq
36: <&mut serde_yaml::de::DeserializerFromEvents as serde::de::Deserializer>::deserialize_seq
Further code visits internals of elements and delegates the deserialization of them to
serde::__private::de::content::ContentDeserializer
I didn't dive so deep, so I can't say what is the entity for. By this
moment I have an idea to handle an error in MapAccess
and SeqAccess
.
They know about marker which carries the location, but I don't know if
it's right design solution. Any hints are appreciated.
heya all, I've been investigating this, and have gotten this far:
- For internally tagged enums,
serde
will use thisContentDeserializer
, in particularvisit_content_map
. - For the example in the first post,
serde_yaml
will useDeserializerFromEvents::deserialize_any
,visit_scalar
on thetext
value in the yaml to deserialize. - Though it should eventually be a
u32
, serde needs to first store the entire map in memory so that it can look for the internal tag. - For this to work,
serde_yaml
deserializes thetext
as a str invisit_untagged_scalar
, andserde
stores it as aContent::Str
inTaggedContentVisitor::visit_map
-- thev
there isContent::Str("text")
. - Note that the
libyaml::Mark
containing the correct position information oftext
is actually available inDeserializerFromEvents::deserialize_any:1201
(as seen in step 2). - However, since
serde
needs to save all values and later check for the internal tag, we have lost that position information when we store just the "successfully" parsed str. - Later on, when
serde
figures out thatvalue2
should be au32
, it eventually callsContentDeserializer::deserialize_integer
, which fails becausecontent
is aContent::Str
(as seen in step 4). - Then there is no way for
serde_yaml
to correctly augment the error with the position, andDeserializerFromEvents::deserialize_seq
only has theMark
(i.e. position) of the data that surrounds the internally tagged enum.
I suspect fixing this may need some changes in serde
to store the position of each value in TaggedContentVisitor::visit_map
's vec
, and include it in Error::custom
?
Though that looks like a breaking change, perhaps there's another way.
Investigation notes:
- A new
DeserializerFromEvents
is instantiated to deserialize the inner map. - I've experimented with passing through the
Mark
of the outerDeserializerFromEvents
to the inner one, but can't get the information through (it won't get past step 4). -
serde::de::Error
is a trait, andError::custom<T: Display>
is generic. It is possible to misuse the display implementation to include theMark
information, and we parse it back out.