serde
serde copied to clipboard
Tagged enums should support #[serde(other)]
I think tagged enums should also allow #[serde(other)]
to mark a enum field as default. This will make the tagged enums much more useful.
Or is there already a way to do this that I don know?
Could you give some example of how you would expect this to work? Please show the enum definition, the input data, and the output value you expect to get from deserializing.
Say I have an enum
enum Target {
A(TargetA),
B(TargetB),
Others(TargetOthers)
}
When deserializing, { "tag": "A" }
will be deserialized to Target::A
, { "tag": "blahblah" }
should be deserialized to Target::Others
As the deserialization input could be indeed anything, it would be nice to have a default (or fallback) choice.
Cool, I am on board with this. I would love a PR to implement it!
This would be really great for reaching feature-parity with protocol buffers. In proto3 Java, unknown variants deserialize to a special UNRECOGNIZED
variant. This allows new variants to be added backwards-compatibly. Without this feature, it's pretty dangerous to use enums at all in more complex distributed systems.
Example
enum TrafficClass {
#[serde(other)]
Unknown,
Undesired,
BestEffort,
Sensitive,
}
In the future I might want to add TrafficClass::Critical
, in which case I wouldn't want to break existing systems that haven't been redeployed. In this case those systems would see TrafficClass::Unknown
, and they could then decide to handle that however they need to -- maybe defaulting to TrafficClass::BestEffort
, or maybe replying with an application-level error that the new traffic class isn't supported yet.
Open Questions
I would assume the most sensible thing would be to only allow this annotation on data-less variants; can you think of a way it would work otherwise? I don't think it'd be wise to try to deserialize the unknown tag's data into some other variant's data.
I would assume the most sensible thing would be to only allow this annotation on data-less variants
I don't know if it makes sense for all deserializers, but for JSON at least, it would be nice to be able to capture either the raw JSON (as a string), or better yet, the serde_json::value::Value
not-strongly-typed but still deserialized data.
My use case is something like this:
#[serde(rename_all = "snake_case")]
pub enum FieldType {
Noop,
Hash,
MaskHost,
MaskAgent,
DropField,
Plugin(String),
}
Where, when parsing JSON, the unrecognised variant would be parsed as Plugin
, and the text would be the value of the field.
Also, now that I think about it, a stretch goal would be as parsing with many unrecognised variants, deciding between them using similar, structure and type-based logic as #[serde(untagged)]
parsing.
Any news on that? This would be extremely useful. I have needed that for such a long time already.
We are using this workaround until it's supported
#[derive(Serialize, Deserialize, Debug)]
pub enum FieldKind {
date,
email,
menu,
#[serde(skip_deserializing)]
unknown,
}
fn deserialize_field_kind<'de, D>(deserializer: D) -> Result<FieldKind, D::Error>
where
D: Deserializer<'de>,
{
Ok(FieldKind::deserialize(deserializer).unwrap_or(FieldKind::Unknown))
}
#[derive(Serialize, Deserialize, Debug)]
pub struct Field {
#[serde(deserialize_with = "deserialize_field_kind")]
kind: FieldKind,
}
hope that helps
@galich thanks! I'd try implementing it in serde-aux crate.
@dtolnay: I might have some bandwidth to implement this.
After a quick look through the serde source, some questions:
-
There's already a
serde(other)
from cdfd4455288930ff5cc43033b9d8795758f49563 that's minimally documented. Could you explain what it's used for along with whatfield_identifier
is for? https://github.com/serde-rs/serde-rs.github.io/issues/45 -
Since
serde(other)
is taken, I propose usingserde(open)
to indicate an open union/enum:
enum E {
A,
B(u32),
#[serde(open)]
C,
}
Are you onboard with that name?
-
A field identifier is the
enum Field
shown in https://serde.rs/deserialize-struct.html i.e. some Deserialize impl for deserializing keys in a map when deserializing that map as a struct. When there is another
variant it matches any other map key that is not matched to a field of the struct being deserialized. This is the default behavior of derived Deserialize impls for structs with named fields when not usingserde(deny_unknown_fields)
. -
Use
serde(other)
because there is no conflict, the attribute means the same thing in this context as it does in the existing implemented context.
I am fighting this too.
I have:
#[derive(Debug, Eq, PartialEq, Clone, Deserialize, Serialize)]
#[serde(tag = "kind", rename_all="lowercase" )]
pub enum DocumentChangeOperation {
Create(CreateFile),
Rename(RenameFile),
Delete(DeleteFile),
#[serde(other)]
Edit(TextDocumentEdit),
}
Where the Edit variant doesn't actually have a kind
tag so it fails to deserialize properly.
I am also interested in this feature. I would like this to choose a common default variant when no tag
field is available, to only require users to specify a type when using a non-common variant.
I would want the following:
[[binds]]
address = "0.0.0.0:80"
[[binds]]
type = "tls"
address = "0.0.0.0:443"
cert = "/some/file/path"
#[derive(Deserialize)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum Bind {
#[serde(other)]
Http {
address: SocketAddr,
},
#[serde(alias = "tls")]
HttpsTls {
address: SocketAddr,
cert: PathBuf,
}
}
but with #[serde(other)]
it doesn't currently compile with the error message: #[serde(other)] must be on a unit variant
Is there any progress on this issue?
Hmm, maybe #[serde(tag = "type", rename_all = "snake_case", default = "http")]
would be a better location for that?
any suggestion how I would do workaround for a Vec<FieldKind>
?
I tried:
fn deserialize_vec_field_kind<'de, D>(deserializer: D) -> Result<Vec<FieldKind>, D::Error>
where
D: Deserializer<'de>,
{
struct VecFieldKind(PhantomData<fn() -> Vec<FieldKind>>);
impl<'de> Visitor<'de> for VecFieldKind {
type Value = Vec<FieldKind>;
fn expecting(&self, formatter: &mut Formatter) -> fmt::Result {
formatter.write_str("Array of field kind.")
}
fn visit_seq<S>(self, mut seq: S) -> Result<Vec<FieldKind>, S::Error>
where
S: SeqAccess<'de>,
{
let mut field_kinds: Vec<FieldKind> = Vec::new();
while let element = seq.next_element().unwrap_or(Some(FieldKind::Unknown)) {
field_kinds.push(element.unwrap())
}
Ok(field_kinds)
}
}
deserializer.deserialize_seq(VecFieldKind(PhantomData))
}
but I end up in an infinite loop.
while let element = seq.next_element().unwrap_or(Some(FieldKind::Unknown)) {
field_kinds.push(element.unwrap())
}
You should match on the error state, not just the Option
-ness. Something like this structure is what you want:
loop {
match seq.next_element() {
Ok(Some(element)) => field_kinds.push(element),
Ok(None) => break, // end of sequence
Err(_) => field_kinds.push(FieldKind::Unknown),
}
}
My bad didn't read that documentation correctly, but even if I add a break on None
I still have an infinite loop
Hmm. Well, something is making next_element
not end on its own. I suggest debugging what all of the next_element
returns are. Something has to change when it gets to the end of the element list (assuming there is one, but then, there's your problem :) ).
What is this issue still waiting on? It appears that the original request works today:
#[derive(Deserialize, PartialEq)]
#[serde(tag = "tag")]
enum Target {
A(()),
B(()),
#[serde(other)]
Others
}
fn main() {
assert_eq!(Target::Others, from_str::<Target>(r#"{ "tag": "blablah" }"#).unwrap());
}
While we could allow the Others
variant to hold data in the future, I think this issue has been addressed.
It might need an issue on its own, but this is still not working for externally tagged enums:
#[derive(Deserialize, PartialEq, Debug)]
enum Target {
A(String),
B(()),
#[serde(other)]
Others
}
fn main() {
assert_eq!(Target::A(String::from("Somestring")), from_str::<Target>(r#"{ "A": "Somestring" }"#).unwrap());
assert_eq!(Target::Others, from_str::<Target>(r#"{ "blablah": "blablah" }"#).unwrap());
}
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=68369e9f98d7f85961f5919dc4ff826e
Is there any way to get this fixed for Externally tagged enums too?
While we could allow the Others variant to hold data in the future
That's one of the most important points in this issue.
Here's a workaround to capture unknown enum variants without needing serde(with)
annotations
use std::str::FromStr;
use serde::Deserialize;
use serde::de::{value, Deserializer, IntoDeserializer};
#[derive(Deserialize, Debug)]
#[serde(rename_all = "kebab-case")]
#[serde(remote = "MaybeUnknown")]
enum MaybeUnknown {
Foo,
Bar,
#[serde(skip_deserializing)]
Unknown(String),
}
impl FromStr for MaybeUnknown {
type Err = value::Error;
fn from_str(s: &str) -> Result<Self, Self::Err> {
Self::deserialize(s.into_deserializer())
}
}
impl<'de> Deserialize<'de> for MaybeUnknown {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where D: Deserializer<'de>
{
let s = String::deserialize(deserializer)?;
let deserialized = Self::from_str(&s).unwrap_or_else(|_| {
tracing::warn!(message = "unknown-variant", variant = ?s);
Self::Unknown(s)
});
Ok(deserialized)
}
}
#
I'm just a newbie but it seems to me https://github.com/serde-rs/serde/issues/912#issuecomment-803097289 will use the default enum even when the input is clearly tagged for another branch of the enum, and just failed to parse, e.g. due to trying to deserialize a string into an integer.
Instead of talking about workarounds: what would be a viable path forward to get this feature into serde itself?
I don't think there is one. If you have a look at unmerged PRs and activitiy on those and on recently-closed ones you will see that that serde is effectively feature-frozen.
Another featureful workaround: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=aabb4e581775d2f3b82d4725ab85448f
In my case, I'm hitting this same issue with an Option<SomeEnumType>
where the deserialized value doesn't match an enum constant. It'd be nice to tell Serde just to default to None
.
I'm hitting up against this and have tried:
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "kebab-case")]
pub enum ValueEntry {
A(Vec<String>),
B(Vec<String>),
// Provide fallback in case we're deserialising from a newer release
// of a subcommand that include json that we don't recognise
//
// 'other' means that we use this variant if none of the others match.
// 'ignore_contents' is used as 'other' only works if the entry deserializes to
// a unit value so we override the deserialization to ignore everything and
// return a unit value.
#[serde(other, deserialize_with = "ignore_contents")]
Unknown,
}
fn ignore_contents<'de, D>(deserializer: D) -> Result<(), D::Error>
where
D: Deserializer<'de>,
{
// Ignore any content at this part of the json structure
deserializer.deserialize_ignored_any(serde::de::IgnoredAny);
// Return unit as our 'Unknown' variant has no args
Ok(())
}
I am new-ish to rust and serde so I've no reason to have confidence in this but it seems promising from some early tests so I thought I'd record it before I forget. I only have one kind of entry to test it on at the moment so it might fail for other variations.
IgnoredAny
only works for structured input like JSON where it is possible to tell where entries end without needing to know the exact details and expected content of the values themselves.
In my case, I'm getting this problem deserializing to HashMap<MyEnum, f64>
. Ideally, if there's not a matching variant for MyEnum
, it just wouldn't get loaded into the map.