bevy_reflect: Add `ReflectDeserializerProcessor`
NOTE: Also see https://github.com/bevyengine/bevy/pull/15548 for the serializer equivalent
Objective
The current ReflectDeserializer and TypedReflectDeserializer use the TypeRegistration and/or ReflectDeserialize of a given type in order to determine how to deserialize a value of that type. However, there is currently no way to statefully override deserialization of a given type when using these two deserializers - that is, to have some local data in the same scope as the ReflectDeserializer, and make use of that data when deserializing.
The motivating use case for this came up when working on bevy_animation_graph, when loading an animation graph asset. The AnimationGraph stores Vec<Box<dyn NodeLike>>s which we have to load in. Those Box<dyn NodeLike>s may store Handles to e.g. Handle<AnimationClip>. I want to trigger a load_context.load() for that handle when it's loaded.
#[derive(Reflect)]
struct Animation {
clips: Vec<Handle<AnimationClip>>,
}
(
clips: [
"animation_clips/walk.animclip.ron",
"animation_clips/run.animclip.ron",
"animation_clips/jump.animclip.ron",
],
)
Currently, if this were deserialized from an asset loader, this would be deserialized as a vec of Handle::default()s, which isn't useful since we also need to load_context.load() those handles for them to be used. With this processor field, a processor can detect when Handle<T>s are being loaded, then actually load them in.
Solution
trait ReflectDeserializerProcessor {
fn try_deserialize<'de, D>(
&mut self,
registration: &TypeRegistration,
deserializer: D,
) -> Result<Result<Box<dyn PartialReflect>, D>, D::Error>
where
D: serde::Deserializer<'de>;
}
- pub struct ReflectDeserializer<'a> {
+ pub struct ReflectDeserializer<'a, P = ()> { // also for ReflectTypedDeserializer
registry: &'a TypeRegistry,
+ processor: Option<&'a mut P>,
}
impl<'a, P: ReflectDeserializerProcessor> ReflectDeserializer<'a, P> { // also for ReflectTypedDeserializer
pub fn with_processor(registry: &'a TypeRegistry, processor: &'a mut P) -> Self {
Self {
registry,
processor: Some(processor),
}
}
}
This does not touch the existing fn news.
This processor field is also added to all internal visitor structs.
When TypedReflectDeserializer runs, it will first try to deserialize a value of this type by passing the TypeRegistration and deserializer to the processor, and fallback to the default logic. This processor runs the earliest, and takes priority over all other deserialization logic.
Testing
Added unit tests to bevy_reflect::serde::de. Also using almost exactly the same implementation in my fork of bevy_animation_graph.
Migration Guide
(Since I added P = (), I don't think this is actually a breaking change anymore, but I'll leave this in)
bevy_reflect's ReflectDeserializer and TypedReflectDeserializer now take a ReflectDeserializerProcessor as the type parameter P, which allows you to customize deserialization for specific types when they are found. However, the rest of the API surface (new) remains the same.
Original implementation
Add ReflectDeserializerProcessor:
struct ReflectDeserializerProcessor {
pub can_deserialize: Box<dyn FnMut(&TypeRegistration) -> bool + 'p>,
pub deserialize: Box<
dyn FnMut(
&TypeRegistration,
&mut dyn erased_serde::Deserializer,
) -> Result<Box<dyn PartialReflect>, erased_serde::Error>
+ 'p,
}
Along with ReflectDeserializer::new_with_processor and TypedReflectDeserializer::new_with_processor. This does not touch the public API of the existing new fns.
This is stored as an Option<&mut ReflectDeserializerProcessor> on the deserializer and any of the private -Visitor structs, and when we attempt to deserialize a value, we first pass it through this processor.
Also added a very comprehensive doc test to ReflectDeserializerProcessor, which is actually a scaled down version of the code for the bevy_animation_graph loader. This should give users a good motivating example for when and why to use this feature.
Why Box<dyn ..>?
When I originally implemented this, I added a type parameter to ReflectDeserializer to determine the processor used, with () being "no processor". However when using this, I kept running into rustc errors where it failed to validate certain type bounds and led to overflows. I then switched to a dynamic dispatch approach.
The dynamic dispatch should not be that expensive, nor should it be a performance regression, since it's only used if there is Some processor. (Note: I have not benchmarked this, I am just speculating.) Also, it means that we don't infect the rest of the code with an extra type parameter, which is nicer to maintain.
Why the 'p on ReflectDeserializerProcessor<'p>?
Without a lifetime here, the Boxes would automatically become Box<dyn FnMut(..) + 'static>. This makes them practically useless, since any local data you would want to pass in must then be 'static. In the motivating example, you couldn't pass in that &mut LoadContext to the function.
This means that the 'p infects the rest of the Visitor types, but this is acceptable IMO. This PR also elides the lifetimes in the impl<'de> Visitor<'de> for -Visitor blocks where possible.
Future possibilities
I think it's technically possible to turn the processor into a trait, and make the deserializers generic over that trait. This would also open the door to an API like:
type Seed;
fn seed_deserialize(&mut self, r: &TypeRegistration) -> Option<Self::Seed>;
fn deserialize(&mut self, r: &TypeRegistration, d: &mut dyn erased_serde::Deserializer, s: Self::Seed) -> ...;
A similar processor system should also be added to the serialization side, but that's for another PR. Ideally, both PRs will be in the same release, since one isn't very useful without the other.
Testing
Added unit tests to bevy_reflect::serde::de. Also using almost exactly the same implementation in my fork of bevy_animation_graph.
Migration Guide
bevy_reflect's ReflectDeserializer and TypedReflectDeserializer now take a second lifetime parameter 'p for storing the ReflectDeserializerProcessor field lifetimes. However, the rest of the API surface (new) remains the same, so if you are not storing these deserializers or referring to them with lifetimes, you should not have to make any changes.
Thinking about the current functions that the processor stores, maybe having two separate can_deserialize and deserialize is not a good idea.
Originally I wrote it as one function, but had issues where I would have to pass ownership of the deserializer into the processor, which means I couldn't use it for default deserialization logic later. I then split this into two, and if the first passes, only then does ownership of the deserializer get moved to the processor (and at that point, we ourselves don't need it anymore for default deser logic).
It might be possible to change this to just use one function, which would help ergonomics. Maybe the processor can pass ownership of the deserializer back, if it refuses to deserialize this value? Although then we'd be getting back an erased deserializer, which likely causes problems.
Marking this as C-Breaking-Change due to the added lifetimes in the public API.
It looks like your PR is a breaking change, but you didn't provide a migration guide.
Could you add some context on what users should update when this change get released in a new version of Bevy?
It will be used to help writing the migration guide for the version. Putting it after a ## Migration Guide will help it get automatically picked up by our tooling.
Thinking even harder about the API, what if should_deserialize returned a Box<dyn FnOnce(..)> which performed the actual deserialization?
Benefits:
- any state you compute in
should_deserializecan be passed immediately to the deserialization fn - if you never make a deserializer fn, you never have to define what it does (I guess this isn't a big deal since you can just
unimplemented!()) - if you have a processor which does different things for different type registrations, it makes it much cleaner to implement the logic. just return two different fns instead of running the check logic again
- this is the most compelling reason to do it this way
Drawbacks:
- you would have to allocate a new box for this fn - if your processor is valid for a lot of fields, this might be bad for performance
- although realistically this is reflect deserializer code which I intend to be used mainly in asset loading - dynamism and flexibility is more important imo
After a lot more thinking, I've decided to just use static dispatch for the processor, and add a P type parameter:
- This is already a breaking change with the
'plifetime - The
'plifetime was already infecting the rest of the visitor types, so now it's swapped out for aP - This means we don't have to erase the deserializer when we pass it to the processor
- The processor can compute state when figuring out if it should actually deserialize or not, and then use that state in the actual deserialization
- A tiny performance improvement probably? Since we're not using dyn dispatch
- The processor trait is implemented for
()when you don't want any processor
Ok, I'm finally in a state where I'm a lot more happy with the public API of the processor. This took a lot of wrangling with the type system, since I kept running into infinite recursion like &mut &mut &mut &mut... P. In summary:
-
ReflectDeserializerProcessoronly has a singlefn try_deserializewhich either returns a value, ownership of theD: Deserializer, or an error- Cleans up the
should_deserializevsdeserializeissues that I complained about before
- Cleans up the
- The
ReflectDeserializer,TypedReflectDeserializerand all internal visitor classes take a type parameterP - This is used in
processor: Option<&'a mut P>, reusing the existing lifetime'aalready used for the&'a TypeRegistry(you'll see why we need anOptionand can't ownP...) - We have a no-op
impl ReflectDeserializerProcessor for ()- The actual logic of this doesn't matter, because whenever we have
P = (), theprocessorfield will beNone. It could evenpanic!().
- The actual logic of this doesn't matter, because whenever we have
-
fn new(..)returns a deserializer withP = (), but crucially,processor: None- So that when a user uses the standard
newfn, they don't even have to think about thePtype at all
- So that when a user uses the standard
-
fn with_processor(..)(renamed fromnew_with_processor) takes in aprocessor: &'a mut P
So why can't we use processor: P and take ownership? Let's assume we did...
- The visitor structs may construct new
TypedReflectDeserializers when traversing - e.g. aListVisitorwill need to construct multiple typed deserializers to deserialize every value under it. - This means we need some way to hand out the same value of
Pmultiple times - that's a borrow, so let's just pass inprocessor: &mut self.processor - Then we
impl<T: Processor + ?Sized> Processor for &mut T - But what if
Pis itself a&mut Q? Now we've made a&mut &mut Q - Now since the new
TypedReflectDeserializermight instantiate more visitor types, which might instantiate more deserializers, rustc overflows when instantiating types -&mut &mut &mut &mut &mut ... - Therefore, if we store a
&'a mut P, we can just reborrow when constructing the child deserializer, and passprocessor: self.processor. We're still passing in a&mut P, so no rustc overflows.
What about that Option?
- Since the deserializer stores a
&mut P, what if we don't want a processor? I.e., we want to give itP = ()which has a no-op impl? - Then
fn new()wouldn't work:
pub fn new(registration: &'a TypeRegistration, registry: &'a TypeRegistry) -> Self {
Self {
registration,
registry,
processor: &mut (), // error: this temporary `()` will be dropped
}
}
- So,
fn newjust setsprocessor: None - Note that the reborrow
processor: self.processorfrom before won't work withOption. Instead, we useprocessor: self.processor.as_deref_mut()to reborrow theOption<&mut P>.
Benefits:
-
try_deserializeis a single fn and much cleaner now - No
erased_serde::Deserializer
Drawbacks:
- More monomorphisations of the deserializer and visitor types, bigger binary size
The generated examples/README.md is out of sync with the example metadata in Cargo.toml or the example readme template. Please run cargo run -p build-templated-pages -- update examples to update it, and commit the file change.
Note about the landscape.png: I don't know if I need to add a license for this asset or something, I noticed that some assets have a License.txt or similar, but this is my own photo that I took, so I guess I'll license it under CC0 if that's suitable for Bevy?
Also Nikko is a really nice place :)