RFC: Add Support for Symfony JsonStreamer
Summary
This RFC proposes adding support for Symfony's new JsonStreamer component as an alternative serialization strategy in Elastically. The JsonStreamer component, introduced in Symfony 7.3, provides significantly faster JSON encoding compared to the traditional Serializer component, which is pretty useful when dealing with large volumes of documents during bulk indexation operations.
Pros:
Serialization is significantly faster with JsonStreamer (up to 10x faster on a real life project)
Cons:
- Component is exprimental, so no BC promise
- Add a bit more complexity as component need to decide which Serializer to use
Proposed Solution
Design Principles
- Backward compatible: Existing code continues to work without changes
- Opt-in: Users explicitly enable JsonStreamer with attribute
- Coexistence: Both Serializer and JsonStreamer can be used simultaneously for different indexes
- Simple migration path: Minimal changes required to adopt JsonStreamer
Implementation Overview
The easiest way would be to check for #[JsonStreamable] attribute (and cache supports per class) and use JsonStreamer when present, or regular serializer otherwise.
👍🏻 or 👎🏻 ?
That's a great idea, but I'm wondering three things:
- would it be both ways? Probably needs to be done deep in Elastica then?
- BULK are not using JSON but NDJSON - does JsonStreamer support it?
- would object serialization / deserialisation with Jane PHP still be possible? 🤔 I have no idea how JsonStreamer works 🤣
Anyway everything that can make working with huge JSON is to consider 👍
* would it be both ways? Probably needs to be done deep in Elastica then?
I think best place to implements this would be in the Indexer, maybe a serializer decorator that checks for the attribute on the model and fallback to regular serializer otherwise. That decorator would only be created if JsonStreamer class exists.
* BULK are not using JSON but NDJSON - does JsonStreamer support it?
AFAIK it does not, but for serialization it's not an issue and it is done per document (unless I'm mistaken?)
* would object serialization / deserialisation with Jane PHP still be possible? 🤔 I have no idea how JsonStreamer works 🤣
It's basically same as serializer, but without context, and way faster, so should still works the same, although I never tried Jane PHP so I have no idea
Right 👍
Looks good to me. You would have JsonStreamer responsible of building the bulk then.
Is there a way it could be used to read response as well? As it's also critical for performance.
As for JanePHP, it's a set of Normalizer/Denormalizer added in the Serializer so it should be compatible :+1:
Should work for deserialization as well, I'll look into it
I think this would be a great addition
I'll wait for 7.4 version to be released as it is no longer experimental