fast_jsonapi
fast_jsonapi copied to clipboard
Polymorphic associations and Serializer DSL
After documenting #64, I felt a little odd about the polymorphic
option I added to the has_many
| belongs_to
| has_one
methods:
class ProjectSerializer
include FastJsonapi::ObjectSerializer
belongs_to :owner, polymorphic: { User => :user, Administrator => :administrator }
end
I think the polymorphic
option describes an ORM feature (specially when working with relational databases) rather than an object serialization aspect.
Actually, I think the jsonapi spec doesn't even care if an association is polymorphic or not, as long as the associated object has the type
key in it, it's all good. I think this should be reflected in the serializer DSL.
So, how about replacing the polymorphic
option with being able to use the record_type
option with either a symbol (as it is right now) or the Hash
object polymorphic
receives now?
class ProjectSerializer
include FastJsonapi::ObjectSerializer
belongs_to :owner, record_type: { User => :user, Administrator => :administrator }
end
As a matter of fact, since we're memoizing the associated object => record_type with #64 when the Hash
isn't provided (Thanks @christophersansone for the tip), I think we can make pretty much the same (performance wise) to not need to specify a record_type
at all:
class ProjectSerializer
include FastJsonapi::ObjectSerializer
belongs_to :owner # The auto-detect fallback + memoization added in #64 would kick in
end
(You could still define a Hash
for record_type
, but that would give just a very marginal performance gain when serializing 1 object, I think)
Your thoughts @shishirmk @christophersansone @AtomGrafiks ?
@vovimayhem this feels like a more intuitive representation of polymorphic relationships in the serializer. Do you mind updating your implementation to use this representation?
Yes! I'll work it out
@shishirmk On my first go, I broke the original usage, as lots of tests depends on the object having the association_ids
method, so let me know if you guys are OK with this:
- Without the
record_type
option, the serialization core can't take advantage of thetask_ids
method to generate the associations hash, since it needs to iterate overrecord.tasks
to find out which record types are in the association:
class ProjectSerializer
include FastJsonapi::ObjectSerializer
has_many :tasks
end
- Giving a
Symbol
as therecord_type
option means all associated objects will be of the same type, and the serialization core CAN use thetask_ids
method (or any other given*_ids
method) to generate the associations hash IF theProject
class has the instance methodtask_ids
defined at the time of the Serializer definition OR if aid_method_name
(which can betask_ids
) was given. If neither is the case, it falls back to callingrecord.tasks
:
class ProjectSerializer
include FastJsonapi::ObjectSerializer
has_many :tasks, record_type: :task
end
- Giving a
Hash
as therecord_type
option, the serialization core will always callrecord.tasks
to sort out which record_type we're talking about:
class ProjectSerializer
include FastJsonapi::ObjectSerializer
has_many :tasks, record_type: { SmallTask => :small_task, ComplexTask => :difficult_task }
end
In all cases, when calling record.tasks
, the serialization core memoizes any record_class => record_type occurrence, making the performance impact small if not negligible. If record.tasks
has an unexpected type, it will memoize and go on as usual.
@vovimayhem Thanks for keeping at this! My opinion is that this should be the supported syntax:
class ActorSerializer
include FastJsonapi::ObjectSerializer
# inferred using the property name (movies), and singularizing when necessary
has_many :movies
# an easy way to override the inferred value
belongs_to :spouse, record_type: :person
# infers the name from the class name of each individual object
has_many :awards, polymorphic: true
end
From there, there is a method whose sole job is to return the polymorphic record type, given an object, e.g.:
def polymorphic_record_type_for(object)
return run_key_transform(object.class.name)
end
... and a second method to memoize them...
def cached_polymorphic_record_type_for(object)
@polymorphic_record_types ||= {}
@polymorphic_record_types[object.class] ||= polymorphic_record_type_for(object)
end
This way, a developer can easily override polymorphic_record_type_for
if the inferred polymorphic record types are incorrect.
I would much prefer this solution than requiring the developer to list every polymorphic record type... and remember to do so every time a new polymorphic class is added. Inferring from the class name will work most of the time. In those rare cases where this is insufficient, overriding a method would be preferred to building the DSL to accommodate it.
In general, I would like to see many of the methods in FastJsonapi to be split into bite-sized methods. I loved seeing the recent addition of run_key_transform
to standardize how keys become underscored, hyphenated, etc., and would love to see more of that so that it is easier to accommodate edge cases without having to bake in every possibility up front.
@christophersansone I'd prefer to stay away from the polymorphic
word in the DSL. "Polymorphic" only takes true meaning when dealing with an object-relational mapper (i.e. ActiveRecord), but means nothing for users using an object-document mapper instead (think mongoid), because nested documents can be (and will be) of any type!
Also, JSONAPI doesn't care if the association is polymorphic or not (and neither the direction of the association, see #73), but actually forces you to specify each associated object's type. I don't think the Serialization DSL should deviate from what's in the actual JSONAPI spec, and neither should add concepts not in the spec.
The current solution (#64) doesn't require the developer stating each polymorphic record type at all - that's what I used the memoization for in the first place: If it wasn't stated, then infer it from the object's class name, then save it for later reference. This proposal is not about changing this either.
I totally agree about splitting the current code into single-responsibility "bite-sized" methods. However this proposal is more about the DSL and how does the info you define or omit affects the serialization process, specially in terms of performance.
...I'd also change record_type
for just type
if I had it my way :)
...BTW @christophersansone I like your example of Actor, awards and spouse for the test suite :) I'll bring the concept of "mechanophilia" in the test examples 👿
@vovimayhem Got it, thanks for the thorough explanation. I live in ActiveRecord world and am used to both AR's and AMS's polymorphic: true
. Inferring from the class name if not otherwise specified by record_type
seems like the way to go.
@vovimayhem I would have chosen just type
instead of record_type
but I am generally scared of using type
as it can be a reserve word and cause some confusion.
@vovimayhem I think when we release polymorphic models support it should really use record_type
and not the polymorphic
as the option. Are you planning to work on this change? If not I can take it up and try to implement it too.
@shishirmk got stuck on #74 ... I mean, Travis CI shows all green, but running the performance tests on my lap fails randomly.
@vovimayhem can you share the performance numbers you see. Especially for the tests that are failing?
Serializes 1 records 20.48 times faster than AMS
Serialize to JSON string
| AMS serializer | 32.52 ms | | jsonapi-rb serializer | 0.17 ms | | Fast serializer | 0.92 ms |
Serialize to Ruby Hash
| AMS serializer | 0.62 ms | | jsonapi-rb serializer | 0.09 ms | | Fast serializer | 0.21 ms |
Serializes 25 records 16.22 times faster than AMS
Serialize to JSON string
| AMS serializer | 8.7 ms | | jsonapi-rb serializer | 0.94 ms | | Fast serializer | 0.56 ms |
Serialize to Ruby Hash
| AMS serializer | 7.78 ms | | jsonapi-rb serializer | 0.91 ms | | Fast serializer | 0.72 ms |
Serializes 25 records 14.43 times faster than AMS
Serialize to JSON string
| AMS serializer | 11.39 ms | | jsonapi-rb serializer | 1.86 ms | | Fast serializer | 1.6 ms |
Serialize to Ruby Hash
| AMS serializer | 10.79 ms | | jsonapi-rb serializer | 1.88 ms | | Fast serializer | 1.15 ms |
With includes and meta serializes 250 records 13.75 times faster than AMS
Serialize to JSON string
| AMS serializer | 109.12 ms | | jsonapi-rb serializer | 15.36 ms | | Fast serializer | 7.75 ms |
Serialize to Ruby Hash
| AMS serializer | 102.6 ms | | jsonapi-rb serializer | 14.24 ms | | Fast serializer | 7.66 ms |
New Tests
With heterogeneous has many associations serializes 25 records 21.78 times faster than AMS
Serialize to JSON string
| AMS serializer | 6.85 ms | | jsonapi-rb serializer | 0.56 ms | | Fast serializer | 0.41 ms |
Serialize to Ruby Hash
| AMS serializer | 5.88 ms | | jsonapi-rb serializer | 0.53 ms | | Fast serializer | 0.28 ms |
With heterogeneous has many associations serializes 250 records 21.89 times faster than AMS
Serialize to JSON string
| AMS serializer | 62.89 ms | | jsonapi-rb serializer | 4.3 ms | | Fast serializer | 2.7 ms |
Serialize to Ruby Hash
| AMS serializer | 60.21 ms | | jsonapi-rb serializer | 3.88 ms | | Fast serializer | 2.29 ms |
With homogeneous has many associations serializes 25 records 23.14 times faster than AMS
Serialize to JSON string
| AMS serializer | 6.86 ms | | jsonapi-rb serializer | 0.65 ms | | Fast serializer | 1.57 ms |
Serialize to Ruby Hash
| AMS serializer | 5.73 ms | | jsonapi-rb serializer | 0.53 ms | | Fast serializer | 0.23 ms |
I've got an idea... I'll try it again
I'll do some stuff, then ask for forgiveness later :)
OK @shishirmk I got it this far. Some noteworthy changes:
-
The serialization code doesn't assume the association is homogenous, unless you specify the
record_type: :some_type
with aSymbol
. That's why I added therecord_type
configuration on the test serializers. This in effect renders the "Homogeneous Associations" optimization an "opt-in" option. -
The serialization code doesn't assume the
association_ids/association_id
method exists, but checks if the method exists in the record class for any given association. The result is memoized in the relationship hash, so the actual check won't be repeated for the same combination of record class + relationship. This will only kick in for "Homogeneous Associations" opt-in when the method exists.
After repeating the performance test a bunch of times, these two are the only tests not passing at all:
Serialize 25 records with includes & meta is 21.66 faster than AMS
To JSON string
Serializer | Time |
---|---|
AMS serializer | 43.51 ms |
jsonapi-rb serializer | 2.26 ms |
Fast serializer | 1.65 ms |
To Ruby Hash
Serializer | Time |
---|---|
AMS serializer | 11.58 ms |
jsonapi-rb serializer | 1.84 ms |
Fast serializer | 0.63 ms |
Serialize 250 records with includes and meta is 19.60 faster than AMS
To JSON string
Serializer | Time |
---|---|
AMS serializer | 110.81 ms |
jsonapi-rb serializer | 16.49 ms |
Fast serializer | 5.82 ms |
To Ruby Hash
Serializer | Time |
---|---|
AMS serializer | 112.68 ms |
jsonapi-rb serializer | 16.55 ms |
Fast serializer | 5.58 ms |
The weird thing is that they pass for 1 records and 1000 records...
@vovimayhem Looking up record type of each object is cheap in the performance tests. These are already in memory. So no need to create these objects. I am not sure it will be the case in real world use cases. It would create so many unnecessary objects just because record_type
is not specified. This makes me think we should add performance tests to cover this too.
I think we should not change the default assumption to homogeneous and expect the developer to provide a hash to the record_type
option when the relationship is heterogeneous. Which means by default if you specify record_type: :tasks
or not, serializer is optimized to work with only ids and type derived using the relationship name (same behavior as it was in 1.0.17). When it is a heterogeneous relationship we should expect the developer to provide a hash always. That way we would be giving better performance and developer experience in the common case (homogeneous relationships). In case of heterogenous relationships it will still work but the developer has to do a bit more work.
I realize you mentioned this as something you were planning to do but thinking about it some more i feel it's not the best default assumption especially when performance is important. I can help you fix that part if you would like so that we can still get it done in a day or two. Let me know.
@shishirmk @vovimayhem Agreed that assuming homogenous is the proper default, as this is the most common and most performant. If record_type
is unspecified, it will infer the type by the relationship name. If record_type
is declared as a symbol, it uses the specified value. Both of those will be lightning fast.
For heterogeneous lists, providing a hash is a decent solution. It will be performant, but at the expense of the developer always needing to adjust the serializer each time there is a new type added in the relationship. What is the expected behavior if it attempts a type that is not defined in the hash? An exception? If so, let's at least make sure the error message is descriptive.
Another possible issue is this. What happens if record_type
cannot be determined by the class? Maybe there is a property on the object that should determine the JSON:API type, rather than the Ruby class.
As a result, I do think it will be super helpful, as part of this feature, to offer a way to specify a method to determine the record_type
. Two possible options:
- Build a "bite-sized" method to determine the type, and let it be overridable, e.g.
def self.heterogeneous_type_for(object, relationship_definition)
...
end
- Allowing
record_type
to also be a Proc, e.g.:
has_many :awards, record_type: -> { |object| object.class.name.underscore }
This way, FastJsonapi defaults to static definitions, makes no assumptions about how to infer the type, and it maximizes performance. When the developer wants or needs to dynamically determine the record type, we offer a solution.
I'd like to continue working ASAP, but this week looks pretty dense for me... I expect to be available in about 2 more days...
Thanks @vovimayhem ! Having re-read my last comment, I think a Symbol, a Hash, and a Proc should be supported. The Proc is nice because we are still in the middle of determining naming conventions and deciding which methods belong on the class vs. the instance. We decided on a similar initial implementation for custom attributes for this same reason.
Hey, all! Thank you for the great gem, I am looking forward to use it into my projects instead of AMS.
@vovimayhem @christophersansone could you please tell what if you plan to implement this feature or do you have any help with it? This looks as a good fit for STI models and that's something I am looking for right now.
@allomov Thanks for reaching out! This project is abandoned, but work is continuing in a fork over here. This week, I just submitted an issue and PR for better STI support. I'd love to get some feedback on it from you.
@christophersansone thank you very much for this note. That's good to know about the fork.
While your PR is related to polymorphic relationships, so I will not be able to use this approach at the moment and will need to implement it in my project.
@allomov Would you mind commenting on the current issue describing why the approach won't work for your situation? We should be able to help one way or another. Thanks!