opentelemetry-ruby-contrib
opentelemetry-ruby-contrib copied to clipboard
ActiveRecord `.eager_load` creates a lot of spans
Using ActiveRecord .eager_load creates a span for each object it instantiates and also a span for each child object.
Example:
describe 'eager loading' do
it 'traces' do
users = 3.times.map { User.create! }
users.each do |user|
user.articles.create!(title: 'test article1')
user.articles.create!(title: 'test article2')
end
exporter.reset
User
.eager_load(:articles)
.map { |u| [u.id, u.articles.length] }
puts spans.map(&:name).inspect
# => [
# "User.instantiate",
# "Article.instantiate",
# "Article.instantiate",
# "User.instantiate",
# "Article.instantiate",
# "Article.instantiate",
# "User.instantiate",
# "Article.instantiate",
# "Article.instantiate"
# ]
end
end
In this example I have 3 users in the database, each having 2 articles. Fetching all users with eager loading their articles creates a total of 9 spans even though there's only one SQL query. This gets worse when the number of records grows.
Compare this to fetching all objects without eager_load:
it 'traces' do
users = 3.times.map { User.create! }
exporter.reset
User.all.to_a
puts spans.map(&:name).inspect
# => ["User.find_by_sql"]
Here it only creates one span.
One idea I had was to just remove tracing from .instantiate method. But in that case, there wouldn't be any spans created when using eager_load.
I would be fine submitting a PR, but I'm not really sure how to improve this.
cc @robertlaurin - as you have worked on AR instrumentation before
I experimented with a different approach which creates spans like User.where(...).order(...) and User.eager_load(:articles). Code can be seen here and tests to show what is happening are here.
The approach allowed quite nice span names which provide a lot more information. The problematic part however is that there are methods in AR that call other AR query methods (e.g. .first calls .order and .limit). I'm currently getting duplicate spans for them. There doesn't seem to be an easy way to fix this (unless we start parsing caller or define each method separately). The other issue is user-defined named scopes in which case we don't know the scope name (there we could use the methods that user is actually calling but not sure how useful those are).
Tbh, I'm not really sure how much and what should this gem be tracing.
Tbh, I'm not really sure how much and what should this gem be tracing.
I share a similar sentiment with you because generating spans for every use case generated by AR Scopes starts to fall into the realm of Profiling.
Maybe we should think about this in a different way, perhaps coming up with a list of use cases?
What are the common problems Rails developers run into that AR Instrumentation could help them find? N+1?
What information are they missing in their current tooling that only tracing could provide?
Hey @indrekj thanks for opening the issue.
We're looking at noisy/long traces internally as well and we can point at instantiation as one of the culprits.
How receptive would you and @arielvalentin be to removing that patched code entirely. Or at very least putting it behind a default to off config option.
For reference I would remove this block of code https://github.com/open-telemetry/opentelemetry-ruby/blob/666b81204fc5d0a34744d9e88788127897becd01/instrumentation/active_record/lib/opentelemetry/instrumentation/active_record/patches/persistence_class_methods.rb#L33-L37
I'm wondering how much value this span provides application owners, I think it's definitely an interesting concept but I wonder if we could look at doing some sort of higher level aggregation where we keep track of the instantiation counts elsewhere.
Tbh, I'm not really sure how much and what should this gem be tracing.
I agree, I don't think it's entirely clear what the right level of detail is here. I wanted to take a different approach from the active support notifications as they were just slightly higher level duplicates of the mysql/postgres spans. The intention with this gem was to instrument the ORM and attempt to capture some higher level details, but I certainly think it needs some refinement.
Just circle back to the original point, I am totally for dropping the patch on the instantiate method until we have a better idea of how this information should be tracked.
I'm not using AR instrumentation at all.
+1 in favor of removing the patch
@robertlaurin I'd be also fine dropping it. But just to mention. If we drop this then User.all.to_a would still generate ["User.find_by_sql"] span, but User.eager_load(:articles).to_a wouldn't generate any spans which may seem odd.
As we're currently most interested in the SQL queries and are already using pg instrumentation then we'll likely skip AR gem for now as well. Though we should still think about how to improve it as it's part of the rails instrumentation package and likely most users will just include the rails package.
@robertlaurin I'd be also fine dropping it. But just to mention. If we drop this then User.all.to_a would still generate ["User.find_by_sql"] span, but User.eager_load(:articles).to_a wouldn't generate any spans which may seem odd.
Hmm, that's not good. I'll need to dig into that more.
As we're currently most interested in the SQL queries and are already using pg instrumentation then we'll likely skip AR gem for now as well. Though we should still think about how to improve it as it's part of the rails instrumentation package and likely most users will just include the rails package.
That's fair, if you already know what would need to change so that you would keep it/add the instrumentation back to your application please let us know. Feedback like you provided on this issue is really beneficial to the improvement of this instrumentation gem.
👋 This issue has been marked as stale because it has been open with no activity. You can: comment on the issue or remove the stale label to hold stale off for a while, add the keep label to hold stale off permanently, or do nothing. If you do nothing this issue will be closed eventually by the stale bot.