sentry-ruby
sentry-ruby copied to clipboard
Implement local variable capture
On 1.9.2 and 1.9.3 this can be done easily using the binding_of_caller gem.
Example monkey patch:
require 'binding_of_caller'
module Kernel
original_raise = method(:raise)
define_method(:raise) do |*args|
begin
original_raise.call(*args)
rescue Exception => e
e.instance_variable_set(:@stack_info, binding.callers)
e.set_backtrace(e.backtrace.drop(2))
original_raise.call(e)
end
end
end
On 1.8.7 this could be possibly be done using the continuation tracing hack outlined in http://rubychallenger.blogspot.com/2011/07/caller-binding.html
Some quick benchmarking per-raise
user system total real
Clean 0.830000 0.030000 0.860000 ( 0.864414)
Sentry 3.630000 0.130000 3.760000 ( 3.751515)
>diff ms: 0.028000 0.001000 0.029000 ( 0.028871)
I wonder if the code here could be a guide for implementing this: https://github.com/ko1/pretty_backtrace
The better_errors gem also gets local vars, so could be used as a guide as well: https://github.com/charliesome/better_errors
What about TracePoint for Ruby 2.0+?
@thedrow unfamiliar with TracePoint (will google), but if you (or anyone interested) is around RailsConf would love to chat about how we make this a reality.
Quick skim of TracePoint looks like its normal eval tracing which is far too expensive generally for what we'd want.
I think a first pass would be to optionally support binding_of_caller. This would, for example, allow you to at least enable it in staging environments.
I'm looking into this now as it's sort of our last "big feature" that we're missing in the Ruby client.
Here's what I've decided:
- TracePoint is inappropriate because it will attach to all raises, even those which get rescued before reaching Sentry. So, we'll be going with the approach of binding.caller, which checks already-existing frame bindings whenever we need them.
- This feature will almost definitely be limited to MRI, possibly MRI 2.0+. JRuby support looks possible (1.7 looks easy and 9.0 probably isn't difficult), but only with compile_mode set to
OFF
(e.g. when using the--dev
flag). - This feature may or may not be expensive, so depending on benchmark results it may be behind a config flag. It may end up being fast only on MRI or fast only on JRuby, etc.
- Binding.of_caller and similar approaches generally say *DON'T USE ON PRODUCTION!!111 without really saying why, so maybe someone can comment. If it's just performance, well, that's not that important to us so long as it doesn't slow down our exception capture by 50% or more.
- We should probably report arguments as well as locals.
- Part of the performance impact here will be the increase in total event size and the impact on data sanitization times.
@nateberkopec The "do not use in prod" is not (just) because of perf, binding-of-caller is effectively manually parsing the Ruby stack and if there is anything unexpected it's almost a guaranteed segfault.
and if there is anything unexpected it's almost a guaranteed segfault.
Got it. Can you give me an idea of what sort of circumstances that could happen under?
That's harder to answer, usually these days it would just be new versions of Ruby but in the days of REE and whatnot, there were often patches floating around that would sometimes re-arrange the stack frame.
OK, good to know. It looks like most of the segfaults surrounding binding_of_caller's approach have been solved since 1.9.3, but this does seem like a "high-danger" area where even raising in the wrong place can cause a segfault. So we may have to put it behind a config flag just for safety's sake.
Yeah, on the plus side I don't think there are any runtime perf problems with BoC so the only thing it would slow down is the actual error reporting.
Any progress on this?
Every time I look at it I don't like the performance impact, and there's still not really a good, stable API for it. Not seeing this being implemented in the near future.
:ping_pong: Any chance of this implemented maybe as an opt-in feature?
Every time I look at it I don't like the performance impact, and there's still not really a good, stable API for it. Not seeing this being implemented in the near future.
This is Ruby's fault. Maybe we should open a ticket on their issue tracker?
I've been using Sentry with Python projects for years and getting the values of local variables logged is one of my favorite features. The stack trace is useful, but frequently knowing what the actual state was at the time is critical to actually solving an issue quickly.
I've got a few Ruby projects using Sentry as well now and while still useful for other reasons, it is significantly less useful than it is on my Python projects.
Does anyone happen to know if the performance impact of is on all code, or only on binding_of_caller
exception handling? My use case is totally OK with slow exception handling, but it's not worth it if it affects successful code.
I was thinking of hacking in a version of this to my code if it's just exception handling.
@coffenbacher I've done a similar feature in my gem. The differences are:
- my gem collects all frames' info right away, instead of just storing callers. because the callers (call frames) will change as the program continues, storing the
callers
reference doesn't equal to storing all the call frame info. - my gem uses
TracePoint
to capture the raise event, instead of patching theraise
method.
And based on the benchmark result, I wouldn't recommend anyone running similar patches on production. Because many libraries or even your own code might use raise/rescue as a flow control mechanism. This means it could still significantly slow your app down even though you don't see any exceptions raised.
You can also install my gem and set
Bundler.require(*Rails.groups)
PowerTrace.replace_backtrace = true # need to be placed after bundler require
in your config/application.rb
to give it a try though.
Any chance Ruby 3 would fix the Ruby side of this issue?
I wouldn't say it's Ruby's "issue". It's just a feature that Ruby doesn't support so you need to pay extra cost for the additional implementation.
IMO, the best way to make this happen is to propose it (embed local variables in backtrace) on Ruby's issue tracker, have a thorough discussion, and implement it with the Ruby core team. This way, we can expect the performance impact to be minimum with a sound implementation.
However, given the impact of this feature, it will likely take another 1 or 2 years even if you start today. And it's also possible that it'll be rejected right away.
So another way is to implement this feature as an additional gem, like sentry-locals-collector
. I can probably make a prototype in a month (I did one months ago). But in a small poll I did a while ago, not many people are interested in this feature.
fyi, I'm adding a partial support of this feature in https://github.com/getsentry/sentry-ruby/pull/1580
fyi, I'm adding a partial support of this feature in #1580
GO STAN, GO! ❤️
This issue has gone three weeks without activity. In another week, I will close it.
But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog
or Status: In Progress
, I will leave it alone ... forever!
"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀
fyi, I'm adding a partial support of this feature in #1580
GO STAN, GO! ❤️
How did it GO, @st0012? 😆
@mecampbellsoup it's available on master and you can try it whenever you want to 😄 I've activated in my work project for 2 weeks and haven't seen any issue.
I'm still waiting for other PRs for the official 4.8.0
release though.
@mecampbellsoup it's available on master and you can try it whenever you want to 😄 I've activated in my work project for 2 weeks and haven't seen any issue. I'm still waiting for other PRs for the official
4.8.0
release though.
Ah okay so this will be included in 4.8.0
? We're using 4.7.3
.
yeah it'll be included in 4.8.0
. you can check the changelog for usage.
@st0012 can we close this issue now? feature seems to be stable.
Hmm not really because:
- It's not enabled by default. So many users may even not be aware of it.
- ATM,
TracePoint
cancels YJIT's optimization so for newer Ruby users who relies on YJIT, it shouldn't be enabled.
I'm still not sure how to resolve the restriction as it'll require more exploration on alternatives.