sentry-ruby Implement local variable capture

On 1.9.2 and 1.9.3 this can be done easily using the binding_of_caller gem.

Example monkey patch:


require 'binding_of_caller'

module Kernel
  original_raise = method(:raise)
  define_method(:raise) do |*args|
    begin
      original_raise.call(*args)
    rescue Exception => e
      e.instance_variable_set(:@stack_info, binding.callers)
      e.set_backtrace(e.backtrace.drop(2))
      original_raise.call(e)
    end
  end
end

On 1.8.7 this could be possibly be done using the continuation tracing hack outlined in http://rubychallenger.blogspot.com/2011/07/caller-binding.html

May 27 '12 23:05 coderanger

Some quick benchmarking per-raise


               user     system      total        real
Clean      0.830000   0.030000   0.860000 (  0.864414)
Sentry     3.630000   0.130000   3.760000 (  3.751515)
>diff ms:  0.028000   0.001000   0.029000 (  0.028871)

May 27 '12 23:05 coderanger

I wonder if the code here could be a guide for implementing this: https://github.com/ko1/pretty_backtrace

Feb 06 '15 12:02 nateberkopec

The better_errors gem also gets local vars, so could be used as a guide as well: https://github.com/charliesome/better_errors

Sep 19 '15 05:09 mikeweaver

What about TracePoint for Ruby 2.0+?

May 03 '16 14:05 thedrow

@thedrow unfamiliar with TracePoint (will google), but if you (or anyone interested) is around RailsConf would love to chat about how we make this a reality.

May 04 '16 14:05 dcramer

Quick skim of TracePoint looks like its normal eval tracing which is far too expensive generally for what we'd want.

I think a first pass would be to optionally support binding_of_caller. This would, for example, allow you to at least enable it in staging environments.

May 04 '16 16:05 dcramer

I'm looking into this now as it's sort of our last "big feature" that we're missing in the Ruby client.

Here's what I've decided:

TracePoint is inappropriate because it will attach to all raises, even those which get rescued before reaching Sentry. So, we'll be going with the approach of binding.caller, which checks already-existing frame bindings whenever we need them.
This feature will almost definitely be limited to MRI, possibly MRI 2.0+. JRuby support looks possible (1.7 looks easy and 9.0 probably isn't difficult), but only with compile_mode set to OFF (e.g. when using the --dev flag).
This feature may or may not be expensive, so depending on benchmark results it may be behind a config flag. It may end up being fast only on MRI or fast only on JRuby, etc.
Binding.of_caller and similar approaches generally say *DON'T USE ON PRODUCTION!!111 without really saying why, so maybe someone can comment. If it's just performance, well, that's not that important to us so long as it doesn't slow down our exception capture by 50% or more.
We should probably report arguments as well as locals.
Part of the performance impact here will be the increase in total event size and the impact on data sanitization times.

Dec 05 '16 16:12 nateberkopec

@nateberkopec The "do not use in prod" is not (just) because of perf, binding-of-caller is effectively manually parsing the Ruby stack and if there is anything unexpected it's almost a guaranteed segfault.

Dec 05 '16 16:12 coderanger

and if there is anything unexpected it's almost a guaranteed segfault.

Got it. Can you give me an idea of what sort of circumstances that could happen under?

Dec 05 '16 16:12 nateberkopec

That's harder to answer, usually these days it would just be new versions of Ruby but in the days of REE and whatnot, there were often patches floating around that would sometimes re-arrange the stack frame.

Dec 05 '16 17:12 coderanger

OK, good to know. It looks like most of the segfaults surrounding binding_of_caller's approach have been solved since 1.9.3, but this does seem like a "high-danger" area where even raising in the wrong place can cause a segfault. So we may have to put it behind a config flag just for safety's sake.

Dec 05 '16 17:12 nateberkopec

Yeah, on the plus side I don't think there are any runtime perf problems with BoC so the only thing it would slow down is the actual error reporting.

Dec 05 '16 17:12 coderanger

Any progress on this?

May 10 '18 18:05 zdrummond

Every time I look at it I don't like the performance impact, and there's still not really a good, stable API for it. Not seeing this being implemented in the near future.

Jun 21 '18 13:06 nateberkopec

:ping_pong: Any chance of this implemented maybe as an opt-in feature?

Sep 02 '19 02:09 nerixim

Every time I look at it I don't like the performance impact, and there's still not really a good, stable API for it. Not seeing this being implemented in the near future.

This is Ruby's fault. Maybe we should open a ticket on their issue tracker?

Sep 02 '19 09:09 thedrow

I've been using Sentry with Python projects for years and getting the values of local variables logged is one of my favorite features. The stack trace is useful, but frequently knowing what the actual state was at the time is critical to actually solving an issue quickly.

I've got a few Ruby projects using Sentry as well now and while still useful for other reasons, it is significantly less useful than it is on my Python projects.

Oct 02 '19 12:10 jmichalicek

Does anyone happen to know if the performance impact of is on all code, or only on binding_of_caller exception handling? My use case is totally OK with slow exception handling, but it's not worth it if it affects successful code.

I was thinking of hacking in a version of this to my code if it's just exception handling.

Jul 21 '20 23:07 coffenbacher

@coffenbacher I've done a similar feature in my gem. The differences are:

my gem collects all frames' info right away, instead of just storing callers. because the callers (call frames) will change as the program continues, storing the callers reference doesn't equal to storing all the call frame info.
my gem uses TracePoint to capture the raise event, instead of patching the raise method.

And based on the benchmark result, I wouldn't recommend anyone running similar patches on production. Because many libraries or even your own code might use raise/rescue as a flow control mechanism. This means it could still significantly slow your app down even though you don't see any exceptions raised.

You can also install my gem and set

Bundler.require(*Rails.groups)

PowerTrace.replace_backtrace = true # need to be placed after bundler require

in your config/application.rb to give it a try though.

Jul 22 '20 05:07 st0012

Any chance Ruby 3 would fix the Ruby side of this issue?

Jun 20 '21 13:06 mecampbellsoup

I wouldn't say it's Ruby's "issue". It's just a feature that Ruby doesn't support so you need to pay extra cost for the additional implementation.

IMO, the best way to make this happen is to propose it (embed local variables in backtrace) on Ruby's issue tracker, have a thorough discussion, and implement it with the Ruby core team. This way, we can expect the performance impact to be minimum with a sound implementation.

However, given the impact of this feature, it will likely take another 1 or 2 years even if you start today. And it's also possible that it'll be rejected right away.

So another way is to implement this feature as an additional gem, like sentry-locals-collector. I can probably make a prototype in a month (I did one months ago). But in a small poll I did a while ago, not many people are interested in this feature.

Jun 21 '21 14:06 st0012

fyi, I'm adding a partial support of this feature in https://github.com/getsentry/sentry-ruby/pull/1580

Sep 24 '21 15:09 st0012

fyi, I'm adding a partial support of this feature in #1580

GO STAN, GO! ❤️

Sep 24 '21 16:09 mecampbellsoup

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog or Status: In Progress, I will leave it alone ... forever!

"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

Oct 28 '21 21:10 github-actions[bot]

fyi, I'm adding a partial support of this feature in #1580

GO STAN, GO! ❤️

How did it GO, @st0012? 😆

Oct 28 '21 21:10 mecampbellsoup

@mecampbellsoup it's available on master and you can try it whenever you want to 😄 I've activated in my work project for 2 weeks and haven't seen any issue. I'm still waiting for other PRs for the official 4.8.0 release though.

Oct 28 '21 21:10 st0012

@mecampbellsoup it's available on master and you can try it whenever you want to 😄 I've activated in my work project for 2 weeks and haven't seen any issue. I'm still waiting for other PRs for the official 4.8.0 release though.

Ah okay so this will be included in 4.8.0? We're using 4.7.3.

Oct 28 '21 21:10 mecampbellsoup

yeah it'll be included in 4.8.0. you can check the changelog for usage.

Oct 28 '21 21:10 st0012

@st0012 can we close this issue now? feature seems to be stable.

Jul 25 '22 13:07 sl0thentr0py

Hmm not really because:

It's not enabled by default. So many users may even not be aware of it.
ATM, TracePoint cancels YJIT's optimization so for newer Ruby users who relies on YJIT, it shouldn't be enabled.

I'm still not sure how to resolve the restriction as it'll require more exploration on alternatives.

Jul 25 '22 13:07 st0012