sentry-ruby icon indicating copy to clipboard operation
sentry-ruby copied to clipboard

RuntimeError: profile not started from a Sidekiq worker when profiling is enabled (Vernier)

Open vittorius opened this issue 1 year ago • 14 comments

Issue Description

I have a project that uses sentry-ruby, sentry-rails, and sentry-sidekiq, all of version 5.21.0. When I enable Profiling according to the docs and deploy the application, I start getting the RuntimeError: profile not started from all of my Sidekiq workers

Reproduction Steps

  1. Have a project with Sidekiq workers
  2. Enable profiling according to the docs using the Vernier profiler
  3. Try running workers

Expected Behavior

No errors, profile samples from workers are visible in Sentry UI

Actual Behavior

RuntimeError: profile not started

Ruby Version

3.3.4

SDK Version

5.21.0

Integration and Its Version

Rails 7.1.4.1, Sidekiq 7.1.6, Vernier 1.3.1

Sentry Config

# frozen_string_literal: true

if Global.sentry.enabled?
  Sentry.init do |config|
    config.dsn = Global.sentry.rails
    config.environment = Global.sentry.environment
    config.release = ENV['DEPLOY_SHA'] || 'no-release'

    # post body, cookies, user IP is not sent to Sentry (the default, but we explicitly agree)
    config.send_default_pii = false

    Warden::Manager.after_set_user do |user|
      Sentry.set_user(id: user&.id)
    end

    config.breadcrumbs_logger = %i[sentry_logger http_logger redis_logger active_support_logger]

    config.traces_sample_rate = 1.0
    config.profiles_sample_rate = 1.0
    config.profiler_class = Sentry::Vernier::Profiler

    config.sidekiq.report_after_job_retries = true
  end
end

vittorius avatar Nov 07 '24 12:11 vittorius

Thanks for the report. I'll look into this!

solnic avatar Nov 15 '24 12:11 solnic

I'm having trouble reproducing this exactly, but I did manage to see Vernier breaking when there's more than one worker running. I still need to investigate what exactly causes this. Will report back when I learn more!

solnic avatar Nov 20 '24 12:11 solnic

Hey @jhawthorn and @mperham - any chance you could help me out here? What I discovered is that when there’s more than one worker running, profiling via Vernier is choking on either “Profile already started” or “profile not started” errors.

On Sentry side, we simply start a transaction which eventually calls Vernier::Profiler#start, then once transaction ends we stop it which eventually calls Venier::Profile#stop.

These two start/stop methods is where we get our crashes about profile either not started yet or already stopped. It seems like when multiple workers are running Vernier is silently failing maybe? Or maybe we’re not handling Vernier start/stop correctly?

I’d appreciate any guidance here 🙏🏻

solnic avatar Nov 28 '24 10:11 solnic

Are you trying to profile two jobs concurrently in the same process? That sounds like a bad time, which is why the Sidekiq wiki page notes that you really should only profile manually, to keep contention down. You could also target different processes via queue selection.

Retries may automatically deal with this.

mperham avatar Nov 28 '24 23:11 mperham

I missed the context here. Sidekiq will support Vernier natively in 8.0; it's already on the main branch. I have no knowledge of Sentry's implementation.

https://github.com/sidekiq/sidekiq/wiki/Profiling

PS "getsantry" above ^^^^^ looks very suspicious.

mperham avatar Nov 29 '24 20:11 mperham

@mperham thank you!

@vittorius we should wait until sidekiq 8.0 is released and try it out, until then there's not much we can do on Sentry side.

solnic avatar Dec 05 '24 20:12 solnic

My understanding is that Vernier is only designed to run globally. You can’t run it on multiple threads concurrently so I don’t think Sentry’s attempt to automatically profile every job will work. Sidekiq’s documentation explicitly says to only use this feature manually for this reason. Am I misunderstanding Vernier or your usage?

mperham avatar Dec 05 '24 20:12 mperham

@solnic it is fine to profile only one thread at a time and document that limitation. However, we should fail gracefully in this case and not throw an exception (if we're doing so).

sl0thentr0py avatar Dec 05 '24 21:12 sl0thentr0py

@solnic thank you for the effort. This doesn't block me actually since it was only an experiment on my project. Good to know anyway.

vittorius avatar Dec 06 '24 10:12 vittorius

@sl0thentr0py OK so just catch these errors and log them, yeah?

solnic avatar Dec 06 '24 10:12 solnic

Hi everyone! I'm having some issues too, when trying out Vernier in a Rails app that uses Sidekiq (Runtime error: profile not started).

Are we waiting for Sidekiq 8 in this case?

Thanks!

lucashungaro avatar Jan 16 '25 01:01 lucashungaro

Sidekiq 8.0+ supports Vernier, try using the main branch.

mperham avatar Jan 16 '25 19:01 mperham

Sidekiq 8.0+ supports Vernier, try using the main branch.

Thanks @mperham!

lucashungaro avatar Jan 16 '25 19:01 lucashungaro

If possible, please try out master branch where I fixed an issue where multiple vernier profiles would be started in nested transactions. This should help with this issue too.

solnic avatar Feb 14 '25 13:02 solnic