sentry-php icon indicating copy to clipboard operation
sentry-php copied to clipboard

Implement Auto Session Tracking

Open antonpirker opened this issue 2 years ago • 15 comments

Sentry can monitor the health of releases by checking session data it receives from the SDK. In other SDKs this session data is already automatically collected. In SDK for Python and now brand new for Ruby. We also want to have this for PHP.

For reference see:

  • https://docs.sentry.io/product/releases/health/
  • https://develop.sentry.dev/sdk/sessions/

We want to implement request mode sessions which are aggregated in the SDK (as compared to application mode sessions which are sent as soon as they finish.) The main reason for this is the scale of most PHP servers out there which would overload the Sentry ingestion pipelines.

In the Ruby and Python implementation a session is one request-response cycle and there is a SessionFlusher that runs in a separate thread that collects the session data and sends it to the server once a minute in bulk.

Basically what this feature should do when enabled:

  • decide when a session should be created and start it in memory (set status of ongoing session to ok. also set release, environment, user, session_mode.)
  • decide when a session should end and if this end is happening update the status of the current session to exited.
  • when an error is raised, set the status of the current session to crashed or errored
  • every 60 seconds aggregate all current sessions into one JSON payload like described here: https://develop.sentry.dev/sdk/sessions/#session-aggregates-payload
  • when this aggregation is done, send the JSON payload to the Sentry server.

Due to the fact that PHP is single threaded this issue should be the start of a discussion on how this can be achieved, if it can be achieved at all.

You can also have a look on how this was done in Ruby: https://github.com/getsentry/sentry-ruby/pull/1715

antonpirker avatar Feb 21 '22 12:02 antonpirker

For reference; #1254 is related to the issue of having a storage/buffer over multiple requests and flusher task in PHP. This is probably impractical without external (buffer) storage and task runner.

stayallive avatar Feb 21 '22 13:02 stayallive

Exactly. In PHP we do not have any native way to bulk those information somewhere, because we cannot assume any external infrastructure or thread that we could leverage to do something like this out of the box.

Maybe something is feasible in the framework integrations, but that too would require a couple of assumptions or something that has to be manually enabled.

Jean85 avatar Feb 21 '22 13:02 Jean85

Do Laravel or Symfony have something that can be used out of the box? Having this only in one framework is also a possiblity.

antonpirker avatar Feb 21 '22 13:02 antonpirker

I wonder if we are building up a number use cases for an sidecar approach. What that might be is for sure open for discussion. But let's say, self-hosted relay to send such information to and it handles complexities needed for client reports, session tracking and other future features needed for gathering performance type data to aggregate and send to sentry

edit: not a real agent but a sidecar service

smeubank avatar Feb 21 '22 13:02 smeubank

For apps that have a queue worker (i.e. a lot of apps though not all), the app could give the SDK a callback for adding items (json, or json plus any other metadata needed to send the request) to the queue, and then the queue worker needs an SDK method(s) to aggregate and send off the items. This logic could actually be used for all Sentry events so it's possible to send them separately from the request process (which you often want to keep free for handling requests), and aggregated together for efficiency. This is basically the app bringing its own agent, I guess.

mfb avatar Feb 21 '22 14:02 mfb

Sounds like a idea on how to do this. Could the SDK discover the queue worker by itself and hook into it, so the SDK can use the queue worker without the user needing to set anything up by hand? And could you estimate how many PHP projects have this queue worker? It it something you setup right when you do your first "hello world" or is it something you only have when you have millions of users and a team of >5 programmers working on a project?

antonpirker avatar Feb 22 '22 10:02 antonpirker

I don't think there is any commonality between how frameworks setup queues and the SDK is pretty abstract, so I think this would have to happen at the level of integration plugins/libraries that have more awareness of the particular app/framework they are running in.

mfb avatar Feb 22 '22 19:02 mfb

But if the SDK made it possible, then that integration could wire it up.

mfb avatar Feb 22 '22 19:02 mfb

And could you estimate how many PHP projects have this queue worker? It it something you setup right when you do your first "hello world" or is it something you only have when you have millions of users and a team of >5 programmers working on a project?

I would say that it's something that happens a lot with bigger dev teams and apps. Having background workers is becoming more common thanks to libraries like Symfony Messenger, but it's still something that you put in your app long after the launch of your app, and it's set up manually.

Auto discovery is probably partially doable in the Symfony integration, but you would still inject workload into the users queues, and that could be troublesome; it would highly object to have that in opt-out mode.

Jean85 avatar Feb 23 '22 09:02 Jean85

I think in order for that to make it applicable/accessible in most cases, we might need to go down the route that @smeubank mentioned briefly -> Agent (Relay). While we do already support Relay as a kind of an acting Agent, there is still a lot of room for improvement to make the experience more seamless. For example, we could do something like Scout APM and download the agent in the background on first-time use and then run it on the side https://github.com/scoutapp/scout-apm-php/blob/eaf275883dd2640ea2ad9ed6e568314554e334f0/src/CoreAgent/Downloader.php#L100

I am not saying this is the way, I am just not sure if adding support for Sessions only for Laravel and only if you run background queues/workers makes sense.

HazAT avatar Feb 23 '22 10:02 HazAT

For what is worth, as a user I would never ever want something to be downloaded in the background on my behalf and ran without me knowing about it. I would rather prefer to have a real agent (as a PHP extension or as an external dependency), even if it means that out of the box I have one more manual step to do to set it up

ste93cry avatar Feb 23 '22 12:02 ste93cry

I agree with @ste93cry; and in fact any other service that I tried that works with performance & monitoring (Blackfire.io, NewRelic, DataDog) goes with an extension (that eventually spawns a background process) or with a clear dedicated agent to be deployed.

Jean85 avatar Feb 23 '22 14:02 Jean85

Yeah I think this would have to just be something SDK could provide infrastructure for, not fully automatic functionality. I maintain the Sentry integration for Drupal, and if it was possible to add Sentry events to Drupal's queue subsystem, I'd definitely have to provide various opt-in configurations around that e.g. to make sure someone didn't understand what was going on and flood their mission-critical queue with unexpected stuff :)

A PHP extension definitely makes sense to make performance tracing instrumentation easier; if it existed then could be leveraged for other functionality as well such as this..

mfb avatar Feb 23 '22 22:02 mfb

Thanks everyone for the input. Really amazing!

tl;dr: The agent is probably the best way to go.

I will close this issue now and we will start discussions about the agent approach in Sentry. When we have any news, you will be the first to know! Thanks again!

antonpirker avatar Feb 24 '22 15:02 antonpirker

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog or Status: In Progress, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

github-actions[bot] avatar Mar 18 '22 00:03 github-actions[bot]

Closing this for now, we might revisit this in the future.

cleptric avatar Oct 20 '22 15:10 cleptric