sentry events filtration on sentry side (ie server-side analogue of sample rate)

Problem Statement

The main purpose of the sentry is to capture application errors with some contextual info that will allow us to solve the problem. But do we really need all error events to understand the context?

Solution Brainstorm

By setting up sample_rate on the application level, we will limit the number of events sent to the sentry. But by doing so we might skip some important rare issues. So to achieve low numbers of unnecessary events in sentry but also capture rare problems we need to always send the first occurrence of an issue and filter out some percent of the events that follow.

We can implement this in two ways:

client-side: store events in some temporary storage to check if they ever occurred.
- cumbersome to implement in many projects
server-side: check an event before ingestion: if it has an issue - apply sample rate filtration, if it doesn't have an issue - save.
- The downside of straightforward server-side implementation - ingestion will be noticeably slower. With preloaded fingerprints or by using some fast key-value storage we can achieve some speedup.
some cronjob that will remove unnecessary events from DBs every day.
- easiest solution, kind of improved version of the cleanup command

Product Area

Ingestion and Filtering

May 06 '24 09:05 vanyakosmos

Assigning to @getsentry/support for routing ⏲️

May 06 '24 09:05 getsantry[bot]

Routing to @getsentry/product-owners-ingestion-and-filtering for triage ⏲️

May 07 '24 09:05 getsantry[bot]

@vanyakosmos thanks for the feature request! We do automatically sample and prioritize infrequent occurrences for transactions, but not for errors (see https://docs.sentry.io/product/performance/retention-priorities/#low-volume-transactions).

Server-side sampling for error events is currently not planned, but we will record it as a feature request. cc @ale-cota

May 07 '24 11:05 jjbayer