events filtration on sentry side (ie server-side analogue of sample rate)
Problem Statement
The main purpose of the sentry is to capture application errors with some contextual info that will allow us to solve the problem. But do we really need all error events to understand the context?
Solution Brainstorm
By setting up sample_rate on the application level, we will limit the number of events sent to the sentry. But by doing so we might skip some important rare issues. So to achieve low numbers of unnecessary events in sentry but also capture rare problems we need to always send the first occurrence of an issue and filter out some percent of the events that follow.
We can implement this in two ways:
- client-side: store events in some temporary storage to check if they ever occurred.
- cumbersome to implement in many projects
- server-side: check an event before ingestion: if it has an issue - apply sample rate filtration, if it doesn't have an issue - save.
- The downside of straightforward server-side implementation - ingestion will be noticeably slower. With preloaded fingerprints or by using some fast key-value storage we can achieve some speedup.
- some cronjob that will remove unnecessary events from DBs every day.
- easiest solution, kind of improved version of the
cleanupcommand
- easiest solution, kind of improved version of the
Product Area
Ingestion and Filtering
Assigning to @getsentry/support for routing ⏲️
Routing to @getsentry/product-owners-ingestion-and-filtering for triage ⏲️
@vanyakosmos thanks for the feature request! We do automatically sample and prioritize infrequent occurrences for transactions, but not for errors (see https://docs.sentry.io/product/performance/retention-priorities/#low-volume-transactions).
Server-side sampling for error events is currently not planned, but we will record it as a feature request. cc @ale-cota