sentry
sentry copied to clipboard
Optimize transitions between grouping configs
Planning
- [x] Write initial tech spec - https://www.notion.so/sentry/Tech-Spec-Optimizing-transitions-between-grouping-configs-39e562cf8bec4aefaa79bbbd320b4aec?pvs=4
- [ ] Figure out how to store non-rule changes so they can be walked back
Milestone 1: Only calculate old hash if new hash not found
Prep
- [x] Move all helper functions to grouping module, add missing tests (Done in https://github.com/getsentry/sentry/pull/62067, https://github.com/getsentry/sentry/pull/62203, https://github.com/getsentry/sentry/pull/62888, https://github.com/getsentry/sentry/pull/62889, https://github.com/getsentry/sentry/pull/62856, and https://github.com/getsentry/sentry/pull/62887)
- [x] Update background grouping to make a copy of the event, the same way secondary grouping does (done in https://github.com/getsentry/sentry/pull/62856)
- [x] Only ever run background grouping before main grouping (done in https://github.com/getsentry/sentry/pull/63715, option removed in https://github.com/getsentry/sentry-options-automator/pull/622 and https://github.com/getsentry/sentry/pull/63793)
Refactor
- [x] Pull all grouping logic besides
migrate_off_hierarchicalandupdate_grouping_config_if_neededfrom_save_error_eventsinto new helperget_hash_values(done in https://github.com/getsentry/sentry/pull/63731) - [x] Move all grouping code from
save_error_eventsto_save_aggregate(Done in https://github.com/getsentry/sentry/pull/63805)
Set up feature flagging
- [x] Introduce feature flag for using new logic (flag added in https://github.com/getsentry/sentry/pull/64128, handler added in https://github.com/getsentry/getsentry/pull/12753)
- [x] Wrap
_save_aggregatein helperassign_to_groupwhich includes logic for which function to call (both branches call_save_aggregatefor now) (done in https://github.com/getsentry/sentry/pull/64312)
Write initial tests
- [ ] Identify/write tests for the obvious happy path(s) with feature flag off, make sure they pass (test helper added in https://github.com/getsentry/sentry/pull/64856, tests themselves added in https://github.com/getsentry/sentry/pull/64689)
- [ ] Adjust tests so they test both the flag-on and flag-off branches of
assign_to_group(Done in https://github.com/getsentry/sentry/pull/64851)
Prep for updated logic
- [ ] Create copies of
_save_aggregateandfind_existing_grouphashand call them when feature flag is on (Done in https://github.com/getsentry/sentry/pull/64851 and https://github.com/getsentry/sentry/pull/64857) - [ ] Remove hierarchical code from new copies of
_save_aggregateandfind_existing_grouphash(Done for_save_aggregate_newin https://github.com/getsentry/sentry/pull/64858 and forfind_existing_grouphash_newin https://github.com/getsentry/sentry/pull/64970) - [ ] Do other general cleanup of
_save_aggregate_new(Done in https://github.com/getsentry/sentry/pull/64974 and https://github.com/getsentry/sentry/pull/64975) - [ ] Update
_save_aggregatetests to run on both versions of_save_aggregate
Update logic
- [ ] Pull logic from
get_hash_values_newback into_save_aggregate_newso order can be mixed - [ ] Change order so only primary hashes are calculated at first, and secondary hashes are only calculated if primary hash not found
Increase coverage
- [ ] Add tests for likely edge cases (are there any?), make sure they pass both ways
Add/update metrics
- [ ] Add metrics to track difference in outcome IRL and in a hypothetical transition period (specific metrics TBA)
- [ ] Add metrics to dashboard
Internal Rollout
- [ ] Turn feature flag on for sentry org
- [ ] Turn on side effects for
_save_aggregate_newand adjust logic in_assign_to_groupto only call_save_aggregate_newinstead of calling both_save_aggregateand_save_aggregate_newwhen flag and non-mobile config conditions are met (this means we're actually using the new values internally, rather than just calculating them to compare)
Wider Rollout
- [ ] Introduce settable constant for sample rate of testing (0 by default)
- [ ] Increase sample rate to > 0
- [ ] Is there anything to do besides adjusting sample rate and options?
Force orgs off of mobile config (can be done later)
- [ ] Introduce feature flag for force-upgrading people (off by default)
- [ ] Introduce settable constant for sample rate of upgrading so not all transitions run at once (0 by default)
- [ ] Change logic for upgrading so that it ignores do-not-upgrade option if sample rate says yes and either config is mobile