Make `DedupeIntegration` more memory efficient.
Users reported that the DedupeIntegration can use up a lot of memory, because it keeps a full exception in memory for checking if it has seen this exception already.
Depending on the users code those exception objects can be big because they also include the traceback and local variables (which can be huge).
Idea is now to not save the whole exception but just a hash of the important parts of the exception to decide if we have seen this exception again.
fixes https://github.com/getsentry/sentry-python/issues/3165 fixes https://github.com/getsentry/sentry-python/issues/4327
Codecov Report
:white_check_mark: All modified and coverable lines are covered by tests.
:white_check_mark: Project coverage is 84.58%. Comparing base (9001126) to head (d6ed7a8).
Additional details and impacted files
@@ Coverage Diff @@
## master #4446 +/- ##
==========================================
- Coverage 84.60% 84.58% -0.02%
==========================================
Files 158 158
Lines 16463 16463
Branches 2850 2850
==========================================
- Hits 13928 13926 -2
- Misses 1694 1696 +2
Partials 841 841
| Files with missing lines | Coverage Δ | |
|---|---|---|
| sentry_sdk/integrations/dedupe.py | 87.50% <100.00%> (ø) |
@antonpirker can't you just id(exc)?
i thought about that too. but garbage collection moves objects around. so id(exc) could be different when we safe it and when we compare it...
The id() of an object is guaranteed to never change during the life cycle of the object. So taking it.
I will close this in favor of https://github.com/getsentry/sentry-python/pull/4809
Reason: Python reuses memory addresses after garbage collection. Making this path unreliable.