sentry
sentry copied to clipboard
Large events strategy
Problem Statement
https://develop.sentry.dev/sdk/envelopes/#size-limits Reading the docs on envelope size limits has me worried: what is the action taken by Sentry should my events reach the limit? The docs do not specify. As such, I need to assume the worst-case scenario: the server will drop the envelope and all events will be lost.
Is this correct, or will the Sentry server trim the envelope intelligently until it reaches the limit? Seeing as the network data was already consumed at this point anyway — it seems like a blind drop doesn't benefit the server but in storage.
If a drop /is/ what happens, will I learn about it somehow, such as through a drop event recorded in its stead?
Since I attach runtime logs to the event as breadcrumbs, there is a real possibility that a long or heavy runtime might lead to hitting the limit. It's quite important that I can be aware of this rather than having a blind-spot to it and silently building in a mechanism that ignores crashes from a certain segment of the user-base.
Solution Brainstorm
We do have SentryOptions.maxBreadcrumbs, an imperfect guard. The default maximum is set to 100, which feels low as a count of log statements. Of course, it is also no guarantee that it will keep the envelope within the server-defined limits. Furthermore, it is undefined by the documentation what happens when the limit is hit. Will new breadcrumbs drop old ones (sounds good) or will new breadcrumbs be dropped once the limit is reached (sounds bad).
We can use SentryOptions.beforeSend to introspect the event before send-out and assess its size. Theoretically this might be an opportunity for us to vet it against the limit and trim it manually before sending, but we have a few issues: [1] we do not know what the current limits are (we can hard-code the 1MB event limit from the doc), [2] we do not know the byte-size of the event. The best proxy is event.serialize() which yields us a dictionary, but AFAIK Sentry itself uses an internal MsgPack implementation that we cannot access, so at best we can JSON serialize or over-count with something like event.serialize().description.utf8.count.
If I had to assess who is best equipped to deal with the problem, then my list from best-to-worst would be:
- The Sentry server
- The Sentry SDK
- The app developer
Where the strategy from best-to-worst would be:
- Trim the envelope intelligently/incrementally, dropping oldest/least relevant data points first.
- Replace with a "envelope limit reached" event that contains some basic details.
- Drop the envelope silently, ignoring the events with no recourse.
Are you willing to submit a PR?
No response
Worth mentioning https://github.com/getsentry/sentry-java/issues/910
What's curious to note is that when I add > 100 breadcrumbs to the scope, Sentry stops invoking beforeSend:
[Sentry] [debug] [SentryCrashStackCursor_SelfThread:61] Retrieving backtrace with async swift stitching...
[Sentry] [debug] [SentryCrashStackCursor_SelfThread:78] Finished retrieving backtrace.
At this point, adding < 100 breadcrumbs halts the code in the beforeSend call-back.
Adding > 100 breadcrumbs skips the callback and proceeds directly to sending the event off:
[Sentry] [debug] [SentryFileManager:321] Writing envelope to path: /Users/hubstaff/Library/Developer/CoreSimulator/Devices/🆔/data/Containers/Data/Application/🆔/Library/Caches/io.sentry/a999994c4ff1a25927f709eb50500057d6cdae95/envelopes/1714509134.833000-00000-🆔.json
[...]
I should note that I have SentryOptions.maxBreadcrumbs set to 500, and still I am seeing this odd behaviour.
That clearly suggests a bug to me, my beforeSend should not be skipped, it does important things.
As a result, it is now also impossible to use the beforeSend call-back to intelligently trim the size of the event to remain within the server-documented size limit.
Hey @lhunath thanks for reaching out - the product docs on size limits offer more details about the handling. In short, there's an overall size limit for the whole compressed/uncompressed event, which causes events to be dropped. After that, individual fields are trimmed as applicable.
The number of events that are dropped because they exceed the overall size limit are shown as part of the "Dropped" stat on the Stats page.
Wrt. the breadcrumbs limit - yes old breadcrumbs will be removed as new ones are added (LIFO).
What's curious to note is that when I add > 100 breadcrumbs to the scope, Sentry stops invoking
beforeSend:
This is a bug - thanks for opening a separate issue, we'll investigate!
Thank you, I regret I was unable to find this information on my own from reading the docs. Perhaps some additional pointers could be added with respect to the server behaviour & dropped events statistics, in the places where it is missing.
Unfortunately, dropped events is rather ambiguous and there's very little clarity in what the cause was for the dropped event, or what it was about. It would appear that SentrySDK is in a good position to attempt a drop recovery of some sort by uploading a minimal stub with some minimally safe information for the event, should this occur as a result of a known and recoverable situation such as envelope size limits.
At a minimum, it might be valuable to expose to the developer an onError callback to inform him of the circumstance and give him an opportunity to accommodate.
Thanks for the feedback about the docs - I'll see what we can do to make this info easier to find 👍
The overall topic would involve all SDKs, API, Ingestion, etc. so I'll cycle back to the larger team and may move this issue to a different repo later
related: https://github.com/getsentry/sentry/issues/59320
Assigning to @getsentry/support for routing ⏲️
Routing to @getsentry/product-owners-stats for triage ⏲️