sentry icon indicating copy to clipboard operation
sentry copied to clipboard

Unreal Engine Crash Reports Being Dropped Again

Open rayanami opened this issue 3 years ago • 11 comments

Environment

SaaS (https://sentry.io/)

Version

No response

Steps to Reproduce

Unreal Engine 4.27.2

  1. Setup Unreal Crash Reporter to send events to Sentry.io (we are using the older method, not yet using the prototype Sentry UE4 Plugin).
  2. Trigger a crash*
  3. Observe on Sentry.io that the crash is received but dropped
  4. HTTP response is 200 and we receive a GUID which looks like a crash event

image

  • We have noticed that some Sentry results are processed correctly and others are dropped. We've checked the data on our side and everything is properly formatted JSON, within the payload size limitations, tag names and values are within the size limitations and do not contain the '\n' character. We get 100% repro rate on the dropped event for certain data input. We had this same problem in April 2022, it was fixed for a while with a change to zlib, and seems to have regressed in July 2022.

Related: https://github.com/getsentry/sentry/issues/34131

Zip folder of a crash that is reliably 100% dropped by the Sentry endpoint is attached.

crashinfo-Loki-pid-15-6EE0DBD370D24F7CAE5EA7B3B36409DC (2).zip

Expected Result

We expect if we receive a HTTP response code of 200 and a GUID that the Sentry event is created, and not dropped by the endpoint.

If the payload is malformed in some way, we expect to receive an error message such as HTTP 413 Payload Too Large or other descriptive error that lets us know what the issue might be from the client side.

Actual Result

We receive a HTTP response code of 200 and the Sentry endpoint drops our event.

rayanami avatar Sep 07 '22 23:09 rayanami

Routing to @getsentry/ecosystem for triage. ⏲️

getsentry-release avatar Sep 08 '22 09:09 getsentry-release

Hi @ReneGreen27 , I think this issue should be routed to the appropriate SDK team. Ecosystem does not have context to resolve this issue.

brianthi avatar Sep 08 '22 15:09 brianthi

Routing to @getsentry/owners-native for triage. ⏲️

getsentry-release avatar Sep 09 '22 01:09 getsentry-release

Hi @rayanami It looks like you can attach to the upload step directly from visual studio, which is perfect!

Can you capture the raw request (upload) stream bytes for me? Unreal uses its own custom zip-like format that we need to parse (and had bugs doing so in the past). Uploading the "real" zip file is not helpful in this in case, as it has a different container format.

Swatinem avatar Sep 12 '22 09:09 Swatinem

CrashPayloadRaw.zip

Please find attached a ZIP with the HTTP request payload binary data. You can see the source code used to output the binary file in the attached screenshot.

image

rayanami avatar Sep 12 '22 21:09 rayanami

Hi @rayanami

The attachment was helpful, it seems to parse correctly when inspected manually, however it only has a context, and no minidump or crash report attached.

What are your sentry stats saying about dropped events? Can you share your project url so we can also take a closer look?

Swatinem avatar Sep 14 '22 14:09 Swatinem

Hi @Swatinem,

We are seeing ~20% of our Sentry events being dropped. I've manually run the Crash Reporter Client to upload this crash and reliably see it being dropped by Sentry.

This crash was generated on Linux, it doesn't have a minidump but does have a log file and Diagnostics.txt with call stack.

Project URL: https://sentry.io/organizations/theorycraft-games/projects/theorycraft-games/?project=5710262

image

rayanami avatar Sep 14 '22 17:09 rayanami

Okay, I think we investigated into the wrong direction from the start.

It looks like there is nothing wrong with the raw payloads themselves. Looking into the more detailed stats I can see that the events were rejected because of Invalid JSON. Looking at the first zip file you attached, it looks like the problem is here:

	<GameData>
		<__sentry>{&quot;tags&quot;:{&quot;ComputerName&quot;:

The CrashContext.runtime-xml GameData.__sentry tag has some html entities in it, which are obviously invalid raw JSON.

We are still investigating if this is a recent regression, or if we indeed need to parse that XML taking html entities into account.

Swatinem avatar Sep 16 '22 11:09 Swatinem

Those html entities are also not the problem, the event data from within the CrashContext.runtime-xml file parses just fine 🤷🏻‍♂️

Your more recent attachment with the raw payload seems to process fine when we reproduce it manually. For the zip file in the first comment, at least the context file seems to parse fine as well.

Swatinem avatar Sep 16 '22 12:09 Swatinem

Thanks for the update @Swatinem - with that info, I was just able to discover the root cause for this issue which turned out to be user config error on our side. This crash is actually uploading successfully, but it was being issue grouped with an issue that had been set to 'ignore'. As a result, the issue was not showing up in the default view (which filters out ignored issues) and was not triggering our Slack alert rule. By removing the 'ignore' status on that issue, I now see the server crash uploading as expected and reporting through Slack.

We still see ~20% of our events being dropped by Sentry. I will work on finding a raw payload / zip that reliably generates the drop in that case. Is there a way that I can see metadata about an event that Sentry drops, so that I'm better able to identify repro steps?

rayanami avatar Sep 16 '22 17:09 rayanami

Unfortunately not. I the employee dashboard I can see that most of your events are indeed dropped because of "invalid json", and a couple of "payload too large" every now and then. But that is all the info we see.

Swatinem avatar Sep 19 '22 07:09 Swatinem

@rayanami I'm closing this issue as there has been no recent activity, if you still have issues, please don't hesitate to reopen it.

ashwoods avatar Mar 06 '23 15:03 ashwoods