sentry-unreal icon indicating copy to clipboard operation
sentry-unreal copied to clipboard

Support non-crashing aborts

Open kristjanvalur opened this issue 2 years ago • 2 comments

UnrealEngine sometimes decides to exit with an error, even without crashing. Such crashes are not caught by Sentry. We have been using our own plugin for years, (https://github.com/mainframeindustries/UE4SentryClientPlugin) but decided to move to this standard one, also to try to address this issue.

The problem is probably not currently solvable, but might be, with some work.

The issue is code like this:

// Check for GPU completion
			uint64 CompletedFenceValue = CurrentQueue.Fence.D3DFence->GetCompletedValue();

			if (CompletedFenceValue == UINT64_MAX)
			{
				// If the GPU crashes or hangs, the driver will signal all fences to UINT64_MAX. If we get an error code here, we can't pass it directly to 
				// VERIFYD3D12RESULT, because that expects DXGI_ERROR_DEVICE_REMOVED, DXGI_ERROR_DEVICE_RESET etc. and wants to obtain the reason code itself
				// by calling GetDeviceRemovedReason (again).
				HRESULT DeviceRemovedReason = CurrentQueue.Device->GetDevice()->GetDeviceRemovedReason();
				if (DeviceRemovedReason != S_OK)
				{
					VerifyD3D12Result(DXGI_ERROR_DEVICE_REMOVED, "CurrentQueue.Fence.D3DFence->GetCompletedValue()", __FILE__, __LINE__, CurrentQueue.Device->GetDevice());
				}
			}

The VerifyD3D12Result will, if failing, perform this:

	void VerifyD3D12Result(HRESULT D3DResult, const ANSICHAR* Code, const ANSICHAR* Filename, uint32 Line, ID3D12Device* Device, FString Message)
	{
		check(FAILED(D3DResult));
		
		GD3DCallFailedCS.Lock();

		const FString& ErrorString = GetD3D12ErrorString(D3DResult, Device);
		UE_LOG(LogD3D12RHI, Error, TEXT("%s failed \n at %s:%u \n with error %s\n%s"), ANSI_TO_TCHAR(Code), ANSI_TO_TCHAR(Filename), Line, *ErrorString, *Message);
		
		if (D3DResult == E_OUTOFMEMORY)
		{
			TerminateOnOutOfMemory(Device, D3DResult, false);
		}
		else
		{
			TerminateOnGPUCrash(Device, nullptr, 0);
		}

		// Make sure the log is flushed!
		GLog->Panic();

		UE_LOG(LogD3D12RHI, Fatal, TEXT("%s failed \n at %s:%u \n with error %s\n%s"), ANSI_TO_TCHAR(Code), ANSI_TO_TCHAR(Filename), Line, *ErrorString, *Message);

		// Force shutdown, we can't do anything useful anymore.
		FPlatformMisc::RequestExit(true);
	}

A bunch of stuff is logged, the log is flushed, and then an exit is performed.

Now, in this particular case, I could just make some custom engine changes, set up a hook and initi it from our game. But I was thinking that perhaps we could come up with a suggested engine PR where an exit handler hook can be registered.

The engine already has a delegate for exit, but not for "abnormal" exit:

void FWindowsPlatformMisc::RequestExitWithStatus(bool Force, uint8 ReturnCode)
{
	UE_LOG(LogWindows, Log, TEXT("FPlatformMisc::RequestExitWithStatus(%i, %i)"), Force, ReturnCode);

#if ENABLE_PGO_PROFILE
	// save current PGO profiling data and terminate immediately
	PGO_WriteFile();
	TerminateProcess(GetCurrentProcess(), 0);
	return;
#endif

	RequestEngineExit(TEXT("Win RequestExit"));
	FCoreDelegates::ApplicationWillTerminateDelegate.Broadcast();

If we create some sort of PR to UE, where we could register a callback for abnormal exits, would it be possible to have Sentry catch that, and issue a corresponding event?

kristjanvalur avatar Sep 13 '23 11:09 kristjanvalur

@kristjanvalur thanks for reaching out and doubly so for sharing with us! If there were a hook Sentry could subscribe to it during startup. We can definitely set something up to capture the event.

bitsandfoxes avatar Sep 18 '23 14:09 bitsandfoxes

Yes, well, we will most likely do experiments in our custom engine and I'll report back to this defect.

kristjanvalur avatar Sep 18 '23 17:09 kristjanvalur