go icon indicating copy to clipboard operation
go copied to clipboard

runtime: use MiniDumpWriteDump for GOTRACEBACK=crash on Windows

Open zx2c4 opened this issue 4 years ago • 36 comments
trafficstars

CL 307372 was committed, which contained a nice change to produce crashdumps under certain circumstances. However, several questions were proposed about when those crashdumps would be automatically uploaded to Microsoft in the context of using Insider Builds. In the absence of good answers to that, the otherwise okay CL was reverted (by me) with CL 362454. This proposal is essentially about reverting the revert, and having an extended discussion about under which circumstances these should be enabled and whether we need a new flag for it, as suggested by @bufflig.

Before this proposal moves to an "under consideration" step, we should get a clearer idea about what we're proposing. I'll defer to @zhizheng052 and @bhiggins on this, as they wrote the original CL.

CC @zhizheng052 @bhiggins @bufflig @ianlancetaylor @alexbrainman @aarzilli @aclements

zx2c4 avatar Nov 09 '21 02:11 zx2c4

In the CL discussion, there was some question as to when "optional" diagnostics gets enabled. I was just installing a fresh Windows 11 board a few minutes ago and noticed that this was checked by default:

image

zx2c4 avatar Nov 09 '21 03:11 zx2c4

It seems to me that if you opt to send you crash dumps to microsoft during install (by not opting out) and then opt into creating a crash dump by setting GOTRACEBACK=crash, then having your crash dump uploaded to microsoft is the expected behavior. Given that this only happens on insider builds, it isn't even that sketchy that the option is preselected. Since this is already gated by two options I don't think adding a third opt-in (through a new flag) is going to make a difference. IMO it's either:

  • add a courtesy reminder that windows does this in the documentation of GOTRACEBACK (and maybe the release notes), or
  • do not implement GOTRACEBACK=crash on windows and document that this is the reason.

aarzilli avatar Nov 09 '21 07:11 aarzilli

Given that this only happens on insider builds, it isn't even that sketchy that the option is preselected.

This wasn't an insider build; I was installing the Windows 11 final release.

zx2c4 avatar Nov 09 '21 12:11 zx2c4

Given that this only happens on insider builds, it isn't even that sketchy that the option is preselected.

This wasn't an insider build; I was installing the Windows 11 final release.

That sucks. Still, I don't think it changes the rest of my opinion.

aarzilli avatar Nov 09 '21 12:11 aarzilli

I assume there's something in the registry that enables the "sending to MS" behavior. I wonder if we could detect that and either:

a) Not enable WER or b) Enable WER but warn the user or c) Simply refuse to run when you have that combination (GOTRACEBACK=crash and uploading is enabled)?

Enabling GOTRACEBACK=crash even when you don't actually want the file on disk is not an unreasonable functionality, and was there before this change (it gives you some additional crash information). To get the added behavior that you upload crashdumps to a third party does not seem expected in those cases. At least a stern warning seems mandated, but I'd actually prefer to disable/revert the change for 1.18 and decide exactly how we want to proceed.

bufflig avatar Nov 09 '21 20:11 bufflig

I'd actually prefer to disable/revert the change for 1.18 and decide exactly how we want to proceed.

This has already happened, FYI.

There's also the WER option to pop up a UI asking for consent. I suspect this won't work in services, though.

Also, I wonder if we could ditch this whole WER set of options and just create a minidump ourselves? https://docs.microsoft.com/en-us/windows/win32/api/minidumpapiset/nf-minidumpapiset-minidumpwritedump

zx2c4 avatar Nov 09 '21 20:11 zx2c4

To clarify: this is only when GOTRACEBACK=crash is set, right?

elagergren-spideroak avatar Nov 09 '21 22:11 elagergren-spideroak

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

rsc avatar Dec 01 '21 20:12 rsc

From the top post:

Before this proposal moves to an "under consideration" step, we should get a clearer idea about what we're proposing.

For the record, I still don't quite know what we're proposing here.

zx2c4 avatar Dec 01 '21 20:12 zx2c4

At least the proposal discussion will get more eyes on it. What are the options? Is it possible to write a crash dump that doesn't get collected and sent off?

rsc avatar Dec 08 '21 18:12 rsc

It sounds like we are still waiting to find out what exactly is being proposed here. Does anyone know?

rsc avatar Jan 05 '22 18:01 rsc

We are still waiting to find out exactly what is being proposed here. If we can't figure that out, it seems like this would be a likely decline.

rsc avatar Jan 19 '22 18:01 rsc

To clarify: this is only when GOTRACEBACK=crash is set, right?

Correct.

It sounds like we are still waiting to find out what exactly is being proposed here. Does anyone know?

I reviewed CL 307372. The change makes Go runtime create process crash dumps that can be used by Delve to debug crashes.

The feature is enabled when you set GOTRACEBACK=crash environment variable.

There are also other things that you need to do to enable this feature. Here are the instructions

https://docs.microsoft.com/en-us/windows/win32/wer/collecting-user-mode-dumps

If we can't figure that out, it seems like this would be a likely decline.

Hopefully my explanation is good enough for you to reconsider.

I am not planing to use this feature today. But I can see how it can be life saving for me and others when they have nothing but crashing program. And crash dump is not good enough.

Also these crash dumps are standard on Windows. Other vendors use them for exactly the same purpose. So I don't see why Go should not implement features available to other tool developers.

Alex

alexbrainman avatar Jan 20 '22 11:01 alexbrainman

What I also want to add to my comment above is that this feature was requested nearly 5 years ago in #20498. Brad labeled it as NeedsFix and Ian as help wanted, so assumed it was OK to implement this feature.

Different people tried to implement this feature. But the implementation required quite a bit of effort and manual testing.

What made me put extra effort with CL 307372 (most recent attempt) is that I discovered that Delve can now actually use the dump files generated by this change. Which makes this feature quite more useful to more people.

I understand concerns that Jason raised about his crash dumps sent to Microsoft. But he can use GOTRACEBACK environment variable or some settings from

https://docs.microsoft.com/en-us/windows/win32/wer/collecting-user-mode-dumps

to stop that.

This feature is what Windows does. If we want crash dumps, we have to accept whatever policies Microsoft enforce onto that feature. We can make it clear in the docs what happens when user sets GOTRACEBACK=crash.

Alex

alexbrainman avatar Jan 21 '22 05:01 alexbrainman

Hi @alexbrainman, thanks for the information. This sounds like a very useful, important feature for users.

I tried to read https://docs.microsoft.com/en-us/windows/win32/wer/collecting-user-mode-dumps and am confused about the implications with sending data to Microsoft.

It says "This feature is not enabled by default." Does that mean that you have to edit a registry setting to make GOTRACEBACK=crash work? Does it mean that MS doesn't collect the crashes by default? Or if not, what exactly is the registry setting that has to be changed to stop uploads of GOTRACEBACK=crash-generated files to MS?

Thanks.

rsc avatar Jan 26 '22 18:01 rsc

It says "This feature is not enabled by default." Does that mean that you have to edit a registry setting to make GOTRACEBACK=crash work?

GOTRACEBACK=crash will still output appropriate stack trace. But GOTRACEBACK=crash will not create crash dump files until you edit your registry. That is how I read Microsoft doco. And that is what I see on my PC.

Does it mean that MS doesn't collect the crashes by default?

I don't believe so. But don't take my word for it. Reading

https://docs.microsoft.com/en-us/windows/win32/wer/using-wer

it talks about Windows OS crashes - see WER sends the report to Microsoft (Watson Server) if the user consented. bit. But we are talking about our program crash dumps. That would be too wasteful for Microsoft to collect all our program dumps.

what exactly is the registry setting that has to be changed to stop uploads of GOTRACEBACK=crash-generated files to MS?

I looked at my friend's PC (with registry untouched by hackers like me), and it does have HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\Windows Error Reporting key, but there is no LocalDumps key inside. See https://docs.microsoft.com/en-us/windows/win32/wer/collecting-user-mode-dumps for details.

This registry config does not create any crash dumps regardless of GOTRACEBACK=crash setting.

If you have any Windows PC, you can check its registry yourself. You can use regedit program, if you have GUI access to your PC (direct console or Remote Desktop).

Alternatively you can write Go program to list that registry key. If you start with internal/syscall/windows/registry.TestReadSubKeyNames and adjust couple of lines, you should be able to run that test on Go builders to see what their regstry settings are. If you care.

Alex

alexbrainman avatar Jan 29 '22 04:01 alexbrainman

@zx2c4 's suggestion to create a minidump should be considered also as a more privacy focused alternative.

beoran avatar Feb 02 '22 13:02 beoran

It sounds like if we make this change, then developers who have not edited the registry will not see any change at all. And developers who did edit the registry will get exactly what those edits imply. That sounds OK too. So it sounds like we should do this?

Does anyone object to accepting this proposal?

rsc avatar Feb 02 '22 18:02 rsc

@zx2c4 's suggestion to create a minidump should be considered also as a more privacy focused alternative.

I don't know what you mean by "minidump". But I tested CL 307372 with DumpType registry value set to 2. I also tried dumps created with DumpType registry value of 1, but these are not recognised by Delve.

You can see CL 307372 commit message explaining DumpType values that we tried. Just like I explained it here.

Alex

alexbrainman avatar Feb 03 '22 08:02 alexbrainman

@alexbrainman @beoran It seems to me that there is some confusion brewing here around the term "minidump" stemming from microsoft decision to use "minidump" and "mini dump" to mean two different things:

  • minidump is a file format used on windows for userspace core dumps, roughly equivalent to ET_CORE elf files. The term is used in contrast to crashdump, which is the format of operating system core dumps. The contents of a minidump are determined by the MINIDUMP_TYPE enumeration
  • mini dump is a minidump with MINIDUMP_TYPE set to MiniDumpNormal. This is in contrast with a "full dump" which includes the full memory of the process.

@zx2c4 was suggesting above to call MiniDumpWriteDump from dbghelp.dll directly, instead of letting WER do it. Either way a minidump is produced and either way the result can be a mini dump or a full dump.

AFAIK mini dumps are recognized by Delve but there is nothing useful in them (because of how goroutines are implemented), which is why it might seem that they aren't.

aarzilli avatar Feb 03 '22 08:02 aarzilli

It sounds like if we make this change, then developers who have not edited the registry will not see any change at all. And developers who did edit the registry will get exactly what those edits imply. That sounds OK too. So it sounds like we should do this?

Does anyone object to accepting this proposal?

Didn't we learn at some point that Windows beta users (called "insiders") get this registry key turned on?

zx2c4 avatar Feb 03 '22 10:02 zx2c4

@zx2c4 was suggesting above to call MiniDumpWriteDump from dbghelp.dll directly, instead of letting WER do it. Either way a minidump is produced and either way the result can be a mini dump or a full dump.

Indeed this still seems like the best solution.

zx2c4 avatar Feb 03 '22 10:02 zx2c4

Indeed this still seems like the best solution.

In theory it makes no difference, if microsoft wants to upload your core dumps they can hook MiniDumpWriteDump just like they can hook WER. The consent they ask is broad. In practice I don't know if there is any difference, has anyone checked that WER actually uploads anything by default and MiniDumpWriteDump doesn't?

aarzilli avatar Feb 03 '22 10:02 aarzilli

WER means we hand the state of a crashing app to the WER API and say "do something with it." As an app developer, I can actually register to receive those dumps on Microsoft Partner Portal. With MiniDumpWriteDump, the third param is a file handle, which means we're in control. I mean sure, Microsoft could shim this and intercept everything; it could do all this behind the scenes too without any API call. But that seems a bit much tinfoil right? That'd be a Big Deal if they were doing that, whereas WER is kind of already marked as part of Microsoft's "crash into this blackbox" apparatus.

zx2c4 avatar Feb 03 '22 11:02 zx2c4

@aarzilli Thanks for the tutorial on "minidumps" vs "mini dumps". You said that "mini dumps" are recognized by Delve but basically useless because they don't have memory contents. What about "minidumps" as generated by MiniDumpWriteDump? Do they work with Delve?

If the WER API and MiniDumpWriteDump both write files that work with Delve, and if we have uncertainty about the WER API and uploads to MS but have no uncertainty about MiniDumpWriteDump, then it seems like the choice is clear: use MiniDumpWriteDump.

But are both of those if statements true?

rsc avatar Feb 09 '22 18:02 rsc

What about "minidumps" as generated by MiniDumpWriteDump? Do they work with Delve?

As far as I can tell procdump.exe uses MiniDumpWriteDump to write its minidumps and Delve works with those.

aarzilli avatar Feb 09 '22 18:02 aarzilli

OK, well if MiniDumpWriteDump works with Delve, then my previous complex conditional reduces to "the choice is clear: use MiniDumpWriteDump".

Does anyone object to using MiniDumpWriteDump instead of the WER API?

rsc avatar Feb 16 '22 18:02 rsc

Retitled to what I believe we converged on. Any objections?

rsc avatar Feb 23 '22 18:02 rsc

Seems okay to me.

zx2c4 avatar Feb 23 '22 18:02 zx2c4

Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group

rsc avatar Mar 02 '22 21:03 rsc