MeshAgent icon indicating copy to clipboard operation
MeshAgent copied to clipboard

Randomly occurring Fatal Exception

Open MailYouLater opened this issue 5 years ago • 12 comments

So, first a little background. I've been experimenting with MeshCentral off and on for some months now, and just the other day I updated to one of the new versions (0.4.x) which Ylianst said has new agents, but I'm not sure which exact version of MeshCentral I was running when each of these occurred (except the most recent one, where I was running MeshCentral 0.4.1-p), because I've been updating pretty frequently during my testing. However, I do know that all three of these have occurred while I was running a 0.4.x version of MeshCentral, and I've also had only one instance of MeshCentral on the network, only one agent operating, communicating with a MeshCentral server instance which is running on the same machine (both operating as the same user) in temporary connection mode (not installed, I just ran the meshagent64.exe file and clicked "Connect"), all this on Windows 10 v1903, build 18362.295.

Sorry, I haven't done any advanced debugging yet. I've just now updated to MeshCentral 0.4.1-q and added controlChannelDebug=1 to the .msh file, so if it happens again, hopefully I'll have more details for you.

Anyway, the first time this happened, I was messing around with agent console commands, and ... I don't think I did anything particularly unusual, but right after I ran a command (sorry, I don't remember which one) it disconnected, and I realised it had crashed. The other 2 times have both happened while I wasn't doing anything with MeshCentral, or the MeshAgent. It was all just idling (presumably doing something occasionally, to keep itself connected) in the background.

meshagent64.log

[2019-09-30 12:37:59 PM] FATAL EXCEPTION [meshagent64_0E6B60297D2ADB72.exe] @ [FuncAddr: 0x00007ff72acf4d7a / BaseAddr: 0x00007ff72ad496dc / Delta: 346466]

[2019-10-01 03:23:11 PM] FATAL EXCEPTION [meshagent64_0E6B60297D2ADB72.exe] @ [FuncAddr: 0x00007ff743ae4d7a / BaseAddr: 0x00007ff743b396dc / Delta: 346466]

[2019-10-02 01:43:30 PM] FATAL EXCEPTION [meshagent64_0E6B60297D2ADB72.exe] @ [FuncAddr: 0x00007ff743ae4d7a / BaseAddr: 0x00007ff743b396dc / Delta: 346466]

meshagent64.msh

MeshName=.
MeshType=2
MeshID=0xE5A0289D08630C767D1706C1205D81EC2F654295ABA272264272C0D7715CA781154C0F92F50B8F74C60AE34370A88A23
ServerID=818BB4D83D9012496A575327DE5AA2E078538FA2E9E64C6B2F9A4B3736FBA95755F00BF664DEE0A97C935B339758B58F
MeshServer=local
ignoreProxyFile=1

In case anything about the .db or .exe files will tell you anything important, I've zipped them up along with the .log and .msh files here: meshagent64.zip

MailYouLater avatar Oct 02 '19 21:10 MailYouLater

If you edit the .msh file and add: coreDumpEnabled=1

to the end of it, and restart the agent, it will generate a dump file whenever it crashes. If you could send that file, it would make debugging crashes way easier.

krayon007 avatar Oct 03 '19 00:10 krayon007

Thanks, I've added coreDumpEnabled=1 to my .msh file and restarted the agent. Should I disable controlChannelDebug? or keep it enabled too?

MailYouLater avatar Oct 03 '19 16:10 MailYouLater

The control channel debug thing is good for figuring out connection problems. The core dump thing is the most useful for tracking crashes. If you don't have any connection issues, I'd put: controlChannelDebug= Into the msh, cuz otherwise it's just writing stuff into the log for not much benefit.

krayon007 avatar Oct 03 '19 23:10 krayon007

Where does it save the core dump to? I had a different (unrelated) crash happen while I was messing with some things, but I don't see a core dump. Did it fail to dump the core? or did it just put it somewhere I wasn't expecting?

Also, I just noticed today that one of the computers I have a normal, non-temporary installation on (which connects to a different server, on a different network) is also crashing. Here's it's log:

[2019-09-28 04:37:34 AM] FATAL EXCEPTION [MeshAgent_0E6B60297D2ADB72.exe] @ [FuncAddr: 0x00007ff667bf4d7a / BaseAddr: 0x00007ff667c496dc / Delta: 346466]

[2019-10-05 06:28:47 AM] FATAL EXCEPTION [MeshAgent_0E6B60297D2ADB72.exe] @ [FuncAddr: 0x00007ff667bf4d7a / BaseAddr: 0x00007ff667c496dc / Delta: 346466]

I've set it to coreDumpEnabled=1 to its .msh file too.

MailYouLater avatar Oct 08 '19 21:10 MailYouLater

On most platforms it will be in the same folder as the executable. On windows, it will be called meshagent.dmp. on Linux and FreeBSD it is simply called core. On macos, it's in /cores, where the filename is the pid of the crashed process.

krayon007 avatar Oct 09 '19 00:10 krayon007

Sorry it took me a while to get back to this. My local test computer hasn't crashed again recently, but the other computer (where I have a permanent-style installation) crashed again the other day. Here's the new portion of the .log file:

[2019-10-12 09:05:37 PM] FATAL EXCEPTION [MeshAgent_0E6B60297D2ADB72.exe] @ [FuncAddr: 0x00007ff658724d7a / BaseAddr: 0x00007ff6587796dc / Delta: 346466]

And here's the .dmp file it made: MeshAgent_2019-10-12_210537.zip
It's in a .zip because GitHub wouldn't accept it otherwise.

MailYouLater avatar Oct 16 '19 00:10 MailYouLater

@krayon007: Did you get a chance to take a look at this?

MailYouLater avatar Oct 22 '19 16:10 MailYouLater

I must've missed the zip file. I'll take a look at it when I get back in the office later today.

krayon007 avatar Oct 29 '19 19:10 krayon007

Your machines that are crashing... Do they have AMT? That dump shows the crash is occuring when a JavaScript object is garbage collected, but it doesn't tell me which type of object was collected. I fixed a crash related to HECI on Windows a few days ago, that was caused when an AMT response was received after the JavaScript object was already garbage collected. They may be related.

krayon007 avatar Oct 29 '19 21:10 krayon007

No. I have some machines that are AMT capable, but I'm not using it, and as far as I'm aware, the machines that are crashing are not AMT capable.

MailYouLater avatar Oct 29 '19 21:10 MailYouLater

Quick update: @krayon007 is able to make a similar error happen on the agent and is looking into it. This said, I am only one more work day tomorrow and will start traveling a lot for 3 weeks. I don't want to risk releasing a new MeshAgent unless it's going to get a lot of testing. So, will probably wait until I get back to publish a new agent.

Also, this is a duplicate of an issue reported on MeshCentral #556. But let's keep both open for now, I don't go in the MeshAgent GitHub much.

Ylianst avatar Oct 30 '19 00:10 Ylianst

It does happened to me also:

[2020-01-08 11:47:36 AM] FATAL EXCEPTION [MeshAgent_E7BC4772A5B427C7.exe] @ [FuncAddr: 0x00007ff7112b507e / BaseAddr: 0x00007ff71130b6ac / Delta: 353838]

[2020-01-09 09:28:22 AM] FATAL EXCEPTION [MeshAgent_E7BC4772A5B427C7.exe] @ [FuncAddr: 0x00007ff7112b507e / BaseAddr: 0x00007ff71130b6ac / Delta: 353838]

vitko-bg avatar Jan 27 '20 17:01 vitko-bg