sysinfo icon indicating copy to clipboard operation
sysinfo copied to clipboard

Sysinfo causing program to crash on m1 mac

Open SpyMachine opened this issue 1 year ago • 2 comments

Describe the bug

We are utilizing sysinfo and wrapping it inside a Ruby gem and our process is crashing on M1 macs with some regularity. I am using sysinfo 0.33, rustc 1.73 and Ruby 3.1.3.

Here's the header for the crash report:

-------------------------------------
Translated Report (Full Report Below)
-------------------------------------

Process:               ruby [42126]
Path:                  /Users/USER/*/ruby
Identifier:            ruby
Version:               ???
Code Type:             ARM-64 (Native)
Parent Process:        ruby [32878]
Responsible:           rubymine [76206]
User ID:               1051487424

Date/Time:             2024-01-23 13:33:06.4356 -0500
OS Version:            macOS 14.2.1 (23C71)
Report Version:        12
Anonymous UUID:        46169751-138F-4F35-B67F-1E0619B7A5F4

Sleep/Wake UUID:       4720FC63-FDF4-486A-A035-F6355830E554

Time Awake Since Boot: 110000 seconds
Time Since Wake:       3333 seconds

System Integrity Protection: enabled

Crashed Thread:        6  diagnostic_context.rb:471

Exception Type:        EXC_GUARD (SIGKILL)
Exception Codes:       GUARD_TYPE_MACH_PORT
Exception Codes:       0x0000000000000000, 0x0000000000000000

Termination Reason:    Namespace GUARD, Code 2305843022098595840 

Application Specific Information:
crashed on child side of fork pre-exec

The crash sometimes looks different but always comes from a call to host_processor_info.

Thread 10 Crashed:: diagnostic_context.rb:471
0   libsystem_kernel.dylib        	       0x18be79874 mach_msg2_trap + 8
1   libsystem_kernel.dylib        	       0x18be8bcf0 mach_msg2_internal + 80
2   libsystem_kernel.dylib        	       0x18be9a84c host_processor_info + 148
3   libsysinfo_gem.dylib          	       0x105300e14 sysinfo::unix::apple::cpu::CpusWrapper::refresh::hdc4c784ed3a862dc + 616
4   libsysinfo_gem.dylib          	       0x105250288 sysinfo_gem::cpu_usage::h3281bfbfd9d49cd8 + 208

Another example:

Thread 6 Crashed:: diagnostic_context.rb:471
0   libsystem_kernel.dylib        	       0x180a29874 mach_msg2_trap + 8
1   libsystem_kernel.dylib        	       0x180a3bcf0 mach_msg2_internal + 80
2   libsystem_kernel.dylib        	       0x180a4a84c host_processor_info + 148
3   libsysinfo_gem.dylib          	       0x10736e068 0x1072fc000 + 467048
4   libsysinfo_gem.dylib          	       0x107301500 0x1072fc000 + 21760
5   libruby.3.1.dylib             	       0x100b66628 vm_call_cfunc_with_frame + 232 (vm_insnhelper.c:3037)
6   libruby.3.1.dylib             	       0x100b68d4c vm_sendish + 1336
7   libruby.3.1.dylib             	       0x100b4b680 vm_exec_core + 8128 (insns.def:778)
8   libruby.3.1.dylib             	       0x100b5de70 rb_vm_exec + 2212

Not sure honestly if the problem is here with sysinfo or the libc crate possibly but I figured I'd start here.

To Reproduce Unfortunately, I haven't found an easy way to reproduce this yet. I can continue working on that.

SpyMachine avatar Jan 30 '24 01:01 SpyMachine

I was playing with this some more this morning and was able to drastically simplify the reproduction case and remove the Ruby dependency. Reuploading here if you wouldn't mind taking a look. The thing that's catching my eye the most is that the issue only seems to occur if I call refresh(); from the parent process. If I move that into the child process, it seems like it does not reproduce.

I'm wondering if perhaps there's something async going on with refresh_cpu and both libcurl and sysinfo are trying to access the same resource at the same time or something? I don't know, I'm just kind of speculating.

the libcurl documentation on curl_global_init does indicate it should be threadsafe but I'm not sure if it's process safe 🙃 https://curl.se/libcurl/c/curl_global_init.html

reproduce2.zip

SpyMachine avatar Jan 31 '24 15:01 SpyMachine

I'll try to take a look next week or so (and hope that this issue is also triggered on non-M1 macs).

GuillaumeGomez avatar Jan 31 '24 21:01 GuillaumeGomez

It seems the main branch fixed this problem and now you can try the reproduce2 code. @SpyMachine

I tested on my M1 Mac

gongzhengyang avatar Jul 13 '24 02:07 gongzhengyang

Closing then. Please don't hesitate to reopen if the bug is still there.

GuillaumeGomez avatar Jul 22 '24 19:07 GuillaumeGomez