Occasional crash when using sensors on MacOS arm64
Describe the bug
I got some random crash when calling SensorsTemperatures() on MacOS (24.3.0 on arm64).
The error seems to be some kind of race-condition, since it only occur when multiple threads are calling SensorsTemperatures concurrently.
On real usage, I think my true issue is with concurrent call between SensorsTemperatures and another usage of ioKit and/or corefoundation. But I was not able to have reproducible code sample that only code one call per system (sensors, disk, cpu, mem...)
To Reproduce
package main
import (
"log/slog"
"sync"
"github.com/shirou/gopsutil/v4/sensors"
)
func main() {
var wg sync.WaitGroup
for range 30 { // The higher is this number, the more likely issue will occur. Empirically 30 seems a good value
wg.Add(1)
go func() {
defer wg.Done()
r, err := sensors.SensorsTemperatures()
if false {
// The log itself isn't required to produce the bug, but without
// assigning SensorsTemperatures result to variable the bug don't seems to
// occure, maybe due to compiler optimization ?
slog.Info("sensors", slog.Any("r", r), slog.Any("err", err))
}
}()
}
wg.Wait()
}
Run the program (possibly multiple time, the race condition seems rather unlikely):
go build sensors_bug.go
while ./sensors_bug ; do echo "Sucess"; done 2>&1 | tee large_error_message.log
It result in error like:
unexpected fault address 0x100921808
fatal error: fault
[signal SIGBUS: bus error code=0x1 addr=0x100921808 pc=0x10092181c]
goroutine 39 gp=0x14000106c40 m=28 mp=0x140000ee008 [running]:
runtime.throw({0x100923b7e?, 0x0?})
/opt/homebrew/Cellar/go/1.24.2/libexec/src/runtime/panic.go:1101 +0x38 fp=0x14000297a70 sp=0x14000297a40 pc=0x1008d0fe8
runtime.sigpanic()
/opt/homebrew/Cellar/go/1.24.2/libexec/src/runtime/signal_unix.go:922 +0x170 fp=0x14000297ad0 sp=0x14000297a70 pc=0x1008d2800
github.com/shirou/gopsutil/v4/internal/common.NewLibrary({0x0, 0x0})
/Users/pierref/go/pkg/mod/github.com/shirou/gopsutil/[email protected]/internal/common/common_darwin.go:97 +0x9c fp=0x14000297b20 sp=0x14000297ae0 pc=0x10092181c
github.com/shirou/gopsutil/v4/sensors.TemperaturesWithContext({0x0?, 0x0?})
/Users/pierref/go/pkg/mod/github.com/shirou/gopsutil/[email protected]/sensors/sensors_darwin_arm64.go:54 +0x6d4 fp=0x14000297fc0 sp=0x14000297b20 pc=0x100922144
created by main.main in goroutine 1
/Users/pierref/tmp/20250403-1426/sensors_bug.go:16 +0x38
goroutine 1 gp=0x140000021c0 m=nil [sync.WaitGroup.Wait]:
runtime.gopark(0x100a29680?, 0x1008d1310?, 0x0?, 0x40?, 0x100da7f28?)
/opt/homebrew/Cellar/go/1.24.2/libexec/src/runtime/proc.go:435 +0xc8 fp=0x1400006de50 sp=0x1400006de30 pc=0x1008d10c8
runtime.goparkunlock(...)
/opt/homebrew/Cellar/go/1.24.2/libexec/src/runtime/proc.go:441
runtime.semacquire1(0x140001140b8, 0x0, 0x1, 0x0, 0x18)
/opt/homebrew/Cellar/go/1.24.2/libexec/src/runtime/sema.go:188 +0x204 fp=0x1400006dea0 sp=0x1400006de50 pc=0x1008b4604
sync.runtime_SemacquireWaitGroup(0x140000021c0?)
/opt/homebrew/Cellar/go/1.24.2/libexec/src/runtime/sema.go:110 +0x2c fp=0x1400006dee0 sp=0x1400006dea0 pc=0x1008d24ac
sync.(*WaitGroup).Wait(0x140001140b0)
[... truncated since I don't belive it matter for this bug]
Expected behavior
No crash :)
Environment (please complete the following information):
- [x] Mac OS: [paste the result of
sw_versanduname -a
$ sw_vers
ProductName: macOS
ProductVersion: 15.3.2
BuildVersion: 24D81
$ uname -a
Darwin mbp-de-pierre.bleemeo.work 24.3.0 Darwin Kernel Version 24.3.0: Thu Jan 2 20:24:16 PST 2025; root:xnu-11215.81.4~3/RELEASE_ARM64_T6000 arm64 arm Darwin
gopsutil version:
$ cat go.mod
module test
go 1.24.2
require github.com/shirou/gopsutil/v4 v4.25.3
require (
github.com/ebitengine/purego v0.8.2 // indirect
github.com/go-ole/go-ole v1.2.6 // indirect
github.com/yusufpapurcu/wmi v1.2.4 // indirect
golang.org/x/sys v0.28.0 // indirect
)
Additional context
I think the bug is due to ioKit and/or corefoundation library being closed by another gorouting while still being used by the one who crash.
To experiment with this, I've modified TemperaturesWithContext (go mod vendor then edit "vendor/github.com/shirou/gopsutil/v4/sensors/sensors_darwin_arm64.go).
The idea is to make TemperaturesWithContext doing concurrent call (like the minimal step to reproduce it), but this time the ioKit and coreFoundation library are shared between gorouting.
func TemperaturesWithContext(_ context.Context) ([]TemperatureStat, error) {
var wg sync.WaitGroup
var (
globalResult []TemperatureStat
globalErr error
l sync.Mutex
)
ioKit, err := common.NewLibrary(common.IOKit)
if err != nil {
return nil, err
}
defer ioKit.Close()
coreFoundation, err := common.NewLibrary(common.CoreFoundation)
if err != nil {
return nil, err
}
defer coreFoundation.Close()
for range 30 { // Once more, the higher the most likely to produce the bug
wg.Add(1)
go func() {
defer wg.Done()
r, err := temperaturesWithContext(ioKit, coreFoundation)
l.Lock()
defer l.Unlock()
globalResult = r
globalErr = err
}()
}
wg.Wait()
return globalResult, globalErr
}
func temperaturesWithContext(ioKit *common.Library, coreFoundation *common.Library) ([]TemperatureStat, error) {
ta := &temperatureArm{
ioKit: ioKit,
cf: coreFoundation,
[... the remaining of the original TemperaturesWithContext unmodified]
With this change, calling TemperaturesWithContext no longer crash:
$ cat single_call.go
package main
import (
"log/slog"
"github.com/shirou/gopsutil/v4/sensors"
)
func main() {
r, err := sensors.SensorsTemperatures()
slog.Info("sensors", slog.Any("r", r), slog.Any("err", err))
}
$ go build single_call.go; while ./single_call ; do echo "Sucess"; done 2>&1 | tee large_error_message.log
If you move ioKit & coreFoundation inside the go func() { } (i.e. initialize and close) the libraries per gorouting, it will crash.
Very final note: only Sensors seems affected by this bug (maybe because sensor does the more complex usage of the ioKit/CF libraries ?): the following code don't exhibit the crash even if it use ioKit/CF concurrently on cpu/disk/mem: https://gist.github.com/PierreF/dd5864811ef6de22bfcb431810fe4f4f
The reproduce condition seems to be a bit extreme, I ran the code you provided 102 times (as suggested by the length of large_error_message.log) before it crashed. In that case it means an unreasonable amount of IOKit / Core Foundation calls, and sure the sensor package is more complex, so it might be easier for it to reach system limits.
In my test, I usually get it in less than 10 tries :/ It probably means that the race condition isn't linked only to calling sensors concurrently, and might even depends on something running elsewhere... (another process on the system ? I also think to other gorouting / GC ?).
If I can found some time, I'll try to come with more realistic way to reproduce it. In real usage I don't call sensors concurrently (only concurrently with disk/cpu/mem) and I do hit the bug "fast" (like in few hundreds call to sensors - i.e. 1 hours with one call to sensors every 10 seconds).
On my environment, no panic occurred on your first code after more than 500 "Success". gopsutil version is v4.25.3.
go version go1.24.3 darwin/arm64
ProductName: macOS
ProductVersion: 15.4.1
BuildVersion: 24E263
Darwin mypc 24.4.0 Darwin Kernel Version 24.4.0: Fri Apr 11 18:33:46 PDT 2025; root:xnu-11417.101.15~117/RELEASE_ARM64_T8112 arm64
My Mac Studio M1 Ultra w/ 128GB ram and macOS 15.4.1 ran once then seg faulted on the second run. Not 10, not 102, not 500. Two. I've been running into a lot of seg faults that all seem to originate from this package so I started digging more and found this issue. I do not think this is the bug I'm running into, but it's definitely similar. Attaching the log file from the crash (using the code in the OP).
EDIT: To test I just made a new folder named i inside an existing project (Notifiarr) with a go.mod that imports this package. You'll see that in the output file, but nothing from Notifiarr was used here.
Turns out this is the problem my app has been running into. I've been testing exclusively on macOS, and every time the app calls sensors it's a crapshoot on the outcome. Sometimes it's fine. Sometimes I get a full segmentation violation. I could attach that stack trace, but there's nothing in it specific to this package. I was only able to narrow it down by removing calls to sensors and watching the problem go away.
EDIT: I should also point out that I generally stick the sensors output into json.Marshal. The more common problem was the marshaller throwing errors about trying to stick strings into structs, or calling IsNil on non-nullable types. These errors only happen when there's a race condition and the data the marshaller is reading is also being written at the same time. tl;dr: this is almost certainly a data race.