beszel icon indicating copy to clipboard operation
beszel copied to clipboard

[Bug]: Mac agent crashes with SIGBUS in sensors_darwin_arm64

Open mmisiewicz opened this issue 8 months ago • 24 comments

Description

While running the beszel agent on Darwin, after about an hour I'm observing crashes with a stack trace like this:

<snip>
2025/05/04 02:11:59 DEBUG Docker stats err="Get \"http://localhost/containers/json\": dial unix /var/run/docker.sock: connect: no such file or directory"
2025/05/04 02:11:59 DEBUG Extra filesystems data=map[]
2025/05/04 02:12:59 DEBUG New session client=192.168.1.30:48500
2025/05/04 02:12:59 DEBUG Cached stats session=6f20e5726c873ff1ba88e0196828406e94880ddc1fd3d6eae64e3c35974f6049
2025/05/04 02:13:59 DEBUG New session client=192.168.1.30:48576
unexpected fault address 0x104c0b000
fatal error: fault
[signal SIGBUS: bus error code=0x1 addr=0x104c0b000 pc=0x104c0b008]

goroutine 1850 gp=0x14000846e00 m=10 mp=0x14000282008 [running]:
runtime.throw({0x104e040f9?, 0x0?})
	/home/runner/go/pkg/mod/golang.org/[email protected]/src/runtime/panic.go:1101 +0x38 fp=0x140007f2ff0 sp=0x140007f2fc0 pc=0x104b49d68
runtime.sigpanic()
	/home/runner/go/pkg/mod/golang.org/[email protected]/src/runtime/signal_unix.go:922 +0x170 fp=0x140007f3050 sp=0x140007f2ff0 pc=0x104b4bdb0
github.com/shirou/gopsutil/v4/internal/common.NewLibrary({0x10000000000, 0x1000000d2})
	/home/runner/go/pkg/mod/github.com/shirou/gopsutil/[email protected]/internal/common/common_darwin.go:91 +0x48 fp=0x140007f30a0 sp=0x140007f3060 pc=0x104c0b008
github.com/shirou/gopsutil/v4/sensors.TemperaturesWithContext({0x104b529c0?, 0x140007f3588?})
	/home/runner/go/pkg/mod/github.com/shirou/gopsutil/[email protected]/sensors/sensors_darwin_arm64.go:54 +0x6c4 fp=0x140007f3540 sp=0x140007f30a0 pc=0x104d4f764
created by github.com/gliderlabs/ssh.(*session).handleRequests in goroutine 1849
	/home/runner/go/pkg/mod/github.com/gliderlabs/[email protected]/session.go:260 +0x504

And there are a thousand more goroutine stacks listed, many of them mentioning ssh.

Using homebrew as the install method.

Expected Behavior

No crash

Steps to Reproduce

  1. Invoke agent like so: LOG_LEVEL=debug KEY="ssh-ed25519 ..." /opt/homebrew/Cellar/beszel-agent/0.11.1/bin/beszel-agen
  2. Wait for crash, in this case about 3h 45m later.

OS / Architecture

Darwin/macOS 15.4.1

Beszel version

0.11.1

Installation method

Binary

Configuration


Hub Logs


Agent Logs


mmisiewicz avatar May 04 '25 17:05 mmisiewicz

Thanks for the report, I think this might be related to the sensors package. There have been a few different issues with that on darwin.

Please try adding SENSORS="" to ~/.config/beszel/beszel-agent.env and restart the agent with brew services restart beszel-agent.

Let me know if you get the same error, otherwise please let it run for a day or so. I just want to narrow down possible causes before looking further into it.

henrygd avatar May 04 '25 22:05 henrygd

Check that - spent a bit more time looking into it. I think it may be related to this purego issue: https://github.com/ebitengine/purego/issues/309

Not 100% sure on that, but a fix has been merged on their end for 0.9.0, so we'll upgrade when that's released.

If you want to try yourself you will need to compile the agent.

Before running make build-agent, run go get github.com/ebitengine/[email protected] to upgrade to their latest alpha release.

You can change the binary location for the brew service by editing the last line in /usr/local/bin/beszel-agent-launcher to point to your compiled binary, then restart the service with brew services restart beszel-agent.

I'm not asking you do to this, to be clear. Just providing the option if you want to try it.

Edit: This is still likely related to the sensors package, so SENSORS="" may keep it running for the time being.

henrygd avatar May 04 '25 22:05 henrygd

Upgrading the purego package (by running go get github.com/ebitengine/[email protected] before running make build-agent and then invoking the compiled binary directly with KEY and DEBUG) did not help the problem. Same crash. The last goroutine in the stack is suggestive of the sensors theory:

goroutine 4096 gp=0x140010f4380 m=nil [runnable]:
runtime.gcTrigger.test({0x0?, 0x0?, 0x0?})
	/opt/homebrew/Cellar/go/1.24.2/libexec/src/runtime/mgc.go:602 +0x140 fp=0x14000ff6e80 sp=0x14000ff6e80 pc=0x100a3bb30
runtime.mallocgcSmallNoscan(0x0?, 0x0?, 0x0?)
	/opt/homebrew/Cellar/go/1.24.2/libexec/src/runtime/malloc.go:1333 +0x2a4 fp=0x14000ff6ee0 sp=0x14000ff6e80 pc=0x100a34164
runtime.mallocgc(0x400, 0x0, 0x0)
	/opt/homebrew/Cellar/go/1.24.2/libexec/src/runtime/malloc.go:1055 +0xa8 fp=0x14000ff6f10 sp=0x14000ff6ee0 pc=0x100a8c0d8
runtime.growslice(0x140010ee000, 0x14000b01618?, 0x71?, 0x100d54153?, 0x0?)
	/opt/homebrew/Cellar/go/1.24.2/libexec/src/runtime/slice.go:264 +0x4c8 fp=0x14000ff6f70 sp=0x14000ff6f10 pc=0x100a90ca8
github.com/shirou/gopsutil/v4/sensors.(*temperatureArm).getThermalValues(0x14000ff7198, 0x1286041c0)
	/Users/mike/go/pkg/mod/github.com/shirou/gopsutil/[email protected]/sensors/sensors_darwin_arm64.go:157 +0x278 fp=0x14000ff7070 sp=0x14000ff6f70 pc=0x100c982b8
github.com/shirou/gopsutil/v4/sensors.TemperaturesWithContext({0x18?, 0x78b9abc8459fc84c?})
	/Users/mike/go/pkg/mod/github.com/shirou/gopsutil/[email protected]/sensors/sensors_darwin_arm64.go:51 +0x3a4 fp=0x14000ff7510 sp=0x14000ff7070 pc=0x100c974f4
beszel/internal/agent.(*Agent).updateTemperatures(0x140001243c0, 0x14000ff76d8)
	/Users/mike/src/golang/beszel/beszel/internal/agent/sensors.go:82 +0x48 fp=0x14000ff75f0 sp=0x14000ff7510 pc=0x100cfdbb8
beszel/internal/agent.(*Agent).getSystemStats(0x140001243c0)
	/Users/mike/src/golang/beszel/beszel/internal/agent/system.go:206 +0xb08 fp=0x14000ff7a10 sp=0x14000ff75f0 pc=0x100cffb98
beszel/internal/agent.(*Agent).gatherStats(0x140001243c0, {0x14000d01e40, 0x40})
	/Users/mike/src/golang/beszel/beszel/internal/agent/agent.go:93 +0x164 fp=0x14000ff7e50 sp=0x14000ff7a10 pc=0x100cf6d44
beszel/internal/agent.(*Agent).handleSession(0x140001243c0, {0x100ec04d8, 0x1400017e000})
	/Users/mike/src/golang/beszel/beszel/internal/agent/server.go:55 +0xd4 fp=0x14000ff7f70 sp=0x14000ff7e50 pc=0x100cfe624
beszel/internal/agent.(*Agent).handleSession-fm({0x100ec04d8?, 0x1400017e000?})
	<autogenerated>:1 +0x3c fp=0x14000ff7fa0 sp=0x14000ff7f70 pc=0x100d00a6c
github.com/gliderlabs/ssh.(*session).handleRequests.func1()
	/Users/mike/go/pkg/mod/github.com/gliderlabs/[email protected]/session.go:261 +0x34 fp=0x14000ff7fd0 sp=0x14000ff7fa0 pc=0x100cc4e64
runtime.goexit({})
	/opt/homebrew/Cellar/go/1.24.2/libexec/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000ff7fd0 sp=0x14000ff7fd0 pc=0x100a962d4
created by github.com/gliderlabs/ssh.(*session).handleRequests in goroutine 4095

mmisiewicz avatar May 06 '25 13:05 mmisiewicz

SENSORS="" seems to have caused the crashing to stop. Though of course there's no sensor data now.

mmisiewicz avatar May 06 '25 22:05 mmisiewicz

@henrygd just to give you an additional data point: I'd been having these same crashes with my MacOS agent instance as well, usually 2-3 per day. I set SENSORS="" late yesterday and haven't had a crash so far, which is probably the longest uptime I've had with the agent on MacOS. :)

Awesome project BTW!!

dlhall111 avatar May 07 '25 00:05 dlhall111

Thanks, the additional stack trace is helpful. I'll fork gopsutil and see if I can figure out a fix.

I run macOS in a VM with no access to sensors, I will need others to test.

henrygd avatar May 07 '25 00:05 henrygd

I can send along a full stack trace if that helps, though there are ~4000 stacks in the full trace.

mmisiewicz avatar May 07 '25 01:05 mmisiewicz

Here's a binary to test.

If you want to build yourself, swap github.com/shirou/gopsutil/v4 for github.com/henrygd/gopsutil/v4. Make sure the latter is on v4.25.6.

https://static.beszel.dev/issues/796/beszel-agent_darwin_arm64

sha256sum beszel-agent_darwin_arm64
46d84d1809c7338ecfdd462f944a16d2441eb4075635f93e5898d40c4974cb92  beszel-agent_darwin_arm64

henrygd avatar May 09 '25 01:05 henrygd

this binary crashed for me:

$  LOG_LEVEL=debug KEY="..../" ./beszel-agent_darwin_arm64
2025/05/09 10:24:26 DEBUG 0.11.1
2025/05/09 10:24:26 DEBUG Not monitoring ZFS ARC err="open /proc/spl/kstat/zfs/arcstats: no such file or directory"
...
2025/05/09 10:24:26 INFO Root disk mountpoint=/ io=disk6
2025/05/09 10:24:26 INFO Detected network interface name=en0 sent=6138258972361 recv=2317691518083
2025/05/09 10:24:26 INFO Detected network interface name=awdl0 sent=39620700 recv=78348221
2025/05/09 10:24:26 INFO Detected network interface name=utun4 sent=15198 recv=18062
2025/05/09 10:24:26 INFO Detected network interface name=utun6 sent=210667 recv=157992
2025/05/09 10:24:26 INFO Detected network interface name=ipsec0 sent=569812 recv=7269771
2025/05/09 10:24:26 INFO Detected network interface name=utun8 sent=10738 recv=7232
2025/05/09 10:24:26 DEBUG GPU err="no GPU found - install nvidia-smi, rocm-smi, or tegrastats"
SIGTRAP: trace trap
PC=0x18a67c5e0 m=3 sigcode=0
signal arrived during cgo execution

goroutine 1 gp=0x140000021c0 m=3 mp=0x14000083008 [syscall]:
runtime.cgocall(0x104b57630, 0x140002ad790)
	/usr/lib/go/src/runtime/cgocall.go:167 +0x44 fp=0x14000036470 sp=0x14000036430 pc=0x104a9f1a4
github.com/ebitengine/purego.RegisterFunc.func4({0x14000292b88?, 0x1?, 0x1?})
	/home/hank/go/pkg/mod/github.com/ebitengine/[email protected]/func.go:320 +0x904 fp=0x140000368f0 sp=0x14000036470 pc=0x104b541b4
reflect.callReflect(0x14000250cc0, 0x14000036e28, 0x14000036c88, 0x14000036c90)
	/usr/lib/go/src/reflect/value.go:770 +0x3e8 fp=0x14000036c30 sp=0x140000368f0 pc=0x104aeae78
reflect.callReflect(0x14000250cc0, 0x14000036e28, 0x14000036c88, 0x14000036c90)
	<autogenerated>:1 +0x28 fp=0x14000036c60 sp=0x14000036c30 pc=0x104af4808
reflect.makeFuncStub()
	/usr/lib/go/src/reflect/asm_arm64.s:48 +0x58 fp=0x14000036e20 sp=0x14000036c60 pc=0x104af3ec8
github.com/henrygd/gopsutil/v4/sensors.(*temperatureArm).getThermalValues.deferwrap2()
	/home/hank/go/pkg/mod/github.com/henrygd/gopsutil/[email protected]/sensors/sensors_darwin_arm64.go:157 +0x30 fp=0x14000036e40 sp=0x14000036e20 pc=0x104ca8380
runtime.deferreturn()
	/usr/lib/go/src/runtime/panic.go:610 +0x60 fp=0x14000036ed0 sp=0x14000036e40 pc=0x104a6ba10
github.com/henrygd/gopsutil/v4/sensors.(*temperatureArm).getThermalValues(0x14000037128, 0x117e04580)
	/home/hank/go/pkg/mod/github.com/henrygd/gopsutil/[email protected]/sensors/sensors_darwin_arm64.go:169 +0x38c fp=0x14000037000 sp=0x14000036ed0 pc=0x104ca831c
github.com/henrygd/gopsutil/v4/sensors.TemperaturesWithContext({0x18?, 0x6357d0b7e405b6fc?})
	/home/hank/go/pkg/mod/github.com/henrygd/gopsutil/[email protected]/sensors/sensors_darwin_arm64.go:51 +0x3a4 fp=0x140000374a0 sp=0x14000037000 pc=0x104ca7444
beszel/internal/agent.(*Agent).updateTemperatures(0x1400018a3c0, 0x14000037668)
	/home/hank/projects/beszel-changes/beszel/internal/agent/sensors.go:82 +0x48 fp=0x14000037580 sp=0x140000374a0 pc=0x104d0dbc8
beszel/internal/agent.(*Agent).getSystemStats(0x1400018a3c0)
	/home/hank/projects/beszel-changes/beszel/internal/agent/system.go:206 +0xb08 fp=0x140000379a0 sp=0x14000037580 pc=0x104d0fc98
beszel/internal/agent.(*Agent).gatherStats(0x1400018a3c0, {0x0, 0x0})
	/home/hank/projects/beszel-changes/beszel/internal/agent/agent.go:93 +0x164 fp=0x14000037de0 sp=0x140000379a0 pc=0x104d06d54
beszel/internal/agent.NewAgent()
	/home/hank/projects/beszel-changes/beszel/internal/agent/agent.go:67 +0x398 fp=0x14000037e90 sp=0x14000037de0 pc=0x104d06ae8
main.main()
	/home/hank/projects/beszel-changes/beszel/cmd/agent/agent.go:118 +0xc8 fp=0x14000037f40 sp=0x14000037e90 pc=0x104d11618
runtime.main()
	/usr/lib/go/src/runtime/proc.go:283 +0x284 fp=0x14000037fd0 sp=0x14000037f40 pc=0x104a6fa04
runtime.goexit({})
	/usr/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000037fd0 sp=0x14000037fd0 pc=0x104aa9b44

goroutine 18 gp=0x14000102380 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000078790 sp=0x14000078770 pc=0x104aa1e88
runtime.goparkunlock(...)
	/usr/lib/go/src/runtime/proc.go:441
runtime.forcegchelper()
	/usr/lib/go/src/runtime/proc.go:348 +0xb8 fp=0x140000787d0 sp=0x14000078790 pc=0x104a6fd58
runtime.goexit({})
	/usr/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000787d0 sp=0x140000787d0 pc=0x104aa9b44
created by runtime.init.7 in goroutine 1
	/usr/lib/go/src/runtime/proc.go:336 +0x24

goroutine 19 gp=0x14000102540 m=nil [GC sweep wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000078f60 sp=0x14000078f40 pc=0x104aa1e88
runtime.goparkunlock(...)
	/usr/lib/go/src/runtime/proc.go:441
runtime.bgsweep(0x14000112000)
	/usr/lib/go/src/runtime/mgcsweep.go:276 +0xa0 fp=0x14000078fb0 sp=0x14000078f60 pc=0x104a5b140
runtime.gcenable.gowrap1()
	/usr/lib/go/src/runtime/mgc.go:204 +0x28 fp=0x14000078fd0 sp=0x14000078fb0 pc=0x104a4efb8
runtime.goexit({})
	/usr/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000078fd0 sp=0x14000078fd0 pc=0x104aa9b44
created by runtime.gcenable in goroutine 1
	/usr/lib/go/src/runtime/mgc.go:204 +0x6c

goroutine 20 gp=0x14000102700 m=nil [GC scavenge wait]:
runtime.gopark(0x14000112000?, 0x104def230?, 0x1?, 0x0?, 0x14000102700?)
	/usr/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000079760 sp=0x14000079740 pc=0x104aa1e88
runtime.goparkunlock(...)
	/usr/lib/go/src/runtime/proc.go:441
runtime.(*scavengerState).park(0x10514efa0)
	/usr/lib/go/src/runtime/mgcscavenge.go:425 +0x5c fp=0x14000079790 sp=0x14000079760 pc=0x104a58c4c
runtime.bgscavenge(0x14000112000)
	/usr/lib/go/src/runtime/mgcscavenge.go:653 +0x44 fp=0x140000797b0 sp=0x14000079790 pc=0x104a59184
runtime.gcenable.gowrap2()
	/usr/lib/go/src/runtime/mgc.go:205 +0x28 fp=0x140000797d0 sp=0x140000797b0 pc=0x104a4ef58
runtime.goexit({})
	/usr/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000797d0 sp=0x140000797d0 pc=0x104aa9b44
created by runtime.gcenable in goroutine 1
	/usr/lib/go/src/runtime/mgc.go:205 +0xac

goroutine 2 gp=0x14000003a40 m=nil [finalizer wait]:
runtime.gopark(0x1400007c5b8?, 0x104aa2ac4?, 0x1?, 0xc5?, 0x104ad3ba4?)
	/usr/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400007c590 sp=0x1400007c570 pc=0x104aa1e88
runtime.runfinq()
	/usr/lib/go/src/runtime/mfinal.go:196 +0x108 fp=0x1400007c7d0 sp=0x1400007c590 pc=0x104a4dfb8
runtime.goexit({})
	/usr/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400007c7d0 sp=0x1400007c7d0 pc=0x104aa9b44
created by runtime.createfing in goroutine 1
	/usr/lib/go/src/runtime/mfinal.go:166 +0x80

goroutine 3 gp=0x14000003c00 m=nil [chan receive]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400007cef0 sp=0x1400007ced0 pc=0x104aa1e88
runtime.chanrecv(0x140000ba0e0, 0x0, 0x1)
	/usr/lib/go/src/runtime/chan.go:664 +0x42c fp=0x1400007cf70 sp=0x1400007cef0 pc=0x104a40e7c
runtime.chanrecv1(0x0?, 0x0?)
	/usr/lib/go/src/runtime/chan.go:506 +0x14 fp=0x1400007cfa0 sp=0x1400007cf70 pc=0x104a40a14
runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
	/usr/lib/go/src/runtime/mgc.go:1796
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
	/usr/lib/go/src/runtime/mgc.go:1799 +0x3c fp=0x1400007cfd0 sp=0x1400007cfa0 pc=0x104a521dc
runtime.goexit({})
	/usr/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400007cfd0 sp=0x1400007cfd0 pc=0x104aa9b44
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
	/usr/lib/go/src/runtime/mgc.go:1794 +0x78

r0      0x0
r1      0x60000255c000
r2      0x0
r3      0x0
r4      0x0
r5      0x0
r6      0x0
r7      0x0
r8      0x0
r9      0xbd424b73c000
r10     0x7ffffffffffff8
r11     0x27
r12     0x14000292ba0
r13     0x16c3d6ef0
r14     0x140001ffbe0
r15     0x104e2dac0
r16     0xbd424b73c000
r17     0x3d424b73c000
r18     0x0
r19     0x60000255c000
r20     0x14000036780
r21     0x140002ad850
r22     0x0
r23     0x0
r24     0x0
r25     0x0
r26     0x0
r27     0x820
r28     0x14000002e00
r29     0x16c3d6e90
lr      0x18a536b44
sp      0x16c3d6e80
pc      0x18a67c5e0
fault   0x18a67c5e0

mmisiewicz avatar May 09 '25 14:05 mmisiewicz

Linking #531 for reference, adding SENSORS="" to ~/.config/beszel/beszel-agent.env also stopped my crashes. So, definitely something there.

luckman212 avatar May 09 '25 17:05 luckman212

Thanks, here's a new binary for testing:

https://beszel.b-cdn.net/issues/796/v2/beszel-agent_darwin_arm64

sha256sum beszel-agent_darwin_arm64
46d84d1809c7338ecfdd462f944a16d2441eb4075635f93e5898d40c4974cb92  beszel-agent_darwin_arm64

If you want to build yourself, please see if specifying CGO_ENABLED=0 or CGO_ENABLED=1 makes any difference:

go get github.com/henrygd/gopsutil/[email protected]
go mod tidy
make build-agent OS=darwin ARCH=arm64 CGO_ENABLED=0
make build-agent OS=darwin ARCH=arm64 CGO_ENABLED=1

henrygd avatar May 09 '25 21:05 henrygd

I've got this new agent installed on a test system, and removed the SENSORS="" for now. Will report back.

luckman212 avatar May 10 '25 12:05 luckman212

Started producing bad JSON after ~2 hrs.

Log and copy of the invalid JSON attached. I reverted to the previous build and put SENSORS="" back for now.

luckman212 avatar May 10 '25 15:05 luckman212

@henrygd Does it help if I have a spare old Mac laptop I can ship to you so you have hardware to test with?

luckman212 avatar May 14 '25 16:05 luckman212

@luckman212 That's very generous and would help a ton. But only if you have absolutely no use for it.

Shoot me an email at [email protected] for my address. I'm in the US and can cover any costs for shipping / time spent.

Thanks!

henrygd avatar May 14 '25 18:05 henrygd

I added panic recovery in the next release. You should (hopefully) be able to remove SENSORS="" and still get sensor data most of the time.

henrygd avatar Jun 23 '25 23:06 henrygd

Ok, I will surely try and report back. Thank you for the update!

luckman212 avatar Jun 24 '25 00:06 luckman212

We are also running into this issue on macOS. Waiting for the new release including panic recovery.

Thanks for all of your work 🥇

bartvdbraak avatar Jun 29 '25 15:06 bartvdbraak

I added panic recovery in the next release. You should (hopefully) be able to remove SENSORS="" and still get sensor data most of the time.

@henrygd I've been running a build that includes 0b04f60b6c7edde99b2410e0b083894982a8f2c3 (build was based on 46316ebffa75e41bf649376672f19ea8349a991a) for about five days and removed SENSORS="" again.
The same problem occurred again now and the system is now permanently displayed as offline even though I can reach the agent through ssh. Is the Hostname used for mapping? It should be HomeboxM4 but is broken / invalid.

2025/06/29 21:37:21 DEBUG Temperature sensors="[{\x00sensorKey\":\"PMU2 tdie10\",\"temperature\":33.171142578125,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdie3\",\"temperature\":35.17010498046875,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdie7\",\"temperature\":35.809783935546875,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdev7\",\"temperature\":36.02253723144531,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU2 tdie7\",\"temperature\":32.85130310058594,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdev2\",\"temperature\":33.77372741699219,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU2 tdie8\",\"temperature\":33.57093811035156,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU2 tcal\",\"temperature\":51.82000732421875,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdie10\",\"temperature\":35.64985656738281,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU2 tdie5\",\"temperature\":32.85130310058594,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU2 tdie2\",\"temperature\":33.65089416503906,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU2 tdev4\",\"temperature\":33.14599609375,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdie4\",\"temperature\":35.48994445800781,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdev4\",\"temperature\":34.0401611328125,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU2 tdev1\",\"temperature\":34.029205322265625,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU2 tdie9\",\"temperature\":32.69139099121094,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdie12\",\"temperature\":35.969696044921875,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdie8\",\"temperature\":36.28953552246094,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU2 tdie3\",\"temperature\":32.93125915527344,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU2 tdev5\",\"temperature\":34.116790771484375,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdev3\",\"temperature\":36.058563232421875,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdie5\",\"temperature\":35.48994445800781,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdie1\",\"temperature\":34.69035339355469,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdev8\",\"temperature\":33.33576965332031,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdie9\",\"temperature\":35.48994445800781,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdev6\",\"temperature\":34.50730895996094,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU2 tdie1\",\"temperature\":33.65089416503906,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdev5\",\"temperature\":33.6824951171875,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"NAND CH0 temp\",\"temperature\":32,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU2 tdie4\",\"temperature\":32.371551513671875,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tcal\",\"temperature\":51.82000732421875,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdie13\",\"temperature\":35.40998840332031,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdie14\",\"temperature\":36.44944763183594,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdev1\",\"temperature\":34.93431091308594,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU2 tdie6\",\"temperature\":32.451507568359375,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU2 tdev2\",\"temperature\":31.361328125,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdie2\",\"temperature\":34.69035339355469,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdie6\",\"temperature\":35.809783935546875,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU tdie11\",\"temperature\":36.049652099609375,\"sensorHigh\":0,\"sensorCritical\":0} {\x00sensorKey\":\"PMU2 tdev3\",\"temperature\":34.56205749511719,\"sensorHigh\":0,\"sensorCritical\":0}]"
2025/06/29 21:37:21 DEBUG sysinfo data="{Hostname:\x00omeboxM4 KernelVersion:15.5 Cores:10 Threads:10 CpuModel:Apple M4 Uptime:768103 Cpu:1.64 MemPct:71.36 DiskPct:81.18 Bandwidth:0 AgentVersion:0.11.1 Podman:false GpuPct:0 DashboardTemp:51.82000732421875 Os:1}"
2025/06/29 21:37:21 DEBUG System stats data="&{Stats:{Cpu:1.64 MaxCpu:0 Mem:16 MemUsed:11.42 MemPct:71.36 MemBuffCache:4.21 MemZfsArc:0 Swap:0 SwapUsed:0 DiskTotal:228.27 DiskUsed:185.32 DiskPct:81.18 DiskReadPs:0.01 DiskWritePs:0.05 MaxDiskReadPs:0 MaxDiskWritePs:0 NetworkSent:0 NetworkRecv:0 MaxNetworkSent:0 MaxNetworkRecv:0 Temperatures:map[NAND CH0 temp:32 PMU tcal:51.82 PMU tdev1:34.93 PMU tdev2:33.77 PMU tdev3:36.06 PMU tdev4:34.04 PMU tdev5:33.68 PMU tdev6:34.51 PMU tdev7:36.02 PMU tdev8:33.34 PMU tdie1:34.69 PMU tdie10:35.65 PMU tdie11:36.05 PMU tdie12:35.97 PMU tdie13:35.41 PMU tdie14:36.45 PMU tdie2:34.69 PMU tdie3:35.17 PMU tdie4:35.49 PMU tdie5:35.49 PMU tdie6:35.81 PMU tdie7:35.81 PMU tdie8:36.29 PMU tdie9:35.49 PMU2 tcal:51.82 PMU2 tdev1:34.03 PMU2 tdev2:31.36 PMU2 tdev3:34.56 PMU2 tdev4:33.15 PMU2 tdev5:34.12 PMU2 tdie1:33.65 PMU2 tdie10:33.17 PMU2 tdie2:33.65 PMU2 tdie3:32.93 PMU2 tdie4:32.37 PMU2 tdie5:32.85 PMU2 tdie6:32.45 PMU2 tdie7:32.85 PMU2 tdie8:33.57 PMU2 tdie9:32.69] ExtraFs:map[] GPUData:map[]} Info:{Hostname:\x00omeboxM4 KernelVersion:15.5 Cores:10 Threads:10 CpuModel:Apple M4 Uptime:768103 Cpu:1.64 MemPct:71.36 DiskPct:81.18 Bandwidth:0 AgentVersion:0.11.1 Podman:false GpuPct:0 DashboardTemp:51.82000732421875 Os:1} Containers:[]}"
2025/06/29 21:37:21 DEBUG Docker stats data="[0x1400050e070 0x1400015ea10 0x14000399260 0x14000286f50 0x1400015e5b0 0x14000398e00 0x140000d37a0 0x140000d3810 0x1400015eb60 0x140004284d0 0x140000d3260 0x14000398d20 0x14000399340 0x14000286e70 0x1400015e4d0 0x14000399420]"

a-mnich avatar Jul 01 '25 20:07 a-mnich

@a-mnich Thanks for testing. Is the hostname displayed correctly with SENSORS="" set?

If so, it sounds like the sensors package may be returning corrupted data that breaks the JSON encoding. There's a similar issue some people have with the processor model name (#837) so I asked there if setting SENSORS="" fixes it.

We're switching from JSON to CBOR in the next version so we'll see if that makes any difference. Should have a beta version out in the next couple days.

henrygd avatar Jul 01 '25 21:07 henrygd

@henrygd Yes, with SENSORS="" the hostname is correct. #837 also occurred on my machine beforehand. I've worked around both issues (which I think have the same root cause) by setting SENSORS="" before trying out the recent fix. Worked fine for over a week.

with SENSORS="" set:

2025/07/02 00:06:15 DEBUG sysinfo data="{Hostname:HomeboxM4 KernelVersion:15.5 Cores:10 Threads:10 CpuModel:Apple M4 Uptime:949836 Cpu:3.93 MemPct:72.31 DiskPct:81.27 Bandwidth:0 AgentVersion:0.11.1 Podman:false GpuPct:0 DashboardTemp:0 Os:1}"

a-mnich avatar Jul 01 '25 22:07 a-mnich

I saw https://github.com/henrygd/beszel/commit/3586f73f306a7d18fc9176be70dcadbef742e9fa but wasn't quite sure - does that mean we no longer need SENSORS=""?

luckman212 avatar Jul 25 '25 15:07 luckman212

I'm not sure either since I don't have an affected system. Most likely it will not fix it, but I figured it was worth a try.

Here's the upstream issue if anyone wants to investigate further: https://github.com/shirou/gopsutil/issues/1832

henrygd avatar Jul 25 '25 16:07 henrygd

After a short while, the agent started producing the invalid JSON again. This is as of 0.12.1. So for now I've reverted to adding SENSORS="" back into my env file. I know this is an upstream issue @henrygd - just noting.

luckman212 avatar Jul 27 '25 12:07 luckman212