heartbeat
heartbeat copied to clipboard
Undiagnosed rare bug will set `heartbeatStats.LongestMissingBeat` to a much higher number than is accurate
Note that this bug does not affect uptime (nor device.LongestMissingBeat). This is only an issue with where time.Now() is used, I think.
Bumping priority, I can't tell why but this issue seemingly still happens after rebooting heartbeat. It appears to always set around ~447 hours. I don't know if this number has any significance (447 - 12 (my initial hours) = 435 hours).
This seems to come back even after the system time has been set and heartbeat has been restarted.
Oh, it's ALWAYS 1610833 seconds
I don't think it's caused by malformed data as my devices look fine

I suspect this might have just been caused by erroneous data in the DB, I fixed it manually and will monitor for further changes.
Can't reproduce. This issue has not happened since April 8th.
Reopening this as some sort of weird Redis behavior. Here's the appendonly.aof from the last time this happened on production, which is fairly recently:
{"last_beat_formatted":"2 minutes, 4 seconds","total_devices_formatted":"3","total_visits_formatted":"903,733","total_uptime_formatted":"11,773 hours, 27 minutes, 8 seconds"{"device_name":"mango","timestamp":1673974698}
*4
$14
JSON.ARRAPPEND
$5
beats
$1
.
$46
{"device_name":"mango","timestamp":1673974698}
*4
$8
JSON.SET
$5
stats
$1
.
$303
{"last_beat_formatted":"30 seconds","total_devices_formatted":"3","total_visits_formatted":"900,588","total_uptime_formatted":"11,599 hours, 47 minutes, 8 seconds","total_beats_formatted":"376,612","total_visits":900588,"total_uptime_milli":41759228479,"total_beats":376612,"longest_missing_beat":63127}
*4
$8
JSON.SET
$7
devices
$1
.
$399
[{"device_name":"lychee","last_beat":{"device_name":"lychee","timestamp":1673928714},"total_beats":4951,"longest_missing_beat":219481},{"device_name":"mango","last_beat":{"device_name":"mango","timestamp":1673974698},"total_beats":15873,"longest_missing_beat":40847},{"device_name":"blueberry","last_beat":{"device_name":"blueberry","timestamp":1671482520},"total_beats":1,"longest_missing_beat":0}]
*4
$8
JSON.SET
$9
last_beat
$1
.
$46
{"device_name":"mango","timestamp":1673974758}
*4
$14
This time it is 692 hours (exactly 2492358 seconds in the DB). Unrelated to system timezone updates, I believe this issue might either be correlated to restarting Docker or adding a new device??? idfk I will solve this later, it happens so rarely that switching to the Rust rewrite might just be the way to go
Happened again. 1681153 seconds, or 466 hours. This HAS to be fixed ASAP because of canary / dead man switch reasons.
{"last_beat_formatted":"11 seconds","total_devices_formatted":"3","total_visits_formatted":"912,240","total_uptime_formatted":"12,354 hours, 39 minutes, 8 seconds","total_beats_formatted":"397,665","total_visits":912240,"total_uptime_milli":44476748495,"total_beats":397665,"longest_missing_beat":63127}
*4
$8
JSON.SET
$7
devices
$1
.
$407
[{"device_name":"lychee","last_beat":{"device_name":"lychee","timestamp":1676693440},"total_beats":10839,"longest_missing_beat":219481},{"device_name":"mango","last_beat":{"device_name":"mango","timestamp":1676693414},"total_beats":31029,"longest_missing_beat":47177},{"device_name":"blueberry","last_beat":{"device_name":"blueberry","timestamp":1675012321},"total_beats":10,"longest_missing_beat":2499120}]
*4
$8
JSON.SET
$9
last_beat
$1
.
$46
{"device_name":"mango","timestamp":1676693474}
*4
$14
JSON.ARRAPPEND
$5
beats
$1
.
$46
{"device_name":"mango","timestamp":1676693474}
*4
$8
JSON.SET
$9
last_beat
$1
.
$47
{"device_name":"lychee","timestamp":1676693500}
*4
$14
JSON.ARRAPPEND
$5
beats
$1
.
$47
{"device_name":"lychee","timestamp":1676693500}
*4
$8
JSON.SET
$5
stats
$1
.
$305
{"last_beat_formatted":"11 seconds","total_devices_formatted":"3","total_visits_formatted":"912,241","total_uptime_formatted":"12,354 hours, 40 minutes, 8 seconds","total_beats_formatted":"397,667","total_visits":912241,"total_uptime_milli":44476808495,"total_beats":397667,"longest_missing_beat":1681153}
For future reference, if this happens again before I fix it, this is handy.
curl -X POST -H "Auth: $HEARTBEAT_AUTH" -i https://frog.icu/api/update/stats -d "$(curl -s https://frog.icu/api/stats | jq '.longest_missing_beat = 63127')"
This is happening more frequently, and I need to fix it.
{"last_beat_formatted":"28 seconds","total_devices_formatted":"3","total_visits_formatted":"919,081","total_uptime_formatted":"12,806 hours, 35 minutes, 8 seconds","total_beats_formatted":"412,428","total_visits":919081,"total_uptime_milli":46103708499,"total_beats":412428,"longest_missing_beat":1618410}