operating-system icon indicating copy to clipboard operation
operating-system copied to clipboard

[VM] Dynamic RAM allocation going haywire.

Open haldi4803 opened this issue 8 months ago • 6 comments

Describe the issue you are experiencing

I'm running TrueNAS Scale VM with HAOS... i've changed from 6GB RAM to 12GB RAM in settings when the HAOS machine was down. Now i'm stuck with 95% RAM usage and 90% SWAP usage and have absolutely NO idea why. Glances doesn't show anything abnormal, htop doesn't show memory usage. Obviously i tried Rebooting the VM, but didn't change anything.

  ~ free -h
               total        used        free      shared  buff/cache   available
Mem:            11Gi        11Gi       109Mi       4.0Mi       540Mi       526Mi
Swap:          3.9Gi       3.5Gi       363Mi

Image

Image

Image

Using top -b -o +%MEM | head -n 22 in the terminal on the Host does not show anything using too much RAM. Using free -h says everything is full.

Image

What other way do i have to debug/log RAM usage?

What operating system image do you use?

generic-x86-64 (Generic UEFI capable x86-64 systems)

What version of Home Assistant Operating System is installed?

15.0

Did the problem occur after upgrading the Operating System?

No

Hardware details

TrueNAS Scale: ElectricEel-24.10.2 64GB RAM, 12GB given to VM. 5 Cores 10 Threas given to VM.

Steps to reproduce the issue

No Clue... seriously.

Anything in the Supervisor logs that might be useful for us?

nope, nothing RAM related.
OR should there be?

Anything in the Host logs that might be useful for us?

nope, nothing RAM related.
OR should there be?

System information

System Information

version core-2025.3.3
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.13.2
os_name Linux
os_version 6.12.18-haos
arch x86_64
timezone Europe/Zurich
config_dir /config
Home Assistant Community Store
GitHub API ok
GitHub Content ok
GitHub Web ok
HACS Data ok
GitHub API Calls Remaining 5000
Installed Version 2.0.5
Stage running
Available Repositories 1604
Downloaded Repositories 14
Home Assistant Cloud
logged_in false
can_reach_cert_server ok
can_reach_cloud_auth ok
can_reach_cloud ok
Home Assistant Supervisor
host_os Home Assistant OS 15.0
update_channel stable
supervisor_version supervisor-2025.03.3
agent_version 1.7.2
docker_version 28.0.1
disk_total 33.7 GB
disk_used 24.4 GB
healthy true
supported true
host_connectivity true
supervisor_connectivity true
ntp_synchronized true
virtualization kvm
board ova
supervisor_api ok
version_api ok
installed_addons Whisper (2.4.0), Piper (1.5.2), ESPHome Device Builder (2025.2.2), openWakeWord (1.10.0), File editor (5.8.0), Advanced SSH & Web Terminal (20.0.2), Studio Code Server (5.18.3), Glances (0.21.1), ESPHome (dev) (dev), InfluxDB (5.0.2)
Dashboards
dashboards 2
resources 9
views 6
mode storage
Network Configuration
adapters lo (disabled), enp0s3 (enabled, default, auto), docker0 (disabled), hassio (disabled), vethd7d7598 (disabled), vetha3bda91 (disabled), veth2705565 (disabled), veth891784e (disabled), veth541da13 (disabled), vethf3731d4 (disabled), veth4623118 (disabled), vethec3950d (disabled), vethce4c7ef (disabled), veth1e53068 (disabled), veth1afdad4 (disabled), veth4ed9ef2 (disabled)
ipv4_addresses lo (127.0.0.1/8), enp0s3 (192.168.1.10/24), docker0 (172.30.232.1/23), hassio (172.30.32.1/23), vethd7d7598 (), vetha3bda91 (), veth2705565 (), veth891784e (), veth541da13 (), vethf3731d4 (), veth4623118 (), vethec3950d (), vethce4c7ef (), veth1e53068 (), veth1afdad4 (), veth4ed9ef2 ()
ipv6_addresses lo (::1/128), enp0s3 (fddf:3940:83b2::dc7/128, 2001:4060:c00b:dd30::dc7/128, 2001:4060:c00b:dd30:5696:ac91:d59:be47/64, fddf:3940:83b2:0:18d5:ab4b:45b9:2e08/64, fe80::aa8a:2217:6ead:feb2/64), docker0 (fe80::d0e9:1fff:fe7a:36e/64), hassio (fe80::875:2bff:feb2:466f/64), vethd7d7598 (fe80::cc6f:caff:fe80:f940/64), vetha3bda91 (fe80::387a:2fff:fe50:e303/64), veth2705565 (fe80::c4c3:92ff:fe32:6c19/64), veth891784e (fe80::c093:78ff:fe54:f2fd/64), veth541da13 (fe80::c70:7aff:fee8:ef76/64), vethf3731d4 (fe80::888e:99ff:fe96:c291/64), veth4623118 (fe80::4c02:4ff:fe40:cdd1/64), vethec3950d (fe80::fce1:2dff:fe5b:8c81/64), vethce4c7ef (fe80::c4fa:27ff:fe64:1317/64), veth1e53068 (fe80::8411:7aff:fe31:34cf/64), veth1afdad4 (fe80::5491:9eff:fe97:b8f9/64), veth4ed9ef2 (fe80::148b:15ff:fe44:9c17/64)
announce_addresses 192.168.1.10, fddf:3940:83b2::dc7, 2001:4060:c00b:dd30::dc7, 2001:4060:c00b:dd30:5696:ac91:d59:be47, fddf:3940:83b2:0:18d5:ab4b:45b9:2e08, fe80::aa8a:2217:6ead:feb2
Recorder
oldest_recorder_run 8 March 2025 at 08:37
current_recorder_run 19 March 2025 at 19:45
estimated_db_size 251.07 MiB
database_engine sqlite
database_version 3.48.0

Additional information

Switching Back to 6GB RAM does solve the issue...

Image

But WTF happend here?

Increasing the RAM again to 20GB again shows 90% Full.

Image

haldi4803 avatar Mar 19 '25 19:03 haldi4803

It might be related to ballooning. total only showing 11G also hints towards that. Can you share free -h from your TrueNAS host while this is happening as well?
I'm curious why you allocate so much to HAOS.

Impact123 avatar Mar 20 '25 08:03 Impact123

My personal experience on Intel NUC (No WM, only HAOS) with 4GB. Since OS 15, I'm experiencing multiple self rebooting, database corruption and slow operations with several integration timeouts.

It seems there are some concurrent tasks that make the system crash. I'm investigating now, in the beginning I believed into a hardware issue but I have to check and restore the previous backup.

I hope it helps.

WladyTee avatar Mar 20 '25 10:03 WladyTee

It might be related to ballooning.

That might be a good lead. I saw something similar in Proxmox when "Minimum memory" was different than the "Memory" configured, when the VM started to eat more memory, it went haywire when the allocation changed. With fixed allocation, the problem went away.

sairon avatar Mar 20 '25 12:03 sairon

I'm curious why you allocate so much to HAOS.

i was running Whisper with the tiny-int8 Model and it couldnt grasp a single sentence correct. thats why i thought use a bigger model. but then RAM was not enough.

Image Image Image Image

Ballooning might be a good argument! I set a minimum of 4gb and increased only the maximum.

haldi4803 avatar Mar 20 '25 19:03 haldi4803

Yeah. Would you look at that? I've set RAM fixed to 12 GiB and it works!

Image Image

takes off pirate hat let's change the title accordingly...

haldi4803 avatar Mar 20 '25 19:03 haldi4803

Same here. Update to OS 15 stops several integrations, no connections after a while. Others, like CompreFace stop working entirely.

chheiss avatar Mar 21 '25 12:03 chheiss

There hasn't been any activity on this issue recently. To keep our backlog manageable we have to clean old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant OS version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jun 22 '25 05:06 github-actions[bot]