rancher-desktop icon indicating copy to clipboard operation
rancher-desktop copied to clipboard

Macos M1 storage lima "rancher-desktop-bridged_en0_vmnet.stderr.log" is 264GB and "diffdisk" is 102GB

Open danielatjumo opened this issue 2 years ago • 17 comments

Actual Behavior

My M1 Pro Macbook filled up storage with over 360GB of rancher-desktop related data in a couple weeks.

The two really large files are /Users//Library/Application Support/rancher-desktop/lima/0/diffdisk /Users//Library/Application Support/rancher-desktop/lima/_networks/rancher-desktop-bridged_en0_vmnet.stderr.log

The only error in the log file is repeated till 264GB size: _on_vmnet_packets_available(): vmnet_return_t VMNET_TOO_MANY_PACKETS vmnet_read: No buffer space available

Steps to Reproduce

No clue, check if your storage is full and track down the culprit.

Result

Files are too large and fill up storage. Probably shouldnt be happening.

Expected Behavior

Smaller files

Additional Information

No response

Rancher Desktop Version

1.2.1

Rancher Desktop K8s Version

1.22.7

Which container runtime are you using?

moby (docker cli)

What operating system are you using?

macOS

Operating System / Build Version

macOS 12.3 M1

What CPU architecture are you using?

arm64 (Apple Silicon)

Linux only: what package format did you use to install Rancher Desktop?

No response

Windows User Only

No response

danielatjumo avatar Mar 31 '22 19:03 danielatjumo

The diffdisk size is limited to 100GB and should not grow any larger. It is mostly used to store images for the container runtime.

The log file size is certainly unexpected; is there anything in particular that may have caused a lot of traffic to your containers?

I think the log file should get deleted (and recreated) when you restart Rancher Desktop. Otherwise you can also delete it manually. But please don't delete the diffdisk, it will break your VM.

jandubois avatar Mar 31 '22 19:03 jandubois

The log file size is certainly unexpected; is there anything in particular that may have caused a lot of traffic to your containers?

Not really? I have mostly been building images using buildx that download a lot of dependencies. But nothing crazy.

edit: I have also been doing a lot of terraform in the containers.

danielatjumo avatar Apr 01 '22 06:04 danielatjumo

I think I got the same error today. Rancher eating up all available disk space. > 750 GB. Screenshot 2022-04-25 at 12 58 57 I don't remember doing anything at all with rancher today... just idling in the background.

SjoenH avatar Apr 25 '22 11:04 SjoenH

Same error, most lines is vmnet_read: No buffer space available and _on_vmnet_packets_available(): vmnet_return_t VMNET_TOO_MANY_PACKETS

$pwd
/Users/user1/Library/Application Support/rancher-desktop/lima/_networks
$ tail -n100000 rancher-desktop-bridged_en0_vmnet.stderr.log | sort | uniq -c | sort -g
  50000 _on_vmnet_packets_available(): vmnet_return_t VMNET_TOO_MANY_PACKETS
  50000 vmnet_read: No buffer space available
$ ll rancher-desktop-bridged_en0_vmnet.stderr.log
-rw-r--r-- 1 user1 128099798874  5 30 14:09 rancher-desktop-bridged_en0_vmnet.stderr.log

ddb4github avatar May 30 '22 06:05 ddb4github

Just ran into the same issue; 490GB disk space used by the vmnet err log file. I had Rancher Desktop running in the background (both moby and kubernetes activated), but I had no containers running yet afaik, so my guess is it doesn't have anything to do with network traffic related to user deployed workloads / containers.

RobinVanCauter avatar Jun 24 '22 07:06 RobinVanCauter

I have also noticed that diffdisk continues to grow, and doing regular commands to prune dockerd (moby) usage has no effect on reducing diffdisk size.

aminmkhan avatar Jul 14 '22 08:07 aminmkhan

I have also noticed that diffdisk continues to grow, and doing regular commands to prune dockerd (moby) usage has no effect on reducing diffdisk size.

There is no way to shrink the diffdisk filesize on the host; it will grow until it reaches the max size (100GB) and then you will get out-of-disk errors from inside the VM. Pruning images helps to reclaim space inside the volume, so it can be reused instead of committing addition host disk space, so is still a good practice.

The diffdisk is conceptually a sparse file, so blocks that have never been written to inside the VM are not allocated on the host. But once the block has been created, there is no way to release it back to the host.

jandubois avatar Jul 14 '22 16:07 jandubois

Hmm, we do ship qemu-img as part of what we supply for lima; it might be possible to use qemu-img convert to copy the existing image to a new sparse image, and then replace it. (The whole thing would have to be done offline, while RD isn't running, of course.)

This would be a totally manual process of course.

mook-as avatar Jul 22 '22 23:07 mook-as

it might be possible to use qemu-img convert to copy the existing image to a new sparse image

I believe I tried this before, and it didn't work on macOS. I think I saw some comments that this only works on Linux.

It should be possible though to create a second volume, that will be mounted into the VM, and then copy the filesystem over, so it will only grow to the size required by the current files. Then you can reboot with just the new disk mounted as the data volume and delete the old one.

But the other problem is that most likely the user will not have enough space to even create a sparse copy of the existing disk; the time when you investigate pruning the image is when the host disk has almost run out, so even creating a 50GB sparse disk from a full 100GB disk might be impossible.

jandubois avatar Jul 28 '22 07:07 jandubois

But the other problem is that most likely the user will not have enough space to even create a sparse copy of the existing disk

Especially if their log files (rancher-desktop-bridged_en0_vmnet.stderr.log) fill up the whole drive ;) Regardless of the usage of diffdisk in rancher/docker is it reasonable to say that these *.stderr.log files should not exceed a reasonable size for logs that are repeated ad nauseam?

danielatjumo avatar Jul 28 '22 09:07 danielatjumo

Especially if their log files (rancher-desktop-bridged_en0_vmnet.stderr.log) fill up the whole drive ;)

Yes, but that is a separate issue. The log files can be deleted at any time to make space; they will be re-created automatically.

Regardless of the usage of diffdisk in rancher/docker is it reasonable to say that these *.stderr.log files should not exceed a reasonable size for logs that are repeated ad nauseam?

Yes, that is correct. I don't know why they grow that fast for you; I can't remember seeing anybody else reporting this.

When you restart the machine, and then restart Rancher Desktop, are all the excessive logs gone?

Side note: once the diagnostics system is implemented, we should have a check for total log size and provide an automatic fix to delete or rotate them.

jandubois avatar Jul 28 '22 19:07 jandubois

The diffdisk file just seems to grow in size over time. I hardly have any images and volumes, and also use docker system prune to clean up unnecessary build caches and unused containers, but the file is still huge. I am also using M1.

maxsokolovsky avatar Oct 04 '22 23:10 maxsokolovsky

The diffdisk file just seems to grow in size over time.

Yes, diffdisk is supposed to grow until it reaches the max size (100GB).

There is one option suggested above to reduce this maximum size in the very beginning when you set up Rancher Desktop.

aminmkhan avatar Oct 07 '22 14:10 aminmkhan

@aminmkhan, is it safe to just delete the diffdisk file and thus restart its slow growth?

maxsokolovsky avatar Oct 11 '22 18:10 maxsokolovsky

is it safe to just delete the diffdisk file and thus restart its slow growth?

It is safer to do a "Factory Reset" (on the Troubleshooting page). You will lose all your settings and images though.

jandubois avatar Oct 13 '22 00:10 jandubois

Got affected by this as well

Some statistics for a 140GB log: there were only 3 kinds of lines present

  "_on_vmnet_packets_available(): vmnet_return_t VMNET_TOO_MANY_PACKETS\n" => 1052173666,
  "vmnet_read: No buffer space available\n" => 1052173666,
  "vmnet_read: No space left on device\n" => 1044973802

With first 2 kinds going one after another (1,2,1,2,1,2...), and most of "No space left on device" just being spammed at the end of the log.

My hunch is that Rancher doesn't handle out-of-memory/out-of-storage issue very gracefully, thus stucking into an endless cycle of

  1. Try to allocate storage/memory
  2. Get rejected (not enough of free space)
  3. GOTO 1 (to try again)

Nonetheless, log rotation should be enough to at least limit this problem.

jrogov avatar Nov 25 '22 10:11 jrogov

I'm also affected by this. Multiple times now. M1 Mac on Mac OS 14.5. Rancher Desktop v 1.14.1, K8s 1.29.6 (stable).

I really like to use Rancher Desktop, but this is a real bummer. Do you guys know any nice alternatives as long as this is not fixed?

Every time I do “Factory Reset” to get diffdisk smaller, I loose my local volumes, including all local databases. I don't want to re-import my local databases every time that files grows to infinity. Using Docker Desktop before, I never ever had those issues.

seisenreich avatar Jul 01 '24 16:07 seisenreich

@seisenreich Any large log files you can safely delete when restarting Rancher Desktop, I think.

How much free space you have on M1 Mac after “Factory Reset” of Rancher Desktop?

Because you can use the option in configuration file to reduce the maximum size (100 GB default) of diffdisk in the very beginning when you set up Rancher Desktop.

aminmkhan avatar Jul 07 '24 09:07 aminmkhan