btrbk
btrbk copied to clipboard
Problem: Out-of-memory for many snapshot deletions -> Feature Request: use "btrfs subvolume sync"
I am using an ssh-based backup of many subvolumes to a remote arm-based NAS device. This device was offline for a long time. Now when running the usual btrbk routine, the device kernel gets out of memory-errors during the subvolume/snapshot deletion phase in btrbk.
I assume this is due to the fact that the btrfs subvolume delete command instantly returns, and btrbk just issues the next one.
--> Would it be possible to add the "btrfs subvolume sync" command in between subvolume deletions? This would make sure that the current delete operation(s) are finished before issuing the next operations - and thus hopefully avoid the OOM.
Thank you very much in advance!
Setting btrfs_commit_delete each
in your btrbk.conf
should help. This will call btrfs subvolume delete --commit-each
which should "wait for transaction commit after deleting each subvolume." (see btrfs-subvolume(8) )
(sorry for the late reply, I've been super busy lately...)
Hello @digint ,
thank you for answering! I was very busy on my own - I wanted to learn perl to modify btrbk on my own to include the btrfs subvolume sync
command, but did not find the time to try it.
Referring to your answer:
I have already tried btrfs_commit_delete each
to no avail - there was no waiting.
I really think that in my case, I need btrfs subvolume sync
.
The btrfs wiki on this command reads as
... sync
[subvolid…] Wait until given subvolume(s) are completely removed from the filesystem after deletion. If no subvolume id is given, wait until all current deletion requests are completed, but do not wait for subvolumes deleted in the meantime.
Would you be able to tell me where I would have to modify btrbk to include this command after every subvolume delete action?
Thank you very much in advance for your help!
I've added a quick'n'dirty patch in the sync-after-delete branch: https://github.com/digint/btrbk/tree/sync-after-delete commit f78fd742e442491f5767b4c519f19dc8c4dc35dd
It adds btrfs_commit_delete sync
config option. Have not much tested it, should work.
btrbk run --override=btrfs_commit_delete=sync -v -v
I'm not sure if this is the right thing to do though, trying to outsmart the kernel is usually not the best idea. I think the best solution would be to post a bug report in the btrfs mailing list, and have OOM bugs fixed in the kernel.
Hello @digint ,
thank you! I just came around trying out your branch.
Here's what I found:
The btrfs subvolume sync
command actually does wait until the subvolume is delete.
The problem is that apparently, the kernel does not come around to do this "deletion" operation quite often - hence, the command takes sometimes minutes to complete.
Therefore, I have created a local modification that inserts a btrfs filesystem sync
in between the btrfs subvolume delete
and the btrfs subvolume sync
commands (inserted at line 1481 in btrbk).
btrfs filesystem sync
triggers the deletion actively, and btrfs subvolume sync
then waits until the deletion is finished.
Although this seems to work now, I fully agree that this might not be the right way to go.
I was assuming that the Out-of-memory errors originate due to the fact that the "tiny" backup pc gets a lot of btrfs subvolume delete requests via SSH and has to confirm all of them. This is why I did not think that it is a kernel problem, but rather due to the fact that btrfs subvolume delete returns immediately - and btrbk just sends the next delete request right after.