zfs-inplace-rebalancing icon indicating copy to clipboard operation
zfs-inplace-rebalancing copied to clipboard

Stop after given number of files processed (or better, bytes processed)

Open spotcatbug opened this issue 10 months ago • 4 comments

TL;DR: What I need is to be able to tell the script to rebalance a directory, but stop after X number of files or, better yet, stop after Y number of bytes. Another command line switch, perhaps?

Read on, for my specific situation that prompted this request...

I just doubled the size of my main pool by adding a second, same-sized vdev to it. As part of the process, the first vdev ended up at a little more than 94% utilization. Not ideal, but hoped to be temporary. So, after adding the new vdev, I was sitting at 94% usage on the first vdev and 0% on the second.

After using du a few times, to look for good candidates for the script to rebalance, I'm currently sitting at 76% / 20% balance. That's certainly good enough. I'm not looking for 47% / 47% balance, here. I know this whole rebalancing process is a bit of a feel good measure, but 94% / 0% just feels icky. What I'm looking for is more like 70% / 24%.

So now I'm facing a dilemma. The only directory I have left, that's big enough, is my "Movies" directory, at 13 TB. That directory is filled with movies, each in a subdirectory, one level down. I must either rebalance the entire Movies directory (13 TB is unnecessarily too much,) or find some other large directory to rebalance (there isn't one.)

I need what the TL;DR says.

spotcatbug avatar Apr 11 '24 17:04 spotcatbug

Hey @spotcatbug :)

What about watching the balance status while having the script running and cancel it when the desired balance level is reached? It is acceptable to cancel the script, you just have to be mindful about the remains it (very likely) will leave behind and clean them up manually. See https://github.com/markusressel/zfs-inplace-rebalancing#things-to-consider

GitHub
Simple bash script to rebalance pool data between all mirrors when adding vdevs to a pool. - markusressel/zfs-inplace-rebalancing

markusressel avatar Apr 11 '24 18:04 markusressel

What about watching the balance status while having the script running and cancel it when the desired balance level is reached?

I hadn't really considered that. Despite the fact that I had cancelled the script a couple of times, on my way to 76% / 20%. I even had to clean up a ".balance" file. I guess it just feels wrong to force it to stop as an official mode of use, so it just didn't enter my mind.

So, yeah, you have the answer. I don't actually need this feature. I'm going to start the script and periodically check the pool's balance.

I still think it would be cool to automate this, though. The hunting for the right directory to run the script on was a little fiddly. If you could set a (tera/giga)byte count bail out, you could just run the script on the pool's root directory.

spotcatbug avatar Apr 11 '24 22:04 spotcatbug

I just realized: the script could even parse the output of "zpool list -v" and bail-out when it's balanced. No need for the user to figure out what directory to balance or how many files or bytes. Just keep balancing until zfs says it's balanced.

spotcatbug avatar Apr 11 '24 22:04 spotcatbug

This tool was not meant to be used regularly and not on a production system either. It was created for your exact use case, which is the case of expanding a pool and "kind of balancing" it. Adding convenience features like that would add a lot of complexity to this script, which I am simply not capable of properly supporting, and would benefit an edge case of a scenario, which is an edge case to begin with. I have built this script, used it once, and never since. So I am sorry to say this, but putting in the work of building such a feature would take more time than justified. If you really want to balance your pool, there are more expensive options available (involving more spare disks), this is the "cheap" option.

markusressel avatar Apr 12 '24 00:04 markusressel