foundationdb icon indicating copy to clipboard operation
foundationdb copied to clipboard

Exclude can fail if data is not distributed evenly

Open vishesh opened this issue 2 years ago • 5 comments

Currently, exclusion looks at the storage server with least available space and uses that ratio to calculate the total cluster usage. This can result into exclusion failing if add bunch of new storage processes to a cluster where existing (or one) SS has very high utilization, and DD hasn't caught up. https://github.com/apple/foundationdb/blob/bc47f90aff019467bc63f184a073a97e0657b294/fdbclient/SpecialKeySpace.actor.cpp#L1036-L1044

vishesh avatar Sep 08 '22 18:09 vishesh

This is related to #4693 (see in particular https://github.com/apple/foundationdb/issues/4693#issuecomment-823579455)

sfc-gh-abeamon avatar Sep 08 '22 18:09 sfc-gh-abeamon

We discussed in the meeting. The metric should be used_bytes for all machines divided by total_bytes of non-excluded SSes.

jzhou77 avatar Sep 12 '22 17:09 jzhou77

I think you have to be careful about dividing by total_bytes. At the very least, you'll need to account for the possibility that storage servers are sharing disks or that the disks have other data on them.

Probably better would be to add up the used_bytes of all non-excluded storage servers and the free_bytes of all non-excluded storage disks. Even better might be to account for available bytes in the used_bytes numerator since these don't count toward our actual usage, though this may just be a nice to have improvement.

sfc-gh-abeamon avatar Sep 12 '22 17:09 sfc-gh-abeamon

Which used_bytes metric are we talking about here? Currently, I see used_bytes for memory but not disk. There is stored_bytes and kvstore_used_bytes inside roles section which I can probably look into. I was thinking to do

```ForAll(NonExcludedServers) [ sum(free_bytes) ] - ForAll(ExcludedServers) [ sum (used_bytes) ] / ForAll(NonExcludedServers) [sum(total_bytes)]`` to get free space ratio.

vishesh avatar Sep 14 '22 16:09 vishesh

I think we need to use something more like:

1 - sum(kvstore_used_bytes, all processes) / 
     (sum(kvstore_used_bytes, non-excluded processes) + sum(kvstore_free_bytes, non-excluded disks))

Using total bytes has problems with sharing disks with other FDB processes or other data, and when using free bytes you will similarly need to count that once per disk rather than once per process. To account for available space in the numerator, you could also do this:

1 - sum(kvstore_used_bytes + kvstore_free_bytes - kvstore_available_bytes, all processes) / 
     (sum(kvstore_used_bytes, non-excluded processes) + sum(kvstore_free_bytes, non-excluded disks))

Used (per process) + free (per disk) gives you the total disk space either available for the FDB processes to use or already in use by FDB across the processes/disks being checked, so it becomes our denominator. The numerator gives either the total size of the files across all processes (in the first case) or the total amount of space actually used across all processes (in the second case).

sfc-gh-abeamon avatar Sep 14 '22 16:09 sfc-gh-abeamon