stratisd
stratisd copied to clipboard
Handle error case when device(s) show up that belong to an existing activated pool
It is theoretically possible that the following scenario could occur:
- Pool is modified, a subset of the disks meta data are updated
-
stratisd
is started and a subset of the disks are available which only have the the outdated metadata - At some point later the device(s) become available which have the newer metadata
Potential outcome(s):
- If
stratisd
doesn't have udev add support it is blissfully unaware of the issue and continues to operate under the false assumption that all is good. - If
stratisd
does have udev add support it gets the udev add, evaluates that the device pool id is already active and ignores it, yielding the same outcome as ifstratisd
didn't have udev add support.
Ultimately we need to be able to identify this error case and correctly figure out what actions to take to correct it. At the moment we could identify it with udev add support, but we don't know what action(s) we should take. As one previous colleague of mine stated, "Don't check for error conditions you don't know how to handle" :-)
Note that stratisd can currently distinguish between whether the device is already in the pool, but was found again, and the device is not yet in the pool, and has been found for the first time.
- The device is already in the pool. a. All our data about the device matches the data we already have. Maybe this means everything is fine. Or maybe not. b. Something doesn't match between the blockdev we have now and the one in the complete pool. This is a definite panic, I would think.
- The device is not already in the pool. a. It has metadata newer than the pools. In that case the pool must be wrong. We should remove the pool from pools, reinsert it into incomplete pools, and rerun the whole setup thing to see if it is now fine. b. Its metadata is older than the pool's metadata. So it belonged to the pool and was removed. Currently, that is impossible, so a panic is the right choice. Later, it may be more possible, as we will be able to remove cache devs from the pool.
The device is not already in the pool. a. It has metadata newer than the pools. In that case the pool must be wrong. We should remove the pool from pools, reinsert it into incomplete pools, and rerun the whole setup thing to see if it is now fine.
Once a pool is up and active and IO has been done on it, there is a good case that data loss/corruption has already occurred. Once cannot simply go from one state to the another and back without ramifications. IMHO we need to prevent this case from happening, not try to deal with it when it does. We should know without any ambiguity that a pool is complete or not.
If I understand this correctly, this all stems from an optimization where we choose to not write the latest metadata to all disks, to speed up metadata updates. IMHO We need to revisit that discussion and determine how we can close this error case.
OK, so can we just write metadata to all blockdevs and worry about this at the time we officially start supporting enough blockdevs to make writing to all blockdevs a bad idea? Such a time might never come...
I think we can theorize a scenario where we are writing to all storage devices and still end up in this situation. This is akin to solving the RAID5 write hole.
Now that stratisd is responding to Change as well as Add events, block_evaluate
is encountering quite benign instances where the block device being evaluated is already in a complete and fully functioning pool. I think that it's time to address this bug, in order to avoid pumping out warn messages about an utterly normal situation.