dragonfly feat: yield when serialization is in progress

Jun 26 '24 07:06 kostasrim

@chakaz draft so we can discuss :)

Jun 26 '24 07:06 kostasrim

@chakaz Regarding the comment that CVCUponInsert calls WriteBucket multiple times, I think that ok.. because what we care is that we dont break the bucket into multiple pieces but I dont think we should care if CVCUponInsert serializes one bucket we will preempt , serialize onother bucket from different flow and than go back to continue serialize the next buckets from CVCUponInsert. Is that right?

I'm afraid of some edge case that will cause us to lose data being sent to replica/RDB file. For example, touching k1 triggers OnDbChange and also causes a bucket/segment split. We start serializing the bucket, and yield to k2 (same bucket/segment) which splits the bucket and moves keys around, and when we return to k1's OnDbChange we missed stuff. Also, when do we set the bucket version? If before we start serializing the bucket in OnDbChange, then a parallel OnDbChange on another key in the same bucket will be missed, but if after then we will process the bucket twice (which is also not good because we can't save the same key twice, as it produces corrupted RDB files). I think that locking the entire operation instead of individual keys can solve these (and potentially other more complex issues we didn't think about), and really there's no price to pay for it

Jul 01 '24 05:07 chakaz

@chakaz Regarding the comment that CVCUponInsert calls WriteBucket multiple times, I think that ok.. because what we care is that we dont break the bucket into multiple pieces but I dont think we should care if CVCUponInsert serializes one bucket we will preempt , serialize onother bucket from different flow and than go back to continue serialize the next buckets from CVCUponInsert. Is that right?

I'm afraid of some edge case that will cause us to lose data being sent to replica/RDB file. For example, touching k1 triggers OnDbChange and also causes a bucket/segment split. We start serializing the bucket, and yield to k2 (same bucket/segment) which splits the bucket and moves keys around, and when we return to k1's OnDbChange we missed stuff. Also, when do we set the bucket version? If before we start serializing the bucket in OnDbChange, then a parallel OnDbChange on another key in the same bucket will be missed, but if after then we will process the bucket twice (which is also not good because we can't save the same key twice, as it produces corrupted RDB files). I think that locking the entire operation instead of individual keys can solve these (and potentially other more complex issues we didn't think about), and really there's no price to pay for it

I think I agree with what you wrote but if this is the case that we have a problem for example the CVCUponBump flow which call the OnDbChange in a loop, so in this case we can preempt between transactions and this will cause a bug

Jul 01 '24 10:07 adiholden

@kostasrim I believe we can use CondVarAny here instead of the mutex, this will be faster without locks and in most use cases we will not have any preeptions as the values will not be big so we will not need to preempt. Take an example with util::fb2::CondVarAny pipeline_cnd; in our code

Jul 01 '24 11:07 adiholden

@kostasrim I believe we can use CondVarAny here instead of the mutex, this will be faster without locks and in most use cases we will not have any preeptions as the values will not be big so we will not need to preempt. Take an example with util::fb2::CondVarAny pipeline_cnd; in our code

I think this won't work because once we unclock the mutex we will schedule the fiber for execution (make it active) and if we preempt on the next round it will interleave it bypassing the atomicity the lock provides. This can be fixed with a predicate and a boolean and use wait_until on the condition variable but there is no much gain imo here other than making the code a little bit more verbose

Jul 01 '24 14:07 kostasrim

@chakaz Regarding the comment that CVCUponInsert calls WriteBucket multiple times, I think that ok.. because what we care is that we dont break the bucket into multiple pieces but I dont think we should care if CVCUponInsert serializes one bucket we will preempt , serialize onother bucket from different flow and than go back to continue serialize the next buckets from CVCUponInsert. Is that right

I'm afraid of some edge case that will cause us to lose data being sent to replica/RDB file. For example, touching k1 triggers OnDbChange and also causes a bucket/segment split. We start serializing the bucket, and yield to k2 (same bucket/segment) which splits the bucket and moves keys around, and when we return to k1's OnDbChange we missed stuff. Also, when do we set the bucket version? If before we start serializing the bucket in OnDbChange, then a parallel OnDbChange on another key in the same bucket will be missed, but if after then we will process the bucket twice (which is also not good because we can't save the same key twice, as it produces corrupted RDB files). I think that locking the entire operation instead of individual keys can solve these (and potentially other more complex issues we didn't think about), and really there's no price to pay for it

I think I agree with what you wrote but if this is the case that we have a problem for example the CVCUponBump flow which call the OnDbChange in a loop, so in this case we can preempt between transactions and this will cause a bug

After we preempt on the first call of the cb and unlock the mutex it will schedule the fiber for execution but won't switch to it so it will call the next cb. If that cb preempts the next fiber (which is the scheduled above) will try to lock the mutex and fail (it got notified but that doesn't mean it can lock the mutex so it has to try and if it fails it need to fall back and wait again for it be notified) so each cb will be applied in order without interleavings

Jul 01 '24 14:07 kostasrim

what is the related issue for this PR?

Jul 07 '24 08:07 romange

It's related to huge entries serialization: https://github.com/dragonflydb/dragonfly/issues/2585

Jul 07 '24 09:07 chakaz

So probably worth updating both the commit and PR description.

Jul 07 '24 09:07 romange

Agreed. I'll let @kostasrim do that though :)

Jul 07 '24 09:07 chakaz