Tomasz Grabiec
Tomasz Grabiec
Ok I see, it's because CL=3 is supposed to fail during bootstrap of the third node, unlike RF=3 which works fine. In test, you could wait on node2 for topology...
@kostja Sounds reasonable to me. Maybe except 5 sec timeout, which may be not enough in our CI environment. It could be a regular barrier, we do several of them...
This has potential to cause regressions compared to vnodes, so I don't think it should be enterprise-only.
> @bhalevy , @tgrabiec : Can we use the shuffle API as a valid workaround here, and push this from the 6.0 release blocker list? I don't think we can...
> @tgrabiec , why did you remove the 'release blocker' label? Can we live without it on 6.1? It didn't block the 6.0 release, why should it block 6.1?
> > I don't understand. Aren't they both a single add_entry() call? > > The call in this case is `add_entries`, which, as the name implies, is adding a vector...
> > Scenario 1 (changing the replica set): > > ``` > > 1. RF=3, tablet replicas={A, B, C} > > > > 2. CL=QUORUM write is ACKed on {A,...
> > > > Scenario 1 (changing the replica set): > > > > ``` > > > > 1. RF=3, tablet replicas={A, B, C} > > > > >...
Another reason why running repair concurrently with migration is not safe is that it may happen that repair was started when tablet migration was in such a stage that reads...
> @tgrabiec doesn't this fix #17658? Yes, updated description.