couchdb Reduce intra cluster conflicts

Overview

CouchDB issues all write requests in parallel without coordination, applying a quorum on the results of those independent actions. When updating a document concurrently this can lead to the introduction of a stored conflict if two different writes reach separate nodes first. This is undesirable.

This patch changes fabric_doc_update in the following ways;

Workers are no longer started immediately, but are given a unique reference each.
For each range in the write request, one node is chosen to "lead" the write decision (calculated as the lowest live node that hosts the shard range)
"Leader" workers are started.
Any doc update that receives "conflict" from a Leader is added to the reply dict W times and the doc updates are removed from the other (unstarted) workers. If that leaves the worker with nothing to do, it is removed entirely.

Testing recommendations

There is some existing coverage in the module itself but more testing is needed before this can be merged.

Related Issues or Pull Requests

Checklist

[x] Code is written and works correctly
[ ] Changes are covered by tests
[ ] Any new configurable parameters are documented in rel/overlay/etc/default.ini
[ ] Documentation changes were made in the src/docs folder
[ ] Documentation changes were backported (separated PR) to affected branches

Dec 29 '24 18:12 rnewson

noting need for a liveness check at the start (so the 'leader' is always a live node as of the invocation) and also to consider maintenance mode (receipt of that message from a 'leader' specifically)

Jan 01 '25 11:01 rnewson

I put the acc introduction in its own commit for exactly that reason and intend to keep those separate commits when merging.

Jan 06 '25 18:01 rnewson

I put the acc introduction in its own commit for exactly that reason and intend to keep those separate commits when merging.

That would look nice when merging, however since a bunch of notes and comments are related to just the acc change may still look nice to do a preliminary cleanup / #acc{} record PR and keep the the overall discussion concerning it in that PR then move on the main one.

Jan 06 '25 23:01 nickva

it's hard enough to get CI to run for one pull request. I'll do it this time but I really don't like this idea at all. I separated out the refactoring from the functional change within the PR and it makes little sense to commit one without the other.

Jan 07 '25 09:01 rnewson

https://github.com/apache/couchdb/pull/5385 for acc record

Jan 07 '25 09:01 rnewson

it's hard enough to get CI to run for one pull request. I'll do it this time but I really don't like this idea at all. I separated out the refactoring from the functional change within the PR and it makes little sense to commit one without the other.

I think it's worthwhile. At least in this case the test failures do seem related to bulk_docs:

07:52:20  7) test bulk docs emits conflict error for duplicate doc `_id`s (BulkDocsTest)
07:52:20       test/elixir/test/bulk_docs_test.exs:124
07:52:20       Expected 201 and the same number of response rows as in request, but got
07:52:20       %HTTPotion.Response{
07:52:20         status_code: 500,
07:52:20         body: [
07:52:20           %{
07:52:20             "error" => "conflict",
07:52:20             "id" => "0",
07:52:20             "reason" => "Document update conflict."
07:52:20           },
07:52:20           %{
07:52:20             "error" => "conflict",
07:52:20             "id" => "1",
07:52:20             "reason" => "Document update conflict."
07:52:20           },
07:52:20           %{
07:52:20             "error" => "conflict",
07:52:20             "id" => "1",
07:52:20             "reason" => "Document update conflict."
07:52:20           },
07:52:20           %{
07:52:20             "error" => "error",
07:52:20             "id" => "3",
07:52:20             "reason" => "internal_server_error"
07:52:20           }
07:52:20         ],

I keep an eye on flaky failures in the CI and this not a test that usually fails. internal_server_error seems worrying there.

Jan 07 '25 16:01 nickva

for sure, that indicates a bug here. It took 8 runs locally to get an internal_server_error so this will be fun to track down, but it needs to be found

Jan 07 '25 23:01 rnewson