ARTSynchronized icon indicating copy to clipboard operation
ARTSynchronized copied to clipboard

obsoleteness detection in lookupRange() of ARTOLC causes an infinite loop

Open chahk0129 opened this issue 4 years ago • 1 comments

Hi, I found an issue with getChillren() function in line 103 of Tree.cpp inside lookupRange() of ARTOLC implementation. https://github.com/flode/ARTSynchronized/blob/849ef27e7e15ec32cca5dd368570aafe4535fbee/OptimisticLockCoupling/Tree.cpp#L103 When it is directed to each specific node type, e.g., Node4, Node16, Node48, Node256, it checks if it needs to restart upon read-write conflicts.

When conflict is detected, it restarts locally, but this can cause an infinite loop when running with write operations. The dynamic node adjustment replaces a node with a larger/smaller node, which marks the original obsolete and reclaimed. However, when running concurrent queries of range lookups with a mixture of write operations such as insert and remove, threads executing range queries keep trying locally to collect child pointers due to obsoleteness detection which will never succeed.

I think the getChildren() in each node type should either

  • distinguish the write latch acquisition state and obsolete state, and when obsolete, retry from parent (may recursively backtrack up).
  • abort upon conflicts, and retry globally (from the root).

chahk0129 avatar Oct 05 '21 22:10 chahk0129

Hey, yes good catch. And it's not only the getChildren function that has this issue, in the findStart and findEnd helpers there are also local retries that can run into an infinite loop. Both of the proposed solutions would solve this.

flode avatar Oct 09 '21 08:10 flode