lucenenet icon indicating copy to clipboard operation
lucenenet copied to clipboard

BUG: Fix for DocumentsWriter concurrency (fixes #935, closes #886)

Open NightOwl888 opened this issue 1 month ago • 6 comments

  • [x] You've read the Contributor Guide and Code of Conduct.
  • [x] You've included unit or integration tests for your change, where applicable.
  • [x] You've included inline docs for your change, where applicable.
  • [x] There's an open issue for the PR that you are making. If you'd like to propose a change, please open an issue to discuss the change or find an existing issue.

Summary of the changes (Less than 80 chars)

Lucene.Net.Support.Threading.ReentrantLock: Fixed the implementation so it prioritizes new threads obtaining the lock over waiting threads.

Fixes #935. Closes #886.

Description

This has been a known issue for some time, however as we have had the DocumentsWriter working reliably on a single thread most users have not worried about it until it was reported in #935.

There were 2 issues that were causing test failures explained in https://github.com/apache/lucenenet/issues/935#issuecomment-2118228819. The second issue turned out to be much more involved to work out how to address even though the most recent solution is actually very simple and lightweight. Instead of calling UninterruptableMonitor.Enter() in the Lock() method, we call UninterruptableMonitor.TryEnter() in a loop that executes Thread.Yield(). This allows other threads to acquire the lock even though there are waiting threads.

Granted, while this approach seems to reliably pass the tests, it may be a bit naïve of an implementation. While it doesn't peg my CPU and seems to run well on Azure DevOps, I am not sure whether this is the appropriate solution for production scenarios. Suggestions are welcome as to how to improve this. Do note there were 2 prior attempts:

  • https://github.com/apache/lucenenet/commit/6f2e1290ac067299c6e9b216649a852ade0a07f8 - Uses a timeout to ensure the lock is acquired, but it still has to wait until the queue schedules it until it runs.
  • https://github.com/apache/lucenenet/commit/89b01e625116afd5ccba9457271a36524dc75ca1 - Uses ManualResetEventSlim and a Queue<T> to manage the waiting threads. This is a more complete implementation and even passed many of the Apache Harmony tests, but it comes at a pretty steep performance cost. Maybe there is a way to improve this, though.

NightOwl888 avatar May 23 '24 12:05 NightOwl888