fdb-record-layer Deal with asynchrony issues in FDBDirectory

Deal with asynchrony issues in FDBDirectory

Open scottfines opened this issue 3 years ago • 4 comments

see #1322 for more information about the bug. This fix is somewhat temporary, and accomplishes 3 things:

Resolves some spotty blocking calls in FDBDirectory so that they block less
Ensures that the forced synchronous methods in FDBDirectory are not called on the context's executor, to avoid deadlocks (still potential performance problems due to waiting for an open FJ thread, but at least it's less likely to actually break)
Add the ability to ignore blocking in asynchronous methods when we know that they are safe (for now)

A very good question would be: why do we have to be asynchronous here? and the answer is--because Lucene requires us to, and I'm not in a position to rewrite Lucene's entire API in order to support asynchronous method execution (I'm not even really sure that Lucene's thread model would support it). So here we are.

In the long term, we'll need to move blocking operations to a separate blocking thread pool in order to avoid potential performance bottlenecks, but this should serve in the short term.

Jul 22 '21 14:07 scottfines

test this please

Jul 22 '21 15:07 scottfines

Hi, we encountered cases of deadlocks when using record layer lucene indexes. Is there a plan to finish this PR? Thanks

Sep 23 '22 07:09 ngbinh

Hi, we encountered cases of deadlocks when using record layer lucene indexes. Is there a plan to finish this PR? Thanks

Do you attribute these deadlocks to something like fork-join pool exhaustion? I'd like to confirm that the bugs addressed here will fix your cases.

Sep 23 '22 18:09 MMcM

@MMcM Yes. Under heavy load on records with Lucene indexes enabled, all the workers on the FJ pool are exhausted.

Sep 24 '22 01:09 ngbinh

fdb-record-layer fdb-record-layer copied to clipboard

Deal with asynchrony issues in FDBDirectory

fdb-record-layer
fdb-record-layer copied to clipboard