nebula icon indicating copy to clipboard operation
nebula copied to clipboard

Long-running operations triggered via HTTP are blocking other HTTP requests.

Open maochongxin opened this issue 1 year ago • 1 comments

Describe the bug (required)

When triggering a compact operation using HTTP, it causes other HTTP requests to be blocked. For example, requests for status and stats sent by the Nebula Exporter process are also blocked, leading to false alarms.

Your Environments (required)

  • Commit id: master

How To Reproduce(required)

Steps to reproduce the behavior:

Step 1:

curl "http://127.0.0.1:19779/admin?space=test_space&op=compact"

Step 2:

curl http://127.0.0.1:19779/stats multiple times, it gets stuck once. The exact number of reproductions depends on the value of FLAGS_ws_threads.

Expected behavior All stats requests should return successfully.

Additional context Currently, operations such as compact and flush are performed in the I/O thread, and the IOThreadPoolExecutor does not support work stealing. This can result in subsequent requests getting blocked. The webWorker is defined but not being used. I think it would be possible to offload time-consuming tasks such as compact to this thread pool and wait for the results there instead of waiting in the I/O thread.

maochongxin avatar Jan 25 '24 09:01 maochongxin

@wey-gu Can you take a look at this issue, please?

QingZ11 avatar Feb 01 '24 08:02 QingZ11