electrs icon indicating copy to clipboard operation
electrs copied to clipboard

Async mempool scanning

Open conduition opened this issue 2 years ago • 17 comments

Fixes https://github.com/romanz/electrs/issues/968. Mempool scanning is now done asynchronously in a separate thread called the mempool_sync thread.

Method

My approach was to split the Mempool::sync method into two parts: (1) fetching the updated mempool transactions [slow, read-only] and (2) applying those transactions [fast, write].

  • Mempool scanning is first triggered when the main thread sends a HashSet<Txid> to the mempool_sync thread. This set represents the current set of TXIDs in the mempool which we already know about.
  • The mempool_sync thread proceeds to scan the mempool from the remote node. While the scan is ongoing (could take a long time) the main thread can still process block updates, script notifications, and electrum client requests.
  • Once the mempool scan finishes, the mempool_sync thread blocks until it can send a MempoolSyncUpdate to the main thread.
  • When the main thread has a spare moment to receive this packaged update, it synchronously applies the update to the Mempool, via Rpc::mempool_apply. This adds new transactions, removes outdated ones, and updates the fee histogram.

Advantages

Scanning is asynchronous and read-only, while writing the update to the mempool's in-memory state is synchronous WRT the main-thread. This way, we don't introduce any mutexes because we don't need concurrent read/writes of Mempool. Electrum client requests are always processed with a consistent view of the mempool, even if a scan is ongoing and could finish at any time.

Histogram Perf

While i was in there, I improved the performance of applying mempool updates, by avoiding needless recomputation of the FeeHistogram from scratch every time. Instead we dynamically update fee histogram bins according to which transactions have been added or removed. I added unit tests to demonstrate the logic is sound. If preferable, I can separate the FeeHistogram changes and package it into another PR (although it will probably conflict with this one).

conduition avatar Dec 08 '23 22:12 conduition

Many thanks! Could you please open a separate PR for the first two commits? (You can keep these commits also in this PR, to prevent merge conflicts)

romanz avatar Dec 09 '23 13:12 romanz

Requested changes force pushed, amending the existing commits. @romanz you can view the diffs here. If you feel comfortable with that, i'll submit https://github.com/romanz/electrs/pull/970/commits/9b5f925934de1986c803881cb866f3770a872175 and https://github.com/romanz/electrs/pull/970/commits/ed42680992994488961bbf5aa3028a7a3d4b5dcc as a new PR.

conduition avatar Dec 10 '23 03:12 conduition

If you feel comfortable with that, i'll submit 9b5f925 and ed42680 as a new PR.

Please do :)

romanz avatar Dec 10 '23 18:12 romanz

PR open here, this PR is now dependent on that one

conduition avatar Dec 11 '23 05:12 conduition

Please rebase over master (since #971 is merged).

romanz avatar Dec 11 '23 19:12 romanz

Rebase complete :+1:

conduition avatar Dec 12 '23 04:12 conduition

Seems that the CI failure is not related to https://github.com/romanz/electrs/pull/970/commits/06303d90d87100163f05b1bf2f0a72a7d280eb4f - I'll merge it via https://github.com/romanz/electrs/pull/973

romanz avatar Dec 12 '23 19:12 romanz

Please rebase over latest master.

romanz avatar Dec 12 '23 19:12 romanz

Could you please take a look at the failure? You can reproduce it locally by running:

$ docker build -f Dockerfile.ci . --rm -t electrs:tests
$ docker run -v $PWD/contrib/:/contrib -v $PWD/tests/:/tests --rm electrs:tests bash -x /tests/run.sh
...
+ TXID=20521100abc8df819901ccc5f964059ffc02303c4b421dccbf43322928221a2b
+ echo ' * get_tx_status'
 * get_tx_status
++ electrum --regtest --wallet=data/electrum/wallet get_tx_status 20521100abc8df819901ccc5f964059ffc02303c4b421dccbf43322928221a2b
++ jq -c .
Transaction not in wallet.
+ test '' == '{"confirmations":0}'
...

IIUC, it seems that the broadcasted transaction is not being reported back to Electrum client.

romanz avatar Dec 12 '23 19:12 romanz

You can modify tests/run.sh to debug this issue, e.g.:

diff --git a/tests/run.sh b/tests/run.sh
index e19b4d2..b19d8cd 100755
--- a/tests/run.sh
+++ b/tests/run.sh
@@ -13,6 +13,7 @@ cleanup() {
   	kill $j
   	wait $j
   done
+  cat data/electrs/regtest-debug.log
 }
 trap cleanup SIGINT SIGTERM EXIT
 
@@ -68,6 +69,8 @@ NEW_ADDR=`$EL getunusedaddress`
 echo " * payto & broadcast"
 TXID=$($EL broadcast $($EL payto $NEW_ADDR 123 --fee 0.001 --password=''))
 
+sleep 20
+
 echo " * get_tx_status"
 test "`$EL get_tx_status $TXID | jq -c .`" == '{"confirmations":0}'

romanz avatar Dec 12 '23 19:12 romanz

Rebased :white_check_mark:

I'm now looking into the mempool missing TX error. My guess is that async mempool scanning results in a delayed reaction where electrs only receives the newly broadcasted TX back from bitcoind after the current mempool scan iteration runs once or twice.

Prior to this PR, the synchronous nature of mempool scanning meant that after broadcasting a transaction, the next iteration of the main thread's event loop triggered a mempool scan, and further client requests (like the subsequent TX lookup) weren't processed until the mempool scan was complete, so the request always got the most up-to-date view of the mempool possible after broadcasting.

I see two options to address this:

  1. After a TX broadcast, immediately execute a synchronous mempool poll/update cycle in the main thread (blocking other client requests)
  2. After a TX broadcast, add the new transaction into the Mempool state manually (using Mempool::apply_sync_update).

I think option 2 is the more appropriate choice. I don't see any obvious associated negatives. The mempool sync thread should remove the entry later if the broadcasted TX we add manually is evicted for some reason.

Still waiting for the dockerfile to build for the moment. i'll return tomorrow with an update once I've verified my suspicions

conduition avatar Dec 14 '23 03:12 conduition

@romanz sorry for the delay. My suspicions were correct, and I applied a fix which automatically appends the newly broadcasted TX to the mempool immediately upon broadcast.

conduition avatar Dec 17 '23 22:12 conduition

@conduition Could you please review https://github.com/romanz/electrs/pull/979 which should also help when syncing large mempools?

romanz avatar Dec 20 '23 19:12 romanz

@romanz I rebased this now that https://github.com/romanz/electrs/pull/979 has been merged. I think this PR still has merit, because even if mempool scans are now much faster, there's still no need to block client requests on them. For my use case, where i have bitcoind on a separate machine, this PR is a must-have

conduition avatar Dec 24 '23 07:12 conduition

Putting this briefly into Draft state; i found I was getting spurious logs after a07b70d.

[2023-12-24T18:21:13.575Z DEBUG electrs::mempool] 82650 mempool txs: 0 added, 0 removed
[2023-12-24T18:21:13.583Z DEBUG electrs::mempool] 82650 mempool txs: 0 added, 0 removed
[2023-12-24T18:21:13.621Z DEBUG electrs::mempool] 82650 mempool txs: 0 added, 0 removed
[2023-12-24T18:21:13.675Z DEBUG electrs::mempool] 82650 mempool txs: 0 added, 0 removed
[2023-12-24T18:21:13.682Z DEBUG electrs::p2p] loading 1 blocks
[2023-12-24T18:21:15.677Z DEBUG electrs::status] 1 transactions from 1 blocks
[2023-12-24T18:21:15.677Z DEBUG electrs::mempool] 82650 mempool txs: 0 added, 0 removed
[2023-12-24T18:21:15.684Z DEBUG electrs::mempool] 82650 mempool txs: 0 added, 0 removed
[2023-12-24T18:21:15.759Z DEBUG electrs::mempool] 82650 mempool txs: 0 added, 0 removed
[2023-12-24T18:21:15.764Z DEBUG electrs::mempool] 82650 mempool txs: 0 added, 0 removed

will fix

conduition avatar Dec 24 '23 18:12 conduition

OK should be fixed now

conduition avatar Dec 25 '23 18:12 conduition

Hey @romanz, just wanted to check in and see if I should take the time to rebase this or?

conduition avatar Jun 20 '24 17:06 conduition