electrum [500$ Bounty] Slow disk writes

[500$ Bounty] Slow disk writes

Open coval3nte opened this issue 2 years ago • 20 comments

We’re using electrum to generate addresses ondemand for thousands of customers, we’re a SaaS eCommerce platform (https://sellix.io) and provide our own infrastructure for cryptocurrencies.

We already have an address reusability system (the same address is re-used multiple times when possible), however, our electrum wallet currently counts over 35.000 addresses today. Generating a new one takes as much as 20 seconds, whilst on electrum-ltc and electron-bch less than a second, with the same amount of addresses on the wallet.

We’d like a hand figuring it out and solving it ASAP. Thank you.

Jun 09 '22 16:06 coval3nte

Generating a new one takes as much as 20 seconds

How specifically are you generating a new address? What is it that you have timed to take that long, is it the createnewaddress RPC command?

Jun 09 '22 16:06 SomberNight

yes, exactly. We've an electrum daemon and both jsonrpc (http) and electrum client (i guess it uses jsonrpc too) takes 20s.

Jun 09 '22 16:06 coval3nte

electrum client (i guess it uses jsonrpc too)

yes, the CLI uses jsonrpc too.

Have you increased the gap limit for this wallet? If so, what value is it set to?

Jun 09 '22 16:06 SomberNight

Never increased the gap limit.

Jun 09 '22 16:06 coval3nte

Generating a new one takes as much as 20 seconds, whilst on electrum-ltc and electron-bch less than a second, with the same amount of addresses on the wallet.

electrum-ltc follows us pretty closely. What version of it are you using?

Please enable debug logging, and grep for lines starting with D | util.profiler | WalletDB._write. The number that follows is the time taken in seconds by the call. How long does that take?

Jun 09 '22 17:06 SomberNight

20220609T180804.215979Z | DEBUG | util.profiler | WalletDB._write 24.0569

Jun 09 '22 18:06 coval3nte

Right... so the root cause seems to be the db writes being slow. This is unfortunately an architectural problem that is hard to fix. The wallet db is backed by a (potentially encrypted) json file. As it is json, if you want any change persisted, the whole file has to be rewritten to disk. For large wallets, this is unsurprisingly very slow. See https://github.com/spesmilo/electrum/issues/4823

As to why the BCH and LTC forks don't exhibit the behaviour... they should in theory suffer from the same fundamental issue, so I am unsure. One thing that comes to mind is that we added an extra write-to-disk call to wallet.set_up_to_date() around version 4.0. This gets called every time the wallet finishes syncing, which in your case gets triggered soon after every createnewaddress call. https://github.com/spesmilo/electrum/blob/839db6ee9c696a9cc5157bf225e750a124c4cdbb/electrum/wallet.py#L382-L384 Previously we used to not do this, and would only persist the wallet file when the wallet is closed gracefully - this could mean losing state, although if it's only HD addresses they would likely be regenerated next time (+gap limit shenanigans). So with that in mind, I guess you could experiment with removing that line (i.e. the call to self.save_db() inside wallet.set_up_to_date)

However, if you are using a recent version of Electrum-LTC, they should have the same code, in which case I don't know. So again, please state exact version.

The largest wallet I have to test with has 250k addresses, with a file size of ~531 MB. WalletDB._write takes ~26 seconds with that.

>>> len(wallet.get_addresses())
250657
>>> len(wallet.db.transactions)
140985

You said yours has 35k addresses and takes ~24 seconds, which is weird as I would expect linear scaling. How large is your wallet file on disk?

Jun 10 '22 19:06 SomberNight

btw, do you have wallet file encryption enabled? If you use the CLI, this is the encrypt_file option for the password command. That has a huge effect on the wallet file size -- although maybe not so much on the db write time. My numbers are for an encrypted wallet file.

Jun 10 '22 19:06 SomberNight

So with that in mind, I guess you could experiment with removing that line (i.e. the call to self.save_db() inside wallet.set_up_to_date)

Another trick you could do, is open the wallet with --offline, generate a few thousand addresses, and then close it, and reopen it normally. When you are offline, set_up_to_date is not used, as it is meaningless, so this process would only result in a single db write, when the wallet is closed.

Jun 10 '22 19:06 SomberNight

what about doing async the wallet save function? I think that without saving wallet file we would lose incoming transaction etc... Moreover, we can't generate addresses offline because electrum's daemon starts online automatically to receive txs etc etc.

Jun 11 '22 13:06 coval3nte

I think that without saving wallet file we would lose incoming transaction etc...

Barring gap limit issues, on-chain state cannot really be lost. (as you would just resync the same state from the electrum server the next time)

Moreover, we can't generate addresses offline because electrum's daemon starts online automatically to receive txs etc etc.

You can start the daemon in offline mode (which is not well supported, the flag mainly exists for the GUI and for no-daemon CLI commands), as follows:

$ ./run_electrum --testnet daemon --offline -v

$ ./run_electrum --testnet load_wallet -w ~/.electrum/testnet/wallets/9dk
$ ./run_electrum --testnet createnewaddress -w ~/.electrum/testnet/wallets/9dk

I've just tested and this works. You can batch address-generation this way.

what about doing async the wallet save function?

hmm.. I think that might make some things much harder to reason about. :/

Jun 11 '22 22:06 SomberNight

we use createnewaddress as a jsonrpc call... can we use two daemon which uses the same wallet (online\offline) without meet any issue?

Jun 11 '22 22:06 coval3nte

we use createnewaddress as a jsonrpc call...

Just because my example is not using jsonrpc, do not presume it would not work like that :P It should.

can we use two daemon which uses the same wallet (online\offline) without meet any issue?

do not open the same wallet file in multiple processes simultaneously. but it is safe to have the same logical wallet (seed/xpub/etc) open in multiple processes simultaneously (each process handling a separate wallet file). so same HD keys ok, same file NOT ok.

having two daemons, one offline, one online, with two wallet files (same seed) and using the offline to generate addresses is very similar to what I've suggested with batched pre-generation of addresses. It should work.

Jun 11 '22 22:06 SomberNight

we use extended private key importing (allows us 3 vers address generation)... so importing same private key into another electrum's daemon should work as a fix? Would the online electrum's daemon recognize the incoming transactions\inputs? Moreover, latest electrum tar.gz version (from downloads electrum website) doesn't have protobuf requirement, which was fixed in a recent commit, appended and this causes application fail.

Jun 11 '22 22:06 coval3nte

Would the online electrum's daemon recognize the incoming transactions\inputs?

Barring gap limit issues, yes. That is, if the offline daemon is generating addresses faster than they are getting used, the online daemon will fall behind and if a new tx arrives beyond the gap limit of the online daemon that tx will not be seen. It will get discovered once the gap is rolled forward (assuming the preceding addresses become used). Not sure how much I need to explain this -- are you familiar with the gap limit concept?

Jun 11 '22 23:06 SomberNight

Moreover, latest electrum tar.gz version (from downloads electrum website) doesn't have protobuf requirement, which was fixed in a recent commit, appended and this causes application fail.

Indeed the latest release does not have that commit. Anyway, that's a separate issue. (https://github.com/spesmilo/electrum/issues/7833)

can we use two daemon which uses the same wallet (online\offline) without meet any issue?

having two daemons, one offline, one online, with two wallet files (same seed) and using the offline to generate addresses is very similar to what I've suggested with batched pre-generation of addresses. It should work.

so importing same private key into another electrum's daemon should work as a fix? Would the online electrum's daemon recognize the incoming transactions\inputs?

Ah wait, I am wrong actually. I mean, the two-daemon approach works as a mode of operation, but it does not solve the performance issue. The online daemon would still end up generating the addresses for its own wallet file, except it would do that automatically as new transactions are discovered. Every time it did it, you would see the same slowness.

The offline address pre-generation into the same wallet file would work though.

Jun 11 '22 23:06 SomberNight

But in fact even if you pre-generate the addresses, when a new tx arrives, momentarily the wallet sync status can become not up_to_date, in which case after the sync is done, set_up_to_date gets called, and the db write executes... Basically, the issue is not address generation being slow.

Jun 11 '22 23:06 SomberNight

How do I generate sufficiently large wallet file? Generating 100000 addresses with createnewaddress yelds in ~9 MB file which is handled quite fast by the client. In other words, how to reproduce this bug?

Jul 04 '22 15:07 ValdikSS

We've a large wallet file with tons of txs apart the generated addresses. (currently it's 3gb)

Jul 04 '22 15:07 coval3nte

How do I generate sufficiently large wallet file?

I have a testnet wallet with master pubkey:

vpub5VfkVzoT7qgd5gUKjxgGE2oMJU4zKSktusfLx2NaQCTfSeeSY3S723qXKUZZaJzaF6YaF8nwQgbMTWx54Ugkf4NZvSxdzicENHoLJh96EKg

though this wallet is not that large (Qt Console:):

>>> len(wallet.get_addresses())
10536
>>> len(wallet.db.transactions)
11012
>>> import os
>>> os.path.getsize(wallet.storage.path) / 1024**2
32.91964912414551

but you can e.g. set long labels for each tx to make it large:

>>> import os
>>> prng = electrum.coinchooser.PRNG(os.urandom(32))
>>> [wallet.set_label(txid, prng.get_bytes(50000).hex()) for txid in wallet.db.transactions.keys()]

Jul 04 '22 16:07 SomberNight

@coval3nte, not addressing your issue directly, but out of curiosity, may I ask why you are using so many addresses? Is it for a single address per order? If so, have you considered any alternative approaches? If you have, which ones, and what were their pros/cons?

Mar 19 '23 16:03 meglio

no, we rotate addresses across shops. the issue is from bunch of months ago, during this time we had time to see things in a different perspective.

rather than address is the loading"scraping" of utxos which's slow [payto, also with specifying the utxos].
we have noticed that before a restart, as the daemon running for days or hours, the endpoint [such as payto, listunspent] becomes slower [360s> for a jsonRPC request] as time passes. After a restart the walletdb read takes 7s max whilst the electrumx or its rust alternative 5s max.

Mar 23 '23 17:03 coval3nte

What do you mean by "the walletdb read"? Are you comparing the time to run listunspent in both cases? in other words, is it faster after a restart?

Mar 23 '23 18:03 ecdsa

yes, it's significantly faster after a restart. both listunspent and payto suffer from this. As "walletdb read" i'm refering to WalletDB._load_transactions (or some other function inside this class), which's ± constant time. so the issue is in the mid between this and the rpc call to electrumx.

Mar 23 '23 21:03 coval3nte

listunspent and payto are not RPC calls to electrumx

Mar 24 '23 04:03 ecdsa

@ecdsa then there's something else which slows down the function, -vDEBUG doesn't provide more informations apart the walletdb one and after that the resulting json...

Mar 24 '23 14:03 coval3nte

Maybe a cluttered cache state (or no cache) of the parsed JSON wallet file? Or something that causes its full re-parse on every listunspent?

Mar 25 '23 04:03 meglio

I don't know electrum internals in depth but I can imagine that the problem isn't the cache itself being that after a restart it gets considerably better. maybe for some reason it is invalidated and never saved again and this mechanism stops with a restart?

Mar 25 '23 19:03 coval3nte

Do you maybe have rough instructions how to reproduce? Or just a description of what you are doing to the wallet where it happens, e.g. how long it is open, which commands you are calling and how, and how many times, etc.

the endpoint [such as payto, listunspent] becomes slower [360s> for a jsonRPC request] as time passes

Do you mean to say that e.g. the payto command takes 6 minutes to complete? How long does it take right after a restart?

Mar 27 '23 14:03 SomberNight

briefly description:

wallet runs for aprox 1 day
both listunspent and payto affected (no info about broadcast), wallet load transaction is 8sec, something else takes more
the commands which I\cronjobs call are listunspent, createnewaddress, broadcast, payto and estimatefee. I don't know precisely how many times per day
the restart decreases a lot the rpcs execution time, from minutes to seconds
the issue happens when I want to sends funds from the wallet

I don't know precisely what does electrum [4.3.4] do under the hood when calling rpcs, but it's a shared thing of both listunspent and payto. moreover I've seen both electrumX and Fulcrum metrics, request times are good enough for not being the reasons of the issue.

Apr 03 '23 13:04 coval3nte

electrum electrum copied to clipboard

[500$ Bounty] Slow disk writes

electrum
electrum copied to clipboard