elements icon indicating copy to clipboard operation
elements copied to clipboard

Memory leak?

Open IntinteDAO opened this issue 3 years ago • 13 comments

Debian Testing (updated today) The newest version of elements - 0.18.1.9

I am not an expert on C++, but I think Elementsd has a problem, and he is still taking care of SWAP and RAM.

Here we have data from the synchronised Liquid node. I have set up a 2GB ZRAM and Liquid has taken virtually the whole for itself. Below the java (RSK node) and Bitcoin, which at Elements are really huge!

It seems to me that Elements shouldn't take so many resources for itself.

elementsd 1787508 kB
java 143848 kB
bitcoind 140728 kB
unattended-upgr 6124 kB
tor 4748 kB
fail2ban-server 3464 kB
for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done | sort -k 2 -n -r | less

IntinteDAO avatar Dec 24 '20 21:12 IntinteDAO

Can you describe on a high level what you're doing with your node? Lots of RPC calls? Creating new transactions?

There are definitely memory leaks in the RPC interface, which I fixed in #935 (which includes, among other things, much more agressive CI tools). But my impression was that they were pretty small and unlikely to add up for most usecases.

Elementsd also uses a fair bit more resident memory than Bitcoind -- about 100Mb extra by default, if I remember right. But that wouldn't explain the numbers you're seeing!

If this is an urgent problem for you I can backport my fixes, since #935 will be under review for a few more weeks at least.

apoelstra avatar Dec 24 '20 23:12 apoelstra

I simply enabled the elements node (for synchronization) with transaction verification using RPC Bitcoin Core. I just synchronized the node myself today.

I need to build a master branch of "elements"?

Config:

mainchainrpcuser=<removed>
mainchainrpcpassword=<removed>
daemon=1

IntinteDAO avatar Dec 24 '20 23:12 IntinteDAO

Which version of Elements are you running? This is a surprising amount of memory usage after less than a day.

apoelstra avatar Dec 24 '20 23:12 apoelstra

You shouldn't need to compile a new version of Elements, but if you can run the binary with daemon=0 inside valgrind --tool=massif that would be super helpful.

apoelstra avatar Dec 24 '20 23:12 apoelstra

Elements Core RPC client version elements-0.18.1.9

https://pastebin.com/pf0sffgq

It is very strange to me that he has started downloading blocks because he has been synchronised and running as a daemon for a few hours now. Maybe there is a bug in this function?

Few minutes (~5) of Elementsd work: https://pastebin.com/sUguejua

IntinteDAO avatar Dec 25 '20 00:12 IntinteDAO

Longer time:

massif.out.11199.txt

IntinteDAO avatar Dec 25 '20 00:12 IntinteDAO

Ah! So, Elements has a very weird initial block download pattern because its block headers are significantly more expensive to validate than bitcoin blocks (11 sig checks vs a couple hashes, a roughly 1000x difference). When it is initially syncing it downloads and validates all the headers, and shows no progress even in the logs (since it is a fork of Bitcoin for which this is a super fast process). Then it finishes, starts downloading blocks, and you suddenly get a slew of UpdateTip log messages, hours after the sync actually started.

So that explains the weird "why is it all of a sudden downloading blocks" behavior, but the memory usage is still a problem.

apoelstra avatar Dec 25 '20 02:12 apoelstra

My guess, by the way, is that this isn't actually a memory leak, but us just incorrectly enforcing memory limits (allocating extra Elements-specific things and then not scaling our limits down correspondingly).

apoelstra avatar Dec 25 '20 02:12 apoelstra

To reduce some of the block header overhead, it might not be a bad idea to bring back checkpoints for Liquid. It's a federated chain anyway.

Or we could make it a bit more advanced. Since a trusted signature on a block automatically approves all earlier blocks, you'd only have to verify the last signature. And with dynafed, you'd only have to verify the blocks where the consensus parameters change. I.e. the last block of each consensus set.

On Fri, Dec 25, 2020, 02:09 Andrew Poelstra <[email protected] wrote:

Ah! So, Elements has a very weird initial block download pattern because its block headers are significantly more expensive to validate than bitcoin blocks (11 sig checks vs a couple hashes, a roughly 1000x difference). When it is initially syncing it downloads and validates all the headers, and shows no progress even in the logs (since it is a fork of Bitcoin for which this is a super fast process). Then it finishes, starts downloading blocks, and you suddenly get a slew of UpdateTip log messages, hours after the sync actually started.

So that explains the weird "why is it all of a sudden downloading blocks" behavior, but the memory usage is still a problem.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ElementsProject/elements/issues/953#issuecomment-751147384, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGQLXE7XH25M6KFFQCBUXLSWPX6HANCNFSM4VISJRWQ .

stevenroose avatar Dec 25 '20 13:12 stevenroose

I agree, I think during IBD we should (by default) validate only every 2000th block header, and dynafed transition blocks, or something like that.

In principle we could get away without validating any blocks except the last (and dynafed transitions), but it's useful for checksumming/sanity checking purposes to still validate a lot of them.

apoelstra avatar Dec 25 '20 15:12 apoelstra

If you have any questions, please let me know, because I don't want to whisk elementsd and turn it on every 10 minutes...

I have set the SWAP on a 4GB frame, but that does not change much.

IntinteDAO avatar Dec 27 '20 00:12 IntinteDAO

I reinstalled the system, gave 8GB Swap and no longer clogs up the ram at all.

IntinteDAO avatar Jan 06 '21 12:01 IntinteDAO

After doing some investigation I think we do actually have an unreasonable memory spike during startup, which on my system hits 5-6Gb. The reason is than when loading the block index from disk, we de/serialize entire block headers. See this loop.

On Bitcoin these are 80 bytes every ten minutes, and the loop is written with that in mind (so e.g. all 700k bitcoin block headers are a bit over 50MB) but on Elements our blocks are 15 times the size and ten times the frequency. We could probably improve this.

Elements also uses more memory than Bitcoin while sitting idle, because its transactions are much larger (even ignoring the extra proofs, which are witnessdata) thanks to CT. Improving this would be harder and less valuable IMO.

apoelstra avatar Sep 27 '21 14:09 apoelstra