electrumx icon indicating copy to clipboard operation
electrumx copied to clipboard

electrumx memory leak

Open github12101 opened this issue 5 years ago • 19 comments

Another memory leak has been found. Screenshot at 2020-11-03 01-20-45

I am using older, stable version on electrumx, since problems with memory leak last time, I settled on older, stable version and stayed there.

commit 835aa492e9a3563d89d99c35168378d65321b849 (HEAD, tag: 1.15.0)
Author: Neil Booth <[email protected]>
Date:   Wed May 27 11:23:38 2020 +0100

    Prepare 1.15

This memory leak is a lot slower. Process has reached 5 GB RAM after 6 days:

 electrumx_rpc getinfo
{
    "coin": "BitcoinSegwit",
    "daemon": "localhost:8332/",
    "daemon height": 655159,
    "db height": 655159,
    "db_flush_count": 1648,
    "groups": 3115,
    "history cache": "3,243,657 lookups 536,633 hits 1,000 entries",
    "merkle cache": "22,953 lookups 10,313 hits 1,000 entries",
    "peers": {
        "bad": 6,
        "good": 71,
        "never": 12,
        "stale": 14,
        "total": 103
    },
    "pid": 501797,
    "request counts": {
        "blockchain.block.header": 70370,
        "blockchain.block.headers": 2660,
        "blockchain.estimatefee": 23735419,
        "blockchain.headers.subscribe": 135572,
        "blockchain.relayfee": 24953,
        "blockchain.scripthash.get_balance": 283748,
        "blockchain.scripthash.get_history": 155753,
        "blockchain.scripthash.listunspent": 165276,
        "blockchain.scripthash.subscribe": 2920814,
        "blockchain.transaction.broadcast": 1511,
        "blockchain.transaction.get": 43903,
        "blockchain.transaction.get_merkle": 23266,
        "getinfo": 41,
        "mempool.get_fee_histogram": 1897426,
        "server.add_peer": 1146,
        "server.banner": 24969,
        "server.donation_address": 24962,
        "server.features": 10299,
        "server.peers.subscribe": 78146,
        "server.ping": 1464476,
        "server.version": 3696202
    },
    "request total": 34760912,
    "sessions": {
        "count": 931,
        "count with subs": 248,
        "errors": 21,
        "logged": 0,
        "pending requests": 37,
        "subs": 114081
    },
    "tx hashes cache": "23,266 lookups 9,745 hits 1,000 entries",
    "txs sent": 1400,
    "uptime": "6d 12h 41m",
    "version": "ElectrumX 1.15.0"
}

Electrum is running perfectly fine, serving a ton of clients, but memory is slowly creeping, 788 MB per day.

OS: Debian GNU/Linux 10 (buster) x86_64 Kernel: 5.8.0-0.bpo.2-amd64

$ python3 --version
Python 3.7.3
$ pip list |grep aio
aiohttp             3.7.1
aiorpcX             0.18.4

I'd appreciate any help debugging this. What should I upgrade/downgrade first? Waiting for suggestions. I want to make accurate and worthwile test runs, as each test takes few days.

github12101 avatar Nov 03 '20 01:11 github12101

So this is on the 1.15 tag (835aa492e9a3563d89d99c35168378d65321b849) now? Then it's been there for multiple months -- or did it only start happening recently?

You can try running something like this (just apply the diff): https://github.com/spesmilo/electrumx/commit/53fbeb7241c8466645153890f719ab45ed1624ed It just periodically logs most common object types and how many there are of them in memory. For the previous leak, this rudimentary technique was enough to find it. Anyway, it's a good start; see how the numbers change over time.

SomberNight avatar Nov 03 '20 01:11 SomberNight

So this is on the 1.15 tag (835aa49) now? Then it's been there for multiple months -- or did it only start happening recently?

Not necessarily this is electrumx problem - could be aio packages this time. Previously I definitely had some other random versions of aio packages. I also had some other electrumx versions too - in the past, I just did git fetch from time to time and installed it, same with pip packages, just updated it from time to time when I feel like it. Today, I have versions as I quoted above, electrumx on 1.15 tag and aiohttp=3.7.1, aiorpcX=0.18.4 versions. Very slow memory leak is present. Could be something else than you fixed last time.

So if you want me to keep aio untouched and move to https://github.com/spesmilo/electrumx/commit/53fbeb7241c8466645153890f719ab45ed1624ed, I will do that in the morning.

github12101 avatar Nov 03 '20 02:11 github12101

$ git checkout 53fbeb7241c8466645153890f719ab45ed1624ed
fatal: reference is not a tree: 53fbeb7241c8466645153890f719ab45ed1624ed

It's not in any branch or repository. How can I checkout to it, or apply diff?

github12101 avatar Nov 03 '20 12:11 github12101

It's in my personal fork https://github.com/sombernight/electrumx/commit/53fbeb7241c8466645153890f719ab45ed1624ed

SomberNight avatar Nov 03 '20 13:11 SomberNight

Been running your version since yesterday. I am on:

commit 53fbeb7241c8466645153890f719ab45ed1624ed (HEAD, origin/202010_memleak)
Author: SomberNight <[email protected]>
Date:   Tue Nov 3 02:48:08 2020 +0100

    memleak
$ electrumx_rpc getinfo
{
    "coin": "BitcoinSegwit",
    "daemon": "localhost:8332/",
    "daemon height": 655368,
    "db height": 655368,
    "db_flush_count": 1856,
    "groups": 3081,
    "history cache": "739,047 lookups 74,973 hits 1,000 entries",
    "merkle cache": "3,463 lookups 1,024 hits 1,000 entries",
    "peers": {
        "bad": 4,
        "good": 70,
        "never": 4,
        "stale": 10,
        "total": 88
    },
    "pid": 1622244,
    "request counts": {
        "blockchain.block.header": 10318,
        "blockchain.block.headers": 523,
        "blockchain.estimatefee": 2471602,
        "blockchain.headers.subscribe": 18014,
        "blockchain.relayfee": 3976,
        "blockchain.scripthash.get_balance": 41628,
        "blockchain.scripthash.get_history": 20716,
        "blockchain.scripthash.listunspent": 3,
        "blockchain.scripthash.subscribe": 704668,
        "blockchain.transaction.broadcast": 233,
        "blockchain.transaction.get": 3064,
        "blockchain.transaction.get_merkle": 3501,
        "getinfo": 6,
        "mempool.get_fee_histogram": 143873,
        "server.add_peer": 136,
        "server.banner": 3976,
        "server.donation_address": 3976,
        "server.features": 1359,
        "server.peers.subscribe": 5335,
        "server.ping": 152758,
        "server.version": 618149
    },
    "request total": 4207814,
    "sessions": {
        "count": 982,
        "count with subs": 155,
        "errors": 10,
        "logged": 0,
        "pending requests": 5,
        "subs": 26301
    },
    "tx hashes cache": "3,501 lookups 859 hits 1,000 entries",
    "txs sent": 217,
    "uptime": "19h 17m 03s",
    "version": "ElectrumX 1.15.0"
}
![Screenshot at 2020-11-04 10-45-35](https://user-images.githubusercontent.com/1646393/98102022-f3d3db80-1e8a-11eb-8b6b-4dad95c9fc7c.png)

It's running perfectly fine, sessions at the limit (1000), but memory creeps up. 3.76GB RAM in 19 hours.

github12101 avatar Nov 04 '20 10:11 github12101

My variables - just in case you found something interesting:

export COIN=BitcoinSegwit
export DAEMON_URL=http://xxx:xxx@localhost:8332
export NET=mainnet
export DB_ENGINE=rocksdb
export CACHE_MB=1200
export DB_DIRECTORY=$datadir
export SSL_CERTFILE=$datadir/fullchain.pem
export SSL_KEYFILE=$datadir/privkey.pem
export LOG_SESSIONS=300
export LOG_LEVEL=INFO
export LOG_FORMAT="%(asctime)s:%(levelname)s:%(name)s:%(message)s"
export BANDWIDTH_UNIT_COST=5000
export REQUEST_TIMEOUT=30
export DROP_CLIENT="3\.[0-4]\.\d+"
export DROP_CLIENT_UNKNOWN=YES
export SERVICES=ssl://:50002,rpc://:8000
export REPORT_SERVICES=ssl://(mydomainname):50002
export EVENT_LOOP_POLICY=uvloop
export MAX_SESSIONS=1000
ulimit -n 10000

And aio versions:

$ pip list |grep aio
aiohttp             3.7.1
aiorpcX             0.18.4

github12101 avatar Nov 04 '20 11:11 github12101

Could you grep the logs for hellothere?

... | grep "hellothere" -A 20

SomberNight avatar Nov 04 '20 12:11 SomberNight

result.txt

github12101 avatar Nov 04 '20 12:11 github12101

Memory still grows, 4.24GB in 1d 22 hours. So that's bigger leak than tag 1.15.

Screenshot at 2020-11-05 13-45-32

$ electrumx_rpc getinfo
{
    "coin": "BitcoinSegwit",
    "daemon": "localhost:8332/",
    "daemon height": 655536,
    "db height": 655536,
    "db_flush_count": 2023,
    "groups": 1139,
    "history cache": "1,629,378 lookups 211,669 hits 1,000 entries",
    "merkle cache": "9,877 lookups 6,001 hits 1,000 entries",
    "peers": {
        "bad": 7,
        "good": 72,
        "never": 7,
        "stale": 13,
        "total": 99
    },
    "pid": 1622244,
    "request counts": {
        "blockchain.block.header": 27423,
        "blockchain.block.headers": 998,
        "blockchain.estimatefee": 6992991,
        "blockchain.headers.subscribe": 39519,
        "blockchain.relayfee": 9827,
        "blockchain.scripthash.get_balance": 131045,
        "blockchain.scripthash.get_history": 53801,
        "blockchain.scripthash.listunspent": 15846,
        "blockchain.scripthash.subscribe": 1518702,
        "blockchain.transaction.broadcast": 533,
        "blockchain.transaction.get": 11753,
        "blockchain.transaction.get_merkle": 9966,
        "getinfo": 13,
        "mempool.get_fee_histogram": 467051,
        "server.add_peer": 334,
        "server.banner": 9830,
        "server.donation_address": 10046,
        "server.features": 3166,
        "server.peers.subscribe": 12993,
        "server.ping": 432234,
        "server.version": 1435547
    },
    "request total": 11183618,
    "sessions": {
        "count": 956,
        "count with subs": 329,
        "errors": 62,
        "logged": 0,
        "pending requests": 405,
        "subs": 122620
    },
    "tx hashes cache": "9,966 lookups 5,037 hits 1,000 entries",
    "txs sent": 510,
    "uptime": "1d 22h 13m",
    "version": "ElectrumX 1.15.0"
}

github12101 avatar Nov 05 '20 13:11 github12101

5.02 GB in 3d 04h. 1623MB per day. Screenshot at 2020-11-06 19-33-22

Any suggestions? Can someone reproduce this? Should I downgrade some pip packages, or is it in electrumx code?

github12101 avatar Nov 06 '20 19:11 github12101

I have 3316 MB RES usage on mainnet, after 10 days of uptime (running ~master. self-note: https://github.com/spesmilo/electrumx/commits/1d7b38d35bf475e9651ae470ff8d94d76f70375d) So no, not sure there is a leak here.

Maybe try without uvloop?

SomberNight avatar Nov 07 '20 04:11 SomberNight

but before you restart, could you grep the logs for hellothere again?

SomberNight avatar Nov 07 '20 04:11 SomberNight

result.txt

I stopped it last night, as over 5GB was too much. There you have grep results from 3d 04h uptime.

github12101 avatar Nov 07 '20 10:11 github12101

Restarted without uvloop:

2020-11-07 10:48:56,169:INFO:electrumx:ElectrumX server starting 2020-11-07 10:48:56,170:INFO:electrumx:logging level: INFO 2020-11-07 10:48:56,170:INFO:Controller:Python version: 3.7.3 (default, Jul 25 2020, 13:03:44) [GCC 8.3.0] 2020-11-07 10:48:56,175:INFO:Controller:software version: ElectrumX 1.15.0 2020-11-07 10:48:56,176:INFO:Controller:aiorpcX version: 0.18.4 2020-11-07 10:48:56,176:INFO:Controller:supported protocol versions: 1.4-1.4.2 2020-11-07 10:48:56,176:INFO:Controller:event loop policy: None 2020-11-07 10:48:56,176:INFO:Controller:reorg limit is 200 blocks 2020-11-07 10:48:56,176:INFO:Daemon:daemon #1 at localhost:8332/ (current) 2020-11-07 10:48:56,177:INFO:DB:switching current directory to /home/nv01/.electrumx 2020-11-07 10:48:56,184:INFO:DB:using rocksdb for DB backend 2020-11-07 10:48:57,277:INFO:DB:opened UTXO DB (for sync: True) 2020-11-07 10:48:57,277:INFO:DB:UTXO DB version: 8 2020-11-07 10:48:57,277:INFO:DB:coin: BitcoinSegwit 2020-11-07 10:48:57,277:INFO:DB:network: mainnet 2020-11-07 10:48:57,277:INFO:DB:height: 655,817 2020-11-07 10:48:57,277:INFO:DB:tip: 0000000000000000000a303c9cf1eb078e114afb651de1f65476c5fca7839a8b 2020-11-07 10:48:57,277:INFO:DB:tx count: 584,617,498 2020-11-07 10:48:57,277:INFO:DB:flushing DB cache at 1,200 MB 2020-11-07 10:48:58,549:INFO:History:history DB version: 1 2020-11-07 10:48:58,549:INFO:History:flush count: 2,302 2020-11-07 10:48:58,601:INFO:BlockProcessor:caught up to height 655817 2020-11-07 10:48:58,602:INFO:Prefetcher:caught up to daemon height 655,817 2020-11-07 10:48:58,602:INFO:SessionManager:RPC server listening on localhost:8000 2020-11-07 10:48:58,602:INFO:DB:closing DBs to re-open for serving

github12101 avatar Nov 07 '20 10:11 github12101

Result:

ElectrumX stats:
Servers: 82 
Connections: 955 
Uptime: 6d:07h:11m 
Process memory: 4555 MB

And growing everyday, but slower than before, 724 MB per day.

github12101 avatar Nov 13 '20 19:11 github12101

result.txt

github12101 avatar Nov 13 '20 19:11 github12101

Same issue, tried with and without uvloop. Lowered MAX_SESSIONS from 1500 to 800. Still no change. Heading for another restart soon.

Name            : electrumx-git
Version         : 1.15.0.r1947.9ca59fa-1

Linux bitcoin 5.8.10-arch1-1 #1 SMP PREEMPT Thu, 17 Sep 2020 18:01:06 +0000 x86_64 GNU/Linux
{
    "coin": "BitcoinSegwit",
    "daemon": "127.0.0.1:8332/",
    "daemon height": 656813,
    "db height": 656813,
    "db_flush_count": 37618,
    "groups": 1880,
    "history cache": "2,466,351 lookups 966,760 hits 1,000 entries",
    "merkle cache": "9,952 lookups 4,310 hits 1,000 entries",
    "peers": {
        "bad": 7,
        "good": 92,
        "never": 5,
        "stale": 5,
        "total": 109
    },
    "pid": 1975409,
    "request counts": {
        "blockchain.block.header": 221177,
        "blockchain.block.headers": 7117,
        "blockchain.estimatefee": 8735638,
        "blockchain.headers.subscribe": 289418,
        "blockchain.relayfee": 36802,
        "blockchain.scripthash.get_balance": 5584,
        "blockchain.scripthash.get_history": 8079,
        "blockchain.scripthash.get_mempool": 29,
        "blockchain.scripthash.listunspent": 1797,
        "blockchain.scripthash.subscribe": 2453649,
        "blockchain.transaction.broadcast": 677,
        "blockchain.transaction.get": 19988,
        "blockchain.transaction.get_merkle": 10333,
        "getinfo": 73511,
        "invalid method": 20,
        "mempool.get_fee_histogram": 95931,
        "peers": 2050,
        "server.add_peer": 1092,
        "server.banner": 36836,
        "server.donation_address": 38373,
        "server.features": 10978,
        "server.peers.subscribe": 6769476,
        "server.ping": 505073,
        "server.version": 5733907,
        "sessions": 6502
    },
    "request total": 25064037,
    "sessions": {
        "count": 509,
        "count with subs": 14,
        "errors": 92,
        "logged": 0,
        "pending requests": 49,
        "subs": 588
    },
    "tx hashes cache": "10,333 lookups 4,320 hits 1,000 entries",
    "txs sent": 543,
    "uptime": "5d 19h 51m",
    "version": "ElectrumX 1.15.0"
} 

Mainnet image image

I have a second server running testnet with same conf with no issues. Testnet image image

mercurytoxic avatar Nov 14 '20 00:11 mercurytoxic

After updating to master, I believe the commit fix memory leak https://github.com/spesmilo/electrumx/commit/8204829e7c15e288bf620c2295cb6495314f162b fixed my issues

image Thanks!

mercurytoxic avatar Dec 04 '20 00:12 mercurytoxic

I wish this was resolved for me... It never was. I kept restarting Electrum every now and then since opening this issue many months ago.

Right now using straight from git, currently:

commit 1c680d362455687db42c2498f9ff6cf1bf8eda91 (HEAD -> master, origin/master, origin/HEAD)
Author: John Studnicka <[email protected]>
Date:   Sat May 1 08:46:43 2021 -0700

    Add Electra Protocol (XEP) support (#124)

Result is (I wrote script to output this data):

Servers: 96 
Connections: 595 
Uptime: 23d:04h:44m 
Process memory: 13601 MB

electrumx getinfo: $ electrumx_rpc getinfo

{
    "coin": "BitcoinSegwit",
    "daemon": "localhost:8332/",
    "daemon height": 685856,
    "db height": 685856,
    "db_flush_count": 31650,
    "groups": 674,
    "history cache": "14,726,487 lookups 2,106,399 hits 1,000 entries",
    "merkle cache": "124,223 lookups 80,847 hits 1,000 entries",
    "peers": {
        "bad": 23,
        "good": 95,
        "never": 3,
        "stale": 6,
        "total": 127
    },
    "pid": 1489,
    "request counts": {
        "blockchain.block.header": 400481,
        "blockchain.block.headers": 7930,
        "blockchain.estimatefee": 67250739,
        "blockchain.headers.subscribe": 408924,
        "blockchain.relayfee": 78746,
        "blockchain.scripthash.get_balance": 289798,
        "blockchain.scripthash.get_history": 328228,
        "blockchain.scripthash.listunspent": 16014,
        "blockchain.scripthash.subscribe": 14297247,
        "blockchain.transaction.broadcast": 1292,
        "blockchain.transaction.get": 103351,
        "blockchain.transaction.get_merkle": 136462,
        "getinfo": 97,
        "invalid method": 6,
        "mempool.get_fee_histogram": 560387,
        "server.add_peer": 10020,
        "server.banner": 78780,
        "server.donation_address": 83851,
        "server.features": 57979,
        "server.peers.subscribe": 137586,
        "server.ping": 3926283,
        "server.version": 12543334
    },
    "request total": 100717535,
    "sessions": {
        "count": 580,
        "count with subs": 1,
        "errors": 1126,
        "logged": 0,
        "pending requests": 4,
        "subs": 49
    },
    "tx hashes cache": "136,462 lookups 80,192 hits 1,000 entries",
    "txs sent": 1069,
    "uptime": "23d 05h 16m",
    "version": "ElectrumX 1.16.0"

Peek 2021-06-02 00-35

ElectrumX is working fine and it's serving thousands of wallets, busy on CPU and HDD as well, but I am happy to contribute. However, memory leak is always there. Maybe some dev would want to take a look at this once again? Thanks.

github12101 avatar Jun 01 '21 23:06 github12101