thor
thor copied to clipboard
How to reduce disk load
Hi. How to reduce disk load on read for thor
? Can I reduce count of threads for example? We use fast NVME SSD on server.
/opt/vechain/bin/thor --network main --data-dir /opt/vechain/data --api-addr 0.0.0.0:8669 --verbosity 9 --max-peers 50 --cache 4096
/opt/vechain/bin/thor --version
Thor version 1.3.3-419e32e-release
thor
proccess
# ps auxf| grep thor
vechain 19062 0.0 55.5 6645136 4338496 ? Ssl Apr22 7558:48 /opt/vechain/bin/thor
--network main --data-dir /opt/vechain/data --api-addr 0.0.0.0:8669 --verbosity 9
--max-peers 50 --cache 4096
217 threads (!)
# ps -e -T | grep 19062| wc -l
217
iotop

iostat -x 1
Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
loop0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
nvme0n1 29072.00 88.00 156716.00 21902.00 0.00 265.00 0.00 75.07 0.10 2.14 2.13 5.39 248.89 0.03 82.80
nvme1n1 24594.00 88.00 132324.00 21902.00 0.00 265.00 0.00 75.07 0.11 2.09 1.78 5.38 248.89 0.03 78.00
md2 53658.00 336.00 289096.00 21892.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.39 65.15 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
14.85 0.00 20.35 1.33 0.00 63.47
Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
loop0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
nvme0n1 34277.00 13.00 190460.00 23.50 0.00 0.00 0.00 0.00 0.09 3.38 2.16 5.56 1.81 0.03 88.80
nvme1n1 29253.00 13.00 163220.00 23.50 0.00 0.00 0.00 0.00 0.09 3.38 1.66 5.58 1.81 0.03 87.20
md2 63534.00 4.00 353592.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.57 2.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
14.65 0.00 20.88 1.26 0.00 63.22
Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
loop0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
nvme0n1 36900.00 12.00 202444.00 20.00 0.00 0.00 0.00 0.00 0.09 2.67 2.27 5.49 1.67 0.02 91.60
nvme1n1 31646.00 12.00 176776.00 20.00 0.00 0.00 0.00 0.00 0.09 2.67 1.89 5.59 1.67 0.03 88.40
md2 68545.00 4.00 379216.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.53 2.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
17.14 0.00 17.90 6.30 0.00 58.66
Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
loop0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
nvme0n1 28694.00 110.00 161896.00 21780.00 0.00 207.00 0.00 65.30 0.11 2.00 2.26 5.64 198.00 0.03 85.60
nvme1n1 24235.00 110.00 135280.00 21780.00 0.00 207.00 0.00 65.30 0.11 2.04 1.86 5.58 198.00 0.03 78.80
md2 52931.00 277.00 297208.00 21704.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.62 78.35 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 14.00 0.00 56.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00
There are some scenarios will make high disk load(read operation in this case):
- re-synchronization
- on-going pruning
- intensive requests of query blocks/txs/state by RESTful API
There are some scenarios will make high disk load(read operation in this case):
- re-synchronization
- on-going pruning
- intensive requests of query blocks/txs/state by RESTful API
Thank you for response.
thor
on the last block
root@vet-02:~# curl http://localhost:8669/blocks/best
{"number":5722736,"id":"0x0057527088bc55a52622f8e9492fcea93567c89596cf5d1eb183181ad661c465",
There wasn't no one blocks/txs/state
request.
Can I set threads limit? I think it too much for 4-8 vCPU.
# ps -e -T | grep 19062| wc -l
217
thor debug logs. Last 1000 lines. https://pastebin.com/raw/D78yRuym
Usually there the node does not perform that high disk load, can you check if there is another program performing requests to the RESTful API and whether the high io load happens all the time
Can I set threads limit? I think it too much for 4-8 vCPU.
# ps -e -T | grep 19062| wc -l 217
It's not recommended to limit threads count, although you can call runtime.GOMAXPROCS to achieve this. Unless the RAM pressure is a concern.
BTW. it'll be helpful to tail lines from main.db/LOG
when high disk load noticed.
When I stop thor
process then disk load it's okey. I have resynced blockchain and now all works fine. I will close issue. If the problem happens again I will reopen issue. Thank for yours help.
Hi. I have this problem again. Very hight load on disk, 685 M/s. iotop
shows that thor
is writing on disk very intensive.
Can set limits for thor
?
root@vet-01:~# curl http://localhost:8669/blocks/best
{"number":5800992,"id":"0x005884200841de3cbf66f1e41b2e7884d9ab42ce1cbac4ce9534dd4c1ab1488d","size":243,"parentID":"0x0058841f2ce30529bf0fdae20456a6e0604db5066ab14bc6c4de45e4bf5c7238","timestamp":1588496820,"gasLimit":31355345,"beneficiary":"0x7b683b0f5078a850b685b0c2f14262c86ea99b8f","gasUsed":0,"totalScore":558429382,"txsRoot":"0x45b0cfc220ceec5b7c1c62c4d4193d38e4eba48e8815729ce75f9c0ab0e4c1c0","txsFeatures":1,"stateRoot":"0x4748f7e3b68d4e000f94935e5a15143b3ce1fdb975a9082543f302ca103a9c62","receiptsRoot":"0x45b0cfc220ceec5b7c1c62c4d4193d38e4eba48e8815729ce75f9c0ab0e4c1c0","signer":"0x871ea386b71cbc16b7e21357de08778cf4e341a2","isTrunk":true,"transactions":[]}
Stdout Logs
root@vet-01:~# journalctl -fu thor
-- Logs begin at Wed 2020-04-22 18:28:05 UTC. --
May 03 09:07:26 vet-01 thor[22157]: t=2020-05-03T09:07:26+0000 lvl=info msg="imported blocks (1)" pkg=node txs=0 mgas=0.000 et=2.703ms|45.710ms mgas/s=0.000 id=[#5800995…f13016d4]
May 03 09:07:26 vet-01 thor[22157]: t=2020-05-03T09:07:26+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer cab862fef37c048d 128.1.48.167:11235" dir=outbound msg=MsgNewBlockID
May 03 09:07:26 vet-01 thor[22157]: t=2020-05-03T09:07:26+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer d1c780cdfbd83db4 152.32.188.92:11235" dir=outbound msg=MsgNewBlock
May 03 09:07:26 vet-01 thor[22157]: t=2020-05-03T09:07:26+0000 lvl=dbug msg="wash done" pkg=txpool len=1 removed=0 elapsed=1.765ms
May 03 09:07:28 vet-01 thor[22157]: t=2020-05-03T09:07:28+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer 88cf00e83688e933 152.32.140.65:11235" dir=outbound msg=MsgGetBlockIDByNumber
May 03 09:07:28 vet-01 thor[22157]: t=2020-05-03T09:07:28+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer 88cf00e83688e933 152.32.140.65:11235" dir=outbound msg=MsgGetBlocksFromNumber
May 03 09:07:31 vet-01 thor[22157]: t=2020-05-03T09:07:31+0000 lvl=dbug msg="synchronization start" pkg=comm
May 03 09:07:31 vet-01 thor[22157]: t=2020-05-03T09:07:31+0000 lvl=dbug msg="start to process block stream" pkg=node
May 03 09:07:31 vet-01 thor[22157]: t=2020-05-03T09:07:31+0000 lvl=dbug msg="process block stream done" pkg=node err=nil
May 03 09:07:31 vet-01 thor[22157]: t=2020-05-03T09:07:31+0000 lvl=dbug msg="synchronization done" pkg=comm peer="Peer dfbb840bc8324809 51.91.82.15:2600" dir=outbound
May 03 09:07:36 vet-01 thor[22157]: t=2020-05-03T09:07:36+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer a434245367373fde 128.1.42.238:11235" dir=outbound msg=MsgNewBlockID
May 03 09:07:36 vet-01 thor[22157]: t=2020-05-03T09:07:36+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer 380738d424b19f39 104.218.165.203:11235" dir=outbound msg=MsgNewBlockID
May 03 09:07:36 vet-01 thor[22157]: t=2020-05-03T09:07:36+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer dfbb840bc8324809 51.91.82.15:2600" dir=outbound msg=MsgNewBlockID
May 03 09:07:36 vet-01 thor[22157]: t=2020-05-03T09:07:36+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer 03b606278b875bd2 128.14.231.4:11235" dir=outbound msg=MsgNewBlockID
May 03 09:07:36 vet-01 thor[22157]: t=2020-05-03T09:07:36+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer d1c780cdfbd83db4 152.32.188.92:11235" dir=outbound msg=MsgNewBlockID
May 03 09:07:36 vet-01 thor[22157]: t=2020-05-03T09:07:36+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer cab862fef37c048d 128.1.48.167:11235" dir=outbound msg=MsgNewBlockID
May 03 09:07:36 vet-01 thor[22157]: t=2020-05-03T09:07:36+0000 lvl=info msg="imported blocks (1)" pkg=node txs=1 mgas=0.021 et=9.318ms|64.541ms mgas/s=0.284 id=[#5800996…f932e309]
May 03 09:07:36 vet-01 thor[22157]: t=2020-05-03T09:07:36+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer 88cf00e83688e933 152.32.140.65:11235" dir=outbound msg=MsgNewBlockID
May 03 09:07:36 vet-01 thor[22157]: t=2020-05-03T09:07:36+0000 lvl=dbug msg="tx washed out" pkg=txpool id=0x5bc9956d572e44af005800af8f64e350bbe233bfd0cad2b79ff938f592d4e07d err="known tx"
May 03 09:07:36 vet-01 thor[22157]: t=2020-05-03T09:07:36+0000 lvl=dbug msg="wash done" pkg=txpool len=1 removed=1 elapsed=3.895ms
May 03 09:07:45 vet-01 thor[22157]: t=2020-05-03T09:07:45+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer 03b606278b875bd2 128.14.231.4:11235" dir=outbound msg=MsgNewBlockID
May 03 09:07:45 vet-01 thor[22157]: t=2020-05-03T09:07:45+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer a434245367373fde 128.1.42.238:11235" dir=outbound msg=MsgNewBlockID
May 03 09:07:45 vet-01 thor[22157]: t=2020-05-03T09:07:45+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer dfbb840bc8324809 51.91.82.15:2600" dir=outbound msg=MsgNewBlockID
May 03 09:07:45 vet-01 thor[22157]: t=2020-05-03T09:07:45+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer d1c780cdfbd83db4 152.32.188.92:11235" dir=outbound msg=MsgNewBlockID
May 03 09:07:45 vet-01 thor[22157]: t=2020-05-03T09:07:45+0000 lvl=dbug msg="skip new block ID announcement" pkg=comm peer="Peer d1c780cdfbd83db4 152.32.188.92:11235" dir=outbound
May 03 09:07:45 vet-01 thor[22157]: t=2020-05-03T09:07:45+0000 lvl=info msg="imported blocks (1)" pkg=node txs=0 mgas=0.000 et=3.165ms|2.161ms mgas/s=0.000 id=[#5800997…d0d3fceb]
May 03 09:07:45 vet-01 thor[22157]: t=2020-05-03T09:07:45+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer 380738d424b19f39 104.218.165.203:11235" dir=outbound msg=MsgNewBlockID
May 03 09:07:45 vet-01 thor[22157]: t=2020-05-03T09:07:45+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer cab862fef37c048d 128.1.48.167:11235" dir=outbound msg=MsgNewBlockID
May 03 09:07:45 vet-01 thor[22157]: t=2020-05-03T09:07:45+0000 lvl=dbug msg="received RPC call" pkg=comm peer="Peer 88cf00e83688e933 152.32.140.65:11235" dir=outbound msg=MsgNewBlockID
May 03 09:07:46 vet-01 thor[22157]: t=2020-05-03T09:07:46+0000 lvl=dbug msg="wash done" pkg=txpool len=0 removed=0 elapsed=1.479ms
iostat
Nginx request per second is not big
# tail -f /var/log/nginx/vechain.access.log| pv -r -l >/dev/null
[14.5 /s]
I can set limit CPU/MEM/Network, but I can't set I/O limits on my server for thor
process.
how long did it last? It's most likely the state pruner was running, if just for several minutes.
you can filter out logs with keyword 'pruner'.
how long did it last? It's most likely the state pruner was running, if just for several minutes.
It has start 2020-05-01 ~16:00 UTC.
@qianbin
you can filter out logs with keyword 'pruner'.
Last 20 records with keyword 'pruner'.
root@vet-01:~# grep pruner /var/log/syslog| tail -n 20
Apr 23 04:46:49 vet-01 thor[3920]: t=2020-04-23T04:46:49+0000 lvl=info msg="archiving account trie..." pkg=pruner
Apr 23 04:48:36 vet-01 thor[3920]: t=2020-04-23T04:48:36+0000 lvl=info msg="archived account trie" pkg=pruner nodes=18274 entries=6777 storageNodes=773300 storageEntries=234992
Apr 23 04:52:12 vet-01 thor[3920]: t=2020-04-23T04:52:12+0000 lvl=info msg="sweeping stale nodes..." pkg=pruner
Apr 23 04:52:20 vet-01 thor[3920]: t=2020-04-23T04:52:20+0000 lvl=info msg="swept stale nodes" pkg=pruner count=3364134
Apr 23 04:52:21 vet-01 thor[3920]: t=2020-04-23T04:52:21+0000 lvl=info msg=initiated pkg=pruner range="[5623959, 5691750]"
Apr 23 04:52:21 vet-01 thor[3920]: t=2020-04-23T04:52:21+0000 lvl=info msg="archiving index trie..." pkg=pruner
Apr 23 04:52:49 vet-01 thor[3920]: t=2020-04-23T04:52:49+0000 lvl=info msg="archived index trie" pkg=pruner nodes=252631 entries=120139
Apr 23 04:52:49 vet-01 thor[3920]: t=2020-04-23T04:52:49+0000 lvl=info msg="archiving account trie..." pkg=pruner
Apr 23 04:54:27 vet-01 thor[3920]: t=2020-04-23T04:54:27+0000 lvl=info msg="archived account trie" pkg=pruner nodes=17082 entries=6274 storageNodes=803692 storageEntries=248991
Apr 24 08:04:12 vet-01 thor[3920]: t=2020-04-24T08:04:12+0000 lvl=info msg="stopping pruner..."
Apr 24 08:04:16 vet-01 thor[22157]: t=2020-04-24T08:04:16+0000 lvl=info msg="pruner started" pkg=pruner range="[5623959, 5691750]" step=dropStale
Apr 28 07:58:36 vet-01 thor[22157]: t=2020-04-28T07:58:36+0000 lvl=info msg="sweeping stale nodes..." pkg=pruner
Apr 28 07:58:45 vet-01 thor[22157]: t=2020-04-28T07:58:45+0000 lvl=info msg="swept stale nodes" pkg=pruner count=3447391
Apr 28 08:00:36 vet-01 thor[22157]: t=2020-04-28T08:00:36+0000 lvl=info msg=initiated pkg=pruner range="[5691750, 5757425]"
Apr 28 08:00:36 vet-01 thor[22157]: t=2020-04-28T08:00:36+0000 lvl=info msg="archiving index trie..." pkg=pruner
Apr 28 08:01:01 vet-01 thor[22157]: t=2020-04-28T08:01:01+0000 lvl=info msg="archived index trie" pkg=pruner nodes=265525 entries=122717
Apr 28 08:01:01 vet-01 thor[22157]: t=2020-04-28T08:01:01+0000 lvl=info msg="archiving account trie..." pkg=pruner
Apr 28 08:02:47 vet-01 thor[22157]: t=2020-04-28T08:02:47+0000 lvl=info msg="archived account trie" pkg=pruner nodes=20711 entries=7786 storageNodes=697653 storageEntries=218045
May 3 10:25:10 vet-01 thor[22157]: t=2020-05-03T10:25:10+0000 lvl=info msg="stopping pruner..."
May 3 10:25:13 vet-01 thor[32273]: t=2020-05-03T10:25:13+0000 lvl=info msg="pruner started" pkg=pruner range="[5691750, 5757425]" step=dropStale
root@vet-01:~#
I have restarted thor
at 10:25 today.
After restart thor
so it's nothing related to pruner. wondering how is the disk load if all http requests are stopped.
so it's nothing related to pruner. wondering how is the disk load if all http requests are stopped.
What about main.db/LOG ? I can't find the reason there.
so it's nothing related to pruner. wondering how is the disk load if all http requests are stopped.
What about main.db/LOG ? I can't find the reason there.
that looks all right.
I have changed --cache 4096
to --cache 1024
and restart. I will watch how it helps.
/opt/vechain/bin/thor --network main --data-dir /opt/vechain/data --api-addr 0.0.0.0:8669 --verbosity 9 --max-peers 50 --cache 1024
I have changed
--cache 4096
to--cache 1024
and restart. I will watch how it helps./opt/vechain/bin/thor --network main --data-dir /opt/vechain/data --api-addr 0.0.0.0:8669 --verbosity 9 --max-peers 50 --cache 1024
oh, forgot to ask about the RAM limit. what's the limit now? 16GB is recommended. If low RAM limit set, pages may go into swap and so disk load will be raised.
I have changed
--cache 4096
to--cache 1024
and restart. I will watch how it helps./opt/vechain/bin/thor --network main --data-dir /opt/vechain/data --api-addr 0.0.0.0:8669 --verbosity 9 --max-peers 50 --cache 1024
oh, forgot to ask about the RAM limit. what's the limit now? 16GB is recommended. If low RAM limit set, pages may go into swap and so disk load will be raised.
it's limit 8Gb now, I will set 16Gb.