exchain icon indicating copy to clipboard operation
exchain copied to clipboard

I built my own node one today has not been able to catch up with the latest height

Open COLUD4 opened this issue 2 years ago • 29 comments

version:1.6.5.1 ubuntu20.04

COLUD4 avatar Nov 17 '22 10:11 COLUD4

same here

always out of sync

sing1ee avatar Nov 17 '22 11:11 sing1ee

+1 still slow syncing even upgrade cpu, memory and disk

macrocan avatar Nov 17 '22 11:11 macrocan

same here

always out of sync

hot-westeros avatar Nov 17 '22 12:11 hot-westeros

+1 still slow syncing even upgrade cpu, memory and disk

no matter with hardware, just the network issue

sing1ee avatar Nov 17 '22 13:11 sing1ee

Same issue. CPU and RAM peak due to this issue too.

dandavid3000 avatar Nov 17 '22 14:11 dandavid3000

Same issue. even after pulling the new version and running the repair-state. Slow syncing

kaelabbott avatar Nov 17 '22 15:11 kaelabbott

try this

if you are using leveldb:

exchaind start 
--pruning everything 
--chain-id exchain-66 
--mempool.sort_tx_by_gp
--iavl-enable-async-commit=true 
--iavl-cache-size=10000000  
--mempool.recheck=0 
--mempool.force_recheck_gap=2000 
--disable-abci-query-mutex=1 
--mempool.size=200000
--mempool.max_gas_used_per_block=120000000 
--home /data_folder/ 
--disable-abci-query-mutex=1  
--fast-query=1 
--enable-bloom-filter=1

if you are using rocksdb

exchaind start 
--pruning everything 
--chain-id exchain-66 
--db_backend rocksdb 
--mempool.sort_tx_by_gp
--iavl-enable-async-commit=true 
--iavl-cache-size=10000000  
--mempool.recheck=0 
--mempool.force_recheck_gap=2000 
--disable-abci-query-mutex=1 
--mempool.size=200000
--mempool.max_gas_used_per_block=120000000 
--home /data_folder/ 
--disable-abci-query-mutex=1  
--fast-query=1 
--enable-bloom-filter=1

Switching to rocksdb is recommended as leveldb will no longer be maintained

sing1ee avatar Nov 17 '22 15:11 sing1ee

@sing1ee thanks for the response, I tried the commands above and still my node is slow syncing. im using rocksdb and the version 1.6.5.1 as well.

kaelabbott avatar Nov 17 '22 15:11 kaelabbott

Notice about OKC Network.

Today, OKC has onboarded XEN successfully. However, due to our low gas fees and XEN's popularity, there were many users who attempted to mint XEN through scripts. These scripts consume a huge amount of gas in a single transaction which consequently filled the Mempool, resulting in temporary congestion.

We have since increased the capacity of our RPC Mempool and the gas limit of each block to alleviate the congestion.

To prevent similar incidents, we will launch a proposal for the community to vote on the option to limit excessive consumption of resources by single transactions. Updates will follow soon! Stay tuned

https://t.me/XENCryptoTalk/367273

In my personal opinion, this was caused by the wrong gas policy. the gas limit of the block was even increased. and the gas price cannot be adjusted correctly, https://www.oklink.com/en/okc/block/15416037

the execution pressure of the block during this period is too high, causing most nodes to take a long time to sync the state.

There are some ways to help you sync faster:

  1. upgrade the machine
  2. use rocksdb
  3. turn on the asynchronous commit option --iavl-enable-async-commit=true --iavl-cache-size=10000000

But these may not help much. Just let the nodes to sync, even if it will be very slow. You can also wait for the official release of a new data snapshot to skip synchronizing today's block data

cwbhhjl avatar Nov 17 '22 16:11 cwbhhjl

same issue from 17/11, who can help

jackie2022tec avatar Nov 18 '22 03:11 jackie2022tec

Notice about OKC Network. Today, OKC has onboarded XEN successfully. However, due to our low gas fees and XEN's popularity, there were many users who attempted to mint XEN through scripts. These scripts consume a huge amount of gas in a single transaction which consequently filled the Mempool, resulting in temporary congestion. We have since increased the capacity of our RPC Mempool and the gas limit of each block to alleviate the congestion. To prevent similar incidents, we will launch a proposal for the community to vote on the option to limit excessive consumption of resources by single transactions. Updates will follow soon! Stay tuned

https://t.me/XENCryptoTalk/367273

In my personal opinion, this was caused by the wrong gas policy. the gas limit of the block was even increased. and the gas price cannot be adjusted correctly, https://www.oklink.com/en/okc/block/15416037

the execution pressure of the block during this period is too high, causing most nodes to take a long time to sync the state.

There are some ways to help you sync faster:

  1. upgrade the machine
  2. use rocksdb
  3. turn on the asynchronous commit option --iavl-enable-async-commit=true --iavl-cache-size=10000000

But these may not help much. Just let the nodes to sync, even if it will be very slow. You can also wait for the official release of a new data snapshot to skip synchronizing today's block data

I have tried, but it seems to be getting slower, only 30 blocks are synchronized in 10 minutes

sing1ee avatar Nov 18 '22 04:11 sing1ee

@sing1ee The gas limit of the block is now 120 million, which I think is still too high. there is a lot of 'scripted' XEN minting with very low gas price in each block.

cwbhhjl avatar Nov 18 '22 16:11 cwbhhjl

same problem here, using rocksdb and the iavl flags mentioned in this discussion

0xChupaCabra avatar Dec 01 '22 13:12 0xChupaCabra

same problem here, using rocksdb and the iavl flags mentioned in this discussion

Hi, @stepollo2 , Can you share your start command ,version and logs? We sugguest to use v1.6.5.9

If you run node as rpc, please start exchaind:

exchaind start --home $your_home

if you run node as validator, please start exchaind:

exchaind start --node-mode=val --home $your_home

We benifit from improve our disk‘s iops(16000) and throughput(1000M). Hope it helps.

giskook avatar Dec 01 '22 14:12 giskook

Here

same problem here, using rocksdb and the iavl flags mentioned in this discussion

Hi, @stepollo2 , Can you share your start command ,version and logs? We sugguest to use v1.6.5.9

If you run node as rpc, please start exchaind:

exchaind start --home $your_home

if you run node as validator, please start exchaind:

exchaind start --node-mode=val --home $your_home

We benifit from improve our disk‘s iops(16000) and throughput(1000M). Hope it helps.

Here are the details requested:

exchaind version
v1.6.5.8

cat /etc/systemd/system/exchain.service
[Unit]
Description=OKX service
After=network.target
StartLimitIntervalSec=0
[Service]
Type=simple
Restart=always
RestartSec=1
User=XXXX
ExecStart=/usr/local/bin/exchaind start --chain-id exchain-66 --home /data1/exchain/data --rest.laddr "tcp://0.0.0.0:10998" --cors "*" --iavl-enable-async-commit=true --iavl-cache-size=10000000 --max-open=1024 --rocksdb.opts max_open_files=100

[Install]
WantedBy=multi-user.target

Snippet from the logs:

Dec 01 17:47:35 ovh-1 exchaind[1710424]: I[2022-12-01|17:47:35.365][1710424] Height<15425676>, Tx<6>, BlockSize<4854>, GasUsed<94425787>, InvalidTxs<0>, lastRun<11275ms>, RunTx<ApplyBlock<13096ms>, abci<11275ms>, persist<1819ms>>, MempoolTxs<0>, Workload<1.00|1.00|1.00|1.00>, MempoolTxs[0], Iavl[getnode<340987>, rdb<30504>, rdbTs<52429ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<11255ms>, refund<0ms>]. module=main
Dec 01 17:47:48 ovh-1 exchaind[1710424]: I[2022-12-01|17:47:48.792][1710424] Height<15425677>, Tx<6>, BlockSize<16069>, GasUsed<78695088>, InvalidTxs<0>, lastRun<11914ms>, RunTx<ApplyBlock<13426ms>, abci<11914ms>, persist<1510ms>>, MempoolTxs<0>, Workload<1.00|1.00|1.00|1.00>, MempoolTxs[0], Iavl[getnode<278986>, rdb<24139>, rdbTs<44188ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<11827ms>, refund<0ms>]. module=main
Dec 01 17:48:08 ovh-1 exchaind[1710424]: I[2022-12-01|17:48:08.769][1710424] Height<15425678>, Tx<7>, BlockSize<2973>, GasUsed<113176930>, InvalidTxs<0>, lastRun<17851ms>, RunTx<ApplyBlock<19975ms>, abci<17851ms>, persist<2122ms>>, MempoolTxs<0>, Workload<1.00|1.01|1.00|1.00>, MempoolTxs[0], Iavl[getnode<402953>, rdb<36778>, rdbTs<72608ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<17824ms>, refund<0ms>]. module=main
Dec 01 17:48:24 ovh-1 exchaind[1710424]: I[2022-12-01|17:48:24.074][1710424] Height<15425679>, Tx<6>, BlockSize<3061>, GasUsed<90351907>, InvalidTxs<0>, lastRun<13627ms>, RunTx<ApplyBlock<15303ms>, abci<13627ms>, persist<1674ms>>, MempoolTxs<0>, Workload<1.00|1.01|1.00|1.00>, MempoolTxs[0], Iavl[getnode<326127>, rdb<29052>, rdbTs<51812ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<13598ms>, refund<0ms>]. module=main
Dec 01 17:48:42 ovh-1 exchaind[1710424]: I[2022-12-01|17:48:42.926][1710424] Height<15425680>, Tx<8>, BlockSize<2998>, GasUsed<113197930>, InvalidTxs<0>, lastRun<16965ms>, RunTx<ApplyBlock<18851ms>, abci<16966ms>, persist<1884ms>>, MempoolTxs<0>, Workload<1.00|1.01|1.00|1.00>, MempoolTxs[0], Iavl[getnode<403674>, rdb<36418>, rdbTs<63500ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<16927ms>, refund<0ms>]. module=main
Dec 01 17:48:57 ovh-1 exchaind[1710424]: I[2022-12-01|17:48:57.021][1710424] Height<15425681>, Tx<7>, BlockSize<15861>, GasUsed<101354033>, InvalidTxs<0>, lastRun<12198ms>, RunTx<ApplyBlock<14093ms>, abci<12198ms>, persist<1893ms>>, MempoolTxs<0>, Workload<1.00|1.01|1.00|1.00>, MempoolTxs[0], Iavl[getnode<356042>, rdb<31227>, rdbTs<57318ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<12152ms>, refund<0ms>]. module=main
Dec 01 17:49:11 ovh-1 exchaind[1710424]: I[2022-12-01|17:49:11.397][1710424] Height<15425682>, Tx<7>, BlockSize<3813>, GasUsed<141031927>, InvalidTxs<0>, lastRun<12808ms>, RunTx<ApplyBlock<14375ms>, abci<12809ms>, persist<1564ms>>, MempoolTxs<0>, Workload<1.00|1.01|1.00|1.00>, MempoolTxs[0], Iavl[getnode<492964>, rdb<45232>, rdbTs<48284ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<12776ms>, refund<1ms>]. module=main

0xChupaCabra avatar Dec 01 '22 17:12 0xChupaCabra

Hi @stepollo2 ,

exchaind version v1.6.5.8

Version is OK

ExecStart=/usr/local/bin/exchaind start --chain-id exchain-66 --home /data1/exchain/data --rest.laddr "tcp://0.0.0.0:10998" --cors "*" --iavl-enable-async-commit=true --iavl-cache-size=10000000 --max-open=1024 --rocksdb.opts max_open_files=100

Is that your machine has the memory problem? I saw you set --rocksdb.opts max_open_files=100, if your machine has enough memory you can increase the value, or just remove this flag. If not just keep the flag

Workload<1.00|1.00|1.00|1.00>

the workload is pretty heavy.

BTW, how is your machine's configuration? and how about the disk‘s IOPS and throughput?

giskook avatar Dec 02 '22 00:12 giskook

Hi @stepollo2 ,

exchaind version v1.6.5.8

Version is OK

ExecStart=/usr/local/bin/exchaind start --chain-id exchain-66 --home /data1/exchain/data --rest.laddr "tcp://0.0.0.0:10998" --cors "*" --iavl-enable-async-commit=true --iavl-cache-size=10000000 --max-open=1024 --rocksdb.opts max_open_files=100

Is that your machine has the memory problem? I saw you set --rocksdb.opts max_open_files=100, if your machine has enough memory you can increase the value, or just remove this flag. If not just keep the flag

Workload<1.00|1.00|1.00|1.00>

the workload is pretty heavy.

BTW, how is your machine's configuration? and how about the disk‘s IOPS and throughput?


I anyway updated the binaries to latest release. exchaind currently runs on sdc

xxx@ovh-1:~$ iostat
Linux 5.15.0-52-generic (ovh-1)         12/03/22        _x86_64_        (40 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          20.99    0.00    4.42    2.08    0.00   72.51

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
loop0             0.00         0.00         0.00         0.00       5325          0          0
loop1             0.00         0.00         0.00         0.00       3759          0          0
loop2             0.00         0.00         0.00         0.00       1123          0          0
loop3             0.00         0.02         0.00         0.00      59533          0          0
loop4             0.00         0.00         0.00         0.00       5234          0          0
loop5             0.00         0.00         0.00         0.00       1143          0          0
loop6             0.00         0.00         0.00         0.00         28          0          0
md2               0.00         0.01         0.14         0.33      17333     383844     881164
md3             317.50     17259.33       925.71       527.47 46763173981 2508163852 1429154104
sda             186.64      8679.09       929.73       528.57 23515499410 2519037601 1432125892
sdb             190.49      8926.61       929.99       527.80 24186140809 2519756111 1430035268
sdc             320.91      6305.24     14369.75     18280.88 17083678957 38934005708 49531003968
sdd             528.96      7125.34     31270.58     24830.51 19305695953 84725839552 67276842052
sde             503.94      6517.02     16490.06     23203.49 17657495117 44678861056 62868519456
sdf             369.31      4330.12     43404.36     23085.22 11732223141 117601608456 62548089384
sdg             396.14      5029.14     12995.20     16875.20 13626171025 35209747336 45722384544
sdh            1391.41     74930.11     84132.24     19543.80 203018824977 227951480988 52952797804


xxx@ovh-1:~$ free -g
               total        used        free      shared  buff/cache   available
Mem:             754         353           8           0         391         395
Swap:              0           0           0

Disks are 14TB SSDs. I have the possibility to test on another machine with 7TB NVMEs drives

0xChupaCabra avatar Dec 03 '22 14:12 0xChupaCabra

Same issue. CPU and RAM peak due to this issue too.

Since the gas incident happened, I've never been able to run the node again. I tried multiple times with the latest snapshots. The sync is pretty slow and RAM usage after a half of a day can reach to 40-50 GB of RAM. Linux kills the node multiple times

I'm pretty sure that nothing is wrong with hardware here: 2TB NMVe and 64 GB RAM machine

dandavid3000 avatar Dec 03 '22 22:12 dandavid3000

Hi @stepollo2 , It seems your nodes run on your own machine instead of some cloud service provider. I suggest we can drop the flag --rocksdb.opts max_open_files=100

do you run the node as rpc or validator? If you want to run as validator please set the flag --node-mode=val with start command

giskook avatar Dec 04 '22 00:12 giskook

Hi @dandavid3000 ,

I tried multiple times with the latest snapshots

Could you please provide exhaind's version information?

The sync is pretty slow and RAM usage after a half of a day can reach to 40-50 GB of RAM. Linux kills the node multiple times

Could you provide the exchaind's start command? do you run the node as rpc, validator or archive node?

I'm pretty sure that nothing is wrong with hardware here: 2TB NMVe and 64 GB RAM machine

This machine seems good enough. Is your node runs on cloud service provider? If so please check the disk's IOPS and throughput

giskook avatar Dec 04 '22 00:12 giskook

Hi @stepollo2 , It seems your nodes run on your own machine instead of some cloud service provider. I suggest we can drop the flag --rocksdb.opts max_open_files=100

do you run the node as rpc or validator? If you want to run as validator please set the flag --node-mode=val with start command

I run the node for RPC only

0xChupaCabra avatar Dec 04 '22 01:12 0xChupaCabra

Hi @stepollo2 ,

I run the node for RPC only

Let's drop the flag --rocksdb.opts max_open_files=100

giskook avatar Dec 04 '22 01:12 giskook

Hi @stepollo2 ,

I run the node for RPC only

Let's drop the flag --rocksdb.opts max_open_files=100

Dropped already but is not helping much :/

0xChupaCabra avatar Dec 04 '22 01:12 0xChupaCabra

Hi @stepollo2 ,

Dropped already but is not helping much :/

  1. Could you send me the whole logs? if the log is too large you can send it to [email protected]

  2. And It seems your exchaind is located /usr/local/bin/exchaind Could you run command /usr/local/bin/exchaind version --long | grep -E "version|commit" to confirm the version?

giskook avatar Dec 04 '22 01:12 giskook

@stepollo2 I recommend you to use the latest snapshot for syncing

https://static.okex.org/cdn/oec/snapshot/index.html

The height you are currently synchronizing is when the block is most congested, and each block will consume hundreds of millions of gas, the synchronization will definitely be slow.

cwbhhjl avatar Dec 04 '22 01:12 cwbhhjl

@dandavid3000 Can you provide a few lines of the node log for the latest height?

cwbhhjl avatar Dec 04 '22 02:12 cwbhhjl

Hi @dandavid3000 ,

I tried multiple times with the latest snapshots

Could you please provide exhaind's version information?

The sync is pretty slow and RAM usage after a half of a day can reach to 40-50 GB of RAM. Linux kills the node multiple times

Could you provide the exchaind's start command? do you run the node as rpc, validator or archive node?

I'm pretty sure that nothing is wrong with hardware here: 2TB NMVe and 64 GB RAM machine

This machine seems good enough. Is your node runs on cloud service provider? If so please check the disk's IOPS and throughput

I ran the node on a local PC.

Linux 5.19.5-051905-generic (precision-3460) 	04/12/2022 	_x86_64_(24 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5,11    0,00    1,84    2,93    0,00   90,12

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
loop0             0,00         0,00         0,00         0,00         21          0          0
loop1             0,00         0,00         0,00         0,00        360          0          0
loop10            0,00         0,15         0,00         0,00     177328          0          0
loop11            0,03         1,97         0,00         0,00    2374569          0          0
loop12            0,00         0,00         0,00         0,00        431          0          0
loop13            0,01         0,38         0,00         0,00     460149          0          0
loop14            0,00         0,00         0,00         0,00         67          0          0
loop15            0,00         0,00         0,00         0,00        233          0          0
loop16            0,00         0,00         0,00         0,00        649          0          0
loop17            0,00         0,00         0,00         0,00         18          0          0
loop2             0,01         0,51         0,00         0,00     613403          0          0
loop3             0,00         0,00         0,00         0,00       1297          0          0
loop4             0,16         9,40         0,00         0,00   11335279          0          0
loop5             0,00         0,00         0,00         0,00       1083          0          0
loop6             0,06         2,14         0,00         0,00    2578811          0          0
loop7             0,00         0,00         0,00         0,00       1076          0          0
loop8             0,00         0,00         0,00         0,00        453          0          0
loop9             0,01         0,05         0,00         0,00      57515          0          0
**nvme0n1          46,99       375,54       492,72       327,43  452981416  594329933  394947352
nvme1n1        1868,08     42470,52     40381,53         0,00 51228512692 48708749400          0
nvme2n1          84,15      1527,69       921,23      1413,32 1842719093 1111201988 1704765472**
sda             174,56      2264,85      3999,26         0,00 2731890600 4823959116          0

I tested multiple times with different exchaind. The latest one is v1.6.5.10 and mainnet-s0-fss-20221127-15594367-rocksdb.tar.gz

Start cmd

export EXCHAIND_PATH=/mnt/980/okex/.exchaind/mainnet-s0-fss-20221127-15594367-rocksdb

exchaind start --rest.laddr "tcp://localhost:38345" --wsport 38346 --db_backend rocksdb --chain-id exchain-66 --home ${EXCHAIND_PATH}
I[2022-12-04|21:50:24.851][1007860] Height<15617983>, Tx<11>, BlockSize<22333>, GasUsed<34817639>, InvalidTxs<0>, lastRun<607ms>, RunTx<ApplyBlock<1352ms>, abci<607ms>, persist<744ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<204865>, rdb<38993>, rdbTs<7606ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<603ms>, refund<0ms>]. module=main 
E[2022-12-04|21:50:25.542][1007860] Stopping peer for error. module=p2p peer="Peer{MConn{175.41.191.69:26656} 7fa5b1d1f1e48659fa750b6aec702418a0e75f13 out}" err=EOF
E[2022-12-04|21:50:25.608][1007860] dialing failed (attempts: 2): dial tcp 8.130.29.139:46966: i/o timeout. module=pex [email protected]:46966
E[2022-12-04|21:50:25.608][1007860] dialing failed (attempts: 5): dial tcp 13.228.20.99:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:25.608][1007860] dialing failed (attempts: 1): dial tcp 47.91.245.244:33254: i/o timeout. module=pex [email protected]:33254
I[2022-12-04|21:50:25.727][1007860] Height<15617984>, Tx<2>, BlockSize<9677>, GasUsed<13670144>, InvalidTxs<0>, lastRun<252ms>, RunTx<ApplyBlock<861ms>, abci<253ms>, persist<606ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<88273>, rdb<15420>, rdbTs<2369ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<248ms>, refund<0ms>]. module=main 
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 3): dial tcp 18.192.220.49:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 4): dial tcp 3.64.37.17:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 4): dial tcp 35.74.98.204:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 4): dial tcp 13.213.145.109:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 5): dial tcp 13.213.117.128:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 4): dial tcp 54.248.224.222:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 4): dial tcp 3.37.121.32:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 5): dial tcp 13.250.251.11:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 3): dial tcp 13.125.38.24:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:26.543][1007860] dialing failed (attempts: 4): dial tcp 52.221.126.186:26656: i/o timeout. module=pex [email protected]:26656
I[2022-12-04|21:50:26.812][1007860] Height<15617985>, Tx<11>, BlockSize<21692>, GasUsed<28449129>, InvalidTxs<1>, lastRun<463ms>, RunTx<ApplyBlock<1072ms>, abci<463ms>, persist<607ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<168172>, rdb<31068>, rdbTs<6280ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<458ms>, refund<1ms>]. module=main 
I[2022-12-04|21:50:27.834][1007860] Height<15617986>, Tx<7>, BlockSize<18198>, GasUsed<27690757>, InvalidTxs<0>, lastRun<456ms>, RunTx<ApplyBlock<1004ms>, abci<457ms>, persist<546ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<166789>, rdb<30781>, rdbTs<5014ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<453ms>, refund<0ms>]. module=main 
E[2022-12-04|21:50:27.894][1007860] dialing failed (attempts: 6): auth failure: secret conn failed: read tcp 192.168.0.12:41982->54.249.109.150:26656: i/o timeout. module=pex [email protected]:26656
I[2022-12-04|21:50:28.641][1007860] Height<15617987>, Tx<6>, BlockSize<14695>, GasUsed<21094604>, InvalidTxs<0>, lastRun<357ms>, RunTx<ApplyBlock<792ms>, abci<358ms>, persist<432ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<128990>, rdb<23130>, rdbTs<3728ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<353ms>, refund<0ms>]. module=main 
E[2022-12-04|21:50:28.830][1007860] dialing failed (attempts: 4): auth failure: secret conn failed: read tcp 192.168.0.12:46820->35.72.176.238:26656: i/o timeout. module=pex [email protected]:26656
I[2022-12-04|21:50:29.693][1007860] Height<15617988>, Tx<8>, BlockSize<20362>, GasUsed<28042968>, InvalidTxs<0>, lastRun<482ms>, RunTx<ApplyBlock<1033ms>, abci<483ms>, persist<548ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<167322>, rdb<30908>, rdbTs<5062ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<478ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:30.510][1007860] Height<15617989>, Tx<7>, BlockSize<16132>, GasUsed<21271813>, InvalidTxs<1>, lastRun<348ms>, RunTx<ApplyBlock<802ms>, abci<348ms>, persist<451ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<129387>, rdb<23665>, rdbTs<5196ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<344ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:30.828][1007860] Height<15617990>, Tx<1>, BlockSize<5721>, GasUsed<6835078>, InvalidTxs<0>, lastRun<153ms>, RunTx<ApplyBlock<303ms>, abci<153ms>, persist<149ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<47150>, rdb<7850>, rdbTs<1673ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<150ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:31.336][1007860] Height<15617991>, Tx<3>, BlockSize<9718>, GasUsed<13822929>, InvalidTxs<0>, lastRun<190ms>, RunTx<ApplyBlock<493ms>, abci<190ms>, persist<302ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<88530>, rdb<15567>, rdbTs<2546ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<187ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:33.262][1007860] Height<15617992>, Tx<12>, BlockSize<29757>, GasUsed<48598568>, InvalidTxs<0>, lastRun<805ms>, RunTx<ApplyBlock<1912ms>, abci<807ms>, persist<1104ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<278964>, rdb<54028>, rdbTs<9928ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<803ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:33.333][1007860] Height<15617993>, Tx<0>, BlockSize<1934>, GasUsed<0>, InvalidTxs<0>, lastRun<2ms>, RunTx<ApplyBlock<54ms>, abci<2ms>, persist<50ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<1510>, rdb<60>, rdbTs<22ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<0ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:34.443][1007860] Height<15617994>, Tx<10>, BlockSize<19576>, GasUsed<28286245>, InvalidTxs<1>, lastRun<455ms>, RunTx<ApplyBlock<1097ms>, abci<455ms>, persist<641ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<167825>, rdb<31291>, rdbTs<6553ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<451ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:35.519][1007860] Height<15617995>, Tx<6>, BlockSize<17281>, GasUsed<27514085>, InvalidTxs<0>, lastRun<419ms>, RunTx<ApplyBlock<1060ms>, abci<420ms>, persist<639ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<167015>, rdb<31363>, rdbTs<6780ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<415ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:36.552][1007860] Height<15617996>, Tx<8>, BlockSize<20339>, GasUsed<27940433>, InvalidTxs<1>, lastRun<444ms>, RunTx<ApplyBlock<1017ms>, abci<444ms>, persist<570ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<167053>, rdb<31044>, rdbTs<5290ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<439ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:37.346][1007860] Height<15617997>, Tx<8>, BlockSize<14719>, GasUsed<21417054>, InvalidTxs<0>, lastRun<350ms>, RunTx<ApplyBlock<778ms>, abci<351ms>, persist<425ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<129900>, rdb<23353>, rdbTs<3773ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<347ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:38.422][1007860] Height<15617998>, Tx<9>, BlockSize<20073>, GasUsed<28139571>, InvalidTxs<1>, lastRun<450ms>, RunTx<ApplyBlock<1059ms>, abci<451ms>, persist<607ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<167223>, rdb<31521>, rdbTs<5098ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<447ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:38.945][1007860] Height<15617999>, Tx<3>, BlockSize<9730>, GasUsed<13822929>, InvalidTxs<0>, lastRun<196ms>, RunTx<ApplyBlock<509ms>, abci<196ms>, persist<311ms>>, MempoolTxs<0>, Workload<0.87|0.87|0.87|0.87>, Iavl[getnode<88809>, rdb<15426>, rdbTs<2497ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<192ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:40.378][1007860] Height<15618000>, Tx<6>, BlockSize<18090>, GasUsed<27766776>, InvalidTxs<0>, lastRun<468ms>, RunTx<ApplyBlock<1419ms>, abci<468ms>, persist<949ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<166775>, rdb<30710>, rdbTs<6448ms>, savenode<4071>], DeliverTxs[RunAnte<0ms>, RunMsg<465ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:40.785][1007860] Height<15618001>, Tx<2>, BlockSize<5875>, GasUsed<6987863>, InvalidTxs<0>, lastRun<180ms>, RunTx<ApplyBlock<392ms>, abci<180ms>, persist<210ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<47118>, rdb<7699>, rdbTs<1275ms>, savenode<316>], DeliverTxs[RunAnte<0ms>, RunMsg<177ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:41.177][1007860] CommitSchedule. module=iavl Height=15618000 Tree=acc IavlHeight=30 NodeNum=32980 trc="commitSchedule<800ms>, cacheNode<15ms>, Pruning<515ms>, batchSet<28ms>, batchCommit<240ms>"
I[2022-12-04|21:50:43.708][1007860] Height<15618002>, Tx<10>, BlockSize<27219>, GasUsed<41692972>, InvalidTxs<0>, lastRun<740ms>, RunTx<ApplyBlock<2907ms>, abci<741ms>, persist<2163ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<242802>, rdb<47575>, rdbTs<9205ms>, savenode<32980>], DeliverTxs[RunAnte<0ms>, RunMsg<735ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:44.944][1007860] Height<15618003>, Tx<6>, BlockSize<14091>, GasUsed<37481454>, InvalidTxs<0>, lastRun<488ms>, RunTx<ApplyBlock<1219ms>, abci<489ms>, persist<728ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<168253>, rdb<31326>, rdbTs<5197ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<484ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:46.436][1007860] Height<15618004>, Tx<9>, BlockSize<26244>, GasUsed<28850050>, InvalidTxs<1>, lastRun<535ms>, RunTx<ApplyBlock<1444ms>, abci<535ms>, persist<907ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<168321>, rdb<32125>, rdbTs<5040ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<531ms>, refund<1ms>]. module=main 
I[2022-12-04|21:50:47.873][1007860] Height<15618005>, Tx<5>, BlockSize<17169>, GasUsed<27493073>, InvalidTxs<0>, lastRun<515ms>, RunTx<ApplyBlock<1421ms>, abci<515ms>, persist<905ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<166770>, rdb<31545>, rdbTs<6736ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<512ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:48.939][1007860] Height<15618006>, Tx<6>, BlockSize<13966>, GasUsed<20945233>, InvalidTxs<0>, lastRun<399ms>, RunTx<ApplyBlock<1052ms>, abci<399ms>, persist<651ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<128533>, rdb<23743>, rdbTs<5791ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<395ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:55.426][1007860] CommitSchedule. module=iavl Height=15618000 Tree=evm IavlHeight=33 NodeNum=988161 trc="commitSchedule<15058ms>, cacheNode<966ms>, Pruning<7562ms>, batchSet<1180ms>, batchCommit<5349ms>"
I[2022-12-04|21:50:55.426][1007860] Height<15618007>, Tx<13>, BlockSize<32461>, GasUsed<42751153>, InvalidTxs<0>, lastRun<703ms>, RunTx<ApplyBlock<6471ms>, abci<704ms>, persist<5767ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<243135>, rdb<47694>, rdbTs<8619ms>, savenode<988161>], DeliverTxs[RunAnte<1ms>, RunMsg<699ms>, refund<0ms>]. module=main 
E[2022-12-04|21:50:55.629][1007860] dialing failed (attempts: 2): dial tcp 111.200.241.59:7270: i/o timeout. module=pex [email protected]:7270
E[2022-12-04|21:50:55.629][1007860] dialing failed (attempts: 3): dial tcp 47.90.29.31:56925: i/o timeout. module=pex [email protected]:56925
E[2022-12-04|21:50:55.629][1007860] dialing failed (attempts: 1): dial tcp 43.129.73.94:43756: i/o timeout. module=pex [email protected]:43756
E[2022-12-04|21:50:55.629][1007860] dialing failed (attempts: 1): dial tcp 8.218.77.5:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:55.629][1007860] dialing failed (attempts: 1): dial tcp 3.135.138.205:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:55.786][1007860] Stopping peer for error. module=p2p peer="Peer{MConn{35.74.8.189:26656} c8f32b793871b56a11d94336d9ce6472f893524b out}" err=EOF
I[2022-12-04|21:50:56.257][1007860] Height<15618008>, Tx<7>, BlockSize<15705>, GasUsed<21262131>, InvalidTxs<1>, lastRun<366ms>, RunTx<ApplyBlock<814ms>, abci<367ms>, persist<446ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<129071>, rdb<24110>, rdbTs<5017ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<362ms>, refund<0ms>]. module=main 
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 5): dial tcp 3.37.121.32:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 4): dial tcp 54.150.183.225:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 2): dial tcp 13.214.12.163:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 4): dial tcp 3.37.251.158:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 5): dial tcp 52.221.126.186:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 5): dial tcp 35.74.98.204:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 5): dial tcp 3.64.37.17:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 5): dial tcp 54.151.166.67:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 5): dial tcp 13.213.145.109:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 4): dial tcp 52.78.236.126:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 5): dial tcp 54.248.224.222:26656: i/o timeout. module=pex [email protected]:26656
E[2022-12-04|21:50:56.787][1007860] dialing failed (attempts: 2): dial tcp 54.180.61.142:26656: i/o timeout. module=pex [email protected]:26656
I[2022-12-04|21:50:57.113][1007860] Height<15618009>, Tx<4>, BlockSize<13486>, GasUsed<20657995>, InvalidTxs<0>, lastRun<325ms>, RunTx<ApplyBlock<798ms>, abci<325ms>, persist<472ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<128139>, rdb<23678>, rdbTs<3930ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<322ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:58.148][1007860] Height<15618010>, Tx<5>, BlockSize<10180>, GasUsed<14305694>, InvalidTxs<0>, lastRun<262ms>, RunTx<ApplyBlock<1020ms>, abci<263ms>, persist<756ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<89250>, rdb<15676>, rdbTs<11898ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<259ms>, refund<0ms>]. module=main 
I[2022-12-04|21:50:58.968][1007860] Height<15618011>, Tx<4>, BlockSize<13494>, GasUsed<20657995>, InvalidTxs<0>, lastRun<302ms>, RunTx<ApplyBlock<806ms>, abci<303ms>, persist<502ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<127891>, rdb<23747>, rdbTs<5538ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<300ms>, refund<0ms>]. module=main 
I[2022-12-04|21:51:00.013][1007860] Height<15618012>, Tx<9>, BlockSize<21490>, GasUsed<28213316>, InvalidTxs<1>, lastRun<445ms>, RunTx<ApplyBlock<1031ms>, abci<446ms>, persist<584ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<167136>, rdb<31408>, rdbTs<6283ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<442ms>, refund<0ms>]. module=main 
I[2022-12-04|21:51:00.961][1007860] Height<15618013>, Tx<7>, BlockSize<14439>, GasUsed<37739646>, InvalidTxs<0>, lastRun<419ms>, RunTx<ApplyBlock<934ms>, abci<419ms>, persist<513ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<168684>, rdb<31110>, rdbTs<6028ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<414ms>, refund<0ms>]. module=main 
I[2022-12-04|21:51:02.302][1007860] Height<15618014>, Tx<8>, BlockSize<21935>, GasUsed<34491069>, InvalidTxs<0>, lastRun<606ms>, RunTx<ApplyBlock<1327ms>, abci<607ms>, persist<719ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<204902>, rdb<39193>, rdbTs<7290ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<603ms>, refund<0ms>]. module=main 
I[2022-12-04|21:51:03.131][1007860] Height<15618015>, Tx<9>, BlockSize<18821>, GasUsed<21305044>, InvalidTxs<1>, lastRun<414ms>, RunTx<ApplyBlock<814ms>, abci<414ms>, persist<397ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<128472>, rdb<23260>, rdbTs<3548ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<411ms>, refund<0ms>]. module=main 
I[2022-12-04|21:51:04.174][1007860] Height<15618016>, Tx<12>, BlockSize<19537>, GasUsed<45179910>, InvalidTxs<0>, lastRun<437ms>, RunTx<ApplyBlock<1028ms>, abci<438ms>, persist<587ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<206645>, rdb<38982>, rdbTs<7202ms>, savenode<0>], DeliverTxs[RunAnte<1ms>, RunMsg<433ms>, refund<0ms>]. module=main 
I[2022-12-04|21:51:04.948][1007860] Height<15618017>, Tx<6>, BlockSize<16968>, GasUsed<21270964>, InvalidTxs<0>, lastRun<346ms>, RunTx<ApplyBlock<757ms>, abci<346ms>, persist<409ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<128894>, rdb<23534>, rdbTs<3522ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<342ms>, refund<0ms>]. module=main 
I[2022-12-04|21:51:05.722][1007860] Height<15618018>, Tx<7>, BlockSize<16313>, GasUsed<21245151>, InvalidTxs<1>, lastRun<335ms>, RunTx<ApplyBlock<760ms>, abci<335ms>, persist<424ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<128857>, rdb<23399>, rdbTs<3755ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<332ms>, refund<0ms>]. module=main 
I[2022-12-04|21:51:06.016][1007860] Height<15618019>, Tx<2>, BlockSize<5964>, GasUsed<6987851>, InvalidTxs<0>, lastRun<98ms>, RunTx<ApplyBlock<280ms>, abci<98ms>, persist<179ms>>, MempoolTxs<0>, Workload<0.88|0.88|0.88|0.88>, Iavl[getnode<47309>, rdb<7737>, rdbTs<1249ms>, savenode<0>], DeliverTxs[RunAnte<0ms>, RunMsg<95ms>, refund<0ms>]. module=main

dandavid3000 avatar Dec 04 '22 20:12 dandavid3000

Hi @dandavid3000

I tested multiple times with different exchaind. The latest one is v1.6.5.10 and mainnet-s0-fss-20221127-15594367-rocksdb.tar.gz

Could you use the latest data snapshot mainnet-s0-fss-20221205-15769737-rocksdb.tar.gz it will help to sync.

Start cmd

export EXCHAIND_PATH=/mnt/980/okex/.exchaind/mainnet-s0-fss-20221127-15594367-rocksdb

exchaind start --rest.laddr "tcp://localhost:38345" --wsport 38346 --db_backend rocksdb --chain-id exchain-66 --home ${EXCHAIND_PATH}

If you run your node as validator, you should set --node-mode=val, if you run node as rpc, we'd better use s1 data. mainnet-s1-fss-20221204-15764063-rocksdb.tar.gz If you want to lower you memory, there are some ways to help:

  1. lower --rocksdb.opts max_open_files now there is no limited, I think we can try to set it to 30000 first;
  2. lower --iavl-cache-size and --iavl-fast-storage-cache-size, the default value is 10000000;
  3. lower --commit-gap-heigh the defaut value is 100;
  4. use tcmalloc may help:

how to install tcmalloc with OKC:

1. cd exchain
2. make tcmalloc
3. make mainnet OKCMALLOC=tcmalloc

giskook avatar Dec 05 '22 00:12 giskook

Hi @dandavid3000

I tested multiple times with different exchaind. The latest one is v1.6.5.10 and mainnet-s0-fss-20221127-15594367-rocksdb.tar.gz

Could you use the latest data snapshot mainnet-s0-fss-20221205-15769737-rocksdb.tar.gz it will help to sync.

Start cmd

export EXCHAIND_PATH=/mnt/980/okex/.exchaind/mainnet-s0-fss-20221127-15594367-rocksdb

exchaind start --rest.laddr "tcp://localhost:38345" --wsport 38346 --db_backend rocksdb --chain-id exchain-66 --home ${EXCHAIND_PATH}

If you run your node as validator, you should set --node-mode=val, if you run node as rpc, we'd better use s1 data. mainnet-s1-fss-20221204-15764063-rocksdb.tar.gz If you want to lower you memory, there are some ways to help:

  1. lower --rocksdb.opts max_open_files now there is no limited, I think we can try to set it to 30000 first;
  2. lower --iavl-cache-size and --iavl-fast-storage-cache-size, the default value is 10000000;
  3. lower --commit-gap-heigh the defaut value is 100;
  4. use tcmalloc may help:

how to install tcmalloc with OKC:

1. cd exchain
2. make tcmalloc
3. make mainnet OKCMALLOC=tcmalloc

Thanks for your help. I confirmed that the node is working well on my side. Here is the thing I noticed during the sync. The memory usage is high. It can catch up to 40-50 GB RAM. I tried all the suggestion flags above and confirmed they did not help. However, after finishing the sync, the memory usage was reduced and reasonable. Some suggestions for those who have this issue:

  • Try to download the latest snapshot version to shorten the sync time
  • Use NVMe at least to have a fast sync. SSD is really bad with IAVL.

dandavid3000 avatar Dec 17 '22 13:12 dandavid3000