besu icon indicating copy to clipboard operation
besu copied to clipboard

Full node takes up more space than archive node

Open pingxab opened this issue 1 year ago • 10 comments

Description

we deployed two nodes for high availability, one archive node and one full node(snap sync both Forest and Bonsai tried), the space archive node takes up is only 3.9 G, while Forest format takes up to 8.7G, Bonesai takes up to 5.6G, very weird as we think full node should save space.

Acceptance Criteria

  • full node should take up less space

Steps to Reproduce (Bug)

  1. configure an archive node and a full node with snap sync Bonsai format
  2. let full node connect to archive node to sync, archive node connects to external validators to sync(we may change both to connect outside directly to sync later)
  3. du the data directory

Expected behavior: [What you expect to happen] The full node data directory should be smaller Actual behavior: [What actually happens] The full node data directory is bigger Frequency: [What percentage of the time does it occur?] always

Logs (if a bug)

Please post relevant logs from Besu (and the consensus client, if running proof of stake) from before and after the issue.

Versions (Add all that apply)

  • Software version: 23.10.1
  • Java version: 17.0.6
  • OS Name & Version: red hat enterprise 8.9
  • Kernel Version: 4.18.0-513.11.1.e18_9.x86_64
  • Virtual Machine software & version:
  • Docker Version:
  • Cloud VM, type, size:
  • Consensus Client & Version if using Proof of Stake:

Smart contract information (If you're reporting an issue arising from deploying or calling a smart contract, please supply related information)

  • Solidity version [solc --version]
  • Repo with minimal set of deployable/reproducible contract code - please provide a link
  • Please include specifics on how you are deploying/calling the contract
  • Have you reproduced the issue on other eth clients

Additional Information (Add any of the following or anything else that may be relevant)

  • Besu setup info - qbft consensus
  • System info - memory, CPU

pingxab avatar Jun 18 '24 16:06 pingxab

Hi there - can you try updating the nodes and seeing if anything changes? We have made some improvements to the database in subsequent versions.

My hunch is that over time, the Archive node will absolutely be larger. We keep more data around in Full nodes to help with block processing performance like caches. Over time, this will not increase linearly, but the Archive node will.

@matkt might also have some insight into this, and also commands you can run to give the size of your database, perhaps.

non-fungible-nelson avatar Jun 24 '24 23:06 non-fungible-nelson

could you share your configuration (flags etc) for each bonsai test ?

matkt avatar Jun 25 '24 08:06 matkt

could you also run

./bin/besu --data-path=/data/besu storage rocksdb usage

in order to have more info on your database for each step

matkt avatar Jun 25 '24 08:06 matkt

@matkt

could you also run

./bin/besu --data-path=/data/besu storage rocksdb usage

in order to have more info on your database for each step

Hi, upload the snapshots from the two nodes, and configuration full node full-node-storage archive node archive-node-storage

#Sync sync-mode="X_SNAP" data-storage-format="BONSAI" #bonsai-historical-block-limit=256 fast-sync-min-peers=1

pingxab avatar Jun 28 '24 02:06 pingxab

Hi there - can you try updating the nodes and seeing if anything changes? We have made some improvements to the database in subsequent versions.

My hunch is that over time, the Archive node will absolutely be larger. We keep more data around in Full nodes to help with block processing performance like caches. Over time, this will not increase linearly, but the Archive node will.

@matkt might also have some insight into this, and also commands you can run to give the size of your database, perhaps.

hi @non-fungible-nelson, which version, and do you hv any calculation formula or ratio of full node storage vs archive nodes?

pingxab avatar Jun 28 '24 03:06 pingxab

@matkt

could you also run

./bin/besu --data-path=/data/besu storage rocksdb usage

in order to have more info on your database for each step

Hi, upload the snapshots from the two nodes, and configuration full node full-node-storage archive node archive-node-storage

#Sync sync-mode="X_SNAP" data-storage-format="BONSAI" #bonsai-historical-block-limit=256 fast-sync-min-peers=1

your screenshot seems to be invalid . the full node don't have any state , only the blockchain is saved. and the archive has the column of a forest node and the size seems to really small. is your node syncing ?

matkt avatar Jun 28 '24 09:06 matkt

it will be nice to share your logs when your bonsai nodes are starting to be sure you have the good configuration

matkt avatar Jun 28 '24 09:06 matkt

it will be nice to share your logs when your bonsai nodes are starting to be sure you have the good configuration

Hi matkt thanks for reponding, from the eth_syncing API call, the archive node is false but in fact always importing blocks from an external source, the full node (follows the archive node) shows it's always syncing with start, current, and highest, so in our scenario, the archive node is always ahead of the full node

here uploads the full node log we configured rolling, so here gave the current log file besu.log

pingxab avatar Jun 28 '24 10:06 pingxab

thanks but I need more logs. when you restart your node you should have something

####################################################################################################
#                                                                                                  #
# Besu version 24.6.0                                                                              #
#                                                                                                  #
# Configuration:                                                                                   #
# Network: Mainnet                                                                                 #
# Network Id: 1                                                                                    #
# Data storage: Bonsai                                                                             #
# Sync mode: Checkpoint                                                                            #
# RPC HTTP APIs: FLEET,TRACE,ADMIN,DEBUG,NET,ETH,WEB3,TXPOOL                                       #
# RPC HTTP port: 8545                                                                              #
# Engine APIs: ENGINE,ETH                                                                          #
# Engine port: 8551                                                                                #
# Engine JWT: /etc/jwt-secret.hex                                                                  #
# Using LAYERED transaction pool implementation                                                    #
# Using STACKED worldstate update mode                                                             #
# Limit trie logs enabled: retention: 512; prune window: 30000                                     #
#                                                                                                  #
# Host:                                                                                            #
# Java: openjdk-java-21                                                                            #
# Maximum heap size: 3.90 GB                                                                       #
# OS: linux-x86_64                                                                                 #
# glibc: 2.35                                                                                      #
# jemalloc: 5.2.1-0-gea6b3e973b477b8061e0076bb257dbd7f3faa756                                      #
# Total memory: 15.60 GB                                                                           #
# CPU cores: 4                                                                                     #
#                                                                                                  #
# Plugin Registration Summary:                                                                     #
####################################################################################################","throwable":""}

also regarding the log you are sharing your don't sync at all.

are you running a qbft network ? if you want to use snapsync with a qbft network there is a PR in order to enable that https://github.com/hyperledger/besu/pull/7140

for the moment you can use fastsync if you want to sync quickly

matkt avatar Jun 28 '24 12:06 matkt

Hi @matkt sorry for late update, as mentioned, we used two nodes above to follow block producing network with 4 qbft nodes, one is archive node, and the other is snap bonsai sync configuration, we want to compare archive node with full node in private chains, how much storage can save versus archive node, here attach the full node start log for diagnosis, thanks. I also tried fast sync ,it indeed synced fast but storage a little higher than archive node too. Also what's the difference between fast sync and snap sync ,thanks! full node start log.txt

pingxab avatar Jul 02 '24 15:07 pingxab

This issue is stale because it has been open for 6 months with no activity.

github-actions[bot] avatar Dec 30 '24 02:12 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Jan 13 '25 02:01 github-actions[bot]