Excessive resource utilization when ethstats reporting is enabled
Describe the bug Nethermind is using excessive resources
To Reproduce Steps to reproduce the behavior:
- Setup an AuRa network such xdaichain/posdao-testnet-prepare
- Before launching the nodes for the first time (
./run_all.sh), segregate the nodes into a different server for each one.- Modify each node's
docker-compose.ymland updateNETHERMIND_NETWORKCONFIG_EXTERNALIPandNETHERMIND_ETHSTATSCONFIG_SERVERenvironment variables - Optionally modify
run_all.shandstop_all.shto only run/stop the specific node or ethstats for each server
- Modify each node's
- Start nodes
- On archive server, deploy staking contract
cd .. # go back to the `compose` directoryanddocker-compose start && docker-compose logs -f - Observe system load. This is the baseline with ethstats reporting enabled
- Stop the nodes
- Modify each node's
docker-compose.ymland updateNETHERMIND_ETHSTATSCONFIG_ENABLEDto false - Observe system load. This is with ethstats reporting disabled
Screenshots
Here you can see some system metrics with ethstats reporting enabled
Focus on system load average
The only outlier is validator1 which seems to report the wrong information yet has the least resource utilization
Here you can see some system metrics with ethstats reporting disabled

System Information:
- Docker host: Ubuntu 20.04
- Docker container images: nethermind/nethermind:1.12.6
Additional notes I cannot really explain the discrepancy between validator1 and the other validators in the ethstats enabled scenario. All validators have almost identical environment variables set. They only differ on node name, keys and external ip. Each node lives on it own VM, each on a different host machine, some host machines on a different datacenters. All VMs have identical specs.
In the ethstats enabled scenario, validators are starting to struggle to parse/produce blocks on time and this results in lots of benign AuRa reports
Logs in the ethstats disabled scenario:

Logs in the ethstats enabled scenario

I forgot to mention the VM orientation shown in my screenshots:
| archive | validator1 | validator2 |
|---|---|---|
| validator3 | validator4 | validator5 |
@nicexe Did you try that with v1.12.3?
Hi @varasev thanks for the reply. I just tried 1.12.3 but the nodes cannot start anymore and I'm getting those 2 errors:
2022-03-24 13:55:05.4288|ERROR|13|Step InitializeBlockchainAuRa failed after 151ms Nethermind.Trie.TrieException: Node 0x3adfe602f80cfa36db0ea3b41a64819071037a685399bf9381508d255e8d9f71 is missing from the DB
at Nethermind.Trie.Pruning.TrieStore.LoadRlp(Keccak keccak, IKeyValueStore keyValueStore) in /src/Nethermind/Nethermind.Trie/Pruning/TrieStore.cs:line 331
at Nethermind.Trie.Pruning.ReadOnlyTrieStore.LoadRlp(Keccak hash) in /src/Nethermind/Nethermind.Trie/Pruning/ReadOnlyTrieStore.cs:line 41
at Nethermind.Trie.TrieNode.ResolveNode(ITrieNodeResolver tree) in /src/Nethermind/Nethermind.Trie/TrieNode.cs:line 234
at Nethermind.Trie.PatriciaTree.Run(Span`1 updatePath, Int32 nibblesCount, Byte[] updateValue, Boolean isUpdate, Boolean ignoreMissingDelete, Keccak startRootHash) in /src/Nethermind/Nethermind.Trie/PatriciaTree.cs:line 401
at Nethermind.Trie.PatriciaTree.Get(Span`1 rawKey, Keccak rootHash) in /src/Nethermind/Nethermind.Trie/PatriciaTree.cs:line 319
at Nethermind.State.StateTree.Get(Address address, Keccak rootHash) in /src/Nethermind/Nethermind.State/StateTree.cs:line 49
at Nethermind.State.StateProvider.GetState(Address address) in /src/Nethermind/Nethermind.State/StateProvider.cs:line 659
at Nethermind.State.StateProvider.GetAndAddToCache(Address address) in /src/Nethermind/Nethermind.State/StateProvider.cs:line 674
at Nethermind.State.StateProvider.GetThroughCache(Address address) in /src/Nethermind/Nethermind.State/StateProvider.cs:line 695
at Nethermind.State.StateProvider.GetCodeHash(Address address) in /src/Nethermind/Nethermind.State/StateProvider.cs:line 311
at Nethermind.State.IReadOnlyStateProvider.IsContract(Address address) in /src/Nethermind/Nethermind.State/IReadOnlyStateProvider.cs:line 39
at Nethermind.Evm.TransactionProcessing.ReadOnlyTransactionProcessor.IsContractDeployed(Address address) in /src/Nethermind/Nethermind.Evm/TransactionProcessing/ReadOnlyTransactionProcessor.cs:line 55
at Nethermind.Blockchain.Contracts.Contract.ConstantContract.CallRaw(CallInfo callInfo, IReadOnlyTransactionProcessor readOnlyTransactionProcessor) in /src/Nethermind/Nethermind.Blockchain/Contracts/Contract.ConstantContract.cs:line 111
at Nethermind.Blockchain.Contracts.Contract.ConstantContract.Call(CallInfo callInfo) in /src/Nethermind/Nethermind.Blockchain/Contracts/Contract.ConstantContract.cs:line 104
at Nethermind.Blockchain.Contracts.Contract.IConstantContract.Call[T](CallInfo callInfo) in /src/Nethermind/Nethermind.Blockchain/Contracts/Contract.ConstantContract.cs:line 43
at Nethermind.Blockchain.Contracts.Contract.IConstantContract.Call[T](BlockHeader parentHeader, String functionName, Address sender, Object[] arguments) in /src/Nethermind/Nethermind.Blockchain/Contracts/Contract.ConstantContract.cs:line 52
at Nethermind.Consensus.AuRa.Contracts.ValidatorContract.GetValidators(BlockHeader parentHeader) in /src/Nethermind/Nethermind.Consensus.AuRa/Contracts/ValidatorContract.cs:line 104
at Nethermind.Consensus.AuRa.Validators.ContractBasedValidator.LoadValidatorsFromContract(BlockHeader parentHeader) in /src/Nethermind/Nethermind.Consensus.AuRa/Validators/ContractBasedValidator.cs:line 246
at Nethermind.Consensus.AuRa.Validators.ContractBasedValidator.SetFinalizationManager(IAuRaBlockFinalizationManager finalizationManager, BlockHeader parentHeader) in /src/Nethermind/Nethermind.Consensus.AuRa/Validators/ContractBasedValidator.cs:line 84
at Nethermind.Consensus.AuRa.Validators.ContractBasedValidator..ctor(IValidatorContract validatorContract, IBlockTree blockTree, IReceiptFinder receiptFinder, IValidatorStore validatorStore, IValidSealerStrategy validSealerStrategy, IAuRaBlockFinalizationManager finalizationManager, BlockHeader parentHeader, ILogManager logManager, Int64 startBlockNumber, Int64 posdaoTransition, Boolean forSealing) in /src/Nethermind/Nethermind.Consensus.AuRa/Validators/ContractBasedValidator.cs:line 71
at Nethermind.Consensus.AuRa.AuRaValidatorFactory.<CreateValidatorProcessor>g__GetContractBasedValidator|19_2(<>c__DisplayClass19_0& ) in /src/Nethermind/Nethermind.Consensus.AuRa/AuRaValidatorFactory.cs:line 104
at Nethermind.Consensus.AuRa.AuRaValidatorFactory.CreateValidatorProcessor(Validator validator, BlockHeader parentHeader, Nullable`1 startBlock) in /src/Nethermind/Nethermind.Consensus.AuRa/AuRaValidatorFactory.cs:line 131
at Nethermind.Consensus.AuRa.Validators.MultiValidator.CreateValidator(Int64 finalizedAtBlockNumber, Validator validatorPrototype, BlockHeader parentHeader) in /src/Nethermind/Nethermind.Consensus.AuRa/Validators/MultiValidator.cs:line 222
at Nethermind.Consensus.AuRa.Validators.MultiValidator.SetCurrentValidator(Int64 finalizedAtBlockNumber, Validator validatorPrototype, BlockHeader parentHeader) in /src/Nethermind/Nethermind.Consensus.AuRa/Validators/MultiValidator.cs:line 204
at Nethermind.Consensus.AuRa.Validators.MultiValidator.SetCurrentValidator(KeyValuePair`2 validatorInfo, BlockHeader parentHeader) in /src/Nethermind/Nethermind.Consensus.AuRa/Validators/MultiValidator.cs:line 196
at Nethermind.Consensus.AuRa.Validators.MultiValidator.InitCurrentValidator(Int64 blockNumber, BlockHeader parentHeader) in /src/Nethermind/Nethermind.Consensus.AuRa/Validators/MultiValidator.cs:line 75
at Nethermind.Consensus.AuRa.Validators.MultiValidator.SetFinalizationManager(IAuRaBlockFinalizationManager finalizationManager, BlockHeader parentHeader) in /src/Nethermind/Nethermind.Consensus.AuRa/Validators/MultiValidator.cs:line 186
at Nethermind.Consensus.AuRa.Validators.MultiValidator..ctor(Validator validator, IAuRaValidatorFactory validatorFactory, IBlockTree blockTree, IValidatorStore validatorStore, IAuRaBlockFinalizationManager finalizationManager, BlockHeader parentHeader, ILogManager logManager, Boolean forSealing) in /src/Nethermind/Nethermind.Consensus.AuRa/Validators/MultiValidator.cs:line 66
at Nethermind.Consensus.AuRa.AuRaValidatorFactory.CreateValidatorProcessor(Validator validator, BlockHeader parentHeader, Nullable`1 startBlock) in /src/Nethermind/Nethermind.Consensus.AuRa/AuRaValidatorFactory.cs:line 145
at Nethermind.Consensus.AuRa.InitializationSteps.InitializeBlockchainAuRa.CreateAuRaValidator(IBlockProcessor processor, IReadOnlyTxProcessorSource readOnlyTxProcessorSource) in /src/Nethermind/Nethermind.Consensus.AuRa/InitializationSteps/InitializeBlockchainAuRa.cs:line 131
at Nethermind.Consensus.AuRa.InitializationSteps.InitializeBlockchainAuRa.CreateBlockProcessor() in /src/Nethermind/Nethermind.Consensus.AuRa/InitializationSteps/InitializeBlockchainAuRa.cs:line 94
at Nethermind.Init.Steps.InitializeBlockchain.InitBlockchain() in /src/Nethermind/Nethermind.Init/Steps/InitializeBlockchain.cs:line 220
at Nethermind.Init.Steps.InitializeBlockchain.Execute(CancellationToken _) in /src/Nethermind/Nethermind.Init/Steps/InitializeBlockchain.cs:line 69
2022-03-24 13:55:05.4476|ERROR|10|Error during ethereum runner start Nethermind.Init.Steps.StepDependencyException: BlockchainProcessor
at Nethermind.Init.Steps.StartBlockProcessor.Execute(CancellationToken _) in /src/Nethermind/Nethermind.Init/Steps/StartBlockProcessor.cs:line 37
at Nethermind.Init.Steps.EthereumStepsManager.RunOneRoundOfInitialization(CancellationToken cancellationToken) in /src/Nethermind/Nethermind.Init/Steps/EthereumStepsManager.cs:line 140
at Nethermind.Init.Steps.EthereumStepsManager.InitializeAll(CancellationToken cancellationToken) in /src/Nethermind/Nethermind.Init/Steps/EthereumStepsManager.cs:line 108
at Nethermind.Runner.Ethereum.EthereumRunner.Start(CancellationToken cancellationToken) in /src/Nethermind/Nethermind.Runner/Ethereum/EthereumRunner.cs:line 51
I'll try keeping all nodes to latest except one where I would configure it to use 1.12.3 and sync from 0
@varasev 1.12.3 definitely helped
system load

ethstats

I restored the archive node and validators from 1 to 4 and configured them to use 1.12.6. I removed all data for validator 5 and configured it to use 1.12.3.
After validator 5 synced with the rest of the nodes I let it rest for a few minutes and then checked the system load as seen in the screenshot.
Validator 5 has a system load average of about 1 on a 4 core VM and the other nodes have a system load average of about 8-10 on a 4 core VM each.
Again, the only outlier is validator 1 which has significantly lower CPU utilization and system load average of about 2 on a 4 core VM but it is also the only node reporting invalid stats to the ethstat server.
Hi @nicexe looking into your issue, if 1.12.3 solves it then there is slight chance that the actual reason is different and related to: https://github.com/NethermindEth/nethermind/pull/3901
@nicexe Can you try to run the setup using the nethermindeth/nethermind:local_txs_broadcasting docker image? It is the build from https://github.com/NethermindEth/nethermind/pull/3901
Also pushed #3906
I tried to use nethermindeth/nethermind:local_txs_broadcasting on validator 5. This brings the VM that validator 5 runs on to a system load of about 9 and 100% CPU utilization just like the rest of the nodes as opposed to a system load of about 1 when using 1.12.3
Also pushed #3906
@LukaszRozmej Can you please make a docker image for this?
@nicexe I fail to actually reproduce this:

Would you be able to provide a performance trace from one of the misbehaving nodes? I am inquiring our devops on how to do that on nodes running inside docker as that is more complicated.
@matilote would you be able to provide a docker image + instructions on how to use dottrace here?
@nicexe can you help us gather data here?
- Please use image nethermindeth/nethermind:test
- After running it on one or multiple misbehaving nodes please do:
sudo docker exec -it xdaitestnet-validatorX /bin/bashmkdir dumps/opt/tools/dottrace attach PID --save-to=dumps - After some time like a minute or two close dottrace with Ctrl+C, it will take the snapshot and save it to dumps folder
- Now you need to send us the dump, you can do it different ways. one potential is:
apt updateapt install curlapt install zipzip -9 -r perf.zip dumpcurl --upload-file ./perf.zip https://transfer.sh/perf.zipThe last one would give you url with the file being available for download.
@nicexe we found one potential issue with our ethstats integration. It could be connected with your problem. We will release the fix soon. The fix is merged to master
#4249