hedera-services icon indicating copy to clipboard operation
hedera-services copied to clipboard

Performance regression in transaction handling

Open OlegMazurov opened this issue 10 months ago • 2 comments

Description

Transaction handling has always been the main bottleneck under heavy load provided by the NFT transfer benchmark. The handling thread, consensusRoundHandler, is now seen spending significant time sleeping in BackpressureObjectCounter.onRamp() and even polling in SequentialThreadTaskScheduler.run() (i.e. idle). The latter may be a consequence of client's reaction to low throughput caused by the former. A likely cause of the regression is #12744 that introduced the wiring framework into transaction handling.

Steps to reproduce

Run the NFT transfer benchmark for a day. During the run, platform reports PLATFORM_NOT_ACTIVE (new behavior) and TPS drops below 1K.

Additional context

No response

Hedera network

other

Version

develop

Operating system

Linux

OlegMazurov avatar Apr 29 '24 15:04 OlegMazurov

Another candidate for the regression is #12139, which added new code to HandleWorkflow.validate(): TransactionDispatcher.dispatchComputeFees() and NetworkUtilizationManagerImpl.trackTxn() responsible for additional >3% of transaction handling time.

OlegMazurov avatar May 07 '24 04:05 OlegMazurov

After #12744, ConsensusRoundHandler.updateRunningEventHash() accounts for 5% of transaction handling time (used to be negligible). Most of its time is spent in ConsensusRound.getRunningEventHash() (waiting for runningEventHashFuture). Synchronous hash computation is faster.

OlegMazurov avatar May 10 '24 16:05 OlegMazurov

@Cody, please disable the debug feature (ConsensusRoundHandler.updateRunningEventHash())

poulok avatar May 20 '24 16:05 poulok

@poulok

@cody, please disable the debug feature (ConsensusRoundHandler.updateRunningEventHash())

This is currently disabled in 0.50 and develop. @OlegMazurov is it safe to close this ticket?

cody-littley avatar May 27 '24 18:05 cody-littley