conflux-rust icon indicating copy to clipboard operation
conflux-rust copied to clipboard

Investigate storage performance regression.

Open peilun-conflux opened this issue 4 years ago • 7 comments

Now we can only reach 4000 tps compared with the previous 7000 tps, even after the quick fix #1241.

The profiling shows that it's mostly limited by the storage accesses (get_nonce_and_balance_from_storage) in the transaction pool.

peilun-conflux avatar Apr 12 '20 19:04 peilun-conflux

@yangzhe1990, we should investigate this when we have time. Probably after the first phase main-net release

fanlong avatar Apr 12 '20 23:04 fanlong

#1261 fixes one locking issue.

peilun-conflux avatar Apr 21 '20 05:04 peilun-conflux

@yangzhe1990 Another performance bottleneck is the BTreeMap in node_ref_maps. It consumes more CPU resources in get_cache_info now for a state containing 1 million accounts.

I recall that before adding DeltaMptId we use a Vec and maintain the indices ourselves, so can we still apply this method back to fix this performance issue?

peilun-conflux avatar Apr 21 '20 05:04 peilun-conflux

BTW, changing BTreeMap to hashbrown::HashMap increases the tps from 5500 to 6000, and still introduces a significant overhead.

peilun-conflux avatar Apr 21 '20 05:04 peilun-conflux

The average TPS of the storage benchmark from 10M to 100M on my last run at end Feburary is more than 25000.

But on master now it seems to be less than 23000.

I reverted the most recent changes within storage itself, but don't see too much difference.

yangzhe1990 avatar Apr 21 '20 11:04 yangzhe1990

Oh, the performance degrade of the benchmark is due to debug validation code here https://github.com/Conflux-Chain/conflux-rust/commit/27fb468481e1008c729f78c17f4aa87fa7bf180a

But it wasn't enabled in test actually..

yangzhe1990 avatar Apr 21 '20 11:04 yangzhe1990

Related work: #1365 should improve the TPS. I wonder wnat's the new TPS number?

yangzhe1990 avatar Apr 28 '20 07:04 yangzhe1990