[BUG] Out of memory exeption
OS and Environment
Linux, AWS, k8s
GIT commit hash
f348b9a8
Minimum working example / Steps to reproduce
perf report include full details list of precondition
### 0. Test objective:
- Apply the load in the required volume.
- Apply the load over an extended period of time.
#### 1. Infrastructure
- iroha version: version="2.0.0-rc.1" git_commit_sha="f348b9a8"
- java sdk version: commit: commit: efeb5a233e
- iroha2-perf version: commit : 041736f
- 5 peers
### SPECIAL CONDITION FOR STAND PREPARATION
-
We increased the disks for Longlive to 20 GB.
-
Ingress initially had more resources (2x horizontally) at the start of the load test to handle the peak load during the sudden start.
-
Iroha has priority; Kubernetes relocates pods within the cluster based on priorities. By default, Iroha has priority over other applications. However, services like NGINX have a higher priority, which makes sense. For that test, I increased the priority for our iroha2-test.
#### 2. images/config
- image - testnet-2.0.0-rc.1.f348b9a8 - harbor image
- config - config.txt
- genesis
### PREPARATION LONGEVITY ENV
Access to standard monitoring tools
On the perf generator
git clone https://github.com/soramitsu/iroha2-perf.git &&
git checkout iroha/2_0_0-rc_1/keypair &&
cd performance-generator/ &&
mvn -N io.takari:maven:wrapper &&
./mvnw gatling:test -Dgatling.simulationClass=simulation.transactions.rampConstant.TransferAssetSimulation -DtargetURL= -DremoteLogin= -DremotePassword= -DstartLevelUsers=0 -DendLevelUsers=234 -DrampDuring=4500 -DstageDuration=86400 -DmaxDuration=86401
Actual result
out of memory exeption
OOM
Resources utilization
performance metrics
Expected result
The load is applied evenly throughout the entire test. There is no CPU or memory utilization.
Logs
Who can help to reproduce?
@timofeevmd @RamilMus
Notes
No response
The issue is that Iroha consumes ~6GB of memory after 20 million transactions.
This matches current implementation (https://github.com/hyperledger-iroha/iroha/issues/5083#issuecomment-2379804636).
80%+ of memory consumes State::transactions which contains hashes of transactions mapped onto block height where they are stored (basically Map<Hash, usize>). State::transactions is a multi-version map with transactional behaviour, currently we use mv crate. Potentially memory usage can be improved if we use some specialized implementation for transactions map. Here is comparison of memory usage for various Map<Hash, usize> implementations:
| Map | Potential memory usage, bytes per transaction |
|---|---|
mv::Storage |
270 |
mv::Storage with HashMap |
286 |
rpds::RedBlackTreeMapSync |
112 |
rpds::HashTrieMapSync |
168 |
dashmap::DashMap |
69 |
chashmap::CHashMap |
88 |
concurrent_map::ConcurrentMap |
64 |
std::collections::BTreeMap |
64 |
std::collections::HashMap |
69 |
Only mv maps can be used directly for our needs. Maps from rpds crate will give about 2x memory improvement and can relatively easily be adapted for our use case (since they provide persistent behaviour). Other maps potentially could give ~3x memory improvement, but require custom implementation of multi-version and transactional logic.
So the plan is to implement custom solution based on some concurrent map with low memory usage (I think dashmap::DashMap is good choice), and in case it is not possible implement simplier solution using rpds.