ethereumjs-monorepo Performance optimization low-hanging fruit

I'm creating this issue to list multiple easy refactors that could be implemented to improve the performance of the VM. Some of them come from EthWork's involvement. Others are mine.

[ ] Common#param should be an O(1) operation. It's called an incredible amount of times.
- Analysis in https://github.com/ethereumjs/ethereumjs-monorepo/issues/775#issuecomment-860545856
[x] The VM's Memory class should be implemented with a Buffer
[x] jumpIsValid is a linear operation. It can be constant. Update: After talking with Jochem and reading evmone's code, I'm not sure if this is possible nor beneficial.
[x] There's a huge number of unnecessary calls to toString(). Sometimes it is used instead of Buffer#equals. Sometimes it's called to initialize a BN from a Buffer, instead of just passing the Buffer. There obvious instances of this in these functions:
- Block#validateTransactionsTrie
- Block#validateUnclesHash
- Block#validateUncles
- Account#isContract
- Account#setCode
- Account#isEmpty
- FakeTransaction#hash
- Transaction#toCreationAddress
- EEI#getExternalBalance
- PUSH's handler: creats a string to create a BN instead of just creating the BN
- VM#runBlock: multiple comparisons of Buffers by string
- VM#runCall
- VM#runTx: calls toString on a Buffer instead of just checking its size
- DefaultStateManager#accountIsEmpty
[x] The opcodes handlers map is indexed by string, it should be done by number.
[x] The opcodes handlers map and the opcodes OpcodeList's one, should probably be Maps. The semantic of Map is way simpler than an accessing an object with object[key]. Done in #852
[x] Transaction#getDataFee is called very frequently and super slow:
- [x] It should be cached
- [x] Calls Common#param (which is really slow) within a loop, could be done outside of it.
- [x] Uses BNs in a loop while it could use normal numbers and then transform it to BN. -- BN operations allow new objects
[x] Transaction#hash should be cached
[x] BlockHeader#hash should be cached
[ ] The VM's getOpcodesForHF can be cached. It's called a lot and iterates a 256 elements map.
[ ] The VM ERROR enum should not have string values, as those are compared very frequently.

Jun 14 '20 21:06 alcuadrado

Common#param should be an O(1) operation. It's called an incredible amount of times.

My intuition is Common configs should be written in JS datastructures not JSON. You can then more easily arrange them to allow easy access. Generally I think Common is a bit over-abstracted and it'd be better to break the configs related to each package and move it to the package directly. Whatever config has to be used by multiple packages (if any) can remain in Common.

There's a huge number of unnecessary calls to toString()

This is very prevalent as you mention. In the VM one valid use-case is using them as keys for Maps and Sets, etc. Do you think we should adopt alternatives that support Buffers directly?

Jun 17 '20 15:06 s1na

A major performance drain when running mainnet blocks seems to be emitting the step event. I've even seen the time for running a block get reduced by 50% after disabling this event in the VM.

As a temporary solution we can add an option to the VM to disable emitting events for users who don't need it. We should also investigate the async-eventemitter library to see if there are rooms for optimization there. I'm not aware what the main reason behind using this library is vs the stock node EventEmitter.

Jun 26 '20 12:06 s1na

Reminder for myself from @alcuadrado's comment here:

I think we can also take advantage of the immutable interface to more aggressively cache some common operations. When profiling the VM I've seen thinks like hashing, and validating txs taking a considerable part of the execution time. We can have a constant public interface, but keep some mutable/lazily-initialized private properties to cache those.

Sep 07 '20 20:09 ryanio

Update: someone at some point should have another look at the initial list here and maybe (directly in the comment) transform to a check list (and check the things already done), otherwise we won't get hold of this.

Jun 11 '21 11:06 holgerd77

Some of these have been resolved thanks to this issue so I will convert to a checklist so we can have a clear picture of remaining items. I can also tag on a per-item basis if it would be a bug fix or breaking change so we can prioritize what can be done now versus later.

I think Common#param is an important one that is called a lot. @holgerd77 do you have an intuition if we would be able to make it a O(1) operation? If it needs code restructuring perhaps we can implement it as a new method and mark the old one as deprecated.

Jun 11 '21 23:06 ryanio

One idea would be to add some simple dict (or similar) cache and add values on first usage and erase on (generally seldomly occurring I would very much assume) HF or chain switches along Common usage.

Did some simple tests on this, might be worth it, script I used:

import Common from './src'
let c = new Common({ chain: 'mainnet' })

let dummyCache = {
  'value1': 15,
  'value2': 15,
  'value3': 15,
  'value4': 15,
  'value5': 15,
  'value6': 15,
  'value7': 15,
  'value8': 15,
  'value9': 15,
  'value10': 15,
  'value11': 15,
  'value12': 15,
  'minGasLimit': 500,
}

for (const iterations of [1000, 10000, 100000, 1000000, 10000000]) {
  let newA, newB
  console.log(`Testing with ${iterations} iterations.`)
  console.time('T')
  for (let i=0; i<iterations; i++) { newA = c.param('gasConfig', 'minGasLimit') }
  console.timeEnd('T')
  console.time('T')
  for (let i=0; i<iterations; i++) { newB = dummyCache['minGasLimit'] }
  console.timeEnd('T')
}

Results:

Testing with 1000 iterations.
T: 5.489ms
T: 0.053ms
Testing with 10000 iterations.
T: 8.378ms
T: 0.421ms
Testing with 100000 iterations.
T: 52.756ms
T: 0.116ms
Testing with 1000000 iterations.
T: 285.341ms
T: 0.824ms
Testing with 10000000 iterations.
T: 2.788s
T: 7.871ms

Jun 14 '21 09:06 holgerd77

Did another small test to put things in perspective and added a global param counter to Common which does

paramCounter += 1
if (paramCounter % 100 == 0) {
  console.log(paramCounter)
}

For most single state tests this is going into the hundreds, for some in the thousands and for selected in the tens of thousands. The whole state test suite comes to ~8.579.700. So this would be some 2s win for the tests (under the assumption that the cache works well respectively most of the reads would be done from a cache if available).

On the benchmark run (so for this 12 or so blocks together with a single iteration) this gets to ~1.130.300 calls, so roughly 200-300ms.

Hmm, not clear indicators for me on first sight that this is worth the effort to introduce some cache or something. But also not completely out of question I guess? 🤔

Jun 14 '21 09:06 holgerd77

Update: someone at some point should have another look at the initial list here and maybe (directly in the comment) transform to a check list (and check the things already done), otherwise we won't get hold of this.

I have converted the opening post to a check list.

Jul 02 '21 17:07 ryanio

Nice! 😄

Jul 02 '21 18:07 holgerd77

Linking #865

Dec 10 '21 10:12 jochem-brouwer

ethereumjs-monorepo ethereumjs-monorepo copied to clipboard

Performance optimization low-hanging fruit

ethereumjs-monorepo
ethereumjs-monorepo copied to clipboard