go-ethereum
go-ethereum copied to clipboard
core/vm: add micro-optimization
This PR adds Uint256() to the Address type, to create a uint256 from an address.
Since we know that the address type is 20 bytes, we can use setBytes20 instead of setBytes which saves us one switch.
Benchmarks:
Master:
//BenchmarkOpAddress-24 31233150 38.22 ns/op 183 B/op 0 allocs/op
Pr:
//BenchmarkOpAddress-24 33228558 35.39 ns/op 172 B/op 0 allocs/op
I think it's not worth to add like this. There might be a way to avoid the allocation though: since the stack is just []uint256.Int, and may have capacity left over from earlier extensions, we could perform a check like this:
if cap(stack) > len(stack) {
stack = stack[:len(stack)+1]
} else {
stack = append(stack, uint256.Int{})
}
last = stack[:len(stack)]
last.SetBytes20(address[:])
Maybe this optimization also generalizes to other stack push operations.
Code above could probably be simplified to just use append instead of checking for available cap explicitly. However, using append will zero the Int in the slice, which we don't need because we clear it ourself anyway.
I think it's a pretty solid idea. We always initialize the stack with a pre-allocated uint256 slices with cap 16. As EVM has the restriction of stack depth (to 1024, probably we should pre-allocate a larger uint256 slice for stack?), I think this optimization could be applied to most of the push operations.
New: func() interface{} {
return &Stack{data: make([]uint256.Int, 0, 16)}
},
Btw, the slice expansion might be a performance bottleneck to some extent. The maximum size of stack could be 1024 theoretically, which will involves 7 slice expansion in total.
We could increase the pre-allocated size to 256 maybe.
Pre-allocated stack size was previously reduced from a default of 1024 to current of 16 in https://github.com/ethereum/go-ethereum/pull/21222 . Maybe a pre-alloced size of 256 is a reasonable compromise between memory usage and performance.
The 1024 was for RStack which has been removed (although i have no idea what RStack is). The default size for stack is always 16.
Pre-allocated stack size was previously reduced from a default of 1024 to current of 16
I think not. Using a pre-allocated 1024 in a somewhat deep callchain which only ever uses a few element is waste of time. There's another thing we can do, which I have higher hopes for, theoretically.
Use one contiguous stack. Then, at call-sites, keep a reference (this scope "stack zero" is at index 28). So all scopes use the same stack, but they never pop past their "stack zero", and they are not allowed to push more than 1024. This scheme makes it so that all execution can operate on pre-allocated elements, but if a call scope doesn't use any elements, it does not "hog" pre-allocated items.
I once implemented this, but for some reason it didn't seem to pay off enough to motivate it's merging. The branch is here: https://github.com/ethereum/go-ethereum/compare/master...holiman:go-ethereum:contiguous_stack (it also contains some other stack-stuff, and is very old, 2017, still based on uint256)
I would say let's try to get @holiman's contiguous_stack branch up to date. It's a very promising approach.
Probably better to pick it up from scratch. Back then, I had to add typed accessor methods like popAddress, since the stack was pointers, and I couldn't let those escape.
It should be simpler now. I'll give it a spin tomorrow
Hmm, I always thought we pre-allocate a 1024 stack and use it contiguously across all subcalls. That seems like a nobrainer thing we should do. Then we can make all the opcodes 0 alloc afterwards.
That seems like a nobrainer thing we should do. Then we can make all the opcodes 0 alloc afterwards.
Yeah, well, we don't :shrug: . Here's a sketch of how we might go about it though: https://github.com/ethereum/go-ethereum/pull/30362
Yeah I think we should move forward with Martins idea of never de-allocing the items, closing this for now