solidity
solidity copied to clipboard
Add switch type to CFG
Adds a new switch type to the CFG to facilitate the implementation of #12978 in the optimizer
It might need a little more work, but the last case optimization was a major improvement:
ir-optimize-evm+yul
project | bytecode_size | deployment_gas | method_gas |
---|---|---|---|
bleeps | +0.128605% ❌ |
0% |
-1.5e-05% ✅ |
brink | +0.342493% ❌ |
||
chainlink | +0.170116% ❌ |
+0.147089% ❌ |
-0.00032% ✅ |
colony | -0.018442% ✅ |
||
elementfi | +0.199935% ❌ |
||
ens | 0% |
+0.341644% ❌ |
+0.008292% ❌ |
euler | !V |
!V |
!V |
gnosis | +0.113951% ❌ |
+0.114305% ❌ |
-0.036179% ✅ |
gp2 | +0.31025% ❌ |
+0.294627% ❌ |
-0.004169% ✅ |
perpetual-pools | +0.385289% ❌ |
+0.356846% ❌ |
+0.023492% ❌ |
pool-together | +0.336411% ❌ |
+0.32835% ❌ |
-0.001011% ✅ |
prb-math | !V |
!V |
!V |
trident | +0.340512% ❌ |
+0.298538% ❌ |
+0.222231% ❌ |
uniswap | +0.223121% ❌ |
+0.215931% ❌ |
+0.163952% ❌ |
yield_liquidator | +0.454005% ❌ |
+0.505831% ❌ |
-0.003383% ✅ |
zeppelin | !V |
!V |
!V |
Just getting back to this now - the last runtime gas diff actually doesn't look too bad - I'm a bit surprised about the code size increase that remains. Out of my head, I don't have a good direction in mind for improving things, but I'll try to have a closer look soon and see if I can find something.
If we can't reduce this further, we could still (I think I saw that as a suggestion somewhere while I was off) consider moving the decision for jump tables to an early stage and keep translating all other switches to nested if
s...
If we can't reduce this further, we could still (I think I saw that as a suggestion somewhere while I was off) consider moving the decision for jump tables to an early stage and keep translating all other switches to nested
if
s...
This is the direction I was thinking of. I wrote some reasons for it at https://github.com/ethereum/solidity/pull/12978#issuecomment-1183954016 and I moved the decision to the CFG builder step. Currently, the simple heuristic is that a jump table is used for a switch statement if there are at least three cases and they are compatible with the jump table (having a range of <= 16). If all cases are used uniformly this should be an improvement when using Ethereum opcode prices.
Yeah - I'd be alright with starting with that approach.
On the other hand, I'm still actually a bit surprised by the changes in bytecode size. I'm a bit swamped myself at the moment, so it may take a bit until I could look into it in more detail myself - but did you encounter some prototypical cases in which the size increases the most? I'd still be interested in looking at some assembly diff for those (to confirm that it's really cases in which nothing can be done if we force a uniform stack layout across switch cases - or whether the current stack layout generation just produces suboptimal layouts in those cases).
The "ghost variable" mechanism was actually never meant as an optimization, but rather as a simplification that merely reduces the number of cases to handle :-). It does make sense that it can handle some cases slightly better, but the codesize effect is larger than I'd have anticipated.
In any case, if you're up for it and want to make progress on this soon, feel free to start working on a version with an "early decision" for jump tables.
I implemented the "early decision" for jump tables in my most recent commits to #12978. It uses the CFG switch type for jump tables only, and seems to have favorable gas benchmarks for the one external project that could make use of the jump table, with no difference for the other ones as expected.
When I have some time, I will find the prototypical examples of the assembly differences. Comparing large contracts was a little difficult using a regular diff because the code sometimes is moved around. I can try to find differences in smaller contracts.
I did an opcode count comparison a while ago of Uniswap and found a lot more pop
s (57% more) and, interestingly, an increased count of others I would not expect, like codesize
. The pop
count probably went down considerably after b4e76f8, but maybe codesize
shows that some code is being duplicated. I will re-run this on the latest commit sometime.
Ah, nice, I hadn't realized, I'll try to have a look at the latest version of https://github.com/ethereum/solidity/pull/12978 then!
And yeah, codesize
may seem weird, but it's actually used specially by the optimized code transform:
codesize
is what the optimized code transform uses to fill in "junk" slots, i.e. whenever it is required to enlarge the stack just in order to jump to a common stack layout, but the actual value pushed does not matter, since it will be popped again anyways.
This should actually rarely occur and may indeed indicate that we can improve the stack layout generation to get rid of them.
I.e. the codesize
opcodes are probably coming from here:
https://github.com/ethereum/solidity/blob/6b60524cfe4186eb7d22d80ca67b9554902d8fb3/libyul/backends/evm/OptimizedEVMCodeTransform.cpp#L348
Oh, that makes sense. I now remember seeing that when I was fiddling with the stack generation.
Here is an overall asm comparison in the latest version, for UniswapV3Factory.sol
from the external test:
Instruction | Ghost variable switch | CFG switch | Change |
---|---|---|---|
calldataload(0x24) | 14 | 32 | +124% ❌ |
codesize | 60 | 86 | +43% ❌ |
dup1 | 844 | 793 | -6.04% ✅ |
dup2 | 1061 | 1110 | +4.62% ❌ |
dup3 | 1022 | 1020 | -0.20% ✅ |
dup4 | 581 | 579 | -0.34% ✅ |
dup5 | 346 | 345 | -0.28% ✅ |
dup6 | 297 | 297 | 0% |
dup7 | 262 | 258 | -1.5% ✅ |
dup8 | 224 | 218 | -2.68% ✅ |
dup9 | 157 | 157 | 0% |
dup10 | 114 | 110 | -3.51% ✅ |
dup11 | 78 | 78 | 0% |
dup12 | 70 | 70 | 0% |
dup13 | 31 | 29 | -6.45% ✅ |
dup14 | 60 | 54 | -10% ✅ |
dup15 | 10 | 10 | 0% |
dup16 | 10 | 8 | -20% ✅ |
iszero | 523 | 541 | +3.44% ❌ |
jump | 1227 | 1227 | 0% |
jumpi | 1159 | 1159 | 0% |
pop | 787 | 922 | +17.15% ❌ |
The one interesting difference here is that there are more calldataload
s.
I still haven't narrowed down any specific instances showing why this code change happens. Some small test cases that I made using yul resulted in nearly equivalent code, except the cases were in a different order.
@jaa2 Are you still working on this ? Or shall we takeover
@jaa2 Are you still working on this ? Or shall we takeover
I would love to see this completed. Technically, I am waiting for feedback on #12978. I could not figure out how to get the CFG switch gas execution cost low enough to outperform the ghost variables style, since the ghost variables get a pass through the rest of the optimizer. I still needed access to the CFG switch type to implement the jump tables in the optimizer, so this change is included in #12978 and the CFG switch type is used only when jump tables can be used.
Ideally, we would find a way to get the CFG switch to become as performant as the ghost variable method.
Thank you for your contribution to the Solidity compiler! A team member will follow up shortly.
If you haven't read our contributing guidelines and our review checklist before, please do it now, this makes the reviewing process and accepting your contribution smoother.
If you have any questions or need our help, feel free to post them in the PR or talk to us directly on the #solidity-dev channel on Matrix.
Hi @jaa2 ! 👋 are you planning to come back soon to this PR, or should we close it for now?
Closing as stale.