fuel-specs Add PUSH and POP instructions.

Use case

When creating a local function call frame (as opposed to a more global contract call frame) the compiler must preserve the value of usually several general purpose registers to avoid clobbering values in the caller.

It is currently (not merged yet, but soon) doing this manually by copying $SP, using CFEI and performing several STOREs, potentially up to ~40 in the worst case at the start of a function, and reversing the process with up to ~40 LOADs at the end.

Even fairly small functions may use around 8 GP regs which would equate to 20 instructions to save and restore them, or 80 bytes of binary executable. Per function.

Proposal

I propose we introduce PUSH and POP to do all this with single instructions, saving a lot of wasted bytecode. 80 bytes becomes 8 bytes. A difference this significant could determine whether a function is inlined or not and so has further implications for optimisation.

The arguments could be a range of registers, first to last inclusive. PUSH $r0 $r3 would save all 4 of $r0, $r1, $r2 and $r3 to the stack. POP $r0 $r3 would restore them correctly.

I assume it's possible for the gas cost to just remain the equivalent of a CFEI followed by n STORE instructions.

Sep 15 '22 02:09 otrho

The gas cost should definitely be better than chaining CFEI with n STORE ops, since each instruction has overhead such as updating $pc, $cgas and $ggas.

How would the VM know the memory location of the registers to POP? Since the registers like fp, sp and ssp are all relative to contract CALL instructions, I don't think they could be relied on for POP. We could make a stack internal to the VM to track this, but that would violate our convention of confining all execution state to registers & memory.

Sep 15 '22 23:09 Voxelot

How would the VM know the memory location of the registers to POP?

It shouldn't, it's up to the compiler or dev to do it. So POP should just LOAD relative to $sp into the registers.

And conceivably they could be used to batch move a bunch of values between registers -- PUSH $r0 $r3 followed by POP $r1 $r4 would move a bunch of registers all at once. The compiler wouldn't usually do this though, and would work hard to keep everything symmetrical.

Sep 16 '22 00:09 otrho

I guess the implications of PUSH $sp ... or PUSH $pc ... need to be explicit, whether it's pre- or post-increment. POP into a constant reg should obviously be disallowed, so the utility of saving the 'constant' regs to the stack is dubious.

Sep 16 '22 00:09 otrho

@otrho This seems related to a suggestion you made in this other issue about using push and pop "if we had them". https://github.com/FuelLabs/fuel-specs/issues/346

Sep 29 '22 17:09 nfurfaro

Yeah, similar if we allow popping into reserved registers, although it sounds dangerous, as I kinda mentioned above your comment. I think maybe a special instruction which just creates a new stack at the end of the current one would be safer for #346, and these push and pop instructions would be only for general purpose registers, r16 and above.

Sep 30 '22 00:09 otrho

I have some suggestions on possible improvements. Some of these are mutually exclusive, and are mostly included to show what alternatives should be considered.

Since any register is readable, I don't think is makes much sense to disallow PUSHing those if we're taking a register-range input. The concerns about $sp and $pc and be simply defined to be the values before the PUSH instruction is executed. On the other hand, it likely wont hurt if pushing those registers requires an extra MOVE to a writable register first, as I assume that wont be very common.

However, there is one interesting optimization opportunity that I would like to consider. Instead of PUSH start stop we would have PSHL bitmask and PSHH bitmask where bitmask is a 24-bit bitmask for the registers to push. The corresponding pop instructions are POPL bitmask and POPH bitmask. Since there are only 48 non-reserved-registers, we only need two bitmasked operations to push or pop any set of writable registers. This would make register allocation easier for the Sway complier and likely has only minimal performance loss compared to pushing a range of registers. This is unless it's very common to push/pop all user registers, in which case even having PUSH start1 stop1 start2 stop2 would be a definite improvement if we special case it so that the second range can be empty. Also if full-register push is often needed in a specific scenarion (maybe CALL/RET) then those instruction could have variants or flags for pushing all the registers instead.

Also introducing a heap push (but not pop) operation might be a good to have, although at least the heap has a nice property that after ALOC the $hp already points to the allocations, but appending a register value to the heap still takes three opcodes (MOVI $tmp, size; ALOC $tmp; SW $hp, $value) instead of one.

One more idea would be to streamline in-context subroutine calls using "push $pc and jump to $reg` and "pop into $pc" -style instructions. This is definitely a separate feature, but I feel like there are not many reasons not to do this in case we add PUSH and POP.

Going even further, we could consider completely removing the idea of user writable data in the stack, and use it only for loaded code and call frames. Then, what's currently called the heap, would be the stack, and heap as a concept would simply consist of the Sway compiler emulating heap using the stack. This makes operations involving dynamicly-sized runtime allocation somewhat harder/constly, but I don't feel think code is common in smart contracts anyways.

Jun 20 '23 22:06 Dentosal

This makes operations involving dynamicly-sized runtime allocation somewhat harder/constly, but I don't feel think code is common in smart contracts anyways.

I don't think this is a reasonable assumption to make. To my knowledge, people commonly use dynamically sized types in Solidity. And devs are already complaining about our lackluster support for them as it is.

Jun 22 '23 15:06 IGI-111