evm icon indicating copy to clipboard operation
evm copied to clipboard

EOF

Open wjmelements opened this issue 1 year ago • 25 comments

EOF adds more immediates. I plan to utilize syntactic sugar for them.

My current idea for EIP-663 is like this:

  1. DUP33 encodes as e6 20 while DUP3 still encodes as 82
  2. SWAP17 encodes as e7 10 while SWAP16 still encodes as 9f
  3. SWAP2&3 encodes as e8 00 (EXCHANGE 0x00)

wjmelements avatar Dec 13 '24 15:12 wjmelements

e6 03 can be disambiguated from 82 like DUPN3 so that the worse way can still be represented unambiguously.

wjmelements avatar Dec 13 '24 16:12 wjmelements

I'm implementing an assembler and the syntax I'm using is:

  • DUPN 33 encodes as e6 20 (note the difference with your example, see below)
  • SWAPN 17 encodes as e7 10
  • EXCHANGE 1 2 encodes as e8 00

The reasoning is explained in https://github.com/ipsilon/eof/pull/174.

  • SWAPn should be equivalent to SWAPN n
  • EXCHANGE n m should be equivalent to SWAPN n SWAPN m SWAPN n

Same reasoning applies to DUPN:

  • DUPn should be equivalent to DUPN n

I think we're aligned on this principle but your example is wrong. DUP33 should encode as e6 20.

I'd like us to have a standard assembly representation to avoid these off-by-one differences across different tools. This is what I was trying with https://github.com/ipsilon/eof/pull/174 but I think a separate document for standard notation might be a better idea? What do you think?

frangio avatar Dec 13 '24 19:12 frangio

  • DUPN 33 encodes as e6 20 (note the difference with your example, see below)
  • SWAPN 17 encodes as e7 10
  • EXCHANGE 1 2 encodes as e8 00

These collide with our push syntax. Likely yours looks like PUSH2 300 while ours looks like 300.

wjmelements avatar Dec 13 '24 19:12 wjmelements

I think we're aligned on this principle but your example is wrong. DUP33 should encode as e6 20.

You are right. edited in description (previously e6 21)

wjmelements avatar Dec 13 '24 19:12 wjmelements

That's fine, I think the use of spacing can vary but it seems important to have numbers with a consistent meaning across tools.

By the way I think your other example should also be changed to be consistent with non-exchange swaps:

  • SWAP1&3 encodes as e8 00

frangio avatar Dec 13 '24 19:12 frangio

How are you planning to represent RJUMP and RJUMPV?

I think you will run into issues there because they can't be made into a single word?

If you're not able to use space-separated arguments to avoid confusion with push, have you considered using delimiters [ ] for immediates? RJUMP[label], SWAPN[3], and so on.

In my case I'm doing:

  • RJUMP label
  • RJUMPV label1, label2 or RJUMPV [label1, label2] (both valid, the second one allows newlines within delimiters)

frangio avatar Dec 13 '24 19:12 frangio

By the way I think your other example should also be changed to be consistent with non-exchange swaps:

  • SWAP1&3 encodes as e8 00

(I'm guessing what you mean here) I want the code to read the same as ~SWAP and~ DUP even though the EXCHANGE operand is compressed. So SWAP2&3 swaps items 2 and 3 and is encoded e800.

wjmelements avatar Dec 13 '24 19:12 wjmelements

So SWAP2&3 swaps items 2 and 3 and is encoded e800.

The thing is that with this syntax this operation SWAP2&3 would be equivalent to SWAP1 SWAP2 SWAP1, it seems weird to shift the indexing by one when you're swapping two items.

frangio avatar Dec 13 '24 19:12 frangio

You're right; it is weird that SWAP3 swaps 1 and 4 while SWAP2&3 would swap 2&3.

wjmelements avatar Dec 13 '24 19:12 wjmelements

If you're not able to use space-separated arguments to avoid confusion with push, have you considered using delimiters [ ] for immediates? RJUMP[label], SWAPN[3], and so on.

I think this makes a lot of sense for arrays (RJUMPV). I've been trying to think of a way to avoid it but haven't thought of anything good yet. Array delimiters make sense for immediates and seem to be sustainable for future opcodes. I'm likely to support them therefore, unless I can think of something better.

wjmelements avatar Dec 13 '24 20:12 wjmelements

I have verified that I'm not currently using [] for anything else.

wjmelements avatar Dec 13 '24 20:12 wjmelements

You're right; it is weird that SWAP3 swaps 1 and 4 while SWAP2&3 would swap 2&3.

I think the real problem is the official opcode name, being 2-indexed.

wjmelements avatar Dec 13 '24 20:12 wjmelements

I think the real problem is the official opcode name, being 2-indexed.

If you mean the original SWAPn series of opcodes, yes, maybe SWAPn and DUPn should've been numbered to reflect the actual stack height that is affected. But at this point I expect anyone dealing with assembly to be used to this, and I think the expectation would be that arguments to EXCHANGE have the same meaning as arguments to SWAP.

frangio avatar Dec 13 '24 20:12 frangio

With SWAPn&m I'm inventing syntactic sugar for EXCHANGE so I'm likely to use the DUP indexing. I'll still support EXCHANGE[,] if I use [] for RJUMPV and I want those operands to match yours.\

EXCHANGE 1 2 encodes as e8 00

@frangio Would you be interested in using the DUP indexes instead of the SWAP indices for EXCHANGE? Like EXCHANGE[2,3] and EXCHANGE 2 3 for e800?

wjmelements avatar Dec 13 '24 20:12 wjmelements

~I made a Twitter poll~

(deleted, ~going to remake~)

wjmelements avatar Dec 13 '24 20:12 wjmelements

Rereading it again I think e800 swaps items 1 and 2.

wjmelements avatar Dec 13 '24 20:12 wjmelements

So I think we agree on the DUP indexing for the representation of EXCHANGE but I was previously mistaken about e800.

wjmelements avatar Dec 13 '24 20:12 wjmelements

Editing my previous posts now to fix the mistake makes the discussion even more confusing so I'm only going to edit the OP. I added notes to the other posts.

wjmelements avatar Dec 13 '24 20:12 wjmelements

I think the expectation would be that arguments to EXCHANGE have the same meaning as arguments to SWAP.

It's possible based on this that you actually want the SWAP indexes though. I think you advocated the DUP indexes here:

EXCHANGE 1 2 encodes as e8 00

wjmelements avatar Dec 13 '24 20:12 wjmelements

Yes I do want the SWAP indices. I believe EXCHANGE 1 2 = e8 00 is consistent with that?

frangio avatar Dec 13 '24 21:12 frangio

Yes I do want the SWAP indices. I believe EXCHANGE 1 2 = e8 00 is consistent with that?

Goodness it's different every time I read it.

wjmelements avatar Dec 13 '24 21:12 wjmelements

There are two separate instances of + 1.

From the spec:

  • EXCHANGE (0xe8) instruction
    • read uint8 operand imm
    • n = imm >> 4 + 1, m = imm & 0x0F + 1
    • n + 1th stack item is swapped with n + m + 1th stack item (1-based).

If the first nibble is 0 you will be swapping the second stack element (closest one that isn't the top of the stack) with some other element (deeper in the stack).

frangio avatar Dec 13 '24 21:12 frangio

So we are in disagreement then. That's too bad. I think https://github.com/ipsilon/eof/pull/174 is a good place to continue that discussion.

wjmelements avatar Dec 13 '24 21:12 wjmelements

I think we might have a disagreement about the meaning of EXCHANGE[n,m] but the choice seems pretty clear for SWAPn&m right?

frangio avatar Dec 13 '24 21:12 frangio

I think we might have a disagreement about the meaning of EXCHANGE[n,m] but the choice seems pretty clear for SWAPn&m right?

Possibly not, because I would want that to also use DUP indexes, even though it has the SWAP prefix. We agree that SWAP2 being equivalent to SWAP1&3 is not ideal. If it's too confusing I'd rather use a different prefix than SWAP, like XCHG. By using DUP indexes the representation can assemble to SWAP, SWAPN, or EXCHANGE appropriately, according to which is possible and then according to which has the least codesize.

wjmelements avatar Dec 13 '24 22:12 wjmelements

EOF is not likely to be included in the near-future. I will reopen this task if that changes.

wjmelements avatar May 15 '25 02:05 wjmelements