huff-rs icon indicating copy to clipboard operation
huff-rs copied to clipboard

Bytecode padding directive

Open Philogy opened this issue 2 years ago • 3 comments

I'd love a directive that allows me to indicate a size n that at compile time should:

  1. Check the resulting size of the section of bytecode
  2. Pad with 0x00 (STOP) bytes up to size n
  3. Give me a compile-time error if the section is larger than n

The use case for this is efficient function dispatchers or internal switch-like statements that require code sections to be padded up to a consistent size. In METH this leads not only to relatively ugly code:

    dest_0x18:
        // 0x18160ddd
        __FUNC_SIG(totalSupply)
        __NON_PAYABLE_SELECTOR_CHECK()
        TOTAL_SUPPLY(callvalue)
        /* padding (45) */ stop stop stop stop stop stop stop stop stop stop stop stop stop stop
                           stop stop stop stop stop stop stop stop stop stop stop stop stop stop
                           stop stop stop stop stop stop stop stop stop stop stop stop stop stop
                           stop stop stop
    dest_0x19: __NO_MATCH()
    dest_0x1a: __NO_MATCH()
    dest_0x1b: __NO_MATCH()
    dest_0x1c: __NO_MATCH()
    dest_0x1d: __NO_MATCH()
    dest_0x1e: __NO_MATCH()
    dest_0x1f: __NO_MATCH()
    dest_0x20:
        // 0x205c2878
        __FUNC_SIG(withdrawTo)
        INVALID_NON_PAYABLE()
        WITHDRAW_TO(callvalue)
        /// @dev Selectors 0x21000000 - 0x21ffffff will exceptionally revert.
        /* padding (38) */ stop stop stop stop stop stop stop stop stop stop stop stop stop stop
                           stop stop stop stop stop stop stop stop stop stop stop stop stop stop
                           stop stop stop stop stop stop stop stop stop stop

But is also very error-prone, if optimize a macro and it results in smaller/larger bytecode and I forget to adjust the padding all subsequent functions are broken due to the corrupted block boundaries. I've had to write a somewhat janky script that helps me identify where the initial block boundary violation happened.

I'd imagine the syntax for this to be something like this:

#define macro MAIN() = takes(0) returns(0) {
     // ... other logic
     dest_0x18: #define padded(0x3f) {
          // 0x18160ddd
          __FUNC_SIG(totalSupply)
          __NON_PAYABLE_SELECTOR_CHECK()
          TOTAL_SUPPLY(callvalue)
          // padding directive implicitly pads block to be 0x3f (63) bytes large
     }    
}

I'm offering a $100 bounty (to be paid in mainnet ETH) to the contributor who implements this once merged.

Philogy avatar Oct 02 '23 15:10 Philogy

ChallengeAcceptedActorGIF

lmanini avatar Oct 12 '23 14:10 lmanini

What about making this a new jump table that does this instead of a new #define token ? It seems a convenient add for the "regular" and "packed" ones. Is there any case for which we wouldn't be able to determine the size of the "stop wall" at compile-time ? And also, does that mean that every jump location has to be less than x bytes long which might be constraining at some point ?

iFrostizz avatar Oct 30 '23 22:10 iFrostizz

To construct a more efficient function dispatcher you want to directly convert a function selector into a jump destination while avoiding having to do a lookup in some table. This typically requires the entry JUMPDESTs for your functions to be at equally spaced intervals e.g.:

jump_offset = (selector % 16) * 64 // implies JUMPDESTs in 64-byte increments

However rarely will all your functions be exactly the length you need, so you need to add some padding e.g. (visualizations not actual Huff):

dispatcher()
[ dest ]  [            body            ]
  fn1:     <logic> <logic> <pad> <pad>
  fn2:     <logic>  <pad>  <pad> <pad>

The issue is what if you change the logic of one your functions because you found an optimization or want to add a feature, the padding is now invalid:

dispatcher()
[ dest ]  [            body            ]
  fn1:     <logic>  <pad> <pad>   fn2:
 <logic>    <pad>   <pad> <pad>    -

Ideally I have some directive I can use to wrap my functions in so that the padding is adjust automatically and I get helpful error message if I happen to exceed the set size e.g.:

dispatcher()
padded (64) { fn1:     <logic> <logic> }
padded (64) { fn2:     <logic>              }

Not quite sure how a new jump table type would achieve these, I guess it would change how you define the constraint of distance between labels, do you mean smth like this?:

#define macro MAIN() = takes(0) returns(0) {
    __DISPATCHER()

    fn1: FN1()
    fn2: FN2()
    no3: REVERT()
    no4: REVERT()
    fn5: FN5()
}

#define jumptable fixed_size(64) {
    fn1 fn2 no3 no4 fn5
}

I like this less as it feels less direct. Having padding be defined in macros seems cleaner and would allow you to reuse such blocks via macros.

Philogy avatar Oct 30 '23 22:10 Philogy