sway icon indicating copy to clipboard operation
sway copied to clipboard

Fix bytecode printing mistaking the data section for actual instructions

Open mohammadfawaz opened this issue 3 years ago • 4 comments

Consider

fn main() -> b256 {
    0x3333333333333333333333333333333333333333333333333333333333333333
}

The generated bytecode, accroding to forc parse-bytecode is

  half-word   byte   op                    raw           notes
          0   0      JI(4)                 90 00 00 04   jump to byte 16
          1   4      NOOP                  47 00 00 00
          2   8      Undefined             00 00 00 00   data section offset lo (0)
          3   12     Undefined             00 00 00 28   data section offset hi (40)
          4   16     LW(63, 12, 1)         5d fc c0 01
          5   20     ADD(63, 63, 12)       10 ff f3 00
          6   24     LW(17, 63, 5)         5d 47 f0 05
          7   28     ADD(17, 17, 12)       10 45 13 00
          8   32     LW(16, 63, 4)         5d 43 f0 04
          9   36     RETD(17, 16)          25 45 00 00
         10   40     LOG(12, 51, 12, 51)   33 33 33 33
         11   44     LOG(12, 51, 12, 51)   33 33 33 33
         12   48     LOG(12, 51, 12, 51)   33 33 33 33
         13   52     LOG(12, 51, 12, 51)   33 33 33 33
         14   56     LOG(12, 51, 12, 51)   33 33 33 33
         15   60     LOG(12, 51, 12, 51)   33 33 33 33
         16   64     LOG(12, 51, 12, 51)   33 33 33 33
         17   68     LOG(12, 51, 12, 51)   33 33 33 33
         18   72     Undefined             00 00 00 00
         19   76     Undefined             00 00 00 20
         20   80     Undefined             00 00 00 00
         21   84     Undefined             00 00 00 28

Note that we're mistaking the data section (the 33s starting byte 40) as actual instructions (LOG), which is misleading.

mohammadfawaz avatar Dec 24 '22 04:12 mohammadfawaz

To know where the data section is, is non-trivial. The best we can do is assume that the word at byte offset 8 is the offset to it, which is what the disassembly does in comments above. But that is entirely by convention and could change in the future. forc-dis actually tracks what the value of $ds is to know where the data section is, which is itself slightly by convention, since $ds is reserved by the compiler and not the VM.

To get around the blown out data section problem in #3161 I was considering putting some values in a local data section, per function, rather than having a single monolithic data section. We're also considering putting const-ish data at the start of the binary for load-type constants. These are more potential examples of where the simple output from parse-bytecode would get confused.

otrho avatar Dec 24 '22 07:12 otrho

@otrho why don't we port forc-dis and make it an official plugin in the Sway repository, and have it replace forc parse-bytecode? I mean if it's much better and more accurate, might as well deprecate the old approach and not have to worry about fixing this. forc-dis is probably a better name too 🤷

mohammadfawaz avatar Jan 29 '23 14:01 mohammadfawaz

Sure. We need to add finish it off, there's still some todos in there and it needs tests, or at least checks to make sure it's up to date. Occasionally we still update fuel-asm (though I guess not so much after beta4) and forc-dis can get out of date.

otrho avatar Jan 30 '23 00:01 otrho

Moving this to the Sway project so that we can actually give it some attention.

mohammadfawaz avatar Feb 01 '23 10:02 mohammadfawaz