sway icon indicating copy to clipboard operation
sway copied to clipboard

Aggregates and memory management in the IR is generally broken.

Open otrho opened this issue 3 years ago • 2 comments

Although the distinction between copy and reference types is more explicit within the compiler now, it wasn't always the case and this is apparent in the IR design.

E.g.,

  • Operations like store and load are used on entire aggregates as a blanket 'write' or 'read'.
  • insert_value and extract_value were attempting to simplify using structs but just ambiguate them.
  • Arrays and structs are treated collectively as 'aggregates' but they actually don't share any utility.
  • Constant 'aggregates' are created out of thin air where necessary and just permanently stored (and leaked) on the stack.
  • The use of pointers is confused and inconsistent. get_ptr was initially only for getting the address of locals but now it's for arguments too and should support globals.

I propose at least the following fixes:

  • The removal of insert_value, extract_value, insert_element and extract_element and instead use get_ptr, load and store.
  • load and store are only for writing copy type values to a pointer.
  • Pointers are opaque, a la LLVM.
  • get_ptr is improved to be more like GEP from LLVM.
  • A global data section is introduced and mutability be strictly specified and adhered to. Reference typed constants may then be constructed there at compile time.

otrho avatar Sep 21 '22 04:09 otrho

Also currently get_ptr is compiled to an address and stored in a register and is then ignored. And the address calculations in ASM gen are extremely complicated and still have the potential to just fall over for large values that might not fit in an immediate and/or addressing is often quite redundant, calc'ing the same offsets repeatedly.

So, the ASM gen around memory needs a lot of work.

otrho avatar Sep 21 '22 07:09 otrho

I'm also finding now that managing enums, which are tagged unions in IR and ASM, to be extremely painful in the current ASM gen. In particular, ensuring the padding around the union to make sure the memory accesses are aligned and correct in a union of types of different sizes is very clumsy and should be made explicit in IR with decent memory management.

The workarounds in sway_core/src/asm_generation/data_section.rs need to be removed.

otrho avatar Sep 24 '22 10:09 otrho

Also thinking that in IR there's no point in making distinctions between mutable pointers and immutable. That sort of correctness can be guaranteed by the compiler and so all pointers might as well be mutable. If we create code which tries to write to the data section then that's a compiler bug, not an issue with IR.

otrho avatar Oct 21 '22 00:10 otrho

Summary of discussion on slack:

  • We'll introduce Alloca instructions. Loads and stores directly use Alloca as their pointer operand. (i.e., the Alloca Value has type "Pointer").
  • Rename get_ptr to get_element_ptr and change its semantics to take a value of "any pointer" and do offset computation.
  • The existing Pointer data structure will go away. Pointers will just be any other Value with a "pointer" type.
  • insert/extract value/element instructions will be removed. Since aggregates are always "in-memory", we don't deal with SSA values of aggregates, so these instructions aren't needed. Accessing elements of aggregates will be through get_element_ptr for offset computation and a load/store on that.
  • We will retain a per-function database of locals, and update it when creating Alloca instructions. This is for convenience (such as printing all locals at the beginning etc).
  • All of this, along with the recent block args and mem2reg changes have made it more difficult to map IR vregs to source variables. So we need a name field for every Value. If that's doable in this PR (not necessarily populating the name field for every Value, but at least introducing it and populating it only for Alloca insts), good. If it isn't doable, add a "name" field to the Alloca instruction temporarily so we know what we're referring to when printing a Load/Store/Alloca.

vaivaswatha avatar Oct 26 '22 13:10 vaivaswatha