sway icon indicating copy to clipboard operation
sway copied to clipboard

Implement Scalar Replacement of Aggregates

Open vaivaswatha opened this issue 2 years ago • 3 comments

fn main() -> u64 {
    let s = S { 
        x: 2,
        y: 3,
        z: 4,
        w: 5
    };
    s.x
}

must become

fn main() -> u64 {
    2
}

Another (C-like) example where we can do SROA. This is from a GCC experience report.. It's an interesting read and likely a guidance for us.

struct S
{
   int i;
    int j;
};
int foo1 (bool b)
{
    struct S s;
    s.i = 2;
    if (b)
      s.i = 1;
    else
      s.i--;
    return s.i;
}

Tracked in: https://github.com/FuelLabs/sway/issues/2383

vaivaswatha avatar Oct 25 '22 14:10 vaivaswatha

I'm hoping that this optimization will be safe to do in every case for us, at least for structs, because structs are always accessed statically and we don't have to worry much about confusing pointer math (unless there are asm blocks). Arrays, on the other hand, can be accessed with a dynamic index.

  • For structs, I imagine the flow being: replace the local corresponding to the struct with a list of locals of its individual elements (has to be done recursively), and then update uses everywhere. Then run mem2reg on the result. I think LLVM runs mem2reg from within SROA which we can also do if we like.
  • For Arrays, this optimization is only valid if all accesses to the array are provably static. If so, the same algorithm can be used.

An interesting situation arises when you have nested structs and arrays (e.g. an array of structs containing other arrays, and so on). So the static checks need to be recursive.

One thing to note is that the current implementation of SROA in LLVM does not break up aggregates in the expected way. Instead, it looks at byte slices within an aggregate and analyzes whether different slices are statically accessed. Those slices may not correspond to any recognizable data fields but eventually things just work out. I think this method is an overkill for us because we don't have to worry much about unsafe pointer math. In contrast, the older implementation, called ScalarReplAggregates is probably a better and easier model to follow.

As always, @vaivaswatha you're free to do whatever you want here. Just spitballing random thoughts 😄

mohammadfawaz avatar Oct 25 '22 15:10 mohammadfawaz

Depends on #2819

vaivaswatha avatar Oct 25 '22 15:10 vaivaswatha

@mohammadfawaz Thank you for what you wrote. As far as I can tell, My plan concurs with what you've said. I'll respond back in more detail (if necessary) after I do a bit of reading on the topic.

vaivaswatha avatar Oct 25 '22 15:10 vaivaswatha