ScratchABlock icon indicating copy to clipboard operation
ScratchABlock copied to clipboard

SSA is upon us...

Open pfalcon opened this issue 5 years ago • 6 comments

So, I pushed initial support for conversion to SSA, e.g. tests: https://github.com/pfalcon/ScratchABlock/commit/72a047e4c10aea2f6b5a2782a38cb57a2737ef0e . To remind, I wanted to postpone introduction of SSA, to not fall into common trap when people treat as some magic device, without good understanding of the frontier where it really becomes needed, how to deal with it, what drawbacks it has, etc.

All this time I (in the background) pumped up my understanding and well, now I hoped I grasped the essence of its construction, that the one true SSA is the maximal SSA, its construction is simple and devoid of any arcane knowledge, and any other form is nothing but implementation-level optimizations, and they're thus strictly optional.

Going beyond construction is definitely on next TODOs, but first syntax for phi functions need to be set. How it's done currently (examples above) is in a textbook way, where phi just takes list of values, in the same order as predecessors (and that order is of course stable in ScratchABlock, partly exactly because of anticipation of phi() introduction).

That should work pretty well for intermediate output, but of course barely suitable as input, where predecessors aren't so explicit, and their order shouldn't have to be clear/stable either. And the input case is of course important (interoperability with LLVM is definitely on the table).

So, there're two facets of it:

  1. External syntax, i.e. how it's rendered in PseudoC source program. This can be as simple as normal C fuction syntax, e.g. phi(label1, $r0_1, label2, $r0_2), or something fancier, e.g. phi(label1: $r0_1, label2: $r0_2). Of course, something fancier is definitely not compatible with C syntax, and staying within its bounds is the aim of PseudoC. But then phi() is not an executable function, so having not a C syntax is somehow not completely crazy idea.
  2. Internal syntax, i.e. should there be bblock addresses retained as arguments of phi's EXPR, or not? Common sense says it's superfluous, because in internal representation, there's no ambiguity. But perhaps as a debugging measure still may be useful?

For reference, LLVM's syntax:

%indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ]

That order of bblock addr vs value I don't like. The most important thing in phi is not value, but the fact that it depends on predecessor block, so it would rather go first.

$indvar = phi(LoopHeader, 0, Loop, $nextindvar)
$indvar = phi(LoopHeader: 0, Loop: $nextindvar)

Well, to make it more C-like, and still preserve pair-ness, could do:

$indvar = phi((LoopHeader, 0), (Loop, $nextindvar))

Well, that's still not really a valid C syntax. Then could do:

$indvar = phi(_(LoopHeader, 0), _(Loop, $nextindvar))

But that's really going down the rabbit hole...

pfalcon avatar Nov 16 '18 21:11 pfalcon

@maximumspatium, Comments are welcome.

pfalcon avatar Nov 16 '18 21:11 pfalcon

@maximumspatium, Comments are welcome.

Paul, there is not much I can add to your elaboration. As you said, the LLVM syntax is incompatible with C syntax and rather cumbersome to read. The other suggestions look better, although each of them has its own pros and cons:

$indvar = phi(LoopHeader, 0, Loop, $nextindvar) doesn't show clearly that even and odd parameters belong together

$indvar = phi(LoopHeader: 0, Loop: $nextindvar) is better but that key:val syntax is typical for script languages, not C

$indvar = phi((LoopHeader, 0), (Loop, $nextindvar)) that tuple notation looks rather Python-like than C-like

$indvar = phi(_(LoopHeader, 0), _(Loop, $nextindvar)) is the best option IMHO, besides the attempt to force tuples into C-syntax by using "_".

What's about this:

$indvar = phi(LoopHeader[0], Loop[$nextindvar])

Anyway, it fully conforms to the C-syntax and doesn't introduce anything superfluous. The notion of location:variable is represented as arrays with subscripts. They can be interpreted as dict[val], too.

maximumspatium avatar Nov 17 '18 19:11 maximumspatium

Thanks for comments!

$indvar = phi(LoopHeader[0], Loop[$nextindvar])

Well, that's creative idea, but I'm not sure how that fares on "ease of understanding" scale ;-).

I guess, so far I lean towards just having a normal C argument syntax (comma-separated).

pfalcon avatar Nov 17 '18 20:11 pfalcon

$indvar = phi(LoopHeader[0], Loop[$nextindvar]) arrays with subscripts.

Thinking about it more, "subscript" is a keyword. But what is usually used as a subscript? Well, address is. Indeed, why scheme for naming SSA vars is subscripting them with address of instructions which defined them, e.g. var4f0 is var defined at 0x4f0.

But then, the order of subscription in the example above should be the opposite:

$indvar = phi(0[LoopHeader], $nextindvar[Loop])

Now, that's more understandable. For me. May be still a quiz for outsiders. But I like it, it's now really creative with a logic behind it. Still more time to think about it of course.

pfalcon avatar Nov 18 '18 09:11 pfalcon

Another matter for own reference: the whole idea of introducing SSA was of course to track defs/uses precisely. Well, it now needs to be done. The "rename_ssa_vars" pass should be augmented to create sparse use-def, def-use chains, and, the whole point - other xform passes should be augmented to update that info as needed.

The latter may be not as easy as it seems. Or rather, previously passes didn't try to update this info because heck, in non-SSA form it's not that easy (maybe). Now it should be easier with SSA, but the question if existing passes are changed to update that info only for SSA case, is it sane? Or maybe it will be possible to update them for the general case after all? Who will write tests for all that?

There's of course Gordian cut "solution" to this issue - the whole reason why SSA is usually introduced is to have more efficient passes. So, just rewrite (i.e. duplicate) all existing passes in their SSA variant, problem solved, lol.

pfalcon avatar Nov 18 '18 10:11 pfalcon

Still didn't get it, lol. SSA construction is broken, need to restart from the beginning...

pfalcon avatar Nov 18 '18 20:11 pfalcon