babylon
babylon copied to clipboard
Experiment: Alternative SSA construction algorithm
Hi,
this is mainly an experiment, I wanted to play around with the API and see how an alternative SSA construction algorithm can be implemented. There are a few notable problems I encountered:
- The original algorithm is supposed to work on-the-fly in a single step. This does not work with the current API.
Values (operands) andReferences (successor arguments) are bound eagerly. I assume terminating ops could be emitted only after everything else was done, but that's not easy to achieve currently either. For operands, I don't see any way to say "oh, please use a different value" after the operation is passed to the builder. I'm not sure how relevant this is for other transformations, but if we find a simple solution to this without giving up immutability, I could imagine it to be very powerful.- I'm currently using more data structures to cope with the two-step variant than the original algorithm itself requires.
- The
predecessorsmap could be avoided at the cost of additional computation whenever only filled blocks should be considered - The
loadsmap is required as we can't replace the loads directly during the analysis step - The
additionalParameterscould probably be removed ifBlock.Builderallowed removing parameters - The
deletedPhisset is also needed to cope with the fact that we look up current defs in both steps
- The
- I noticed that for the method in
TestTraverse, this implementation gets rid of one parameter that the existing implementation does not get rid of. I'm not sure if it is designed to produce minimal SSA, but that looks odd. (cis passed to the while body, but it is never read there before being written)
I think the algorithm itself is pretty straightforward, but the extra steps needed to adapt to the API somewhat nullify that. I tried to document the implementation details that differ from the paper, but if there are any questions or suggestions how this could be improved, please let me know.
Progress
- [x] Change must not contain extraneous whitespace
Reviewing
Using git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/babylon.git pull/214/head:pull/214
$ git checkout pull/214
Update a local copy of the PR:
$ git checkout pull/214
$ git pull https://git.openjdk.org/babylon.git pull/214/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 214
View PR using the GUI difftool:
$ git pr show -t 214
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/babylon/pull/214.diff
Webrev
:wave: Welcome back hgreule! A progress list of the required criteria for merging this PR into code-reflection will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.
@SirYwell This change now passes all automated pre-integration checks.
ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.
After integration, the commit message for the final commit will be:
Experiment: Alternative SSA construction algorithm
Reviewed-by: psandoz
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.
At the time when this comment was updated there had been 3 new commits pushed to the code-reflection branch:
- f69e732433c903febc2ef647c5459139130b9acc: Dont use hashbang for launcher scripts
- cb52322017ca649c25c40b27dd9478419521f8f3: Fixed env.bash issue found on Linux (Mac ok?)
- 94b5149779bed3715885955a9d9677f9e9d97d21: bld script and bldr tools cleanup
Please see this link for an up-to-date comparison between the source branch of this pull request and the code-reflection branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.
As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@PaulSandoz) but any other Committer may sponsor as well.
➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).
This is a good experiment. The Cytron paper is quite old and I did not check to see if there were more recent alternative algorithms that were simpler. It will take me some time to fully digest the paper and your implementation. My gut feeling is this might be a good default replacement for the current implementation. I am curious as to how easily you found the adaption from phi nodes to block parameters. Generally i have found it easy to adapt and it can often simplify things.
You noticed a deficiency in the Cytron implementation (and possibly the algorithm), regarding a variable that is just stored to. We have noticed that too.
I don't currently see a way to retain immutability and post-adapt what is currently built before building completes. Maybe as I look more deeply a few ideas will emerge. One approach i have used when transforming is instead of building up data structures I build up (capturing) functions on the first pass to apply when transforming on the second pass. I don't yet know if this would make your implementation simpler.
I am curious as to how easily you found the adaption from phi nodes to block parameters. Generally i have found it easy to adapt and it can often simplify things.
Conceptually, I think block parameters are easy to understand. One hurdle I could imagine when doing global transformation is the "when do I need them", or rather the "why do I sometimes need them". I'm not sure how I would explain that without explaining phis first. But for local transformations, that doesn't really matter at all, reducing the complexity in that aspect.
It also fits in well with the design of the API (I think it wouldn't fit into a sea-of-nodes style API), and transformations between the representations seem pretty simple too.
I don't currently see a way to retain immutability and post-adapt what is currently built before building completes. Maybe as I look more deeply a few ideas will emerge. One approach i have used when transforming is instead of building up data structures I build up (capturing) functions on the first pass to apply when transforming on the second pass. I don't yet know if this would make your implementation simpler.
I'm not sure how useful that is in general, but I think terminating ops could always only be added to the block after everything else was done. Combined with the ability to remove block parameters again, it would potentially made some things a little easier. However, that also feels like an odd special casing that might be more confusing than helpful.
(BTW the changes from Set to SequencedSet are nice, feel free to separate that out as a separate PR.)
(Still working my way through the paper + code, i'l send some further thoughts later today.)
Explaining phis at that level is i would imagine a proxy for explaining control flow join points. If one thinks of blocks as little functions, that jump to other functions as a tail call, it might actually be easier to explain.
Combined with the ability to remove block parameters again, it would potentially made some things a little easier.
This should be possible to support, the Block.Builder::parameters currently returns an unmodifiable list. I was hesitant about exposing a mutable list, since removing a parameter that is used will create an invalid model. But, we could support removing block parameters that are not currently used nor are any successor arguments associated with them. Although, that probably means it is not very useful.
In general my approach to the design as been to lean into multiple passes (analysis/transformation) rather than try and optimize for single passes. That's a trade off. We are not trying to build a framework for compiler pipelines like say LLVM, MLIR, or C2, although there are obvious similarities to key aspects.
I have a better grasp now (not complete). This algorithm really leans on the mutability of a phi node's operand list and user list, and replacement of uses of a phi with another value. I don't see how we can make it work in one pass with the current code model design. However, its still rather good in two passes, better than the Cytron implementation IMO.
My suggestion would be to lean into the two passes and make the second pass simpler for the resolution of values, for load operation results and additional successor arguments. Its either replacement with another Value or a say BlockParameterHolder that holds the actual Block.Parameter to use. When a Block is processed to append the additional block parameters given the list of the holders it will update the holders.
In our current approach it is possible to SSA transform models with nested operations, but the constraint is variables declared outside a body can only be loaded from within a body. This obviously works well for lambdas. But it means we cannot support say the SSA transformation of the high-level model while retaining the structure e.g. the model for this code snippet:
int x =0;
if (...) {
x = 42;
} else {
x = -42;
}
We would have to transform the operation modeling the if statement into one that yields the resulting value of x. I explored that and it quickly became complex. So instead we throw an exception.
There is also another case, not currently supported, where we cannot fully SSA transform - if a variable is updated in a try block and accessed in a catch or finally block, or updated in a catch block and accessed in a finally block. We should either fail on such cases or leave the variable intact (perhaps via some configuration parameter). This requires analysis of stores within exception regions. We might be able to transform and split the code, but that is harder and may not be worth the effort given the likely edge case nature.
This should be possible to support, the
Block.Builder::parameterscurrently returns an unmodifiable list. I was hesitant about exposing a mutable list, since removing a parameter that is used will create an invalid model. But, we could support removing block parameters that are not currently used nor are any successor arguments associated with them. Although, that probably means it is not very useful.
Yes, that's a restriction that is probably also hard to track, and not restricting removal wouldn't be a good API I think.
This algorithm really leans on the mutability of a phi node's operand list and user list, and replacement of uses of a phi with another value. I don't see how we can make it work in one pass with the current code model design. However, its still rather good in two passes, better than the Cytron implementation IMO.
I agree. I also think if we want to properly support try-catch-finally constructs with it (as you mentioned), it wouldn't be one pass even if we had mutability. So two passes are probably fine.
My suggestion would be to lean into the two passes and make the second pass simpler for the resolution of values, for load operation results and additional successor arguments. Its either replacement with another
Valueor a sayBlockParameterHolderthat holds the actualBlock.Parameterto use. When aBlockis processed to append the additional block parameters given the list of the holders it will update the holders.
I'll try to improve on that.
In our current approach it is possible to SSA transform models with nested operations, but the constraint is variables declared outside a body can only be loaded from within a body. This obviously works well for lambdas. But it means we cannot support say the SSA transformation of the high-level model while retaining the structure e.g. the model for this code snippet
I think this is a general problem that comes with the (very powerful) flexibility of the model. It's generally difficult to set boundaries what a valid input code model is for a specific transformation operation, and then there might still be two options: 1) In such case, do not touch that variable and only transform the rest, or 2) throw an exception if an unsupported code element is encountered. Both might make sense in specific scenarios, or both variants could be combined depending what kind of unsupported elements are encountered.
Without trying hard, I also noticed that the current transform API isn't well-suited when trying to clean up e.g. empty blocks or to combine blocks. Especially with the SSA transform, we end up with unneeded blocks in many places, so I thought that could be a potential improvement, but I didn't find a way to do that without a lot of additional work.
I think this is a general problem that comes with the (very powerful) flexibility of the model. It's generally difficult to set boundaries what a valid input code model is for a specific transformation operation, and then there might still be two options: 1) In such case, do not touch that variable and only transform the rest, or 2) throw an exception if an unsupported code element is encountered. Both might make sense in specific scenarios, or both variants could be combined depending what kind of unsupported elements are encountered.
Yes, since it is specific to the operations present in the model. I used the current SSA implementation carefully in the Triton example, when translating the operation modeling the Java for statement, modeling a counted loop, into a Triton SSA counted loop expression.
Without trying hard, I also noticed that the current transform API isn't well-suited when trying to clean up e.g. empty blocks or to combine blocks. Especially with the SSA transform, we end up with unneeded blocks in many places, so I thought that could be a potential improvement, but I didn't find a way to do that without a lot of additional work.
Indeed, it's often easier not to do that, and apply as a separate transformation if needed. When a body is built, any empty blocks with no predecessors are removed, and then the blocks are topologically sorted. This will retain non-empty block with no predecessors, occurring at the end, and i am unsure whether it is appropriate to remove them or not -- is it intentional or an error? It was intentional, for simplicity, to retain blocks that unconditionally branch to other blocks that could otherwise be merged, lowering will likely do this.
@SirYwell This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!
I experimented with moving around a few things, mainly the code in the public void apply(Block.Builder block, Block b) method (dealing with new block parameters) and the load situation in the second pass. Somehow I always ended up confusing myself what's mapped and what isn't. I hadn't the impression things got clearer (but maybe I went into a wrong direction). Maybe you have more concrete ideas that I missed?
I also noticed that CopyContextImpl works recursively (in the sense of looking at parent contexts) for values but not for blocks and successors. Is that intended?
I experimented with moving around a few things, mainly the code in the
public void apply(Block.Builder block, Block b)method (dealing with new block parameters) and the load situation in the second pass. Somehow I always ended up confusing myself what's mapped and what isn't. I hadn't the impression things got clearer (but maybe I went into a wrong direction). Maybe you have more concrete ideas that I missed?
Not at the moment. How about we do the following? Switch to use the new implementation, ensuring tests pass. Keep the old one too and ensure it is explicitly tested, since it is a good test case. And incrementally, if needed, update the new implementation.
I also noticed that
CopyContextImplworks recursively (in the sense of looking at parent contexts) for values but not for blocks and successors. Is that intended?
Yes, blocks can only reference other blocks from within the same body. Do you need access to a parent context?
Not at the moment. How about we do the following? Switch to use the new implementation, ensuring tests pass. Keep the old one too and ensure it is explicitly tested, since it is a good test case. And incrementally, if needed, update the new implementation.
Okay, I'll do that. Is there are way in jtreg to run tests with different system properties? This way we could just check for the property in one of the transform methods (in one of the classes) and delegate to the other depending on what was passed. This is how I tested it locally, but running tests for both this way (where they don't have different results that are relevant to the test) might make sense. WDYT?
Yes, blocks can only reference other blocks from within the same body. Do you need access to a parent context?
I mainly was confused by the difference, without any need for it. But I also experimented with "preparing" a copy context that I then pass to the Op#transform(CopyContext, OpTransformer) call. That also didn't work well for other reasons, and it might be not an intended use at all, but in such scenario it would matter.
Not at the moment. How about we do the following? Switch to use the new implementation, ensuring tests pass. Keep the old one too and ensure it is explicitly tested, since it is a good test case. And incrementally, if needed, update the new implementation.
Okay, I'll do that. Is there are way in jtreg to run tests with different system properties? This way we could just check for the property in one of the transform methods (in one of the classes) and delegate to the other depending on what was passed. This is how I tested it locally, but running tests for both this way (where they don't have different results that are relevant to the test) might make sense. WDYT?
Yes, see here
Yes, blocks can only reference other blocks from within the same body. Do you need access to a parent context?
I mainly was confused by the difference, without any need for it. But I also experimented with "preparing" a copy context that I then pass to the
Op#transform(CopyContext, OpTransformer)call. That also didn't work well for other reasons, and it might be not an intended use at all, but in such scenario it would matter.
Ok. For some wider context i am wondering if it is possible to place all the transform logic within OpTransformer. What's missing is the code to transform an input body into output operations added to a block builder. OpTransformer should only use public APIs. This may make the transformation seem less special, its just a particular composition of traversal and building with some booking between input and output code items. (Further, CopyContext is a terrible name!)
Yes, see here
Thanks, I adjusted some tests where it seems to make sense to test both variants accordingly. Let me know if there's anything else to do there. The current tests all pass, only one needed changes due to different numbering (I wonder why we don't have a block 0 anymore?).
Ok. For some wider context i am wondering if it is possible to place all the transform logic within
OpTransformer. What's missing is the code to transform an input body into output operations added to a block builder.OpTransformershould only use public APIs. This may make the transformation seem less special, its just a particular composition of traversal and building with some booking between input and output code items. (Further,CopyContextis a terrible name!)
That sounds interesting and might be worth further investigation.
This looks good. I verified it fixes uninitialized pattern variables mentioned in #264.
Here's a patch that i believe fixes the algorithm to support uninitialized variables.
diff --git a/src/java.base/share/classes/java/lang/reflect/code/analysis/SSAConstruction.java b/src/java.base/share/classes/java/lang/reflect/code/analysis/SSAConstruction.java
index e5133490128..25de41e9691 100644
--- a/src/java.base/share/classes/java/lang/reflect/code/analysis/SSAConstruction.java
+++ b/src/java.base/share/classes/java/lang/reflect/code/analysis/SSAConstruction.java
@@ -70,8 +70,12 @@ private void prepare(Op nestedOp) {
}
case CoreOp.VarAccessOp.VarStoreOp store ->
writeVariable(store.varOp(), store.parentBlock(), new Holder(store.storeOperand()));
- case CoreOp.VarOp initialStore ->
- writeVariable(initialStore, initialStore.parentBlock(), new Holder(initialStore.initOperand()));
+ case CoreOp.VarOp initialStore -> {
+ Val val = initialStore.isUninitialized()
+ ? Uninitialized.VALUE
+ : new Holder(initialStore.initOperand());
+ writeVariable(initialStore, initialStore.parentBlock(), val);
+ }
case Op.Terminating _ -> {
Block block = op.parentBlock();
// handle the sealing, i.e. only now make this block a predecessor of its successors
@@ -183,6 +187,7 @@ private void sealBlock(Block block) {
private Value resolveValue(CopyContext context, Val val) {
return switch (val) {
+ case Uninitialized _ -> throw new IllegalStateException("Uninitialized variable");
case Holder holder -> context.getValueOrDefault(holder.value(), holder.value());
case Phi phi -> {
List<Phi> phis = this.additionalParameters.get(phi.block());
@@ -253,6 +258,10 @@ sealed interface Val {
record Holder(Value value) implements Val {
}
+ enum Uninitialized implements Val {
+ VALUE;
+ }
+
record Phi(CoreOp.VarOp variable, Block block, List<Val> operands, Set<Object> users) implements Val {
Phi(CoreOp.VarOp variable, Block block) {
this(variable, block, new ArrayList<>(), new HashSet<>());
Using IllegalStateException might be too broad in general. TestUninitializedVariable checks it encounters it, but it might be thrown by something else. The test passed without these changes because the SSA algorithm fails for different reasons, on the call to initialStore.initOperand().
The patch looks good and simple. I assume exceptions will generally need to be refined for the whole API.
/integrate
I'll try to find some time to experiment with bailing out on variables in try/catch/finally blocks where needed.
@SirYwell Your change (at version 9458769a2674bfe851958249fe3ad8781ef45591) is now ready to be sponsored by a Committer.
/sponsor
Going to push as commit a9fbdee81066fd5f840afd537005e04e44ceedef.
Since your change was applied there have been 3 commits pushed to the code-reflection branch:
- f69e732433c903febc2ef647c5459139130b9acc: Dont use hashbang for launcher scripts
- cb52322017ca649c25c40b27dd9478419521f8f3: Fixed env.bash issue found on Linux (Mac ok?)
- 94b5149779bed3715885955a9d9677f9e9d97d21: bld script and bldr tools cleanup
Your commit was automatically rebased without conflicts.
@PaulSandoz @SirYwell Pushed as commit a9fbdee81066fd5f840afd537005e04e44ceedef.
:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.