jdk
jdk copied to clipboard
8326306: RISC-V: Re-structure MASM calls and jumps
Hi, please consider.
We have code that directly use the asm for call/jumps instead masm. Our masm have a bit odd naming, and we don't use 'proper' pseudoinstructions/mnemonics. Suggested by riscv-asm-manual
j offset jal x0, offset Jump
jal offset jal x1, offset Jump and link
jr rs jalr x0, rs, 0 Jump register
jalr rs jalr x1, rs, 0 Jump and link register
ret jalr x0, x1, 0 Return from subroutine
call offset auipc x1, offset[31:12]; jalr x1, x1, offset[11:0] Call far-away subroutine
tail offset auipc x6, offset[31:12]; jalr x0, x6, offset[11:0] Tail call far-away subroutine
But these can only be implemented like this if you have small enough application. The fallback of these is to use GOT (your C compiler should place a copy of GOT every 2G so it's always reachable). We don't have GOT, instead we materialize, so there is still differences between these and ours.
This patch:
- Tries to follow these suggested mappings as good we can.
- Make sure all jumps/calls go through MASM. (so we get control and can easily change for sites using a certain calling convention)
- To avoid confusion between MASM public/private methods and ASM methods and the mnemonics there are some renaming. E.g. the mnemonics jal means call offset, as we can't use that so there is no 'jal'.
- I enabled c.j, but right now we never generate it.
- As always the macro does no good and are legacy from when code base did not use templates. (also the x-macros screws up my IDE (vim+rtags))
I started down this path due to I have followup patch on top of this which removes trampoline in favor for load-n-jump. (WIP: https://github.com/robehn/jdk/compare/jal-fixes...robehn:jdk:load-n-link?expand=1) While looking into our calls it was a bit confusing, this helps.
Done a couple of t1-3 slightly different version of this patch, and as part of the followup, no issues found. (VF2, qemu, LP4) Re-running tests, had some last minute changes.
Thanks, Robbin
Progress
- [x] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
- [x] Change must not contain extraneous whitespace
- [x] Commit message must refer to an issue
Issue
- JDK-8326306: RISC-V: Re-structure MASM calls and jumps (Enhancement - P4)
Reviewers
- Fei Yang (@RealFYang - Reviewer)
Reviewing
Using git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/18942/head:pull/18942
$ git checkout pull/18942
Update a local copy of the PR:
$ git checkout pull/18942
$ git pull https://git.openjdk.org/jdk.git pull/18942/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 18942
View PR using the GUI difftool:
$ git pr show -t 18942
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/18942.diff
Webrev
:wave: Welcome back rehn! A progress list of the required criteria for merging this PR into master
will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.
@robehn This change now passes all automated pre-integration checks.
ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.
After integration, the commit message for the final commit will be:
8326306: RISC-V: Re-structure MASM calls and jumps
Reviewed-by: fyang, luhenry
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.
At the time when this comment was updated there had been 33 new commits pushed to the master
branch:
- 3d511ff63e59f542ae20c722bfef1c867cd1da0e: 8329748: Change default value of AssertWXAtThreadSync to true
- 67f03f2a4f5ac12748ffbf5c04f248a60869e180: 8332533: RISC-V: Enable vector variable shift instructions for machines with RVV
- 5f804b2ec12627b593353ceeab881187b0bb5cd6: 8329825: Clarify the value type for java.net.SocketOptions.SO_LINGER
- 52eda79522a5bd71b527e5946b654a331b021473: 8332538: Switch off JIT memory limit check for TestAlignVectorFuzzer.java
- d999b81e7110751be402012e1ed41b3256f5895e: 8331572: Allow using OopMapCache outside of STW GC phases
- 8291c94bcdbb01beddc94f290f2749841404cc0c: 8331224: ClassCastException in ObjectInputStream hides ClassNotFoundException
- 42e3c842ae2684265c794868fc76eb0ff2dea3d9: 8332086: Remove the usage of ServiceLoader in j.u.r.RandomGeneratorFactory
- 5cf8288b8071bdcf0c923dd7ba36f91bc7594ef3: 8332153: RISC-V: enable tests and add comment for vector shift instruct (shared by vectorization and Vector API)
- ae9ad862ee54e119553efec919f1061dca36b954: 8331934: [s390x] Add support for primitive array C1 clone intrinsic
- 3479b46c5bea3afd92b6ab4acd2fe7f274df38aa: 8332595: Serial: Remove unused TenuredGeneration::should_collect
- ... and 23 more: https://git.openjdk.org/jdk/compare/9bb6169a1cba900fa79d63119696efe265762083...master
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.
➡️ To integrate this PR with the above commit message to the master
branch, type /integrate in a new comment.
@robehn The following labels will be automatically applied to this pull request:
-
hotspot
-
shenandoah
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.
/label remove shenandoah
Webrevs
- 12: Full - Incremental (d882cd59)
- 11: Full - Incremental (b663e872)
- 10: Full - Incremental (c9b59d93)
- 09: Full - Incremental (d53e9694)
- 08: Full - Incremental (d08afa51)
- 07: Full - Incremental (8408c027)
- 06: Full - Incremental (38bd4187)
- 05: Full - Incremental (d8fbb00b)
- 04: Full - Incremental (6b3e4c47)
- 03: Full - Incremental (cb5ec446)
- 02: Full - Incremental (e9bd4d6b)
- 01: Full - Incremental (31361202)
- 00: Full (72c3b0bd)
@robehn
The shenandoah
label was successfully removed.
My merge with master was hit by: https://bugs.openjdk.org/browse/JDK-8331546 Re-merge when it's fixed.
Updated change looks great! Thank you.
Thanks for sticking with it, and thanks for a good review!
So now MacroAssembler::rt_call
is effectively this after this PR change:
void MacroAssembler::rt_call(address dest, Register tmp) {
RuntimeAddress target(dest);
relocate(target.rspec(), [&] {
int32_t offset;
la(tmp, target.target(), offset);
jalr(tmp, offset);
});
}
Here are some of my thoughts on the next step/cleanup for reference:
-
For
MacroAssembler::far_call
andMacroAssembler::far_jump
, I would suggest we use directauipc
instead ofla
for them as the destination is expected to be in code cache. This will help distinguish these two functions fromMacroAssembler::rt_call
. -
Since there are only a few uses of
MacroAssembler::call
, we might want to remove this function replacing its callsites withMacroAssembler::rt_call
. This would help avoid possible confusion betweencall
andrt_call
. I don't think the relocation added byrt_call
would make a difference. (Our aarch64 counterpartlea + blr
always emits relocation for address literal: https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/assembler_aarch64.cpp#L141) -
As we discussed in this PR, we also want to fix Shenandoah related callsites to
call_VM_leaf
. And letcall_VM_leaf
to usert_call
so that we emitauipc
when possible, which should be better in performance. -
We might want to turn some other places in the code where we do
mv + jalr
into art_call
.
Yes, in general agreed.
One issue is that RV MASM often have lower-case address, where other platforms use Address, i.e. AddressLiteral on x86. I think we should use upper case Address as then we can check for relocation and produce all different code patterns with same function, li/auipc/movptr.
While looking at this it seem like we have some funny things:
src/hotspot/cpu/riscv/c1_CodeStubs_riscv.cpp: __ rt_call(Runtime1::entry_for(stub_id), ra);
src/hotspot/cpu/riscv/c1_CodeStubs_riscv.cpp: __ far_call(RuntimeAddress(Runtime1::entry_for(_stub_id)));
Note that 'mv reg, imm' is not a recognized mnemonic (should be 'li reg, imm') AFAIK.
RuntimeAddress target(dest);
Yeah, I know what you mean. As you see, the address
passed to rt_call
is used to construct a RuntimeAddress
in this function which is also an Address
. It seems that it will be more consistent to build a RuntimeAddress
and pass that to rt_call
, which is similar with others like far_call
. So let's do it :-)
(I don't have a strong opinion about mv reg, imm
. I am OK if others prefer something like li reg, imm
)
Passed t1 with RCC 2047M, t1 with default RCC rolling on fine. (we have very noise t1 at the moment)
Any other takers?
Thank you @luhenry.
I did a full t1-t3 (t1 also with 2047) (VF2) I have seen no issues with this, I'll integrate later today.
/integrate
Going to push as commit c3bc23fe48ca1603afe68a6ac4aaa523a1edbb41.
Since your change was applied there have been 34 commits pushed to the master
branch:
- 8a9d77d58de259b6b2bdc2cc9e7bfdc28dcf7165: 8320622: [TEST] Improve coverage of compiler/loopopts/superword/TestMulAddS2I.java on different platforms
- 3d511ff63e59f542ae20c722bfef1c867cd1da0e: 8329748: Change default value of AssertWXAtThreadSync to true
- 67f03f2a4f5ac12748ffbf5c04f248a60869e180: 8332533: RISC-V: Enable vector variable shift instructions for machines with RVV
- 5f804b2ec12627b593353ceeab881187b0bb5cd6: 8329825: Clarify the value type for java.net.SocketOptions.SO_LINGER
- 52eda79522a5bd71b527e5946b654a331b021473: 8332538: Switch off JIT memory limit check for TestAlignVectorFuzzer.java
- d999b81e7110751be402012e1ed41b3256f5895e: 8331572: Allow using OopMapCache outside of STW GC phases
- 8291c94bcdbb01beddc94f290f2749841404cc0c: 8331224: ClassCastException in ObjectInputStream hides ClassNotFoundException
- 42e3c842ae2684265c794868fc76eb0ff2dea3d9: 8332086: Remove the usage of ServiceLoader in j.u.r.RandomGeneratorFactory
- 5cf8288b8071bdcf0c923dd7ba36f91bc7594ef3: 8332153: RISC-V: enable tests and add comment for vector shift instruct (shared by vectorization and Vector API)
- ae9ad862ee54e119553efec919f1061dca36b954: 8331934: [s390x] Add support for primitive array C1 clone intrinsic
- ... and 24 more: https://git.openjdk.org/jdk/compare/9bb6169a1cba900fa79d63119696efe265762083...master
Your commit was automatically rebased without conflicts.
@robehn Pushed as commit c3bc23fe48ca1603afe68a6ac4aaa523a1edbb41.
:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.