jdk icon indicating copy to clipboard operation
jdk copied to clipboard

8326306: RISC-V: Re-structure MASM calls and jumps

Open robehn opened this issue 10 months ago • 12 comments

Hi, please consider.

We have code that directly use the asm for call/jumps instead masm. Our masm have a bit odd naming, and we don't use 'proper' pseudoinstructions/mnemonics. Suggested by riscv-asm-manual

j offset	jal x0, offset	Jump
jal offset	jal x1, offset	Jump and link
jr rs	        jalr x0, rs, 0	Jump register
jalr rs	        jalr x1, rs, 0	Jump and link register
ret	        jalr x0, x1, 0	Return from subroutine
call offset	auipc x1, offset[31:12]; jalr x1, x1, offset[11:0]	Call far-away subroutine	
tail offset	auipc x6, offset[31:12]; jalr x0, x6, offset[11:0]	Tail call far-away subroutine

But these can only be implemented like this if you have small enough application. The fallback of these is to use GOT (your C compiler should place a copy of GOT every 2G so it's always reachable). We don't have GOT, instead we materialize, so there is still differences between these and ours.

This patch:

  • Tries to follow these suggested mappings as good we can.
  • Make sure all jumps/calls go through MASM. (so we get control and can easily change for sites using a certain calling convention)
  • To avoid confusion between MASM public/private methods and ASM methods and the mnemonics there are some renaming. E.g. the mnemonics jal means call offset, as we can't use that so there is no 'jal'.
  • I enabled c.j, but right now we never generate it.
  • As always the macro does no good and are legacy from when code base did not use templates. (also the x-macros screws up my IDE (vim+rtags))

I started down this path due to I have followup patch on top of this which removes trampoline in favor for load-n-jump. (WIP: https://github.com/robehn/jdk/compare/jal-fixes...robehn:jdk:load-n-link?expand=1) While looking into our calls it was a bit confusing, this helps.

Done a couple of t1-3 slightly different version of this patch, and as part of the followup, no issues found. (VF2, qemu, LP4) Re-running tests, had some last minute changes.

Thanks, Robbin


Progress

  • [x] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • [x] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue

Issue

  • JDK-8326306: RISC-V: Re-structure MASM calls and jumps (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/18942/head:pull/18942
$ git checkout pull/18942

Update a local copy of the PR:
$ git checkout pull/18942
$ git pull https://git.openjdk.org/jdk.git pull/18942/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 18942

View PR using the GUI difftool:
$ git pr show -t 18942

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/18942.diff

Webrev

Link to Webrev Comment

robehn avatar Apr 25 '24 07:04 robehn

:wave: Welcome back rehn! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Apr 25 '24 07:04 bridgekeeper[bot]

@robehn This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8326306: RISC-V: Re-structure MASM calls and jumps

Reviewed-by: fyang, luhenry

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 33 new commits pushed to the master branch:

  • 3d511ff63e59f542ae20c722bfef1c867cd1da0e: 8329748: Change default value of AssertWXAtThreadSync to true
  • 67f03f2a4f5ac12748ffbf5c04f248a60869e180: 8332533: RISC-V: Enable vector variable shift instructions for machines with RVV
  • 5f804b2ec12627b593353ceeab881187b0bb5cd6: 8329825: Clarify the value type for java.net.SocketOptions.SO_LINGER
  • 52eda79522a5bd71b527e5946b654a331b021473: 8332538: Switch off JIT memory limit check for TestAlignVectorFuzzer.java
  • d999b81e7110751be402012e1ed41b3256f5895e: 8331572: Allow using OopMapCache outside of STW GC phases
  • 8291c94bcdbb01beddc94f290f2749841404cc0c: 8331224: ClassCastException in ObjectInputStream hides ClassNotFoundException
  • 42e3c842ae2684265c794868fc76eb0ff2dea3d9: 8332086: Remove the usage of ServiceLoader in j.u.r.RandomGeneratorFactory
  • 5cf8288b8071bdcf0c923dd7ba36f91bc7594ef3: 8332153: RISC-V: enable tests and add comment for vector shift instruct (shared by vectorization and Vector API)
  • ae9ad862ee54e119553efec919f1061dca36b954: 8331934: [s390x] Add support for primitive array C1 clone intrinsic
  • 3479b46c5bea3afd92b6ab4acd2fe7f274df38aa: 8332595: Serial: Remove unused TenuredGeneration::should_collect
  • ... and 23 more: https://git.openjdk.org/jdk/compare/9bb6169a1cba900fa79d63119696efe265762083...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk[bot] avatar Apr 25 '24 07:04 openjdk[bot]

@robehn The following labels will be automatically applied to this pull request:

  • hotspot
  • shenandoah

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

openjdk[bot] avatar Apr 25 '24 07:04 openjdk[bot]

/label remove shenandoah

robehn avatar Apr 25 '24 07:04 robehn

@robehn The shenandoah label was successfully removed.

openjdk[bot] avatar Apr 25 '24 07:04 openjdk[bot]

My merge with master was hit by: https://bugs.openjdk.org/browse/JDK-8331546 Re-merge when it's fixed.

robehn avatar May 02 '24 08:05 robehn

Updated change looks great! Thank you.

Thanks for sticking with it, and thanks for a good review!

robehn avatar May 13 '24 11:05 robehn

So now MacroAssembler::rt_call is effectively this after this PR change:

void MacroAssembler::rt_call(address dest, Register tmp) {
  RuntimeAddress target(dest);
  relocate(target.rspec(), [&] {
    int32_t offset;
    la(tmp, target.target(), offset);
    jalr(tmp, offset);
  });
}

Here are some of my thoughts on the next step/cleanup for reference:

  1. For MacroAssembler::far_call and MacroAssembler::far_jump, I would suggest we use direct auipc instead of la for them as the destination is expected to be in code cache. This will help distinguish these two functions from MacroAssembler::rt_call.

  2. Since there are only a few uses of MacroAssembler::call, we might want to remove this function replacing its callsites with MacroAssembler::rt_call. This would help avoid possible confusion between call and rt_call. I don't think the relocation added by rt_call would make a difference. (Our aarch64 counterpart lea + blr always emits relocation for address literal: https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/assembler_aarch64.cpp#L141)

  3. As we discussed in this PR, we also want to fix Shenandoah related callsites to call_VM_leaf. And let call_VM_leaf to use rt_call so that we emit auipc when possible, which should be better in performance.

  4. We might want to turn some other places in the code where we do mv + jalr into a rt_call.

RealFYang avatar May 14 '24 02:05 RealFYang

Yes, in general agreed.

One issue is that RV MASM often have lower-case address, where other platforms use Address, i.e. AddressLiteral on x86. I think we should use upper case Address as then we can check for relocation and produce all different code patterns with same function, li/auipc/movptr.

While looking at this it seem like we have some funny things:

src/hotspot/cpu/riscv/c1_CodeStubs_riscv.cpp:  __ rt_call(Runtime1::entry_for(stub_id), ra);
src/hotspot/cpu/riscv/c1_CodeStubs_riscv.cpp:  __ far_call(RuntimeAddress(Runtime1::entry_for(_stub_id)));

Note that 'mv reg, imm' is not a recognized mnemonic (should be 'li reg, imm') AFAIK.

robehn avatar May 14 '24 06:05 robehn

RuntimeAddress target(dest);

Yeah, I know what you mean. As you see, the address passed to rt_call is used to construct a RuntimeAddress in this function which is also an Address. It seems that it will be more consistent to build a RuntimeAddress and pass that to rt_call, which is similar with others like far_call. So let's do it :-)

(I don't have a strong opinion about mv reg, imm. I am OK if others prefer something like li reg, imm)

RealFYang avatar May 14 '24 07:05 RealFYang

Passed t1 with RCC 2047M, t1 with default RCC rolling on fine. (we have very noise t1 at the moment)

Any other takers?

robehn avatar May 15 '24 06:05 robehn

Thank you @luhenry.

I did a full t1-t3 (t1 also with 2047) (VF2) I have seen no issues with this, I'll integrate later today.

robehn avatar May 22 '24 10:05 robehn

/integrate

robehn avatar May 22 '24 11:05 robehn

Going to push as commit c3bc23fe48ca1603afe68a6ac4aaa523a1edbb41. Since your change was applied there have been 34 commits pushed to the master branch:

  • 8a9d77d58de259b6b2bdc2cc9e7bfdc28dcf7165: 8320622: [TEST] Improve coverage of compiler/loopopts/superword/TestMulAddS2I.java on different platforms
  • 3d511ff63e59f542ae20c722bfef1c867cd1da0e: 8329748: Change default value of AssertWXAtThreadSync to true
  • 67f03f2a4f5ac12748ffbf5c04f248a60869e180: 8332533: RISC-V: Enable vector variable shift instructions for machines with RVV
  • 5f804b2ec12627b593353ceeab881187b0bb5cd6: 8329825: Clarify the value type for java.net.SocketOptions.SO_LINGER
  • 52eda79522a5bd71b527e5946b654a331b021473: 8332538: Switch off JIT memory limit check for TestAlignVectorFuzzer.java
  • d999b81e7110751be402012e1ed41b3256f5895e: 8331572: Allow using OopMapCache outside of STW GC phases
  • 8291c94bcdbb01beddc94f290f2749841404cc0c: 8331224: ClassCastException in ObjectInputStream hides ClassNotFoundException
  • 42e3c842ae2684265c794868fc76eb0ff2dea3d9: 8332086: Remove the usage of ServiceLoader in j.u.r.RandomGeneratorFactory
  • 5cf8288b8071bdcf0c923dd7ba36f91bc7594ef3: 8332153: RISC-V: enable tests and add comment for vector shift instruct (shared by vectorization and Vector API)
  • ae9ad862ee54e119553efec919f1061dca36b954: 8331934: [s390x] Add support for primitive array C1 clone intrinsic
  • ... and 24 more: https://git.openjdk.org/jdk/compare/9bb6169a1cba900fa79d63119696efe265762083...master

Your commit was automatically rebased without conflicts.

openjdk[bot] avatar May 22 '24 11:05 openjdk[bot]

@robehn Pushed as commit c3bc23fe48ca1603afe68a6ac4aaa523a1edbb41.

:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

openjdk[bot] avatar May 22 '24 11:05 openjdk[bot]