rust icon indicating copy to clipboard operation
rust copied to clipboard

x86-32 float return for 'Rust' ABI: treat all float types consistently

Open RalfJung opened this issue 1 year ago • 6 comments

This helps with https://github.com/rust-lang/rust/issues/131819: for our own ABI on x86-32, we want to never use the float registers. The previous logic only considered F32 and F64, but skipped F16 and F128. So I made the logic just apply to all float types.

RalfJung avatar Oct 18 '24 05:10 RalfJung

r? @cjgillot

rustbot has assigned @cjgillot. They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

rustbot avatar Oct 18 '24 05:10 rustbot

Cc @tgross35

RalfJung avatar Oct 18 '24 05:10 RalfJung

Huh. I hadn't noticed a bunch of target-specific code has escaped rustc_target.

That will be fixed.

workingjubilee avatar Oct 18 '24 06:10 workingjubilee

@RalfJung just to confirm: This winds up applying to both i586 and i686, yes?

workingjubilee avatar Oct 18 '24 06:10 workingjubilee

This winds up applying to both i586 and i686, yes?

I assume so. I didn't change that part of the logic.

RalfJung avatar Oct 18 '24 07:10 RalfJung

Huh. I hadn't noticed a bunch of target-specific code has escaped rustc_target.

Yeah, rustc_target does the non-Rust-ABI handling, but at some point we had to start making Rust ABI handling target-specific...

RalfJung avatar Oct 18 '24 07:10 RalfJung

👍 for making this change

tgross35 avatar Oct 18 '24 08:10 tgross35

While I do need to be able to read this code to refactor it, I do intend to bulldoze it.

And since this seems fine-ish, assuming CI doesn't find a reason to take issue with it, and I would hate to make RalfJung rebase this a bunch:

@bors r+

workingjubilee avatar Oct 19 '24 00:10 workingjubilee

:pushpin: Commit a9f6fd16c07d78386e726946bfe73c1c2a18102e has been approved by workingjubilee

It is now in the queue for this repository.

bors avatar Oct 19 '24 00:10 bors

@bors r- failed in https://github.com/rust-lang/rust/pull/131924#issuecomment-2423908020 i think

matthiaskrgr avatar Oct 19 '24 14:10 matthiaskrgr

@bors rollup=never

RalfJung avatar Oct 19 '24 14:10 RalfJung

@workingjubilee I had to update the codegen tests, please have a look.

I decided that it's not worth testing the ABI generation for the same return type 20 times, so I made most of these cases match against just the function name, only a few also check the return type to check the ABI. Argument types weren't covered before so I added them as well to ensure they are checked.

RalfJung avatar Oct 19 '24 15:10 RalfJung

:umbrella: The latest upstream changes (presumably #131211) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Oct 19 '24 17:10 bors

Go figure, those tests have hardly been touched since I added them but they just happen to conflict with https://github.com/rust-lang/rust/pull/131211 that landed a couple hours ago :)

Their intent was mostly to test that we lower to the correct llvm calls and not so much ABI, for when I was wiring things up in the compiler without having too much exposed in the library. We probably could use a separate test just for ABI at some point so that check not repeated so many times here, and to cover different signatures and architectures (same for the other float types).

tgross35 avatar Oct 19 '24 19:10 tgross35

note to self, PR review checklist should include:

  • if this PR affects codegen, examine the codegen tests under multiple different optimization levels

workingjubilee avatar Oct 20 '24 02:10 workingjubilee

Argh that's a painful conflict. :(

RalfJung avatar Oct 20 '24 09:10 RalfJung

Ah but most of the diff in f128.rs can actually be removed.

RalfJung avatar Oct 20 '24 09:10 RalfJung

@rustbot ready

RalfJung avatar Oct 20 '24 09:10 RalfJung

@bors r+

workingjubilee avatar Oct 21 '24 20:10 workingjubilee

:pushpin: Commit 57d5f864e3ff87d7e5ba06f6175c9aa50d576848 has been approved by workingjubilee

It is now in the queue for this repository.

bors avatar Oct 21 '24 20:10 bors

:hourglass: Testing commit 57d5f864e3ff87d7e5ba06f6175c9aa50d576848 with merge 8b863d7d722c8655bc8604ed986d15353f3d5428...

bors avatar Oct 22 '24 01:10 bors

The job i686-gnu failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
failures:

---- [assembly] tests/assembly/x86-return-float.rs#normal stdout ----

error in revision `normal`: verification with 'FileCheck' failed
status: exit status: 1
command: "/checkout/obj/build/i686-unknown-linux-gnu/llvm/build/bin/FileCheck" "--input-file" "/checkout/obj/build/i686-unknown-linux-gnu/test/assembly/x86-return-float.normal/x86-return-float.s" "/checkout/tests/assembly/x86-return-float.rs" "--check-prefix=CHECK" "--check-prefix" "normal" "--check-prefix" "NONMSVC" "--allow-unused-prefixes" "--dump-input-context" "100" "--implicit-check-not" "fld" "--implicit-check-not" "fst"
--- stderr -------------------------------
/checkout/tests/assembly/x86-return-float.rs:308:12: error: CHECK: expected string not found in input
/checkout/tests/assembly/x86-return-float.rs:308:12: error: CHECK: expected string not found in input
 // CHECK: pinsrw $0, {{.*}}(%ebp), %xmm0
           ^
/checkout/obj/build/i686-unknown-linux-gnu/test/assembly/x86-return-float.normal/x86-return-float.s:621:12: note: scanning from here
return_f16:
           ^
/checkout/obj/build/i686-unknown-linux-gnu/test/assembly/x86-return-float.normal/x86-return-float.s:628:6: note: possible intended match here
 movzwl 8(%ebp), %eax


Input file: /checkout/obj/build/i686-unknown-linux-gnu/test/assembly/x86-return-float.normal/x86-return-float.s


-dump-input=help explains the following input dump.
Input was:
<<<<<<
             .
             .
             .
             .
           521:  calll get_f64_other@PLT 
           522:  subl $4, %esp 
           523:  movsd -24(%ebp), %xmm0 
           524:  movl -16(%ebp), %eax 
           525:  movsd %xmm0, (%esi) 
           526:  movl %eax, 8(%esi) 
           527:  addl $32, %esp 
           528:  popl %esi 
           529:  popl %ebx 
           530:  popl %ebp 
           531:  .cfi_def_cfa %esp, 4 
           532:  retl 
           533: .Lfunc_end17: 
           534:  .size call_f64_other, .Lfunc_end17-call_f64_other 
           535:  .cfi_endproc 
           536:  
           537:  .section .text.call_other_f32,"ax",@progbits 
           538:  .globl call_other_f32 
           539:  .p2align 4, 0x90 
           540:  .type call_other_f32,@function 
           541: call_other_f32: 
           542:  .cfi_startproc 
           543:  pushl %ebp 
           544:  .cfi_def_cfa_offset 8 
           545:  .cfi_offset %ebp, -8 
           546:  movl %esp, %ebp 
           547:  .cfi_def_cfa_register %ebp 
           548:  pushl %ebx 
           549:  pushl %esi 
           550:  subl $16, %esp 
           551:  .cfi_offset %esi, -16 
           552:  .cfi_offset %ebx, -12 
           553:  movl 8(%ebp), %esi 
           554:  calll .L18$pb 
           555: .L18$pb: 
           556:  popl %ebx 
           557:  leal -16(%ebp), %eax 
           558: .Ltmp8: 
           559:  addl $_GLOBAL_OFFSET_TABLE_+(.Ltmp8-.L18$pb), %ebx 
           560:  movl %eax, (%esp) 
           561:  calll get_other_f32@PLT 
           562:  subl $4, %esp 
           563:  movl -16(%ebp), %eax 
           564:  movss -12(%ebp), %xmm0 
           565:  movl %eax, (%esi) 
           566:  movss %xmm0, 4(%esi) 
           567:  addl $16, %esp 
           568:  popl %esi 
           569:  popl %ebx 
           570:  popl %ebp 
           571:  .cfi_def_cfa %esp, 4 
           572:  retl 
           573: .Lfunc_end18: 
           574:  .size call_other_f32, .Lfunc_end18-call_other_f32 
           575:  .cfi_endproc 
           576:  
           577:  .section .text.call_other_f64,"ax",@progbits 
           578:  .globl call_other_f64 
           579:  .p2align 4, 0x90 
           580:  .type call_other_f64,@function 
           581: call_other_f64: 
           582:  .cfi_startproc 
           583:  pushl %ebp 
           584:  .cfi_def_cfa_offset 8 
           585:  .cfi_offset %ebp, -8 
           586:  movl %esp, %ebp 
           587:  .cfi_def_cfa_register %ebp 
           588:  pushl %ebx 
           589:  pushl %esi 
           590:  subl $16, %esp 
           591:  .cfi_offset %esi, -16 
           592:  .cfi_offset %ebx, -12 
           593:  movl 8(%ebp), %esi 
           594:  calll .L19$pb 
           595: .L19$pb: 
           596:  popl %ebx 
           597:  leal -20(%ebp), %eax 
           598: .Ltmp9: 
           599:  addl $_GLOBAL_OFFSET_TABLE_+(.Ltmp9-.L19$pb), %ebx 
           600:  movl %eax, (%esp) 
           601:  calll get_other_f64@PLT 
           602:  subl $4, %esp 
           603:  movl -20(%ebp), %eax 
           604:  movsd -16(%ebp), %xmm0 
           605:  movl %eax, (%esi) 
           606:  movsd %xmm0, 4(%esi) 
           607:  addl $16, %esp 
           608:  popl %esi 
           609:  popl %ebx 
           610:  popl %ebp 
           611:  .cfi_def_cfa %esp, 4 
           612:  retl 
           613: .Lfunc_end19: 
           614:  .size call_other_f64, .Lfunc_end19-call_other_f64 
           615:  .cfi_endproc 
           616:  
           617:  .section .text.return_f16,"ax",@progbits 
           618:  .globl return_f16 
           619:  .p2align 4, 0x90 
           620:  .type return_f16,@function 
           621: return_f16: 
check:308'0                X error: no match found
           622:  .cfi_startproc 
check:308'0     ~~~~~~~~~~~~~~~~
           623:  pushl %ebp 
check:308'0     ~~~~~~~~~~~~
           624:  .cfi_def_cfa_offset 8 
check:308'0     ~~~~~~~~~~~~~~~~~~~~~~~
           625:  .cfi_offset %ebp, -8 
check:308'0     ~~~~~~~~~~~~~~~~~~~~~~
           626:  movl %esp, %ebp 
check:308'0     ~~~~~~~~~~~~~~~~~
           627:  .cfi_def_cfa_register %ebp 
check:308'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           628:  movzwl 8(%ebp), %eax 
check:308'0     ~~~~~~~~~~~~~~~~~~~~~~
check:308'1          ?                 possible intended match
           629:  popl %ebp 
check:308'0     ~~~~~~~~~~~
           630:  .cfi_def_cfa %esp, 4 
check:308'0     ~~~~~~~~~~~~~~~~~~~~~~
           631:  retl 
check:308'0     ~~~~~~
           632: .Lfunc_end20: 
check:308'0     ~~~~~~~~~~~~~~
           633:  .size return_f16, .Lfunc_end20-return_f16 
check:308'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           634:  .cfi_endproc 
check:308'0     ~~~~~~~~~~~~~~
check:308'0     ~
check:308'0     ~
           636:  .section .text.return_f128,"ax",@progbits 
check:308'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           637:  .globl return_f128 
check:308'0     ~~~~~~~~~~~~~~~~~~~~
           638:  .p2align 4, 0x90 
check:308'0     ~~~~~~~~~~~~~~~~~~
           639:  .type return_f128,@function 
check:308'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           640: return_f128: 
check:308'0     ~~~~~~~~~~~~
           641:  .cfi_startproc 
           642:  pushl %ebp 
           643:  .cfi_def_cfa_offset 8 
           644:  .cfi_offset %ebp, -8 
           645:  movl %esp, %ebp 
           646:  .cfi_def_cfa_register %ebp 
           647:  pushl %edi 
           648:  pushl %esi 
           649:  .cfi_offset %esi, -16 
           650:  .cfi_offset %edi, -12 
           651:  movl 8(%ebp), %eax 
           652:  movl 12(%ebp), %ecx 
           653:  movl 16(%ebp), %edx 
           654:  movl 20(%ebp), %esi 
           655:  movl 24(%ebp), %edi 
           656:  movl %edi, 12(%eax) 
           657:  movl %esi, 8(%eax) 
           658:  movl %edx, 4(%eax) 
           659:  movl %ecx, (%eax) 
           660:  popl %esi 
           661:  popl %edi 
           662:  popl %ebp 
           663:  .cfi_def_cfa %esp, 4 
           664:  retl $4 
           665: .Lfunc_end21: 
           666:  .size return_f128, .Lfunc_end21-return_f128 
           667:  .cfi_endproc 
           668:  
           669:  .ident "rustc version 1.84.0-nightly (8b863d7d7 2024-10-22)" 
           670:  .section ".note.GNU-stack","",@progbits 
------------------------------------------



rust-log-analyzer avatar Oct 22 '24 02:10 rust-log-analyzer

:broken_heart: Test failed - checks-actions

bors avatar Oct 22 '24 02:10 bors

profoundly cursed.

workingjubilee avatar Oct 22 '24 03:10 workingjubilee

That's about these tests:


// The "C" ABI for `f16` and `f128` on x86 has never used the x87 floating point stack. Do some
// basic checks to ensure this remains the case for the "Rust" ABI.

// CHECK-LABEL: return_f16:
#[no_mangle]
pub fn return_f16(x: f16) -> f16 {
    // CHECK: pinsrw $0, {{.*}}(%ebp), %xmm0
    // CHECK-NOT: xmm0
    // CHECK: retl
    x
}

// CHECK-LABEL: return_f128:
#[no_mangle]
pub fn return_f128(x: f128) -> f128 {
    // CHECK: movl [[#%d,OFFSET:]](%ebp), %[[PTR:.*]]
    // CHECK-NEXT: movl [[#%d,OFFSET+4]](%ebp), %[[VAL1:.*]]
    // CHECK-NEXT: movl [[#%d,OFFSET+8]](%ebp), %[[VAL2:.*]]
    // CHECK-NEXT: movl [[#%d,OFFSET+12]](%ebp), %[[VAL3:.*]]
    // CHECK-NEXT: movl [[#%d,OFFSET+16]](%ebp), %[[VAL4:.*]]
    // CHECK-NEXT: movl %[[VAL4:.*]] 12(%[[PTR]])
    // CHECK-NEXT: movl %[[VAL3:.*]] 8(%[[PTR]])
    // CHECK-NEXT: movl %[[VAL2:.*]] 4(%[[PTR]])
    // CHECK-NEXT: movl %[[VAL1:.*]] (%[[PTR]])
    // CHECK: retl
    x
}

We don't actually want the Rust ABI to use the floating point stack here either. So... should the test just be removed? If not, I'd appreciate advice on how to update them, as I can't read the assembly. Cc @beetrees

RalfJung avatar Oct 22 '24 07:10 RalfJung

pinsrw

So, that is saying "insert a 'word' (really, 2 bytes) into a SIMD vector register", which is relevant for x86 if we use the vector registers...

The new instructions are this:

           pushl %ebp 
           movl %esp, %ebp 
           movzwl 8(%ebp), %eax 
           popl %ebp 
           retl

...AT&T syntax...

...anyways, it is:

  • push 32-bit value in ebp to stack
  • move 32-bit value from esp to ebp
  • take the address you get from offsetting ebp by 8 and then from that, move a 16-bit value into eax, zero extending it to 32 bits
  • pop 32-bit value from stack to ebp
  • return

...isn't that a lot of work...? shouldn't it be just something like

           movzwl %esp, %eax 
           retl

maybe with an offset...? is it because of the frame pointers?

anyway it should be fine to set the test to be

// CHECK: pushl [[PUSHED:%{.*}]]
// CHECK: movl %esp, [[PUSHED]]
// CHECK: movzwl 8([[PUSHED]]), %eax
// CHECK: popl [[PUSHED]]
// CHECK-NEXT: retl

workingjubilee avatar Oct 22 '24 08:10 workingjubilee

That doesn't quite work, I had to remove the NEXT (and fix a typo in the pushl line, the curly braces there are wrong). Otherwise I get:

/home/r/src/rust/rustc.2/tests/assembly/x86-return-float.rs:312:17: error: CHECK-NEXT: is not on the line after the previous match
 // CHECK-NEXT: retl
                ^
/home/r/src/rust/rustc.2/build/x86_64-unknown-linux-gnu/test/assembly/x86-return-float.normal/x86-return-float.s:631:2: note: 'next' match was here
 retl
 ^
/home/r/src/rust/rustc.2/build/x86_64-unknown-linux-gnu/test/assembly/x86-return-float.normal/x86-return-float.s:629:11: note: previous match ended here
 popl %ebp
          ^
/home/r/src/rust/rustc.2/build/x86_64-unknown-linux-gnu/test/assembly/x86-return-float.normal/x86-return-float.s:630:1: note: non-matching line after previous match is here
 .cfi_def_cfa %esp, 4
^

RalfJung avatar Oct 22 '24 08:10 RalfJung

The relevant part of the asm looks like this:

        629:  popl %ebp 
        630:  .cfi_def_cfa %esp, 4 
        631:  retl 
next:312      !~~~  error: match on wrong line
        632: .Lfunc_end20: 

So I guess this cfi_def_cfa thing is in the way.

RalfJung avatar Oct 22 '24 08:10 RalfJung

Ah, apologies, I hadn't actually CHECK-NEXTed it.

workingjubilee avatar Oct 22 '24 08:10 workingjubilee

Yes.

Those are non-effectual directives... they may affect unwinding, or debugging, but they sure don't affect what we care about, at least... but right now I can't remember how to coax rustc or FileCheck into skipping those.

workingjubilee avatar Oct 22 '24 08:10 workingjubilee