ir Some IR tests still fail on MacOS/AArch64

Hi @dstogov,

I noticed that you had fixed the ir_add_veneer() assertion failure in https://github.com/dstogov/ir/commit/1671b3de78742b85611f0e84f805141608730dfc. After your fix, there is no more assertion failure, but below IR tests still fail on MacOS/AArch64.

001: Argument Passing [./tests/debug.aarch64/args_001.irt]
002: Argument Passing [./tests/debug.aarch64/args_002.irt]
Simple CALL -O0 [./tests/debug.aarch64/call-O0.irt]
Simple CALL [./tests/debug.aarch64/call.irt]
CALL with parallel argument passing [./tests/debug.aarch64/call2.irt]
Simple CALL with ALLOCA [./tests/debug.aarch64/call_alloca.irt]
Simple CALL with VADDR [./tests/debug.aarch64/call_vaddr.irt]
003: Parameter Loading and argument passing [./tests/debug.aarch64/params_003.irt]
Fib [./tests/debug.aarch64/regset-fib.irt]
Fib2 [./tests/debug.aarch64/regset-fib2.irt]
FibI [./tests/debug.aarch64/regset-fibi.irt]
Simple CALL [./tests/debug.aarch64/tailcall_001.irt]
FibI [./tests/fibi_min.irt]
CTLZ 001: [./tests/run/ctlz_001.irt]
CTPOP 001: [./tests/run/ctpop_001.irt]
CTTZ 001: [./tests/run/cttz_001.irt]
Floating Point number comparison (001: CMP edge cases) [./tests/run/fcmp_001.irt]
Floating Point number comparison (002: CMP+COND edge cases) [./tests/run/fcmp_002.irt]
VA_ARG 001: va_arg(int) [./tests/run/vaarg_001.irt]
VA_ARG 002: va_arg(double) [./tests/run/vaarg_002.irt]
VA_ARG 003: va_arg(float) [./tests/run/vaarg_003.irt]
VA_ARG 004: va_arg() expanded on AArch64 [./tests/run/vaarg_004_aarch64.irt]

The failures are caused by the difference between .exp and .out, for example ./tests/debug.aarch64/call.irt

call.out

main:
	stp x29, x30, [sp, #-0x10]!
	mov x29, sp
	adr x0, .L1
	movz w1, #0x2a
	movz x17, #0x9f54
	movk x17, #0x8d06, lsl #16
	movk x17, #0x1, lsl #32
	blr x17
	ldp x29, x30, [sp], #0x10
	ret
.rodata
.L1:
	.db 0x68, 0x65, 0x6c, 0x6c, 0x6f, 0x20, 0x25, 0x64, 0x21, 0x0a, 0x00, 0x00

hello 1799882064!

exit code = 18

call.exp

main:
	stp x29, x30, [sp, #-0x10]!
	mov x29, sp
	adr x0, .L1
	movz w1, #0x2a
	bl printf
	ldp x29, x30, [sp], #0x10
	ret
.rodata
	.db 0x1f, 0x20, 0x03, 0xd5
.L1:
	.db 0x68, 0x65, 0x6c, 0x6c, 0x6f, 0x20, 0x25, 0x64, 0x21, 0x0a, 0x00, 0x00

hello 42!

exit code = 10

It looks that the issues are still related to calling convention differences between Linux/AArch64 and MacOS/AArch64. Perhaps you are still working on some complete fixes for this. But I'd like to add this issue here to track this.

Nov 27 '23 08:11 pfustc

Thanks for the report. The vararg passing on MacOS/AArch64 is not fixed yet.

However this failure is not related to calling convention and it is not a serious problem. The difference is in usage of blr instead of bl that is caused by the larger distance between JIT code buffer and printf.

Probably we may always generate bl and relay on additional veneer when necessary. I'll think about this.

Nov 27 '23 21:11 dstogov

I'm wrong. This test uses expectation that is incorrect on MacOS. The third argument must be passed through stack and the result of the execution is incorrect. This is directly related to vararg passing.

Nov 27 '23 21:11 dstogov

@pfustc could you please re-run the tests on MacOS/AArch64 once again I hope I fixed vararg passing via https://github.com/dstogov/ir/commit/c5bf8003c0f3aeaeef0700e17c44e137627758c8 Tests that verifies assembler code are going to be failed, but the output generated by JIT code must be the same (e.g. hello 42!)

Nov 30 '23 20:11 dstogov

Thanks for your quick fix. We now have 12 failures, 11 of which are just caused by the diff in assembly code.

-------------------------------
FAILED TESTS
-------------------------------
001: Argument Passing [./tests/debug.aarch64/args_001.irt]
002: Argument Passing [./tests/debug.aarch64/args_002.irt]
Simple CALL -O0 [./tests/debug.aarch64/call-O0.irt]
Simple CALL [./tests/debug.aarch64/call.irt]
CALL with parallel argument passing [./tests/debug.aarch64/call2.irt]
Simple CALL with ALLOCA [./tests/debug.aarch64/call_alloca.irt]
Simple CALL with VADDR [./tests/debug.aarch64/call_vaddr.irt]
003: Parameter Loading and argument passing [./tests/debug.aarch64/params_003.irt]
Fib [./tests/debug.aarch64/regset-fib.irt]
Fib2 [./tests/debug.aarch64/regset-fib2.irt]
FibI [./tests/debug.aarch64/regset-fibi.irt]
Simple CALL [./tests/debug.aarch64/tailcall_001.irt]
-------------------------------

But the last one ./tests/debug.aarch64/tailcall_001.irt still has output difference.

$ diff tailcall_001.out tailcall_001.exp
2d1
< 	sub sp, sp, #0x10
4,16c3,4
< 	movz w16, #0x2a
< 	str x16, [sp]
< 	sub sp, sp, #0x10
< 	adr x0, .L1
< 	movz w16, #0x2a
< 	str x16, [sp]
< 	movz x17, #0x9f54
< 	movk x17, #0x8857, lsl #16
< 	movk x17, #0x1, lsl #32
< 	blr x17
< 	add sp, sp, #0x10
< 	add sp, sp, #0x10
< 	ret
---
> 	movz w1, #0x2a
> 	b printf
21a10,12
> hello 42!
>
> exit code = 10

There is no hello 42! printed in tailcall_001.out.

Dec 01 '23 03:12 pfustc

Great! Thanks!

According ./tests/debug.aarch64/tailcall_001.irt - we probably can't use TAILCALL because prinrf() accepts stack parameters. Actually, the generated code tries to convert TAILCALL to regular CALL, but it seem done improperly.

the link register is not saved restored
42 passed on stack twice
add sp, sp, #0x10 is done twice

The other test failures most probably are false positives and may be avoided by separating target apple-aarch64. The other problem is bl/blr mismatch caused by different code layout.

Dec 01 '23 06:12 dstogov

The output for ./tests/debug.aarch64/tailcall_001.irt should be fixed by https://github.com/dstogov/ir/commit/81fab9a2cbaaff0fdef7c4c93f5ccf4154278a43 I can't test this.

Dec 04 '23 08:12 dstogov

Just checked that there's no other failures, except those caused by the assembly difference. Thanks for it.

The other test failures most probably are false positives and may be avoided by separating target apple-aarch64. The other problem is bl/blr mismatch caused by different code layout.

Is the assembly difference eventually caused by the different behavior of DynASM on Linux and MacOS? I'd like to know if it can be avoided by replacing current IR backend to something else.

Dec 04 '23 09:12 pfustc

Is the assembly difference eventually caused by the different behavior of DynASM on Linux and MacOS? I'd like to know if it can be avoided by replacing current IR backend to something else.

There are two reasons:

the difference in calling conventions (e.g. vararg passing) and different code generated on purpose. We can't avoid this (because of similar reason we made difference between x86_64 and Windows-x86_64 targets. Probably, we will have to introduce apple-aarch64).
the difference in jump distance and usage of bl or blr instruction (this might be avoided by usage of reallocations or veneers).

Dec 04 '23 09:12 dstogov

Oh yes, I understand that. Thanks for your explanation.

Dec 04 '23 09:12 pfustc

ir ir copied to clipboard

Some IR tests still fail on MacOS/AArch64

ir
ir copied to clipboard