toolchain
toolchain copied to clipboard
ARCEB: linux kernel build error: operand out of range (-132 is not between -128 and 127)
I've reproduced the Linux kernel build issue with arceb
compiler, described here:
http://lists.infradead.org/pipermail/linux-snps-arc/2023-August/007525.html
To reproduce this issue, I used gcc 12.2 from the arc-2023.03 toolchain release (Linux/glibc ARC HS Big Endian).
With certain set of options, I observe that compiler puts short branch instruction (bne_s
) with incorrect relative offset and then assembler prints the following error:
refscale.s: Assembler messages:
refscale.s:1917: Error: operand out of range (-132 is not between -128 and 127)
The refscale.zip achive contains pre-processed source (refscale.i). The issue can be reproduced with this file by the following command:
arceb-linux-gcc -save-temps -mlock -mno-ll64 -mmedium-calls -mcpu=hs38 -fno-inline-functions-called-once -fconserve-stack -Os -mlong-calls -c ./refscale.i -o refscale.o
The little-endian compiler with -mbig-endian
also shows this issue.
A reduced set would be:
int m(void);
int n(long long, int);
void p(int, int, int, int);
void o(void);
void q(const char *);
void r(char *);
static int a, b = 30, c, e, g;
int d = (int) &c;
long long *h;
long long k();
int l() {
long long j;
int f;
char i;
f = 0;
for (; f < b && m(); f++) {
g = 0;
for (; a;)
for (;;)
;
j = k();
h[f] = n(1000 * j, a * 10000);
}
f = 0;
for (; f < b; f++)
o();
if (c)
p(e, 0, 1, 0);
q("");
r(&i);
}
s()
{
a = b = 0;
}
and build with:
$ arc-elf32-gcc -mbig-endian \
-mno-ll64 \
-mcpu=hs38 \
-fno-inline-functions-called-once \
-fconserve-stack \
-Os \
-mlong-calls \
-c \
./test.c \
-o /dev/null
The relevant assembly output:
.L8:
mov_s r13,0
.align 2
.L6:
ld_s r0,[r14]
brgt r0, r13, @.L9
mov_s r0,@.LANCHOR0
ld_s r0,[r0,4]
breq_s r0, 0, @.L10
mov_s r3,0
mov_s r2,1
mov_s r1,0
mov_s r0,0
jl @p
.align 2
.L10:
mov_s r0,@.LC0
jl @q
add r0,sp,3
jl @r
add_s sp,sp,4
leave_s {r13-r14, blink, pcl}
.align 2
.L3:
jl @k
mpy r3,r0,1000
ld_s r2,[r14]
mpydu r0,r1,1000
add_s r0,r3,r0
mpy r2,r2,10000
jl @n
ld_s r1,[gp,@h@sda]
add3_s r1,r1,r13
add_s r13,r13,1
st_s r0,[r1,4]
asr_s r0,r0,31
st_s r0,[r1]
b_s @.L2
.align 2
.L5:
jl @m
breq_s r0, 0, @.L8 <----- The problematic range
The relevant diff between the assembly output of little endian
and big endian
.-----------------------.-----------------------.
| little endian | big endian |
|-----------------------+-----------------------|
| mpy r3,r1,1000 | mpy r3,r0,1000 |
| ld_s r2,[r14] | ld_s r2,[r14] |
| mpydu r0,r0,1000 | mpydu r0,r1,1000 |
| add_s r1,r3,r1 | add_s r0,r3,r0 |
| mpy r2,r2,10000 | mpy r2,r2,10000 |
| jl @n | jl @n |
`-----------------------^-----------------------'
The problem is the mpydu
instruction. In little endian
form, it's mpydu b,b,s12
which is 4 bytes long. However, in big endian
output, it's mpydu a,b,limm
which is 8 bytes long. Fixing the length costs in GCC should fix this issue.
Thanks @claziss for pointing me in the right direction.
gcc
indeed considers the length of mpydu r0, r1, ...
instruction 4 bytes long:
$ arc-elf32-gcc ... -mbig-endian ... -dp ...
...
mpydu r0,r1,1000 # 29 [c=4 l=4] mpydu_imm_arcv2hs/1
...
Variant 1 of mpydu_imm_arv2hs
corresponds to r, 0, I
constraint that has a length of 4.
$ cat /src/gcc/gcc/config/arc/arc.md
...
(define_insn "mpyd<su_optab>_imm_arcv2hs"
[(set (match_operand:DI 0 "even_register_operand" "=r,r, r")
(mult:DI (SEZ:DI (match_operand:SI 1 "register_operand" "r,0, r"))
(match_operand 2 "immediate_operand" "L,I,Cal")))
...
[(set_attr "length" "4,4,8")
At the first glance, the real problem here is that mpydu r0, r1, 1000
shouldn't have been mapped to r, 0, I
, but r, r, Cal
instead. The tricky part is that for the big endian, gcc
considers this form of assignment:
(r0)r1 = mpydu(r1, 1000)
So, in its eyes, source and destination registers are the same (r1
). This could be fine if ARC's ISA would allow r1
to be encoded as the indicator for destination register pair in big endian (r0r1
). However, the ISA still expects the same indicator for the little endian register pair (r1r0
), which is r0
.
Proposed fix here: https://github.com/foss-for-synopsys-dwc-arc-processors/gcc/commit/b974ff374d51e03d99271d6adde8eb39490a0185
Proposed fix here: foss-for-synopsys-dwc-arc-processors/gcc@b974ff3
The proposed fix disables the b,b,...
format when dealing with a big endian target for the following instructions:
macd(u)
mpyd{s,u}_arcv2hs
vmac2h(u)
vmpy2h(u)