riscv-assembler
riscv-assembler copied to clipboard
Incorrect translation of `BEQ`?
Assembly code
addi s0 x0 10
addi s1 x0 10
loop:
addi s1 x0 -1 ; I know it's the same as s1 = -1
beq s1 x0 out
beq x0 x0 loop
out:
addi s1 s0 -32
Assembling it
from pathlib import Path
from riscv_assembler.convert import AssemblyConverter
BASE_PATH = Path("code")
conv = AssemblyConverter(output_type="bt")
conv.convert(str(BASE_PATH / "loop.s"))
Instructions in binary
This is the .txt
file produced by convert
:
00000000101000000000010000010011
00000000101000000000010010010011
11111111111100000000010010010011
00000000000001001000010001100011
01111100000000000000101011100011
11111110000001000000010010010011
Issue
The beq x0 x0 loop
instruction seems to be encoded incorrectly. According to chapter 19 "RV32/64G Instruction Set Listings" of the spec, this is a B-type instruction whose encoding is as follows:
0 111110 00000 00000 000 1010 1 1100011
^ ^^^^^^ ^^^^^ ^^^^^ ^^^^ ^
| | | rs1 | |
| | rs2 | imm[11]
| imm[10:5] imm[4:1]
imm[12]
Thus, the immediate bytes are:
-
imm[4:1] = 1010
-
imm[10:5] = 111110
-
imm[11] = 1
-
imm[12] = 0
According to section 2.3 "Immediate Encoding Variants", imm[12]
is the sign of the immediate, so since imm[12] = 0
, we have a positive offset, so beq x0 x0 loop
will jump forward, even though it's supposed to jump backward, back to the loop
label.
The immediate is:
0000 1 111110 1010 0 = 4052
So we'll jump 4052 bytes forward???
Furthermore, RARS provides a different encoding for this instruction:
v--- different leading bit
1 111111 00000 00000 000 1100 1 1100011 <- RARS
0 111110 00000 00000 000 1010 1 1100011 <- this assembler
RARS's immediate is 1111 1 111111 1100 0 = -8
, so that's a jump 8 / 2 = 4
bytes back, so 2 instructions back, which leads to the loop
label, which makes sense.