Aarch64 : ADD/SUB, SP/ZR confusion when Rd=31
Hi !
Currently in Aarch64 architecture, if Rd=31 in ADD(S) and SUB(S) instructions (and so the alias CMP), Miasm translates the destination register with the 'SP' register.
https://github.com/cea-sec/miasm/blob/master/miasm/arch/aarch64/arch.py#L1803
aarch64op("addsub", [sf, bs_adsu_name, modf, bs('10001'), shift, imm_sft_12, rn, rd], [rd, rn, imm_sft_12])
However, according to the ARM specification, results are not always the same and can be SP or ZR. http://shell-storm.org/armv8-a/ISA_v85A_A64_xml_00bet8_OPT/xhtml/subs_addsub_ext.html#commonps
- For ANDS(imm, ext, shifted) and SUBS (imm, ext, shifted) Rd=31 refers to ZR register.
- For ADD/SUB(shift), Rd=31 refers to ZR register
// Instruction pseudo code
result = ....
X[d] = result;
- For ADD/SUB (imm, ext) Rd=31 refers to SP register
// Instruction pseudo code
result = ....
if d == 31 then
SP[] = result;
else
X[d] = result;
Can you be a bit more precise on the bug (I mean give opcode and what are you expecting for?)
It seems the adds with r31 as destination are 'aliased' in miasm with CMP:
aarch64op("cmp", [sf, bs('1'), bs('1'), bs('10001'), shift, imm_sft_12, rn, bs('11111')], [rn, imm_sft_12], alias=True)
aarch64op("cmn", [sf, bs('0'), bs('1'), bs('10001'), shift, imm_sft_12, rn, bs('11111')], [rn, imm_sft_12], alias=True)
(same with ands/tst)
And the disasm of such an instruction generates CMP/TST respectively (see regression tests in tests/aarch64/arch.py)
Yep sorry :S
For instance with cmp w1, w2, uxtb (alias for SUBS )
from miasm.analysis.machine import Machine
machine = Machine('aarch64l')
# cmp w1, w2, uxtb
instr = machine.mn.dis("\x3f\x00\x22\x6b",'l')
print(instr)
The example prints SUBS WSP, W1, W2 UXTB 0x0 instead of SUBS WZR, W1, W2 UXTB 0x0 http://shell-storm.org/armv8-a/ISA_v85A_A64_xml_00bet8_OPT/xhtml/cmp_subs_addsub_imm.html
So internal expressions of sub/add instructions can be wrong.
I used a CMP extended register because aliases hide the real Rd destination register parsed by Miasm and we cannot see the issue with a simple CMP
Hey, can you try by decommenting the line: https://github.com/cea-sec/miasm/blob/master/miasm/arch/aarch64/arch.py#L1815
#aarch64op("cmp", [sf, bs('1'), bs('1'), bs('01011'), bs('00'), bs('1'), rm_ext, option, imm3, rn, bs('11111')], [rn, rm_ext], alias=True)
I don't remember why I commented out this one :disappointed:
By commenting the line it is ok for this instruction :+1: I also checked the IR, and it is ok too (SP is not modified)
zf = FLAG_EQ_CMP(X1[0:32], zeroExt_32(X2[0:32][0:8]) << (0x0 & 0x1F))
of = FLAG_SUB_OF(X1[0:32], zeroExt_32(X2[0:32][0:8]) << (0x0 & 0x1F))
nf = FLAG_SIGN_SUB(X1[0:32], zeroExt_32(X2[0:32][0:8]) << (0x0 & 0x1F))
cf = FLAG_SUB_CF(X1[0:32], zeroExt_32(X2[0:32][0:8]) << (0x0 & 0x1F)) ^ 0x1
However with cmn w0, w1 => ADDS WZR, W0, W1 there is the same issue:
from miasm.analysis.machine import Machine
from miasm.core.locationdb import LocationDB
machine = Machine('aarch64l')
# cmn w0, w1
instr = machine.mn.dis("\x1f\x00\x01\x2b",'l')
print(instr)
loc_db = LocationDB()
ira = machine.ira(loc_db)
ircfg = ira.new_ircfg()
ira.add_instr_to_ircfg(instr, ircfg)
for lbl, irblock in ircfg.blocks.items():
print(irblock.to_string(loc_db))
Miasm uses WSP instead of WZR
ADDS WSP, W0, W1
loc_0:
zf = FLAG_EQ_CMP(X0[0:32], -X1[0:32])
of = FLAG_ADD_OF(X0[0:32], X1[0:32])
nf = FLAG_SIGN_SUB(X0[0:32], -X1[0:32])
SP = zeroExt_64(X0[0:32] + X1[0:32])
cf = FLAG_ADD_CF(X0[0:32], X1[0:32])
I think it misses aliases for add/sub instructions and I wonder if it is better to add missing aliases or to split the "addsub" aarch64op definition and use either 'rd' or 'rdz' depending on the opcode.
I agree with you.
I will make the adds/ands using a rd_nosp, and a separate cmp/tst (with the bs('11111') for sp)