dynamorio
dynamorio copied to clipboard
AArch64 codec is diabolically sensitive to the width of an integer
I spent a long time wondering why the following (trying to get a UMOV <Wd>, <Vn>.<Ts>[<index>]
) didn't work:
instr_create_1dst_3src(drcontext, OP_umov,
opnd_create_reg(wreg),
opnd_create_reg(qreg),
opnd_create_immed_int(0, OPSZ_1),
opnd_create_immed_int(2, OPSZ_1));
The helpful error message was "internal crash at PC" whatever.
To make it work I had to do the following instead:
instr_create_1dst_3src(drcontext, OP_umov,
opnd_create_reg(wreg),
opnd_create_reg(qreg),
opnd_create_immed_int(0, OPSZ_3b),
opnd_create_immed_int(2, OPSZ_1));
In other words, a 3-bit zero is different from an 8-bit zero!
But it's even worse than that: when I used OPSZ_4b
it didn't crash but I got a different instruction, which was a total surprise because I thought the last operand was specifying the width of the field explicitly. Why was it ignored?
My immediate thought was that the AArch64 codec (at least) should not be sensitive to the width of an integer operand and this should be enforced by making some of the tests (for example dis-a64
) replace all integer operands with an integer of a particular standard width before reencoding an instruction.
But that prompts the question of why we have all these different widths of immediate integer operand in the first place. Perhaps they are useful for some other architecture? It also prompts the question of whether we should accept sensitivity to different widths of immediate float operand.
Currently there are only a few types of AArch64 operand that are sensitive to the width of an immediate integer and would have to be changed, but one of them is imm13_const
, which is a lot more sensible than the example above because it represents a repeating bit pattern rather than the index of a vector element.
See #6448, where I claim that the width of a vector element should be specified by an immediate integer source operand with 0 meaning b
, 1 meaning h
, and so on. In the example above 2 means h
: a perfect reasonable convention, perhaps, but painfully inconsistent with other instructions.