Add new record filter tool for public release of traces
We want to create a new tool to filter traces of Google workloads for public release. The new public Google workload traces will contain more information compared to the previous version, while still preserving confidentiality of Google's IP.
Main features of new public traces:
- Instruction categories will be used instead of the original instruction opcode;
- Operand dependencies will be preserved using virtual registers + register size;
- Syscall numbers will be removed as before, but we'll keep the blocking attribute, if a syscall is blocking;
- Branch target addresses will be embedded in the trace, instead of being in a separate file;
- Provide a map of all virtual addresses used in the trace and a tool to perform the virtual-to-physical mapping.
We'd like to raise a point of discussion about the use of DR_REG_V as a prefix for virtual registers.
Most Arm programmers' documentation uses V for vector registers, e.g.ADD Vd.4S, Vn.4S, Vm.4S.
The DynamoRIO user documentation uses R for reasons of convention and generality, e.g.INSTR_CREATE_add_vector(dc, Rd, Rm, Rn, width), see:
https://dynamorio.org/dr__ir__macros__aarch64_8h.html#ad6fa6d2ab7764783481efd209cf11b76
A new or relatively inexperienced user may probably and intuitively but wrongly try something like this first:
INSTR_CREATE_add_vector(dc, opnd_create_reg(DR_REG_V0), opnd_create_reg(DR_REG_V1), opnd_create_reg(DR_REG_V2), OPND_CREATE_SINGLE())
Rather than the correct:
INSTR_CREATE_add_vector(dc, opnd_create_reg(DR_REG_Q0), opnd_create_reg(DR_REG_Q1), opnd_create_reg(DR_REG_Q2), OPND_CREATE_SINGLE())
or e.g.
INSTR_CREATE_add_vector(dc, opnd_create_reg(DR_REG_D20), opnd_create_reg(DR_REG_D10), opnd_create_reg(DR_REG_D14), OPND_CREATE_HALF())
This possible stumbling block is probably less of a problem for scalable vectors, e.g. INSTR_CREATE_add_sve(dc, Zd, Zn, Zm), because the documentation implies DR_REG_Z, see
https://dynamorio.org/dr__ir__macros__aarch64_8h.html#a14d0f0b7fad176b301539c3e9254771b
What do you think? Are we worrying unnecessarily? Is the level of DR knowledge required to use instruction macros in e.g. clients, enough for developers to know which register names/IDs are correct?
s/DR_REG_V0/DR_REG_VIRT0/?
s/DR_REG_V0/DR_REG_VIRT0/?
That's fine.
Thank you for pointing this out @AssadHashmi !
Will change from DR_REG_V to DR_REG_VIRT as @derekbruening suggested.