mcsema
mcsema copied to clipboard
mcsema-dyninst-disass cannot handle some avx-512 instructions
Hi team,
I have an independent binary tested_api.zip generated from the assembly source code shown below
movl $15, %eax
kmovb %eax,%k1
vpshufb %xmm2, %xmm3, %xmm0{%k1}{z}
pmovmskb %xmm0, %eax
ret
But when I used mcsema-dyninst-disass to do the Control Flow Recovery
/home/suzixin/code/remill/scripts/remill-build/tools/mcsema/tools/mcsema_disass/dyninst/mcsema-dyninst-disass --binary tested_api.o --output tested_api.cfg --pie_mode --dump_cfg
The cfg I got was different from the binary.
/home/suzixin/code/remill/scripts/remill-build/tools/mcsema/tools/mcsema_disass/dyninst/mcsema-dyninst-disass --binary tested_api.o --output tested_api.cfg --pie_mode --dump_cfg
name: "tested_api.o"
funcs {
ea: 0
blocks {
ea: 0
instructions {
ea: 0
bytes: "\270\017\000\000\000"
}
instructions {
ea: 5
bytes: "\305\371\222"
}
}
is_entrypoint: true
name: "mystrchr"
}
segments {
ea: 0
data: "\024\000\000\000\000\000\000\000\001zR\000\001x\020\001\033\014\007\010\220\001\000\000\024\000\000\000\034\000\000\000\000\000\000\000\024\000\000\000\000\000\000\000\000\000\000\000"
read_only: true
is_external: false
name: ".eh_frame"
is_exported: false
is_thread_local: false
}
segments {
ea: 0
data: "\270\017\000\000\000\305\371\222\310b\362e\211\000\302f\017\327\300\303"
read_only: false
is_external: false
name: ".text"
is_exported: false
is_thread_local: false
}
Most likely DynInst doesn't support those AVX-512 instructions or mask registers. Neither does Remill, though.
If we use capstone instead, as PR #638 do(But it seems block in aquynh/capstone#1604), we might solve the problem of disassembler.
BTW, If we solve the disassembler problem, Is it difficult to add support for those AVX-512
instructions or mask registers in Remill?
Yup looks like the problem is indeed with those AVX instructions as the function is "cut in the middle". It is no hard obstacle and can be solved, but it would take some unknown amount of time (I already have some ideas).
(Also I have never really tested Dyninst on .o
instead of fully linked ELF, but it should not play that much role here.)
As for how much work it would be to add to remill, I will leave that answer to @pgoodman
Yeah adding AVX512 is likely to be a bunch of work. We could possibly make use of one of those tools that converts the Intel manual to text documents and extracts the code, then programatically translate that to our semantics. There's also some work my Sandeep Dasgupta (@sdasgup3) related to x86-64 semantics, and they might have AVX semantics that we can use for generating things more compatible with remill.
@adahsuzixin hello ,which dyninst version you used, please?