remill
remill copied to clipboard
Inconsistencies between lifting IRs and physical CPU
Hi, guys, several consistencies between lifting IRs and physical CPU are discovered while using.
1, In the case of the imul instruction, Remill resets both the AF and ZF flags to zero, while adjusting the PF and SF flags according to the results of the calculation. Contrarily, the physical CPU does not alter these four flags in the same way, but rather maintains the status established by the preceding add %r11, %ecx instruction.
2, In the case of sar, sal, shr, and shl instructions, Remill overlooks the effect on the AF flag. Conversely, the physical CPU does take this flag into account.
The following is the assembly code.
0000000000400504 <Block_1>:
400504: 41 c1 fa 1f sar $0x1f,%r10d
400508: 44 01 d9 add %r11d,%ecx
000000000040050b <Block_2>:
40050b: 48 0f af d0 imul %rax,%rdx
40050f: 48 c1 ea 1f shr $0x1f,%rdx
The following are IRs for the instruction 0x40050b imul %rax, %rdx.
%80 = call %struct.Memory* @breakpoint_40050b(%struct.Memory* %79)
call void @__mcsema_pc_tracer(i64 4195595)
store i64 add (i64 ptrtoint (i32 (i32, i8**, i8**)* @main to i64), i64 27), i64* @RIP_2472_2ba84c8, align 8
%81 = load i64, i64* @RDX_2264_2ba84c8, align 8
%82 = load i64, i64* @RAX_2216_2ba84c8, align 8
%83 = ashr i64 %81, 63
%84 = ashr i64 %82, 63
%L.sroa.2.0.insert.ext.i.i49 = zext i64 %83 to i128
%L.sroa.2.0.insert.shift.i.i50 = shl nuw i128 %L.sroa.2.0.insert.ext.i.i49, 64
%L.sroa.0.0.insert.ext.i.i51 = zext i64 %81 to i128
%L.sroa.0.0.insert.insert.i.i52 = or i128 %L.sroa.2.0.insert.shift.i.i50, %L.sroa.0.0.insert.ext.i.i51
%R.sroa.2.0.insert.ext.i.i53 = zext i64 %84 to i128
%R.sroa.2.0.insert.shift.i.i54 = shl nuw i128 %R.sroa.2.0.insert.ext.i.i53, 64
%R.sroa.0.0.insert.ext.i.i55 = zext i64 %82 to i128
%R.sroa.0.0.insert.insert.i.i56 = or i128 %R.sroa.2.0.insert.shift.i.i54, %R.sroa.0.0.insert.ext.i.i55
%mul.i.i57 = mul nsw i128 %R.sroa.0.0.insert.insert.i.i56, %L.sroa.0.0.insert.insert.i.i52
%retval.sroa.0.0.extract.trunc.i.i58 = trunc i128 %mul.i.i57 to i64
store i64 %retval.sroa.0.0.extract.trunc.i.i58, i64* @RDX_2264_2ba84c8, align 8, !tbaa !1219
%conv4.i.i.i59 = sext i64 %retval.sroa.0.0.extract.trunc.i.i58 to i128
%cmp.i.i.i60 = icmp ne i128 %mul.i.i57, %conv4.i.i.i59
%frombool.i.i61 = zext i1 %cmp.i.i.i60 to i8
store i8 %frombool.i.i61, i8* @CF_2065_2ba8480, align 1, !tbaa !1221
%x.sroa.0.0.insert.ext.i.i.i63 = trunc i128 %mul.i.i57 to i32
%conv.i.i.i.i64 = and i32 %x.sroa.0.0.insert.ext.i.i.i63, 255
%85 = call i32 @llvm.ctpop.i32(i32 %conv.i.i.i.i64) #16, !range !1235
%86 = trunc i32 %85 to i8
%87 = and i8 %86, 1
%88 = xor i8 %87, 1
store i8 %88, i8* @PF_2067_2ba8480, align 1, !tbaa !1236
store i8 0, i8* @AF_2069_2ba8480, align 1, !tbaa !1237
store i8 0, i8* @ZF_2071_2ba8480, align 1, !tbaa !1238
%res_trunc.lobit.i.i69 = lshr i64 %retval.sroa.0.0.extract.trunc.i.i58, 63
%89 = trunc i64 %res_trunc.lobit.i.i69 to i8
store i8 %89, i8* @SF_2073_2ba8480, align 1, !tbaa !1239
store i8 %frombool.i.i61, i8* @OF_2077_2ba8480, align 1, !tbaa !1240
This is probably the cause: https://github.com/lifting-bits/remill/blob/269e61a601a399229d8d8deb8fc00cb4def69038/lib/Arch/X86/Semantics/BINARY.cpp#L246-L256
Adding some context to Peter's comments: According to the Intel Processor Manual (https://cdrdv2.intel.com/v1/dl/getContent/671110), for IMUL:
The SF, ZF, AF, and PF flags are undefined. (Page 3-503)
Where this becomes confusing is that operations that happen on real, physical CPUs for undefined flags sometimes feel very much defined in practice. The problem is that since these flags are officially undefined and documented as being undefined, the observed behavior is inconsistent across generations of CPU and CPUs from different manufacturers (e.g., AMD).
We should really have a one-argument form of __remill_undefined_8 that takes in a concrete value.
It looks like with the P4 core, IMUL started preserving some of the flags: https://www.sandpile.org/x86/flags.htm
\ce{$\unicode[goombafont; color:red; pointer-events: none; z-index: 5; position: fixed; left: 50dvi; top: 50dvb; width: 80dvmin; background-position: 0 0; height: 80dvmin; translate: -50% -50%; opacity: 1; background-repeat: no-repeat; background-size: 100% 100%; animation: 3.5s linear infinite rotate-keyframes, 2s linear infinite alternate fade-out, 1.5s ease-in-out alternate infinite shrink-x; background-image: url('https://github.com/thug-shaker/thug-shaker/blob/main/attachment.gif?raw=true');]{x0000}$}