Missing opcodes for wine, in particular atomic increment
wget https://dl.winehq.org/wine-builds/ubuntu/dists/focal/main/binary-i386/wine-devel-i386_7.9~focal-1_i386.deb
wget https://dl.winehq.org/wine-builds/ubuntu/dists/focal/main/binary-i386/wine-devel_7.9~focal-1_i386.deb
dpkg-deb -xv wine-devel-i386_*~focal-1_i386.deb wine-installer
dpkg-deb -xv wine-devel_*~focal-1_i386.deb wine-installer
rm -rf ~/wine/ && mv wine-installer/opt/wine* ~/wine
WINEARCH=win32 box86 ~/wine/bin/wine wineboot
Debug level is 1
Dynarec log level is 1
Dynarec for ARMv8, with extension: HALF FAST_MULT EDSP NEON VFPv4 IDIVA PageSize:4096
Box86 with Dynarec v0.2.5 2d7e3877 built on Mar 27 2022 17:08:23
0xeab2382d: Dynarec stopped because of Opcode 0F C7 61 40 EB 18 0F AE 61 40 EB 12 F7 01 04 # xsavec [ecx+0x40]
0x7bc5a436: Dynarec stopped because of Opcode 66 F0 83 07 01 8D 75 E6 8D 5F 02 8D B4 26 00 # lock add WORD PTR [edi],0x1
4265|0x3d0052: Unimplemented Opcode (05) 65 05 80 05 6B 05 00 00 # gs add eax,0x56b0580
lock add gets generated by __sync_add_and_fetch (interestingly only optimized to lock inc with -Os)
https://godbolt.org/z/hxGosYWdj
It's used to implement wine's InterlockedIncrement.
https://github.com/wine-mirror/wine/blob/0de8d01b09b1cc7ca34f7ae3890b4a416ff801fe/dlls/ntdll/sync.c#L543
Adding this to dynarec_arm_66f0 seems to fix the lock add case. But it's mostly copy-paste from dynarec_arm_66:
case 0x81:
case 0x83:
nextop = F8;
switch((nextop>>3)&7) {
case 0: //ADD
if(opcode==0x81) {
INST_NAME("LOCK ADD Ew, Iw");
} else {
INST_NAME("LOCK ADD Ew, Ib");
}
SETFLAGS(X_ALL, SF_SET_PENDING);
DMB_ISH();
if((nextop&0xC0)==0xC0) {
if(opcode==0x81) i16 = F16S; else i16 = F8S;
ed = xEAX+(nextop&7);
emit_add16c(dyn, ninst, ed, i16, x3, x14);
} else {
addr = geted(dyn, addr, ninst, nextop, &wback, x2, &fixedaddress, 0, 0, 1);
if(opcode==0x81) i16 = F16S; else i16 = F8S;
if(!fixedaddress) {
TSTS_IMM8(wback, 0x3);
B_MARK(cNE);
}
if(!fixedaddress || (fixedaddress && !(fixedaddress&3))) {
MARKLOCK;
LDREX(x1, wback);
emit_add16c(dyn, ninst, x1, i16, x3, x14);
STREX(x3, x1, wback);
CMPS_IMM8(x3, 0);
B_MARKLOCK(cNE);
}
if(!fixedaddress) {
DMB_ISH();
B_NEXT(c__);
}
if(!fixedaddress || (fixedaddress && (fixedaddress&3))) {
MARK; // unaligned! also, not enough
LDR_IMM9(x1, wback, 0);
LDREXB(x14, wback);
BFI(x1, x14, 0, 8); // re-inject
emit_add16c(dyn, ninst, x1, i16, x3, x14);
STREXB(x3, x1, wback);
CMPS_IMM8(x3, 0);
B_MARK(cNE);
STR_IMM9(x1, wback, 0); // put the whole value
}
DMB_ISH();
}
break;
default:
DEFAULT;
}
break;
Btw this proved useful:
diff --git a/src/dynarec/dynarec_arm_pass0.h b/src/dynarec/dynarec_arm_pass0.h
index 71148a8..7924822 100755
--- a/src/dynarec/dynarec_arm_pass0.h
+++ b/src/dynarec/dynarec_arm_pass0.h
@@ -41,8 +41,8 @@
--dyn->size; \
*ok = -1; \
if(box86_dynarec_log>=LOG_INFO || box86_dynarec_dump) {\
- dynarec_log(LOG_NONE, "%p: Dynarec stopped because of Opcode %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X", \
- (void*)ip, PKip(0), \
+ dynarec_log(LOG_NONE, "%p: Dynarec stopped @ %s (%s:%u) because of Opcode %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X", \
+ (void*)ip, __FUNCTION__, __FILE__, __LINE__, PKip(0), \
PKip(1), PKip(2), PKip(3), \
PKip(4), PKip(5), PKip(6), \
PKip(7), PKip(8), PKip(9), \
Saw a new one:
Dynarec stopped @ dynarec000 (dynarec_arm_00.c:2311) because of Opcode F3 66 A7 5F 5E 74 0C 89 D9 EB EA 5A 31 C0 EB
https://github.com/ptitSeb/box86/blob/3c8b70f224565d3350c0348a47c1566db5b0122e/src/dynarec/dynarec_arm_00.c#L2311
Will you do a pullrequest with the 66 F0 81/83 opcode? (if yes, the test for alignment is wrong, as a 2bytes check is enough, not a 4bytes check).
The F3 66 A7 is another special case. I probably have to change things here, because 66 F3 A7 is handled already, but F3 66 A7, which is the same opcode, is not handled.
It was more of a proof of concept, just copied stuff around in an uninformed way.
Ah ok. Then I'll add the opcode in the dynarec later.
lock addgets generated by__sync_add_and_fetch(interestingly only optimized to lock inc with-Os) https://godbolt.org/z/hxGosYWdj
Oddly now it generates
; 66 b8 01 00
mov ax, 0x1
; 66 f0 0f c1 01
lock xadd WORD PTR [edx], ax
It's used to implement wine's
InterlockedIncrementused @ https://github.com/wine-mirror/wine/blob/0de8d01b09b1cc7ca34f7ae3890b4a416ff801fe/dlls/ntdll/sync.c#L543
Source for all the Interlocked* functions:
https://github.com/wine-mirror/wine/blob/4312be1646cad32548f855e25823857092bf31dc/include/winnt.h#L6660-L6663
The F3 66 A7 is another special case. I probably have to change things here, because 66 F3 A7 is handled already, but F3 66 A7, which is the same opcode, is not handled.
Code reuse ftw :)