box86 icon indicating copy to clipboard operation
box86 copied to clipboard

Missing opcodes for wine, in particular atomic increment

Open Trass3r opened this issue 3 years ago • 8 comments

wget https://dl.winehq.org/wine-builds/ubuntu/dists/focal/main/binary-i386/wine-devel-i386_7.9~focal-1_i386.deb
wget https://dl.winehq.org/wine-builds/ubuntu/dists/focal/main/binary-i386/wine-devel_7.9~focal-1_i386.deb
dpkg-deb -xv wine-devel-i386_*~focal-1_i386.deb wine-installer
dpkg-deb -xv wine-devel_*~focal-1_i386.deb wine-installer
rm -rf ~/wine/ && mv wine-installer/opt/wine* ~/wine
WINEARCH=win32 box86 ~/wine/bin/wine wineboot
Debug level is 1
Dynarec log level is 1
Dynarec for ARMv8, with extension: HALF FAST_MULT EDSP NEON VFPv4 IDIVA PageSize:4096
Box86 with Dynarec v0.2.5 2d7e3877 built on Mar 27 2022 17:08:23
0xeab2382d: Dynarec stopped because of Opcode 0F C7 61 40 EB 18 0F AE 61 40 EB 12 F7 01 04 # xsavec [ecx+0x40]
0x7bc5a436: Dynarec stopped because of Opcode 66 F0 83 07 01 8D 75 E6 8D 5F 02 8D B4 26 00 # lock add WORD PTR [edi],0x1
4265|0x3d0052: Unimplemented Opcode (05) 65 05 80 05 6B 05 00 00 # gs add eax,0x56b0580

Trass3r avatar Mar 28 '22 22:03 Trass3r

lock add gets generated by __sync_add_and_fetch (interestingly only optimized to lock inc with -Os) https://godbolt.org/z/hxGosYWdj

It's used to implement wine's InterlockedIncrement. https://github.com/wine-mirror/wine/blob/0de8d01b09b1cc7ca34f7ae3890b4a416ff801fe/dlls/ntdll/sync.c#L543

Trass3r avatar Apr 02 '22 11:04 Trass3r

Adding this to dynarec_arm_66f0 seems to fix the lock add case. But it's mostly copy-paste from dynarec_arm_66:

        case 0x81:
        case 0x83:
            nextop = F8;
            switch((nextop>>3)&7) {
                case 0: //ADD
                    if(opcode==0x81) {
                        INST_NAME("LOCK ADD Ew, Iw");
                    } else {
                        INST_NAME("LOCK ADD Ew, Ib");
                    }
                    SETFLAGS(X_ALL, SF_SET_PENDING);
                    DMB_ISH();
                    if((nextop&0xC0)==0xC0) {
                        if(opcode==0x81) i16 = F16S; else i16 = F8S;
                        ed = xEAX+(nextop&7);
                        emit_add16c(dyn, ninst, ed, i16, x3, x14);
                    } else {
                        addr = geted(dyn, addr, ninst, nextop, &wback, x2, &fixedaddress, 0, 0, 1);
                        if(opcode==0x81) i16 = F16S; else i16 = F8S;
                        if(!fixedaddress) {
                            TSTS_IMM8(wback, 0x3);
                            B_MARK(cNE);
                        }
                        if(!fixedaddress || (fixedaddress && !(fixedaddress&3))) {
                            MARKLOCK;
                            LDREX(x1, wback);
                            emit_add16c(dyn, ninst, x1, i16, x3, x14);
                            STREX(x3, x1, wback);
                            CMPS_IMM8(x3, 0);
                            B_MARKLOCK(cNE);
                        }
                        if(!fixedaddress) {
                            DMB_ISH();
                            B_NEXT(c__);
                        }
                        if(!fixedaddress || (fixedaddress && (fixedaddress&3))) {
                            MARK;   // unaligned! also, not enough 
                            LDR_IMM9(x1, wback, 0);
                            LDREXB(x14, wback);
                            BFI(x1, x14, 0, 8); // re-inject
                            emit_add16c(dyn, ninst, x1, i16, x3, x14);
                            STREXB(x3, x1, wback);
                            CMPS_IMM8(x3, 0);
                            B_MARK(cNE);
                            STR_IMM9(x1, wback, 0);    // put the whole value
                        }
                        DMB_ISH();
                    }
                    break;
                default:
                  DEFAULT;
            }
            break;

Btw this proved useful:

diff --git a/src/dynarec/dynarec_arm_pass0.h b/src/dynarec/dynarec_arm_pass0.h
index 71148a8..7924822 100755
--- a/src/dynarec/dynarec_arm_pass0.h
+++ b/src/dynarec/dynarec_arm_pass0.h
@@ -41,8 +41,8 @@
         --dyn->size;                    \
         *ok = -1;                       \
         if(box86_dynarec_log>=LOG_INFO || box86_dynarec_dump) {\
-        dynarec_log(LOG_NONE, "%p: Dynarec stopped because of Opcode %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X", \
-        (void*)ip, PKip(0),             \
+        dynarec_log(LOG_NONE, "%p: Dynarec stopped @ %s (%s:%u) because of Opcode %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X %02X", \
+        (void*)ip, __FUNCTION__, __FILE__, __LINE__, PKip(0), \
         PKip(1), PKip(2), PKip(3),      \
         PKip(4), PKip(5), PKip(6),      \
         PKip(7), PKip(8), PKip(9),      \

Trass3r avatar Apr 02 '22 13:04 Trass3r

Saw a new one:

Dynarec stopped @ dynarec000 (dynarec_arm_00.c:2311) because of Opcode F3 66 A7 5F 5E 74 0C 89 D9 EB EA 5A 31 C0 EB

https://github.com/ptitSeb/box86/blob/3c8b70f224565d3350c0348a47c1566db5b0122e/src/dynarec/dynarec_arm_00.c#L2311

Trass3r avatar May 25 '22 21:05 Trass3r

Will you do a pullrequest with the 66 F0 81/83 opcode? (if yes, the test for alignment is wrong, as a 2bytes check is enough, not a 4bytes check).

The F3 66 A7 is another special case. I probably have to change things here, because 66 F3 A7 is handled already, but F3 66 A7, which is the same opcode, is not handled.

ptitSeb avatar May 26 '22 07:05 ptitSeb

It was more of a proof of concept, just copied stuff around in an uninformed way.

Trass3r avatar May 27 '22 12:05 Trass3r

Ah ok. Then I'll add the opcode in the dynarec later.

ptitSeb avatar May 27 '22 12:05 ptitSeb

lock add gets generated by __sync_add_and_fetch (interestingly only optimized to lock inc with -Os) https://godbolt.org/z/hxGosYWdj

Oddly now it generates

; 66 b8 01 00 
mov    ax, 0x1
; 66 f0 0f c1 01
lock xadd WORD PTR [edx], ax

It's used to implement wine's InterlockedIncrement used @ https://github.com/wine-mirror/wine/blob/0de8d01b09b1cc7ca34f7ae3890b4a416ff801fe/dlls/ntdll/sync.c#L543

Source for all the Interlocked* functions: https://github.com/wine-mirror/wine/blob/4312be1646cad32548f855e25823857092bf31dc/include/winnt.h#L6660-L6663

Trass3r avatar Oct 12 '22 09:10 Trass3r

The F3 66 A7 is another special case. I probably have to change things here, because 66 F3 A7 is handled already, but F3 66 A7, which is the same opcode, is not handled.

Code reuse ftw :)

Trass3r avatar Oct 12 '22 23:10 Trass3r