msys2-runtime icon indicating copy to clipboard operation
msys2-runtime copied to clipboard

Segfaults if CPU's PKU/PKRU feature is enabled

Open pipcet opened this issue 6 months ago • 11 comments

Hello!

This is about a bug affecting the msys2-docker-experimental images, reported at https://github.com/msys2/msys2-docker/issues/18 . I've now found this bug and think it should be fixed in the gendef file.

I'm not sure whether this bug applies to Cygwin, too, but it's about this code in gendef:

	movl	\$0x0d,%eax
	xorl	%ecx,%ecx
	cpuid	# get necessary space for xsave
	movq	%rbx,%rcx
	addq	\$0x48,%rbx # 0x18 for alignment, 0x30 for additional space
	subq	%rbx,%rsp
	movl	%ebx,0x24(%rsp)
	xorq	%rax,%rax
	shrq	\$3,%rcx
	leaq	0x30(%rsp),%rdi
	rep	stosq
	xgetbv	# get XCR0 (ecx is 0 after rep)
	movl	%eax,0x28(%rsp)
	movl	%edx,0x2c(%rsp)
	notl	%ecx # set ecx non-zero
	movl	%ecx,0x20(%rsp)
	xsave64	0x30(%rsp)

When running in Wine, I'm seeing segfaults caused by the final xsave64 instruction. These are because if the CPU's PKU/PKRU flag is enabled by Linux (disabling it by passing nopku on the kernel command line prevents the bug), it is also enabled in wine binaries, and that will make the cpuid instruction report a value of 2440 in rbx and rcx. This value is not a multiple of 64.

This results in a misaligned stack pointer when we hit the xsave64 instruction; xsave64 requires a 64-byte aligned storage area, and in an actual run, %rsp is 0x7ffffb648. This means the store goes to 0x7ffffb678, which isn't 64-byte aligned, causing a segfault.

However, even if the xsave64 instruction were to succeed, we'd crash a bit later because the stack pointer isn't aligned to the 16-byte boundary required by the ABI.

Unfortunately, I don't know how to rebuild msys2 to fix this. I can work around it by using Wine-gdb (winedbg --gdb bash.exe -c ... is what I'm running) , though:

For me, the cpuid instruction above is at 0x18019c645, so the following instruction is at 0x18019c647. After the cpuid instruction, we inject a breakpoint which increases %rbx by 56, making it 64-byte aligned again:

b *0x18019c647
Breakpoint 1 at 0x18019c647
Wine-gdb> command 1
command 1
Type commands for breakpoint(s) 1, one per line.
End with a line saying just "end".
>p $rbx += 56
>c
>end
Wine-gdb> c

This allows the program to execute normally even if the CPU's PKU/PKRU feature flag is in use. This suffices to convince me the bug is as I've described; sorry if the description is a bit terse, I'm happy to answer questions!

The proposed fix is to add something to the effect of %rbx += 63; %rbx &= -64; to gendef after the cpuid instruction but before the following mov: this will increase the reported size of the xsave area to the next 64-byte multiple, which will have the same effect as the GDB script above.

pipcet avatar Jun 27 '25 12:06 pipcet

brilliant, did you use wine-gdb or gdb? or both?

Kreijstal avatar Jun 27 '25 12:06 Kreijstal

cygwin source: https://github.com/cygwin/cygwin/blob/5c8475417bc3b5c93c6c9b23f245211828cde50c/winsup/cygwin/scripts/gendef#L232-L248

@tyan0 pinging you since you wrote/changed that code from what I see :)

lazka avatar Jun 27 '25 12:06 lazka

brilliant, did you use wine-gdb or gdb? or both?

Thanks. I must confess I found the bug by grepping the source code for "xsave", then studying that code. Using winedbg from the start would have saved me the time. The proof-of-concept workaround above uses winedbg --gdb, which runs a version of GDB, IIUC.

(The msys2-docker-experimental image does not include gdb, so winedbg --gdb doesn't work in that image; even after installing gdb, there appears to be some trouble with console IO)

cygwin source: https://github.com/cygwin/cygwin/blob/5c8475417bc3b5c93c6c9b23f245211828cde50c/winsup/cygwin/scripts/gendef#L232-L248

@tyan0 pinging you since you wrote/changed that code from what I see :)

Thanks, reported to the Cygwin mailing list as it's an upstream bug. That email includes a suggested patch:

diff --git a/winsup/cygwin/scripts/gendef b/winsup/cygwin/scripts/gendef
index 861a2405b..d681fde3f 100755
--- a/winsup/cygwin/scripts/gendef
+++ b/winsup/cygwin/scripts/gendef
@@ -232,6 +232,8 @@ sigdelayed:
 	movl	\$0x0d,%eax
 	xorl	%ecx,%ecx
 	cpuid	# get necessary space for xsave
+	addq	\$63, %rbx
+	andq	\$-64, %rbx # align to next 64-byte multiple
 	movq	%rbx,%rcx
 	addq	\$0x48,%rbx # 0x18 for alignment, 0x30 for additional space
 	subq	%rbx,%rsp

I tried testing it, and while it worked in a simple test (build new DLL; replace DLL; run bash) it wasn't a clean build (building msys2 under wine appears to require additional trickery, or more experience than I have), so this might need further discussion.

pipcet avatar Jun 27 '25 14:06 pipcet

Thanks! https://cygwin.com/pipermail/cygwin/2025-June/258368.html

lazka avatar Jun 27 '25 14:06 lazka

@lazka do we want to apply the patch included in https://inbox.sourceware.org/cygwin/[email protected]/ early?

dscho avatar Jun 30 '25 10:06 dscho

I wouldn't mind at least. It's over my head though.

lazka avatar Jun 30 '25 21:06 lazka

To be honest, I'd suggest waiting until the patch is applied to Cygwin:

  • this is exclusively a Linux/Wine issue: there is no indication Windows systems are going to support PKU/PKRU anytime soon.
  • the patch might not be applied as-is, so there may be merge conflicts in the future
  • if I messed up, that might result in subtle bugs rather than loud ones

(I also feel quite strongly that the Linux kernel should allow disabling the PKU feature on a per-process basis, even if there is a small performance penalty for mixed pku/nopku systems.)

pipcet avatar Jun 30 '25 22:06 pipcet

Fair enough, I'll stop myself from opening that PR, then 😀

dscho avatar Jul 01 '25 07:07 dscho

Maybe it's time to revisit this. The Cygwin patch hasn't been applied and there was no response to a recent ping. I'm still using a horrible LD_PRELOAD-based workaround in my CI setup. CPUs with this problem are becoming more common...

pipcet avatar Nov 08 '25 13:11 pipcet

Maybe it's time to revisit this. The Cygwin patch hasn't been applied and there was no response to a recent ping. I'm still using a horrible LD_PRELOAD-based workaround in my CI setup. CPUs with this problem are becoming more common...

can you share those CI scripts if they are public, are you using the msys2 docker?

Kreijstal avatar Nov 08 '25 13:11 Kreijstal

Maybe it's time to revisit this. The Cygwin patch hasn't been applied and there was no response to a recent ping. I'm still using a horrible LD_PRELOAD-based workaround in my CI setup. CPUs with this problem are becoming more common...

can you share those CI scripts if they are public, are you using the msys2 docker?

Absolutely. (Sorry this is a bit offtopic: these are CI/CD scripts for Emacs, not for msys2, but they use msys2).

This is CI/CD for cross-building GNU Emacs targeting Win64 machines (there are similar jobs for 32-bit mingw-w64 and 32-bit "mingw.org" mingw):

The Dockerfile is here: https://codeberg.org/pipcet/emacs-forge/src/branch/main/ci/x86_64-w64-ucrt64/msys2/Dockerfile

The relevant lines are:

FROM ghcr.io/msys2/msys2-docker-experimental AS wine
RUN apt -y update
RUN apt -y install wget curl unzip git make psmisc
RUN sed -i -e 's/Cygwin WARNING/\x00ygwin WARNING/g' /root/.wine/drive_c/msys64/usr/bin/msys-2.0.dll

(The sed command silences an fprintf that I'm pretty sure is harmless, but I don't recall the details).

The rest just starts an entrypoint script (https://codeberg.org/pipcet/emacs-forge/src/branch/main/ci/x86_64-w64-ucrt64/msys2/entrypoint), the relevant bit of which reads:

    if cp /public/xsave_preload.so /xsave_preload.so; then export LD_PRELOAD=/xsave_preload.so; fi

The xsave_preload.so is put in a shared directory accessible to the Docker container as /public; its source code is https://codeberg.org/pipcet/emacs-forge/src/branch/ci-x86_64-w64-ucrt64-msys2/ci/x86_64-w64-ucrt64/msys2/xsave_preload.c. In short, it detects the specific instruction sequence that causes the problem after the segfault, but before the "real" SIGSEGV handler runs (which it doesn't if the handler succeeds). It hooks into sigaction so the handler can install itself, and it hooks into execve to ensure that any child that has been spawned also preloads the library. (I believe the execve thing was necessary).

This works very well and will turn into a harmless nop when the code is fixed upstream, but it's quite annoying when running 32-bit binaries: those don't need the preload, but it's still in the environment variable, so they try to load it anyway and an error message is produced. For every single exec.

pipcet avatar Nov 08 '25 13:11 pipcet

This works very well and will turn into a harmless nop when the code is fixed upstream, but it's quite annoying when running 32-bit binaries: those don't need the preload, but it's still in the environment variable, so they try to load it anyway and an error message is produced. For every single exec.

In man ld-linux it mentions a "dynamic string token" $PLATFORM. So the workaround could probably be extended by using LD_PRELOAD='/xsave_preload_$PLATFORM.so', and placing files for both, x86_64 and ~i386~ i686?

EDIT: Seems to work for 32-bit - at least in this test here, thanks for creating this workaround.

bernhardu avatar Jan 01 '26 23:01 bernhardu