rr icon indicating copy to clipboard operation
rr copied to clipboard

Replaying a Wine recording triggers `Assertion 'ticks_now == trace_ticks'`, mmap with MAP_PRIVATE

Open bernhardu opened this issue 6 months ago • 14 comments

After @Keno 's patch finally went in some form into Wine I found there still happen diversions. So I tried to bisect and found Wine was changing how a shared memory between wineserver and the gui processes get mapped. Unfortunately there were a few more commits making this one not alone revertable.

But I could strip it down to this modification to wine to make replaying work again by using the previous mapping flag MAP_SHARED instead of MAP_PRIVATE:

--- a/dlls/ntdll/unix/virtual.c
+++ b/dlls/ntdll/unix/virtual.c
@@ -2351,7 +2351,13 @@ static NTSTATUS map_file_into_view( struct file_view *view, int fd, size_t start
         /* changes to the file are not guaranteed to be visible in read-only MAP_PRIVATE mappings,
          * but they are on Linux so we take advantage of it */
 #ifdef __linux__
+      if (vprot & SEC_FILE) {
         flags |= MAP_PRIVATE;
+      } else {
+        flags |= MAP_SHARED;
+        prot &= ~PROT_WRITE;
+        //ERR("fd=%d flags=0x%x prot=0x%x vprot=0x%x:%d\n", fd, flags, prot, vprot, __LINE__);
+      }
 #else
         flags |= MAP_SHARED;
         prot &= ~PROT_WRITE;

This is what I got, but there was no relation to a shared memory visible:

[FATAL src/ReplaySession.cc:1278:check_ticks_consistency()] 
 (task 1131280 (rec:1131219) at time 151589)
 -> Assertion `ticks_now == trace_ticks' failed to hold. ticks mismatch for 'SYSCALL: read'; expected 24071991, got 24071980

So I tried prove this shared memory being the issue by spraying such calls syscall(-1, 0, 0); in the functions shown by the emergency shell and reached that way function shared_object_get_id in Wine, with a better to follow register mismatch:

[FATAL src/ReplayTask.cc:130:validate_regs()] 
 (task 1123897 (rec:1123824) at time 170377)
 -> Assertion `!comparison.mismatch_count' failed to hold. Mismatched registers, replay vs rec: r12 (nil) != 0x18
104             lock.id = object->id;
   0x00007f7e41638440 <+127>:   mov    0x8(%rbx),%r12
   0x00007f7e41638444 <+131>:   mov    %r12,-0x40(%rbp)

There the memory in question looks like this:

(gdb) print/x *(const shared_object_t *) 0x7ffffe9e0ef8
$5 = {seq = 0x6, id = 0x0, shm = {desktop = {flags = 0xfefefefe, cursor = {x = 0xfefefefe, y = 0xfefefefe, last_change = 0xfefefefe, clip = {left = 0xfefefefe, top = 0xfefefefe, right = 0xfefefefe, 
          bottom = 0xfefefefe}}, keystate = {0xfe <repeats 256 times>}, monitor_serial = 0xfefefefefefefefe}, queue = {hooks_count = {0xfefefefe <repeats 17 times>}, wake_mask = 0xfefefefe, wake_bits = 0xfefefefe, 
      changed_mask = 0xfefefefe, changed_bits = 0xfefefefe}, input = {foreground = 0xfefefefe, active = 0xfefefefe, focus = 0xfefefefe, capture = 0xfefefefe, menu_owner = 0xfefefefe, move_size = 0xfefefefe, 
      caret = 0xfefefefe, caret_rect = {left = 0xfefefefe, top = 0xfefefefe, right = 0xfefefefe, bottom = 0xfefefefe}, cursor = 0xfefefefe, cursor_count = 0xfefefefe, keystate = {0xfe <repeats 256 times>}, 
      keystate_lock = 0xfefefefe}}}

When I replay the wineserver to short before the diversion I see this state of the mapping:

(rr) print/x *(const shared_object_t *)0x7f872e6d0ef8
$1 = {seq = 0xe, id = 0x18, shm = {desktop = {flags = 0x0, cursor = {x = 0x0, y = 0x0, last_change = 0x0, clip = {left = 0x0, top = 0x0, right = 0x0, bottom = 0x0}}, keystate = {0x0 <repeats 41 times>, 0x80, 0x0, 
        0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x80, 0x0, 0x0, 0x55 <repeats 204 times>}, monitor_serial = 0x5555555555555555}, queue = {hooks_count = {0x0 <repeats 17 times>}, wake_mask = 0x0, wake_bits = 0x8000, 
      changed_mask = 0x0, changed_bits = 0x8000}, input = {foreground = 0x0, active = 0x0, focus = 0x0, capture = 0x0, menu_owner = 0x0, move_size = 0x0, caret = 0x0, caret_rect = {left = 0x0, top = 0x0, 
        right = 0x0, bottom = 0x0}, cursor = 0x0, cursor_count = 0x0, keystate = {0x0 <repeats 21 times>, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x80, 0x0, 0x0, 0x55 <repeats 224 times>}, 
      keystate_lock = 0x55555555}}}

So it looks like this mapping was not propagating changes to the memory in wineserver to the wine process, which seems to show some initial state of it.

To make things more complicated this shared memory is created by wineserver by creating a temporary file and sending the fd through a socket to the wine process.

I tried to replicate this in a rr testcase, but unfortunately this is not showing the issue, so there is maybe more to it: mmap_private_from_fd_via_socket.c.txt

I had made some tests a while ago for this issue and was seeing it at Intel hardware too, so it seems no AMD related thing.

Is this probably a known issue in rr? Any hints how to continue from here?

PS.: To speed up things I am just recording wineserver and the wine process I am interested in. But the issue appears also when recording the whole tree.

rr record bash -c "wineserver -k; wineserver -p60; rr record --nested=release wine wineconsole cmd /c \"timeout /t 10\" & sleep 8s; echo start; WINEDLLOVERRIDES=winemenubuilder.exe=n WINEDEBUG=+pid,+timestamp wine winemine; sleep 1s; wineserver -k; sleep 1s"```

bernhardu avatar May 31 '25 09:05 bernhardu

This comment looks suspicious:

         /* changes to the file are not guaranteed to be visible in read-only MAP_PRIVATE mappings,
          * but they are on Linux so we take advantage of it */

https://gitlab.winehq.org/wine/wine/-/blob/master/dlls/ntdll/unix/virtual.c?ref_type=heads#L2351 In rr, changes to the file behind a MAP_PRIVATE mapping will generally not be reflected in a MAP_PRIVATE mapping even if it's read-only. For MAP_PRIVATE mappings we generally make some kind of copy of the file data at mapping time. As the comment hints, applications aren't really supposed to depend on readonly MAP_PRIVATE mappings tracking updates to the underlying file.

rocallahan avatar May 31 '25 11:05 rocallahan

I don't know what that commit was really trying to do.

rr has MmappedFileMonitor to track file writes that should be echoed to MAP_SHARED mappings. Extending it to MAP_PRIVATE as well might work but would be pretty expensive :-(.

rocallahan avatar May 31 '25 11:05 rocallahan

I mixed up the flags when I tried to create a rr test. This version shows a similar assertion as Wine:

mmap_private_from_fd_via_socket.c.txt

bernhardu avatar May 31 '25 13:05 bernhardu

Thanks for the hints. Unfortunately all my tests to get into the MmappedFileMonitor path did still show a register diversion up to now.

bernhardu avatar Jun 01 '25 08:06 bernhardu

MmappedFileMonitor won't help us here. The testcase does

  • Create a a shared memory segment
  • Create a MAP_PRIVATE mapping of it
  • Read but don't write to the private mapping
  • Create a MAP_SHARED mapping of it
  • Write to the shared mapping
  • Read from the private mapping and observe the previous write

(The socket stuff is not relevant.)

During replay, when rr creates the private mapping, we create anonymous pages and copy the recorded initial contents of the mapping into those pages. So those pages aren't connected to the shared mapping.

We can probably fix this by making non-anonymous private mappings use the shared-mapping path, i.e. we create a sharable file for the mapped pages, and create a private mapping for it during replay just like we do during recording.

rocallahan avatar Jun 02 '25 04:06 rocallahan

Unfortunately, making MAP_PRIVATE behave more like MAP_SHARED means making replay a lot slower in general because when we clone sessions we have to replicate the structure of shared mappings in the cloned processes :-(. That seems like a sad tradeoff to support Wine depending on behaviour that Linux doesn't even guarantee.

rocallahan avatar Jun 02 '25 10:06 rocallahan

In this case, would it be something to give a warning about this private mapping which may cause issues? Can this situation safe enough detected?

So this is then maybe the point to involve Wine developer, if this is really the intended behaviour? Maybe instead of slowing down rr in general, it might be possible to get a "switch" into Wine to avoid this, something like in my opening comment?

Thanks for you time.

bernhardu avatar Jun 02 '25 13:06 bernhardu

If you can figure out which file(s) this matters for and why they are being treated this way, that would be very useful.

It's not obvious to me why you would want to do this instead of just mapping them shared but readonly, like Wine used to do.

rocallahan avatar Jun 02 '25 20:06 rocallahan

Will try to get some more infos.

About the files - if I understand it right, this situation is currently just for one mapping between wineserver and a .exe process, when it creates some "shared session block". And for this the file is just the vehicle to have both processes having the same memory mapped - the content of the file is after the mapping not of interest. (Would it be easier to lookup for the private mapping, if that file is mapped in some other process and leave the file with the inital content, but "connect" the two memory areas?)

wineserver, file creation:
(rr) bt
#0  __libc_open64 (file=file@entry=0x7ffe0495b2a0 "tmpmap-8c213959", oflag=oflag@entry=194) at ../sysdeps/unix/sysv/linux/open64.c:30
#1  0x000055d5dc0d8e0f in make_temp_file (name=name@entry=0x7ffe0495b2a0 "tmpmap-8c213959") at .../wine/server/mapping.c:331
#2  0x000055d5dc0d9060 in create_temp_file (size=1208320) at .../wine/server/mapping.c:377
#3  0x000055d5dc0d93da in create_mapping (root=root@entry=0x55d5e24c8190, name=name@entry=0x55d5dc13f490 <session_str>, attr=attr@entry=16, size=1208320, flags=flags@entry=134217728, handle=handle@entry=0, file_access=<optimized out>, sd=<optimized out>) at .../wine/server/mapping.c:1091
#4  0x000055d5dc0da926 in create_session_mapping (root=root@entry=0x55d5e24c8190, name=name@entry=0x55d5dc13f490 <session_str>, attr=attr@entry=16, sd=sd@entry=0x0) at .../wine/server/mapping.c:1310
#5  0x000055d5dc0ce36c in init_directories (intl_fd=0x55d5e24c6e50) at .../wine/server/directory.c:499
#6  0x000055d5dc0d8a67 in main (argc=2, argv=0x7ffe0495b928) at .../wine/server/main.c:236
(rr) when
Completed event: 810

wineserver map shared:
(rr) bt
#0  __GI___mmap64 (addr=addr@entry=0x0, len=len@entry=1208320, prot=prot@entry=3, flags=flags@entry=1, fd=fd@entry=20, offset=offset@entry=0) at ../sysdeps/unix/sysv/linux/mmap64.c:50
#1  0x000055d5dc0da979 in set_session_mapping (mapping=mapping@entry=0x55d5e24cbdd0) at .../wine/server/mapping.c:1321
#2  0x000055d5dc0ce377 in init_directories (intl_fd=0x55d5e24c6e50) at .../wine/server/directory.c:500
#3  0x000055d5dc0d8a67 in main (argc=2, argv=0x7ffe0495b928) at .../wine/server/main.c:236
(rr) when
Completed event: 811

.exe process, map private:
(rr) bt
#0  __GI___mmap64 (addr=addr@entry=0x7ffffe8e0000, len=len@entry=1208320, prot=prot@entry=1, flags=flags@entry=17, fd=fd@entry=8, offset=offset@entry=0) at ../sysdeps/unix/sysv/linux/mmap64.c:50
#1  0x00007fdc46435e56 in map_file_into_view (view=view@entry=0x7fffffdb0818, fd=<optimized out>, start=start@entry=0, size=size@entry=1208320, offset=offset@entry=0, vprot=vprot@entry=134217761, removable=<optimized out>) at .../wine/dlls/ntdll/unix/virtual.c:2546
#2  0x00007fdc4643ab1d in virtual_map_section (handle=handle@entry=0x3c, addr_ptr=addr_ptr@entry=0x55555ab1d0d0, limit_low=limit_low@entry=0, limit_high=limit_high@entry=0, commit_size=commit_size@entry=0, offset_ptr=offset_ptr@entry=0x7ffffe0ffad0, size_ptr=<optimized out>, alloc_type=<optimized out>, protect=<optimized out>, machine=<optimized out>) at .../wine/dlls/ntdll/unix/virtual.c:3727
#3  0x00007fdc4643f3b3 in NtMapViewOfSection (handle=<optimized out>, process=process@entry=0xffffffffffffffff, addr_ptr=addr_ptr@entry=0x55555ab1d0d0, zero_bits=zero_bits@entry=0, commit_size=commit_size@entry=0, offset_ptr=offset_ptr@entry=0x7ffffe0ffad0, size_ptr=<optimized out>, inherit=<optimized out>, alloc_type=<optimized out>, protect=<optimized out>) at .../wine/dlls/ntdll/unix/virtual.c:6254
#4  0x00007fdc4593809a in map_shared_session_block (ret=0x7ffffe0ffb68, size=1047808, offset=0) at .../wine/dlls/win32u/winstation.c:129
#5  find_shared_session_block (offset=offset@entry=0, size=size@entry=1047808, ret=ret@entry=0x7ffffe0ffb68) at .../wine/dlls/win32u/winstation.c:165
#6  0x00007fdc45938395 in shared_session_init () at .../wine/dlls/win32u/winstation.c:197
#7  0x00007fdc45866252 in init_user () at .../wine/dlls/win32u/class.c:255
#8  0x00007fdc4652a057 in __pthread_once_slow (once_control=0x7fdc4598bd2c <init_once>, init_routine=0x7fdc4586622e <init_user>) at ./nptl/pthread_once.c:116
#9  0x00007fdc4652a135 in ___pthread_once (once_control=<optimized out>, init_routine=<optimized out>) at ./nptl/pthread_once.c:143
#10 0x00007fdc45864c51 in NtUserInitializeClientPfnArrays (client_procsA=<optimized out>, client_procsW=<optimized out>, client_workers=<optimized out>, user_module=<optimized out>) at .../wine/dlls/win32u/class.c:280
#11 0x00007fdc46415753 in __wine_syscall_dispatcher () from .../obj/dlls/ntdll/ntdll.so
(rr) when
Completed event: 173405

bernhardu avatar Jun 03 '25 18:06 bernhardu

Ok.

The commit message of https://gitlab.winehq.org/wine/wine/-/commit/07160a23fa2139c2dc9cad92158ddf0b64a3e75e says "So that PAGE_WRITECOPY can be set after the fact". That makes sense. But I would have expected that you can switch the mapping from shared to private by re-mapping the file object, although that's more complicated and probably a bit slower.

rocallahan avatar Jun 03 '25 20:06 rocallahan

I guess a possible workaround might be to have map_shared_session_block use mmap directly instead of NtMapViewOfSection?

rocallahan avatar Jun 03 '25 20:06 rocallahan

I guess a possible workaround might be to have map_shared_session_block use mmap directly instead of NtMapViewOfSection?

As a workaround the modification from my initial comment is working for me, which is more or less just for this single mapping at .exe side using map shared instead map private, like the compile switch would do for all non-linux systems.

bernhardu avatar Jun 03 '25 20:06 bernhardu

Do you want to try to get that upstream?

rocallahan avatar Jun 04 '25 08:06 rocallahan

Do you want to try to get that upstream?

I am not sure, I tried to bring it to Wine developers attention: https://bugs.winehq.org/show_bug.cgi?id=58332

Let's see where this gets us.

bernhardu avatar Jun 05 '25 17:06 bernhardu