xemu icon indicating copy to clipboard operation
xemu copied to clipboard

Serious Sam: glitching during the name entry screen (due to race condition)

Open abaire opened this issue 2 years ago • 8 comments

Title

https://xemu.app/titles/54540004/#Serious-Sam

Bug Description

The player name screen when starting a new single player game experiences unpredictable glitching.

E.g., Screenshot_20220318_212258

The screen flashes between glitches and correct display and the glitches do not appear to be consistent (in timing or in content).

The glitching seems to be significantly reduced in frequency after the name entry but still occurs on the difficulty selection screen.

After letting it sit for awhile on the loading screen, it happens with increasing frequency, eventually glitching extremely often,

![Screenshot_20220318_212934](https://user-images.githubusercontent.com/448413/159106793-077ffe5d-1706-425d-

Expected Behavior

The player name screen should have no glitches.

xemu Version

Version: 0.6.2-82-ghttps://github.com/mborgerson/xemu/commit/https://github.com/mborgerson/xemu/commit/2ff5f23235387a296fc9a5f11943b4dd29bd2837235387a296fc9a5f11943b4dd29bd2837235 Branch: master Commit: 2ff5f23 Date: Sat Mar 19 03:47:53 AM UTC 2022

System Information

"cpu": "Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz", "gl_renderer": "NVIDIA GeForce GTX 1070/PCIe/SSE2", "gl_shading_la844c-cc0d466e4729.png) nguage_version": "4.00 NVIDIA via Cg compiler", "gl_vendor": "NVIDIA Corporation", "gl_version": "4.0.0 NVIDIA 470.103.01", "os_platform": "Linux", "os_version": "Ubuntu 21.10",

Additional Context

No response

abaire avatar Mar 19 '22 04:03 abaire

This may be fixed by one (or a combination) of my unmerged PRs. I can still reproduce with a master build of 0.7.3 but my work branch (which has all of my PRs applied) does not seem to suffer this problem anymore.

abaire avatar May 10 '22 16:05 abaire

This still appears to happen in 0.7.15

abaire avatar May 19 '22 16:05 abaire

The glitching seems to consistently get worse the longer the game is left running, I let it run on the main menu for ~30 minutes, then got into the loading screen and it glitched more or less non-stop.

I was able to get a few renderdoc caps that exhibit the issue, one thing of interest is that it does not appear to ever fully clear the surface, so it is possible that this is just a case of memory not being flagged as dirty correctly, leading to compounding errors over time.

UPDATE: It looks like the very first draw is intended to put the background into place, but it ends up mixing it against the diffuse color which is completely black + partially transparent. The game uses the fixed function pipeline and is not passing in any value for the diffuse color.

UPDATE: According to the pgraph, it should be passing diffuse values:

nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_VERTEX_DATA_ARRAY_OFFSET__POS<0x1720> (0x3C84000)
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_VERTEX_DATA_ARRAY_OFFSET__DIFFUSE<0x172C> (0x3CA8920)
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_VERTEX_DATA_ARRAY_OFFSET__TEX0<0x1744> (0x3CC3FF8)

so it is not clear why these are not showing up in renderdoc. Renderdoc itself shows that the attribute is enabled (along with v0 and v9) but the values are unused. 3 Enabled Attribute 3 B8G8R8A8_UNORM 3 0

It looks like the color target is partially cleared at the end of each frame and the content is allowed to carry over into the subsequent frame, though it also looks like the color surface alternates across frames (between 0x3E30000 and 0x3EC6000)

nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_CLEAR_RECT_HORIZONTAL<0x1D98> (0x2490037 {Min:55, Max:585})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_CLEAR_RECT_VERTICAL<0x1D9C> (0x1900172 {Min:370, Max:400})
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_ZSTENCIL_CLEAR_VALUE<0x1D8C> (0x0)
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_COLOR_CLEAR_VALUE<0x1D90> (0x300)
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_CLEAR_SURFACE<0x1D94> (0xF0)

EDIT: I think this is how they're drawing the progress bar, given the location on screen.

Comparing a glitched draw call to a non-glitched draw, the input vertices for some of the elements seem wildly different/wrong. For example, in one glitched draw the down arrow on the loading screen is drawn in the wrong place with the wrong size; the vertex data given to the shader is already completely different from the input in a comparable pass.

Normal: normal

Glitched: corrupted

abaire avatar May 19 '22 19:05 abaire

Looking back in the renderdoc trace for a non-glitched draw, I see the exact same vertices being passed for an entirely different draw. Spot checking a few other obviously incorrect renders, I find the same situation, I can always find a draw pass in the non-glitched capture that takes the same set of vertices.

I suspect the issue here is a GL buffer; it looks like the game is using the "Inline elements" path.

Specifically, I suspect the LRU caching when committing the draw may be causing stale indices to be reused, since the incorrect values always match previous values and it appears that the same GL_ELEMENT_ARRAY_BUFFER is used between the incorrect pairings.

UPDATE: It looks like both the older and reused element array buffers reference indices 0-3, so it may be a problem with the actual data in the array buffer. Looking at the pgraph log, I see that the draws seem to always reference new vram addresses for their data arrays. E.g., looking at the first draw (whose data is erroneously reused):

nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_VERTEX_DATA_ARRAY_OFFSET__POS<0x1720> (0x3C90FC0)
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_VERTEX_DATA_ARRAY_OFFSET__DIFFUSE<0x172C> (0x3CACE60)
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_VERTEX_DATA_ARRAY_OFFSET__TEX0<0x1744> (0x3CCCA78)
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_BEGIN_END<0x17FC> (NV097_SET_BEGIN_END_OP_TRIANGLES<0x5>)
frame_draw 10   
--
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_VERTEX_DATA_ARRAY_OFFSET__POS<0x1720> (0x3C87420)
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_VERTEX_DATA_ARRAY_OFFSET__DIFFUSE<0x172C> (0x3CA9A80)
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_VERTEX_DATA_ARRAY_OFFSET__TEX0<0x1744> (0x3CC62B8)
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_BEGIN_END<0x17FC> (NV097_SET_BEGIN_END_OP_TRIANGLES<0x5>)
frame_draw 10   
--
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_VERTEX_DATA_ARRAY_OFFSET__POS<0x1720> (0x3C8B2F0)
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_VERTEX_DATA_ARRAY_OFFSET__DIFFUSE<0x172C> (0x3CAAF70)
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_VERTEX_DATA_ARRAY_OFFSET__TEX0<0x1744> (0x3CC8C98)
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_BEGIN_END<0x17FC> (NV097_SET_BEGIN_END_OP_TRIANGLES<0x5>)
frame_draw 10   
--
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_VERTEX_DATA_ARRAY_OFFSET__POS<0x1720> (0x3C8F1C0)
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_VERTEX_DATA_ARRAY_OFFSET__DIFFUSE<0x172C> (0x3CAC460)
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_VERTEX_DATA_ARRAY_OFFSET__TEX0<0x1744> (0x3CCB678)
nv2a_pgraph_method 0: NV20_KELVIN_PRIMITIVE<0x97> -> NV097_SET_BEGIN_END<0x17FC> (NV097_SET_BEGIN_END_OP_TRIANGLES<0x5>)
frame_draw 10   

UPDATE: Disabling the dirty check in pgraph_update_memory_buffer appears to prevent the glitching, so I suspect the issue is that there's some path that is failing to mark the memory region as dirty. (Removing the check presumably has a significant negative impact on performance and isn't a viable fix on its own).

Keeping the check but ignoring the value also seems to prevent the glitches (this rules out the check in update_memory_buffer causing issues with update_surface_part, which also checks the DIRTY_MEMORY_NV2A log).

abaire avatar May 20 '22 20:05 abaire

Note: @mborgerson mentioned that there is a race condition with a WIP fix.

abaire avatar May 21 '22 02:05 abaire

Hey mate, I'm really interested on playing this game on Xemu so I want to ask, you had any luck on your research/fix? If not , how could I contribute for faster resolution?

esau817 avatar Jun 07 '23 22:06 esau817

I stopped looking into this when @mborgerson mentioned that there was a known race condition (which would neatly explain the observed behavior). If you'd like to contribute, your best bet is to jump on the Discord channel and see if you can get some pointers towards the relevant parts of pgraph.c/etc... there.

abaire avatar Jun 10 '23 05:06 abaire

This also happens when in-game.

Triticum0 avatar Oct 22 '23 16:10 Triticum0