GS: Refactor pre-calculation of texture coordinates.
Description of Changes
Refactor the calculations of texture coordinates in GS before rendering (e.g., like in the vertex trace).
Rationale behind Changes
Functionality should be the same but there is some duplicated code that could be removed. Hopefully, the code is more readable also. Some parts are hopefully more accurate (though the difference may not be always visible) and/or more efficient.
Suggested Testing Steps
So far I've run this on GS dumps and done comparisons with master. Needs more testing and on actual games.
Did you use AI to help find, test, or implement this issue or feature?
Yes, I used GitHub Copilot.
GS dump run results (using gspack-10-july-2024)
Fixes in SW mode (PR left, master right)
Car appears to be textured better. Dump: "Destruction_Derby_Arenas_SLUS-20855_20240120182644.gs.xz"
Character is chopped in master. Dump: "EternalPoisonSubs.gx.xz"
Menu not fully displayed in master. Dump: "Final_Fantasy_XI_Disc_2_of_2_SLPS-20200_20240228004030.gs.xz"
Menu not fully dusplayed in master: Dump: "Final_Fantasy_XI_Disc_2_of_2_SCUS-97269_20230113152502.gs.xz"
Fixes in HW/GL mode (PR left, master right)
Cloud missing in master (actually not 100% sure about this...). Dump: "Ace Combat 5 clouds missing in HW.gs.xz"
Strange texture in master. Dump: "Bge.gs.xz"
This breaks software rendering. Check my dump: Tekken 5_SLUS-21059_20250608143444.gs.zip
It breaks Jak 3 on hardware renderers. Jak3.zip
Breaks the nametags on Samurai Warriors 2 (software)
The bottom half of Hitman 2's screen is black (SW)
@Slayer0fRA Thanks for the testing. I fixed a bug that seems to have fixed Samurai and Jak3. The rest I'll try to retest soon.
@refractionpcsx2 Thanks for the reviews. Fixed the easy ones. About the boundary check thing, I don't understand why the min coordinate being 0 should be treated differently (though I don't argue that in practice it may fix games). According to the GS manual "Operations such as repetition and clamping are performed to iu0, iv0, iu1, and iv1 depending on the wrapping mode." So in theory it should be more accurate to consider the whole texture used when linear interpolation and repeat mode is used and some iu0 or iv0 is -1 (i.e., [iu0, iv0, iu1, iv1] = st.floor() in the above code).
@Slayer0fRA Thanks for the testing. I fixed a bug that seems to have fixed Samurai and Jak3. The rest I'll try to retest soon.
@refractionpcsx2 Thanks for the reviews. Fixed the easy ones. About the boundary check thing, I don't understand why the min coordinate being 0 should be treated differently (though I don't argue that in practice it may fix games). According to the GS manual "Operations such as repetition and clamping are performed to iu0, iv0, iu1, and iv1 depending on the wrapping mode." So in theory it should be more accurate to consider the whole texture used when linear interpolation and repeat mode is used and some iu0 or iv0 is -1 (i.e., [iu0, iv0, iu1, iv1] = st.floor() in the above code).
The biggest problem is for hardware mode texture lookups.
say for example they are copying the framebuffer with bilinear filtering enabled and the framebuffer is 640x480.
Now the hardware renderer will know that that it has a 640x480 (buffer width 10) texture, but to access this, the TEX0 setting will have to be TW 10 and TH 9, aka 1024x512 with a TBW of 10. If the min/max doesn't use boundary, this will return 640x480 (maybe 641x481, but close enough), but if it wraps, it will demand the entire size of the texture, and now it wants 1024x512, so when it comes to the texture lookup, this is going to fail, because the HW renderer doesn't know about any texture this size, or it will resize it to be 1024x512 and pull garbage from GS memory to cover it, then the end address will be wrong and it could get the wrong target when it tries to get something which was originally butted up against it for either drawing or reading as a texture, and this makes a huge mess of things.
sorry for the wall of text, but I hope that explains the situation a little better.
@refractionpcsx2 Thanks for the expanation! Yes, I think I roughly understand the situation now. I need to do some more testing with the HW renderer.
The GS dumps linked above by Slayer0fRA now seems to be fine on both SW and HW (OpenGL). There are some small differences in the way textures are loaded by this PR (often less texture is loaded in the edges compared to master because this PR has slightly more accurate calculation for sprites and right triangles), but none of this has an impact on the RTs.
Right now the commit history is a mess and needs significant rebasing. However, this should be in a state that can be used for testing.
@refractionpcsx2 I have looked through the code and I can't see where the ‘uses_boundary’ field in the return value of GetTextureMinMax() is used for determining the texture bounds to be loaded. In the case where iu0 or iv0 = -1, I think the full 2^TW or 2^TH of the texture will be loaded if the wrapping mode is ‘repeat’ (or ‘region repeat’ with certain values for ‘msk’ and ‘fix’).
For reference, the lines where 'uses_boundary' are used in GSRendererHW.cpp (around line 3775):
if (!is_shuffle && m_cached_ctx.CLAMP.WMS == CLAMP_REPEAT && (tmm.uses_boundary & TextureMinMaxResult::USES_BOUNDARY_U) && unscaled_size.x != tw)
{
// Our shader-emulated region repeat doesn't upscale :(
// Try to avoid it if possible
// TODO: Upscale-supporting shader-emulated region repeat
if (unscaled_size.x < tw && m_vt.m_min.t.x > -(tw - unscaled_size.x) && m_vt.m_max.t.x < tw)
{
// Game only extends into data we don't have (but doesn't wrap around back onto good data), clamp seems like the most reasonable solution
m_cached_ctx.CLAMP.WMS = CLAMP_CLAMP;
}
else
{
m_cached_ctx.CLAMP.WMS = CLAMP_REGION_REPEAT;
m_cached_ctx.CLAMP.MINU = (1 << m_cached_ctx.TEX0.TW) - 1;
m_cached_ctx.CLAMP.MAXU = 0;
}
}
if (!is_shuffle && m_cached_ctx.CLAMP.WMT == CLAMP_REPEAT && (tmm.uses_boundary & TextureMinMaxResult::USES_BOUNDARY_V) && unscaled_size.y != th)
{
if (unscaled_size.y < th && m_vt.m_min.t.y > -(th - unscaled_size.y) && m_vt.m_max.t.y < th)
{
m_cached_ctx.CLAMP.WMT = CLAMP_CLAMP;
}
else
{
m_cached_ctx.CLAMP.WMT = CLAMP_REGION_REPEAT;
m_cached_ctx.CLAMP.MINV = (1 << m_cached_ctx.TEX0.TH) - 1;
m_cached_ctx.CLAMP.MAXV = 0;
}
}
I still need to do a full dump run in HW mode though .
Edit: My bad. I had different bilinear filtering settings on both PCSX2'S. Hehe.
I've always wondered why the software mode looked strangely pixelated. People would always gaslight me with the "That's how it looked like on the PS2." This PR made HW mode indistinguishable from SW. Images are now rendered with silky softness on SW. Just look at this, it's so beautiful!!
PR:
Master:
Dump:
Castlevania - Curse of Darkness_SLUS-21168_20250610232843.gs.zip
Keep in mind that Software Renderer will be most accurate and if you would display it on a CRT and not a progressive monitor then it should be pretty close to how the physical PS2 would display it.
@Slayer0fRA Thanks for testing nonetheless, I was able to find some bugs because of your earlier GS dumps.
@refractionpcsx2 You were right about the wrapping of 0 -> -1 with linear interpolation which was breaking some textures in HW mode. I put the fix in GSState::GetTextureMinMaxAxisAligned(). Do you think its safe to assume that this issue affects only sprites and triangles that are axis-aligned?
Did a full dump run and updated OP some of the results. After fixing some bugs, nothing seems to be "obviously" broken though any more help testing is very much appreciated.
Yes it generally affects anything that reads from the base of the texture, usually with bilinear filtering enabled.
Added optimized right triangle check. Thanks @TellowKrinkle for the original code of GSUtil::AreTrianglesRight/IsTriangleRight.
can this be rebased :)
can this be rebased :)
Yep, all done.
Will have to redo dump runs because there were significant merge conflicts with the rebase.
Converting to a draft for now. This PR should probably be split into two since the vertex trace modifications are independent of the texture min/max calucaltions and affect the core functionality less.

