Devil's Third graphical bugs and hangs
Current Behavior
Using both Vulkan and OpenGL there are various graphical bugs along with hangs during cutscenes, which at worst can last several minutes while using OpenGL.
Vulkan:
OpenGL:
Hangs occur frequently during the airplane cutscene at the start of mission 2:
Be prepared for a very long stare off with Stella if you choose to play this using OpenGL. :)
Vulkan also struggles with this scene, but not as much.
Expected Behavior
Smooth sailing.
Steps to Reproduce
Play until bugs occur.
System Info (Optional)
Device: Steam Deck OS: SteamOS 3.7.8 Stable Version: Cemu 2.6 x86_64 AppImage
Emulation Settings (Optional)
No response
Logs (Optional)
No response
Duplicate of #1317
I took a stab at this and have some workarounds. First on the issue of the black character under Vulkan, I noticed a validation error of VK_SHADER_STAGE_FRAGMENT_BIT declared input at Location 0 Component 0 but it is not an Output declared in VK_SHADER_STAGE_VERTEX_BIT.. Looking at the dumped shaders there were some pixel shaders with an unmatched input of passParameterSem64. By testing I found special casing psSemanticId of 64 to LATTE_ANALYZER_IMPORT_INDEX_SPIPOSITION seemed to resolve the issue. There's a few psInputControl values found in this game with the 7th bit set, so I used that as the check. Setting them to SPIPOSITION is a complete guess, but it keeps it from messing up the shader. Another thing to note about this game is that a lot of the other psInputControl values have the 8th bit set, so I wonder if that's a flag rather than part of the semanticId.
Then for the issue of glitching textures, I think it's related to surface copy handling. The game causes constant HLECopySurface(): Source texture is not in list of dynamic textures warnings and discarding all surface copies seems to mitigate the issue. Except the game uses a surface copy when drawing dynamic shadows so as a workaround I tried discarding every copy unless the dimensions were 1280x720. It seems likely this might cause other less obvious graphical errors though.
I reproduced the cutscene hang issue once while debugging and the only thing I noticed was a much higher rate of itHLEBeginOcclusionQuery: Query 0x%08x is already active warnings in the console. Those messages happen during almost every cutscene but it was being spammed while it was hanging for a couple seconds.
Various crashes were also caused by writing over the GX2 command buffer and by hitting Vulkan descriptor limits. I just increased their sizes until it didn't crash, it's probably wrong in multiple ways. Writing over the GX2 command buffer happens pretty consistently when returning to the menu from a stage. Edit: I should also note I can't remember if writing over the command buffer ever caused anything other than a debug assert.
I played through the first 2 levels with these changes and only noticed one glaring issue with the shadows/lighting that I didn't see when cross-referencing playthroughs on youtube. You can see the truck flickers from lit to dark and the shadows of the container in the upper right flickers in and out, as well as on the containers in the ship. The triangular shapes breaking up the shadow in front of the truck does seem to be a game bug though.
https://github.com/user-attachments/assets/d21c4b99-4af3-4bc8-a37e-b7b4ffcbdae9
System Info: OS: Linux GPU: RX 6600 (RADV, Mesa 25.1.4) Cemu: git-e91740cf29248bfbf2f059ac7e42159e8e7e9e9a
diff --git a/src/Cafe/HW/Latte/Core/LatteShader.cpp b/src/Cafe/HW/Latte/Core/LatteShader.cpp
index d9f0a5d..ee5adbc 100644
--- a/src/Cafe/HW/Latte/Core/LatteShader.cpp
+++ b/src/Cafe/HW/Latte/Core/LatteShader.cpp
@@ -257,6 +257,7 @@ void LatteShader_UpdatePSInputs(uint32* contextRegisters)
{
uint32 psInputControl = contextRegisters[mmSPI_PS_INPUT_CNTL_0 + f];
uint32 psSemanticId = (psInputControl & 0xFF);
+ uint32 insertSpiPos = (psInputControl>>6) & 1;
uint8 defaultValue = (psInputControl>>8)&3;
// default:
@@ -275,7 +276,7 @@ void LatteShader_UpdatePSInputs(uint32* contextRegisters)
key += (uint64)psInputControl;
key = std::rotl<uint64>(key, 7);
- if (spi0_positionEnable && f == spi0_positionAddr)
+ if ((spi0_positionEnable && f == spi0_positionAddr) || insertSpiPos)
{
_activePSImportTable.import[f].semanticId = LATTE_ANALYZER_IMPORT_INDEX_SPIPOSITION;
_activePSImportTable.import[f].isFlat = false;
diff --git a/src/Cafe/HW/Latte/Core/LatteSurfaceCopy.cpp b/src/Cafe/HW/Latte/Core/LatteSurfaceCopy.cpp
index 45be684..00b2ef1 100644
--- a/src/Cafe/HW/Latte/Core/LatteSurfaceCopy.cpp
+++ b/src/Cafe/HW/Latte/Core/LatteSurfaceCopy.cpp
@@ -21,6 +21,8 @@ void LatteSurfaceCopy_copySurfaceNew(MPTR srcPhysAddr, MPTR srcMipAddr, uint32 s
debug_printf("HLECopySurface(): Source texture is not in list of dynamic textures\n");
return;
}
+ else if (!(srcWidth == 1280 && srcHeight == 720))
+ return;
sourceTexture = sourceView->baseTexture;
if (sourceTexture->reloadFromDynamicTextures)
{
diff --git a/src/Cafe/HW/Latte/Renderer/Vulkan/VulkanRenderer.cpp b/src/Cafe/HW/Latte/Renderer/Vulkan/VulkanRenderer.cpp
index aed0db2..c3bd840 100644
--- a/src/Cafe/HW/Latte/Renderer/Vulkan/VulkanRenderer.cpp
+++ b/src/Cafe/HW/Latte/Renderer/Vulkan/VulkanRenderer.cpp
@@ -3083,19 +3083,19 @@ void VulkanRenderer::CreateDescriptorPool()
{
std::array<VkDescriptorPoolSize, 4> poolSizes = {};
poolSizes[0].type = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER;
- poolSizes[0].descriptorCount = 1024 * 128;
+ poolSizes[0].descriptorCount = 1024 * 512;
poolSizes[1].type = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
- poolSizes[1].descriptorCount = 1024 * 1;
+ poolSizes[1].descriptorCount = 1024 * 4;
poolSizes[2].type = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC;
- poolSizes[2].descriptorCount = 1024 * 128;
+ poolSizes[2].descriptorCount = 1024 * 512;
poolSizes[3].type = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
- poolSizes[3].descriptorCount = 1024 * 4;
+ poolSizes[3].descriptorCount = 1024 * 16;
VkDescriptorPoolCreateInfo poolInfo = {};
poolInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO;
poolInfo.poolSizeCount = poolSizes.size();
poolInfo.pPoolSizes = poolSizes.data();
- poolInfo.maxSets = 1024 * 256;
+ poolInfo.maxSets = 1024 * 1024;
poolInfo.flags = VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT;
if (vkCreateDescriptorPool(m_logicalDevice, &poolInfo, nullptr, &m_descriptorPool) != VK_SUCCESS)
diff --git a/src/Cafe/OS/libs/gx2/GX2_Command.cpp b/src/Cafe/OS/libs/gx2/GX2_Command.cpp
index d12bf21..551a1e4 100644
--- a/src/Cafe/OS/libs/gx2/GX2_Command.cpp
+++ b/src/Cafe/OS/libs/gx2/GX2_Command.cpp
@@ -17,6 +17,8 @@ namespace GX2
GX2PerCoreCBState s_perCoreCBState[Espresso::CORE_COUNT];
}
+#define DEFAULT_COMMAND_BUF_SIZE 512
+
void gx2WriteGather_submitU32AsBE(uint32 v)
{
uint32 coreIndex = PPCInterpreter_getCoreIndex(PPCInterpreter_getCurrentInstance());
@@ -79,7 +81,7 @@ namespace GX2
}
else
{
- s_commandState->commandPoolBase = (uint32be*)coreinit::_weak_MEMAllocFromDefaultHeapEx(poolSize, 0x100);
+ s_commandState->commandPoolBase = (uint32be*)coreinit::_weak_MEMAllocFromDefaultHeapEx(poolSize, DEFAULT_COMMAND_BUF_SIZE);
s_cbBufferIsInternallyAllocated = true;
}
if (!s_commandState->commandPoolBase)
@@ -96,7 +98,7 @@ namespace GX2
s_perCoreCBState[i].currentWritePtr = nullptr;
}
// start first command buffer for main core
- GX2Command_StartNewCommandBuffer(0x100);
+ GX2Command_StartNewCommandBuffer(DEFAULT_COMMAND_BUF_SIZE);
}
void GX2Shutdown_commandBufferPool()
@@ -151,7 +153,7 @@ namespace GX2
uint32 coreIndex = coreinit::OSGetCoreId();
auto& coreCBState = s_perCoreCBState[coreIndex];
- numU32s = std::max<uint32>(numU32s, 0x100);
+ numU32s = std::max<uint32>(numU32s, DEFAULT_COMMAND_BUF_SIZE);
// grab space from command buffer pool and if necessary wait for it
uint32be* bufferPtr = nullptr;
uint32 bufferSizeInU32s = 0;
@@ -299,7 +301,7 @@ namespace GX2
void GX2Flush()
{
- GX2Command_Flush(256, true);
+ GX2Command_Flush(DEFAULT_COMMAND_BUF_SIZE, true);
}
uint64 GX2GetLastSubmittedTimeStamp()
@@ -442,7 +444,7 @@ namespace GX2
if (!s_perCoreCBState[coreIndex].isDisplayList)
{
// make sure any preceeding commands are submitted first
- GX2Command_Flush(0x100, false);
+ GX2Command_Flush(DEFAULT_COMMAND_BUF_SIZE, false);
}
GX2Command_SubmitCommandBuffer(static_cast<uint32be*>(addr), size / 4, nullptr, false);
}
Edit: Fixed != 1280x720 logic.
I noticed another issue. With OpenGL, it looks like the shadow positions are inverted. This happens with or without the changes I posted. They are positioned correctly under Vulkan.
https://github.com/user-attachments/assets/764a9e1f-b767-485c-ac81-737c2dd43a5a