Raspberry Pi 4 graphic issues
My RPi4 has arrived today 9 months after ordering, so obviously I needed to know whether it can compile and run OpenGothic :rocket:
OpenGothic compiles, starts, loads the main menu, starts a new game, shows the intro, but then it only draws the UI and nothing else. If I wait long enough, I can see Xardas's chat appearing, so the game does move, just slowly:
I've checked that RPi4 currently exposes support for Vulkan 1.2.246, the outcome does not change even when trying to disable some of the graphic settings via -rt 0 -ms 0 and also in the extended configuration. Is this simply because Vulkan 1.3 is not supported by the GPU and the stuff cannot be rendered, or can it be caused by something else such as low VRAM?
Lowering the system resolution (= also OpenGothic render resolution) helps with the framerate as far as I could observe in main menu, but the in-game outcome is the same.
Here's the game log when testing on a95b7e9b:
OpenGothic v1.0 dev
no *.ini file in path - using default settings
GPU = V3D 4.2
Depth format = Depth32F Shadow format = Depth16
[ALSOFT] (EE) available update failed: Broken pipe
[phoenix] world: parsing object [MeshAndBsp % 0 0]
[phoenix] bsp_tree: parsing chunk c000
[phoenix] bsp_tree: parsing chunk c010
[phoenix] bsp_tree: parsing chunk c040
[phoenix] bsp_tree: parsing chunk c045
[phoenix] bsp_tree: parsing chunk c050
[phoenix] bsp_tree: parsing chunk c0ff
[phoenix] mesh: 1 bytes remaining in section b020
[phoenix] world: parsing object [VobTree % 0 0]
[phoenix] world: parsing object [WayNet % 0 0]
[phoenix] world: parsing object [EndMarker % 0 0]
invalid particle system: "INVISIBLE_VOBBOX.3DS"
unable to load sound fx: ENV_NIGHT_TONSOFINSECTS
unable to load sound fx: OW_BIRD11
unable to load mesh "_1HST1.MDS"
unable to load mesh "_2HST1.MDS"
unable to load mesh "_BOWT1.MDS"
unable to load mesh "_CBOWT1.MDS"
[phoenix] model_script: unexpected value for event_tag_type: " "
[phoenix] model_script: 4 bytes remaining in section f5a3
unable to play video: "Addon_Title.BIK"
[phoenix] vm: accessing member "C_NPC.AIVAR" without an instance set
[phoenix] vm: accessing member "C_NPC.AIVAR" without an instance set
[phoenix] vm: accessing member "C_NPC.AIVAR" without an instance set
[ALSOFT] (EE) available update failed: Broken pipe
Hi, @Nindaleth interesting report :)
disable some of the graphic settings via -rt 0 -ms 0
You don't need to do this part, if RT/Mesh is not supported it wont be enable anyway.
supported by the GPU and the stuff cannot be rendered, or can it be caused by something else such as low VRAM
Baseline is still Vulkan-1.0; engine will fail-fast with exception, if out of memory. Can be Raspberry Pi driver - AFAIK it's incomplete.
For debugging I would suggest to look into Renderer::dbgDraw to display intermediate attachments. Interesting to see would be: sceneLinear, shadowMap[].
gbufDiffuse, gbufNormal, also useful, yet water rendering will override them, so drawGWater need to be commented out and set Painter::NoBlend.
Baseline is still Vulkan-1.0; engine will fail-fast with exception, if out of memory. Can be Raspberry Pi driver - AFAIK it's incomplete.
While I agree that RPi Vulkan Mesa driver may not be as thoroughly maintained as those of AMD and Intel, the V3DV driver still is 1.2 conformant officially: https://www.raspberrypi.com/news/vulkan-update-version-1-2-conformance-for-raspberry-pi-4/
For debugging I would suggest to look into
Thanks, I'll see what I can do.
BTW, I can only run OpenGothic on RPi4 once per boot. All following attempts crash with something like Tempest::Swapchain::reset() in backtrace - I think, don't have the backtrace available at the moment. Is this definitely a driver issue or it could be a platform-dependent problem in Tempest?
BTW, I can only run OpenGothic on RPi4 once per boot. All following attempts crash .... platform-dependent problem in Tempest?
Unlikely: there is nothing application can do to affect next runtime, in such a way. Vulkan is user-space api after all.
I'm trying to get the proper debug info format on my desktop first, before I go the much slower way on RPi4.
Is the following approach correct?
I've done the following patch:
diff --git a/game/graphics/renderer.cpp b/game/graphics/renderer.cpp
index c127b453..9b9d4747 100644
--- a/game/graphics/renderer.cpp
+++ b/game/graphics/renderer.cpp
@@ -352,20 +352,23 @@ void Renderer::draw(Encoder<CommandBuffer>& cmd, uint8_t cmdId, size_t imgId,
}
void Renderer::dbgDraw(Tempest::Painter& p) {
- static bool dbg = false;
+ static bool dbg = true;
if(!dbg)
return;
std::vector<const Texture2d*> tex;
- tex.push_back(&textureCast(hiz.hiZ));
+ //tex.push_back(&textureCast(hiz.hiZ));
//tex.push_back(&textureCast(hiz.smProj));
//tex.push_back(&textureCast(hiz.hiZSm1));
- //tex.push_back(&textureCast(shadowMap[1]));
- //tex.push_back(&textureCast(shadowMap[0]));
+ tex.push_back(&textureCast(sceneLinear));
+ tex.push_back(&textureCast(shadowMap[1]));
+ tex.push_back(&textureCast(shadowMap[0]));
+ tex.push_back(&textureCast(gbufDiffuse));
+ tex.push_back(&textureCast(gbufNormal));
int left = 10;
for(auto& t:tex) {
- p.setBrush(Brush(*t,Painter::Alpha,ClampMode::ClampToBorder));
+ p.setBrush(Brush(*t,Painter::NoBlend,ClampMode::ClampToBorder));
auto sz = Size(p.brush().w(),p.brush().h());
if(sz.isEmpty())
continue;
@@ -429,7 +432,7 @@ void Renderer::draw(Tempest::Attachment& result, Tempest::Encoder<CommandBuffer>
stashSceneAux(cmd,fId);
- drawGWater(cmd,fId,*wview);
+ // drawGWater(cmd,fId,*wview);
cmd.setFramebuffer({{sceneLinear, Tempest::Preserve, Tempest::Preserve}}, {zbuffer, Tempest::Preserve, Tempest::Preserve});
cmd.setDebugMarker("Sun&Moon");
Which results in this debugging output:
Yep, look correct to me
OK, this is how it looks on RPi4 - the dialog boxes show up normally so I was able to finish talking to Xardas and save.
It's in 1440x900 so that I can get at least that 1 FPS while keeping most of the screen real estate.
OK, so GBuffer and shadows are fine, but sceneLinear is not. In principle commenting stuff in rendering code, should be next step:
...
cmd.setFramebuffer({{sceneLinear, Tempest::Discard, Tempest::Preserve}}, {zbuffer, Tempest::Readonly});
drawShadowResolve(cmd,fId,*wview); // Xardas tower in indoor, so will be black
drawAmbient(cmd,*wview); // Ambient light to see something at least
... can cut stuff from here
drawLights(cmd,fId,*wview);
drawSky(cmd,fId,*wview);
stashSceneAux(cmd,fId);
drawGWater(cmd,fId,*wview);
cmd.setFramebuffer({{sceneLinear, Tempest::Preserve, Tempest::Preserve}}, {zbuffer, Tempest::Preserve, Tempest::Preserve});
cmd.setDebugMarker("Sun&Moon");
wview->drawSunMoon(cmd,fId);
cmd.setDebugMarker("Translucent");
wview->drawTranslucent(cmd,fId);
cmd.setFramebuffer({{sceneLinear, Tempest::Preserve, Tempest::Preserve}});
drawReflections(cmd,fId);
if(camera->isInWater()) {
cmd.setDebugMarker("Underwater");
drawUnderwater(cmd,fId);
} else {
cmd.setDebugMarker("Fog");
wview->drawFog(cmd,fId);
}
------- this part is must keep, to write at least sumething in result
cmd.setFramebuffer({{result, Tempest::Discard, Tempest::Preserve}});
cmd.setDebugMarker("Tonemapping");
drawTonemapping(cmd);
Also in general - don't forget to disable SSAO/Radial fog and such to make it more simple
don't forget to disable SSAO/Radial fog
the 1 FPS is with all this already disabled :)
I've applied your suggested patch, then tried enabling the drawing commands one by one and in the end the only thing that causes issues is wview->drawFog(cmd,fId);, with that one commented out the game renders properly. EDIT: Tested with d47987cd63d099670503e72be895e2536ddf0bf2.
That neatly narrows it down :)
Thanks for investigating!
Can you please check c349d30 ? My assumption that they do not support storage-image for 16-bit and 11-bit formats
Unfortunately these conditions never match - I've verified with a log that both of these tested storage formats are present within the device properties.
Hm, Vulkan DB says that they are not supported... http://www.vulkan.gpuinfo.org/displayreport.php?id=18029#formats
In Low-Q fog there is not a low going on:
// fog.frag
vec4 fog(vec2 uv, float z) {
float dMin = 0;
float dMax = 0.9999;
float dZ = linearDepth( z, push.clipInfo);
float d0 = linearDepth(dMin, push.clipInfo);
float d1 = linearDepth(dMax, push.clipInfo);
float d = (dZ-d0)/(d1-d0);
// return vec4(debugColors[min(int(d*textureSize(fogLut,0).z), textureSize(fogLut,0).z-1)%MAX_DEBUG_COLORS], 1);
vec4 val = textureLod(fogLut, vec3(uv,d), 0);
vec3 trans = vec3(1.0-val.w);
float fogDens = (trans.x+trans.y+trans.z)/3.0;
vec3 lum = val.rgb;
return vec4(lum, fogDens);
}
On idea is to replace textureLod(fogLut, vec3(uv,d), 0); to constant vec4(0,0,0,1).
Also maybe capability check is off, and setting lutRGBAFormat to RGBA8 unconditionally on R-PI?
Forcing both unconditionally to RGBA8 doesn't look good:
EDIT: Should I have set only the RGBA lut?
But only putting that constant vector into the fog shader does work, the game then renders properly.
Xardas tower should look like this with 8-bit fog (almost same as 16-bit)
So, my assumption, that fog_view_lut.comp produces wrong results (nan/inf/crap). Can be due to lack of fpu-precision, can be due to how imageStore is working(or not working) on R-PI.
One way to debug it further can be to hack imageStore in that shader to store (0,0,0,1)
Another idea is to flip it back to use RGBA32F float format instead of RGBA8.
Should I have set only the RGBA lut?
I'm assuming both formats are problematic
Thanks for the further suggestions!
One way to debug it further can be to hack
imageStorein that shader to store (0,0,0,1)
That didn't work, still drawing black instead of sceneLinear.
Another idea is to flip it back to use RGBA32F float format instead of RGBA8.
That didn't work either, I've forced it right away. The constant vector in fog.frag is the only successful workaround at the moment.
BTW, I get that this is taking away development effort from more useful things. Even if it worked out in the end, the game would still run on ~1 FPS, unless I've commented out reflections and improved perf multiple times to ~4 FPS on an already reduced resolution... or unless you had an ace up your sleeve to get more perf optimizations on low-range machines :)
So I understand getting RPi4 to render properly is not exactly the top priority when it won't really run nicely either way.
TBH it's good that someone tries to run this on slower hardware like the Raspberry Pi. The original game ran on a 3dfx Voodoo 3 just fine. The GPU of a Raspberry Pi 4 should be several magnitudes faster than that.
OpenGothic always felt kind of slow, but I didn't report any issues, because on modern desktop hardware it's plenty fast enough.
That didn't work, still drawing black instead of sceneLinear. BTW, I get that this is taking away development effort from more useful things.
Thanks for testing! Well now we know that imageStore is broken on Broadcom :D
Well, worth checking still - just to be sure, that it's not UB/bug on our end.
I've commented out reflections and improved perf multiple times to ~4 FPS
(facepalm) So, they can't even do empty-pass elimination - great...
Raspberry Pi 4 should be several magnitudes faster than that.
Actually no, it doesn't work like that. Silicon is better - sure, but gpu is more than that. Currently can say we have 2 classes of gpu's to work with: NVidia, AMD and the broken ones
On NV, there are seem to be only 2 issues from our end: draw-context-limit and slow memory.
On AMD: minor HW issues (gl_LocalInvocationIndex issue)
On M1 (closest to R-PI, so far) - Metal3 is full of emulation, and due to no communication from apple can only guess, what is going on with draw-calls. One of the work thing that I'm aware: one mobile(good, that we do not do android!) vendor wasn't able to implement indexed geometry call right
It turns out RPi lead times have improved a lot since summer (and before). I ordered RPi5 at the beginning of October and here I am, testing OPG on RPi5 already! This time I'm only curious about the perf uplift between RPi4 and RPi5 and intend to sell the RPi5 almost immediately (I don't really need the extra power of RPi5 for my headless server usage while I do care about all the stuff already upstreamed fully for RPi4), so my invested time and effort will be more limited in comparison to the above.
In a nutshell, RPi5 can finally be used as a cheap and slow desktop computer (instead of previous garbage desktop computer) and while I've ordered a starter set with an active cooler, to my surprise it was inaudible even when compiling the full project.
Here's the default log output when starting the game, in case something useful can be seen there:
pi@raspberrypi:~/src/OpenGothic/build/opengothic $ ./Gothic2Notr.sh -g ~/Downloads/G2G/
OpenGothic v1.0 dev
WARNING: v3dv support for hw version 71 is neither a complete nor a conformant Vulkan implementation. Testing use only.
GPU = V3D 7.1.7
Depth format = Depth32F Shadow format = Depth16
And here are two out of the box screenshots! The first one is the new game talking to Xardas:
The other one is standing in front of Khorinis Hotel:
While this does look pretty bad (and you don't even see this shit moving), it's much better than RPi4's no picture, also it matches the reported >3x uplift in GPU performance (0.44 FPS vs 1.90 FPS). Lowering the render resolution further improves the performance appropriately to somewhere around 8 FPS. I could have tested a few more things but I guess this is still too low FPS for any serious gaming and is not necessary to explore yet.
This means that RPi6 probably will be able to run OpenGothic at playable framerates! :slightly_smiling_face: