WIP Optimisation: Use all texture units, bindless texture support
Makes the engine use all texture units and rather than rebinding textures each time sets the correct uniform. Shaders have been modified to return the location for the correct uniform (this is needed because normal and bindless textures use different functions, so making it otherwise would be too convoluted).
Adds support for bindless textures. Bindless textures control texture residency instead of binding them.
Both approaches use texture priority which depends on the textures overall usage and usage during last frame. Texture priority right now is just calculated as frame bind percentage (amount of times this texture was bound during the last frame / total texture binds last frame) * 0.5 + total bind percentage (amount of times this texture was bound in total / total texture binds) * 1.5.
If there's no available texture units ~~or bindless texture is not made resident~~ then it will replace the lowest priority texture unit ~~or make the lowest priority texture non-resident~~. Currently disabled for bindless as making whatever texture was last added non-resident has better performance than sorting them every time. Another planned change might get rid of these priorities anyway.
This currently doesn't do anything for bound textures and even slightly lowers their performance when there's a lot of texture bindings. However, using bindless textures gives a slight performance improvement instead. (that is why this pr is draft) I'll likely have to retain some of the old functionality for this reason. It is possible that using a UBO would still improve performance even without bindless textures, but that requires GLSL 4.00 so still won't be supported by all devices.
I tried to run it but I got:
Warn: Compile log:
0:13(8): error: qualifier `bindless_sampler` requires ARB_bindless_texture
Warn: Unhandled exception (15ShaderException): Couldn't compile fragment shader: generic2D
I assume the extensions supported by your machine include ARB_bindless_texture? Maybe addExtension() is not working properly...
I assume the extensions supported by your machine include ARB_bindless_texture? Maybe addExtension() is not working properly...
It looks like addExtension() is just a function to set defines if the extension is loaded, but you need to load the extension first (in sys/sdl_climp.cpp). Also the code was misleading because it was assuming GLSL 4.00 included ARB_bindless_texture, which doesn't seem to be true.
Our engine require OpenGL 3.1 Core by default and then we cherry-pick extension, but sometime some drivers still provide OpenGL 4.5 and all extensions available, I guess this is why it worked on your end, and by luck it may also have enabled the define because your GLSL version is high enough, but not because the test was correct.
Here is a patch that makes your code run on my end. Notice I added a r_arb_bindless_texture cvar to make easy to disable the extension for debugging purpose, and I modified addExtension() to receive -1 when an extension is not part of any GLSL version so we can't assume any version can bring it.
diff --git a/src/engine/renderer/gl_shader.cpp b/src/engine/renderer/gl_shader.cpp
index a81c4f049..3234113ab 100644
--- a/src/engine/renderer/gl_shader.cpp
+++ b/src/engine/renderer/gl_shader.cpp
@@ -379,7 +379,7 @@ static void addExtension( std::string &str, int enabled, int minGlslVersion,
int supported, const char *name ) {
if( !enabled ) {
// extension disabled by user
- } else if( glConfig2.shadingLanguageVersion >= minGlslVersion ) {
+ } else if( minGlslVersion != -1 && glConfig2.shadingLanguageVersion >= minGlslVersion ) {
// the extension is available in the core language
str += Str::Format( "#define HAVE_%s 1\n", name );
} else if( supported ) {
@@ -421,7 +421,7 @@ static std::string GenVersionDeclaration() {
GLEW_ARB_gpu_shader5, "ARB_gpu_shader5" );
addExtension( str, r_arb_uniform_buffer_object->integer, 140,
GLEW_ARB_uniform_buffer_object, "ARB_uniform_buffer_object" );
- addExtension( str, glConfig2.bindlessTexturesAvailable, 400,
+ addExtension( str, glConfig2.bindlessTexturesAvailable, -1,
GLEW_ARB_bindless_texture, "ARB_bindless_texture" );
return str;
diff --git a/src/engine/renderer/tr_init.cpp b/src/engine/renderer/tr_init.cpp
index 288ad7af4..7ae12c846 100644
--- a/src/engine/renderer/tr_init.cpp
+++ b/src/engine/renderer/tr_init.cpp
@@ -100,6 +100,7 @@ Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
cvar_t *r_arb_uniform_buffer_object;
cvar_t *r_arb_texture_gather;
cvar_t *r_arb_gpu_shader5;
+ cvar_t *r_arb_bindless_texture;
cvar_t *r_checkGLErrors;
cvar_t *r_logFile;
@@ -1094,6 +1095,7 @@ ScreenshotCmd screenshotPNGRegistration("screenshotPNG", ssFormat_t::SSF_PNG, "p
r_arb_uniform_buffer_object = Cvar_Get( "r_arb_uniform_buffer_object", "1", CVAR_CHEAT | CVAR_LATCH );
r_arb_texture_gather = Cvar_Get( "r_arb_texture_gather", "1", CVAR_CHEAT | CVAR_LATCH );
r_arb_gpu_shader5 = Cvar_Get( "r_arb_gpu_shader5", "1", CVAR_CHEAT | CVAR_LATCH );
+ r_arb_bindless_texture = Cvar_Get( "r_arb_bindless_texture", "1", CVAR_CHEAT | CVAR_LATCH );
r_picMip = Cvar_Get( "r_picMip", "0", CVAR_LATCH | CVAR_ARCHIVE );
r_imageMaxDimension = Cvar_Get( "r_imageMaxDimension", "0", CVAR_LATCH | CVAR_ARCHIVE );
diff --git a/src/engine/renderer/tr_local.h b/src/engine/renderer/tr_local.h
index cb5257637..ac269fd85 100644
--- a/src/engine/renderer/tr_local.h
+++ b/src/engine/renderer/tr_local.h
@@ -2992,6 +2992,7 @@ enum class deluxeMode_t { NONE, GRID, MAP };
extern cvar_t *r_arb_uniform_buffer_object;
extern cvar_t *r_arb_texture_gather;
extern cvar_t *r_arb_gpu_shader5;
+ extern cvar_t *r_arb_bindless_texture;
extern cvar_t *r_nobind; // turns off binding to appropriate textures
extern cvar_t *r_singleShader; // make most world faces use default shader
diff --git a/src/engine/sys/sdl_glimp.cpp b/src/engine/sys/sdl_glimp.cpp
index 7d555c845..c4c322870 100644
--- a/src/engine/sys/sdl_glimp.cpp
+++ b/src/engine/sys/sdl_glimp.cpp
@@ -1933,6 +1933,9 @@ static void GLimp_InitExtensions()
// made required in OpenGL 3.2
glConfig2.syncAvailable = LOAD_EXTENSION_WITH_TEST( ExtFlag_CORE, ARB_sync, r_arb_sync->value );
+ // not required by any OpenGL version
+ glConfig2.bindlessTexturesAvailable = LOAD_EXTENSION_WITH_TEST( ExtFlag_CORE, ARB_bindless_texture, r_arb_bindless_texture->value );
+
GL_CheckErrors();
}
Ahh, that makes sense. I assumed that minGlslVersion was the minimal GLSL version required to support it, not that it would be necessarily included. Thanks for the patch, I'll add it!
I'm still not sure if the cvar should even be there, because switching back and forth might produce issues if the texture paramaters are changed while the extension is disabled. It is, however, a debug cvar, so not a big deal.
Also, how does it work performance-wise on your end, compared to not using this change?
Also, how does it work performance-wise on your end, compared to not using this change?
| map | viewpos | legacy | bindless |
|---|---|---|---|
| plat23 | 0 2688 192 -90 13 |
250fps | 380fps |
| plat23 | 1890 1920 0 0 -10 |
160fps | 220fps |
This is an AMD Radeon PRO W7600.
Though, with your code but r_arb_bindless_texture 0, many 2D textures are now broken.
I get the same 2D texture glitches when disabling bindless texture with a Mesa environment variable (in case you distrust the cvar):
MESA_EXTENSION_OVERRIDE='-GL_ARB_bindless_texture'
That is a nice FPS boost!
Though, with your code but r_arb_bindless_texture 0, many 2D textures are now broken.
Can you send some example screenshots (here or on the IRC)? I thought I made all of those work, but maybe I missed something. I'll need to make some changes to resolve conflicts with current master, but other than that it should work(tm). (also, make sure you set the cvar from terminal or in the cfg, before starting the game, I think otherwise it's gonna take an incorrect code path; well, vid_restart might work too)
I get the same 2D texture glitches when disabling bindless texture with a Mesa environment variable (in case you distrust the cvar):
Hmm, that's weird...
I've added your patch and updated this branch to current master.
Levelshot blended with main menu background, garbage progression bar color:
Missing minimap black background:
Missing console background:
I get the exact same bugs, whatever I disable bindless texture in the Mesa driver directly, or in engine using the cvar.
Oh, that's new.
Weirdly enough, it's working on my end.
Forget about what I said about the Mesa environment variable, I did a mistake writing it, so I only tested the cvar.
Disabling the extension get me a segfault, here is the proper writing of the Mesa variable (I also edited my previous comment):
MESA_EXTENSION_OVERRIDE='-GL_ARB_bindless_texture'
For some unknown reason the game still detect it:
...using GL_ARB_bindless_texture
[…]
Using OpenGL extensions: GL_ARB_half_float_pixel GL_ARB_texture_float GL_EXT_gpu_shader4 GL_EXT_texture_integer GL_ARB_texture_rg GL_ARB_bindless_texture GL_ARB_texture_gather GL_EXT_texture_compression_s3tc GL_ARB_texture_compression_rgtc GL_EXT_texture_filter_anisotropic GL_ARB_half_float_vertex GL_ARB_framebuffer_object GL_ARB_get_program_binary GL_ARB_buffer_storage GL_ARB_uniform_buffer_object GL_ARB_map_buffer_range GL_ARB_sync GL_ARB_bindless_texture
But on the other hand:
$ glxinfo | grep -i bindless
GL_ARB_arrays_of_arrays, GL_ARB_base_instance, GL_ARB_bindless_texture,
GL_ARB_arrays_of_arrays, GL_ARB_base_instance, GL_ARB_bindless_texture,
$ MESA_EXTENSION_OVERRIDE='-GL_ARB_bindless_texture' glxinfo | grep -i bindless
(nothing)
I also did a mistake with my patch:
diff --git a/src/engine/sys/sdl_glimp.cpp b/src/engine/sys/sdl_glimp.cpp
index c4c322870..8ee521c39 100644
--- a/src/engine/sys/sdl_glimp.cpp
+++ b/src/engine/sys/sdl_glimp.cpp
@@ -1784,7 +1784,7 @@ static void GLimp_InitExtensions()
// made required in OpenGL 3.0
glConfig2.textureRGAvailable = LOAD_EXTENSION_WITH_TEST( ExtFlag_CORE, ARB_texture_rg, r_ext_texture_rg->value );
- glConfig2.bindlessTexturesAvailable = LOAD_EXTENSION_WITH_TEST( ExtFlag_CORE, ARB_bindless_texture, r_ext_bindless_textures->value );
+ glConfig2.bindlessTexturesAvailable = LOAD_EXTENSION_WITH_TEST( ExtFlag_NONE, ARB_bindless_texture, r_ext_bindless_textures->value );
{
/* GT218-based GPU with Nvidia 340.108 driver advertising
Now I get as expected:
Missing OpenGL extensions: GL_ARB_bindless_texture
But I still get a segfault.
Well, this is wrong:
Using OpenGL extensions: GL_ARB_half_float_pixel GL_ARB_texture_float GL_EXT_gpu_shader4
GL_EXT_texture_integer GL_ARB_texture_rg GL_ARB_texture_gather GL_EXT_texture_compression_s3tc
GL_ARB_texture_compression_rgtc GL_EXT_texture_filter_anisotropic GL_ARB_half_float_vertex
GL_ARB_framebuffer_object GL_ARB_get_program_binary GL_ARB_buffer_storage
GL_ARB_uniform_buffer_object GL_ARB_map_buffer_range GL_ARB_sync GL_ARB_bindless_texture
Missing OpenGL extensions: GL_ARB_bindless_texture
It is said to be both used and missing… 🤔️
I've removed the r_ext_bindless_textures since it is, after all, ARB_bindless_texture.
It is said to be both used and missing… 🤔️
I think it's because there were 2 cvars for it. Can you try the updated version?
Right, we were loading the extension twice, now you miss that (the extension is not implied by core profile):
diff --git a/src/engine/sys/sdl_glimp.cpp b/src/engine/sys/sdl_glimp.cpp
index 03ded78b2..1b4bb29d0 100644
--- a/src/engine/sys/sdl_glimp.cpp
+++ b/src/engine/sys/sdl_glimp.cpp
@@ -1932,7 +1932,7 @@ static void GLimp_InitExtensions()
glConfig2.syncAvailable = LOAD_EXTENSION_WITH_TEST( ExtFlag_CORE, ARB_sync, r_arb_sync->value );
// not required by any OpenGL version
- glConfig2.bindlessTexturesAvailable = LOAD_EXTENSION_WITH_TEST( ExtFlag_CORE, ARB_bindless_texture, r_arb_bindless_texture->value );
+ glConfig2.bindlessTexturesAvailable = LOAD_EXTENSION_WITH_TEST( ExtFlag_NONE, ARB_bindless_texture, r_arb_bindless_texture->value );
GL_CheckErrors();
}
Once the latest patch is applied, the game runs while the extension is disabled in the driver, but reproduces the errors seen in the above screenshots.
Great, now I get:
...GL_ARB_bindless_texture not found.
[…]
Using OpenGL extensions: GL_ARB_half_float_pixel GL_ARB_texture_float GL_EXT_gpu_shader4
GL_EXT_texture_integer GL_ARB_texture_rg GL_ARB_texture_gather GL_EXT_texture_compression_s3tc
GL_ARB_texture_compression_rgtc GL_EXT_texture_filter_anisotropic GL_ARB_half_float_vertex
GL_ARB_framebuffer_object GL_ARB_get_program_binary GL_ARB_buffer_storage
GL_ARB_uniform_buffer_object GL_ARB_map_buffer_range GL_ARB_sync
Missing OpenGL extensions: GL_ARB_bindless_texture
but the game runs (with the said bugs).
Unfortunately I don't reproduce the bug with Mesa llvmpipe, I reproduce it with Mesa radeonsi…
We may have a shader cache issue, deleting the glsl/ folder fixes the bug for me.
Hmm, that might indeed be the case.
Not your fault:
- https://github.com/DaemonEngine/Daemon/pull/1054
Now that I delete my shader cache between every test run to avodi running wrong GLSL code because of shader cache issue, the bindless branch shows me a huge performance drop instead.
Faster is master branch, then this branch with bindless disabled, then this branch with bindless enabled.
| preset | map | viewpos | master | bindless off | bindless on |
|---|---|---|---|---|---|
| ultra | plat23 | 0 2688 192 -90 13 |
610fps | 480fps | 380fps |
| ultra | plat23 | 1890 1920 0 0 -10 |
490fps | 350fps | 240fps |
| ultra | dretchstorm | 120 -504 248 135 45 |
500fps | 350fps | 300fps |
| ultra | dretchstorm | 2007 -412 196 11 16 |
390fps | 270fps | 230fps |
| ultra | nova | -1896 336 248 0 0 |
520fps | 320fps | 280fps |
| lowest | plat23 | 0 2688 192 -90 13 |
830fps | 830fps | 780fps |
| lowest | plat23 | 1890 1920 0 0 -10 |
620fps | 620fps | 615fps |
| lowest | dretchstorm | 120 -504 248 135 45 |
650fps | 650fps | 630fps |
| lowest | dretchstorm | 2007 -412 196 11 16 |
430fps | 430fps | 450fps |
| lowest | nova | -1896 336 248 0 0 |
600fps | 450fps | 500fps |
- GPU: AMD Radeon W7600 PRO
- Driver: Mesa radeonsi
- Resolution:
2560×1440
Thread 1 "daemon" received signal SIGSEGV, Segmentation fault.
RB_RenderMotionBlur () at Unvanquished/daemon/src/engine/renderer/tr_backend.cpp:3186
3186 GL_BindToTMU( gl_motionblurShader->GetUniformLocation_ColorMap(), tr.currentRenderImage[backEnd.currentMainFBO] );
Thread 1 (Thread 0x7ffff5515a80 (LWP 2370458) "daemon"):
#0 RB_RenderMotionBlur () at Unvanquished/daemon/src/engine/renderer/tr_backend.cpp:3186
#1 RB_RenderMotionBlur () at Unvanquished/daemon/src/engine/renderer/tr_backend.cpp:3165
#2 0x0000555555695171 in RB_RenderView (depthPass=false) at Unvanquished/daemon/src/engine/renderer/tr_backend.cpp:4783
#3 DrawViewCommand::ExecuteSelf (this=0x7fffd6e92364) at Unvanquished/daemon/src/engine/renderer/tr_backend.cpp:5683
#4 0x000055555568d689 in RB_ExecuteRenderCommands (data=<optimized out>) at Unvanquished/daemon/src/engine/renderer/tr_backend.cpp:5927
#5 0x00005555556a8a62 in RE_EndFrame (frontEndMsec=0x0, backEndMsec=0x0) at Unvanquished/daemon/src/engine/renderer/tr_cmds.cpp:889
#6 0x0000555555625850 in SCR_UpdateScreen () at Unvanquished/daemon/src/engine/client/cl_scrn.cpp:327
#7 0x0000555555619d1d in CL_Frame (msec=msec@entry=3) at Unvanquished/daemon/src/engine/client/cl_main.cpp:2094
#8 0x00005555555ae46f in Com_Frame () at Unvanquished/daemon/src/engine/qcommon/common.cpp:1025
#9 0x00005555555a78bd in Application::ClientApplication::Frame (this=0x555555875f00 <Application::GetApp()::app>) at Unvanquished/daemon/src/engine/client/ClientApplication.cpp:96
#10 0x00005555555a71e5 in main (argc=<optimized out>, argv=<optimized out>) at Unvanquished/daemon/src/engine/framework/System.cpp:755
How I reproduce:
- load plat23 map
- enable noclip
- press
[crouch]button until it crashes
It doesn't crash anymore but the motion blur is now garbage:
the motion blur is now garbage:
Yet another false positive, this disappears when cleaning the GLSL cache.
Yep, I just checked and on my end it works fine.
If you can rebase over current master to get this:
- https://github.com/DaemonEngine/Daemon/pull/1054
It will make testing your branch easier! 😉️