ClassiCube icon indicating copy to clipboard operation
ClassiCube copied to clipboard

Lag on maps with lots of bots

Open dflat2 opened this issue 2 years ago • 8 comments

When joining a multiplayer map with more than ~40 bots, FPS drops from about 120 to less than 20 (and it keeps getting lower as the number of bots increase). FPS seems to drop by the same amount whether the quality of the skins is standard or HD. FPS is back to normal when I'm not facing the bots (ie. when they are out of my FOV) or when they are in the fog.

I'm playing on MacBook Pro 2020 with an M1 chip.

dflat2 avatar Jul 30 '23 08:07 dflat2

Are you using the regular macOS ClassiCube.net build, or a manually compiled build for ARM?

UnknownShadow200 avatar Aug 01 '23 11:08 UnknownShadow200

I'm using the regular macOS ClassiCube.net build.

dflat2 avatar Aug 01 '23 13:08 dflat2

Can you please compile and then run ClassiCube with

  1. Graphics_GL1.c replaced with the one from https://gist.github.com/UnknownShadow200/8dc9baa18b4084f5ff68066bf1495a5f
  2. Gfx_EndFrame in _GLShared.h either renamed or deleted (otherwise it conflicts with the new one in Graphics_GL1.c)

And then note what the timing values are when:

  • looking at all the bots
  • not looking at any bots

UnknownShadow200 avatar Aug 03 '23 13:08 UnknownShadow200

Here's a sample of the timing values when looking at all the bots (there are about 80 bots in the FOV):

VB timing: 30418 us total, 562 us max
VB timing: 34668 us total, 781 us max
VB timing: 35060 us total, 526 us max
VB timing: 35419 us total, 796 us max
VB timing: 35756 us total, 784 us max
VB timing: 35849 us total, 868 us max
VB timing: 35398 us total, 686 us max
VB timing: 36481 us total, 601 us max
VB timing: 37087 us total, 1216 us max
VB timing: 35655 us total, 866 us max
VB timing: 34804 us total, 747 us max
VB timing: 34812 us total, 838 us max

When looking at no bots:

VB timing: 57 us total, 50 us max
VB timing: 68 us total, 60 us max
VB timing: 73 us total, 67 us max
VB timing: 87 us total, 76 us max
VB timing: 91 us total, 81 us max
VB timing: 83 us total, 77 us max
VB timing: 115 us total, 108 us max
VB timing: 81 us total, 71 us max
VB timing: 69 us total, 61 us max
VB timing: 85 us total, 75 us max
VB timing: 96 us total, 82 us max
VB timing: 91 us total, 83 us max

dflat2 avatar Aug 03 '23 15:08 dflat2

Yikes.. those are some very concerning timing values

I guess my choice to render all entities using the same dynamic vertex buffer, with that vertex buffer being constantly partially updated using glBufferSubData (each rendered entity calls glBufferSubData at least once), is causing constant GPU stalling in this case

Going to need some time to think about an alternative approach to use for rendering entities


For comparison, these are the timings value from my 2010 mac mini with a GeForce 320M GPU:

VB timing: 314 us total, 57 us max
VB timing: 199 us total, 7 us max
VB timing: 187 us total, 5 us max
VB timing: 262 us total, 38 us max
VB timing: 219 us total, 8 us max
VB timing: 211 us total, 6 us max
VB timing: 223 us total, 30 us max
VB timing: 183 us total, 7 us max
VB timing: 212 us total, 6 us max
VB timing: 250 us total, 22 us max
VB timing: 23 us total, 5 us max
VB timing: 24 us total, 4 us max
VB timing: 24 us total, 5 us max
VB timing: 26 us total, 5 us max
VB timing: 26 us total, 6 us max
VB timing: 64 us total, 40 us max
VB timing: 24 us total, 6 us max
VB timing: 25 us total, 6 us max
VB timing: 24 us total, 5 us max
VB timing: 23 us total, 5 us max

UnknownShadow200 avatar Aug 03 '23 22:08 UnknownShadow200

@dflat2 Sorry for the long delay in addressing this issue

If you

  1. Download latest source code
  2. Go to Gfx_SetDynamicVbData in Graphics_GL1.c
  3. Change _glBufferSubData(GL_ARRAY_BUFFER, 0, size, vertices) to _glBufferData(GL_ARRAY_BUFFER, size, vertices, GL_DYNAMIC_DRAW);

Does that solve the significantly low performance problem?

UnknownShadow200 avatar Nov 03 '23 11:11 UnknownShadow200

On the latest commit (f63018b05):

Without the change: 20 FPS for 80 bots in the field of view. With the change _glBufferSubData_glBufferData: 120 FPS for the same amount of bots in the FOV.

Yes it solves the problem, thanks.

dflat2 avatar Nov 05 '23 12:11 dflat2

Looks like it might be more challenging than just switching from glBufferData to glBufferSubData. I did some performance testing with the Intel HD Graphics 2000 on Windows and found that:

old: 26 FPS (before #1904 was merged)

new (shared VB, bufferSubData):  42 FPS
new (shared VB, bufferData):    122 FPS
new (unique VB, bufferSubData): 130 FPS
new (unique VB, bufferData):    124 FPS

shared VB = all entities reusing the same single vertex buffer
unique VB = each entity uses a separate vertex buffer

So for this case, although switching to glBufferData is a better baseline, the fastest method does end up using glBufferSubData. Guess I'll have to perform much more testing on this

UnknownShadow200 avatar Nov 11 '23 23:11 UnknownShadow200