OpenComputers Optimized Render code for TextBuffers

trafficstars

This PR replaces the existing infrastructure for TextBuffer Rendering with a texture sheet.

Benefits:

Significantly less OpenGL calls
Dramatic increase in FPS on certain GPUS (see below for stats)
Only re-draws the parts of the screen that need it
Viewing the screen both in world and in window draw a single quad each

TODO:

Changing screen resolution does not stretch as expected. I would argue that this is a feature, but it could be considered a bug

Stress Test Program (Courtesy of Ristelle):

local component = require("component")
local gpu = component.gpu
local r = {"00","33","66","99","cc","ff"}
local g = {"00","24","49","6d","92","b6","db","ff"}
local b = {"00","40","80","c0","ff"}
local x,y = gpu.getResolution()
local rr = 1
local rg = 1
local rb = 1
while true do
    rr = rr+ 1
    if rr == 6 then
        rr = 1
        rg = rg + 1
    end
    if rg == 8 then
        rg = 1
        rb = rb +1
    end
    if rb == 5 then
        rb = 1
    end
    local rgb = string.format("0x%s%s%s",r[rr],g[rg],b[rb])
    gpu.setBackground(tonumber(rgb))
    print("Hello Cam asjdkfasdhjkhfakh Ｆｏｒ ｅｘａｍｐｌｅ， ｔｈｉｓ．12839054812908902!")
    os.sleep(0)
end

Test Results: AMD R9 380: 5FPS - 60FPS (linux vsync locked) NVIDIA GTX 650: 80FPS - 80FPS INTEL Celeron 847 HD Graphics (1.1GHz): 10FPS vs 30FPS more coming soon

Jun 17 '18 16:06 cam72cam

So if I understand correctly, this replaces display lists with textures? In which case, cool, this is something I would have liked to try out (back when I was more active on OC dev :P) but never got around to, in part because I wasn't certain it'd actually be feasible, GPU memory-wise.

And that's one of the two points where you could make me a lot more at ease with merging this: would you mind running a comparison of the new rendering code and the old one with a superflat (less noise) scene that has a whole bunch (*) of individual (actively rendering) T3 screens in it, and see how GPU memory usage reads? GPU-Z should probably work, otherwise there's an OpenGL callback to get memory usage IIRC. I have no idea how memory-(in)efficient display lists are, so this might even be better with textures, but still, I'd like to see actual numbers.

Second, I'd very much like for the old method to be available via an option, still. Just so we can tell people in case they should run into any kind of issue to use that until it can get fixed. Once it's proven itself, we can kill the old codepath (say in half a year/with MC 1.13 or so).

I'll be gone for most of the week, but I expect to be able to look at the code more in-depth next weekend.

However this turns out, thanks in advance for taking a shot at this!

(*) Very specific, I know. I'd go with 32 or so, just to be able to see a significant difference and avoid noise messing up results.

Jun 17 '18 18:06 fnuecke

I'll like to mentions some more details as well.
Some users may experience poorer performances
Cam Test:
2018-06-18_21 11 11
Main Branch: 2018-06-18_21 23 38
Same Exact Setting for both.

Jun 18 '18 13:06 Ristellise

Oh wow I can upload the Test world... Sure!
Testworld.zip

Jun 18 '18 16:06 Ristellise

First thing I noticed, this does not fix issue #779. That's more of a different thing focused on the Lua side, this PR is more about rendering efficiency (and for rendering it's good! 👍)

Secondly, the stretching would be fine if it only stretches by integer multiples, which it doesn't when the size it is stretched to wouldn't otherwise fit, leading to it looking wrong when a high resolution is set.

Jun 18 '18 21:06 skyem123

A big change I'd like to introduce with this is making linear scaling the min filter for screens - which was impossible in the old system, but would definitely help performance.

I have my own concerns over VRAM usage at scale, though.

Jul 04 '18 23:07 asiekierka

if we've added this to the our 1.7.3 milestone, we need to have the PR author finish the concerns of the review.

@cam72cam before we can merge this, we would need all of the following addressed

how GPU memory usage reads comparing 32 screen with superflat scenes? GPU-Z should probably work.
keep the old method avaialble via config until it's proven
Ristelle demonstrated on nvidia 650 we drop from 91fps to 52fps? That seems bad, doesn't it?
asie would like to see making linear scaling the min filter for screens.
asie is concerned about vram usage at scale. @asiekierka: what scale value matters?

Sep 09 '18 21:09 payonel

I am on linux and don't have GPU-Z
I'll try to see if I can figure out a way to do this
That's a fair comparison, a lot of the performance of this code depends on the GPU you are using. Intel and AMD seem to work better with my PR. NVIDIA is a toss up IIRC 4: I'll see what I can do 5: aise?

Sep 09 '18 22:09 cam72cam

For what it's worth, for NVIDIA GPUs on Linux (with the official drivers), a very useful tool is running nvidia-smi -l to see basically all relevant data at a glance. I know nothing about other GPUs, or about vendor-independent ways to view this data, unfortunately.

Sep 09 '18 22:09 Vexatos

@cam72cam I can confirm there is a drop in FPS to half on integrated graphics. This will affect a good number of users, unfortunately This speedup, however, would be a welcome improvement. The right thing going forward is to provide a configuration option to enable this.

We are very close to releasing our 1.7.3 patch. I choice on this PR is three-fold

I'm removing the 1.7.3 miletone from this PR
I think this change in behavior should be configurable, but off by default. We should continue using the old code by default.
I think it is okay to only upgrade our 1.12 branch with this feature. I'm not interested in making these improvements to the lower versions

Nov 10 '18 13:11 payonel

Sorry I have not been able to work on this PR much. Work has kept me busy and all my current free time goes into IR.

Nov 10 '18 18:11 cam72cam

@cam72cam Any updates on this?

May 30 '19 04:05 Ristellise

OpenComputers OpenComputers copied to clipboard

Optimized Render code for TextBuffers

OpenComputers
OpenComputers copied to clipboard