xeokit-sdk [DataTexturePerformanceModel]: 80% GPU RAM savings!

[DataTexturePerformanceModel]: 80% GPU RAM savings!

Open tmarti opened this issue 2 years ago • 4 comments

⚙️ [WIP] Will keep updating this PR description ⚙️

Introducing `DataTexturePerformanceModel` 🚀

This is a variation of PeformanceModel where all the GPU data is stored in data textures.

GPU memory savings are bigger than 80% (both with instanced and non-instanced geometries) when using the DataTexturePerformanceModel.

💡This is achieved by using a combination of creative geometry management techniques 💡

TODO: document the techniques introduced in this PR

It requires WebGL2 support in the browser

https://caniuse.com/webgl2

Feature toggled! ⚡

A toggle is introduced in XKTLoaderPlugin to enable the new techniques that optimize GPU RAM usage.

To enable the GPU optimizations, load your XKT files with:

xktLoaderPlugin.load ({
    // Same arguments you would use today
    ...,

    // This toggles on the GPU optimizations
    useDataTextures: true,
});

This should be able to load any XKT file from version v1 to v9 (latest today).

Toggle-able extensions 🔧

Dynamic Level Of Detail mechanism

When enabled, this extension will try to match a target frame-rate in the Viewer and, if that is not possible, it will keep progressively hiding the most complex objects until the target frame-rate is reached.

When the frame-rate recovers (e.g. when the camera remains steady), it will keep unhiding the previously hidden objects.

The heuristic to know which objects should be hidden first is based on the number of triangles of each object.

By default, the following triangles limits are used:

[ 2000, 600, 150, 80, 20 ]

1st, objects with more than 2000 triangles will be hidden

2nd, objects with more than 600 triangles

...

but objects with 20 triangles or less will remain always visible

This is adjustable on a per-model basis:

xktLoaderPlugin.load ({
    // Same arguments you would use today
    ...,

    // This toggles on the GPU optimizations
    useDataTextures: {
        // For this model, set the target frame-rate to 15 fps for LOD mechanism
        targetLodFps: 15,
    },
});

View Frustum Culling mechanism

When enabled, this extension will pre-cull all objects outside of the Camera Frustum (this is, objects outside the Field of View of the Camera).

This extension does not cause any visible difference in the render or its quality, but when the camera is close to or inside a model, it's usually the case that only a % of the objects are in the Camera Field of View, and it's in those cases where this extension is able to boost the frame-rate in some cases up to 2x.

As a technical note, the implementation stands on top of an efficient r*tree based data structure to boost the Frustum Containment calculations and minimize the number of calculations.

See the Wikipedia article on r*trees: https://en.wikipedia.org/wiki/R*-tree

In cases where all objects are visible by the Camera, this mechanism adds no noticeable overhead to the frame-rate.

This extension is toggle-able on a per-model basis:

xktLoaderPlugin.load ({
    // Same arguments you would use today
    ...,

    // This toggles on the GPU optimizations
    useDataTextures: {
        // For this model, enable View Frustum Culling
        enableViewFrustumCulling: true,
    },
});

Use them together!

xktLoaderPlugin.load ({
    // Same arguments you would use today
    ...,

    // This toggles on the GPU optimizations
    useDataTextures: {
        // For this model, enable both LOD and VFC mechanisms
        targetLodFps: 15,
        enableViewFrustumCulling: true,
    },
});

What is working:

non-instanced geometry rendering (triangles/edges) and picking
instanced geometry rendering (triangles/edges) and picking
full illumination support
out-of-box-support for per-object XYZ offsets

What is missing in the `Renderers`

Section Plane Support it not yet done
SAO rendering
Not all shader renderers are migrated. The following ones are missing:
- ...DepthRenderer.js
- ...EdgesRenderer.js
- ...NormalsRenderer.js
- ...OcclusionRenderer.js
- ...ShadowRenderer.js
- ...SilhouetteRenderer.js
- ...ColorQualityRenderer.js
even though camera-view ray picking works well, arbitrary ray-origin-plus-direction picking does not work yet

What is not covered by the GPU optimizations (yet, to decide if this will be also optimized)

Lines-based models
Point-Set-based models

Some GPU statistics

xeokit is using today around 67 bytes/triangle on the GPU

55.8 M triangles model not using instancing

+200k objects
655.23 MB of GPU-RAM
11.15 GPU-bytes/triangle

32 M triangles model using instancing

+100k objects
125.79 MB of GPU-RAM
3.69 GPU-bytes/triangle

Mar 25 '22 08:03 tmarti

🎬 A couple demo videos on what a data-texture based approach allows, performance wise.

The tests are done in this model:

https://github.com/xeokit/xeokit-sdk/tree/master/assets/models/xkt/v7/Lyon
+73k objects in the scene

Notice how the frame-rate is kept even though continuously modifying per-object properties 10 times per second!

Dynamic object coloring 🌈

73k objects colored each individually 10 times per second!

more than 700k colorings per second

https://user-images.githubusercontent.com/2405414/160140308-59a7e7d8-0398-45c2-a557-ccb5a722efc6.mov

Dynamic object coloring and offset 🌈 + 🌊

73k objects colored each individually 10 times per second!

more than 700k colorings per second
more than 700k offsetings per second

https://user-images.githubusercontent.com/2405414/160140790-41cb4caa-0bef-40f4-8ef1-ac8d47dd3d2b.mov

Mar 25 '22 14:03 tmarti

   useDataTextures: {
        // For this model, enable both LOD and VFC mechanisms
        targetLodFps: 15,
        enableViewFrustumCulling: true,
    },

Does this mean LOD and Culling are dependent on data textures?

Mar 31 '22 08:03 Amoki

@Amoki Does this mean LOD and Culling are dependent on data textures?

At the moment yes, but would be quite easy to also support them on the current PerformanceModel w/out data textures 🙂

Even though a really nice thing of using them with data textures is that the involved culling operations are really fast (if you check the couple videos above, you get an idea on how performant is to update per-entity attributes)

Mar 31 '22 13:03 tmarti

@Amoki @xeolabs here, there is a branch that introduces the VCF and LOD mechanisms to vanilla PerformanceModel (w/out data-textures):

https://github.com/tribiahq/xeokit-sdk/tree/test-xeokit-master-vfc-lod

The mechanims there don't need webgl2 and are toggled a bit differently:

const model = xktLoader.load({
    // as today's xeokit-master
    ...,
    // enable VFC
    enableViewFrustumCulling: true,
    // enable LOD targeting 15 frames per second
    targetLodFps: 15,
 });

Apr 02 '22 15:04 tmarti

👇 Click on the image for a new way to code review

Legend

Mar 20 '23 16:03 ghost

Just found one issue @tmarti

In this test, I'm loading three copies of the Holter Tower, then double-clicking on a roof to fly to it, then when I back way again, the culling does not seem to restore all objects - see example & video:

https://xeokit.github.io/xeokit-sdk/examples/#BIMOffline_XKT_IFC2glTFConverter_RevitSamples_HolterTower_useDataTextures

Screencast from 21.04.2023 00:08:11.webm

Apr 21 '23 09:04 xeolabs

Super thanks for reporting @xeolabs!

Will check it soon: ~have the impression that it could be caused, if the same file file is loaded trice, because of repeated id's 🙂.~

Apr 21 '23 17:04 tmarti

Just found one issue @tmarti

In this test, I'm loading three copies of the Holter Tower, then double-clicking on a roof to fly to it, then when I back way again, the culling does not seem to restore all objects - see example & video:

@xeolabs, this PR solves it: https://github.com/xeokit/xeokit-sdk/pull/1026

Apr 24 '23 21:04 tmarti

@xeolabs please consider also merging this one 🙂

https://github.com/xeokit/xeokit-sdk/pull/1027

Apr 25 '23 08:04 tmarti

@tmarti Performance is very slow on Android 11, Galaxy Tab - testing with this example

May 12 '23 10:05 xeolabs

@tmarti Performance is very slow on Android 11, Galaxy Tab - testing with this example

A Samsung Galaxy S10+ phone (released in 2019) runs the sample link at >30 fps.

@xeolabs What is the concrete model of the Galaxy Tab you used and what was its release date?

Ref link: https://en.m.wikipedia.org/wiki/Samsung_Galaxy_Tab_series

May 12 '23 10:05 tmarti

It's a Moto G10 Dual SIM Smartphone (XT2127-2), sorry not a Tab

May 12 '23 11:05 xeolabs

Then, if you agree, will not do anything with this problem on the data-tex mechanisms, as the GPU for the G10 (Adreno 460 within Spandragon 460 SoC) is considered a low-end one by today's standards.

Ref: https://nanoreview.net/en/soc/qualcomm-snapdragon-460

May 12 '23 15:05 tmarti

No problem @tmarti - I think it makes sense to expect users to have decent hardware, for our sort of use cases.

May 12 '23 17:05 xeolabs

Please add this to the list @xeolabs 🎉.

https://github.com/xeokit/xeokit-sdk/pull/1044

This seems to solve the last known problem for data-textures! 🏄

May 14 '23 17:05 tmarti

Nice one @tmarti , merged, thanks!

May 14 '23 17:05 xeolabs

xeokit-sdk xeokit-sdk copied to clipboard

[DataTexturePerformanceModel]: 80% GPU RAM savings!

Introducing DataTexturePerformanceModel 🚀

Feature toggled! ⚡

Toggle-able extensions 🔧

Dynamic Level Of Detail mechanism

View Frustum Culling mechanism

Use them together!

What is working:

What is missing in the Renderers

What is not covered by the GPU optimizations (yet, to decide if this will be also optimized)

Some GPU statistics

Dynamic object coloring 🌈

Dynamic object coloring and offset 🌈 + 🌊

Legend

xeokit-sdk
xeokit-sdk copied to clipboard

Introducing `DataTexturePerformanceModel` 🚀

What is missing in the `Renderers`