Hardware instancing improvements
- We have a basic gpu instancing system, which allows world matrices of meshes to be specified in a vertex buffer, and a mesh at those location is instance-rendered on the GPU.
We could do with some improvements:
- currently, only Matrix4 can be stored in the VB. Ideally we allow custom VB format to be used, with additional data (colors or anything else), and allow those to be accessible in the vertex shader code.
- currently the Matrix4 (4 x Vec4) is mapped to those attributes: SEMANTIC_ATTR12 .. 15, and ideally this is flexible, as well as location of additional attributes in the instancing VB.
- currently, when instancing is set on a MeshInstance, the frustum culling is turned off for it, and it gets submitted to the GPU at all times. We could:
- allow user to per frame specify a single AABB for all instances, and use culling
- we could even automatically calculate AABB in some cases (add up AABBs of meshes at their locations).
Related issue: https://github.com/playcanvas/engine/issues/3485
Regarding the culling, this might work, but ideally we have a nicer solution: https://forum.playcanvas.com/t/how-to-check-which-meshes-are-getting-rendered/28519/10
A few other features that would be great for instancing solutions based on some use cases:
- Dynamic quantity vs Static quantity - sometimes a number of instances are static, but sometimes it is desirable to have a dynamic number of instances which can increase/decrease per frame.
- Instance Object - would be great to have a very slim instance object that would store associated data based on instance vb attribute mapping, in engine-friendly format. For example it could just have one vec3 for position and one vec3 for color. So the instance object would have two accessors for these properties in form of Vec3 and Color.
- Dynamic pool - when using instancing with dynamically created/deleted instances, we had to create a pool of IDS, sorted incrementally, where we would always try to re-use smallest ID of an instance from a pool. Then sometimes we would check how many active instances there are and how many we are drawing, and re-sort them periodically if threshold is reached. This dynamic re-sorting allowed us to have a large buffer, with only small portion of it rendered when needed, so it reduces number of triangles dramatically. This also can be combined with culling techniques.
- More abstractions - currently creating a custom instanced mesh requires definition of: TypedArray, VertexFormat, VertexBuffer, Shader, Material and MeshInstance. It would be great to have some abstractions to reduce number of low-level preparations, e.g. InstanceObject would simplify data setting.
- Dynamic quantity vs Static quantity - sometimes a number of instances are static, but sometimes it is desirable to have a dynamic number of instances which can increase/decrease per frame.
As long as you allocate large enough buffer, you can change the count that is used using https://api.playcanvas.com/classes/Engine.MeshInstance.html#instancingCount - is this what you need or something more?
- Dynamic quantity vs Static quantity - sometimes a number of instances are static, but sometimes it is desirable to have a dynamic number of instances which can increase/decrease per frame.
As long as you allocate large enough buffer, you can change the count that is used using https://api.playcanvas.com/classes/Engine.MeshInstance.html#instancingCount - is this what you need or something more?
- Dynamic quantity vs Static quantity - sometimes a number of instances are static, but sometimes it is desirable to have a dynamic number of instances which can increase/decrease per frame.
As long as you allocate large enough buffer, you can change the count that is used using https://api.playcanvas.com/classes/Engine.MeshInstance.html#instancingCount - is this what you need or something more?
Yes, that is what we do also, but here is a challenge:
- Let's say you have 8 instances (a small number for example clarity).
- You want instances 1, 2, 3 and 4 to be rendered, so set instancingCount to 4 - all good.
- Then you need to disable rendering of 2 and 3. There are 2 ways you can do this: a. Set instances 2 and 3 to be rendered outside of camera. They are wasting GPU time, as they are still going through vertex shader pipeline. b. Copy 4th instance data to 2st instance place, and set instancingCount to 2.
So the option B - is good, but with large quantities of instances (the whole point of using them), the sorting - can become somewhat complex. We've implemented it using binary sorted pool of indices, and when want to create new instance getting smallest index, and when removing pushing that index to right location into pool of indices. And then set instancingCount to the last index in active instance indices. With additionally rendering instances that have to be hidden outside of camera. With occasional sorting of the data.
Would be great to have this challenge solved by the engine, as I assume this is pretty common use case.
The intended changes are done here: https://github.com/playcanvas/engine/pull/6867
I'll close this issue, as I believe the features @Maksims is asking for do not belong to the core engine, but more into some library build on top of it.