corona
corona copied to clipboard
Custom objects, step #5: vertex extensions and instancing
This is a major departure from PR 5 as proposed here.
It does, however, build on the previous PRs. In particular, it expands significantly on the geometry-customization and shader-patching features introduced in PR 3.
In particular, it allows one to write “vertex extensions”, adding per-point information (attributes) to any display object’s fill or stroke, and the ability to read the same from vertex kernels.
Where available, hardware instancing is also allowed. This can be done via attributes—single- or multi-instance—or with instance IDs, according to what is supported.
Some lower-level APIs have also seen some streamlining.
=========================
This adds a few new APIs and removes some defunct ones, so it changes both Rtt_ApplePlatform.mm
and Main.cpp
, as well as CoronaGraphics.*
and CoronaObjects.h
.
=========================
PR 3 mentioned that Solar’s geometry only honors one vertex format. That is no longer true.
The following paragraphs describe the situation:
Geometry is either added to a large CPU-side stream or, in the case of indexed meshes, into a GPU-side object. The format is either contained in a VAO or handled in a Bind()
event, where the vertex size is locked in.
On the CPU side, the Bind()
tends to stay in effect for a long time, only changed out when the last stream is exhausted or switching out an indexed mesh. Provided space exists, new geometry is batched right into the stream without incident.
Some operations also comb through the geometry before it gets submitted to a batch.
=========================
The new changes were designed to work with or adapt these facts.
For compatibility, any non-standard vertex data is stored in a separate channel in the Geometry
objects. This is also a Vertex
array: the extra data size is rounded up to a multiple of sizeof(Vertex)
, and laid out accordingly. This will usually waste some space, but makes splicing the new and original channels together fairly straightforward.
The CPU- and GPU-side splicing differ slightly, but preserve the original "comb through the geometry" logic. When batching, we simply combine elements as we go, since we are moving into separate memory anyway. On the GPU, meanwhile, we want to keep the data together—so one upload—but also avoid intermediate allocations. The extra channel thus also includes space for the standard data; it is updated with the current contents and then we upload using this array.
The revised batching calls for some extra bookkeeping, thus the new “extra” and “offset” fields in the renderer.
The Bind()
logic has gone through some rearrangement, since it must accommodate changes in the vertex size and format. This also avoids some redundant attribute bindings.
Instanced attributes mirror a lot of what their normal cousins do, including the CPU-side geometry pool. They also figure into how the binding logic was rearranged, since all attributes use a similar process.
Shaders are key to using vertex extensions. The idea behind them is to allow more elaborate types than vanilla built-in display objects, in turn giving shaders more power.
This is available in Lua through the graphics.defineVertexExtension()
function. The more low-level CoronaGeometryRegisterVertexExtension()
is also available.
After an extension is defined, we can provide its name to graphics.defineEffect()
and it will be used during compilation.
=========================
In the previous PRs, non-draw operations would use CoronaRendererDo()
to inject an action. This worked to some extent, but required some awkward steps to interact with batching, probably incurring some major memory costs.
Some of these actions would also require cleanup, thus the end-of-frame ops.
This PR adds a new approach: registering a data blob with some default state, possibly all 0s. We can call CoronaRendererWriteStateBlock()
to edit this blob’s contents. If any changes are detected, a dirty handler will be called when the current geometry is batched. At the end of the frame, a similar dirty handler is used to restore to default.
Writing appropriate dirty state handlers gives us both the CoronaRendererDo()
—with better timing, at that—and end-of-frame logic. As such, the state block APIs replace those bits.
PR 4 mentioned using a dummy stage bounds to keep 3D objects visible despite abusing their x
and y
properties. We also used a global clear op to wipe the depth buffer. These and similar things were rife with problems.
Some of the logic here has been restructured, but the major improvements has been moving the work they do into display.setDefault()
. Clearing the depth and stencil buffers is now recognized by the core and done if certain defaults are active. In that same vein, flags may be temporarily enabled to create display objects with behaviors baked in, , akin to what is done with textures—some groundwork has also been laid to incorporate depth and stencil buffers into snapshots and textures. (An earlier PR played with this idea.)
=========================
As before, the comments in the native API should about represent the future docs. Attached are some preliminary docs covering the Lua and shader APIs. These cover PRs 1-5.
=========================
I tested using this sample.
It does different things according to the instancing capabilities available: if replication or instancing IDs are available, it can use certain vertex kernels, or a fallback otherwise.
The “testVec2s” shader uses a “Vec2s” extension. This adds a couple vec2
attributes, and these are in turn used to produce a tinted, distorted quad. (If instancing is available, it is also replicated, gradually, in a few positions.) These emulate features already available in Solar, of course; it’s merely a test.
If replication support is available, the “testWindow” shader is demonstrated, using the “ColorMix” extension. Five color attributes are loaded: one rect is then drawn, using the window over attributes 1-3; another uses 2-4; the last uses 3-5.
If multi-instance replication is available, a y-offset is used, with a 2-instance replication: the first two objects have one y-offset, the third has another.
Some circles are shown below those rects, with the same colors, for reference.
The “test” shader expands on the 3D example from PR 4’s sample, adding a “normal” attribute. The normals are dotted with a fixed up vector to shadow the object a bit.
Instancing is also used when possible, using another offset attribute to separate them. This demonstrates that instancing works with indexed geometry. (There is also an “unused” attribute.)
Finally, a “normalsOnlyTest” is used on the same 3D model. This shader uses an extension that only adds the “normal” attribute. This demonstrates use of a subset-targeted shader on a broader geometry: it skips the irrelevant “xOffset” and “unused” attributes.