cesium Separate pass for billboards

There's several interrelated billboard issues that could be solved by rendering billboards in their own pass. It may be a requirement for https://github.com/CesiumGS/cesium/issues/6840 and is necessary architecture for https://github.com/CesiumGS/cesium/issues/4235.

@YoussefV you should first check out https://github.com/CesiumGS/cesium/issues/6840 which links to https://github.com/CesiumGS/cesium/pull/6802#issuecomment-405415474 which links to https://github.com/CesiumGS/cesium/issues/5121. When billboard.disableDepthTestDistance is infinity the billboard gets clamped to the near plan so it renders above everything else, but since it still writes depth it interferes with the camera controller. Disabling depth writes on the billboard is not enough; primitives rendered after the billboard will overwrite the billboard's color which is not desired. However disabling depth writes could work if the billboard were rendered after everything else in the scene.

Now comes the tie-in for https://github.com/CesiumGS/cesium/issues/4235. There's some discussion in that issue about rendering billboards in a separate pass in order to improve rendering quality. The two things that really hurt billboard (and label) quality at the moment are viewer.useBrowserRecomendedResolution and FXAA. We set viewer.useBrowserRecomendedResolution to true by default because it significantly improves performance for high-DPI displays, but it tends to hurt the quality of text. FXAA is disabled by default, but when it's enabled it also has the tendency to blur text. MSAA is a better alternative to FXAA but it doesn't help or hurt billboard quality since it's geometric anti-aliasing. This sandcastle compares how these options affect billboard quality: sandcastle. If we want to render the crispest text possible we will likely need to render billboards to a native-res framebuffer in a separate pass from the rest of the scene.

@YoussefV for now let's concentrate on fixing https://github.com/CesiumGS/cesium/issues/6840, but make the right architectural decisions for https://github.com/CesiumGS/cesium/issues/4235. Roughly what this involves is:

Render billboards in their own Pass after the translucent pass. It will consist of three "subpasses"
- Render opaque billboards with depth test/write enabled
- Render translucent billboards with depth test enabled, depth write disabled
- Render billboards with disableDepthTestDistance === Number.POSITIVE_INFINITY with depth test/write disabled
If billboards aren't rendered in the translucent OIT pass anymore we may start to see alpha blending problems. Billboards may need to be sorted by depth every frame, though this could be prohibitively expensive. Though it would help fix https://github.com/CesiumGS/cesium/issues/6838.
Don't worry about rendering to a different framebuffer or anything yet, but I think we'll need to do it eventually.

Apr 28 '20 21:04 lilleyse

Any progress on this?

Apr 20 '21 21:04 Nadav42

Has the option of rendering billboards as HTML+CSS3D overlays been considered? Very similar advantages to rendering in a separate pass, with a few other benefits:

Crisp text, no high-resolution framebuffer, let the browser handle antialiasing and outlines
- https://github.com/CesiumGS/cesium/issues/4235
- https://github.com/CesiumGS/cesium/issues/8155
- https://github.com/CesiumGS/cesium/issues/11377
Support text selection / highlighting (can be disabled)
Support more languages
- https://github.com/CesiumGS/cesium/issues/2521
Support emoji
Display billboards with custom CSS styling, form elements, and interactivity
- https://github.com/CesiumGS/cesium/issues/7525
- https://github.com/CesiumGS/cesium/issues/1247
Perhaps an easier path to supporting annotation leader lines? (using SVG)

For examples of the technique, see:

https://modelviewer.dev/examples/annotations/index.html
https://doc.babylonjs.com/features/featuresDeepDive/babylonViewer/hotspots/#annotations
https://threejs.org/manual/examples/align-html-elements-to-3d-globe.html

Perhaps less obviously, CSS3D transforms can transform HTML with a view matrix, as if it were part of the 3D scene:

https://threejs.org/examples/?q=css3d#css3d_periodictable
https://codesandbox.io/p/sandbox/crazy-germain-6oei7?file=%2Fsrc%2FApp.js

The main downside of the technique is that HTML content doesn't participate in the depth buffer for occlusion, and >1000 HTML overlays would likely reduce performance. Occlusion can be "faked" by tracking an oriented anchor point for each billboard, and reducing billboard opacity when that anchor point is occluded or facing away from the viewer.

The techniques are complementary — there certainly are times we'd need to render SDF/MSDF/... text in WebGL, with full participation in the 3D scene! — but my hunch is that >50% of billboard examples in the Sandcastle gallery might benefit from HTML+CSS3D overlays. Notable exceptions might include very large numbers of labels, labels wrapped onto terrain or winding roads, etc.

Nov 12 '25 21:11 donmccurdy

Another benefit — for either a separate pass in WebGL, or an HTML overlay — would be to avoid issues in "HDR" mode like ...

https://github.com/CesiumGS/cesium/issues/7957
https://github.com/CesiumGS/cesium/issues/8180

... for billboards and labels at least, because the overlay pass could be drawn after tone mapping.

Nov 13 '25 16:11 donmccurdy

Taking a look at some of the examples above, I actually quite like this as a solution. Let the browser handle text elements instead of us! If we were designing Cesium from the ground up, I would be in favor of starting with this approach. But since we want to respect backwards compatibility, maybe this would serve as a new pipeline entirely (with an easy way for users to opt-in / migrate existing billboard/label workflows).

Is it fair to say that, in addition to not really participating in occlusion, these html overlay elements couldn't really participate in transparency blending, either?
What would Cesium have to do each frame for one of these overlays? Update its anchor point's position orientation to determine if it's facing the camera? (More specifically, what sort of globe math do we need to do? A terrain pick? (Very slow))
Has anyone written up about their performance anywhere? I suspect it's fine to be honest, maybe even at large scales.

Nov 14 '25 16:11 mzschwartz5

Just to point out for the sake of discussion, we do "support" HTML overlays, but the API is very manual and does not align with existing CesiumJS workflows. And it doesn't support occlusion or depth testing without a user manually implementing it.

Nov 14 '25 17:11 ggetz

There still are many details to be sorted out (and I certainly didn't read through all related issues, and even less the code). But I remember trying to use "billboards" once, and eventually resorted to some manual HTML twiddling for "mouse tooltips" (in the 3d-tiles-samples examples).

Having a more streamlined way for doing ~"something like this" (in addition to any existing billboard functionality) could certainly be helpful for many use-cases (and avoid many of the issues that are caused by people assuming that billboards might be the easiest or most idiomatic solution for their case, even though in some cases, they aren't...)

Nov 14 '25 17:11 javagl

This could be more powerful than for just billboards but in general html overlays, a similar API exists in maplibre - https://www.maplibre.org/maplibre-gl-js/docs/API/classes/Popup/

Nov 14 '25 19:11 Beilinson

Is it fair to say that, in addition to not really participating in occlusion, these html overlay elements couldn't really participate in transparency blending, either?

Agreed! Fair to say elements in the scene couldn't blend with, or "properly" occlude, HTML overlays. Occlusion can be faked, to some extent, by fading the entire overlay out based on some precomputed visibility-by-orientation data, or raycasting or GPU picking to the overlay's anchor point, but ... there are definitely limits to this.

HTML overlays could be transparent, and blend on top of the scene or other overlays, however. This should make it easier to avoid cases where overlays clip or z-fight with one another.

What would Cesium have to do each frame for one of these overlays? Update its anchor point's position orientation to determine if it's facing the camera? (More specifically, what sort of globe math do we need to do? A terrain pick? (Very slow))

The ModelViewer approach is, approximately, to store a position and (optional) normal vector, projecting the normal into screen space and fading the annotation out when the vector no longer faces the camera. If annotations are attached to a surface, it's likely best to use the normal vector of that surface; they provide an editor (https://modelviewer.dev/editor) to help with manual annotation.

Possible scenarios ...

Simple: Marker is anchored to a surface with a known normal vector, or we do a one-time raycast into terrain to compute that when the marker is initialized, and transform the vector to screen space (on CPU) each frame for visible markers.

Complex: Use GPU picking to check visibility of each marker's anchor point. I believe it should be possible to do only one GPU multi-picking pass, regardless of the number of annotations, but I don't know if I see that as part of an MVP.

Finally the CSS transform and visibility for each HTML overlay must be updated.

Has anyone written up about their performance anywhere? I suspect it's fine to be honest, maybe even at large scales.

Not that I'm aware of, but probably it will mean this isn't a complete replacement for existing labels/billboards (even if there were no other reasons). As a ballpark estimate let's assume each overlay costs the browser the equivalent of a draw call, so if the total is ~1000+, it becomes important to do frustum culling, occlusion testing, and/or thinning to avoid having that many in the DOM at once.

An API similar to MapLibre's Popup might be a nice approach — leaning into the strengths of HTML/CSS while setting expectations that creating 100M of these might not be the right tool for the job. And let the WebGL-based billboards lean into what they're better at, without so much pressure to handle things like rounded corners, larger font and image sizes at high resolution, etc.

Nov 17 '25 17:11 donmccurdy