bevy icon indicating copy to clipboard operation
bevy copied to clipboard

Retain asset without data for RENDER_WORLD-only assets

Open robtfm opened this issue 1 month ago • 5 comments

Objective

when RenderAssets with RenderAssetUsages::RENDER_WORLD and without RenderAssetUsages::MAIN_WORLD are extracted, the asset is removed from the assets collection. this causes some issues:

  • systems which rely on the asset, like picking with meshes, fail with "asset not found" errors which are unintuitive.
  • loading the asset by path a second time results in the asset being reloaded from storage, re-extracted and re-transferred to gpu, replacing the existing asset
  • knowledge about the asset state is lost, we cannot tell if an asset is already loaded with AssetServer::get_handle
  • metadata (image size, e.g.) is no longer available for the asset

Solution

extraction:

  • add take_gpu_data to the RenderAsset trait. use it to pull the data out of the asset for transfer, and leave the empty asset in the collection. default implementation just clones the asset.
  • if the data has already been taken, ~~panic. this follows from modifying an asset after extraction, which is always a code error, so i think panic here makes sense~~ log an error

Mesh/RenderMesh:

  • make Mesh::attributes and Mesh::indices options
  • take them on extraction
  • expect operations which access or modify the vertex data or indices if it has been extracted. accessing the vertex data after extraction is always a code error. fixes #19737 by resulting in the error Mesh has been extracted to RenderWorld. To access vertex attributes, the mesh must have RenderAssetUsages::MAIN_WORLD
  • provide try_xxx operations which allow users to handle the access error gracefully if required (no usages as part of this pr, but provided for future)
  • compute the mesh Aabb when gpu data is taken and store the result. this allows extracted meshes to still use frustum culling (otherwise using multiple copies of an extracted mesh now panics as compute_aabb relied on vertex positions). there's a bit of a tradeoff here: users may not need the Aabb and we needlessly compute it. but i think users almost always do want them, and computing once (for extracted meshes) is cheaper than the alternative, keeping position data and computing a fresh Aabb every time the mesh is used on a new entity.

Image/GpuImage:

images are a little more complex because the data can be deliberately None for render-targets / GPU-written textures where we only want an uninitialized gpu-side texture.

  • take Image::data on extraction
  • record on the resulting GpuImage whether any data was found initially
  • on subsequent modifications with no data, panic if there was data previously

corner case / issue: when used with RenderAssetBytesPerFrameLimiter there may be no previous gpu asset if it is still queued pending upload due to the bandwidth limit. this can result in a modified image with initial data skipping the had_data check, resulting in a blank texture. i think this is sufficiently rare that it's not a real problem, users would still hit the panic if the asset is transferred in time and the problem/solution should be clear when they do hit it.

ShaderStorageBuffer/GpuShaderStorageBuffer

follows the same pattern as Image/GpuImage:

  • take ShaderStorageBuffer::data on extraction
  • record on the resulting GpuShaderStorageBuffer whether any data was found initially
  • on modifications with no data, panic if there was data previously

we don't have the queue issue here because GpuShaderStorageBuffer doesn't implement byte_len so we can't end up queueing them.

other RenderAssets

i didn't modify the other RenderAsset types (GpuAutoExposureCompensationCurve, GpuLineGizmo, RenderWireframeMaterial, PreparedMaterial, PreparedMaterial2d, PreparedUiMaterial) on the assumption that ~~cloning these is cheap enough anyway~~ the asset usages are not exposed so we should never call take_gpu_data. the default implementation panics with a message directing users to implement the method if required

Testing

only really tested within my work project. i can add some explicit tests if required.

robtfm avatar Nov 03 '25 13:11 robtfm

I wonder if it would be better having an enum like this just to have some more information

enum RenderAssetData {
    Available(Vec<u8>),
    Unavailable,
    SentToRenderWorld
}

hukasu avatar Nov 03 '25 14:11 hukasu

maybe ... i tried to avoid changing the api of the main-world types (particularly Image::data which is pub) but maybe it's worthwhile

robtfm avatar Nov 03 '25 14:11 robtfm

  • added MeshExtractableData enum to avoid the nasty nested options / make the intent clearer.
  • added try_xxx functions to avoid 100s of unwraps in the cases where we know the access is valid (and reduce the breaking changes to the public api).
  • changed the extract functions to return an error, and the default impl to return AssetExtractionError::NoExtractionImplementation.

i still need to do a pass through to fix up docs for all the copy/pasted functions, but the logic should be complete now i think.

robtfm avatar Nov 16 '25 18:11 robtfm

it would be nice to have a MeshBuilder with infallible access methods. then all the Mesh accessors could be result-based without scattering 100s of unwraps about.

i don't think i want to do it as part of this PR though.

robtfm avatar Nov 26 '25 11:11 robtfm

Ah, my bad, I should have been clearer that I think this PR is fine as it is. It would have been simpler if MeshBuilder already existed, but c'est la vie.

greeble-dev avatar Nov 26 '25 11:11 greeble-dev

Code seems reasonable - just got a couple of requests to avoid some corner cases. I also tested a variety of examples, and hacked in a few extra tests (including glTFs with RENDER_WORLD only, and making some examples deliberately trigger errors).

I don't like how the Mesh interface has ended up, but I'm not sure that's a problem with this PR. My hot take is that it's a problem with Mesh trying to be three things at once:

  1. A finished, immutable asset that can be efficiently shared and rendered.
  2. A way to incrementally build a mesh by adding/modifying attributes and suchlike.
  3. A way to animate meshes by mutating an asset in-place and re-extracting.

I think the solution could be:

  • Add a new MeshBuilder struct and move all the Mesh mutation methods to MeshBuilder.

    • This struct wouldn't have to deal with MeshAccess errors since it's entirely CPU side, so a load of try_xxx methods disappear.
    • A finish function would consume the builder and return a Mesh.
  • Change Mesh to be immutable, except by take_gpu_data.

    • Animation that wants to mutate the mesh now has to create a fresh Mesh and overwrite the entire asset.

That gives as clear separation between 1) and 2), but makes 3) more awkward and inefficient. I'd argue that's ok in the short-term since it still basically works and mutating the asset is always going to be somewhat inefficient. In the long-term there could be a separate way of efficiently sending fine-grained mesh updates to the GPU.

The Mesh/MeshBuilder separation might also make compressed and derived data (like AABBs) more robust and efficient. So MeshBuilder::finish would automatically calculate AABBs and optionally apply compression.

I totally agree with @greeble-dev. Related to https://github.com/bevyengine/bevy/issues/17000. My quick thoughts are moving all the current Mesh methods to something like InfallibleMesh, too and making Mesh only contains packed vertex+indices data for GPU and necessary info about attributes layout, morph targets, aabb, etc.

So copying vertex and indices to MeshAllocator is fast and no longer needs repacking data. This opens a door to direct constructing Mesh and mutating Mesh in-place if you have the raw vertex and indices data and info and want to avoid repacking.

I think the current approach of extracting vertex attributes is suboptimal. And Mesh and InfallibleMesh have many identical methods, making things confusing.

Edit: Oh, think again, I can see the reason to store attributes on the Mesh since mesh_picking or physics engine may need to read the Mesh’s vertex positions so we can’t just store interleaved vertex attributes. In that case, the above is less relevant and I think the current PR is fine. See discussion in #21986

beicause avatar Dec 08 '25 10:12 beicause

Regarding the Mesh interface, I‘d prefer to remove these try_xxx methods and let them panic. Adding a is_extracted_to_render_world method for checking is sufficient IMO, these try_xxx methods are rarely useful.

beicause avatar Dec 11 '25 05:12 beicause

Regarding the Mesh interface, I‘d prefer to remove these try_xxx methods and let them panic. Adding a is_extracted_to_render_world method for checking is sufficient IMO, these try_xxx methods are rarely useful.

since the change to results was at request of @andriyDev (and @alice-i-cecile) i'll defer to them. i agree fwiw

robtfm avatar Dec 11 '25 10:12 robtfm

And yeah, I prefer keeping the try_ function that return results. We should generally try to avoid panics as much as possible. I've hit way too many unexpected panics in rendering code. Arguably, we shouldn't even have a panicky variant but that's a completely different discussion.

IceSentry avatar Dec 12 '25 19:12 IceSentry

I much prefer making users implement this explicitly - pretty much every render asset should want to handle this correctly I think, especially since the default is just breaking.

there are several RenderAssets that don't implement it and don't need to because their RenderAssetUsages are not exposed. it's not useful to make authors add it to types that don't support it, as i mentioned last time.

robtfm avatar Dec 12 '25 23:12 robtfm

And yeah, I prefer keeping the try_ function that return results. We should generally try to avoid panics as much as possible. I've hit way too many unexpected panics in rendering code. Arguably, we shouldn't even have a panicky variant but that's a completely different discussion.

My concern is that if we add more extractable resources (like RawMesh, etc.), we will also follow a similar pattern of providing try_xxx functions, which feels really bad. Also, If the extractable data isn’t consolidated, we have to check each field individually to determine if a resource data is complete, it's inconvenient.

beicause avatar Dec 12 '25 23:12 beicause

we will also follow a similar pattern of providing try_xxx functions

you've seen the InfallibleMesh pr i made on top of this, right? there it would still be an extra unwrap() on the base function, but wouldn't have any duplicated try_ functions.

robtfm avatar Dec 12 '25 23:12 robtfm

Another possible approach I consider is to consolidate the extractable data and access it through something similar to entry API, thereby avoiding multiple unwrap operations.

beicause avatar Dec 12 '25 23:12 beicause

we will also follow a similar pattern of providing try_xxx functions

you've seen the InfallibleMesh pr i made on top of this, right? there it would still be an extra unwrap() on the base function, but wouldn't have any duplicated try_ functions.

That is unrelated. I mean if more extractable assets are added in the future, or if crate authors want to add their extractable assets, should the interfaces also follow the try_xxx+xxx pattern? I think this is an area that could be improved.

beicause avatar Dec 13 '25 01:12 beicause