bevy icon indicating copy to clipboard operation
bevy copied to clipboard

Add `EncasedBufferVec`, an higher-performance alternative to `StorageBuffer`, and make `GpuArrayBuffer` use it.

Open pcwalton opened this issue 1 year ago • 7 comments

EncasedBufferVec is like BufferVec, but it doesn't require that the type be Pod. Alternately, it's like StorageBuffer<Vec<T>>, except it doesn't allow CPU access to the data after it's been pushed. GpuArrayBuffer already doesn't allow CPU access to the data, so switching it to use EncasedBufferVec doesn't regress any functionality and offers higher performance.

Shutting off CPU access eliminates the need to copy to a scratch buffer, which results in significantly higher performance. Note that this needs https://github.com/teoxoy/encase/pull/65 from @james7132 to achieve end-to-end performance benefits, because encase is rather slow at encoding data without that patch, swamping the benefits of avoiding the copy. With that patch applied, and #[inline] added to encase's derive implementation of write_into on structs, this results in a 16% overall speedup on many_cubes --no-frustum-culling.

I've verified that the generated code is now close to optimal. The only reasonable potential improvement that I see is to eliminate the zeroing in push. This requires unsafe code, however, so I'd prefer to leave that to a followup.

Here's write_batched_instance_buffer before-and-after (yellow = after, red = before) for many_cubes --no-frustum-culling:

Screenshot 2024-03-23 130623

pcwalton avatar Mar 23 '24 20:03 pcwalton

I'm not wedded to the name EncasedBufferVec. Alternative suggestions are welcome.

pcwalton avatar Mar 23 '24 20:03 pcwalton

Nice performance improvements in conjunction with James’ encase changes. I don’t know what the name should be but I don’t think it’s quite right.

I also want us to check thoroughly that this will never produce misaligned data. I think it will be fine but I’m not certain. For a wgsl binding containing only an array<T> the alignment of each element is supposed to be the alignment of T and if T is a struct then it is the max of the alignments of the members of the struct type. And the size of a struct type is the offset of the last member plus the size of the last member rounded up to the alignment. So I think it will be fine… There is one case where it might break - WebGL2 where 16-byte aligned sizes are always needed or something like that.

superdump avatar Mar 23 '24 20:03 superdump

How about renaming BufferVec to RawBufferVec and EncasedBufferVec to BufferVec?

pcwalton avatar Mar 23 '24 21:03 pcwalton

I did that renaming since nobody objected.

pcwalton avatar Mar 25 '24 19:03 pcwalton

It looks like your PR is a breaking change, but you didn't provide a migration guide.

Could you add some context on what users should update when this change get released in a new version of Bevy? It will be used to help writing the migration guide for the version. Putting it after a ## Migration Guide will help it get automatically picked up by our tooling.

github-actions[bot] avatar Mar 25 '24 19:03 github-actions[bot]

Look's great, but I'm marking this as blocked until the encase patch goes through.

NthTensor avatar Apr 01 '24 16:04 NthTensor

Encase released a new version a few days ago.

I think this just needs https://github.com/bevyengine/bevy/pull/12757 now?

Elabajaba avatar Apr 28 '24 03:04 Elabajaba

Closed as I am completely burned out and have no motivation to continue working on this.

pcwalton avatar May 03 '24 08:05 pcwalton