aframe-gaussian-splatting icon indicating copy to clipboard operation
aframe-gaussian-splatting copied to clipboard

Performance ideas / benchmarking

Open hybridherbst opened this issue 2 years ago • 14 comments

Some ideas in regards to performance. Ultimately it would be nice to get (a subset of) this to work on a Quest 2 / 3; currently that's running at 5-10 fps and very choppy. So are the other three.js implementations though!

  • using compressed data at runtime. Seems you started on this already! There are some ideas regarding compression formats here: https://aras-p.info/blog/2023/09/27/Making-Gaussian-Splats-more-smaller/
  • using better interleaved data so data fetching on the GPU is more localized (same link above has some info)
  • using alpha hashing instead of transparency, and then rendering back-to-front instead to get some early Z cutoff
  • some kind of LOD system - not sure if splats could be sorted by "importance" (e.g. less transparent ones are more important?) at runtime, or if the calculations would need to be done with less splats in the first place.

Regarding loading behaviour, I've dabbled a bit with creating splats already while loading, will see if I can make a PR for parts of that.

And it would be interesting to load compressed data, again Aras (link above) has some ideas around that and tooling to generate byte buffers that are already optimized (10-20x size reduction).

hybridherbst avatar Oct 10 '23 13:10 hybridherbst

Thank you for sharing interesting information. Also, thanks for the pull requests. I'll check them later.

I bought the Quest 3 today and made code modifications to support VR mode. It is very intriguing. The performance still needs improvement, though. I'll look into the information you provided and consider about it.

quadjr avatar Oct 10 '23 15:10 quadjr

Please pleaseee let me know too !

electrum-bowie avatar Oct 10 '23 20:10 electrum-bowie

using compressed data at runtime. Seems you started on this already! There are some ideas regarding compression formats here: https://aras-p.info/blog/2023/09/27/Making-Gaussian-Splats-more-smaller/

I've read Aras's impressive work! In my current implementation, each splat uses 256 bits. This means 7.8x size reduction. I haven't evaluated the image quality yet, but I may be able to implement some of Aras's methods.

using alpha hashing instead of transparency, and then rendering back-to-front instead to get some early Z cutoff

Could you elaborate on this idea?

quadjr avatar Oct 11 '23 12:10 quadjr

I’ve studied alpha hashing. I’ll test it later. 🤓

quadjr avatar Oct 11 '23 23:10 quadjr

I've tested alpha hashing, but it didn't improve the performance I think it won't reduce memory traffic because continuous pixel values are read in one operation. Thus, discarding pixels on an individual basis won't impact memory traffic Here is the code I tested. https://github.com/quadjr/aframe-gaussian-splatting/tree/feature/alpha-hashing

quadjr avatar Oct 12 '23 14:10 quadjr

@quadjr the alpha-hashing branch is 100% identical to the main branch

electrum-bowie avatar Oct 12 '23 17:10 electrum-bowie

@electrum-bowie Sorry, I pushed the code. https://github.com/quadjr/aframe-gaussian-splatting/tree/feature/alpha-hashing

quadjr avatar Oct 14 '23 07:10 quadjr

@hybridherbst I've made several improvements based on your ideas.

For the LOD system, small splats with high transparency at a distance will be removed during the sorting process. This method has significantly improved performance. The threshold for removal requires further theoretical consideration.

Data compression might enhance performance. I need to set up image quality evaluation programs. Alpha hashing and data localization might not boost the performance.

I've also implemented incremental loading.

I've done almost everything I can think for now. I'll shift my focus to the generation software. I believe I can make further improvements to it. 🤓

quadjr avatar Oct 14 '23 07:10 quadjr

Thank you, that does sound like great improvements!

The current threshold of -0.001 did have a very noticeable quality impact on my "FH Portrait" dataset though; I've set it to -0.0001 as a quick test which looks fine, but haven't looked for a proper upper bound. I'll do some more testing with your updates. EDIT: On Quest -0.001 looks fine actually, so the number may need to be fov-based.

One question out of curiosity, the sortSplats method currently allocates new arrays on each run – doesn't that have a performance impact and/or would it be better to cache those instead?

hybridherbst avatar Oct 15 '23 18:10 hybridherbst

Thank you for the reports. The threshold should be determined by the size of the splat on the screen, and it can be calculated using FOV and resolution. I will work on implementing the threshold calculation later.

One question out of curiosity, the sortSplats method currently allocates new arrays on each run – doesn't that have a performance impact and/or would it be better to cache those instead?

Yeah, There are some unnecessary allocations during the loading and sorting processes. I'm currently focused on the generation side of the model and am prioritizing that. Once I've addressed that, I will optimize the memory usage and allocations of this viewer.

quadjr avatar Oct 16 '23 11:10 quadjr

Maybe consider integrating https://github.com/mkkellogg/GaussianSplats3D, which uses a wasm module for sorting. The author has also done some other interesting optimizations.

JiamingSuen avatar Oct 20 '23 16:10 JiamingSuen

Hi, will the mkkellogg .splat format (that add further optimization) https://github.com/mkkellogg/GaussianSplats3D/issues/28 will be compatible with the .splat of this implementation?

softyoda avatar Nov 05 '23 12:11 softyoda

@quadjr @hybridherbst We've already done the work on making splats smaller!

I made this repo to share our small splats for renderer testing. we have these running in our renderer at 90FPS on Quest 3 in browser. I'd love to help out so we can get something more sharable. https://github.com/gmix-tech/small_splats

I tested this branch out with our small splat and I'm only pulling 45 FPS from AFrame in VR mode on Quest 3. It's unclear to me whether it's something to do with AFrame itself or with this component implementation.

Feel free to ping me at [email protected] if you wanna chat more about this

dlazares avatar Nov 29 '23 19:11 dlazares

@dlazares 90fps sounds great! How can I give your renderer a try? Couldn't find the repo. I'm working on integrating a component on A-Frame core. Thanks so much

dmarcos avatar Mar 08 '24 05:03 dmarcos