aframe-gaussian-splatting
                                
                                 aframe-gaussian-splatting copied to clipboard
                                
                                    aframe-gaussian-splatting copied to clipboard
                            
                            
                            
                        Performance ideas / benchmarking
Some ideas in regards to performance. Ultimately it would be nice to get (a subset of) this to work on a Quest 2 / 3; currently that's running at 5-10 fps and very choppy. So are the other three.js implementations though!
- using compressed data at runtime. Seems you started on this already! There are some ideas regarding compression formats here: https://aras-p.info/blog/2023/09/27/Making-Gaussian-Splats-more-smaller/
- using better interleaved data so data fetching on the GPU is more localized (same link above has some info)
- using alpha hashing instead of transparency, and then rendering back-to-front instead to get some early Z cutoff
- some kind of LOD system - not sure if splats could be sorted by "importance" (e.g. less transparent ones are more important?) at runtime, or if the calculations would need to be done with less splats in the first place.
Regarding loading behaviour, I've dabbled a bit with creating splats already while loading, will see if I can make a PR for parts of that.
And it would be interesting to load compressed data, again Aras (link above) has some ideas around that and tooling to generate byte buffers that are already optimized (10-20x size reduction).
Thank you for sharing interesting information. Also, thanks for the pull requests. I'll check them later.
I bought the Quest 3 today and made code modifications to support VR mode. It is very intriguing. The performance still needs improvement, though. I'll look into the information you provided and consider about it.
Please pleaseee let me know too !
using compressed data at runtime. Seems you started on this already! There are some ideas regarding compression formats here: https://aras-p.info/blog/2023/09/27/Making-Gaussian-Splats-more-smaller/
I've read Aras's impressive work! In my current implementation, each splat uses 256 bits. This means 7.8x size reduction. I haven't evaluated the image quality yet, but I may be able to implement some of Aras's methods.
using alpha hashing instead of transparency, and then rendering back-to-front instead to get some early Z cutoff
Could you elaborate on this idea?
I’ve studied alpha hashing. I’ll test it later. 🤓
I've tested alpha hashing, but it didn't improve the performance I think it won't reduce memory traffic because continuous pixel values are read in one operation. Thus, discarding pixels on an individual basis won't impact memory traffic Here is the code I tested. https://github.com/quadjr/aframe-gaussian-splatting/tree/feature/alpha-hashing
@quadjr the alpha-hashing branch is 100% identical to the main branch
@electrum-bowie Sorry, I pushed the code. https://github.com/quadjr/aframe-gaussian-splatting/tree/feature/alpha-hashing
@hybridherbst I've made several improvements based on your ideas.
For the LOD system, small splats with high transparency at a distance will be removed during the sorting process. This method has significantly improved performance. The threshold for removal requires further theoretical consideration.
Data compression might enhance performance. I need to set up image quality evaluation programs. Alpha hashing and data localization might not boost the performance.
I've also implemented incremental loading.
I've done almost everything I can think for now. I'll shift my focus to the generation software. I believe I can make further improvements to it. 🤓
Thank you, that does sound like great improvements!
The current threshold of -0.001 did have a very noticeable quality impact on my "FH Portrait" dataset though; I've set it to -0.0001 as a quick test which looks fine, but haven't looked for a proper upper bound. I'll do some more testing with your updates. EDIT: On Quest -0.001 looks fine actually, so the number may need to be fov-based.
One question out of curiosity, the sortSplats method currently allocates new arrays on each run – doesn't that have a performance impact and/or would it be better to cache those instead?
Thank you for the reports. The threshold should be determined by the size of the splat on the screen, and it can be calculated using FOV and resolution. I will work on implementing the threshold calculation later.
One question out of curiosity, the sortSplats method currently allocates new arrays on each run – doesn't that have a performance impact and/or would it be better to cache those instead?
Yeah, There are some unnecessary allocations during the loading and sorting processes. I'm currently focused on the generation side of the model and am prioritizing that. Once I've addressed that, I will optimize the memory usage and allocations of this viewer.
Maybe consider integrating https://github.com/mkkellogg/GaussianSplats3D, which uses a wasm module for sorting. The author has also done some other interesting optimizations.
Hi, will the mkkellogg .splat format (that add further optimization) https://github.com/mkkellogg/GaussianSplats3D/issues/28 will be compatible with the .splat of this implementation?
@quadjr @hybridherbst We've already done the work on making splats smaller!
I made this repo to share our small splats for renderer testing. we have these running in our renderer at 90FPS on Quest 3 in browser. I'd love to help out so we can get something more sharable. https://github.com/gmix-tech/small_splats
I tested this branch out with our small splat and I'm only pulling 45 FPS from AFrame in VR mode on Quest 3. It's unclear to me whether it's something to do with AFrame itself or with this component implementation.
Feel free to ping me at [email protected] if you wanna chat more about this
@dlazares 90fps sounds great! How can I give your renderer a try? Couldn't find the repo. I'm working on integrating a component on A-Frame core. Thanks so much