HoGS: Unified Near and Far Object Reconstruction via Homogeneous Gaussian Splatting
This PR implements this paper, which improves background reconstruction while preserving foreground fidelity compared to training without homogeneous coordinates.
I’d appreciate community support to verify this, but based on my tests, the results should be identical to the reference implementation. I haven't run the benchmarks yet.
You can find explanations and more results on their project page: https://kh129.github.io/hogs/
The comparison is certainly not fair (30k vs 50k iterations). However, the default won't get there even with 50k iterations. I will hopefully find some free cycles to run the benchmarks.
https://github.com/user-attachments/assets/b909adcb-c5d2-459c-92f6-dedc81894546
wow this is pretty cool! Yeah we should get the benchmarks (maybe at 30k since all other things are run for 30k in this repo?) so we could be confident on how much it improves and also for future reference.
This week is kinda crazy for me. happy to review the PR more closely next week!
I found a bug when saving an PLY.
Code used:
RESULT_DIR="results/benchmark_HoGS-BS1"
SCENE="garden"
RENDER_TRAJ_PATH="ellipse"
ITERS='50000'
BS='1'
CUDA_VISIBLE_DEVICES=0 python3 examples/simple_trainer.py mcmc --eval_steps $ITERS --disable_viewer --data_factor 2 \
--render_traj_path $RENDER_TRAJ_PATH --max-steps $ITERS --batch-size $BS \
--data_dir data/360_v2/$SCENE/ --use-hom-coords \
--save-ply --ply-steps 30000 50000 \
--result_dir $RESULT_DIR/$SCENE/
Running garden
Warning: image_path not found for reconstruction
[Parser] 185 images, taken by 1 cameras.
Downscaling images by 4x from data/360_v2/garden/images to data/360_v2/garden/images_4_png.
100%|██████████| 185/185 [00:00<00:00, 125720.39it/s]
Scene scale: 1.2265263065808911
Model initialized. Number of GS: 138766
...
...
...
Step 24800: Added 0 GSs. Now having 1000000 GSs.
Step: 29999 {'mem': 1.4372062683105469, 'ellipse_time': 591.6495430469513, 'num_GS': 1000000}
Traceback (most recent call last):
File "/workspace/GSPLAT_160a1_HoGS/gsplat-mcmc/examples/simple_trainer.py", line 1276, in <module>
cli(main, cfg, verbose=True)
File "/usr/local/lib/python3.10/dist-packages/gsplat/distributed.py", line 360, in cli
return _distributed_worker(0, 1, fn=fn, args=args)
File "/usr/local/lib/python3.10/dist-packages/gsplat/distributed.py", line 295, in _distributed_worker
fn(local_rank, world_rank, world_size, args)
File "/workspace/GSPLAT_160a1_HoGS/gsplat-mcmc/examples/simple_trainer.py", line 1204, in main
runner.train()
File "/workspace/GSPLAT_160a1_HoGS/gsplat-mcmc/examples/simple_trainer.py", line 832, in train
means *= w_inv
RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.
Additional info: Ubuntu 22.04 LTS Python 3.10 Torch 2.1.2+cu118 Gsplat commit 2455a5f
EDIT # 1st
If I remove the --save-ply --ply-steps 30000 50000 args, the code works. But I don't get the splats Ply.
Here is the benchmark of HoGS-MCMC-1M for 50k iters.
Here is it. As comparison, gsplat MCMC-1M with 30k iters only.
~MCMC-1M with 50k iters will be reported later in my next free time (hopefully)~
EDIT # 2nd
The result of HoGS-MCMC-1M for 50k is strange..
EDIT # 3rd
MCMC-1M with 50k iters with Gsplat 1.5.0
So, I suspect, HoGS is good because it uses many Gaussian as possible for reconstructing background area. But, for method efficiency, I think MCMC is better.
Thx @ichsan2895 for testing and evaluating. Maybe the degenerated rendering is caused by the bug you reported. I fixed it but lemme check if the rendering looks correct.
Btw, yes. HoGS primary excels at background reconstruction. So you need to look at the output visually as it is not necessarily reflected by the overall psnr. Also, 1m are a bit too low in that case. Anyway, it is optional and can be activated depending on the situation.
Yep, looks like the ply file is fixed
Oh No.. Left is the PLY opened by Mkkellogg Gaussian 3D viewer and Right is the gsplat viewer
Why it still happen? I have modified the code and rerun the expeeriment..
FYI, the error is gone, but the shape of PLY and in the viewer are strange.
@ichsan2895 I am not 100% sure what you are doing? Is this the simple_viewer.py? Then it is loading the ckpts which are not correctly scaled.
From @ichsan2895 it seems HoGS does not really win over metrics.
But I'm curious about the visuals! If this paper really shifts more GS from foreground to background and improve the quality on background that would also be pretty nice. At least it provides this flexibility. But if it's minor difference on visual as well then i'm not sure if we want to add this complexity to the code base which we would have to maintain going forward.
BTW we should really compare in a fair setting tho -- Can't conclude anything from 30k v.s. 50k ...
We should really compare in a fair setting tho -- 30k v.s. 50k doesn't makes sense. (and considering this HoGS runs slower per iteration maybe it's actually fair to compare MCMC 50k v.s. HoGS 30k if we aim for the same wall time)
Yes, I have ran Gsplat MCMC-1M with 50k iters. Here is the benchmark
@ichsan2895 I am not 100% sure what you are doing? Is this the simple_viewer.py? Then it is loading the ckpts which are not correctly scaled.
Yeah, That is simple_viewer.py (Right) and Mkkellogg Gaussian Viewer (Left). I will try to check and recheck that I have pointed the corrrect path,
@ichsan2895 once you have figured out the viewer stuff, would you mind sharing some comments on how you feel about the visuals? esp how much the background got improved from this approach
@ichsan2895 I never use the simple_viewer so that's why I was a bit puzzled. But I fixed it now. Please pull the latest commit.
@liruilong940607 The visual quality is not minor, it can be quite tremendous with respect to the background. It maybe shows best with the default strategy but it certainly consumes more Gaussians. I doubt that running mcmc with 1m is a good metric here.
@MrNeRF From the visual you shared initially the difference indeed seems big. But I'm curious why it is not reflected on the metrics. Or it's just because that is 30k v.s. 50k so the visual difference seems big? If there are visuals with same number of iterations that would be very helpful to compare
Just trying to convince myself to be more confident about this paper. (For the recorder i read this paper before and I really like it! But I'm always cautious about results in the papers because nowadays many papers are not doing comparison properly ... until you actually try it out)
@liruilong940607 I pretty much know what you mean.
The paper explicitly reports different values "near vs far". I experimented quite a bit with it. In my experience this method can be a bit worse on the foreground especially when there are lots of fine details. However, it pretty much always outperforms in the background 3dgs significantly. So, there is certainly a trade-off. That's also why I made it optional in the first place.
I will try to prepare some side-by-side comparisons to visualize that.
@ichsan2895 I never use the simple_viewer so that's why I was a bit puzzled. But I fixed it now. Please pull the latest commit.
Thanks, that is fixed now
I don't have the complete benchmark yet, but this is default strategy 30k steps main vs this branch. I will run bother later on also under the hogs settings for 50k iterations as specified in the paper.
edit: left main, right this branch.
https://github.com/user-attachments/assets/6e172d5e-bf80-4fcc-b63a-68bfe80596a5
https://github.com/user-attachments/assets/f513ac45-053c-4a65-85e4-b99ed4b29c3d
30k steps default strategy on main
30k steps default strategy HoGS this branch:
I will also do next the benchmark for the default strategy with the proposed 50k steps for main and this branch!
Nice work! I noticed a discrepancy in how scales are handled between two files:
In sample_viewer.py(load ckpts):
scales = torch.exp(ckpt["scales"]) * w_inv
In sample_trainer.py (export_splats):
scales = torch.log(torch.exp(scales) * w_inv)
Why is there an extra torch.log() in export_splats?