simple_trainer.py RGBA dataset
It looks like simple_trainer does not take into account transparency in the input images, even though it has a random_bkgd field.
Is this as simple as bringing in the 4th channel and matching the random color of the background? Or is the "transparency carving" technique in splatfacto more complex than this?
If you have access to the alpha channel of the input images, there are actually two ways to use it.
-
Option 1 is super simple, you could just apply an L1/L2 loss between the rendered alpha map with the ground-truth alpha channel.
-
Option 2 is like the transparency carving, where you generate a random color at each iteration. Then you augment the ground-truth RGB image with this random color using the alpha channel, as well as the rendered image:
rng_rgb: ...
gt_rgb = gt_rgb * gt_alpha + rng_rgb * (1 - gt_apha)
pred_rgb = pred_rgb + rng_rgb * (1 - pred_alpha)
loss = L2(gt_rgb, pred_rgb)
Actually, for option 2, if the loss converge to zero, you would get pred_rgb=gt_rgb * gt_alpha and pred_alpha=gt_apha, which is exactly what you want. And same for the option 1.
The random_bkgd argument we have in our training script is not for transparency carving, but to encourage the pred_alpha to be all one everywhere, as it should be for in the wild images.
Hey, @liruilong940607 awesome work thank you so much for you continuous gsplat/nerfstudio work!
I was wondering if the step that you mentioned above is enough to match how nerfstudio handles transparency with "--pipeline.model.background_color random"
Do we need to pass the alpha into the rasterizer for training?
Thanks a lot in advance!
Nevermind, I didn't notice that the pred_alpha was returned from the rasterize_splats function so using your formula (similar to the one in nerfstudio) actually made it work.
Would you like me to create a pull request to make this available on gspat?
@AlexMorgand A PR for this would be greatly appreciated, I think transparency bg is the most major diff that gsplat missing vs splatfacto.