gaussian-splatting
gaussian-splatting copied to clipboard
How to get the gaussian points for each pixel
Hi,
When rasterizing the rendered image, how to get the list of gaussian points that are used for rendering each pixel?
I'm also not quite sure about the meaning of the "radii" variable. Is that the largest radius of the ellipse after splatting gaussian on the screen? the number unit is pixel?
Thank you very much!
- It's in the variable
binningState.point_listhere. However in the way it's written currently, it is very hard to only extract that variable (binning_statehas so many preceding buffers of different sizes, you have to derive the correct offset). radiiis the long axis of each gaussian projected on an image, unit is pixel
Hi there and thanks for the great work!
following up on that. Do we know the size of binningState.point_list beforehand? I would really like to expose it to the python side for experimentation purposes. For now, I just returned the sizeof(binningState.point_list) as an integer on the python side, just to get some intuition about it, but I would really like to return it as a tensor.
Thanks for your time.
Am I correct in assuming that binningBuffer is a torch.Tensor version the BinningState in bytes?
yes, so you need to chop that tensor to correct sizes for each of the content, then do .to(...) to convert it to the dtype that it is supposed to be. But I imagine that it's easier that you create as many tensors as needed (in the correct dtype) and pass them to the functions, instead of passing by binningState, this way it is much more clear...
Ok thanks for that. But do we know the size of the tensors beforehand (i.e. before running the rasterization kernel)? Or we don't care about it, meaning that it can be "solved" by implementing lambdas and using resizeFunctional?
binningState.list_sorting_space is something that is decided by cub::DeviceRadixSort::SortPairs (please look at its documentation). The others are of size num_rendered and the dtype is int32 for ids and int64 for keys. With that you should be able to figure out how to create the tensors and pass in the pointers.
Thank you very much! indeed with this info everything should more or less work out.
resizeFunctional is only because there are so much content in a buffer so they need to resize the buffer to accommodate each of them. In practice the size are all known in advance and you can manually create tensors for all of them which allows easy access later
Thanks again for the info so far.
Now let's say that I got the buffers from binning_state. Actually I found out that numpy is more useful for converting from byte arrays to the correct dtype.
Now, in a comment above you mentioned that binning_state.point_list is holding the gaussians that where aggregated to render each pixel. I assume that the integers in point_list are the indexes of the means (_xyz) in GaussianModel.
But how one can associate the gaussian's id to a pixel? Is this the association (or mapping) made by point_list_keys?, i.e point_list_keys are pixel ids?
Please correct me if I am not thinking correctly here
Bump @kwea123
@ankarako Were you able to extract this mapping?
Hey, I have the same question. Did someone find a way to: get the list of gaussian points that are used for rendering each pixel?
So radii unit is pixel. Also, if radii value is 0, that means Gaussian is not visible from viewpoint (i.e. Gaussian is not in the viewing frustum)? Also for Gaussians whose radii value is not 0, are those all Gaussians from the viewing frustum, or only kind of first layer of visible Gaussians?
@ankarako The point_list_keys correspond to [ tile | depth ] (see rasterizer_impl.cu). The IDs are in point_list. (see rasterizer_impl.cu, comment in duplicateWithKeys)
@zhangfuyang @kwea123 @ankarako does anyone have a working codebase that can achieve this? Thanks so much!
For one thing I don't. But you can check NeRFStudio's gsplat implementation .
@zhangfuyang @kwea123 @ankarako @LoickCh hi, does anyone have a working code to get the list of gaussian points that are used for rendering each pixel? Thank you!